How the Windows Subsystem for Linux Redirects Syscalls

How the Windows Subsystem for Linux Redirects Syscalls(blogs.msdn.microsoft.com)

359 points by jackhammons 10 years ago | 266 comments

ataylor284_ 10 years ago |

> The real NtQueryDirectoryFile API takes 11 parameters

Curiosity got the best of me here: I had to look this up in the docs to see how a linux syscall that takes 3 parameters could possibly take 11 parameters. Spoiler alert: they are used for async callbacks, filtering by name, allowing only partial results, and the ability to progressively scan with repeated calls.

bitwize 10 years ago | |

This is a recurring pattern in Windows development. Unix devs look at the Windows API and go "This syscall takes 11 parameters? GROAN." But the NT kernel is much more sophisticated and powerful than Linux, so its system calls are going to be necessarily more complicated.

trentnelson 10 years ago | | |

Curiosity got the better of me recently when I re-read Russinovich's [NT and VMS - The Rest Of The Story](http://windowsitpro.com/windows-client/windows-nt-and-vms-re...), and I bought a copy of [VMS Internals and Data Structures](http://www.amazon.com/VAX-VMS-Internals-Data-Structures/dp/1...).

Side-by-side, comparing VMS to UNIX, and VMS's approach to a few key areas like I/O, ASTs and tiered interrupt levels are simply just more sophisticated. NT inherited all of that. It was fundamentally superior, as a kernel, to UNIX, from day 1.

I haven't met a single person that has understood NT and Linux/UNIX, and still thinks UNIX is superior as far as the kernels go. I have definitely alienated myself the more I've discovered that though, as it's such a wildly unpopular sentiment in open source land.

Cutler got a call from Gates in 89, and from 89-93, NT was built. He was 47 at the time, and was one of the lead developers of VMS, which was a rock-solid operating system.

In 93, Linus was 22, and starting "implementing enough syscalls until bash ran" as a fun project to work on.

Cutler despised the UNIX I/O model. "Getta byte getta byte getta byte byte byte." The I/O request packet approach to I/O (and tiered interrupts) is one of the key reasons behind NT's superiority. And once you've grok'd things like APCs and structured exception handling, signals just seem absolutely ghastly in comparison.

tremon 10 years ago | | |

But the NT kernel is much more sophisticated and powerful than Linux

That does not follow from the example. All it shows is that Microsoft prefers to put a lot of functionality in one interface, while Linux probably prefers low-level functions to be as small as possible, and probably offers things like filtering on a higher level (in glibc, for example).

Neither explanation has anything to do with sophistication. I personally believe that small interfaces are a better design.

ckaygusu 10 years ago | | |

I think the problem here is not a syscall taking 11 parameters, it's a syscall that merely lists what is inside a directory taking 11 parameters. ataylor_284 explained the reasons (how convincingly, I'd argue) but on the first sight that surely smells bloat.

I'd also object NT kernel being more "powerful". Sure unixy kernels and NT has their differences but I don't think either one is superior.

darkengine 10 years ago | | |

It may be more "sophisticated" (sounds like a more positive synonym of "complex" to me), but I certainly don't think it's more powerful.

deprave 10 years ago | | |

Since when is kernel complexity a measure of quality...? :)

pjmlp 10 years ago | | |

Also UNIX devs seem to forget how cumbersome the X11, Xlib and Motif APIs are.

pbarnes_1 10 years ago | | |

This was maybe the case at Linux 2.0, but is not the case now.

Also, Windows development is infinitely more painful than Unix/Linux.

uudecode 10 years ago | | |

"... so its system calls are going to be necessarily more complicated."

Are you implying that an increase in "power" can never be achieved through increasing simplicity?

zxcvcxz 10 years ago | | |

>the NT kernel is much more sophisticated and powerful than Linux

Source?

It's not sophisticated enough or powerful enough to be the most used kernel on super computers (and in the world). Windows pretty much only dominates the desktop market. Servers, super computers, mainframes, etc, mostly use Linux.

A few years ago there was even a bug in Windows that caused degradation in network performance during multimedia playback that was directly connected with mechanisms employed by the Multimedia Class Scheduler Service (MMCSS), this is used on a lot of audio setups. If they can't even get audio setups right how can people consider anything Windows releases "sophisticated"?

It's made to do anything you throw at it I guess, it's definitely complicated, but powerful and sophisticated aren't words I would use to describe NT.

tptacek 10 years ago | |

Overloaded system call entrypoints are a fact of life on all mainstream platforms. Consider for instance "ioctl".

marvy 10 years ago | | |

I've heard that Plan 9 doesn't have ioctl. But I guess that doesn't count as mainstream.

deprave 10 years ago | |

A lot of Microsoft APIs and subsystems are similarly bloated. There are probably tons of factors at play, but I believe being closed-source and having to support many individual use cases is one fundamental reason. (See for example CreateProcess vs. fork...)

bitwize 10 years ago | | |

When it comes to system call interfaces, it's because Dave Cutler has forgotten more than many modern "kernel hackers" will ever know about how to design an OS.

luchs 10 years ago |

>As of this article, lxss.sys has ~235 of the Linux syscalls implemented with varying level of support.

Is there a list of these syscalls somewhere? It would be cool to check it against the recent Linux API compatibility paper [0, 1].

[0]: http://oscar.cs.stonybrook.edu/api-compat-study/ [1]: http://www.oscar.cs.stonybrook.edu/papers/files/syspop16.pdf

besselheim 10 years ago | |

You piqued my curiosity - just made one by extracting the syscall dispatch table from lxcore.sys and placing it alongside the Linux syscall list: https://goo.gl/QHGe1U

A lot of coverage there, but interesting to see which ones aren't yet implemented, at least in the recent build 14342.

(I used Filippo Valsorda's work from https://filippo.io/linux-syscall-table as the Linux syscall data source.)

xorblurb 10 years ago | |

A list of (at least partly) supported syscalls is here: https://msdn.microsoft.com/en-us/commandline/wsl/release_not...

Not details on which one are fully or partly supported, though.

Maarten88 10 years ago |

I have installed the current fast ring build and have tried installing several packages on Windows. Some do install and work (compilers, build environment, node, redis server), but packages that use more advanced socket options (such as Ethereum) or that configure a deamon (most databases), still end with an error. Compatibility is improving with every new build, and you can ditch/reset the whole Linux environment on Windows with a single command, which is nice for testing.

_khhm 10 years ago | |

They've said the initial intent is for developers to use it, not for running servers / etc (which is why they only target Windows 10 client and not Windows Server OSs).

ygjb-dupe 10 years ago | | |

There is "running servers" in production and there is "running servers" in dev.

If I can't run the entire stack I use for dev under the subsystem then I will go the other route, which is to continue using VMs. I am excited about the initial release, and the prospect of being able to use Windows for all of the regular things I do, but it's clear that this isn't ready for primetime even as a dev tool.

stuaxo 10 years ago | | |

Yup, when I'm developing I need to run pretty much most stuff. I guess, I can install say postgres using the windows native version, but then we are back at square zero.

caf 10 years ago |

Since NT syscalls follow the x64 calling convention, the kernel does not need to save off volatile registers since that was handled by the compiler emitting instructions before the syscall to save off any volatile registers that needed to be preserved.

Say what? The NT kernel doesn't restore caller-saved registers at syscall exit? This seems extraordinary, because unless it either restores them or zaps them then it will be in danger of leaking internal kernel values to userspace - and if it zaps them then it might as well save and restore them, so userspace won't need to.

trentnelson 10 years ago | |

I think that's referring to the prolog/epilog convention and "homing" of parameter registers, e.g.

Frame struct ReturnAddress dq ? HomeRcx dq ? HomeRdx dq ? HomeR8 dq ? HomeR9 dq ? Frame ends

    NESTED_ENTRY Foo, _TEXT$00

    mov Frame.HomeRcx[rsp], rcx
    mov Frame.HomeRdx[rsp], rcd
    mov Frame.HomeR8[rsp], r8
    mov Frame.HomeR9[rsp], r9

    alloc_stack 64

    END_PROLOG
    
    ; *do stuff*

    BEGIN_EPILOG

    add rsp, 64

    NESTED_END Foo, _TEXT$00

https://msdn.microsoft.com/en-us/library/tawsa7cb.aspx

emcrazyone 10 years ago |

I can't think of much that would benefit from this except for, perhaps, headless command line type applications. The one that comes to mind is rsync. Being able to compile the latest version/protocol of rsync on a Linux machine and then running the same binary on a Windows host would be nice but fun seems to end there plus with Cygwin, this is largely a no-brainer without M$ help.

What about applications that hook to X Windows or do things like opening the frame buffer device. I've got a messaging application that can be compiled for both Windows and Linux and depending on the OS, I compile a different transport layer. Under Linux heavy use of epoll is used which is very different than how NT handles Async I/O - especially with sockets. So my application's "transport driver" is either compiling an NT code base using WinSock & OVERLAPPED IO or a Linux code base using EPOLL and pthreads.

Over all it seems like a nice to have but I'm struggling to extract any real benefit.

Can anyone offer up some real good use cases I may be overlooking?

quux 10 years ago | |

There are both free and commercial X servers for Windows, and you can get a linux app running under WSL to work with one of those X servers very easily. I played with it a little bit and it worked fine.

coverband 10 years ago |

With this feature, if you're a Linux developer, you're automatically a Windows developer as well. Almost like being able to run all Android or iOS apps on Windows phones.[1][2]

[1] http://www.pcworld.com/article/3038652/windows/microsoft-kil... [2] https://developer.microsoft.com/en-us/windows/bridges/ios

Edit: Now I am puzzled as to why this got downvoted?

besselheim 10 years ago | |

If you disassemble lxcore.sys you can still see hints of the Android subsystem project that it grew from: the \Device\adss and /dev/adss devices, the application name Microsoft.Windows.Subsystem.Adss, various function names containing "Adss", and some other textual references to Android.

Animats 10 years ago |

It's too bad that x86 hardware doesn't do virtualization as well as IBM hardware. You can't stack VMs. That's exactly what's needed here - a non-kernel VM that runs above NT but below the application.

pmalynin 10 years ago | |

You can have nested VMs.

https://www.kernel.org/doc/Documentation/virtual/kvm/nested-...

overgryphon 10 years ago | | |

Windows also now supports nested virtualization.

https://msdn.microsoft.com/en-us/virtualization/hyperv_on_wi...

geofft 10 years ago | |

I thought that a) the conclusion of VMware's "Comparison of techniques" paper [1] was that x86 and possibly everything is Popek-and-Goldberg-virtualizable [2] via binary translation, and b) the last several years of Intel and AMD chips all have hardware virtualization support, including nested virtualization, that made their architectures Popek-and-Goldberg-virtualizable in the obvious way?

[1] https://www.vmware.com/pdf/asplos235_adams.pdf

[2] https://en.wikipedia.org/wiki/Popek_and_Goldberg_virtualizat...

pjmlp 10 years ago | |

Looking at the way mainframes work, with their higher level languages, JIT compilers at kernel level, object databases, type 1 hypervisors, ....

It is quite interesting to see mainstream OSes increasingly get adopting all those features.

Animats 10 years ago | | |

Very, very slowly. Microprocessors still have DMA instead of mainframe-like "channels", although we're starting to see MMUs on the I/O side. With channels, devices can't blither all over memory and neither driver nor device need be trusted.

kevincox 10 years ago |

> the Linux fork syscall has no documented equivalent for Windows

Emphasis is mine. I wonder if this is something that cygwin could (ab)use. Also I wonder why they would need this undocumented call.

wfunction 10 years ago | |

It's been tried and failed. See [1].

[1] https://cygwin.com/ml/cygwin-developers/2011-04/msg00036.htm...

bboreham 10 years ago | |

> Also I wonder why they would need this undocumented call.

To implement the first NT Posix subsystem, which was a FIPS requirement.

xorblurb 10 years ago | |

Cygwin is layered above Win32. Win32 has no provision to nicely handle forks. So even if there was an NT API fork syscall (I'm don't think there is on Windows 10, WSL does not use the NT API, there is not any more Posix/SFU/{Whatever Unix NT classic subsys of the day} as far as I know), this would not go anywhere.

pcwalton 10 years ago | | |

> So even if there was an NT API fork syscall

You can do it with NtCreateProcess: https://groups.google.com/d/msg/microsoft.public.win32.progr...

(The Win32 userland won't understand what you did, but you can still do it.)

rossy 10 years ago | | |

Cygwin programs technically run under the Win32 subsystem, but they're not that cleanly layered. The runtime calls into a lot of Nt* functions, including undocumented ones. midipix (which another commenter mentioned) is another Unix-like environment for Windows that also runs under the Win32 subsystem, and apparently it has successfully implemented a real copy-on-write fork() on top of undocumented NT syscalls, so it's definitely possible.

CUViper 10 years ago | |

midipix is trying to use it, and advertises copy-on-write fork as an advantage over Cygwin, but I don't know how well it works yet. http://midipix.org/#sec-midipix http://midipix.org/git/cgit.cgi/ntapi/tree/src/process

bla2 10 years ago |

Does anybody know how fork() is implemented? This blog post kind of sounds like fork() would do the slow emulation of it through CreateProcess().

xorblurb 10 years ago | |

fork() is properly implemented by the NT kernel. WSL is not layered above Win32.

obnauticus 10 years ago |

Excellent post, Jack.

quux 10 years ago |

Interesting, I wonder how much overhead is added to syscalls to look up the process type. Does NT still do this check when no WSL processes are running?

stuaxo 10 years ago | |

Pretty sure these are different entry points, so you wouldn't need to do anything different for normal Windows processes whether WSL is running or not.

quux 10 years ago | | |

I don't think so... both linux and windows binaries are using the same SYSCALL cpu instruction, and thus must be going to the same handler in the NT kernel.

_RPM 10 years ago |

Does Microsoft document all system calls?

detaro 10 years ago | |

They document the WinAPI, but how that talks to the kernel is not documented. You can talk to it directly if you want, but there is nothing from Microsoft on how to do that. So if you see those as the true system calls, they are not documented at all.

xorblurb 10 years ago | | |

Well, tiny parts of the NT API (callable from userspace) are documented, but then often with the caveat that they are not stable (in practice, even some undocumented ones can be considered stable if used by enough programs in the wild, especially if they are simple and standalone and have no Win32 equivalent)

The very precise mechanism, though, is extremely unstable. For example virtually every release of Windows (even sometimes SP) changes the syscall numbers. You have to go through the ntdll, which is kind of a more heavyweight version of the Linux VDSO. (The NTDLL approach was invented way before the VDSO, though)

davidgerard 10 years ago |

Yes, yes, but can we run Wine on it?

negus 10 years ago |

wtf is "pico process" and "pico driver"?

wereHamster 10 years ago | |

https://blogs.msdn.microsoft.com/wsl/2016/05/23/pico-process...

prirun 10 years ago |

Step 1: embrace

smegel 10 years ago |

Funny they don't mention ioctl.

vegabook 10 years ago |

Next step is Microsoft basically needs to turn Windows into a flavour of Linux. If they don't, they're under massive pincer threat from Android and Chrome, which are rapidly becoming the consumer endpoints of the future. Windows is about to "do an IBM" and throw away a market that it created. See PS/2 and OS/2.

They should probably just buy Canonical. That would put the shivers into Google, properly.

mxuribe 10 years ago | |

Funny years ago i would have reflexively flabbergasted at the thought of microsoft buying canonical (or any linux distro producer)...but actually thinking on that concept, and seeing recent (perhaps less-than-hostile) approach that microsoft has taken towards open source and linux, that wouldn't be a bad idea. I mean if microsoft could have both offerings - for windows servers and ubuntu-installed servers - i suppose that would be a very smart business move. Assuming they don't actually butcher or deny resources to whatever linux company they would buy, i could see several benefits - not only to microsoft but to developers, system integrators, etc. worldwide. Hey if a side benefit is that it would spur the market (a la google, apple, etc.) a little - to the benefit of us civilians - that's cool too.

orionblastar 10 years ago | | |

I think Microsoft should do what Apple did with BSD Unix aka Nextstep and merge it with their old OS.

Microsoft should take the Windows GUI and put it over Linux as a desktop manager. Microsoft could sell the Windows GUI for Linux users that want to run Windows apps.

vegabook 10 years ago | | |

I've been heavily downvoted for the view, but the facts are, there are hundreds of billions of dollars being spent in the Linux ecosystem, by corporations. Microsoft cannot afford not to be present in it. It's as simple as that. Canonical is starting to look like hitting Red Hat a bit on support contracts for corpos ets, so that's why I suggested that, but as you say, it could be another big and credible Linux distro (though Ubuntu all over the cloud must surely be tempting). Generally the idea that Microsoft wants to/must go big into Linux is uncontroversial, for me.

zxcvcxz 10 years ago |

I use to run Linux in a VM on windows and use Chocolatey for package management and cygwin and powershell etc, then I realized I was just trying to make Windows into Linux. Seems to be the way things are going and with the addition of the linux subsystem it kind of proves that Windows really isn't a good OS on it's own, especially not for developers.

I wish Windows/MS would abandon NT and just create a Linux distro. I don't know anyone who particularly likes NT and jamming multiple systems together seems like an awful idea.

Windows services and Linux services likely won't play nice together (think long file paths created by Linux services and other incompatibilities), for them to be 100% backward compatible they need to not only make Windows compatible with the things Linux outputs, but Linux compatible with the things windows services output, and to keep the Linux people from figuring out how to use Windows on Linux systems they'd need to make a lot of what they do closed source.

So I don't see a Linux+Windows setup being deployed for production. It's cool for developers, but even then you can't do much real world stuff that utilizes both windows and Linux. If you're only taking advantage of one system then whats the point of having two?

I went ahead and made the switch to Linux since I was trying to make Windows behave just like Linux.

dragonbonheur 10 years ago |

l3m0ndr0p 10 years ago |

Pretty neat stuff. I think that MS should just create their own Linux Distribution & port all MS products. Get rid of the Windows NT Kernel. I believe it's outdated & doesn't have the same update cycle that the Linux Kernel has.

Why run a Linux Application/binary on a windows server OS? When you can just run it on Linux OS and get better performance & stability.

jjtheblunt 10 years ago | |

What makes you believe it's outdated?

zxcvcxz 10 years ago | | |

Can you show me the source so I can check?

serge2k 10 years ago | |

> Get rid of the Windows NT Kernel. I believe it's outdated & doesn't have the same update cycle that the Linux Kernel has.

Curious why you claim this? What's outdated about the NT Kernel?

l3m0ndr0p 10 years ago | | |

Here are some, or maybe this is not part of the NT Kernel... 1. The use of drive letters A-Z for file system access. 2. Creating symbolic links to files and folders, like you can in Unix/Linux. You have to set a setting somewhere to enable this, but there's a security risk. 3. Standard functional/usable non-gui terminal application like Unix/Linux ssh. PowerShell doesn't come close. 4. Ability to SUDO or su Admin like Unix/Linux. Maybe these are not kernel related above, but the OS specific layer.