Windows: A software engineering odyssey (2000)(usenix.org) |
Windows: A software engineering odyssey (2000)(usenix.org) |
It's a wonder the thing ever worked at all.
Of course, things got better as time went on thanks to process improvement. I started in 1991, and I remember driving over to the NT team's building with a large (for the time) hard drive to grab a physical copy of the source tree. This was before NT was first released - when you tried running your build and went to shut it down, you had to watch the activity LED on the drive flash a few times to be sure the cache had synced to disk and powering down was safe. Fast forward a few years, and building all of WinNT was more routine, to the point it was just another component built by the automated VC++ checkin procedure (we called it submitting code to The Gauntlet), along with Excel and other Office components.
I might be misremembering if NT was part of Gauntlet, but it was definitely something we could and would build as desired.
Windows 2000 was certainly complex, but was it really substantially more complex than a full Linux distribution (including compilers, desktop environment, office suite, etc)? Why was it so difficult to build from scratch?
Also, they have about 10k people working on Windows (and devices) and about 10k people working on ads nowadays (that paints a good story of priorities).
Source: 2nd hand from MS friends
An interesting question is: why are we still having the same problems today? why haven't they been solved yet?
https://www.amazon.de/Show-Stopper-Cloth-BREAKNECK-GENERATIO...
Dave was an engineer on NT and creator of Task Manager and zip folders. Lots of interesting stories and anecdotes from that period on that channel.
Serialized Development
The model from NT 3.1 -> Windows 2000
All developers on team check-in to a single main line branch
Master build lab synchs to main branch and builds and releases from that branch
Checked in defect affects everyone waiting for results
Diagram: Developer
Developer
Developer
Developer
-> Single MainBranch -> Product Build Machine -> Product Release ServerOuch, and it looks like they only had version control with branching for the last nine months of development.
You had patches you'd float with "changelists" on top of enlistments. Each part large enough in thhe org (for example Excel or Word) gets a "branch" and it gets "forward integrated" and "reverse integrated" to the main "branch".
From your perspective the tool used to submit stuff (usubmit usually) you just push to the same branch as everyone else in your org and if your code breaks things it gets "backed out" by an automatic process.
Using git now is so much nicer.
Windows sources 20 years ago used to have a ridiculously complicated branching strategy, driven by middle managers and made worse by having actual devs sneak around the edges to do "buddy builds" of changes with some godawful batch file that I heard may have originated with RaymondC (who was exactly the kind of person to make ridiculous MSFT somehow bearable for the rest of us). It was Conway's Law, somehow twisted and applied to version control. With permissions SNAFUs.
I still see companies today trying to map their org chart into their branching strategy and just shake my head . . . and run away.
I'm pretty sure XP came with a digital app store right at the top of the redesigned Start menu, Windows Media Player had ten music stores integrated all selling DRM'd WMA files...
The system requirements especially, must have created a lot of work right down into the kernel team.
Like, rhetorical dude from 2002, you're mad that Windows XP will not let you remove Internet Explorer easily and that it requires online or phone activation to work? Let me tell you about Windows 11...
Yeat another thing where Microsoft was ahead of the curve, nowadays we get Electron (aka Chrome) all over the place.
People even buy laptops where the browser turned into the OS!
Apologies. This was me. Pretty much all 10. Most of them were just white-labels of the same code. Believe me, I hated doing it. MS didn't want to do the right thing and vertically-integrate everything like Apple was doing, which was the better solution as then you owned the entire user experience from end-to-end.
We know how that story ended.
I think people are forgetting how unreliable Windows was in its early days. If you were doing anything complex (programming, editing pictures, ...) Windows couldn't run for 2 hours without crashing every so often.
If anything, the core of the Windows operating system has only gotten better with time. Yes, they keep adding fluff to the desktop environment but that doesn't take away the progress they have made in stablizing their core operating system.
I`m really curious, which version of Windows you mean?
Because I don't remember this on win 3.11, win XP, win 95, etc. etc. Of course there sometimes HW/drivers issue, sometime some programs corrupt system files, etc. etc. But crashing every so often.. thats strange.
Not to mention being as easy to attack as a house made of butter.
That sounds like a Windows 3.1, where applications could easily take down the operating system. Windows 9x wasn't quite as bad. If I recall correctly, properly written applications could not take down the operating system though drivers certainly could. That said, there were certainly ways for developers to break the rules since there was little (if any enforcement) so some applications did take down the operating system. With the Windows NT series, there was sufficient isolation and enforcement of that isolation, that it was very reliable. Drivers could be an issue, as with bugs in Microsoft's code, but that was nothing in comparison to contemporary versions of 3.1 and 9x.
On the whole, I don't think it is reasonable to blame Microsoft for the reliability of their operating system. There were certainly design issues that resulted in it being unreliable, especially when running third-party code. On the other hand, the operating system was basically an evolution of a product line that started on the 8088 with very limited memory (I'm speaking of PC-DOS here) and a great degree of compatibility had to be maintained. Keep in mind, the computer industry did not work at the same pace: features had to wait until processors incorporated them, processor adoption had to wait for manufacturers to build them into their systems, and then consumers buy those systems in sufficient numbers. For example: the 286 was introduced in early 1982, but the IBM PC AT did not come out for another 2.5 years. Microsoft was also limited by the hardware their customers owned, even when it supported particular features. Life is much harder when you cannot throw memory at the problem because people had 2 or 4 or 8 MB of RAM.
On the other hand, Windows NT was a completely different product. There was much less concern over compatibility. There was much more intent to throw away baggage to create a modern (for the time) operating system. It did not crash every two hours.
1. budget devices from OEMs that cut corners at every cost
2. capacitor plague with merchants unable to guarantee good capacitors from any source
Windows, despite its legitimately annoying monetization strategy, has absolutely done the opposite - it does More Stuff every release, and the stuff it did before largely still works.
Do you have some examples of how macOS is doing less / capable of less today, than say 1 or 2 or 3 releases ago?
Another would be fragmenting the settings between the control panel and the new settings menu. It does more stuff (you have twice as many settings apps!) but it is less useful, because you are less likely to find the setting you are liking for.
Another example of doing more and becoming less useful is requiring a TPM for Windows 11. My security should be my decision. Not letting one install Windows obviously makes Windows less useful than if it could be installed.
In general (ie, not a Windows specific issue) ever growing hardware requirements makes the software less useful over time, as it can only run on a smaller and smaller subset of hardware. As software gets better, it should run on more hardware than it did before. Not less. Windows will simply not run on hardware from 15-20 years ago that is otherwise fully functional. That means it is less useful than it was before.
I wouldn't say "doing more" is better. I'd be happy if it did a lot less. I don't care about most of the big new features in windows. I'd be a lot more happy they'd rework their old antiquated stuff that keeps making problems (drivers, registry, focus handling, etc. etc.).
> Apple is making its desktop OS more "secure" (read: convoluted and does less stuff)
What is apple really making less useful with time? For me I really like many of the new features. The only reason I stick to windows is that gamging is still horrible on macOS.
I agree there are definitely shitty chunks Windows, but there are still some very solid foundations there to this day.
https://devblogs.microsoft.com/dotnet/working-through-things...
But I miss the ability to only pull down a portion of a monorepo, and the ability to remap where folders are at, or to pull down a single folder into multiple locations.
So much bullshit in with monorepos in Git land exists because Git doesn't support things that Source Depot (and Perforce I presume) supported decades ago.
As an aside for those who don't know what I am talking about, when pulling down a repo in source depot you can specify which directories to pull down locally, and you can also remap directories to a different path. This is super useful for header files, or any other sort of shared dependency. Instead of making the build system get all funky and fancy, the source control system handled putting files into expected locations.
So imagine a large monorepo for a company and you can have some shared CSS styles that exist and they always end up in every projects `styles` folder or what have you.
Or the repo keeps all shared styles in a single place, and you can then import them into your project, but instead of build system bullshit you just go to your mappings and tell it to pull the proper files put them into a sub-directory of your project.
It is a really damn nice to have feature. (That also got misused a ton...)
We have all that with git in Microsoft though. We don't check out the entire office monorepo - only the parts relevant to what you're working on (Excel in my case).
Also sharing stuff in SourceDepot wasn't the bad part (you get links to changelists and those open in a desktop program). The bad part was the branching model, commits, no real/good CI (we had a commit queue) etc). SourceDepot was just overall a bad scm for us.
I’m moderately confident the correct path is monorepo + centralization + virtual filesystem. Not every tool plays nice with VFS but at this point most do.
The D in DVCS is almost entirely a waste. Source control systems should, imho, trivially support petabytes of history and terabyte scale clones.
We did move stuff we could to other git repos inside Microsoft.
SourceDepot is still running for some stuff and is still awful but git is working great.
> Also, they have about 10k people working on Windows (and devices) and about 10k people working on ads nowadays (that paints a good story of priorities).
I'm not sure I'm privvy to all information but looking at the org chart this part is false. The ads org is much much smaller than E+D.
> Windows is only $5m a year
https://news.ycombinator.com/item?id=34934946
I was very impressed to determine that was only $416k/mo. Since I read that I've been like "that can't be right." (There's certainly no qualification of scope to work with.) That's roughly 15-20 (~$250k-$333k) senior developer salaries.
I'm very curious how and where Windows practically fits into the pie chart nowadays, mostly just from the perspective of a passively curious person who likes to file away watermarks and yardsticks :)
There's probably some perfectly externally-facing info out there under a rock I'm not sure where to look for...
Look at Panos' org and compare to the WebXT org (both under E+D).
I did, it was just a very long and complicated process. You had to set up a lot of tooling for it and you were strongly discouraged from doing it so in the year of SourceDepot (on Office) I saw this option being used exactly once.
And why our brilliance does not make a difference: it is a human problem :)
NT solved that problem by not allowing a lot of that nonsense, breaking code in the process. This incompatibility is the reason new Windows 95/98 PCs are produced until this day (https://nixsys.com/legacy-computers/windows-95-computers, https://nixsys.com/legacy-computers/windows-98-computers): back in the Win9x days, programming your computer like you would program a microcontroller today was quite a reasonable thing to do for certain applications, like controlling production lines.
There is the uptime overflow bug to deal with, but a monthly reboot is easier than reverse engineering and porting control software.
There are two levels of features here (maybe three) that we should consider:
- There are consumer facing features, the stuff pushed by marketing departments since it will grab the attention of customers and (perhaps) make it more desirable for customers. A lot of this is targeted towards specific groups of users, while being less useful to others, and goes out of fashion very quickly (assuming it ever went into fashion).
- There is the infrastructure. This stuff is harder to sell users on because relatively few people care about the details. It includes everything from exposing functionality to developers to improving performance and security. Sometimes it turns out this functionality is only of interested to a limited subset of developers. Sometimes it retrospectively seen as a problem that needs to be addressed. Either way it is very difficult to alter or remove because other software depends upon it. (Heck, even internal software depends upon it. While they may have the means to update internal software, that doesn't mean they have the resources to.)
I'm tempted to split the second category into two, but the net effect is the same so we may as well keep it simple.
As for the Apple thing, well, Apple has a more focused market. Choosing Apple also tends to be a conscious decision, while choosing Windows tends to be more a default position. For those reasons, I have no doubt that macOS is a better OS in the eyes of its users than Windows is in the eyes of its users.
https://www.oreilly.com/library/view/windows-xp-professional... "Windows Catalog"
[1] https://web.archive.org/web/20011113052730/http://www.micros...
[2] https://web.archive.org/web/20020409123842/http://www.micros...
AFAIK it is possible to do this on Linux (either through mprotect + SIGSEGV or userfaultfd) but it's slow. But there's a work-in-progress patch that the Collabora folks (probably on a contract from Valve if I'd had to guess, as some games do use this) are working on which will add a new fast way of doing this.
Even in current form, userfaultfd is useful for GC, so Linux's lack of the feature in 2015 was unfortunate. Android 13 added a new GC taking advantage of userfaultfd: https://android-developers.googleblog.com/2022/08/android-13....
> A new garbage collector based on the Linux kernel feature userfaultfd is coming to ART on Android 13... The new garbage collector... leading to as much as ~10% reduction in compiled code size.
The nice thing about COM is that it provides a well-defined, C-based ABI for calling object-oriented interfaces; if your language has a FFI that supports C, then you can call COM objects.
I’m a big believer that COM bindings for any language with automatic memory management should not expose refcounts directly to the programmer (at least in 90% of cases). It’s not far fetched — the original, pre-.NET Visual Basic did a very good job of this.
The goals were good, and other platforms haven't really tried to achieve them (KParts and Bonobo were the closest equivalents but both were abandoned a long time ago, DBUS isn't quite the same thing). But COM was fiddly.
Ran out of memory? BAM. An official driver from Intel or nVidia or ATI did something slightly off-time because silicon decided to wait a clock for something, BAM. You had a professional capture card with high bandwidth for that time, and you wanted to capture a video, BAM.
A blue screen because of a spinlock access violation, a Windows bundled driver, or any high-end software was common back in these days.
This is why you can tell if people grew up in that era, you have muscle memeoy of CRTL + S every few minutes burned into your soul.
I'm honestly afraid that my child will born with it and do that pinky-midfinger combo on air like playing air guitar on day 1.
It's arguably a bit shit from a business perspective, but has no real impact on power-users day to day.
I enjoy working for Microsoft (mostly) but I have _no idea_ how our sales looks like.
Also, the first beta of NT 5.0 shipped in September 1997 and it was renamed to Windows 2000 in October 1998 (https://en.wikipedia.org/wiki/Windows_2000#History), and 1998 vs 2003 is about 3 times 1½ years, so, at the time, about three performance doublings.
Chances are your hardware was at least 5 times as powerful as what the early Windows 2000 engineers used.
Yes, I would frequently download e.g. new kernel release tarballs (this was before Git) and slot it into the system. This didn't require recompiling anything but the kernel. Actually installing Stage 1 Gentoo required compiling everything (although it was on top of a compiler binary for bootstrapping.)
My hardware was cheap 2001 era consumer hardware, so I doubt it was that much faster than what the Windows developers had available. Besides, my question is more about why Windows (or anything else) would be difficult to compile, rather than just time-consuming. The nice thing about recompiling an entire operating system from scratch is that there are no external dependencies, because you're building everything! (Except the bootstrapping compiler, but for the Windows operating system there's no reason to rebuild that.)
I think that's a huge help. I think that it's also helpful that the system is intended to be compiled by a bunch of other people and the code is released with that in mind.
My main question is also not so much why compilation of Windows should be time consuming, but why it should be difficult.
Did it? It had more third party drivers, but did NT itself build in that many?
There's no reason to build everything from scratch, it's like working on a patch to e.g. KWrite, and deciding to build the kernel in order to do it. If you're working on a Windows component, you install a daily build so it's close to your equivalent of main/master, write your code, and overwrite the binaries on your test machine / test VM. Your development loop is pretty fast in practice
It's also an uphill battle against the ever encroaching Microsoft Edge bullshit; every time you remove part of the bullshit, Microsoft comes out with an update that adds more.
If you're stuck with Windows I'd consider the safe defaults for ShutUp1x as essential but you do need to read the notes for every setting you enable, which may require some Googling so you understand what you're doing.
Semi-related, I try to use symlink shenanigans in git to share common files between monorepo projects w/o using 3rd party tooling, but my latest attempt worked on Windows but the symlink fell apart when the repo was pulled down on a Mac!
Not the OS that I thought would have issues. :)
Working offline is distinct from distributed. In practice almost all development is defacto centralized on GitHub (or other central host).
> In software development, distributed version control (also known as distributed revision control) is a form of version control in which the complete codebase, including its full history, is mirrored on every developer's computer.
That’s a super mega anti-feature to me. Git still sucks for large binary files which is an insane limitation.
The Linux kernel is not, and Git was designed by the creator of the Linux kernel to serve the needs of the Linux kernel developer community. And I am certain they are not the only ones with that workflow.
Git makes fundamental design choices that are (maybe possibly but not necessarily) good for the Linux kernel. They’re objectively bad and problematic for the majority of dev work. Which makes it really fucking shitty that the industry standardized on a tool that is bad for standard workflows.