Why the CrowdStrike bug hit banks hard(bitsaboutmoney.com) |
Why the CrowdStrike bug hit banks hard(bitsaboutmoney.com) |
So not really a failure of IT, at least not for this reason.
I know, not really the DailyWTF materials that majority HNers led to believe.
I don’t know if I somehow just have little exposure to Windows in my life or if there’s an untold resiliency story for the global internet in the face of such a massive outage.
All I can say is THANK YOU to all the unsung heroes who answered the call and worked their butts off. Infrastructure doesn’t work without you. We see you & we thank you!
My first question was "do you shut your machine off at the end of the day?" She did, and that's probably why about half of her office was affected, and the other half was not.
Can't update it if it isn't on.
But they took out a far larger fraction of installation base in regulated industries. The very industries who are tightly regulated because they are supposed to keep the wheels of the society turning.
Supply chain risks are everywhere, and in regulated industries they are highly concentrated.
I asked the guy at the luggage counter, and he said the day before was pretty crazy, but they had everything straightened out by the next day.
Vanguard.co.uk was down.
But yes, I echo your feelings. When you examine how complex everything is under the hood it's almost unbelievable that anything works.
But yeah, other than that, the only issue we ran into was that the Jimmy John’s we stopped at for lunch outside of MSP was slammed because Delta had ordered hundreds of sandwiches for their staff.
I’ve definitely experienced much worse travel disruptions due to normal weather (though obviously we got real lucky compared to some Delta customers).
The section that goes over why this wasn't federally pushed is largely accurate, mind. Not all capture is at the federal level. Is why you can get frustrated with customer support for asking you a checklist of unrelated questions to the problem you have called in.
And the super frustrating thing is that these checklists are often very effective for why they exist.
This would be the third incident I'm familiar with of a file of entirely zeroes breaking something big.
Folks, as much as we wish it weren't true, null comes up all the damn time, and if you don't have tests trying to force-feed null into your system in novel and exciting ways, production will demonstrate them for you.
Never assume 'zero' (for whatever form zero takes in context) can't be an input.
> Many contractors are small businesses. Many small businesses are very thinly capitalized. Many employees of small businesses are extremely dependent on receiving compensation exactly on payday and not after it. And so, while many people in Chicago were basically unaffected on that Friday because their money kept working (on mobile apps, via Venmo/Cash App, via credit cards, etc), cash-dependent people got an enormous wrench thrown into their plans.
I never really thought about not having to worry about cashflow problems as a privilege before, but it makes sense, considering having access to the banking system to begin with is a privilege. I remember my bank's website and app were offline, but card processing was unaffected - you could still swipe your cards at retailers. For me, the disruption was a minor annoyance since I couldn't check my balance, but I imagine many people were probably panicking about making rent and buying groceries while everything was playing out.
For example (making up numbers here): if 75% of all airline computers have croudstrike falcon installed that seems like a very concentrated risk.
I actually wouldn't be surprised if we had this we would see really high concentrations of a small number of vendors in any industry.
Australia got hit hard because they modernized their bank systems and now most are cloud based. I am not aware of any major bank running their core systems on the cloud or on windows.
You mean they made them more vulnerable?
I work in medical device software -- the stuff that runs on machines in hospital labs, ER's or at patient bedside.
The first "ohmigod do we need to recall this?" bug I remember was an innocuous piece of code that was inserted to debug a specific problem, but which was supposed to be disabled in the "non-debug" configuration.
Then somehow, the software update shipped with a change to the configuration file that enabled that code to run. Timing-critical debug code running on a real-time system with a hard deadline is a recipe for disaster.
Thankfully, we got out of that pretty easily before it affected more than a small handful of users, but things could have been a lot worse.
To answer the question, CrowdStrike is a global company with thousands of employees around the world. Not sure why the EU wasn't hit as hard.
There is something to be said for a diverse banking industry when it comes to this kind of problem. Also, this event is a powerful argument for keeping the core systems on unusual mainframe architectures. I think building a bank core on windows would be a really bad choice, but some vendors have already done this.
Hospitals, for instance, weren't that widely affected as they barely have any money to buy security tooling.
Silver linings and all that, I guess.
Everybody seems to be quick to forget about WannaCry.
A kernel level driver from a 3rd party is something that you willingly add to the OS, it wasn't there.
Just because windows allow you to do it, doesn't mean you should.
I mean, you can apply some dangerous mods to your car's engine, but you probably shouldn't, and if you do, it's your responsibility, not the car company.
It's an old term at this point, but I don't think the reasons for it being called "userspace" have changed or become outdated since then, so I wouldn't call them historic per se.
Decide who you're writing for, and write to that audience.
He has, and he does.
"In which an HN commenter offers me writing advice but fails to understand the implication of second sentence"
Where is the "user" when the machine is a Windows box stuffed behind a façade wall that displays airport directions, notifications, and ads on rotate?
And banks/airlines etc were hit hard because their _Windows_ didn't boot, not because of an application crash on a perfectly working Windows.
Fictional statements like this make me reluctant to read further, and ignore source of such "news" in the future.
also, bragging about your inability to read text seems an odd way to interact.
I'm not so sure about this:
> money is core societal infrastructure, like the power grid and transportation systems are. It would be really bad if hackers working for a foreign government could just turn off money.
Sure, it would be inconvenient in the short term. But I think the current design is holding us back.
I suspect that most of us would have more to gain than to lose if we managed to shut off money-as-we-know-it and keep it off for long enough to iterate on alternatives. Any design that even tried to step beyond "well that's how we've always done it" would likely land somewhere better than what we're doing. Much has changed since Alexander Hamilton.
Unless the OS is locked down to the point that even its owner cannot do that. Actually, this is something I like about Operational Technology, you run into a lot of doodads where the elevation process requires turning a physical key, and the device's main functionality is disabled while it is in service mode. Ofc the doodad has to be engineered to operate reliably, perpetually, for years, and you cant really expect that from a desktop computer.
If such outages were more frequent, then it could definitely become a liability. But such risks have to be balanced against the risk of being compromised and leaking customer data and other confidential trade secrets, and the risk posed by the latter one is far higher, not to say it's also more common.
It's the only way to detect certain types of advanced threats.
All of these requirements essentially become transitive across a company's entire supply chain.
* Big bank needs to comply with X, so do all of their vendors.
* Vendor wants to sell to big bank, so they comply with X. They also need all of their vendors to comply with X.
* So on and so on.
----
Ultimately, there are a lot more options than CrowdStrike, but this is a case of "Nobody gets fired for buying IBM". Even if CrowdStrike isn't the "best", it's good enough. Because it's use is sooo widespread, an issue with it often affects dozens and dozens of other companies when you're affected. One of the great things about this effect is everyone "goes down at the same time", so people don't tend to point fingers at you. In fact, they might not have any clue you're down because some other, more critical system is down internally and preventing them from accessing you.
I remember a similiar situation happening a few years back. A big outage hit large parts of the internet. A pretty major part of our app got taken offline with this outage. This was a known risk and something that we accepted. We expected some backlash and inquires if this situation should ever happen. It was a calculated risk to dedicate more effort towards building customer-facing value.
I think we got one inquiry. It was basically just an FYI. This person had so many things broken on their end that "one more thing" being broken was just a drop in the bucket.
On the other hand, count me surprised at the sales prowess of Crowdstrike, I did not know how big they were.
I disagree. Long term, the fundamentals of CRWD continue to remain unabated.
Endpoint protection is still a critical need no matter what - for every bug like CRWD, there's always a company you can point to who's operations were shut down due to an attack.
CRWD skimped on QA and customer support, but long term there aren't many other vendors that can provide a similar service, and CRWD is large enough to pull a PANW and M&A into entirely new segments (eg. DSPM with Flow Security, Observability/Data Lake with Humio, ASPM with Bionic) along with greenfield category makers like Charlotte AI for AI Security and AI EDR.
There will be short term pain for CRWD's Windows endpoint business with churn to MDE, SentinelOne, Tanium, etc but they have enough dry powder and a diversified security portfolio that they can safely recover within a year at most.
> crush their sales pipeline
With CRWD sized companies, most of their revenue comes from multi-year contracts and renewals.
They'll probably have a decently large layoff in the sales org, but enterprise sales tends to be fairly stable due to contract sizes along with riders about liability
A lot of lawsuits are going to be thrown out, I think.
Windows cannot simply "skip" failed drivers. Say Crowdstrike driver failed as a one time thing, Windows skipped it instead of retrying which led to the endpoint being vulnerable and a ransomware happens. We'd be saying the opposite now.
This is a high-impact ability Windows offers to applications - and applications should take responsibility and treat it as such.
I spoke to another EDR lead I know - they said they had provisions in place to read the dump if boot crashed, check if it was due to their driver and skip it if it was (and then send telemetry after startup so that it can be fixed, probably). Crowdstrike should have done the same.
One more thing to note is that we cannot say Windows shouldn't provide this ability - that becomes an anti-trust monopoly, because MS themselves are a competitor in this space.
We'd end in a situation similar to Mac OS where there's a single gatekeeper and whole industries are subjected to the will of the platform owner.
Enterprises have chosen Windows because of that flexibility and control, while having a business partner they don't get with linux. If anything the blame should fall on them for getting hosed even as they fully had the means to avoid that situation.
Furthermore, if a driver is marked as optional and crashes, Windows can reboot with that optional driver disabled next time, preventing infinite crash/boot loops. Obviously that's no good if your antimalware driver gets disabled, so they can mark theirs as "required." Obviously in the CrowdStrike case, we got the worst of both worlds.
Maybe this is the loophole that needs closing. You can't claim a driver is certified for Windows if the manufacturer can push arbitrary files that change its behavior. Especially if that manufacturer has sloppy development practices.
I understand that a primary goal of endpoint monitoring software is to be able to quickly react to new threats, and that the turn around time for Windows certification is surely unacceptable in this scenario, but this functionality can never be allowed to jeopardize the stability of the system it's supposed to protect. So it's ultimately on Microsoft to fix this for their users.
It is, perhaps, a guarantee that no vendor should be expected to make.
Or did they choose to keep their own security software to run in kernel space thus forcing themselves to let others play by the same rules?
Nothing in that means they need ring-0 access.
If I sell you a bike and you remove the breaks you can’t sue me when you crash.
Any OS which allows users to do what they generally want to do, also allows users to fubar their own systems.
Let's say I've developed an laptop that bricks whenever you open a website with incorrectly formatted HTML.
Not sure how to adapt your bike analogy to this... Let's say you made a bike that's intended to be ridden outdoors, but breaks down whenever user sits on indoors. Yea, no one is supposed to ride it indoors. Not sure it's the best analogy though.
UPDATE: let's say the bike breaks down completely whenever it's ridden in the rain.
I don't understand how this has anything to do with Windows, Crowdstrike is the one who built the application.
Applications crash all the time. But in this case people weren't able to even load the Windows to figure what's wrong or what app has crashed.
Microsoft allowed a third-party to self-update and didn't put a proper system of review and updates control to the heart of its OS.
If you replace parts in your BMW, and put in some garbage or incompatible parts, it your fault if it doesn’t run.
You expect to sue your mechanic if he messed up, and for him to cover the full cost. For some reason people do not expect CrowdStrike to pay for their stupidity, which is the root of the problem. And the management that installed crowdstrike without due diligence
If you replace parts in your BMW, and put in some garbage or incompatible parts, it your fault if it doesn’t run.
You expect to sue your mechanic if he messed up, and for him to cover the full cost. For some reason people do not expect CrowdStrike to pay for their stupidity, which is the root of the problem. And the management that installed crowdstrike without due diligence
https://www.theregister.com/2024/07/22/windows_crowdstrike_k...
Microsoft, interestingly enough, is working on a project to add an eBPF[0] runtime to the NT kernel. If they were to use this for their own security products then I doubt the EU would prohibit them from transitioning third-party security products to eBPF programs. Antitrust and competition law do not care about specific technical measures competitors use to compete, just that dominant companies are not shutting competitors out of markets.
[0] Formerly "extended Berkley Packet Filter", eBPF lets you run safety-verified code in kernel space. Notably, the verifier isn't just a signing check, it can actually ensure the code won't crash the kernel directly.
Furthermore, Microsoft does actually have some rules regarding what you can and can't put into a signed kernel driver. Specifically, they won't sign kernel code unless they've seen and tested it first. CrowdStrike deliberately circumvented this rule by implementing their own configuration format - really, just a fancy way of loading code into the kernel that Microsoft doesn't have signing control over.
If there is blame to be had here for Microsoft, maybe it's that their kernel code signing program doesn't scrutinize third-party configuration formats hard enough. I mean, if you sign a code loader, you're really signing all possible programs, making code signing irrelevant. And configuration is more often than not, code in a trenchcoat. It's often Turing-complete, and almost certainly more complicated than the actual programming languages used to write the compiled code being signed off on.
But at the same time I imagine Microsoft tried this and got pushback. That might be why they feel (incorrectly) like they can blame the EU for this. Every third-party security solution does absolutely unspeakable things in kernel space that no one with actual computer science training would sign off on, using configuration to wrestle signing control away from Microsoft. Remember: Crowdstrike is designed to backdoor Windows systems so that their owners know if an attack has succeeded, not to make them more secure from attacks in the first place. Corporations are states[0], and states fundamentally suffer from poor legibility: they own and operate far too much stuff for a tribe[1] of humans to meaningfully control or remember.
The problem is that we have two different entities that all have the ability to stop this madness. When states run into this situation, they impose "joint and several liability", which means "I don't care how we precisely assign blame, I'm just going to say you all caused it and move on". In other words, it's Microsoft's fault and it's CrowdStrike's fault.
[0] ancaps fite me
[1] Maximally connected social graph with node degree below Dunbar's number.
One only needs to look at what's happening with Google's privacy sandbox to know the perils of antitrust with regard to introducing new interfaces. Even though Google has offered new interfaces and APIs that they themselves intend to migrate to (and take a ~20% revenue reduction), they've attracted the scrutiny of regulators who claim that this is a way of locking out competitors in the advertising space.
> [0] ancaps fite me
This part is simply inciting a flamewar, and something that you can do without in the spirit of the website guidelines[1].
How is Microsoft not to blame, it's their product? We wouldn't blame a Toyota supplier for a failure in a car, but we somehow segment that in the software world?
Crowdstrike is entirely optional software that doesn't come from Microsoft. Microsoft doesn't market it. Microsoft had no hand in making it. Microsoft doesn't sell it. Microsoft had no hand in a user installing Crowdstrike.
Do you not see the obvious differences there?
Do you think Crowdstrike is a Microsoft product?
Believe it or not, that really did not help the low and low-middle classes with their growing financial problems; and the upper-middle and top classes mostly operated in dollars (or less often, in deutschmarks) by this time anyhow, so that didn't inconvenience them much at all.
What I think would help is something that evolved in a less stable computing environment. Something which had to be partition tolerant. Such a thing would have to remain more closely coupled with the consent and merits of its participants because it would lack a reliable connection to a far away authority (currently used to uphold the wishes of extraneous parties to the transaction). Something like local-first software, but for money.
But now all those people who were using currency to trade for housing now suddenly need to find a new way to trade for shelter.
Who got hurt worse here?
So yeah, it could go as you say, but only of the wealthy are behaving in a way that justifies their outsized share while the renters are just spending from a pile of money that they got through less honorable means.
I don't think that's the most likely scenario though.
The nature of “content updates” vs a full product update. Though you may be right, perhaps they provide controls for those updates, I’ve never used their software. But doesn’t sound like it.
Maybe the more critical infrastructure and health care orgs are at the end of that rollout plan so they are at lower risk. It's not ideal if one sandwich shop in Idaho can't run their reports that day, but that's far better than shutting down the hospital next door. CrowdStrike could even compensate those one system shops that are on the front line when something goes down.
Again, better to pay a sandwich shop a few thousand dollars for their lost day of sales than get sued by the people in the hospital who couldn't get their meds, x-rays, etc in time.
This was just Symantec and McAffee ranting about PatchGuard and MS did not remove it.
I've never actually heard anyone claim Privacy Sandbox[0] APIs would give third-party ad networks the same level of tracking as Google. But I imagine even if they did, the APIs would probably be a poor fit for competing ad networks, in the same way that, say, the iOS File Provider APIs are a terrible fit for Dropbox[1].
There are three different ways you can introduce a new standard or interface:
- You can go to or form a standards body with all the relevant market players and agree on a technical specification for that interface. This is preferred, and it's how the Web is usually done.
- You can take a competitor's interface people are already using and adopt that. This is how you get de-facto standards, and while they might have loads of technical problems[2], none of them give you an unfair market advantage.
- You can make your own interface and force competitors to adopt that. You get all the technical problems of a de-facto standard, but those are all problems your competition has to deal with, not you.
The difference is a matter of market advantage. Out of all the major browser vendors, only Google has dominance in online marketing. Microsoft and Apple would like to have a piece of that pie, but they all dropped third-party cookies without tying it to their own competing standards that they wanted to force other people to use.
[0] Hell of an Orwellian name
[1] For example, if you use Dropbox as your file storage, you can't pick folders. At all. On an operating system built by the company whose engineers are obsessed with bundles (directories that look and act like files instead of folders).
[2] laughs in SWF
If I install some kernel level anti cheats and they stop Windows from booting, I need to blame the game developers. Not Microsoft.
Your free to install pretty much whatever you want on Windows.
Exactly
The fact that developers do not take their responsibility as seriously as an average car mechanic bring shame on our entire industry
Bingo. That's the buy signal
Even on home machines where no user has a password, having to do something special to get into administrator mode will stop several attacks just because people will slow down and ask.
Also, if you want to stop everyone from using administrator accounts, the simplest way is to not have the Windows installer/OOBE setup make an administrator account first.
Windows has a built-in Administrator account already not unlike Root in Linux, there is no reason (other than tradition and absolute convenience) the Windows installer/OOBE setup needs to make an administrator account for the user installing/setting up.
This would just result in more UAC prompts and thus annoyed users who get taught to click on "Allow" whenever a dialog pops up.
Unlike normal software development, anti-malware software has to be resilient against all kinds of tampering. The price for having an os that isn't heavily locked down and tamper resistant due to hardware enabled checks is having to rely on kernel mode code to enforce tamper resistance. Evasion is another issue, you can already hook api calls from user space (some EDRs do this) but evading it as a privileged user is trivial. It boils down to how on x86/x64 the cpu enforces 3 major privilege rings, by design things that are integrated with the OS that require OS level privileges and system wide access must run in the same ring as the OS (ring0/kernel mode).
There are many ways to tackle this but I haven't heard of any (even from Microsoft's blogs/proposals after the incident) that won't reduce the capabilities and tamper/evasion resiliency of these security softwares. if x64 had a "secure world" concept like ARM for example, that would be different but it doesn't.
[0] https://www.youtube.com/watch?v=EGttFWntctU - I need to state here that I do not possess the level of knowledge the author of video presents and therefore am unable to confirm findings included in the video
And we're back to Microsoft -- they are responsible for not having a proper way to handle such third-party apps, nor they maintained a process and controls to prevent such rogue breaking updates.
So I don't understand why you're focusing on windows here. Linux allows anyone to update too, there's no review or control either.
Just because an OS allows you to break it, does not mean the maker of the OS is liable when you do break it.
PS: I believe BSD-based systems would be more resilient because of microkernel architecture.
And installing a third-party kernel module (driver) is...a third party addon that changes the behavior of the product outside of the original designs of the product?
Honda didn't build the engine with NOS in mind. Microsoft didn't build the NT kernel for CrowdStrike. It is a third-party modification to the system the user chose to add on after taking delivery of the product that ultimately changes the behaviors of the system.
Arguing like Microsoft is liable for CrowdStrike's bad software is like arguing Honda is responsible for that NOS kit.
If I write a buggy kernel module that instantly kernel panics my Linux system, is Linus Torvalds responsible? Or am I responsible for the software I wrote?
If you zoom out, Microsoft has a system, a feature allowed on that system, signed by a cert, etc, can take down 8.5million devices of your system, that is a fault of your system.
A counter example of how to architect the thing? MacOS, Linux.
https://access.redhat.com/solutions/7068083
https://lists.debian.org/debian-kernel/2024/04/msg00202.html
https://forums.rockylinux.org/t/crowdstrike-freezing-rockyli...
Anyone can make a program that can crash MacOS or Linux especially when you convince the user to install it with very high permissions. It is really not too difficult. Heck, Linux comes with the ability to really mess up your system out of the box. Give it a try:
sudo rm -rf --no-preserve-root /
Gee, why would they possibly ship such malware on their system, something that could break the whole thing just hanging around. Would the distro developers be responsible for the damage caused if you decided to run that command?If you zoom out, Linux has a system, a feature allowed on that system, signed by a cert, etc, can take down any Linux machine, that is a fault of your system.
> Microsoft's platform is meant to integrate with third party software
Sure, but Microsoft offers no warranty to any of the third-party software. Just like Honda offers no warranty to third party modifications made to your car. Which yes, its normal and fine to use non-OE equipment on your car, but if you swap OE equipment with non-OE equipment they're no longer going to warranty that equipment. It is not like every component of your car is welded together.
Going back to your original comment here, CrowdStrike was not in any way a supplier of parts to Microsoft. This is why Microsoft shouldn't be held responsible in the same way auto makers are liable for the parts by their suppliers. And even then, often with the way auto parts suppliers' contracts are written the final liability just might lay on the parts suppliers! It is not like Honda went under with the Takata airbag recall. Takata was negligent and didn't build to the standards and requirements as their contracts required.
Microsoft isn't going to warranty Chrome having a security issue with their JS sandbox or Photoshop corrupting a file. Neither is Apple if it happens on MacOS.
https://www.microsoft.com/en-us/security/business/endpoint-s...
https://learn.microsoft.com/en-us/defender-endpoint/non-wind...
But always a chance that the skipping mechanism could break as well. And there must be some form of networking available to able to send that and ask for approval.
One change - this approval and telemetry doesn't happen during the boot loading process. It's just logged and skipped.
Once bootup is done, the EDR app auto starts, checks logs for anomalies and sends telemetry over whenever network is available (it usually is, because they update malware signatures etc frequently). Someone at the company gets paged, they fix and the process continues.
So a web browser can't be trusted or certified, ever. Unless JavaScript is disabled?
Sandboxing is such a way to attempt to enforce a guarantee (modulo sandbox bugs, of course). Since crexs aren't entirely in the sandbox, vetting and signoff is supposed to provide the added assurance of security the sandbox can't provide. And those assurances are hollow when the vetted crex is running arbitrary code from a third-party source.
This is incorrect. CrowdStrike caused a similar "won't boot after update was pushed" issue on Linux earlier this year, see https://news.ycombinator.com/item?id=41005936
Microsoft should have no say to decide what software I am allowed to run on my computer.
> Mac, linux don't have this problem due to how THEY architected the system.
You're joking right? You're arguing kernel panics can't happen on Linux? FFS, the CrowdStrike sensor caused kernel panics on multiple Linux distros in the last few months! Linux is not immune to kernel panics for buggy kernel modules.
Two: Here I'm not arguing about what's possible but rather what happened in the real world. 8.5 M machines down, my org runs Macs, we knew about it from the news...
It's not code execution without signing, and I think probably they do want these files to be updated hands free.
The real problem was the lack of testing, rather than the actual mechanism I think.
There is no guarantee the law is written soundly.
[0]: https://learn.microsoft.com/en-us/windows-hardware/drivers/i...
The problem is that you're assuming you can prove a program doesn't having security holes and bad processes.
The CISO and security ops will demand to be completely independent from corp IT, for legit reasons, as the security team needs to treat IT as potential insider threat actors with elevated privileges.
They will also demand the ability to push out updates everywhere at any time in response to real-time threats, and per the previous point they will not coordinate or even announce these changes with IT.
There has always been an implicit conflict between security and usability, because of the inherent nature of security deny policies, but they also inherently conflict with conservative change management policies such as IT slow rolling changes through lower environments on fixed schedules and operating with transparency
I always wondered: why should security ops not be a potential insider thread actor? In fact, if they were compromised, it would be even worse.
Do we need two different security ops that monitor each other? :)
So I guess 5 security OPS teams in different regions of the world, and they can all call a vote if one of the teams is now 'bad' :)
For many high privilege operations there are more segregation of duties in the act side of things - these can be down to plan, authorise, configure, activate, validate or some rollups of these. Another is dual control on the act side, since conspiracy is generally quite hard to do especially if it’s just for pocket-change. Different if it’s $$Billions of fungible cash of course at stake.
People often overcomplicate - simple do/check is often enough.
IMO there are no legit reasons except politics, empire building, NIH and toxic relationships for such a such a crazy state of affairs.
Technical people will make a recommendation, knowing it’s going to be ignored and that the decisions already been made.
So sure, IT gets to "decide" - between CrowdStrike, SentinalOne, or Palo Alto (and maybe a couple others). But they don't really have much choice, they can't use an OSS solution, or roll their own, or anything else. They have to pick one of a small number of existing solutions.
And yes, in the real-world, third-party software can and does cause Macs to crash.
8.5M machines...out of what, 1.4 billion? That's what, 0.6% of machines?
"And yes, in the real-world, third-party software can and does cause Macs to crash." Thanks for adding so much to the conversation (eyes rolled).
In the absolute sense 8.5M machines is a lot. Airlines down is a lot. Hospitals down is a lot. Hey we guarantee we won't wreck 99.4% of our machines out there! is not a good guarantee.
And sure, why shouldn't you be able to modify the software on hardware you own? It's your microwave. If you modify the software on it and that causes it to burn up don't go to the manufacturer when it burns your house down. But that's true if you open it up and rewire it as well. Which, sure, feel free to open it up. It is your microwave.
Are you arguing you shouldn't be able to modify the things you own?
> Thanks for adding so much to the conversation
I mean it seriously seems like you're arguing MacOS and Linux are immune to third party software crashing the system. Do you agree or disagree that third party software can cause MacOS and Linux instability, especially when the user chooses to run it at root level permissions?
> we guarantee we won't wreck
Microsoft didn't wreck these machines. CrowdStrike wrecked these machines. Every Windows machine that did not have CrowdStrike installed was unaffected by this, which is 99.4% of Windows machines.
> what happened in the real world
And yes, look at those bug reports, those are crashes happening in the real world not something theoretical. Kernel panics happen!
Maybe not. Intel is considering removing rings 1 and 2 for a future 64-bit only x86 architecture, because they "are unused by modern software".
https://www.intel.com/content/www/us/en/developer/articles/t...
And the reason why not is simple. Anything that Microsoft thinks is a good thing to add to the API, they'll add for themselves. When the new API is released, their software is released with it. This gives them a competitive advantage over competitors who have to wait for Microsoft to have the idea that they want, and then scramble to implement it after Microsoft does.
The EU is suspicious of this for the simple reason that Microsoft has a several decade history of doing exactly that. Repeatedly. My favorite example being the release of Windows 95 with Microsoft Word available at the same time, and with WordPerfect unable to run. By the time WordPerfect had figured out how to port their software to Windows 95, they were no longer the market leader.
That is somewhat revisionist history. WordPerfect admitted at the time they saw OS/2 as the future and were focused on that. Only in hindsight did they realize OS/2 was going nowhere (too bad, it was better than 95) and had to rush to get a WordPerfect for 95. Worse for them, they wrote each release of WordPefect in platform specific code (mostly assembly) so it wasn't a case of port to 95 it was a case of start over mostly from scratch.
Yes WordPerfect lost to Word with 95 - but it was bad decisions on WordPerfect's part. They had opportunity to get WordPerfect on 95 much faster. I don't know if it would have been fast enough, but they didn't even try until it was too late.
It was full of ad infested solutions, which would crash your computer from time to time.
Defender at least was reasonably performant and tended to be stable.
You could say that since they had access to kernel source, they were better informed, but I guess if there was an API, the provided documentation would solve the issue (not necessarily, not everyone bothers to read the docs).
But then you get back on how to enforce equal and open access for everyone (the EU did try to make Microsoft open the Word file format, but turned out it was so complicated and documented in legacy code only, that Micorsoft had trouble giving useful docs)
Anyway, as you said, it's complicated...
The ultimate lesson then is to stop using MS stuff.
The fact that Microsoft abandoned it as soon as a regulator pointed out how anti-competitive the design of the API was makes you wonder what Microsoft's true intention was. To me that implies the anti-competitive design was its main feature and to Microsoft it would've been pointless to continue without it.
The use of platform specific code was a performance necessity at the time, everyone did it. Part of the promise of Windows 95 was that it could run your Windows 3.1 programs. They bent over backwards for a ton of programs, but not WordPerfect. Microsoft also had an early access program to Windows 95. WordPerfect applied for it - and was denied access. After that the OS/2 bet was their only real hope.
The truth is that Microsoft had a long and documented history of using one monopoly to leverage into another. Over and over again they lost antitrust lawsuits, but internally regarded them as speeding tickets on the way to greater monopoly power. This history showed up in court. The internal documentation on the WordPerfect case showed up in the Netscape case, and is part of why Mocrosoft won.
It wasn't until the EU started charging Microsoft over $400 million per day for noncompliance in 2006 that Microsoft's attitude started to change. Now I see them as just normal big guys with a worse than average history. But back in the 90s and early 2000s? They EARNED the title of "evil empire".
But another way of looking at this would be that perhaps they wanted to be the beta testers of the API themselves because opening it up would have been a maintenance liability for the company. Microsoft tends to be pretty good about backwards compatibility in ways that Apple is not.
We also don't know that these APIs were cancelled, they may make it into future versions of windows.
And as mentioned elsewhere, an eBPF module behaving badly but in valid ways can still make your system pretty unusable.
If we want to talk microwaves, Microsoft is the microwave manufacturer. Users installing CrowdStrike are people sticking a giant ball of foil and paper towels in the microwave and turning it on for an hour. You're arguing Microsoft is liable for the things people stick in their microwaves, and that Microsoft should put in place guards to prevent people from putting whatever they want in their own microwaves. That Microsoft should control the things people put in their microwaves. Only Microsoft tested and Microsoft approved foods in Microsoft microwaves. And the microwave needs to ensure only the proper cook time applies to the properly signed food products to make sure it doesn't get burnt. Sorry, Microsoft hasn't fully validated Red Gold potatoes, it can only cook Russet potatoes.
That is the same logic as Microsoft is liable for the third-party software people install on Windows machines and that Microsoft shouldn't have allowed the third-party software to run.
Why should Microsoft be able to say what antivirus software I choose to install or not? Why should Microsoft be able to say what browser I install? If I install some software that breaks my Windows machine, is that the faut of Microsoft or the fault of the software maker? If I stick foil in the microwave is the ensuing fire GE's fault?
Microsoft didn't make CrowdStrike. They didn't make the update. They didn't enforce it. They didn't sell it.
If I write a new OS how will you force the "word processing, video conferencing, and music-selling" companies to write code for it? If they don't write the above my OS is worthless, but if my OS fails in the market anyway they just wasted a lot of money. This is why OS companies tend to have the other things, their OS cannot exist in a vacuum and the only way to ensure they have those needed tools is to write them themselves.
Nobody wants to try to be selling consumer software that is optimized for the out of date and unsupported version of the OS.
If you could prevail on a government to decide that, maybe it could work.
One thing I see, is that AV has a component of maintaining a DB of signatures of bad things. This does not seem at all the job of the core os. Would the Debian team maintain such a DB?
https://redmondmag.com/articles/2014/04/28/court-nixes-novel...
The case brought to light an Oct. 3, 1994 memo from then-Microsoft CEO Bill Gates, who indicated that Microsoft should withhold namespace extension APIs in Windows 95 from its competitors, WordPerfect and IBM, in order to gain market advantage for Microsoft Word.
In other words, your revisionist history is wrong. Microsoft really was big enough. We know that because WordPerfect asked for early access to Windows 95. It was Microsoft who turned them down. (And no, I don't believe Gate's testimony about security. I think that Gates was bamboozling the judge, and the judge bought it.)
(I had misremembered which court case brought that memo to light. But regardless, it was obvious to the whole industry at the time. Incidentally this memo came while Microsoft was under a consent decree signed on July 25, 1994 with the Justice Department to not try to maintain their monopoly by tying specific products to Windows. Technically, they didn't here, but they were walking the line. They crossed the line with IE though, and that later resulted in the Netscape loss.)
As for BeOS, the question was how a LEADING operating system company was supposed to cope with getting software for the next version of their OS. No matter how many good things we can say about BeOS, they never got to the point of being a leading operating system company.