If not, how’d a small time outfit get access to something the rest of us can’t have because we’re (apparently) not trustworthy enough?
No shade on these guys - I’m thinking it’s just another plot twist in “Hours of AI’s Lives”.
Is it? It's an arms race between the "good guys / defenders" and "bad guys / attackers". Assuming both sides have access to the same tools, how is this going to make any difference? Their relative strength will stay the same.
What is actually different is that 1. anybody without tool access is out of the game, which includes security professionals from poorer backgrounds (for them it's not too amazing of a journey) and 2. the AI vendors get a constant stream of what's essentially an AI tax from everybody - so yeah, for them it's gonna be an amazing ride.
I mean really, given how AI is being marketed, what is the point of the internet going forward when all its contents are going to be ai slop anyhow? Just disconnect and run local models if all you are getting is slop anyhow. The original purpose of the internet is now dead. Real people communicating and sharing information with eachother? Ha! That is no longer valued.
In fact actual innovation might no longer be valued anymore. What is innovation but opening Pandora's box and potentially seeing a disruptive competitor take your slice of the federal reserves money print? Better to nip that in the bud and control all the devils we already know, from a pure sociopathic profitmaking standpoint, which seems to be a very dominant viewpoint among the people in charge of the worlds power structures right now.
I jest, but I did notice having more confidence to take on more ambitious work lately. We're all centaurs now.
They simply have to show it against a beta version of MacOS, and frame it as unauthorized access, and maybe from locked mode if possible
> Apple spent five years building it. Probably billions of dollars too.
This seems higher than I'd expect.
Also, Apple claims it was an effort spanning half a decade (https://security.apple.com/blog/memory-integrity-enforcement...), so depending on what you consider part of this (for example, do you include time spent on their secure memory allocator, on designing/implementing the ARM Memory Tagging Extension or Extended Memory Tagging Extension in the costs of this feature?),
Haha! Nerd jokes are the best jokes
Arm published the Memory Tagging Extension (MTE) specification in 2019 as a tool for hardware to help find memory corruption bugs. MTE is a memory tagging and tag-checking system, where every memory allocation is tagged with a secret. The hardware guarantees that later requests to access memory are granted only if the request contains the correct secret. If the secrets don’t match, the app crashes, and the event is logged. This allows developers to identify memory corruption bugs immediately as they occur.
https://support.apple.com/guide/security/operating-system-in...
(I’m sure they’re not lying, but we’re not learning anything here)
However it is no different from the Linux kernel, just because Rust is now allowed, the world hasn't been rewriten, and no sane person is going to do a Claude rewrite of the kernel.
https://docs.swift.org/compiler/documentation/diagnostics/st...
My opinion is that it is over-hyped because like any LLM, it requires a suitable human in the loop to keep the LLM on the straight and narrow, and then to weed through the inevitable false-positives and hallucinations.
Nicholas Carlini, for example, whose name is on many of the recent high-profile Mythos findings is not just some random dude with a Claude sub on his credit card .... he's an experienced security researcher.
Random inexperienced people thinking Mythos can replace the need for experienced pen-testers, auditors etc. are likely to be sorely disappointed if/when they get their hands on Mythos.
I don’t think Mythos is hype for all kinds of reasons.
Anthropic is a young company but their track record is solid; they don’t seem to hype things just for the sake of hyping things. Sam Altman at OpenAI? We already know his track record…
I’m going Occam’s razor here: the simplest explanation is usually the correct one.
Anthropic had an “oh shit” moment when they realized what Mythos can do. They decided to do the responsible thing: give the industry a heads-up and an opportunity to use the preview to identify and fix the most dangerous zero-day vulnerabilities.
Since the FAANG companies have billions of users, it makes sense to start with them.
There’s still going to major issues for users of systems too old to get patches or updates. Or for IT organizations who think Mythos is a replay of Y2K, where, compared to the warnings, not lot happened.
The bottom line is someone with Mythos won’t need to be an experienced security expert to cause real problems. That’s kind of the point.
"Suitable human" is a dry phrase indeed. ^_^
The hype is "gosh look at all the bad things this brilliant almost conscious tool found!"
The reality: an insecure toolchain for an insecure language with an insecure compiler produced a runnable but insecure binary for an insecure OS. We couldn't be arsed to address any of this before, but now we're being billed the full price of our laziness.
At first they will be delighted. So much money and time saved. When their adversaries get their hands on their system (with or without Mythos), then they'll be sorely disappointed.
They got all that compute with the SpaceX partnership but now the PR has taken a life of its own, so might as well keep hyping up Mythos and artificial scarcity if they have an asset people want now
Just roll with it
Open AI has a history of doing the same thing and it's the same people. GPT 5 was supposed to be AGI at one point, remember.
I agree, but it's the people I'm worried about.
I'm hearing anecdotes from all over about devs pushing LLM-generated code changes into production without retaining any knowledge of what it is they're pushing. The changes compound, their understanding of the codebase diminishes, and so the actions become risker.
What's worse is a lot of this behavior is being driven by leaders, whether directly (e.g. unrealistic velocity goals, promoting people based on hand-wavy "use AI" initiatives, etc) or indirectly (e.g. layoffs overloading remaining devs, putting inexperienced devs in senior rolls, etc).
The world's gone mad and large swaths of the industry seem hellbent on rediscovering the security basics the hard way.
No anecdotes needed, it's entirely happening.
But it's also devs, being devs.
juniors have been writing code forever that is imperfect and not memorized by the people reviewing
isnt the important thing the mechanisms for maintaining the code?
I don’t think so.
An LLM can produce higher-quality documentation than most humans. If it's not already happening, when a new developer joins a team, they're going to have an LLM produce any documentation a new developer needs, including why certain decisions were made.
It could also summarize years of email threads and code reviews that, let's face it, a new person wouldn’t be able to ingest anyway; it's not like a new developer gets to take a week off to get caught up on everything that happened before they got there. English not their first language? Well, the LLM can present the information in virtually any language required.
As the models continue to improve, they'll spot patterns in the code that a human wouldn’t be able to see.
(https://www.usenix.org/publications/loginonline/data-only-at...)
This makes more sense. You don't trigger MTE since you're not doing anything for force MTE to take action the program isn't actually changing.
My other question would be, why didn't apple use fbounds checking here? They've been doing it aggressively everywhere else.
MTE plus fbounds checking everywhere should lead to an extremly hardened OS
Its not the first time bugs get past MTE, happened with Google Pixel last year ... https://github.blog/security/vulnerability-research/bypassin...
1. Any given system has a finite number of findable vulnerabilities.
2. All findable vulnerabilities are fixable (if not in software then with a new hardware revision).
3. Fixing a vulnerability while keeping the same intended functionality introduces on average less than 1 other findable vulnerability.
4. It is possible to cease adding new features to a system and from that point forward only focus on fixing vulnerabilities.
If all 4 are true, then perfect security seems possible, in some sense. I think some vulnerabilities might not be fixable, if you include things like the idea that users can be tricked into revealing their passwords. If you restrict the definition of vulnerability to some narrower meaning that still captures most of what people mean when they say computer vulnerability, then I think those 4 statements are probably true.
Perfect security might be near impossible in practice because vulnerabilities will get more difficult to find and fix over time, but I think we should expect the discovery of vulnerabilities to eventually become arbitrarily slow in a hypothetical system that prioritized security above all else.
If you imagine you had a vulnerability scanner as fast and convenient as a linter, it would be much cheaper to write secure code right away. Probably not perfectly secure, but still secure enough to make sure finding exploits stays expensive.
Ironically the AIs will probably help us produce higher quality software in the end, because "everything gets pwned" becomes the forcing function for software actually being correct.
In other words I think we are actually entering an age where correctness makes economic sense. (One can dream!) The cost of producing correctness is dropping, and the cost of not doing so is rising massively.
My dad was on one of the many Y2K teams that major tech companies had to make sure nothing went wrong. I feel like history may have undersold what could've been if not for considerable effort leading up to Jan 1, 2000.
(or if you tasked an AI to write that: how appropriate!)
When Sundar Pichai announces that 75% of all new code at Google is AI-generated, their stock price goes up. If he were to announce that 75% of all new code at Google is now written by junior engineers, this would trigger a massive sell-off and a lot of employees would resign.
Seniors are only part of the picture as team lead, or when it escalates after big screwups.
Seniors are only part of the picture as team leads, or when it escalates after big screwups.
Can bears some heavy weight.
LLM generated documentation has so low level of information density, that it’s useless. Yes, it writes nice sentences… or even writes. But it contains so much noise that currently, reading code is a better documentation than what I’ve seen from every single LLM generated documentation.
The same with LLM generated articles. I close them after the second sentence because at least about 90% of it is useless filler.
Now compare that to this: https://slate.com/technology/2004/11/the-death-of-the-last-m...
I almost closed it when I read the first few sentences because these kinds of articles are useless time wasting nonsenses. But this was different. This was old. Most sentences contained something new. Something worthy. (Of course, people also write unnecessary long articles… looking at you Atlantic)
You can throw out almost everything by volume from LLM generated documentation without loosing any information.
Currently, if I smell (and it’s very easy to smell) LLM generated documentation or article, then I close it immediately, because it’s good for only one thing: wasting my time, for no good reason.
I should clarify: the documentation I’m talking about is not generated using a generic LLM prompt, which would mostly suck.
With the proper context and additions (skills, plugins, MCPs) LLMs can produce high-quality documentation. You'd also have subagents doing QA of the documentation.
But it does require effort; it’s not magic.
If stuff really goes wrong, you need people who deeply understand the codebase so that they know where to look and how to diagnose the issue. It might be the case in the future that LLMs become so powerful they'll diagnose any issue (I doubt it), but until then, we need people in the loop.
if you want to be a one man show handcrafting an artisan iOS app that will be fine, but you should probably let Claude bang against it for a while to shake out whatever bugs
Will we now have leetcode of prompt writing?
1. it’s to performance sensitive
Or
2. The os is so darn large it’s hard to recompile everything
A simultaneous total world build is relatively rare (is that needed here?), but it does happen. Sometimes new compiler versions or features need this.
I'd imagine this set is very similar to just "the set of software on the world". Even before the AI stuff, it was a pretty good bet at any given software had some vulnerability; it was just a question of how easy to was to find it.
So much out of date software with known exploits left running for years. The only reason there hasn't been total disaster is no one has tried to hack it yet.
The root problem is the world runs on C code that is riddled with vulnerabilities.
Then you have the many companies in the UK, US, Canada, EU that have compliance and regulatory laws that require them to exist in some capacity in house. Though that is changing with MDR services, but someone still has to interface with the MDR.
[1]: https://www.elastic.co/pdf/sans-soc-survey-2025.pdf [2]: https://github.com/jacobdjwilson/awesome-annual-security-rep...
I mean we are literally in a thread about how the 4 trillion dollar company, literally the 3rd most valuable company in the world, with a core competency in software has, yet again, released a core product riddled with security defects for the 50th year in a row.
Commercial IT security is a industry that is incapable to a fault and has, so far, faced basically zero consequences for it.
Even more so in the future when a software company can be launched by a farm of AI Agents with a founder at helm with no clue about computing or security.
What's debateable is how many of those companies actually need irontight security, because they are never realistically going to be targets of criminals and/or they have nothing valuable to steal/corrupt in the first place (other than the owner's pride).
I was pointing out how even Apple, a entity who by all rights should have top-notch security, is still absolutely hopeless in the face of commonplace commercial, profit-motivated attackers.
Massive, extremely well-resourced divisions supported by management in a technically competent organization that is actually trying to solve the problem struggle to produce at best middling security that is inadequate against commonplace threats. This is not a prioritization problem; even if you do “everything right” you are still vulnerable to run-of-the-mill commercial attackers. This is a fundamental capability problem, like how we can not make a net positive fusion reactor right now.
It is actually unfair to blame these companies for not having a fusion reactor because they “were not trying hard enough”. Actual security is not a easy problem, and it is a great disservice to portray it as one that is only unsolved due to dunderheads being in charge since it leads to underestimating what actually needs to be done.
That is not to say that you can not do dramatically worse than the “gold standard” and also that most organizations are actually incompetent; but the “gold standard” is still objectively grossly inadequate. You need to be dramatically better than the 4 trillion dollar software company to reach adequate against prevailing threats.
I still have nightmares about the contact form on my low-stakes personal website getting hijacked to use as a spam sender (because I used unsanitized input in mail headers).
This is true in America in many industries now, but most of the rest of the world (even the rest of the OECD) is still far behind.
Exploits are BAD!