Project Glasswing: what Mythos showed us

Project Glasswing: what Mythos showed us(blog.cloudflare.com)

363 points by Fysi 45 days ago | 141 comments

roxolotl 45 days ago |

What does this mean?

> It's a different kind of tool doing a different kind of work, and that makes a clean apples-to-apples comparison to earlier models difficult.

They claim it’s a different kind of tool and then describe using it the same way you’d use any other model. This really felt way worse than the average Cloudflare blog and really just rehashed the Mythos announcement which had already called out the key parts being chaining and crafting examples.

JeremyNT 45 days ago | |

> They claim it’s a different kind of tool and then describe using it the same way you’d use any other model. This really felt way worse than the average Cloudflare blog and really just rehashed the Mythos announcement which had already called out the key parts being chaining and crafting examples.

Hah, I was trying to parse this too.

Charitably perhaps they're being vague on exactly what's different because they're still under NDA.

password4321 45 days ago | |

> way worse than the average Cloudflare blog

How long has it been since you took your average? Lately all Cloudflare output has been heavily AI'd.

__natty__ 45 days ago | |

Sounds different because it’s hidden advertisement not a regular blog post

grim_io 45 days ago | | |

But why would cloudflare advertise Anthropic? They are competing with Anthropic by hosting open weights models.

alsetmusic 44 days ago | | |

> Sounds different because it’s hidden advertisement not a regular blog post

Yep. Cloudflare has lost my respect over the last six months.

The posts about pro-AI initiatives and APIs for AI and then laying off a lot of people was pretty impressive for how to do the wrong thing.

samstokes 45 days ago | |

The post says they wrote a custom harness that orchestrates work between multiple separate model invocations. That is different from running Claude Code (which is a specific existing harness around the Claude models).

The post takes a while to get around to saying that, and could have included more detail besides the workflow diagram and table (which they flag as only "an example of" such a harness), but it does answer the question. It's a different kind of tool because it's a model rather than a harness+model pair.

meander_water 45 days ago | |

> the model has its own emergent guardrails that sometimes cause it to push back on legitimate security research requests. But as we found, these organic refusals aren’t consistent - the same task, framed differently or presented in a different context, could produce completely different outcomes as illustrated in the examples below.

This was new. I'm surprised that a model specifically designed for security research and gated to professionals is refusing legitimate requests

_alternator_ 45 days ago | | |

There's pretty strong evidence that (mis)alignment in one area creates (mis)alignment in others. The "aligned behavior" vectors are not orthogonal from cybersecurity to bioweapons to prejudice, so having alignment in some will likely bleed into others.

sn0rl27 44 days ago | | |

The model wasn’t created specifically for security research. It’s a general model that just happens to be dangerously good at security research (according to Anthropic)

getnormality 45 days ago | |

I think they're saying it has qualitatively different capabilities that make certain kinds of security work more worth pursuing with the model, not that the model of human-AI interaction has changed.

You're right that they're using a harness like everyone else. The general idea of giving the model a harness is not going to change. I mean even humans need harnesses to accomplish some things.

mycall 44 days ago | | |

Google Maps is my favorite human harness.

smusamashah 45 days ago | |

'Its not X, its Y' is also a common LLM trope.

FergusArgyll 45 days ago | |

I think what they might mean is:

Because of it's capabilities, a new kind of harness can be built for it, thus the entire system (model + harness) is a different kind of tool than say Claude code

Xirdus 45 days ago | | |

But did they build this different harness? And are they sure other models can't cope with it?

eikenberry 45 days ago | |

My guess is because it is a model trained specifically for security/hacking. So comparing it to Opus, trained for chat/code/etc., is apples-to-oranges.

rs_rs_rs_rs_rs 45 days ago | | |

It is not, that's what surprised Anthropic employees too.

sandeepkd 45 days ago |

I was expecting some more concrete numbers and surprises. It just seems like a balanced promotion article probably written using LLM itself.

wslh 45 days ago | |

In the last few days I was recommending to read the insights from XBOW [1], it's a competitor but it adds more information to the discussion.

[1] https://xbow.com/blog/mythos-offensive-security-xbow-evaluat...

sandeepkd 45 days ago | | |

Thanks for sharing. Its definitely more concrete. Some of the things that I was hoping to find were, the number of false positives, the times it takes to identify the false positives from real ones, the taxation on human mind to perform this exercise. Did anyone manually verified the exploits which were identified by the LLM or were they assumed correct based on the explanation. I do understand that the target audience of these articles is probably the decision makers so the language and content has to be tailored accordingly.

FergusArgyll 45 days ago | | |

That is a good article.

Interesting that gpt-5.5, while not as good as mythos, also seems like a decent step up

xnorswap 45 days ago |

The real question is whether it was Mythos or Opus that wrote this post.

> "Why it matters"

It doesn't, it's a corporate blog, they were rarely written in one-author's voice anyway, but it's interesting to see that even large organisations are outsourcing their blogs to LLMs.

Illniyar 45 days ago |

'Narrow scope produces better findings - Telling the model "Find vulnerabilities in this repository" makes it wander. Telling it "Look for command injection in this specific function, with this trust boundary above it, here's the architecture document and here's prior coverage of this area" makes it do something much closer to what a researcher would actually do.'

So what, we take every function and every vulnerability type and just run the agents millions of times?

I would expect Mythos to be able to find vulnerabilities without pointing it out for him, otherwise it's no better from other agents. It's just has a better harness.

robot_jesus 45 days ago |

The "Four lessons" that came out of running this work at scale made me chuckle. Three of the four were essentially identical and entirely obvious. In short: specific, narrow requests work better than "find vulnerabilities." Well, d'uh.

But, I did think the adversarial review (while not novel at all and talked about much in HN circles) is interesting and distinct, at least. I need to put this to work in more of workflows. I think it could be beneficial for non-coding tasks, too.

https://blog.cloudflare.com/cyber-frontier-models/#what-a-ha...

MattSayar 45 days ago |

> The loudest reaction to Mythos Preview from other security leaders has been about speed - scan faster, patch faster, compress the response cycle. More than one team we have spoken with is now operating under a two-hour SLA from CVE release to patch in production [...] If regression testing takes a day, you cannot get to a two-hour SLA without skipping it, and the bugs you ship when you skip regression testing tend to be worse than the bugs you were trying to patch.

Over time, I wonder if these models will be able to generate more secure code by default by doing this kind of exploitability testing before ever merging their code.

krupan 45 days ago | |

I don't know, but it always seems weird to me when people notice AI isn't performing super well and then they conclude that the solution to problem is to try using more AI

tskj 45 days ago | | |

Yeah why not? That's how I work. If I don't review my work, it's way worse than if I do review it and revise and iterate. I don't see why AI should be different: in fact it very clearly seems to be the case that is isn't.

germandiago 44 days ago | | |

Reminds me of people adding more intervention and bureaucracy bc the last one did not do well, so we need more of it.

The problem is never the results of it. It is that we did not do well enough.

edu 45 days ago | |

Or they don’t, and they* sell access to Mythos and successors through their services company or network of partners and charge a premium.

* they, I mean all foundation models providers, as OpenAI seems to go in the same direction

dataflow 45 days ago |

That's great and all but how severe were the most severe vulnerabilities found? I imagine they don't want to talk about it, but that's really the most interesting and important bit.

sf_tristanb 45 days ago |

great, but why don't you share real data on how many security vuln it found ? how many were reals, how many weren't ?

ofjcihen 45 days ago | |

Yeah I’m waiting for this as well.

I get that you want to address them or whatever before releasing info but I keep seeing these claims with barely any data and I’m like…how do you expect people to not be skeptical?

I mean hell if you’re a security professional you’re literally paid to be skeptical.

rithdmc 45 days ago | | |

the curl maintainer goes into some more detail on this

https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-v...

sanxiyn 44 days ago | | |

Mozilla published some numbers and actual bugs.

https://hacks.mozilla.org/2026/05/behind-the-scenes-hardenin...

dheatov 44 days ago |

They used SO many words to say so little thing. At this point it seems pretty safe to say Mythos is purely PR stunt.

keyle 44 days ago | |

It reads like long winded version of paid advertising.

unethical_ban 45 days ago |

Interesting for teams looking to implement ai into their deployment process.

I don't think guardrails are useful long term. Assuming we don't see the end of open near-frontier models, it is folly to try to keep models from doing exploit generation. The solution needs to be all software projects writing code under the assumption that hackers will be running LLMs against their code in search of exploits and write secure code accordingly.

sterlind 45 days ago | |

even careful programmers working in unsafe languages will introduce bugs; it's inevitable. in 2026 we should be using safe languages for all new projects, but there's a gargantuan amount of C/C++ handling protocols.

but I agree that guardrails will only help for like, 3-6 months. we should be screening as much as we can with Mythos; unfortunately, Anthropic is only giving access to the big players.

Arcuru 45 days ago |

> What changed with Mythos Preview is that a model can now take those low-severity bugs (which would traditionally sit invisible in a backlog) and chain them into a single, more severe exploit.

I think this statement seems to align with some of the other independent tests of Mythos[1]. It did very well on long agentic work which I expect is what they trained it for, and that requires being able to find these tangential links between loosely related topics in the context window.

[1] I'm mainly referring to https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos...

btown 45 days ago |

This is worth a read specifically for this section and the ones following it, re: custom vs. agentic-coding harnesses. https://blog.cloudflare.com/cyber-frontier-models/#why-point...

Claude Code's harness is remarkable for many use cases, particularly with 1M context sizes. But it's also limited when the scale of code or data to read becomes close to that, or exceeds it. The idea that a cluster of actors can work on a shared, structured set of context snippets, and have guidance around what is relevant to them, is an incredibly useful model outside of cybersecurity as well.

jerrythegerbil 45 days ago |

This blog was written by AI.

vbernat 45 days ago |

I don't understand why Cloudflare got unrestricted access while Daniel Stenberg got Mythos run by a third party on cURL and only got a report. Well, I understand, but I may be wrong.

whh 45 days ago |

The pushback is quite funny. I have found, in my own usage, that I had to evidence my legitimate access to the codebase before it would proceed.

Archit3ch 44 days ago |

> Programming language - C and C++ give you direct memory control and, with it, bug classes - buffer overflows, out-of-bounds reads and writes - that memory-safe languages like Rust eliminate at compile time. We saw consistently more false positives from projects written in memory-unsafe languages.

Re-write your Rust into C++ to drown the attacker in false positives? ;)

jongjong 45 days ago |

> we tried letting the model write its own patches and watched a few go out that fixed the original bug while quietly breaking something else the code depended on.

This is something I've been anticipating. Imagine this happening on a 500k+ line project scattered across 10+ repos.

It would be easier and cheaper to pay me to rewrite the whole thing from scratch than to fix all the vulnerabilities.

charcircuit 44 days ago |

>Why pointing a generic coding agent at a repo doesn't work

The author of this blog post does not acknowledge the existence of subagents and thinks that it's not possible for a model to come up with multiple ideas and have multiple streams of thought at the same time.

miraculixx 45 days ago |

Did they compare it to other models? A lot of this sounds like this is the first time they have applied AI to security, and they are just amazed at the unreasonable performance of a pattern matching machine. Well, it matches patterns. duh

dools 44 days ago |

> They ingest a lot of source code, hold a single hypothesis at a time, and iterate against it. That's exactly the wrong shape for vulnerability research, which is narrow and parallel by nature

LLMs are trained on Ed Sheeran lyrics

staticassertion 45 days ago |

> The harder question is what the architecture around the vulnerability should look like. The principle is to make exploitation harder for an attacker even when a bug exists, so that the gap between when a vulnerability is disclosed and when it is patched matters less. That means defenses that sit in front of the application and block the bug from being reached. It means designing the application so that a flaw in one part of the code cannot give an attacker access to other parts. It means being able to roll out a fix to every place the code is running at the same moment, rather than waiting on individual teams to deploy it.

So nothing new then.

yieldcrv 45 days ago |

“Sorry Dave I’m afraid I can’t do that“

I’m a security researcher

“Oh in that case”

hydra-f 45 days ago |

Beside the poorly written post, the vulnerability discovery workflow might actually give good results

pizzafeelsright 45 days ago | |

The part on the harness is spot on.

I have been encouraging people to think about agentic coding in the same way.

Let agents do the reading and writing and inspections. Human does the thinking.

Asking an agent that is looking at a firearm specification schematic "what is wrong with this?" and the response is "this thing contains an explosion and can kill". Human "that's the function" when the human should be asking "based upon the materials used, are the fault tolerances sufficient to maintain structural integrity".

sherlockx 44 days ago |

"Why it matters"

Kringe sloppy AI writing.

perching_aix 45 days ago |

It's nice to see them address the instrumentation side of this.

I expressed some concerns along the same lines in the thread about the Mythos evaluation curl did a few days ago, which sounded a lot like the "passing in the repo and telling it go!" type workflow described in this as dramatically less effective.

Disappointed that the post is very slim on details beyond this however. No hard numbers. Not comparatively, not in isolation. Would have arguably been kinda the point.

mring33621 44 days ago |

We got a special thing that did special things. Yay!

wnevets 45 days ago |

I can't wait to be told that Cloudflare is now part of "The Mythos FUD" campaign.

whizzter 45 days ago | |

2 things can be true at the same time.

I think the curl folks finding it underwhelming is more of a testament to their code being subjected to a lot of tests/attacks/auditing over the past years compared to many other codebases. It's not going to find magically insurmounable exploits on it's own and "pwn teh w0rld".

At the same time, there is so much shitty non-memory safe code out there (C/C++ mainly) or logically weak code (much of it vibe-coded or otherwise by inexperienced devs) that will be easy pickings for anyone pointing Mythos at those codebases/services and eventually lead to chaos since the cost of an customized exploit has gone from days to months of expensive researcher time to some token spending.

Now if they noticed that they could find exploit chains easily in a lot of popular software, some embargo and hardening to give popular OSS packages time to not be exploitable by default does help people (and the NSA that probably has a preview).

adrian_b 45 days ago | | |

While it is true that C/C++ are prone to bugs when used by careless programmers, Cloudflare also said:

"We saw consistently more false positives from projects written in memory-unsafe languages."

So while there may be a greater probability to find bugs in C/C++ projects, there is also a greater probability that there will be more work that must be done by humans to verify that real bugs have been found.

pixl97 45 days ago | | |

The amount of code that is absolute trash in F500 could drown the world.

Static scanners are ok at find a few particular types of issues, and really bad at more abstract issues. Also having rules where you must pass static analysis has to be followed up with actually making sure your code monkeys aren't writing bullshit that confuses the scanner and lets it pass while doing nothing for security (or adding nice logic traps).

Most external security firms looking at code are more useless than a zero with the circle rubbed out. Had a fun example from a while back where the team that wrote the code inserted an intentional security flaw to be sure they were catching anything. Problem is they were giving access to the entire git history so these stood out. The moment they just gave flat code the security teams ability to find flaws disappeared.

LLM models seem to have a pretty good grasp on finding flaws in code like this once you can get the issue to stay in context and execution time. When I hear things like Mythos getting much longer time to work on the problem then at least to me it makes a lot more sense on the number of issues it's picking up.

k33P1Tr3aL 45 days ago |

well how many CVE vulns did it find?

schnitzelstoat 44 days ago |

Nice content marketing piece.

unglaublich 44 days ago |

> Model refusals [..]

That even their model aimed at security research tries to be a pedantic better-than-thou annoys me much.

I build an agentic loop framework at work, and I need the model to test some boundaries and error-mechanisms, but Opus keeps whining that it's not ready to do these "bad" things and tells me to do it myself instead. Makes me roll my eyes...

wutwutwat 45 days ago |

Technically speaking CloudFlare is at its core, a security vulnerability itself. World's largest MITM

reducesuffering 45 days ago |

There will be no mea culpa from folks insinuating Mythos is a marketing stunt. Nor will there be every time AI capabilities repeatedly blast through the naive expectations.