What it feels like to work with Mythos(oneusefulthing.org) |
What it feels like to work with Mythos(oneusefulthing.org) |
The first item on the article, the first thing it showed, was wrong.
It is 100% faster to go from London to New York in 1881 than Volgagrad. Or any of the Russian hinterland colored green or Turkey or Egypt.
Not a good look.
It also burned through my usage quota like a late-90s Hummer.
In a project like mine (https://github.com/tsz-org/tsz) I am constantly frustrated that models were not doing enough research and were not taking into account other situations. Again and again models would produce code that would fix one thing and break 2 other tests that were "unrelated".
With Fable it seems like tasks are taking much longer (I have not seen a pull request from Fable sessions yet) but reading the transcription of those sessions I can see how it is doing the right thing by not leaving any stone unturned.
As the article says, it's hard to communicate this "feeling" about models because it is very project specific but I thought I share
A small portion of this effort is having a high quality Lua in Rust repo. I’m using mythos to fix some of the performance issues with my Lua interpreter that gpt 5.5/ opus 4.8 had stone walled on.
Not sure if Mythos will be able to crack this but it has been running for a couple hours now with some promising results.
Performance charts linked here if your curious https://github.com/ianm199/lua-rs
I don't consider myself a genius; but, for my workflow(s), what/how I've grown accustomed to building over the last 6-9 months, and the speed at which I'm able to produce entirely new integrated platform features, DS just isn't cutting it when compared to Anthropic's models.
https://isochronic-passage-chart.netlify.app/
Doesn’t work too well on mobile but looks interesting
> Again, it wasn’t perfect. As an expert, I was able to spot some errors and omissions (some as a result of the design I had asked for) that I had the AI correct
That's the bit that stuck out to me - that's longer than I would expect to work on a problem in a day or even expect to go back & fix the output of something that has a core reward loop of hours.
My customers are currently clamoring to push down my agent response times from 85 seconds down to below the 20s mark.
At the same time, it is very dissonant to see the industry heading towards hour+ long workflows with an agent.
What makes me excited is that GPT 5.6 (its actually GPT 6) is going to be crazy
> Switched to Opus 4.8: Fable 5 has safety measures that flag messages on most cybersecurity or biology topics. They may flag safe, normal content as well. These measures let us bring you Mythos-level capability in other areas sooner, and we're working to refine them. Send feedback or learn more.
He is a professor but sadly also an AI shill. He should switch to advertising washing power.
We're gonna go back to the days where our bosses ask why we're just sitting around, but instead of saying "compiling," we'll just say, "waiting for Claude."
Will Claude's code be perfect in one shot? Probably not, will it get you 80 to 90% of the way there with your chosen design patterns in under a few hours? Absolutely.
At this point, pay me significantly more, and I'll do it.
I'm amazed we're so far into SOTA bloat that the chinese will kill once they start etching silicon with these models.
There are people that almost feel physical pain if something is unnecessarily incorrect.
+ That if the mental model of something is accurate, it is actually _more_ work to say something that is incorrect than just saying the correct thing.
Similiar to "My game just crashed".
Jira otoh is not yours, because it's in the cloud. It might be "my internet connection", "my browser" or "my account" that is having trouble.
___
Hm. "My train got delayed" is interesting in this context. I don't find that offensive. But that also might be because trains don't seek rent.