"Programmers don't need unions or professional standards, it will stand in our way of making as much money as we can, and will slow down the speed of software development".
With that said, HN does provide the tools to find users that said things like this and if I wasn't lazy I'd love to find at least a few that said things like the above, but are now pearl clutching over AI being bad they are going to crush the things down to singularities.
And of course I'm just yet another envious hater from "the orange website". Your conscience is clear, AI bros. /s
You need to try really hard to convince me that OpenClaw is more important or has done more good than React or the 10 projects "below" it.
As any if this matters.
By which metrics?
> This isn’t clowning.
Why?
Because a solo dev has deployed to millions of people in less than eight months spending I believe zero dollars on marketing.
We should all be so lucky to clown at this scale.
I don’t feel the need to spend all day auditing, and I don’t care very much, but generally I think the combination of Nvidia corporate enthusiasm, available github stats and industry analysis all tells a pretty coherent story: A project with 70k forks on github is likely to have more than, say 700k users. My own fork-to-usage ratio is far less than that.
Put another way, I would suggest that most public evidence points one direction. If you believe something else, that’s fine. But if you want to convince me there’s less than, say, 100k deployments worldwide, I’d want to understand where those numbers came from before being convinced.
I don't think they would do that unless this was widely deployed.
This AI boom feels similar, a lot of hype and the AI usage costs are being subsidized by private equity/VC so far. IPO's are supposed to happen this fall for OpenAI and Anthropic. They're going to have to face the music of corporate governance, accounting rules, reporting revenue, earnings, etc. Subsidizing users seems unsustainable, they need to either jack up rates or downgrade usage per plans. Then there is the circular investments between all of them and Google, Microsoft, etc. Seems like a house of cards.
He did clarify that it was with fast mode. Without fast mode it'd "only" be $300k in raw API cost, or ~60 $200 Codex subscriptions.
Business: Amazing, that’s great what did you do?
I ran 50 instances and had them all fix the same bugs at the same time and then analyzed the results of all 50 runs to have AI score each of the attempts, then sort them, then compare them to each other in a round robin tournament style double elimination to ensure I got the best result. Then I had AI convert this into a skill, and then ran all 50 attempts again and repeated the process to ensure that I had the absolute best result. It was amazing and I used 1.3 billion tokens!
Business: That is amazing! What did you fix?
A spelling mistake on the About page.
Wish me luck on a raise!
Eventually Codex's subscription subsidization will diminish to near-zero, like the rest of the providers.
It's extremely important that people understand how expensive these models currently are. Even $300k in raw API costs is alarming for the output.
Because it does not say “equivalent of”, it literally says he spent money that he did not spend
The money going to the American model companies is not going to their hosting costs.
What I mean by this:
1. Intern, analyst, junior, or offshore level coding is cheaper when done by the machine.
// Side note: There is good reason the industry invests in suboptimal output from this set which moves to the "cost" column when using an LLM, but nobody's accounting for that.
2. For the interns, analysts, junior, or offshoring to do the right thing costs a multiple of the coding effort: the PdM/PjM stuff of course, but also the Stakeholder, Product Owner, Architect, Principal Engineer, QA, and SRE stuff.
3. If you are not a principal or staff engineer level engineer, you are likely unqualified to catch and fix the errors LLMs make across engineering, much less these other PDLC (product development lifecycle, which includes SDLC and SRE) loop.
4. For LLM output to be useful, your 'harness' has to incorporate all of that as well, which because it's so much harder than transliterating spec-to-code, balloons tokens exponentially.
5. Today it is faster, more efficient, and costs less, to work with LLMs "XP" (eXtreme Programming) style, pairing with the LLM actively co-creating and co-reviewing, steering for more effective turns.
So, your options are:
- ship garbage while costing less than a median first world SWE
- pair with the LLM actively for the benefits of XP
- add enough harness and steering the LLM costs more than SWEs, and still needs a human loop “move fast and break things to find out what's broken” style
I would expect that within a couple years, these other disciplines can be baked in enough the machine costs less for everything but surprises.
They already are. I’m successfully using frameworks like bmad to deliver complex apps at that level. My job is to manager the see, as, ux, sre processes and catch errors.
I spend more time refinding prd , epics and stories than I do elbows deep in code.
If I don’t like the output of a story I nuke it change the story and have the flanker try again. I’m using the open source glm, kimi, deepseek models. I expect the full pipeline to be good enough by the end of the year.
ya'll cant have it both ways; either it's really worth the cost or it's a bunch of token burn with no smoke.
They literally are. (If by "all this" you mean the subscription future bait-and-switch plans.)
Lets say I was at the casino and was spending a lot on casino chips but I also happen to work at the casino. I'm not really losing money whether if I win / lose since I'm using the houses money and there's little risk involved on every dice roll or press of the button. The risk is far higher if I don't have that level of access and continue to spend the same amount of money on lots of tokens (or casino chips, spins or button presses.)
The same is true here with these agents. Some companies will realize that they can no longer afford to spend millions a month on tokens or even startups spending $5k - $6k per person per month on tokens.
I can only see local efficient models making sense on recovering from this unnecessary spending or even light gambling on tokens.
Doubtful lol, dudes killing the environment just for fun at this point.
[0] https://old.reddit.com/r/ProgrammerHumor/comments/1teswot/pe...
I won’t lie, if I had the access to this, I’d do the same exact thing.
Privacy: Reuses existing provider sessions — OAuth, device flow, API keys, browser cookies, local files — so no passwords are stored.
macOS permissions: Full Disk Access for Safari cookies, Keychain access for cookie decryption and OAuth flows...
It's excellent this is disclosed as a reminder of how things work and the tradeoffs you're making to use it.
Btw, same frustration for me setting up signal, Whatsapp or slack...
We know it’s totally stupid, but unfortunately tokenmaxxing is real. I know our management line isn’t that dumb, but this is what you get when the business is selling it.
One person using 600B tokens in a month. The most I’ve hit is around 500M tokens and I thought that was a huge amount.
We’re going to have some major compute shortages for a while
> humanity is going to need 1000x the current energy production
and all that heat is going to go where?I use more than 150B/month with just 15 codex accounts.
60 accounts is "just" $12,000/month. So Peter could "save" 100x by using monthly accounts.
Of course, he doesn't have to, as he works at OpenAI now.
Narrator: there was no moat
For me it's not even a "what the hell are you working on" so much as complete inability to understand how you can keep so many different processes working on distinct tasks. It simply doesn't map on to how I use these tools.
I spend most of my day writing extremely detailed prompts and that's how I'm able to get the sort of excellent results that confound skeptics. But I have to be honest with you: I don't think I can write (or think) fast enough to do two of these at a time, much less 15.
I definitely could not review what they are generating with any degree of confidence.
I'm really hoping you can explain what the heck your usage pattern actually looks like, because reading this makes me feel like I'm missing something.
Building compilers has a _lot_ of parallel tasks agents can work on.
Wish me luck..
Just last week I saw a dude boasting about how they used their $20/month ChatGPT subscription to earn $15 (or similar trivial amount) in a bug bounty by running the model the whole day. Sam Altman replied to that tweet but not entirely positively.
OpenAI has been removing limits on token usage to take on Anthropic but I'm sure most of the users they are acquiring are these AI bros who are burning tokens for the sake of it. Massive price hikes are coming after OpenAI and Anthropic IPOs probably an order of magnitude larger than what happened to ride sharing.
Grifters gonna grift. What a state of affairs.
Hopefully eventually we will go back to evaluating the output. Not that I am very hopeful that we learn to do it in sensible way.
If you look at what happened with Sora, you know none of this matters.
Just wait till this OpenClaw thing is over.
Has everyone gone crazy?
Cool heads will prevail
[0] https://github.com/steipete/CodexBar
However, I do not see a strong reason to believe that this is his actual, personal usage. It could be all openclaw usage or some subset of openai usage, given that he is inside them. I suspect it is far more likely to be fake data [1] that exercises the graph library in a visually satisfying way. Notice that it has no usage for a 'week' after April 15 (a Wednesday), but picks up a bunch later. As marketing copy it needn't have any basis in reality [2]. I should hope openai would put a procedure in front of their entrepreneur acquisition that prevents accidentally exposing trade secrets [3].
[1] https://github.com/faker-js/faker
[2] https://www.reddit.com/r/proceduralgeneration/comments/lf2n4...
[3] https://tvtropes.org/pmwiki/pmwiki.php/Main/PostingWhatYouSh...
I’d actually seen the original DB episode years before when it first aired and it definitely had an affect on me through this form of manipulation - it altered my internal understanding of marketing/advertising, which was the actual underlying purpose of the episode.
It’s altered how I internally accept and process information from any 2nd or 3rd hand source. BTW, people aren’t necessarily always aware they’re doing it. We all suffer from our own internal biases and deceptions, and sometimes we spread them unknowingly!
i built my personal app mostly with ollama and it’s been smooth sailing so far. basically openclaw + hermes-style agents running on android phones, and the stuff it can do is kinda insane
He was. When it comes to marketing. This is was most people don't understand. Peter is a great marketing guy who got hired because of a hype vision, not because he is an outstanding engineer. Think of it like OpenAI hiring MrBeast of the coding world.
We really need better standards for disagreement.
Now let's wait until the moderators clean up the wrongthink. He also has censors on his side.
Opencode has the same problems. They often do multiple releases of that app a day, yet within the span of a week or two I have had to update my config because some random change has altered the behaviour and my permissions broke. Or I've noticed the way the app renders is suddenly different.
Yet, my day to day usage has barely changed since the version I installed last year. It's like everything changes but nothing changes.
https://github.com/openclaw/openclaw/commits/main/
GitHub insights over the last week.
Excluding merges, 216 authors have pushed 5864 commits to main and 6568 commits to all branches.
On main, 6965 files have changed and there have been 418,110 additions and 126,691 deletions.
In one month Peter makes 12k commits. So he is spending about $100 per commit depending on how much other stuff is going on.
That means he spends about
All projects can become fast if they drop guardrails.
This does not correlate with productivity increase
That doesn't sound very positive to me...
I just checked the code and feature outputs, and I can build all that in 15 days, for 1.3M USD. Fuck I would do it for 1M...
Scratch that, if it's 300K then sure I could do the same too, if you paid me that for 30 days of work. Lmao, the quality and the feature volume is just not worth anything worth paying so much money for.
I am not saying this because I don't like LLMs or I may think that AI coding can't work, but folks whatever openclaw has built for that much money is not worth nearly that much money...
Do existing companies run entire end-to-end product integration tests on every single change they make to a repo to make sure something hasn't broken? No, they just architect things in a way such that a minor change to something can be tested in isolation. And that can be automated, deterministically and efficiently.
Where I work we can release changes to our production site in minutes almost completely autonomously with high confidence with absolutely zero AI agents in the loop. How did we do it? With lessons learned from the past 5 decades of professional software development experience.
Lets not forget what OpenClaw is at it's core. It's a glorified cron scheduler. Why on earth does any of this effort need to exist. It's not that deep, it's not that complex, it's all AI for AI's sake.
Yes, that is _exactly_ the problem that is being solved. Is it easier to spin up some LLMs or pay a team of experienced engineers?
As inference costs fall, which will be cheaper?
I run it in a firewalled VM and am very conscious about any tokens I give it access to - so far for all I know this was unnecessary.
PS. for me the core feature of OpenClaw isn't the cron, though that is nice. It's the memory and instant extensibility. Like it takes 5-15 minutes to add an SSH tool where all agent requests go through a manual review, together with a good auto loaded description that just works in all future sessions.
He has a different opinion of what it means to be lean than almost everyone else. That's fine, he's allowed to, but it's something you have to understand to make sense of any of his comments on things. He has a radically different set of values to most people.
This is clearly an implementation and not a conceptual issue, as I had none of these issues using the same model with Hermes, for example.
And do you enjoy this more than writing code? I used to look forward to writing code, solving these little optimization puzzles, learning, and staying sharp. Working with agents is dreadful in comparison. They lie, rarely learn, and I feel like a proctor.
Sure, you sometimes get to see something amazing, but usually I am just very annoyed by their performance and ever-changing but never-ending billing issues. First, with Claude Code, now with Codex, which was fine for a minute, but now I am out of tokens for the majority of time. (I don't have the income for those Pro INTx plans.)
Now, I'm master of about a thousand lines of pricing code plus documentation and research which actually matters. The AI can handle the rest as a very skilled junior with a TBI.
You absolutely are a proctor, or senior manager. The AI is the smartest most well read junior you will ever meet, but don't go out of its happy path.
As you go out of the commonly read happy path for CRUD apps, you'll have to get more and more involved. I wouldn't write a new kernel design with AI right now, I might write a Linux kernel driver with it though.
They make more money from inference than they do training the model, but then the next model gets so much more expensive to train so their annual figures have been in the red.
The hard part is not building such toys, it's the convincing people with money to buy said toy. This is where he earned his applause.
So yeah its misleading but in the other direction.
It sucks at both security and usability as a result (all the vibe-designed security layers are constantly getting in my way).
He has agents write shitty code for features other agents think other people want, then has it reviewed by other agents in hopes of catching bugs that the first agent put there, then has some more agents try to find security bugs in the now double-agented code to make it triple-agented and at the end of the day, he spent a shitton of tokens, probably emitted enough carbon to heat our planet by another degree, and has a feature nobody really asked for that might or might not work.
He then has the sense of humor to call this grotesque process "incredibly lean".
What's the point in all of this? What problems is this solving? Who's benefiting?
The morality issues about consumption climate impacts are not his alone, and are not unique by itself to his endeavor. Every company with an enterprise LLM agreement has a share, for instance.
Firstly, who TF would use that crap in the first place at all? Yeah, he did some crap he got paid for. So did the people who created the addictive algorithms for social and media or creators of the brainrot videos that infest kids' minds. Should we applaud them too?
> What's the point in all of this? What problems is this solving? Who's benefiting?
The economy doesn't work like how you think it does. Its not central planning. All the usages aren't detailed in a specification, submitted for approval to 100 agencies and then allowed to be used.
It shows lack of intellectual curiosity to not engage deeply with obviously profound technology and what the implications are. I find this exercise helpful.
Peter is predicting how LLMs will be used in the future when the prices go down. And they will definitely go down. I think his predictions are correct and we will definitely have something similar to OpenClaw.
like one bot finding similar issues and PRs, the another bot closing issues for "lack of activity", meanwhile people are reacting and pleading to speak to a real human?
Congrats builders of the future, you've turned software development into automated voice systems.
I'm aware. That is in fact my central critique. The way it works is incredibly wasteful of our limited resources, as illustrated by this guy burning through fuel during a time of crisis for no perceptible gain.
> It shows lack of intellectual curiosity to not engage deeply with obviously profound technology and what the implications are.
The "obviously profound" is an assertion without proof.
The rest I agree with, we should engage with the implications of burning through energy to build features that bots think humans want, but nobody actually asked for, all while climate scientists are telling us we're heading for the apocalypse. It is intellectually incurious to just ignore the questions of why and at what cost, maybe even dangerously so.
I didn't know that studying photocopiers is suddenly linked to "intellectual curiosity". Being a photocopier maintenance guy was always considered boring.
What you put on top of the machine was intellectually interesting.
“He has /people/ write shitty code for features other /people/ think other people want, then has it reviewed by other /people/ in hopes of catching bugs that the first /people/ put there, then has some more /people/ try to find security bugs in the now /double-peopled/ code to make it /triple-peopled/ and at the end of the day, he spent a shitton of /money, the people/ probably emitted enough carbon to heat our planet by another degree, and has a feature nobody really asked for that might or might not work.”
Honestly sounds like a normal tech company to me. Just with much dumber “people” who are getting exponentially smarter, eventually never die, eventually never forget.
You have to skate to where the puck is going, not where it is.
They haven't gotten any smarter yet, let alone exponentially smarter. They are still the same dumb parrots that they were in the beginning.
The execution in case of Openclaw is a hot mess.
I don't think there's any way most people would call that lean. It's lean in exactly 1 axis which is people, but no one really cares about that, people is always a proxy for cost.
If these methods prove successful it isn't going to matter. A user doesn't care if code is 'slop' or artisanal, so long as the app/site/whatever works.
If you can combine autonomous flows (and millions of dollars in tokens) to produce work comparable to a traditional engineering team, then why would the user care which wrote the app/site/whatever?
You should try playing the game “workers and resources”; it’s a simcity like game, but based in the Soviet system of central planning, not capitalism. It will make you loathe the inefficiencies in central planning.
The appropriate comparison is command vs market. Capitalism is efficient in utilising the characteristics of humans to bring about expansion of markets.
The site say 1200 Github contributors and looking on Github there are now 2105 so it doesn't seem to be dropping that much.
The reality is if the thing can’t survive financially without being subsidised in the long run - it deserves to die.
There is no trend showing that these expensive things exist in the long run in this manner. - it’s pure speculation and for many: hopes and dreams.
Who knew it was that simple..?
Agricultural mechanisation didn't eliminate human labor over the 20th century. A huge fraction of the world's farmers have little or no mechanization today, well over a century after the invention of the revolutionary farm tractor.
With apologies to Ada Lovelace, but humanity has been writing code in anger for only like, 80 years? We'll still be at it in a 100 more.
I'm personally just impressed with the rate of improvement and _hope_ that it will continue, and that inference prices will fall (or on-device LLM become more feasible/powerful).
Anyway I appreciate your perspective even though I don't necessarily share it
It's a very simple question, the subthread you created based on reducing everything to "he did a thing" and calling the comment you didn't interact with at all "agonizing".
Why not rather leave it at "they wrote a comment"? What is so hard to understand about that, to use your words?