Labor market impacts of AI: A new measure and early evidence(anthropic.com) |
Labor market impacts of AI: A new measure and early evidence(anthropic.com) |
If you claim regulation favors the incumbents, well take it up with parent commenter whose comment seems to imply that regulation is needed!
It is a form of regulatory capture.
These companies are not going to regulate themselves. Capitalism is going to drive them to relentlessly compete for growth at all costs, a lot of which would be imposed on society. At least they are honest that regulation is needed, unlike Big Tobacco.
We basically have ~40 components and 6 pages to go until complete rewrite, I am sure we will run into bumps in the road, but it's been crazy to watch.
We also added i18n (English + Spanish), ThemeProvider for white labeling solution, and WCAG 2A compliance, all in one shot.
If I went to a third party and asked them to rewrite just the static pages it would have been $200k and 3 months of work.
I think experienced people move faster because they can evaluate the output and redirect it, less experienced people often struggle because they don’t yet know what “good” looks like.
The interesting long-term question is how companies rebuild the environments where that judgment gets developed in the first place.
No one knows what’s going to happen in the future. Yes there already are fewer SWE jobs than before because of AI, and yes the days of companies hiring new grads in droves at $300k+ packages are likely over. IMO all you can really do is study what you’re interested in, learn it deeply, and do good work with cool people. If unsure, it’s possible to go back to what you were doing before if the new path doesn’t work out.
The TL;DR is that there is little measurable impact (and I'd personally add "yet").
To quote:
"We find no systematic increase in unemployment for highly exposed workers since late 2022, though we find suggestive evidence that hiring of younger workers has slowed in exposed occupations"
My belief based on personal experience is that in software engineering it wasn't until November/December 2025 that AI had enough impact to measurably accelerate delivery throughout the whole software development lifecycle.
I have doubts that this impact is measurable yet - there is a lag between hiring intention and impact on jobs, and outside Silicon Valley large scale hiring decisions are rarely made in a 3 month timeframe.
The most interesting part is the radar plot showing the lack of usage of AI in many industries where the capability is there!
Gemini 3 and Opus 4.6 were the "woah, they're actually useful now!" moment for me.
I keep saying to colleagues that it's like a rising tide. Initially the AIs were lapping around our ankles, now the level of capability is at waist height.
Many people have commented that 50% of developers think AI-generated code is "Great!" and 50% think its trash. That's a sign that AI code quality is that of the median developer. This will likely improve to 60%-40%, then 70%-30%, etc...
I think there are some advantages to being first.
It's time to re-evaluate strategies if we've been operating under the assumption that this is going to be a bubble, or otherwise largely bullshit. It definitely works. Not everywhere all the time, but often enough to be "scary" now. Some of my prior dismissals like "text 2 sql will never work" are looking pale in the face today.
Also, it seems to me the concept of "observed exposure" is analogous to OpenAI's concept of "capability overhang" - https://cdn.openai.com/pdf/openai-ending-the-capability-over...
I think the underlying reason is simply because companies are "shaped wrong" to absorb AI fully. I always harp on how there's a learning curve (and significant self-adaptation) to really use AI well. Companies face the same challenge.
Let's focus on software. By many estimates code-related activities are only 20 - 60%, maybe even as low as 11%, of software engineers' time (e.g. https://medium.com/@vikpoca/developers-spend-only-11-of-thei...) But consider where the rest of the time goes. Largely coordination overhead. Meetings etc. drain a lot of time (and more the more senior you get), and those are mostly getting a bunch of people across the company along the dependency web to align on technical directions and roadmaps.
I call this "Conway Overhead."
This is inevitable because the only way to scale cognitive work was to distribute it across a lot of people with narrow, specialized knowledge and domain ownership. It's effectively the overhead of distributed systems applied to organizations. Hence each team owned a couple of products / services / platforms / projects, with each member working on an even smaller part of it at a time. Coordination happened along the heirarchicy of the org chart because that is most efficient.
Now imagine, a single AI-assisted person competently owns everything a team used to own.
Suddenly the team at the leaf layer is reduced to 1 from about... 5? This instantly gets rid of a lot of overhead like daily standups, regular 1:1s and intra-team blockers. And inter-team coordination is reduced to a couple of devs hashing it out over Slack instead of meetings and tickets and timelines and backlog grooming and blockers.
So not only has the speed of coding increased, the amount of time spent coding has also gone up. The acceleration is super-linear.
But, this headcount reduction ripples up the org tree. This means the middle management layers, and the total headcount, are thinned out by the same factor that the bottom-most layer is!
And this focused only on the engineering aspect. Imagine the same dynamic playing out across departments when all kinds of adjacent roles are rolled up into the same person: product, design, reliability...
These are radical changes to workflows and organizations. However, at this stage we're simply shoe-horning AI into the old, now-obsolete ticket-driven way of doing things.
So of course AI has a "capability overhang" and is going to take time to have broad impact... but when it does, it's not going to be pretty.
Look at GPT 5.4 and Opus, we’re clearly hitting diminishing returns already and these guys are pumping unsustainable amounts of money into them.
I’m bullish on AI, it’s been a net positive for me and my team. All I see here though is propaganda disguised as science to convince businesses to shrink their engineering budgets and redirect it to AI companies.
TL;DR: AI company says AI is amazing, more at 10.
Kinda done with this.
If you have something important to say, say it up front and back it up with literature later.
note that this concept was not invented by OpenAI
Here is my take on AI's impact on productivity:
First let's review what are LLMs objectively good at: 1. Writing boiler plate code 2. Translating between two different coding languages (migration) 3. Learning new things: Summarizing knowledge, explaining concepts 4. Documentation, menial tasks
At a big tech product company #1 #2 #3 are not as frequent as one would think - most of the time is spent in meetings and meetings about meetings. Things move slowly - it's designed to be like that. Majority devs are working on integrating systems - whatever their manager sold to their manager and so on. The only time AI really helped me at my job was when I did a one-week hackathon. Outside of that, integrations of AI felt like more work rather than less - without much productivity boost.
Outside, it has proven to be a real productivity boost for me. It checks all the four boxes. Plus, I don't have to worry about legal, integrations, production bugs (eventually those will come).
So, depends who you are asking -- it is a huge game changer (or not).
It is quite good at following most orders. Hence why you must ALWAYS be in the loop. AI can augment, but not replace. Maybe some day it might. But it's not now, even with the latest SOTA models.
I let AI write my emails for me. But never the ability to hit send. I let AI access to my data to make informed decisions, but never let it make the final decision.
You may think I'm being paranoid, but I'm a very cautious person. I don't jump into new technology fresh out of the oven and this has served me well for the last 15 years. (I learned my lesson courtesy of MongoDb).
With AI, I am taking the same approach. Experiment, understand the limits and only then implement. Working really well so far and have managed to automate tons of tedious tasks from emails to sales to even meetings.
I don't use Clawdbot, not any library. I wrote my own wrappers for everything using Elixir. I used Instructor and Ash framework with Phoenix and a bunch of generators to automate tedious tasks. I control the endpoints the models are loaded from (Open router) and use a multi-model flow so no one company has enough data about me. Only bits and pieces of random user IDs.
Privacy is the real challenge with AI.
Lol why? You've been suckered in and will eventually crash and burn. But carry on.
Just remember when things go wrong - it's your ass on the line.
AI is making everyone faster that I’ve seen. I’d say 30% of the tickets I’ve seen in the last month have been solved by just clicking the delegate to AI button
I agree its in the 2-7 person range.
The challenge for those teams is distribution. They will crush at building, but I'm not sure how they can crack distribution. Some will, but maybe there is a way to help thousands of small teams distribute.
Big corporations are full with people who love to entertain 20+ people in video calls. 1-2 people speak, the other nod their heads while browsing Amazon.
I wouldn’t be sad if those jobs vanished.
(1) LLMs are basically Stack Overflow on steroids. No need to go look up examples or read the documentation in most cases, spit out a mostly working starting point.
(3) Learning. Ramping up on an unfamiliar project by asking Antigravity questions is really useful.
I do think it makes devs faster, in that it takes less time to do these two things. But you're running into the 80% of the job that does not involve writing code, especially at a larger company.
In theory, this should allow a company to do more with fewer devs, but in reality it just means that these two activities become easier, and the 80% is still the bottleneck.
That, and I've never had to beg an LLM for an answer, or waste 5 minutes of my life typing up a paragraph to pre-empt the XY Problem Problem. Also never had it close my question as a duplicate of an unrelated question.
The accuracy tends to be somewhat lower than SO, but IMO this is a fair tradeoff to avoid having to potentially fight for an answer.
Are you generating revenue or, otherwise, what productivity are you measuring?
Without generating revenue (which to be clear is a very good proxy to measure impact) everyone can be indeed very prolific in their hobbies. But labor market is about making money for a living and unless you can directly impact your day-to-day needs from your work, it can't be called productive.
At my previous employer, I was generating $2.5million per year (revenue per employee). I didn't ship a single line of code. All the time was spent trying to convince various stake holders.
Now, I have already built a couple of apps that help me better manage my tech news (keeps me sane) plus I am writing a blog that generates $0. It's only been a month.
If you measure the immediate dollar value, you are right. But in life, pay-offs are not always realized immediately. Just my opinion anyway.
Working on a side project, and it's truly incredible how good AI has been for MOST of it.
Also, bewildering how truly awful it was at some seemingly random things - like writing not terribly difficult Assembly that mostly exists already to do Go-style hot splitting (to even get it to understand what older versions of Go did).
I suspect it'll still be 3 years before AI is as good at the FAANGs as it is outside, just due to the ungodly huge context and the amount of proprietary stuff it would need to learn to use effectively, plus getting all the access to it, etc.
But, even when it does all that, that's maybe 33% of the job.
I just don't see mass layoffs at the really big tech companies, unless it's more focused on just cutting and cutting than actually because people have been made redundant.
Even at the management level, I'm not sure we're going to see managers managing teams of 30 instead of teams of 10.
At the end of the day, a manager needs to know what you're doing and if you're any good at it, and there's only so many people a person can do that effectively with.
Maybe low-level managers go away, and it's just TLMs, but someone still needs to do your 1-on-1s and babysit those that need babysat.
I have a game written in XNA
100% of the code is there, including all the physics that I hand-wrote.
All the assets are there.
I tried to get Gemini and Claude to do it numerous times, always with utter failure of epic proportions with anything that's actually detailed. 1 - my transition from the lobby screen into gameplay? 0% replicated on all attempts 2 - the actual physics in gameplay? 0% replicated none of it works 3 - the lobby screen itself? non-functional
Okay so what did it even do? Well it put together sort of a boilerplate main menu and barebones options with weird looking text that isn't what I provided (given that I provided a font file), a lobby that I had to manually adjust numerous times before it could get into gameplay, and then nonfunctional gameplay that only handles directional movement and nothing else with sort of half-working fish traveling behavior.
I've tried this a dozen times since 2023 with AI and as late as late last year.
ALL of the source code is there every single thing that could be translated to be a functional game in another language is there. It NEVER once works or even comes remotely close.
The entire codebase is about 20,000 lines, with maybe 3,000 of it being really important stuff.
So yeah I don't really think AI is "really good" at anything complex. I haven't really been proven wrong in my 4 years of using it now.
And then, maybe someone slightly crazy comes along and tries seeing how much they can do with regular codegen approaches, without any LLMs in the mix, but also not manual porting.
- Do not say: "just convert this"
- On critical sections you do a method-per-method-translation
- Dont forget: your 20.000 lines source at a whole will make any model to be distracted on longer tasks (and sessions, for sure)
- Do dedicated projects within Claude per each sub-module
But I do feel this is a solvable problem long term.
Because, I am terrified by the output I am getting while working on huge legacy codebases, it works. I described one of my workflow changes here: https://news.ycombinator.com/item?id=47271168 but in general compared to old way of working I am saving half of the steps consistently, whether its researching the codebase, or integrating new things, or even making fixes. I have stopped writing code, occasionally I jump into the changes proposed by LLM and make manual edits if it is feasible, otherwise I revert changes and ask it to generate again but based on my learnings from the past rejected output
I am terrified about what's coming
Every time I say this people get really angry, but: so far AI has had almost no impact on my job. Neither my dev team nor my vendors are getting me software faster than they were two years ago. Docker had a bigger impact on the pipeline to me than AI has.
Maybe this will change, but until it does I'm mostly watching bemusedly.
However, I can't imagine vibe-coders actually shipping anything.
I really have to ride herd on the output from the LLM. Sometimes, the error is PEBCAK, because I erred, when I prompted, and that can lead to very subtle issues.
I no longer review every line, but I also have not yet gotten to the point, where I can just "trust" the LLM. I assume there's going to be problems, and haven't been disappointed, yet. The good news is, the LLM is pretty good at figuring out where we messed up.
I'm afraid to turn on SwiftLint. The LLM code is ... prolix ...
All that said, it has enormously accelerated the project. I've been working on a rewrite (server and native client) that took a couple of years to write, the first time, and it's only been a month. I'm more than half done, already.
To be fair, the slow part is still ahead. I can work alone (at high speed) on the backend and communication stuff, but once the rest of the team (especially shudder the graphic designer) gets on board, things are going to slow to a crawl.
The real impact is for indie-devs or freelancers but that usually doesn't account for much of the GDP.
Don't know if this is effective and I don't think management knows either, but it's what they're doing
Doesn't mean the two are related.
Is AI just the excuse? We've got tariffs, war, uncertainty and other drama non stop.
Instead they are using Electron and calling it a day. Very ironic isn't it? If AI is so good then why don't we get native software from Anthropic?
It just becomes a source of truth for media and corporate decision machines.
His rationale is he won’t let the company log his prompts and responses so they can’t build an agentic replacement for him. Corporate rules about shadow it be damned.
Only the paranoid survive I guess
I’d argue this can’t be trusted either considering the AI labs already established they’re willing to break laws (copyright) if the ultimate legal consequence is just a small fine or settlement.
For me, the impact is absolutely in hiring juniors. We basically just stopped considering it. There's almost no work a junior can do that now I would look at and think it isn't easier to hand off in some form (possibly different to what the junior would do) to an AI.
It's a bit illusory though. It was always the case that handing off work to a junior person was often more work than doing it yourself. It's an investment in the future to hire someone and get their productivity up to a point of net gain. As much as anything it's a pause while we reassess what the shape of expertise now looks like. I know what juniors did before is now less valuable than it used to be, but I don't know what the value proposition of the future looks like. So until we know, we pause and hold - and the efficiency gains from using AI currently are mostly being invested in that "hold" - they are keeping us viable from a workload perspective long enough to restructure work around AI. Once we do that, I think there will be a reset and hiring of juniors will kick back in.
If AI increases productivity, and juniors are cheaper to hire, but is just as able to hand off tasks to ai as a senior, then it makes more sense to hire more juniors to get them working with an AI as soon as possible. This produces output faster, for which more revenue could be derived.
So the only limiting factor is the possibility of not deriving more revenue - which is not related to the AI issue, but broader, macroeconomic issue(s).
I think this is the crux of it. Someone who doesn't know the right thing to do just isn't in a position to hand off anything. Accelerating their work will just make them do the wrong thing faster.
Let's say I sell snake oil and I survey every buyer, trying to convince everyone doctors won't be needed in the future.
First conclusion is that retired population seeks medical services the most (reality check - according to CDC most doctor visits are for infants).
Second conclusion is that because it's a snake oil, it heals all the problems and those people will never return to outdated healthcare system.
The leading AI exposure indices (Anthropic, Eloundou et al.) focus on which jobs get automated. They treat low exposure as “safe.”
But the least exposed workers—cooks, roofers, dishwashers, construction laborers—are often in the worst jobs: low pay, high physical toll, short career spans, and little upward mobility. Safe from AI, but not from burnout or injury.
I built JQADI (Job Quality-Adjusted Displacement Index) to combine AI exposure with job quality. It surfaces three kinds of risk:
High AI exposure → classic displacement risk Low AI, low quality → “trapped” workers in grinding, unsustainable jobs Moderate AI, low quality → partial automation strips cognitive work and leaves physical drudgery (the “task residual” effect)
Findings: 83.5M workers are in low-AI, low-quality jobs. Customer service reps, data entry keyers, and medical records specialists sit at the intersection of high exposure and poor quality. Meanwhile, chief executives and lawyers are both low-exposure and high-quality.
The index uses ONET, BLS, and Anthropic exposure data. Code and methodology are open source. LINK https://github.com/quinndupont/JQADI
It’s not quite at the place where LLMs can take over 100% coding, but give it a few more months.
And I don't even think it'll stay that way, just that it's what I've seen so far.
It's anecdotal of me, I apologize for that.
Anthropic can cause layoffs through pure marketing. People were crediting an Anthropic statement in causing a drop in IBM's stock value, which may genuinely lead to layoffs: https://finance.yahoo.com/news/ibm-stock-plunges-ai-threat-1...
We'll probably have to wait for the hype to wear off to get a better idea, but that might take a long while.
Then the 2008 crash happened and those people were gone in a blink of an eye and never replaced. The companies grew in staff after that, but it was in things like sales and marketing.
> It can't think, it just predicts likely tokens
> I can't believe this industry I once cherished for rational professionalism has fallen for nondeterministism
> Sorry, I'm just not going to participate in destroying the planet with these power hunger DCs
> All this stuff actually costs 10x what a human developer costs but they're dumping the service at a low price to make us dependent.
> It's a bubble, or a scam, in a year or two everything will go back to normal.
Tell me sentiments like these don't get bandied about by devs who want to keep doing things the way they know and like.
Shipping speed never/is was the issue. Most companies are terrible at figuring out what exactly they should be allocating resources behind.
Speeding up does not solve the problem that most humans who are at the top of the hierarchy are poor thinkers. In fact it compounds it. More noise, nice.
Writing code is lesser problem than figuring out what we want when we want, and to get stakeholders at one place.
I'm curious how the system will maneuver itself to deprive workers of pay so that they can stay competitive with the ever-decreasing cost of AI.
Conversely, I'm curious how disruptors will find ways to provide workers with pay (perhaps through mutual aid networks, grants and alternative socioeconomic systems) so that they can use AI to produce the resources they need outside of the contracting labor market.
The junior hiring slowdown makes sense in that context. Junior roles were often the execution layer. That layer is getting absorbed. Whether that's bad long-term depends on whether there's still a path to build judgment without first doing the execution work for years. But what can be seen on entry level teams is you typically have 20% of these people that are outstanding, and 80% average. I assume this 20% will simply be able to cover more ground.
There goes my excuse of not finding a job in this market.
To give an example from a field where LLMs started causing employment worries earlier than software development: translation. Some translators made their living doing the equivalent of routine, repetitive coding tasks: translating patents, manuals, text strings for localized software, etc. Some of that work was already threatened by pre-LLM machine translation, despite its poor quality; context-aware LLMs have pretty much taken over the rest. Translators who were specialized in that type of work and too old or inflexible to move into other areas were hurt badly.
The potential demand for translation between languages has always been immense, and until the past few years only a tiny portion of that demand was being met. Now that translation is practically free, much more of that demand is being met, though not always well. Few people using an app or browser extension to translate between languages have much sense of what makes a good translation or of how translation can go bad. Professional translators who are able to apply their higher-level knowledge and language skills to facilitate intercultural communication in various ways can still make good money. But it requires a mindset change that can be difficult.
On a macro level, if you were in a rising economic tide, you would still be hiring, and turning those productivity gains into more business.
I wonder what the parallels are to past automations. When part producing companies moved from manual mills to CNC mills, did they fire a bunch of people or did they make more parts?
AI needs documentation, automation, integration tests... It works very well for remote first company, but not for in-face informal grinding approach.
Just year ago, client told me to delete integration tests, because "they ran too long"!
If, and it's a big if, AI models really boost productivity by an order of magnitude (I personally, while being skeptical a year or two ago, am leaning towards this idea) then engineers have a chance to realize their ideas, improve current system design patterns and build successful companies, which will inevitably (hopefully) require hiring personnel to keep competing, bringing entire software engineering market to a newly balanced state.
Once you get to a certain size company, this means a lot of bloat. Heck, I've seen small(ish) companies that had as many managers and administrators as ICs.
But You're not wrong, I'm just pointing out how an org that has 4k people can lay off a few hundred with modest impact of the financials (though extensive impact on morale).
It’s refreshing to see the same sentiment from so many other people independently here.
Doesn't exclude the possibility of short term distribution, though.
That's one of the reasons why I am terrified, because it can lead to burn out, and I personally don't like to babysit bunch of agents, because the output doesn't feel "mine", when its not "mine" I don't feel ownership.
And I am deliberately hitting the brake from time to time not to increase expectations, because I feel like driving someone else's car while not understanding fully how they tuned their car (even though I did those tunings by prompting)
If you look at my post history I'm essentially saying the same stuff lol.
I find anything else, I spend more time coaxing them into doing 85% of what I need that I'm better off doing it myself.
So they're not useless but there's only so many times in a week that I need a function to pretty-print a table in some fashion. And the code they write on anything more complex than a snippet is usually written poorly enough that it's a write-once-never-touch-again situation. If the code needs to be solid, maintainable, testable, correct (and these are kind of minimal requirements in my book) then LLMs make little impact on my productivity.
They're still an improvement on Google and Stack exchange, but again - only gets you so far.
YMMV
What was the last thing you built in which you felt this was the case?
However now that it's in the beta stage the amount of issues and bugs is insane. I reviewed a lot of the code that went in as well. I suspect the bug fixing stage is going to take longer than the initial implementation. There are so many issues and my mental model of the codebase has severely degraded.
It was an interesting experiment but I don't think I would do it again this way.
Not only that, the less coding you do in general? Guess what, fixing issues that in the past wouldve been a doddle (muscle memory) become less harder due to atrophy.
Swear most people dont think straight and cant see the obvious.
I came to the same conclusion when producing a video with Grok. Did the job but utterly painful and it was definitely very costly - I used 50 free-trial accounts and maxed them out each day for a month.
Im pretty sure these conclusions hold across all models and therefore the technology by extension.
The group used feature flags...
if (a) {
// new code
} else {
// old code
}
void testOff() {
disableFlag(a);
// test it still works
}
void testOn() {
enableFlag(a);
// test it still works
}
However, as with any cleanup, it doesn't happen. We have thousands of these things lying around taking up space. I thought "I can give this to the AI, it won't get bored or complain."I can do one flag in ~3minutes. Code edit, pr prepped and sent.
The AI can do one in 10mins, but I couldn't look away. It kept trying to use find/grep to search through a huge repo to find symbols (instead of the MCP service).
Then it ignored instructions and didn't clean up one or the other test, left unused fields or parameters and generally made a mess.
Finally, I needed to review and fix the results, taking another 3-5 minutes, with no guarantee that it compiled.
At that point, a task that takes me 3 minutes has taken me 15.
Sure, it made code changes, and felt "cool", but it cost the company 5x the cost of not using the AI (before considering the token cost).
Even worse, the CI/CD system couldn't keep up the my individual velocity of cleaning these up, using an automated tool? Yeah, not going to be pleasant.
However, I need to try again, everyone's saying there was a step change in December.
Claude Code took 4 hours, with multiple prompts. At the end, it started to break the previous fixes in favor of new features. The code was spaghetti. There was no way I could fix it myself or steer Claude Code into fixing it the right way. Either it was a dead-end or a dice roll with every prompt.
Then I implemented my own version with Cursor tab completion. It took the same amount of time, 4 hours. The code had a clear object-oriented architecture, with a structure for evolution. Adding a new feature didn't require any prompts at all.
As a result, Claude Code was worse in terms of productivity: the same amount of time, worse quality output, no possibility of (or at best very high cost of) code evolution.
I wanted it to finish up some tests that I had already prefilled, basically all the AI had to do was convert my comments into the final assertions. A few minutes later of looping, I see it finishes and all tests are green.
A third of the tests were still unfilled, I guess left as an exercise for the reader. Another third was modified beyond what I told it to do, including hardcoding some things which made the test quite literally useless and the last third was fine, but because of all the miscellaneous changes it made I had to double check those anyways. This is about the bare minimum where I would expect these things to do good work, a simple take comment -> spit out the `assert()` block.
I ended up wasting more time arguing with it than if I had just done the menial task of filling out the tests myself. It sure did generate a shit ton of code though, and ran in an impressive looking loop for 5-10 minutes! And sure, the majority of the test cases were either not implemented or hardcoded so that they wouldn't actually catch a breakage, but it was all green!!
That's ultimately where this hype is leading us. It's a genuinely useful tool in some circumstances, but we've collectively lost the plot because untold billions have poured into these systems and we now have clueless managers and executives seeing "tests green -> code good" and making decisions based on that.
It’s fine at replacing what stack overflow did nearly a decade ago, but that isn’t really an improvement from my baseline.
I end up replacing any saved time with QA and code review and I really don’t see how that’s going to change.
In my mind I see Claude as a better search engine that understands code well enough to find answers and gain understanding faster. That’s about it.
The question is really, velocity _of what_?
I got this from a HN comment. It really hit for me because the default mentality for engineers is to build. The more you build the better. That's not "wrong" but in a business setting it is very much necessary but not sufficient. And so whenever we think about productivity, impact, velocity, whatever measure of output, the real question is _of what_? More code? More product surface area? That was never really the problem. In fact it makes life worse majority of the time.
They've already admitted they just 'throw the code away and start again'.
I think we've got another victim of perceived productivity gains vs actual productivity drop.
People sitting around watching Claude churn out poor code at a slower rate than if they just wrote it themselves.
Don't get me wrong, great for getting you started or writing a little prototype.
But the code is bad, riddled with subtle bugs and if you're not rewriting it and shoving large amounts of AI code into your codebase, good luck in 6-12 months time.
If I can just vibe and shrug when someone asks why production is down globally then I'm sure the amount of features I can push out increases, but if I am still expected to understand and fix the systems I generate, I'm not convinced it's actually faster to vibe and then try to understand what's going on rather than thinking and writing.
In my experience the more I delegate to AI, the less I understand the results. The "slowness and thinking" might just be a feature not a bug, at times I feel that AI was simply the final straw that finally gave the nudge to lower standards.
You're pretty high up in the development, decision and value-addition chain, if YOU are the responsible go-to person for these questions. AI has no impact on your position.
At review time.
There are simply too many software industries that can't delegate both authorship _and_ review to non-humans because the maintenance/use of such software, especially in libraries and backwards-compat-concerning environments, cannot justify an "ends justifies the means" approach (yet).
I also heard an opinion that since writing code is cheap, people implement things that have no economic value without really thinking it through.
Only it doesn't, there's product positioning, UX, information architecture, onboarding and training, support, QA, change management, analytics, reporting… sigh
We can now make 1$ million dollar commercials with 100,000$ or less. So a 90% reduction in costs - if we use AI.
The issue is they don’t look great. AI isn’t that great at some key details.
But the agencies are really trying to push for it.
They think this is the way back to the big flashy commercials of old. Budgets are lower than ever, and shrinking.
Big issue here is really the misunderstanding of cause - budgets are lower, because advertising has changed in general (TV is less and less important ) and a lot of studies showed that advertising is actually not all that effective.
So they are grabbing onto a lifeboat. But I’m worried there’s no land.
I’ve planned my exit.
Also what are you existing to?
I can give you many, many examples of where it failed for me:
1. Efficient implementation of Union-Find: complete garbage result 2. Spark pipelines: mostly garbage 3. Fuzzer for testing something: half success, non-replicateable ("creative") part was garbage. 4. Confidential Computing (niche): complete garbage if starting from scratch, good at extracting existing abstractions and replicating existing code.
Where it succeeds: 1. SQL queries 2. Following more precise descriptions of what to do 3. Replicating existing code patterns
The pattern is very clear. Novel things, things that require deeper domain knowledge, coming up with the to-be-replicated patterns themselves, problems with little data don't work. Everything else works.
I believe the reason why there is a big split in the reception is because senior engineers work on problems that don't have existing solutions - LLMs are terrible at those. What they are missing is that the software and the methodology must be modified in order to make the LLM work. There are methodical ways to do this, but this shift in the industry is still in baby shoes, and we don't yet have a shared understanding of what this methodology is.
Personally I have very strong opinions on how this should be done. But I'm urging everyone to start thinking about it, perhaps even going as far as quitting if this isn't something people can pursue at their current job. The carnage is coming:/
I have never heard anyone say "it works" as a positive thing when reviewing code..
Yes, there is a productivity boost but you can't tell me there is no decrease in quality
Isn't it a very inefficient way to learn things? Like, normally, you would learn how things work and then write the code, refining your knowledge while you are writing. Now you don't learn anything in advance, and only do so reluctantly when things break? In the end there is a codebase that no one knows how it works.
It is. But there are 2 things:
1. Do I want to learn that? (if I am coming back to this topic again in 5 months, knowledge accumulates, but there is a temptation to finish the thing quickly, because it is so boring to swim in huge legacy codebase)
2. How long it takes to grasp it and implement the solution? If I can complete it with AI in 2 days vs on my own in 2 weeks, I probably do not want to spend too much time on this thing
as I mentioned in other comments, this is exactly makes me worried about future of the work I will be doing, because there is no attachment to the product in my brain, no mental models being built, no muscles trained, it feels someone else's "work", because it explores the code, it writes the code. I just judge it when I get a task
In a legacy codebase this may require learning a lot of things about how things work just to make small changes, which may be much less efficient.
Edit: Ha, and the report claims it's relatively good at business and finance...
Edit 2: After discussion in this thread, I went back to opus and asked it to link to articles about how to handle non-normally distributed data, and it actually did link to some useful articles, and an online calculator that I believe works for my data. So I'll eat some humble pie and say my initial take was at least partially wrong here. At the same time, it was important to know the correct question to ask, and honestly if it wasn't for this thread I'm not sure I would have gotten there.
A good way to use AI is to treat it like a brilliant junior. It knows a lot about how things work in general but very little about your specific domain. If your data has a particular shape (e.g lots of orders with a few large orders as outliers) you have to tell it that to improve the results you get back.
On the other hand, our corporate AI is.. not great atm. It was briefly kinda decent and then suddenly it kinda degraded. Worst case is, no one is communicating with us so we don't know what was changed. It is possible companies are already trying to 'optimize'.
I know it is not exactly what you are asking. You are saying capability is there, but I am personally starting to see a crack in corporate willingness to spend.
LLMs also are quite bad for security. They can find simple bugs, but they don't find the really interesting ones that leverage "gap between mental model and implementation" or "combination of features and bugs" etc, which is where most of the interesting security work is imo.
I am doing novel work with codex but it does need some prompting ie. exploring possibilities from current codebase, adding papers to prompt etc.
For security, I think I generally start a new thread before committing to review from security pov.
https://aisle.com/blog/what-ai-security-research-looks-like-...
Sometimes I realise that this particular task has been slower than if I’d done it myself when I take in to account full wall clock time.
I can’t tell what type of task is going to work ahead of time yet.
People who are saying they're not seeing productivity boost, can you please share where is it failing?
Believe it or not, I still know many devs who do not use any agents. They're still using free ChatGPT copy and paste.I'm going to guess that many people on HN are also on the "free ChatGPT isn't that good at programming" train.
Probably that's the reason why some people are sure their job is still safe.
Nature of job is changing rapidly
Til then wtf_are_these_abstractions.jpg
Tests were always important, but now they are the gatekeepers to velocity.
Yesterday a colleague didn’t quite manage to implement a loading container with a Vue directive instead of DOM hacks, it was easier for me to just throw AI at the problem and produced a working and tested solution and developer docs than to have a similarly long meeting and have them iterate for hours.
Then I got back to training a CNN to recognize crops from space (ploughing and mowing will need to be estimated alongside inference, since no markers in training data but can look at BSI changes for example), deployed a new version of an Ollama/OpenAI/Anthropic proxy that can work with AWS Bedrock and updated the docs site instructions, deployed a new app that will have a standup bot and on-demand AI code review (LiteLLM and Django) and am working on codegen to migrate some Oracle forms that have been stagnating otherwise.
It’s not funny how overworked I am and sure I still have to babysit parallel Claude Code sessions and sometimes test things manually and write out changes, but this is a completely different work compared to two or three years ago.
Maybe the problem spaces I’m dealing with are nothing novel, but I assume most devs are like that - and I’d be surprised at people’s productivity not increasing.
When people nag in meetings about needing to change something in a codebase, or not knowing how to implement something and its value add, I’ll often have something working shortly after the meeting is over (due to starting during it).
Instead of sending adding Vitest to the backlog graveyard, I had it integrated and running in one or two evenings with about 1200 tests (and fixed some bugs). Instead of talking about hypothetical Oxlint and Oxfmt performance improvements, I had both benchmarked against ESLint and Prettier within the hour.
Same for making server config changes with Ansible that I previously didn’t due to additional friction - it is mostly just gone (as long as I allow some free time planned in case things vet fucked up and I need to fix them).
Edit: oh and in my free time I built a Whisper + VLM + LLM pipeline based on OpenVINO so that I can feed it hours long stream VODs and get an EDL cut to desired length that I can then import in DaVinci Resolve and work on video editing after the first basic editing prepass is done (also PyScene detect and some audio alignment to prevent bad cuts). And then I integrated it with subscription Claude Code, not just LiteLLM and cloud providers with per-token costs for the actual cuts making part (scene description and audio transcriptions stay local since those don't need a complex LLM, but can use cloud for cuts).
Oh and I'm moving from my Contabo VPSes to running stuff inside of a Hetzner Server Auction server that now has Proxmox and VMs in that, except this time around I'm moving over to Ansible for managing it instead of manual scripts as well, and also I'm migrating over from Docker Swarm to regular Docker Compose + Tailscale networks (maybe Headscale later) and also using more upstream containers where needed instead of trying to build all of mine myself, since storage isn't a problem and consistency isn't that important. At the same time I also migrated from Drone CI to Woodpecker CI and from Nexus to Gitea Packages, since I'm already using Gitea and since Nexus is a maintenance burden.
If this becomes the new “normal” in regards to everyone’s productivity though, there will be an insane amount of burnout and devaluation of work.
We've started building harnesses to allow people who don't understand code to create PRs to implement their little nags. We rely on an engineer to review, merge, and steward the change but it means that non-eng folks do not rely on us as a gate. (We're a startup and can't really afford "teams" to do this hand-holding and triage for us.)
As you say we're all a bit overworked and burned out. I've been context switching so much that on days when I'm very productive I've started just getting headaches. I'm achieving a lot more than before but holding the various threads in my head and context switching is just a lot.
I've always done more in days than others might in a week. YMMV.
God I hope I never ever have to work with you
Productivity is a term of art in economics and means you generate more units of output (for example per person, per input, per wages paid) but doesn't take quality or otherwise desireability into account. It's best suited for commodities and industrial outputs (and maybe slop?).
I don't think features per hour is really what is holding back most established businesses.
My experiences suggest that we still have some time before the people that understand the plumbing of the business _and_ AI bubble up to positions of authority through wielding it practically and successfully at increasingly greater scale.
Note1: I have "expert" level research skills. But LLMs still help me in research, but the boost is probably 1.2x max. But
Note2: By research, I mean googling, github search, forum search, etc. And quickly testing using jsfiddle/codepen, etc.
But I also think you are overestimating your RESEARCH skills, even if you are very good at research, I am sure you can't read 25 files in parallel, summarize them (even if its missing some details) in 1 minute and then come up with somewhat working solution in the next 2 minutes.
I am pretty sure, humans can't comprehend reading 25 code files with each having at least 400 lines of non-boilerplate code in 2 minutes. LLM can do it and its very very good at summarizing.
I can even steer its summarizing skills by prompting where to focus on when its reading files (because now I can iterate 2-3 times for each RESEARCH task and improve my next attempt based on shortcomings in the previous attempt)
* you probably lack good RESEARCH skills
* I can see at most 1.25x improvements - now it is 2-3x
By updating your comment you are making my reply irrelevant to your past response
The productivity gains are blatantly obvious at this point. Even in large distributed code bases. From jr to senior engineer.
Gaslight me by telling me I must be a time traveler because I use go 1.26 but the latest version actually is 1.24
And tell me I can't use wg.Go() because this function does not exist (it does)
Why? This is great. AI fixing up huge legacy codebases is just taking the jobs humans would never want to do.
I've seen lots of people say AI can basically code a project for them. Maybe it can, but that seems to heavily depend on the field. Other than boilerplate code or very generic projects, it's a step above useless imo when it comes to gamedev. It's about as useful as a guy who read some documentation for an engine a couple years ago and kind of remembers it but not quite and makes lots of mistakes. The best it can do is point me in the general direction I need to go, but it'll hallucinate basic functions and mess up any sort of logic.
1) Do that inside their IDEs, which is less funny
2) Generate blog post about it instead of memes
So the good old days before search engines were drowning with ads and dark patterns. My assumption is big LLMs will go in the same direction after market capture is complete and they need to start turning a profit. If we are lucky the open source models can keep up.
For me this is a huge boost in productivity. If I remember how I was working in the past (example of Google integration):
Before:
* go through docs to understand how to start (quick start) and things to know
* start boilerplate (e.g. install the scripts/libs)
* figure out configs to enable in GCP console
* integrate basic API and test
* of course it fails, because its Google API, so difficult to work with
* along the way figure out why Python lib is failing to install, oh version mismatch, ohh gcc not installed, ohh libffmpeg is required,...
* somehow copy paste and integrate first basic API
* prepare for production, ohhh production requires different type of Auth flow
* deploy, redeploy, fix, deploy, redeploy
* 3 days later -> finally hello world is working
Now: * Hey my LLM buddy, I want to integrate Google API, where do I start, come up with a plan
* Enable things which requires manual intervention
* In the meantime LLM integrates the code, install lib, asks me to approve installation of libpg, libffmpeg,....
* test, if fails, feed the error back to LLM + prompt to fix it
* deployIf not, then you’re not close to the cutting edge.
I can turn out some scripts a little bit quicker, or find an answer to something a little quicker than googling, but I'm still waiting on others most of the time, the overall company processes haven't improved or gotten more efficient. The same blockers as always still exist.
Like you said, there has been other tech that has changed my job over time more than AI has. The move to the cloud, Docker, Terraform, Ansible, etc. have all had far more of an impact on my job. I see literally zero change in the output of others, both internally and externally.
So either this is a massively overblown bubble, or I'm just missing something.
I've been in ops for 30 years, Claude Code has changed how I work. Ops-related scripting seems to be a real sweet spot for the LLMs, especially as they tend to be smaller tools working together. It can convert a few sentences into working code in 15-30 minutes while you do something else. I've given it access to my apache logs Elastic cluster, and it does a great job at analyzing them ("We suspect this user has been compromised, can you find evidence of that?"). It's quite startling, actually, what it's able to do.
And that's the key problem, isn't it? I maintain current organizations have the "wrong shape" to fully leverage AI. Imagine instead of the scope of your current ownership, you own everything your team or your whole department owns. Consider what that would do to the meetings and dependencies and processes and tickets and blockers and other bureaucracy, something I call "Conway Overhead."
Now imagine that playing out across multiple roles, i.e. you also take on product and design. Imagine what that would do to your company org chart.
I added a much more detailed comment here: https://news.ycombinator.com/item?id=47270142
Imo it's only a matter of time as companies start to figure out how to use ai. Companies don't seem to have real plans yet and everyone is figuring out ai in general out.
Soon though I will think agents start popping up, things like first line response to pages, executing automation
Humans are funny. But most cant seem to understand that the tool is a mirage and they are putting false expectations on it. E.g. management of firms cutting back on hiring under the expectation that LLMs will do magic - with many cheering 'this is the worst itll be bro!!".
I just hope more people realise before Anthropic and OAI can IPO. I would wager they are in the process of cleaning up their financials for it.
A famous economist once said, "You can see the computer age everywhere but in the productivity statistics."
There are many reasons for the lag in productivity gain but it certainly will come.
A Commodore 64 was a cool gadget, but “the family computer” became a device that commoditized the productivity. The opportunity cost of applying a computer to try something new went to near zero.
It might have been harder for someone to improve the productivity of an old factory in Shreveport, Louisiana with a computer than it was for the upstarts at id to make Doom.
Predictions without a deadline are unfalsifiable.
Because I can get so much done, I've lost my sense for what's enough. And if I can squeeze out a bit more relatively easily, why wouldn't I? When do I hit the brakes?
There are some tasks where LLMs are not all that helpful, and I find myself kind of savoring those tasks.
I'm surprised you don't notice a difference. Where I work it has been transformative. Perhaps it's because we're relatively small and scrappy, so the change in pace is easier with less organizational inertia. We've dramatically changed processes and increased outputs without a loss in quality. For less experienced programmers who are more interested in simple scripts for processing data, their outputs are actually far better, and they're learning faster because the Claude Code UI exposes them to so many techniques in the shell. I now see people using bash pipes for basic operations who wouldn't have known a thing about bash a couple years ago. The other day a couple less-technical people came to me to learn about what tests are. They never would have been motivated to learn this before. It's really cool.
It doesn't reduce work at all, though. We're an under-funded NGO with high ambition. These changes allow us to do more with the same funding. Hopefully it allows us to get more funding, too. I can't see it leading to anyone being let go; we need every brain we can get.
I'm not sure what to say. It's like someone claiming that automobiles don't improve personal mobility. There are a lot of logical reasons to be against the mass adoption of automobiles, but "lack of effectiveness as a form of personal mobility" is not one of them.
Hearing things like this does give me a little hope though, as I think it means the total collapse of the software engineering industry is probably still a few years away, if so many companies are still so far behind the curve.
I prefer walking or cycling and often walk about 8km a day around town, for both mobility and exercise. (Other people's) automobiles make my experience worse, not better.
I'm sure there's an analogy somewhere.
(Sure, automobiles improve the speed of mobility, if that's the only thing you care about...)
Are you hiring?
IMO AI will make 70-80% job obsolete for sure.
The specific way it applies to your specific situation, if it exists, either hasn't been found or hasn't made its way to you. It really is early days.
I recently used copilot.com to help solve a tricky problem for me (which uses GPT 5.1):
I have an arbitrary width rectangle that needs to be broken into smaller
random width rectangles (maintaining depth) within a given min/max range.
The first solution merged the remainder (if less than min) into the last rectangle created (regardless if it exceeded the max).So I poked the machine.
The next result used dynamic programming and generated every possible output combination. With a sufficiently large (yet small) rectangle, this is a factorial explosion and stalled the software.
So I poked the machine.
I realized this problem was essentially finding the distinct multisets of numbers that sum to some value. The next result used dynamic programming and only calculated the distinct sets (order is ignored). That way I could choose a random width from the set and then remove that value. (The LLM did not suggest this). However, even this was slow with a large enough rectangle.
So I poked my brain.
I realized I could start off with a greedy solution: Choose a random width within range, subtract from remaining width. Once remaining width is small enough, use dynamic programming. Then I had to handle the edges cases (no sets, when it's okay to break the rules.. etc)
So the LLMs are useful, but this took 2-3 hours IIRC (thinking, implementation, testing in an environment). Pretty sure I would have landed on a solution within the same time frame. Probably greedy with back tracking to force-fit the output.
No simply 'producing a feature' aint it bud. That's one piece of the puzzle.
If you can’t be exposed to it in your day job, start using Claude opus in the evening so you know what’s coming.
Maybe I will be replaced by matrix multiplication in my job, but if I need to use LLM at some point I expect little benefit from starting now.
Yes, I tried to use Claude Code two months ago. It was scary, but not useful.
People who actually know how to think can see it a mile away.
Being overworked is sometimes better than being underworked. Sometimes the reserve is better. They both have challenges.
Best time to be a solo founder in underserved markets :)
Same here. This is also why I haven't been able to switch to Claude Code, despite trying to multiple times. I feel like its mode of operation is much more "just trust to generated code" than Cursor, which let's you review and accept/reject diffs with a very obvious and easy to use UX.
I don't have the bandwidth to juggle four independent things being worked on by agents in parallel so the single-IDE "bottleneck" is not slowing me down. That seems to work a lot better for heavy-boilerplate or heavy-greenfield stuff.
I am curious about if we refactored our codebase the right way, would more small/isolatable subtasks be parallelizable with lower cognitive load? But I haven't found it yet.
I'm a vibe-coder, and I've shipped lots! The key is to vibe-code apps that has a single user (me). Haven't coded anything for 15 years prior to January too.
I am usually my principal customer, but I tend to release publicly.
I haven’t turned it on, yet, because of the velocity of the work, but I think I’ve found my stride.
There's a ton of millennials (myself included) turning 40, that have been in this field since 2005 or earlier. It's all we know, and at this point we're getting too old to just "go do physical labor for minimum wage so AI can write code instead." I'm certainly too old to go back to school and try to pass the bar example to be a lawyer at 50+, and I have zero interest in any kind of people management whatsoever.
IMO Anthropic, OpenAI, Google, etc. should all be helping governments work toward a plan and lobbying for regulation on it instead of just charging full steam ahead "damn the consequences, those are someone else's problem."
It's going to obliterate what little is left of the middle class and leave a massive amount of unemployed middle aged tech workers with no where to go. What then? We either get ahead of the problem now (Outlook not so good), or we collapse into massive civil unrest and chaos.
Our city just spent >$15MM on "case management software" that took 5 years to build by some fly-by-night outfit in California who won the contract, haphazardly bolted together MSFT Azure components, then vanished with zero support.
These teams can't in good faith freely adopt AI tooling into their workflow because they don't have the bandwidth to do it well, so they don't do it at all.
Will I crash and burn? Maybe, you're right. But, that's why I'm taking things at a very slow pace. Only automating internal tasks. Only things I trust AI to do. Very very limited scope. What's really my alternative here?
Just sit back and watch the world move on? My alternative is not changing with the times and being stagnant. That's not really a solution. Even if I'm doing that, I want to have data points that AI is really a dead end instead of just assumptions. My alternative reality isn't a bed or roses - a lot of people at the top do believe they can replace me and my work (CTO) with AI, thanks to the hype. I'm just trying to evolve so I don't become a meme down the line. Can they actually replace me or my job with AI? Absolutely not from what I'm seeing. But hypes of cutting cost is always attractive to people at the top. Just trying to stay alive man, lol.
A) how old the product is: Twitter during its first 5 years probaby had more work to do compared to Twitter after 15 years. I suspect that is why they were able to get rid of so many developers.
B) The industry: many b2c / ecommerce businesses are straightforward and don't have an endless need for new features. This is different than more deep tech companies
I’ve built several of such tools where I work. We don’t even have a dev team, it’s just IT Ops, and all of what I’ve built is effectively “done” software unless the business changes.
I suspect there’s a lot of that out there in the world.
I'll be honest, just the idea of working there makes me feel like vomiting. For me, they are bizarrely evil. They're not evil like, "we're going to destroy our competition through anti competitive practices," (which they do), but "let's destroy a whole generation of minds."
And now with the glasses. I mean, jeeze. Can there be a stronger signal of not caring for others?
It's as if Meta sees people as cattle. Though I think a lot of techies see humans as cattle, truthfully.
What was your rationale?
I guess this question is out-of-the-blue, and I don't mean for you to justify your existence, but I've never understood why people choose to work for Meta.
Then he took a contracting gig for Meta. His rationalization was that the project was an ill-specified prototype that would never see the light of day - if they wanted to throw money at him for stuff like that, he would accept it.
That gig is finished, and he's now thoroughly disillusioned with working for big tech.
Both sell things that are bad for you, but that the consumer has complete control over whether or not to consume.
And not all of what Meta is selling is bad. There's a lot of information exchanged on Facebook, Instagram, etc. that are good for society. Like health/nutrition advice, etc.
Where livelihood is concerned, rational individuals with strong morals can do irrational, and immoral things (e.g., work at the Palantir's of the world).
TLDR: incentives don't just shape perception, they form it
This is, if AI is going to cause job losses it will feel very small for some time, then it will happen suddenly all at once with little to no time to properly react.
Tangential, I don't even know what "responsible" in the corporate world means anymore, it seems to me no one is really responsible for anything. But the one thing that's almost certain is that I will fix the damn thing if I made it go boom.
Tip to budding software engineers: try to not work in these sort of places, as they're about "looking busy" rather than engineering software, where the latter is where real long-lasting things are built, and the former is where startup founders spend most their money.
The last paragraph is where the tricky and valuable parts are, and also where AI isn't super helpful today, and where you as a human can actually help out a lot if you're just 10% better than the rest of the "engineers" who only want to ship as fast as possible.
The docs used to be good enough that there would be an example which did exactly what you needed more often than the llm gets it right today.
lol, that sounds like a disaster for the codebase.
I worry that we're returning to an era of renting core development tools. After the huge benefits from free and open source tools, that's a bitter pill to swallow.
In this example, if the 25 files are organized nicely, and I had I nice IDE that listed class/namespace members of each file neatly, I might take 30 minutes to understand the overall structure.
Morever, If I critically analyzed this, I would ask "how many times does this event of summarizing 25 files happen"? I mean, are we changing codebases every day? No, it's a one time cost. Moreover, manually going through will provide insight not returned by LLM.
Obviously, every case is different, and perhaps you do need to RESEARCH new codebases often, I dunno!
It might not be about bringing more revenues but retaining market share.
This is what's so frustrating about the hype bros for me. In most cases, everything AI spits out are code smells.
We're all just supposed to toss out every engineering principle we've learned all so the owner class can hire less developers and suppress wages?
I'm sure it's working great for everyone working on SaaS CRUD or web apps, but it's still not anywhere close to solving problems outside that sphere. Native? It's very hit and miss. It has very little design sense (because, why would it? It's a language model) so it chokes on SwiftUI, it also can't stop using deprecated stuff.
And that's not even that specialized. It still hallucinates cmdlets if you try to do anything with PowerShell, and has near zero knowledge about the industry I work in, a historically not tech-forward industry where things are still shared in handcrafted PDF reports emailed out to subscribers.
I'm going to leave this field entirely if the answer just becomes "just make everything in React/React Native because it's what the AI does best."
Me too. After listening to all the claims about Claude Code's productivity benefits, I was surprised to get the result I got.
I'm not able to share details of my work. I was using Claude Opus 4.5, if I recall correctly.
It pains the anti-capitalist fibers in my body to say this, but no they are not. At the maximum the value is in organizational knowledge and existing assets (= source code, documentation), so that people with the least knowledge possible can make changes. In software companies in general, technical excellence and knowledge is not strongly correlated with economic success as long as you clear a certain bar (that's not that high). In comparison, in hardware/engineering companies, that's a lot more correlated.
In the concrete example of a legacy codebase we have here, there is even less value in trying to build up knowledge in the company, as it has already been decided that the system is to be discarded anyways.
Am trying to compare this to reports that people are not reviewing code any more.
from stackoverflow import quick_sort
https://github.com/drathier/stack-overflow-import1: pre-AI. Not keen on becoming a manager of an idiot savant, so I’m planning my exit.
Arguably this solution is "better" because you don't even really need to understand that you have specific problems to have the agent solve them for you, but I fail to see the point of keeping these people employed in that case. If you haven't been able to solve your own workflow issues up until now I have zero trust in you being able to solve business problems.
I disagree with "always".
This is only the recent wave of brogrammers who care nothing about the quality of the tech and are only in this industry for the gold rush.
They aren't inherently technically minded, they just know how to schmooze their way around and convince decision makers to follow capricious trends over solid practices.
We aren't fungible workers in a low skill industry. And if you find yourself working in a tech company without equity: just don't, leave. Either find a new tech company or do something else altogether.
There’s some neat stuff, don’t get me wrong. But every additional tool so far has started strong but then always falls over. Always.
Right now there’s this “orchestrator” nonsense. Cool in principle, but as someone who made scripts to automate with all the time before it’s not impressive. Spent $200 to automate doing some bug finding and fixing. It found and fixed the easy stuff (still pretty neat), and then “partially verified” it fixed the other stuff.
The “partial verification” was it justifying why it was okay it was broken.
The company has mandated we use this technology. I have an “AI Native” rating. We’re being told to put out at least 28 commits a month. It’s nonsense.
They’re letting me play with an expensive, super-high-level, probabilistic language. So I’m having a lot of fun. But I’m not going to lie, I’m very disappointed. Got this job a year ago. 12 years programming experience. First big tech job. Was hoping to learn a lot. Know my use of data to prioritize work could be better. Was sold on their use of data. I’m sure some teams here use data really well, but I’m just not impressed.
And I’m not even getting into the people gaming the metrics to look good while actually making more work for everyone else.
Its not rocket science to measure actually. The issue is most people dont know how to think properly to invent the right proxies.
Sunk cost fallacy is very real, for all involved. Especially the model producers and their investors.
Sunk cost fallacy is also real for dev's who are now giving up how they used to work - they've made a sunk investment in learning to use LLMs etc. Hence the 'there's no going back' comments that crop up on here.
As I said in this thread - anyone who can think straight - Im referring to those who adhere to fundamental economic principles - can see what's going on from a mile away.
- https://cxl.com/blog/outliers/
- https://www.blastx.com/insights/the-best-revenue-significanc...
- (online tool to calculate significance) https://www.blastx.com/rpv-calculator
I'm not checking their math, but the articles make sense to me, and I trust they did implement it correctly. In the end the LLM did get me to the correct answer by suggesting the articles, so I guess I should eat some humble pie and say it _did_ help me. At the same time, if I didn't have the intuition that using rpv as-is in a t-test would be noisy, and the suggestions from this comment thread, I think I could have gone down the wrong path. So I'm not sure what my conclusion is -- maybe something like LLMs are helpful once you ask the right question.
When the metrics arrived with digital, they saw that advertising, in some ways, was just not as effective as they’d hoped. In some ways the ROI wasn’t there. Seth Godin agrees. He says that advertising in the digital era could be as simple as just having a good product. I think this is Tesla’s position on it - make the best product and the internet takes care of it.
Legacy companies have kept large ad budgets but those are diminishing. From what I spoke with my friend at WPP, he said their data science team showed that outside of a new product or a product that is not recognised by consumers, the actual outcomes from ads are marginal or incremental. Thats what he told me. If your product is already known to consumers, the ROI is questionable.
There’s something about AIs that feels wrong for storytelling. I just don’t think people will want AIs to tell them stories. And if they do… Well, I believe in human storytelling.
I'm not a tech CEO but people who are anti-LLM for programming have no place on my team.
FWIW I find it useful if I know exactly what I want and it's quicker to prompt it than type it myself. Also for research and building understanding it's generally good. I still catch it being wrong on details of you're really paying attention or literally contradicting itself between prompts. That gives me a lot of pause about trusting things it told me that I just accepted as fact without having enough knowledge myself to question it.
Someone deciding to drop a spreadsheet of customer data into their personal AI account to increase their productivity would be catastrophic for business, so you need rules. And rules means paying for enterprise AI tooling.
The $20 a month tier in particular is a trivial expense, on par with businesses that expect their workers to wear steel toed shoes. Some may give workers a little stipend to buy those boots, some not. Either way, it doesn't really matter.
Brooks law “Adding manpower to a late software project makes it later” is just the surface of some of the metaphorical language that has most stuck with me: large systems and teams quickening entanglement in tar pits through their struggle against coordination scaling pains, conceptual integrity in design akin to preserving architectural unity of Reims cathedral, roles and limitations attempting to expand surgical teams, etc.
Love a good metaphor, even when its foundation is overextended or out of date. Highly recommend.
Mostly it's because when we hit a point where one person would get stuck, the other usually knows what to do, and we sail through almost anything with little friction.
And to your point, a single person can easily get stuck, I know that applies to me many times.
I don't get why people try to simplify - you're removing important details that determine performance and therefore output. This leads to false conclusions.
Why are you surprised customers don't like spending money on the items that don't add business value. Add to that QA, documentation, security audits, etc.
They want to ship stuff that brings in customers and revenue day one, everything else is a cost.
They absolutely do add value / prevent loss, but you need some understanding in order to see that. Not seeing it is a marker of not understanding.
Not to the non-technical bean counters. When they allocate money they want to see you prove how that extra money translates to an immediate ROI, and it's difficult to prove that in an Excel sheet exactly what the ROI will be without making stuff up on vibes and feels.
Like at one German company i was at ~15 years ago, all the devs wanted a second 19" monitor on our workstations for increased productivity, and the bean counters wouldn't approve that because they wanted proof of how that expense across hundreds of people will increase our productivity and by how much %, to see if that would offset the cost.
This is how these people think. If you don't bring hard numbers on how much their "line will go up", they won't give you money.
I know this is difficult to understand from the PoV of SV Americans where gazillions of dollars just fall from the sky at their tech companies.
In that sort of high-fear, change-adverse environment "get rid of all the devs and let the AI do it" may not be the most compelling sales pitch to leadership. ("Use it to port the code faster so we can spend more time on the migration plan and manual testing" might have better luck.)
1. No one cares about quality. Even in fields you'd expect to require the 'human touch' (e.g. novel translation), publishers are replacing translators with AI. It doesn't matter if you have higher-level knowledge or skills if the company gains more from cutting your contract than it loses in sales.
2. Translation jobs have been replaced with jobs proofreading machine translations, which pays peanuts (since AI is 'doing most of the work') but in fact takes almost as much effort as translating from scratch (since AI is often wrong in very subtle ways). The comparison to PR reviews makes itself.
After all, this has been Apple strategy since the 80's, and, even though there were some up's and down's, overall it's a success.
Maybe, but it probably requires a very strong and opinionated leader to pull off. The conventional wisdom in American business leadership seems to be to pursue the lowest level of quality you can get away with, and focus on cutting costs. And you'll have to fight that every second.
I don't think that's true at the individual-contributor level (pursing quality is very motivating), but they people who move up are the ones who sound "smart" by aping conventional wisdom.
> After all, this has been Apple strategy since the 80's, and, even though there were some up's and down's, overall it's a success.
I might give you that "since the late 90s," but there have been significant periods where that wasn't true (e.g. the early mid-90s Mac OS was buggy and had poor foundations).
With what time and money? The statue of David and the Sistine Chapel ceiling would not have come to be from hobbyists. There are precious few culturally relevant works of art that lacked both a patron and a sales motive.
If it would be useful I would continue to use it, but at this point I would not use even if it would be free, not proprietary and not funding replacing me.
The problem with analyzing logs is determinism. If I ask Claude to look for evidence of compromise, I can't trust the output without also going and verifying myself. It's now an extra step, for what? I still have to go into Elastic and run the actual queries to verify what Claude said. A saved Kibana search is faster, and more importantly, deterministic. I'm not going to leave something like finding evidence of compromise up to an LLM that can, and does, hallucinate especially when you fill the context up with a ton of logs.
An auditor isn't going to buy "But Claude said everything was fine."
Is AI actually finding things your SIEM rules were missing? Because otherwise, I just don't see the value in having a natural language interface for queries I already know how to run, it's less intuitive for me and non deterministic.
It's certainly a useful tool, there's no arguing that. I wouldn't want to go back to working with out it. But, I don't buy that it's already this huge labor market transformation force that's magically 100x everyone's productivity. That part is 100% pure hype, not reality.
Is it? A couple days ago I had it build tooling for a one-off task I need to run, it wrote ~800 lines of Python to accomplish this, in <30m. I found it was too slow, so I got it to convert it to run multiple tasks in parallel in another prompt. Would have taken a couple days for me to build from hand, given the number of interruptions I have in the average day. This isn't a one-off, it's happening all the time.
It's not going to be written exactly like you would do it, but that's ok - because you care about the results of the solution and not its precise implementation. At some point you have to make an engineering decision whether to write it yourself for critical bits or allow the agent/junior to get a good enough result.
You're reviewing the code and hand editing anyway, right? You understand the specs even if your agent/junior doesn't, so you can take credit even if you didn't physically write the code. It's the same thing.
Yes, yes!
And this is problem for me, because of the pace, my brain muscles are not developing enough compared to when I was doing those things myself.
before, I was changing my mind while implementing the code, because I see more things while typing, and digging deeper, but now, because juniors are doing things they don't offer me a refactoring or improvements while typing the code quickly, because they obey my command instead of having "aha" moment to suggest better ways
The engineer who worked with you took ownership of the code! Have you forgotten this?
Management often has a perverse short-term incentive to make labor feel insecure. It’s a quick way to make people feel insecure and work harder ... for a while.
Also, “AI makes us more productive so we can cut our labor costs” sounds so much better to investors than some variation of “layoffs because we fucked up / business is down / etc”
"We've frozen hiring because our growth potential is tapped out."
"We've frozen hiring because AI can replace employees."
When you're unemployed, it doesn't matter. When executives cargo cult, it doesn't matter.
So why aren't they using their own software to generate a linux optimized package for linux, a Swift software for MacOS and whatever windows uses.
That would be the best ad for AI. See, we use our own product!
But it doesn't happen.
So essentially by not generating custom binaries for every platform and using Electron they're doing one thing but saying something else. So maybe generating code isn't the #1 problem in the world!
Also I remember them saying that their engg write less code and they use Claude to write Claude. If Claude can be used to write Claude then why not use Claude to write OS specific binaries?!
Apple has already shown this decades ago - they got the iPhone and iPod developed and out the door in relatively short-time scales given the impact of the products on the world. Once you know what you want, exactly what you want, things moves fast - really fast.
But sure let's buy 200$ per month claude to ship things faster lol
- AI tools are expensive so until the increased productivity translates to increased revenue we need to make room in the budget
- We expect the bottlenecks in our org to move from writing code to something else (PM or design or something) so we're cutting SWEs in anticipation of needing to move that budget elsewhere.
- We anticipate the skillsets needed by developers in the AI world to be fundamentally different from what they are now that it's cheaper to just lay people off, run as lean as possible, and rehire people with the skills we want in a year or two than it is to try and retrain.
I don't necessarily agree with those arguments (especially the last one), but I think they're somewhat valid arguments
> rehire people with the skills we want in a year or two than it is to try and retrain.
before that future comes your company might become obsolete already, because you have lost your market share to new entrants
> We expect the bottlenecks in our org to move from writing code to something else
I would love to tell them, hey lets leverage current momentum and build, when those times come, we offer existing people with accumulated knowledge to retrain to a new type of work, if they think they're not good fit, they can leave, if they're willing, give them a chance, invest in people, make them feel safe and earn trust and loyalty from them
> AI tools are expensive so until the increased productivity translates to increased revenue we need to make room in the budget
1. Its not that expensive: 150$/seat/month -> 5 lunches? or maybe squeeze it from Sales personnel traveling with Business class?
2. By the time increased productivity is realized by others, company who resisted could be so far behind, that they won't be able to afford hiring engineers with those skillsets, if they think 150$ is expensive now, I am sure they will say "What??? 350k$ for this engineer?, no way, I will instead hire contractors"
However AI definitely is capable of lower end software tasks and really well trodden ground, especially when managed by a developer, so perhaps what we will see is a bigger gap in pay and talent not too different from the off-shore vs on-shore market comparisons.
The key for me though for me is that, if AI makes your employees 20% more valuable, that will either get priced into their wage or captured by the business, but it still doesn't replace the need for good talent (software engineer, agent handler, whatever it will get called).
Jobs where a machinist is in charge of large chunks of the process are rarer. Large shop will have one person setting up many machines to maximize throughput.
If it is truly because of AI, then it's still a losing strategy long term in my opinion.
I don't do tech outside of 9-5, so either my employer pays for it all, or I don't use it. Simple as that. Thankfully, they do pay for it, but I couldn't imagine working somewhere that says "You need to use AI" and then not providing it on their dime.
Quite frankly it should be regulation that if a W2 employee needs something to perform their job duties, the employer must provide it.
The craft changes with all these AI helpers, so the juniors have to also catch up/change with it. Or there won't b any seniors in due time.
You would hire someone with the expactation that they learn, but you also need to pay them. New hires always slow the team down. And currently you wouldn't even get much out of them, as you can delegate those tasks to AI.
Additionally you can not even be sure that the junior will learn or just throw stuff at AI. The amount of vibecoded Code I have to review at the moment from Seniors is stunning.
So yeah, the market needs Seniors, but there is basically no incentive for a company to hire a Junior at the moment. It's just easier and cheaper to pay a bit better than the market and hire Seniors then to train a Junior for years.
It's just shortsighted to not train any/enough juniors as an industry. Shortsightedness, what else is new
(I also just love statistics and think it's some of the most applicable math to everyday life in everything from bus arrival times to road traffic to order values to financial markets.)
All of these people will consequently be on the job market competing for your opportunities.
Yes you may feel superior to their capabilities - and may even be justified in your opinion (I know nothing about you beyond this comment)... But it'll still significantly impact your professional future if this actually happens. It would massively impact wages at the very least
Your viewpoint is incredibly short-sighted and not actually realizing the broad effect on the industry as a whole such a change would bring.
Every efficiency wave made life better for humans. Why should this one be different?
Assume many people lose their jobs. This in turn means companies will have higher margins. Higher margins attract more competition. More competition means lower margins since some will use the lower costs to offer lower prices.
Lower prices increase quality of life for everyone.
People who lost their job might be able to pick up doing something they actually enjoy…
That's so out of touch.
First, you're conveniently ignoring the possibility that people actually like the job they are about to lose.
And believe it or not most people aren't toiling away at jobs they hate because it never occurred to them to do something they like more. They work jobs they dislike because it's the only choice they have because they have to pay their bills so they can survive and so that their dependents can have an acceptable life.
We just gloss over them and villify the ones who tried to do anything about it (the ones that weren't executed also died in poverty).
It's more probable they lose everything before ending up with a worse job that pays less.
If you are relying on the LLM and context, then unless your context is a secret your competitor is only ever one prompt behind you. If you're willing to pursue true novelty, you need a human and you can leap beyond your competition.
The reality is that a huge portion of my time is spent doing similar work and what LLMs largely do is pick up the smaller tasks or features that I may not have prioritized otherwise. Revolutionary in one sense, completely banal and a really minor part of my job in many others.
It still is completely and utterly hopeless
I found success with it pretty easily for those smaller projects. They were gamedev projects, and the process was basically to generate a source of truth AST and diff it vs a target language AST, and then do some more verifier steps of comparing log output, screenshot output, and getting it to write integration tests. I wrote up a bit of a blog on it. I'm not sure if this will be of any use to you, maybe your case is more difficult, but anyway here you go: https://sigsegv.land/blog/migrating-typescript-to-csharp-acc...
For me it worked great, and I would (and am) using a similar method for more projects.
The tests it writes in my experience are extremely terrible, even with verbose descriptions of what they should do. Every single test I've ever written with an LLM I've had to modify manually to adjust it or straight up redo it. This was as recent as a couple months ago for a C# MAUI project, doing playwright-style UI-based functionality testing.
I'm not sure your AST idea would work for my scenario. I'd be wanting to convert XNA game-play code to PhaserJS. It wouldn't even be close to 95% similar. Several things done manually in XNA would just be automated away with PhaserJS built-ins.
That's... not a good look for your engineers?
And in the longer term those people will also get deprecated.
Then any company that was staffed at levels needed prior to the arrival of current-level LLM coding assistants is bloated.
If the company was person-hour starved before, a significant amount of that demand is being satisfied by LLMs now.
It all depends on where the company is in the arc of its technology and business development, and where it was when powerful coding agents became viable.
But, to be robust you want a signal handler with clean shutdown, a circuit breaker, argument processing (100 lines right there), logging, reporting progress to our dashboard (it's going to run 10-15 days), checking errors and exceptions, retrying on temp fail, documentation... It adds up.
So it could be shorter, but it's not like there is anything superfluous in it.
Maybe 2 years ago Ai was doing random stuff and we got all those funny screenshots of dumb gemini answers. The indeterminism leading to random stuff isn't really an issue any more.
The way it thinks keeps it on track.
Job done fella.
I totally understand that not everyone is having that experience. And yet until people live it, it seems they just discount the experience others are having.
I'll take the 12 month bet.
It's clearly relative. For all we know you're a crap coder and AI is now your crutch. We have no evidence that with AI you are as good as an average developer with a fair amount of experience. And even if you do have a fair amount of experience, that doesn't mean you're a good coder.
Let us not lose sight of how we got here.
Guys like you dont get it. You think OAI, Amazon etc can freely put large amounts of money into this for 5-10 years? Lmao - delusional. Investors are impatient. Show huge jumps in revenue this year or you no longer have permission to put monumental amounts of money into this anymore.
Short of that they'll just destroy the stock price by selling off; leaving employees who get paid via SBC very unhappy.
For the tests, I'm not sure why we have such different results but essentially it took a codebase I had no tests in, and in the port it one shot a ton of tests that have already helped me in adding new features. My game server for it runs in kubernetes and has a "auto-distribute" system that matches players to servers and redistributes them if one server is taken offline. The integration tests it wrote for testing that auto-distribute system found a legit race condition that was there in both the old and new code (it migrated it accurately enough that it had the same bugs) and as part of implementing that test it fixed the bug.
Of course I wouldn't use it if it wasn't a good tool but for me the difference between doing this port via this method versus doing it manually in prior massive projects was such an insane time save that I would have been crazy to do it any other way. I'm super happy with the new code and after also getting the test infra and stuff like that up it's honestly a huge upgrade from my original code that I thought I had so painstakingly crafted.
"I have an arbitrary width rectangle that needs to be broken into smaller random width rectangles (maintaining depth) within a given min/max range. The solution needs to be highly performant from an algorithmic standpoint, well-tested using TDD and Red/Green testing, written in python, and not have any subtle errors."
It got the answer you ended up with (if I'm understanding you correctly) the first time in just over 2 minutes of working, and included a solid test suite examining edge cases and with input validation.
I appreciate you testing, even though it's not a great comparison:
- My feedback cycle of LLM prompting forced me to be more explicit with each call, which benefited your prompt since I gave you exactly what to look for with fewer nuances.
- Maybe GPT 5.1 is old or kneecapped for newer versions of GPT
- Maybe Opus/Claud is just a way better model :P
Please post the code!
Edit: Regarding "exactly what to look for", when solving a new problem, rarely is all the nuance available for the first iteration.
Sorry to be so blunt, but it's not surprising that you aren't able to get much value from these tools, considering you don't use them much.
Getting value from LLMs / agents is a skill like any other. If you don't practice it deliberately, you will likely be bad at it. It would be a mistake to confuse lack of personal skill for lack of tool capability. But I see people make this mistake all the time.
If it's "you didn't explain the problem clearly enough", then that aligns with my original comment.
The loss of skills, complete loss of visibility and experience with the codebase, and the complete lack of software architecture design, seems like a massive killer in the long term
I have a feeling that we're going to see productivity with AI drop through the floor
Just yesterday I asked Opus 4.6 what I could do to make an old macOS AppKit project more testable, too lazy to even encumber the question with my own preferences like I usually do, and it pitched a refactor into Elm architecture. And then it did the refactor while I took a piss.
The idea that AI writes bad software or can't improve existing software in substantial ways is really outdated. Just consider how most human-written software is untested despite everyone agreeing testing is a good idea simply because test-friendly arch takes a lot of thought and test maintenance slow you down. AI will do all of that, just mention something about 'testability' in AGENTS.md.
AI writes bad software by virtue of it being written by the AI, not you. No actual team member understands what's going on with the code. You can't interrogate the AI for its decision making. It doesn't understand the architecture its built. There's nobody you can ask about why anything is built the way it is - it just exists
Its interesting watching people forget that the #1 most important thing is developers who understand a codebase thoroughly. Institutional knowledge is absolutely key to maintaining a codebase, and making good decisions in the long term
Its always been possible to trade long term productivity for short term gains like this. But now you simply have no idea what's going on in your code, which is an absolute nightmare for long term productivity
Remember this famously happened before, in the 1970s
> Now imagine
> Imagine what that would do
Imagine if your grandma had wheels! She'd be a bicycle. Now imagine she had an engine. She could be a motorcycle! Unfortunately for grandma, she lives in reality and is not actually a motorcycle, which would be cool as hell. Our imagination can only take us so far.
To more substantively reply to your longer linked comment: your hypothesis is that people spend as little as 10% of time coding and the other 90% of time in meetings, but that if they could code more, they wouldn't need to meet other people because they could do all the work of an entire team themselves[1]. The problem with your hypothesis is that you take for granted that LLMs actually allow people to do the work of an entire team themselves, and that it is merely bureacracy holding them back. There have been absolutely zero indicators that this is true. No productivity studies of individual developers tackling tasks show a 10x speedup; results tend to be anywhere from +20% to minus 20%. We aren't seeing amazing software being built by individual developers using LLMs. There is still only one Fabrice Bellard in the world, even though if your premise could escape the containment zone of imagination anyone should be able to be a Bellard on their own time with the help of LLMs.
[1] Also, this is basically already true without LLMs. It is the reason startups are able to disrupt corporate behemoths. If you have just a small handful of people who spend the majority of their work time writing code (by hand! No LLMs required!), they can build amazing new products that outcompete products funded by trillion-dollar entities. Your observation of more coding = less meetings required in the first place has an element of truth to it, but not because LLMs are related to it in any particular way.
> Imagine if your grandma had wheels! She'd be a bicycle.
I always took this to be a sharp jab saying the entire village is riding your grandma, giving it a very aggressive undertone. It's pretty funny nonetheless.Too early to say what AI brings to the efficiency table I think. In some major things I do it's a 1000x speed up. In others it is more a different way of approaching a problem than a speed up. In yet others, it is a bit of an impediment. It works best when you learn to quickly recognize patterns and whether it will help. I don't know how people who are raised with ai will navigate and leverage it, which is the real long-term question (just as the difference between pre- and post-smartphone generations is a thing).
EDIT: Retracted, I think the example given below is reasonably valid.
The only study showing a -20% came back and said, "we now think it's +9% - +38%, but we can't prove rigorously because developers don't want to work without AI anymore": https://news.ycombinator.com/item?id=47142078
Even at the time of the original study, most other rigorous studies showed -5% (for legacy projects, obsolete languages) to 30% (more typical greenfield AND brownfield projects) way back in 2024. Today I hear numbers up to 60% from reports like DX.
But this is exactly missing the point. Most of them are still doing things the old way, including the very process of writing code. Which brings me to this point:
> There have been absolutely zero indicators that this is true.
I could tell you my personal experience, or link various comments on HN, or point you to blogs like https://ghuntley.com/real/ (which also talks about the origanizational impedance mismatch for AI), but actual code would be a better data point.
So there are some open-source projects worth looking at, but they are typically dismissed because they look so weird to us. Here's two mostly vibe-coded (as in, minimal code review, apparently) projects that people shredded for having weird code, but is already used by 10s of 1000s of people, up to 11 - 18K stars now. Look at the commit volume and patterns for O(300K) LoC in a couple of months, mostly from one guy and his agent:
https://github.com/steveyegge/beads/graphs/commit-activity
https://github.com/steveyegge/gastown/graphs/commit-activity
It's like nothing we've seen before, almost equal number of LoC additions and deletions, in the 100s of Ks! It's still not clear how this will pan out long term, but the volume of code and apparent utility (based purely on popularity) is undeniable.
If you are referring to the following quote [0], you are off by a sign:
> we now estimate a speedup of -18% with a confidence interval between -38% and +9%.
I dismiss them because Yegge's work (if it can even be called his work, given that he doesn't look at the code) is steaming garbage with zero real-world utility, not "because they look weird". You suggest the apparent utility is undeniable, while saying "based purely on popularity" -- but popularity is in no way a measure of utility. Yegge is a conman who profited hundreds of thousands of dollars shilling a memecoin rugpull tied to these projects. The actual thousands of users are people joining the hypetrain, looking to get in on the promised pyramid scheme of free money where AI will build the next million dollar software for you, if only you have the right combination of .md files to make it work. None of these software are actually materialising, so all the people in this bubble can do is make more AI wrappers that promise to make other AI wrappers that will totally make them money.
I am completely open to being proven wrong by a vibe-coded open source application that is actually useful, but I haven't seen a single one. Literally not even one. I would count literally anything where the end-product is not an AI wrapper itself, which has tens to hundreds of thousands of users, and which was written entirely by agents. One example of that would be great. Just one. There have been a couple of attempts at a web browser, and Claude's C compiler, but neither are actually useful or have any real users; they are just proofs of concept and I have seen nothing that convinces me they are a solid foundation from which you could actually build useful software from, or that models will ever be on a trajectory to make them actually useful.
The iPod project was done in months, not years. Im convinced most people aren't as good at programming / focusing on the right stuff as they claim.
You'd get the same sort of results if you were studying the benefits of substance abuse.
"It is difficult to study the downsides of opiates because none of our participants were willing to go a day without opiates. For this reason, opiates must be really good and we're just missing something."
The language is confusing, but the chart helps: https://metr.org/assets/images/uplift-2026-post/uplift_timel...
There’s obviously a benefit of paying higher rates for US programmers, but does that benefit change when llms are thrown into the mix
It takes more planning, more specification, more coordination, more QA. The quality is almost always worse, and remediation takes forever. So your BA, QA and PM time goes way up and absorbs any cost savings.
YMMV.
This assumes that the companies' business growth is a function of the amount of code written, but that would not make much sense for a software company.
Many companies (including mine) are building our product with an engineering team 1/4 the size of what would have been required a few years ago. The whole idea is that we can build the machine to scale our business with far fewer workers.
Even in companies that are no longer growing I've always seen the roadmap only ever get larger (at that point you get desperate to try to catch back up, or expand into new markets, while also laying people off to cut costs).
Will we finally out-write the backlog of ideas to try and of feature requests? Or will the market get more fragmented as more smaller competitors can carve out different niches in different markets, each with more-complex offerings than they could've offered 5 years ago?
This is already happening. Fewer people are getting hired. Companies are quietly (sometimes not, like Block) letting people go. At a personal level all the leaders in my company are sounding the “catch up or you’ll be left behind” alarm. People are going to be let go at an accelerated pace in the future (1-3 years).
> We find no systematic increase in unemployment for highly exposed workers since late 2022
It is absolutely likely. The hiring market for juniors is fucked atm.
(And if it is, what is the cause?)
Lots going on right now in the market, but IMO that retreat is the biggest one still.
Many companies were basically on a path of infinite hiring between ~2011 and ~2022 until the rapid COVID-era whiplash really drove home "maybe we've been overhiring" and caused the reaction and slowdown that many had been predicting annually since, oh, 2015.
There's a lot of perverse interests and incentives at play.
> We find no systematic increase in unemployment for highly exposed workers since late 2022
Also dont forget theres only so many viable revenue-generating and cost-saving projects to take. And said above - overhiring in COVID.
Opus 4.6 didn't have an issue with this question though.
No, but it can show unreliability for adjacent tasks. Identifying a CIDR block in traffic logs is a normal part of an ops work flow. It means it's more likely to fail if you need to generate a complex Regex to filter PII from a terabyte of logs. If the model has a blind spot for specific characters because it tokenizes words instead of seeing individual characters, then it can miss a critical path of failure because the service name didn't fit its probabilistic training.
Maybe you need to boilerplate Terraform. If the model can't reliably (reliably, as in, 100% deterministic, does this without fail) parse constraints, it's not just a funny mistake it's a potential 5 figure billing error.
Ops can't run on "mostly accurate." That's just simply not good enough. We need deterministic precision.
For AI to be useful in this world to the extent others have claimed it is for software eng, we'll likely need more advanced world models, not just something that can predict the next most likely token.
If your Ai work flow is still dumping logs into a chat and saying search it for some pattern, then you should see what something like Claude code approaches problems. These agents aren't building scripts to solve problems. Which is your deterministic solution.
People always say: "Things ended up working out in the end"
Things only worked out in the sense that society carried on without all the people who lost their jobs.
The U.S. has recent examples of large scale job destruction.
Michigan: From 2000-2009. Massive job destruction. 330,000 auto workers in 2000. Down to 109,000 in 2009. Estimates are that 1/3-1/2 of all those affected never achieved equal/similar employment. That is, somewhere around ~70k-120k workers never earned as much as they previously did. Since this was msotly contained within one city (Detroit), it's pretty easy for the country to ignore it and go on with their lives.
(Detroit was in decline since the 50's really. 2000-2009 is just a particularly bad snapshot.)
Coal mining towns have experienced the same phenomenon but more gradually. The poverty left behind by the destruction of those jobs has never been addressed.
With AI, we are heading into a situation where potentially a much larger amount of people will be affected. So maybe that changes the calculus on the government stepping in and fixing the problem. But I wouldn't count on it.
Sources for Michigan numbers:
https://lehd.ces.census.gov/doc/workshop/2010/LEDautopres031...
https://research.upjohn.org/cgi/viewcontent.cgi?article=1205...
It's concentrated in Detroit but also distributed throughout the state, as you can observe in the census.gov slides.
The devastation is regional. It's been a wild experience, watching it all fall apart over the last 40+ years. The decay is immense and impossible to convey to someone from a rich state. Someone from the Eastern Bloc might get it, but I've never been able to communicate it to a Californian. Hop in a car and drive from town to town. Once-prosperous communities are boarded up and gradually reclaimed by nature. Department stores are converted into soup kitchens or marijuana dispensaries.
"Things will work themselves out" is not a law of nature, unless we broaden our definition of "things working out" to include outcomes like "everyone young enough flees, everyone else clutches their savings until they eventually die impoverished."
But with AI, even outcomes like that might be overly optimistic. Where will young people flee to? Where can they go, what trade can they learn, to be safe enough to eventually die in comfort?
When I look at Michigan I see both the past and the future, and I am planning accordingly.
during the Industrial Revolution many artisan and skilled trades lost their livelihoods.
And yet, while many people did suffer serious short-term hardship and wage collapse, most did not simply remain in lifelong poverty, because over time industrialization created new types of employment and average wages eventually rose.
You don’t want to go back to before the Industrial Revolution. Do you?
It's not acceptable to attack a fellow community member like this on HN. The guidelines make it clear we're aiming for better than this:
Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.
Comments should get more thoughtful and substantive, not less, as a topic gets more divisive.
When disagreeing, please reply to the argument instead of calling names. "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not 3."
Please don't use Hacker News for political or ideological battle. It tramples curiosity.
The status quo was that I have no better understanding of code I haven't touched in a year, or code built by other people. Now I have the option to query the code with AI to bootstrap my understanding to exactly the level necessary.
But you're wrong on every claim about LLM capabilities. You can ask the AI exactly why it decided on a given design. You can ask it what the best options were and why it chose that option. You can ask it for the trade-offs.
In fact, this should be part of your Plan feedback loop before you move to Implementation.
If you choose to take AI reasoning at face value, you're choosing to accept pretty strong technical debt
You can be an exec with 10-20% fewer random products/departments in your company, and maybe 40% fewer middle managers in the rest of them. You might even get a nice bonus for cutting all that cost! Bonuses for growth, bonuses for "efficiency" when the macro vibe shifts. Trim sails and carry on.
How long have we been hearing about crushing affordability problems for property? And how long ago did that start moving into essentials? The COVID-era bullwhip-effect inflation waves triggered a lot of price ratcheting that has slowed but never really reversed. Asset prices are doing great, as people with money continue to need somewhere to put it, and have been very effective at capturing greater and greater shares of productivity increases. But how's the average waiter, cleaning-business sole-proprietor, uber driver, schoolteacher, or pet supply shopowner doing? How's their debt load trending? How's their savings trending?
[1] https://www.npr.org/2026/02/12/nx-s1-5711455/revised-labor-d...
[2] https://www.marketplace.org/story/2025/12/18/expect-more-of-...
> popularity is in no way a measure of utility
Why would it be popular if it's not useful? Yegge is not like some superstar whose products are popular just because he made them. And while some people may be chasing dollars, most of them are building software that scratches an itch. (Search for Beads on GitHub, you'll find thousands of public repos, and lord knows how many private repos.)
Beads has certainly made my agents much more effective, even the older models. To understand its utility you have to do agentic coding for a while, see the stupid mistakes agents make because they forget everything, and then introduce Beads and see almost all those issues melt away.
> None of these software are actually materialising
They are if you look for them. There are many indications (often discussed here) showing spikes in apps on app stores, number of GitHub projects, and Show HN entries. Now, you may dismiss these as "not actually useful", and at this volume that's undoubtedly true for a lot of them.
But there is already early data showing growth not only in mobile app downloads, but also time spent per user and revenue -- which are pretty clear indications of utility: https://sensortower.com/blog/state-of-mobile-2026
Edit: it occurs to me that by "vibe-coding" we may be talking about two different things -- I tend to mean "heavily AI-assisted coding" whereas you likely mean "never look at the code YOLO coding." I'll totally agree that YOLO vibe-coded apps by non-experts will be crap. Other than Beads and Gastown I don't know of any such app that is non-trivial. But then those were steered by a highly experienced engineer, and my original point was, vibe-coding correctly could look very weird by today's best practices.
The original point that sparked this sub-thread though is that AI is being overhyped. If actual vibe coding (YOLO it, never look at or understand the code, thus truly enabling non-technical folk to have revolutionary power and ability) doesn't work, then AI is yet just another tool in the toolbelt like any other developer life enhancing tech we've had so far, it's just a new form of IDE.
Being a new form of IDE, while very useful, isn't exactly entire economy transforming revolutionary tech. If it can't be used by someone with zero computer/eng experience to build something useful and revenue generating, the amount of investment we've seen into it is way overblown and is well overdue for a pretty severe correction.
I buy AI as a "developer enhancing tool" just like any other devtools that we've seen over my career. I don't currently buy it as a "total labor economy transformation force."
So when I see people online say they feel 10x productive, I tend to believe them. But I'm working solo, and have none of the encumbrances and "Conway Overhead" of coordinating with a lot of other people, so I also understand why the overall effect is so limited. Which is why I think current companies are "shaped wrong" for AI.
When companies eventually adapt, it will be a "labor economy transformation force" because the same dynamics will play out across all knowledge work. And I am not talking as an AI booster, but as a parent whose kids are interested in software engineering; I have every incentive to hope my prognostications do not come true, but I prefer being prepared for the worst.
My point is the cat is out of the bag. It doesn't take massive investments to achieve iterative improvements on SOTA. As long as the technology does not plateau, smaller labs have shown it's possible to advance the frontiers independent of large companies/investments. And as these frontiers advance, more and more of economical knowledge work will be subsumed by AI. I don't see a way out of this, which is why I am a strong proponent of wealth distribution eg UBI.
Won't matter. The Chinese models will be running on potatoes by then and be better than ever.
This place is full of bozos.
If you have a different understanding of the topic, share it, so all can benefit. That's what people do when they are sincere about contributing positively here.
If instead you insist on continuing to use abusive terms towards others here, we'll have to ban the account.
I’m reminded of https://www.joelonsoftware.com/2000/08/09/the-joel-test-12-s...
How much bloat and bureaucracy bottleneck is sitting in middle management whose favorite past time is wasting everyone's time on meetings that could have been an email? HR? Not the execs, but the HR drones that do nothing but answer employee questions about policy, could have already been replaced with not even an AI, just an old school chatbot, a long time ago.
Instead of cutting engineers, cut the non-tech jobs, flatten the structure.
Why would you think you have your finger on the pulse of general software trends like that when you use the same, what, dozen apps every week?
Just looking at my own productivity, as mere sideprojects this month, I've shipped my own terminal app (replaced iTerm2), btrfs+luks NAS system manager, overhauled my macOS gamepad mapper for the app store, and more. All fully tested and really polished, yet I didn't write any code by hand. I would have done none of that this month without AI.
You'd need some real empirics to pick up productivity stories like mine across the software world, not vibes.
Most of the software I replaced was software I was paying for (iStat Menus, Wispr Flow, Synology/Unraid). That I was paying for a project I could trivially take on with AI was one of the main incentives to do it.
I'm happy to sell it to you, though it is also free. I guided Claude to write this in three weeks, after never having written a line of JavaScript or set up a server before. I'm sure a better JavaScript programmer than I could do this in three weeks, but there's no way I could. I just had a cool idea for making advertising a force for good, and now I have a working version in beta.
I'd say it is better software, but better is doing a lot of heavy lifting there. Claude's execution is average and always will be, that's a function of being a prediction engine. But I genuinely think the idea is better than how advertising works today, and this product would not exist at all if I had to write it myself. And I'm someone who has written code before, enough that I was probably a somewhat early adopter to this whole thing. Multiply that by all the people whose ideas get to live now, and I'm sure some ideas will prove to be better even with average execution. Like an llm, that's a function of statistics.
I guess it does depend on the languages involved; one study suggests that it's even worse than Google Translate for some languages, but maybe actually okay at English<-->Spanish?
> There were 132 sentences between the two documents. In Spanish, ChatGPT incorrectly translated 3.8% of all sentences, while GT incorrectly translated 18.1% of sentences. In Russian, ChatGPT and GT incorrectly translated 35.6% and 41.6% of all sentences, respectively. In Vietnamese, ChatGPT and GT incorrectly translated 24.2% and 10.6% of sentences, respectively.
It's all right there on my website in parallel text, everybody can check and come to their own conclusion rather than driving by with unhelpful generalizations. And really, that is the primary scope of these translations: as aids in reading an original text.
Claude resorting to writing code for everything, because that's all the model can do without too many hallucinations and context poisoning, is just a higher speed REPL. Great, that's useful.
But that's not what is being hyped and sold. What's being hyped and sold is "You don't need an Ops guy anymore, just talk to the computer." Well, what happens when the AI decides the "fix" is to just open up 0.0.0.0/0 to the world to make the errors go away? The non technical minimum wage person now just talking to the computer has no idea they just pwned the company.
If AI's answer is "Just write a script to solve the prompt" then you still need technical people, and it's vasly over hyped.
I'll be interested when you actually can just dump logs in a chat and analyze it without the model having to resort to writing code to solve the problem. That will be revolutionary. Imagine all the time I'd save by not having to make business reports, I can just tell the business people to point AI at terabytes of CSV exports and just ask it questions. That is when it will stop just being labor compression for existing engineers, and start being a world changing paradigm shift.
For now, it's just yet another tool in my toolbelt.
Can you expand on that? Because it sure seems to me like it is in fact deterministic unless the person deliberately made it otherwise
This concept will never work outside of their own head. People continue to think producing something is the hard part my word.
No idea. I certainly didn't get it. Goal tracker is one thing, ad blocker is another thing. Why would I want to combine them? And why would I want to see any ads at all? Perhaps I'm just not the target audience...
I didnt get it either on first glance when scrolling down the whole page
That second point is the part that seems obvious to me but I have a hard time communicating.