Agentic Coding Is a Trap(larsfaye.com) |
Agentic Coding Is a Trap(larsfaye.com) |
If you ask it to, say, update the major version of some library, it will read the source of the new version, check the deprecations, attempt the changes based on that, rerun tests... a completely different level of utility.
It's even more ridiculous with access to server logs and such, as you can point it to a chart, say there were some errors in X service at Y time, and it'll dutifully look at logs in that window, check traces if available, look at caller services, check the database if needed, and come up with a hypothesis on what happened based on all the available information. It might miss things, but that's why you are there too. No need to be a prompting wizard that gives it everything it needs to get you the right answer in one shot: It's like pair programming with someone that has encyclopedic knowledge in many topics, but hasn't worked at your company before. A completely different experience.
It's quite different.
The actual trap is not to write agency code, but to rely on it within a team environment where I fail to build a mental model. In a solo context, that cannot happen since the mental model will still fail for new cases, and I am the only one who can repair it.
I think many people already recognize the problem:
-“Our ability to write code is being damaged.” -“If our ability to write code declines, our ability to recognize good code also declines.”
But the problem is that the market no longer works without LLMs.
Freelance rates and deadlines are now calibrated around LLM-assisted output. Even clients who write “do not vibe code” often set deadlines that are impossible to meet unless you use something like vibe coding. The client’s expectations themselves are becoming abnormal.
That is the irony of the market.
I honestly do not know what to do.
Recent Hacker News discussions are mostly a negative echo chamber about AI use. In other places, it is often the opposite: only positive echo. But almost nobody discusses the actual solution.
The main topics I keep seeing are roughly these:
1. Is the large repository PR system failing a fundamental stress test? Or should AI-generated(GEN AI) code simply not be merged? If PR review is moving from handmade production to mass production, how should the PR system change? Or should it remain the same?
2. As vendor lock-in continues, can we move toward local LLMs to escape it? Are cost and harness design manageable? What level of local model is required to reach a similar coding speed?
3. If we are forced to use agentic coding, how do we avoid damaging our own ability to code? There is a passage from Christopher Alexander that I keep thinking about:
“A whole academic field has grown up around the idea of ‘design methods’—and I have been hailed as one of the leading exponents of these so-called design methods. I am very sorry that this has happened, and want to state, publicly, that I reject the whole idea of design methods as a subject of study, since I think it is absurd to separate the study of designing from the practice of design. In fact, people who study design methods without also practicing design are almost always frustrated designers who have no sap in them, who have lost, or never had, the urge to shape things.” — Christopher Alexander, 1971
This quote feels relevant to programming now. If we separate the study and supervision fo programming from the actual practice of making, something important may be lost.
In architecture, there is this idea that without practice, the architect loses meaning. But now the market is forcing the separation.
People with enough symbolic capital and high status have the freedom not to use AI. But people lower in the market are under pressure to use it.
So I think the discussion now needs to move beyond whether AI coding is good or bad.
The real question is How do we keep using AI because the market demands it, while still preserving the human practice that makes programming meaningful and keeps our judgment alive?
I think these are the important question. How do you maintain market value without using AI?
Or, if you do use AI, how do you avoid being treated as low-quality?
If you do not use AI, how can you remain more competitive than people who do use it?
If you do use AI, what advanatge do you have over people who do not use it, and how should you position yourself?
I know that agentic coding can cause skill degradation. I can feel it happening to me already. But for someone like me, who does not have strong status, credentials, or symbolic capital, social and market pressure makes AI almost unavoidable.
What frustrates me is that I do not see practical answers anywhere.
Software engineering without a proper SDLC is a trap.
Driving without a seatbelt is a trap.
Agentic agile > agentic waterfall (at least for now)
Don't give the AI a spec, work with it every step of the way.
> pulls the slot machine lever over and over (link to "One More Prompt: The Dopamine Trap of Agentic Coding")
I'm sure the first cave-person to discover how to make fire was equally "addicted" to making fires. That doesn't really say anything about the underlying technology.
> An increase in the complexity of the surrounding systems to mitigate the increased ambiguity of AI's non-determinism
I don't know what this means, exactly. Anyone have any ideas?
> Atrophying skills for a wide swath of the population
This is very real and something we're going to have to contend with. Software can't really become less complex, and there's a minimum amount of knowledge you need, with or without AIs there to help you. We may need specialized training academies for developers where they spend a few years without AI to learn to program, and then are given a few years of AI programming.
> Vendor lock-in for individuals and entire teams
This isn't really a big program, you can always switch AI providers if there's frequent downtime.
> only a skilled developer who's thinking critically, and comfortable operating at the architectural level, can spot issues in the thousands of lines of generated code, before they become a problem
Agreed...
> Yet, in an ironic twist of fate, it's the individual's critical thinking skills and cognitive clarity that AI tooling has now been proven to impact negatively.
...well, yes and no. AI tooling can help you _reduce_ cognitive debt. Picture this: There is one senior developer (Person A) on the team who understands Service X. Your other developers could schedule time with Person A to get an understanding. Or, they could ask the AI to analyze the project and explain it to them. This scales much better, and if Person A is a poor communicator (let's face it, many senior engineers are), it might be the only working option.
If you're afraid of cognitive decline - try to get to proper orchestration using multiple agents. That's a fun exercise.
There's also the 'more stuff is being delivered, but it's not right, full of holes and papercuts'.
I'm 22 years into development and couldn't think of going back to non AI programming now. Not only has it sped up velocity by an order of magnitude, it's also helped me unlock side projects that I would never even begin in the past as I knew I didn't have that time.
It's just like any tool though, and I've found enormous differences in outcome depending on how you drive it. Launching into 'build this' and expecting it to output code that you would manually write would not get you there; and I feel this is where most developers stall out.
Getting the right outcomes takes a lot of harness set up - the same as if you wanted to hire new devs and get them productive without peering with them. You would set up linting, good test coverage and approaches, thorough documentation about what your project is, the domain, the architecture etc. This at least gets good code consistency for the most part.
For how to build, https://github.com/bmad-code-org/BMAD-METHOD is really good and I've onboarded a few Saas projects into it now. Tech speccing and multiple cycles of elicitation are what deal with all the edge cases that you normally only encounter during coding. It does front-load all of the planning brainwork; but condensing that into a couple days of solid speccing is far more productive than spreading it out over months.
It's taken a while to get to this point, and most agents aren't good for substantial work out of the box. Most of the time what the agent does will be a product of its environment.
I created a project called Ninchi to force myself to read my code and understand it. Recently I began also sharing it to see if there may be a larger need/opportunity. It's a small effort. We need to make a variety of efforts I think to encourage responsible AI usage before we end up drowning in slop.
At some point, if you're just pulling the lever on the slop machine with zero insights into your code, may be your boss is justified in asking why you should earn more than 50k a year.
This is a personal thought experiment so think it through for yourself. What would the consequence be if the agents really were better than you and you acknowledged that?
The major premise of "It's a trap!" is that it matters if you lose your coding skill. (I'll gloss over general critical thinking and stick with coding for now) However in the world where on any given task it would be done to a higher level of quality and faster if you gave it to the agent, then what are you doing trying to do it yourself? There's plenty of room for that kind of thinking in hobbies, but in the professional world?
Maybe you can add some value in code reviews, but you may also be better off never reading the code at all. Maybe the how of coding stops mattering and the what of products needs to be your top concern.
I can tell you that the agents that I use today are much better coders than I am in the language we're using. I don't write it at all. I couldn't fizzbuzz in it. But with a small team we are building useful internal tools and features at a breakneck pace. I certainly feel the same feelings of getting dumber and losing my coding chops, but I have to step back and say, could what we've built have been built in 5x the time without agents? And the answer is probably no.
The thing I'm mastering now is conjuring software with agents. What lets them rip, what slows them down, where they are today and where they will likely be tomorrow.
I can tell you that you should re-invest in small, modular systems, because agents can build modules and greenfield projects instantly. I can tell you that there is a point at which agents fall over completely even on mid-sized projects, but that that point is receding with each new generation of model, and that Codex 5.4 XHigh Fast set to 500K context window is a beast. (5.5 has yet to win me over)
I can tell you that pushing direct to main is viable, that PRs slow down fully agentic teams, and if your agents have sufficient permissions they can fix things fast enough to be let loose even knowing they may delete your service. I wouldn't do it with your main product yet (unless you're starting your startup today) and I wouldn't try it with a large legacy project. But maybe that rewrite you've always wanted to do is here and just a prompt away.
Now, the sane among you will note that agents are not better today, that they might not ever be, and either way you should never trust a computer to make a decision because it can't suffer the consequences of its actions. Or more down to earth, there are some things that are too important to yolo.
But I will argue that a huge swath of us work in domains where if you're willing to challenge some of the basic assumptions of software development (you should understand the code, it should be maintainable by humans, it should be built to last) then you'll be able to provide very useful software much more quickly than you would otherwise be able to do. Save the skill for your hobbies, and build things people want.
When I first learned to code, I would do complete rewrites of a project several times. Each time I learned a lot, and the final result would be very stable and very well designed.
At the time, those seemed like large projects, but they were relatively small.
So I have learned to slow down, and spend considerable time thinking or overthinking before coding. Since for large projects, rewrites are not so efficient.
Except that all those rewrites were upfront thinking and overthinking of the highest quality.
I have recently attacked a couple new greenfield projects in "orchestrator" mode. The fact that I know I am exploring and creating throw away code lets me try things out ambitiously. I can obsess about the original and critical code, as I did before. But now I can quickly surround it with the mundane code it needs to be usable - which can happen very very fast - especially when its a throw away experiment.
My conclusion from these successes, and others, is agentic coding isn't something that can be judged without factoring in all kinds of context, and the ability of the "orchestrator" to come up with orchestration patterns well suited to the work.
If agentic coding is a trap, it is a trap created with the cooperation of the orchestrator.
EDIT: An agent is what you make it. I insist mine keep all memories in an in-project folder. And that documentation is for "both of us", whereas their folder is for actively developing their own understanding and ideas. They are not be agreeable or contrarian, but collaborate and contribute by considering anything and everything all the time, at their highest level of operation. At the beginning of every session, they review everything and from their "fresh" perspective, update their own materials for anything that strikes them or they believe is important. And that they do the same thing at the end of every session before last submit. Project appropriate "harnesses" like this make a massive difference. Never operate an agent in plain helpful-servant mode, it is a serious waste of talent. Push them to operate at a high level, all the time, and develop their own material purely for their own project related self-enhancement, and they contribute far more than speed coding.
Another interesting thing that seems to be helpful. I have the agent write a kind of zen document about what it values. Its first task, before any project related review, is to consider its own words, and update them if they con't feel right, or they want to add something important. To remind themselves of who they are, before engaging with the project. This moment of intentional self-reflection before diving straight into project details seems to help them maintain a birds eye view, and a stronger self-aware/self-motivated commitment to quality (defined completely in their own terms!). Their own words do appear to ring true to them. They reliably respond to this session-start ritual as an intriguing surprise.
Stop using AI for coding. Period...there is no other solution. You can't make it work, nobody else can either. Without determinism, the entire process is useless. We need to stop trying to act like we all know that this isn't true. We have given it a chance, it failed, time to move on to something else no matter how much the VCs and execs don't want to. Those that do move on have a chance, the others have no future in software.
The market realigns, and unless you handwrite the highest possible quality at a quick pace, you won't be competitive with the vibe-coders who can fix a hundred issues a month.
It was the same with gps-assisted driving, now most people can't orient themselves autonomously. Worse, there are no roadsigns with directions installed, meaning that you are stuck with using the GPS.
So while I agree with your point, it does not feel like a practical answer for my situation. For someone who is already well known and has enough reputation, refusing to use AI may be a matter of principle. But I am dealing with survival.
I do not think your answer is bad. But because this is a survival problem, it is difficult for me to risk everything on principle.
In other words, I know that your answer may be the morally correct one. If everyone boycotted this, perhaps it would not be adopted so aggressively.
But I cannot do that.
What I need is a way to use AI while degrading my own ability as little as possible, and while still preserving my skills.
I am not saying you are wrong. I am saying that your answer is too idealistic for someone in my position.
"An increase in the complexity of the surrounding systems to mitigate the increased ambiguity of AI's non-determinism"
I'm referring to the layers of review that are needed to be put in place to reign in the fact that the code that is generated is blurry and obfuscated, and in amounts that exceed what someone can review in one or two sittings. I'm seeing multi-phase AI review stages that will try and distill review finer each time. Then another agent layer to document. Then another to create a PR. Then the human reviews, but uses Coderabbit locally. Then sends it back into the pipeline and round and round we go, all because there's simply too much volume of code to review.
Going against the grain here which statistically is more likely to be right given how HN was so wrong about self driving and AI being useless for coding. I think HNers given that their identity is tied around coding are of course going to defend that identity till the bitter end in the same way artists did.
Re the understanding code point: you can still use LLMs to understand code. If you write the spec without knowing anything about the code, of course the architecture might suck. Maybe there is already a subsystem that you can modify and extend instead of adding a completely new one for the new feature you are adding, etc.
I use LLMs for my daily workflows and they do understand code perfectly and much more quickly than if I read it.
I was just looking through HN search for "show HN", and I saw many fitness and calorie tracking apps.
A lot of them disappeared just after a few months of launch; a few of them survived a year, then died out as their domain name expired.
People are making things, but they are not reaching their "audience".
I created https://macrocodex.app/, launched on 16 Mar 2026, and reached 10,000+ monthly active users.
Fitness/Calorie tracking is a competitive space where there are tons of apps and services.
I could never have built such an app because I do not know how to design pages; I can talk to a designer, but from past experience, it takes them a long time to understand what the market wants and projects. And companies with small budgets find it very difficult to find a good guy.
Many of my projects never got shipped because I dreaded making landing pages, icons, UI, etc.
I am not saying we did a very good job with AI on landing pages or UI at all; that's not an area of my expertise; the domain knowledge is, but the fact that many people find it useful, I think I’ve succeeded.
I've even put a ticket system in the app for support and received a few bug reports, which I resolved.
Here's the latency of my other service: https://prnt.sc/6474F4gba_he
I no longer use managed services in AWS, and my costs are very low; this enables me to offer my apps and services for free to many users.
- tech for management who can delegate but don't have the expertise to know when it's wrong or just plainly impossible
- tech for coders who have the expertise... but who will gradually lose it.
So I'm not sure who it is for, beside VCs and shareholders until the next quarter obviously.
> – Jeremy Howard, creator of fast.ai
This so much summarizes it.
I wrote shortly about it from a slightly different angle (piano playing) just this morning: https://livingsystems.substack.com/p/playing-freely
This is really validating to read. I recently was having a call with a friend where I was arguing against 100% AI usage, and I was saying, some problems the LLM just can't solve. He asked for an example, and I tried to explain a complex chart I was trying to make at a previous gig, and in the end said "well to be fair neither the AI or I could figure it out lol." He replied "how could you even code it if you didn't know exactly what you were trying to build? You're supposed to know exactly what you're building before you write a single line of code, that's what they teach you in school."
He was poking fun at the fact that I have a boot camp background and he has a uni degree - it's been ten years for both of us now so he's running out of ways to poke fun at that difference as we even out our differences, but this one poke brought back about the old imposter syndrome, since my entire career, I've thought via coding.
When I get a ticket, I tend to jump into the codebase to figure out the context I need to know about, the current patterns, what files I'll need to worry about; and while I'm there, I tend to start writing some things, and as I do that I pull in a shared function, and in doing so just check out of curiosity where else the function is used, and in doing so discover oh, actually, we have similar functionality elsewhere, lemme just abstract this work for this ticket and the previous functionality into a shared function, and use it in both places. And so on. Before I know it, I'm looking back at the ticket checking if I've covered everything, and sending in the PR.
I've never had complaints about my productivity, in fact I'm often lauded for it so I think it at least hasn't been a process that slows me down long term even if it's meassier. But I had been wondering if it makes me less than a "real" engineer. I'm happy to hear others may doing it this way too.
I do agree that if we just rely on AI for all outputs and some reviews (at least to a threshold, because we simply can't keep up with the AI throughput as humans) we will eventually have skills atrophy. Here's where the tangents intersect: I've been working on a way to have the best of both worlds. We can still use AI to generate a large swathe of code, but use good old software engineering to do it. My project (https://salesforce-misc.github.io/switchplane/) inverts the control. Rather than having LLM-as-runtime and doing all the things, you define and write LangGraph control flows that only use the LLM when judgement is actually required. The basic principle is:
If it's deterministic, write it in code. If it requires judgement, use the LLM.
Switchplane itself is local-only but the principles can be applied to deployed agentic services as well. Because the approach is code-first, we can have that vendor independence: Use whatever model you want anywhere in the graph. One goes down? No problem. Swap the config without impacting the overarching control flow.
Cost becoming a factor? Limit LLM loops or constrain their access however you want. It's just code that needs to be updated. You control the runtime, not the LLM.
Concerned about non-deterministic behaviour when you need determinism? Don't be. It's in code.
Worried about skills atrophying because we're handing off everything to an LLM? That's mitigated somewhat here because you still need to think in systems in order to build execution graphs in the first place.
It might not demo as well as a number of markdown files being executed by an LLM. It's definitely a more reliable approach in the long run though.
Why make this assumption so confidently?
The arrival of the electronic computer did not turn human computers into programmers, it simply eliminated them en masse.
I would add that no one learns assembler anymore – and that’s a problem.
It will be the same with coding – people that know how to code will become very very valuable. I don’t think they will disappear hence.
I don’t disagree on AI being a massive revolution though, but maybe not so much in software development. Education and most of all, the art, are the most impacted in my view.
Does anyone really do this? You want verification and self-correction in a loop, not rerolling and cherrypicking. The non-determinism point is really tiresome to hear over and over.
Yes, lots of people. It’s a whole issue.
When the problem is fixed, you'll stop hearing about it.
But I still want to be in touch with coding by hand and have ventured into systems programming, outside of work, which I feel AI is less useful for currently.
It won't do everything exactly the way you would've coded it but I find this model much better at setting and maintaining "guardrails" for your codebase so you don't find yourself wondering how it all fits together.
- Socrates, decrying the invention of writing
Does the person who wrote that still work there?
[1] https://www.anthropic.com/research/how-ai-is-transforming-wo...
define observable behaviors, let codegen workout most of the rest by implementing the contract.
decompose specs into proper units (a hundred lines or so), not god-awful, unreadable, vibe-coded, frankenstein documents. follow the software engineering best practices you've honed for the last 25 years.
you end up slowing down, sitting in the problem, figuring out "the right problem to solve", all the benefits of writing the code. but now you have a spec to iterate on, and the behavior statements in the spec generate code and tests from a single source of truth (each statement a provable assertion).
as bugs arise, trace back to the spec. changes often end up being a single line of text in the spec which cascades into an easy to review diff plus tests.
spec prose allows writing "why" and "how" together without code comments which go stale.
and lean on your type system to leverage opportunities to be terse, creating a spec which has fewer words, yet still produces strong correctness guarantees (aka, the spec can be shorter than the code, and still be readable).
bonus: versioning a spec is easy, so now you have a change signal when reviewing your peer's code changes. be more careful with major/minor bumps, skim patch bumps.
while many of my peers are taking the giant-ass sdd approach, shipping fast, and losing touch with the actual system behavior, ive been taking the approach outlined above with a modest 2x speedup in feature delivery (their speedup appears much larger), without losing touch with the underlying system.
i am working on large, complex, overgrown, legacy code, so i dont have the luxury of floating in a vibe coding cloud, miles above the scary jungle and tigers and lava and spike traps that i call home.
ive found this approach to be a brilliant balance of speed, incremental AI opt-in, hands-on to avoid context loss, and most importantly to me: maintainability.
i suspect a subset of "proper" ai-codegen software engineer tooling and flows will settle in this vicinity.
encourage folks to swiftly vibe-code prototypes, but then from there, let software engineers do what we do best: engineer software, and transform the protoypes into something maintainable.
Seems safebox went after a subset.
If you want, I'm happy to jump on a zoom and talk. You can use calendly.com/safebotsai
That's exactly what I do. I know I am lucky to be gifted in this skillset. But that's not a good reason to excuse people destroying the market for everyone.
Would you refuse to work on a navigation app such as Waze, since it allows everyone to work with platforms as Uber and pushes traditional taxis out of the market?
For geniuses like you it will still be around the corner only until you can call a Waymo in the middle of the fucking ocean. You’re right. It will never happen.
The claim you should know everything about everything you work on is an intensely naive one. If you’ve worked on a team of more than one there’s a lot of stuff you don’t totally grok. If you work in an old code base there’s almost every bit of it that’s unfamiliar. If you work in a massive monorepo built over decades, you’re lucky if you even understand the parts everyone considers you an expert in it.
I often get the impression folks making these claims are either very junior themselves or work basically alone or on some project for 20 years. No one who works in a team or larger org can claim they know everything in their code base. No one doing agentic programming can either. But I can at least ask the agent a question and it will be able to answer it. And after reading other people’s code for most of my adult life, I absolutely can read the LLMs. The fact a machine wrote crappy code vs a human bothers me not in the least, and at least the machine will take my feedback and act on it.
the bar to "start" is lower and the bar to actually competency is higher now, juniors who want to actually learn instead of just pressing enter over and over again will do so regardless of whatever you do to "help" them.
There's no better time to have a curious mind
It’s not all that different than writing code directly and having it turn into a mess they can’t debug—something we all did when we were learning to program.
It is in many ways far easier to write robust, modular, and secure software with agents than by hand, because it’s now so easy to refactor and write extensive tests. There is nothing magical about coding by hand that makes it the only way to learn the principles of software design. You can learn through working with agents too.
Y'all need to stop worrying about the kids.
They're smarter than us and will run circles around us.
They're going to look at us like dinosaurs and they're going to solve problems of scale and scope 10x or more than what we ever did.
Hate to "old man yells at cloud" this, but so many people are falling into the trap because of personal biases.
While the fear that "smartphones might make kids less computer literate" is true, that's because PCs are not as necessary as they once were. The kids that turn into engineers are fine and are every bit as capable.
Due to English-language limitation my most adult life, I struggled to code. Used visual coding etc. But of course, I can't make a living on drag-and-drop harness.
Comes in GPT-3.5, accelerated my learning. Now I'm running my incorporated company, just launched one software-hardware hybrid product. Second one is a micro-SaaS in closed beta.
The point is: when people use "juniors" as a fixed shaped blobs of matter, they focus on the juniors that were in any case going to make mistakes: AI or not. Misses the key point of agentic usage.
What is important is not being afraid to learn the rest of your system and keeping an index.
Most importantly it's about being able to spin up on anything quickly. That's how you have wide reach. Digging in when you have to, gliding high when you have to. Appropriate level for the problem at hand.
When I was in college eons ago they taught CS folks all of engineering. "When do I need to know chem-e or analog control systems?" We asked. "You won't. You just need to be able to spin up on it enough to code it and then forget it. We're providing you a strong base."
That holds even within just large code bases.
I disagree with this take. Personally, I pride myself in learning the code bases I work on in detail, sometimes better than the leads for those code bases. I’m not saying that everyone should do so, but it’s achievable and not naive at all.
Nothing in the article made that claim.
This is a slight tangent from that, but I place a lot of value on the ability to offload some/most of the mental model to AI. I need to know less about everything (involved in this one task) when working on it, because a lot of the peripheral information can be handled by the AI. I find that _incredibly_ useful.
From a person perspective though, I'm apprehensive about the effect AI will have on the human "very well read intern." People who know a lot very deeply about specific areas are fascinating to talk to, but now almost everyone is able to at least emulate deep knowledge about an area through the use of AI. The productivity is there, but the human connection is missing.
It is true that you normally do not need to know everything, or even most of it.
Despite this, it is necessary to be able to discover and understand quickly anything about the project or system on which you work.
I have seen plenty of software teams that became stuck at some point because they could not solve some trivial problem that required a zoom into the project where some extra skills were required for understanding what they saw, like understanding a lower-level language, or assembly language or some less usual algorithms or networking protocols and so on.
Or otherwise they were stuck not because they lacked the skills to interpret what they saw, but because they used something that was a black box, like a proprietary library or a proprietary operating system, and it was impossible to determine what it really did instead of what it was expected to do, without being able to dive into its internals.
So I believe that the environment should always enable you to know everything about everything you work on, even if this should be only very seldom necessary.
> But I can at least ask the agent a question and it will be able to answer it
A problem here is that, in some sense, the agent that wrote the code is not the same agent that is answering questions about it. if the original agent didn't leave their reasoning, you are probably out of luck.
There are tools like git-ai [0] that capture LLM sessions and associate each file edit to a specific agent action, and let agents query a given piece of code to read the conversation around it (what the user prompted, what was the reasoning of the LLM that created the code, etc). They could change the balance, but are not widely used
Author here. Where did you find I was stating that? As other users said, that's not at all found in my writing. The rest of your post goes on a tangent about this notion, but seems like its more of a personal pontification, rather than a critique of anything I wrote.
The questions came flying in fast, without any introduction, and this was about an external integration out of a dozen. They have their own lingo, different from ours, to make the situation worse.
I had a _very hard time_ making sense of the questions, as I indeed relied heavily on a model to produce these integrations (extremely boring job + external thick specs provided).
I'm still positive these would have simply not happened in a 10x the time if I did not use models, however, I'm now carefuly considering re-documenting the "ohhs" and "aahs" of these so that these kind of uncomfortable moments never happen again.
I haven't felt so clueless and embarassed in a meeting, ever. All I could say was "I'll get back to you on that one, and that one, and this one".
Cognitive debt is very real, and it hurts worse than technical debt on a personal level! Tech debt is shared across the team, cognitive debt is personal, and when you're the guy that built the thing, you should know better!
To be continued... But from now on, the work isn't done if I don't get a little 5 mins flash-card type markdown list of "what is this" and "what is that", type glossary.
An additional factor: to find issues in generated code, the developer has to care. Many developers (especially at big firms) are already profoundly checked out from their work and are just looking for a way to close their tickets and pass the buck with the minimum possible effort. Those developers - even the capable ones - aren't going to put in the effort to understand their generated code well enough to find issues that the agents missed. Especially during the current AI-driven speed mania.
There is skill loss from heavy AI use.
But I want to acknowledge the awkward elephant in the room. AI Is making people too fast. I don't mean that a faster output is bad. It's a faster output and code rather than a full understanding and experience in producing the code. It's rewarding people who try to talk about business value rather than the people that are building and making safe decisions with deep knowledge.
AI: Yes, its good and it can produce some good solutions, however it ultimately doesn't know what it's doing and at the best of cases needs strong orchestrators.
We're in a cesspit of business driven development and they're not getting the right harsh and repulational punishments for bad decisions.
First, you've got to plan everything, using whatever Agile or Waterfall planning ritual your company uses, get the task breakdown, file the JIRA tickets, decide who's doing the work. That all can take days or even weeks. Then you need to write a design doc with your proposed design, and get that reviewed by your peers/teammates. Again, another week for any substantial feature. If there are multiple teams involved, you need to get buy-in and design agreement among those multiple teams, let's add another week. At some places, you need approval to commence work, which can take multiple days, depending on the approver's schedule and availability.
Then, you take a day and write the code and make sure it passes tests.
Then, it's code review time, and this can involve a lot of back and forth with your team, resulting in multiple iterations and additional code reviews. Another "days or weeks" stretch. At bigger companies, you're going to need to pass all sorts of reviews from other departments, like legal, privacy, performance, accessibility, QA... even if done in parallel, let's add a conservative 2 weeks. Finally, you push to staging, and need to get some soak time internally among dogfooders, so you have some confidence that it's working. +1 week. Then you're ready to push from staging to prod, but since you work at a serious company, nothing goes to 100% prod right away--you need to slowly ramp up and check feedback/metrics in case you need to roll back. The ramp to fully launched could take another two weeks.
So here's a feature that took, what, maybe two months from design to release, and we're falling all over ourselves to optimize the part that took a day so that it takes 5 minutes instead...
Nothing stopping you from iterating with the agent till the code is the exact same quality that you yourself would write
Wait, is this the same AWS I have been using?
However, the code review study needs to compare between surface scanning and reviewing long enough to get over a theoretical slough of perspective: when you assume the coding chair and are in their frame, whether the brain shifts into a different cognitive mode.
Otherwise, just stamping "Looks good to me" is likely to lead to the same atrophy. There's no critical thought, even a self-summary of the change or active questioning.
Thoughtful, deliberate code review just plain takes longer. AI can help here a lot, although it still takes over the "get into review mode" process.
And they will deserve it.
Code review alone is kind of like being able to understand a foreign language enough to read it, but not really understand it in flowing conversation or being able to speak it, much less construct a complex piece of literature.
Retention also suffers, as you will quickly forget what you just reviewed. What is the last PR you remember?
This is, alas, pretty consistent with SV technocracy through the ages. The same simplification instinct that compels people to see timeless human or political problems, for example, as mere insufficiencies of code or apps, invites this rather facile notion that a job with automatable tasks is a defunct job.
Tasks can be automated, to varying degrees. Jobs are a different thing from an economic point of view.
+1 to de-skilling being the main concern, per the author. Curating agents' output--a job no senior developer really wants--absolutely and totally depends on skills acquired by banging out code the hard way, potentially for decades.
Likewise, +1 to the keen insight that the tactile, motor-memory, man-machine process of typing the code is a vital discovery pathway to what you're actually going to write and how it's actually going to work, as opposed to specifying it in natural language. A model trained on the output of this process might spit out passably acceptable code in well-trodden, happy CRUD app paths, but it's not much good once you go spelunking outside that kind of domain. Ask me how I know.
Also, let’s not forget. The developer is rarely the person pitching the feature, and is normally given the constraints and the PRD…
Soooo people can keep tiptapping on the keyboard, but eventually they need to open their mind to the possibility that “the old way” is actually dead.
This heavily depends on the industry and company culture.
I've pitched plenty of features and I've basically never had a spec land on my desk ready to go. Part of my job as a SWE is to help product folks decide what to build.
Knowing some machining still lets you design parts and assemblies that are some combination of cheaper, better, etc. This is noticeable with precision or high performance assemblies. And how many revisions are needed.
I have been described as a decel and a Luddite though so be weary of my opinions.
The result of that though would be establishment of development patterns that are good practices.
The rule of thumb is: An agent can write it, but a human has to understand it before it gets pushed to prod.
I'm still not convinced about the doom and gloom over developers being replaced. I'm not a dev as part of my main job function, but where I do use LLMs, it has been to do things I couldn't have done before because I just didn't have time, and had to de-prioritize. You can ship more and better features. I think LLMs being tools and all, there is too much focus on how the tool should be used without considering desired and actualized results.
If you just want an app shipped with little hassle and that's it, just let Claude do most of the work and get it over with. If you have other requirements, well that's where the best practices and standards would come in the future (I hope), but for now we're all just reading random blog posts and see how others are faring and experimenting.
its easier for me to code now, because its like i have a 24/7 insane intern that needs to be supervised via pair programming but also understands most topics enough to be useful/ dangerous.
ironically ive been spending much of my time iterating on ways to improve model reasoning and reliability and aside from the challenge of benchmark design, ive had some pretty good success!!
my fork of omp: https://github.com/cartazio/oh-punkin-pi has a bunch of my ideas layered on top. ultimately its just a bridge till i’ve finished the build of the proper 2nd gen harness with some other really cool stuff folded in. not sure if theres a bizop in a hosted version of what ive got planned, but the changes ive done in my forks have made enough difference that i can see the different in per model reasoning
My sense is that a decade from now, the people who generally see their place as the driver seat but recognize when its not are going to be writing the code that matters.
You can debate with agentic coding who is monitoring and who is flying but, if we assume the user is monitoring what that means, in practice, for me is that I'm reading and making sure I understand all the changes the agent is proposing to make, as well as providing instruction, guidance, correction, etc. That includes reading and understanding all the code changes.
> An increase in the complexity of the surrounding systems to mitigate the increased ambiguity of AI's non-determinism.
My question is why isn’t there an effort from the author to mitigate the insane things that LLMs do? For example, I set up a hexagonal design pattern for our backend. Claude Code printed out directionally ok but actually nonsensical code when I asked it to riff off the canonical example.
Then, I built linters specific to the conventions I want. For example, all hexagonal features share the same directory structure, and the port.py file has a Protocol class suffixed with “Port”.
That was better but there was a bunch of wheel spinning so then I built a scaffolder as part of the linter to print out templated code depending on what I want to do.
Then I was worried it was hallucinating the data, so I wrote a fixture generator that reads from our db and creates accurate fixtures for our adapters.
Since good code has never been “explained for itself 100%, without comments”, I employ BDD so the LLM can print out in a human readable way what the expected logical flow is. And for example, any disable of a custom rule I wrote requires and explanation of why as a comment.
Meanwhile, I’m collecting feedback from the agents along the way where they get tripped up, and what can improve in the architecture so we can promote more trust to the output. Like, I only have a fixture printer because it called out that real data (redacted yes) would be a better truth than any mocks I made.
Finally, code review is now less focused on the boilerplate and much more control flow in the use_case.
The stakes to have shitty code in these in-house tools is almost zero since new rules and rule version bumps are enforced w a ratchet pattern. Let the world fail on first pass.
Anyway, it seems to me like with investment you can slap rails on your code and stay sharp along the way. I have a strong vision for what works, am able to prove it deterministically with my homespun linters, and am being challenged by the LLMs daily with new ideas to bolt on.
So I don’t know, seems like the issue comes down to choosing to mistrust instead of slap on rails.
Edit: I wanted to ask if anyone is taking this approach or something similar, or have thought about things like writing linters for popular packages that would encourage a canonical implementation (I have seen some crazy crazy modeling with ORMs just from folks not reading the docs). HMU would love to chat youngii.jc@gmail
Where I worry is beginners. The hard-won intuition for "this is a reasonable approach" vs. "this will bite you in six months" takes years to develop. With experience, you steer the agent. Without it, the agent steers you -- and it steers confidently in every direction, good and bad alike.
The sooner programmers start thinking about modeling the domain, user mental models, architecture and data structures and less focus on the mechanics of writing code, the better.
Writing code is the EASY part. LLMs have basically solved the easiest part of software development. They however are bad at all the stuff I mentioned. LLMs don't have a point of view, you do as a software developer.
You could have always had a role that did that; they're called BAs and most engineers didn't choose those roles because actually coding paid more.
Now that all the coders are also going to be BAs, the salary even for BAs will drop (increased supply).
You may think you want to be the person who only specs and never codes, but I doubt you want the salary of that person.
“The market can stay irrational longer than you can stay solvent” quote is usually applied to markets, but it can be applied to software engineering as well - all jobs can be gone even if world will be submerged into technological crisis, with single nine availability (and I'm talking about 9% :) ) and all accounts compromised.
The funny thing is that he mentions that Spec Driven Development is the future.
Technically we allready did this when we where doing Waterfall. I kind of miss it that we had good documentation. The last decade (maybe more?) I get jira tickets with a one liner. I often need to call people as they specify almost nothing.
I am still avoiding working with AI. I try to use some models locally for experiments. I refuse to pay for something that was build on ripping others off. And the local models are underwhelming thus so far.
When you start coding as a way to scope out a problem, you're biasing yourself to think of everything in terms of abstractions which you invented for problems which you don't yet fully understand. My experience is that this distorts your own thinking; you are injecting your own biases into your learning process and locking down on decisions too early due to suck-cost bias.
Having a solution all coded-up and working after a couple of days creates the illusion that you've built something solid and maintainable and that any additional functionality needs to be added on top. Before you know it, the prototype has become the foundation.
It's like if I took you to some random country and told you to build a house and you started chopping wood and putting up the walls straight away. You might immediately have noticed that it's hot so you would put lots of windows... Good... But what you don't know yet is that this country gets hit by powerful cyclones once a year on average and your wooden house won't survive the first one. You started with the wrong material. It might work really well for the first few months until a point when it won't work at all and you'll have to rebuild the entire thing from scratch.
Code was never the scarce resource. Your neighbors' nephew could code a working website that would service the corner shops needs since ten years ago.
The value is and always will be the support and community behind the code.
HN isn't valuable because of a code repo, but because of the community of users and a steady hand cultivating the continued participation thereof. But this too has no real moat, something could change the zeitgeist and tomorrow it joins friendster and TheGlobe.com
This is a common thing doctors complain about. Patients come in, saying they just need a prescription for some drug or other. Good doctors often refuse to give any drugs or any advice until they understand the whole situation properly.
If you're a senior developer, you're the one who has to push back against behaviour you don't like. You have the authority. "Hm, interesting question. I'm going to need more context before I can give you my point of view. Can you give me a quick overview of the system architecture / explain what actual problems you're trying to solve with this approach?"
One of us misunderstood the GP; I understood him to mean that, because he did not write the code, he was not able to answer the questions based on that code.
You seem to think he meant that he could not answer questions on someone else's code.
Off topic, but this must be a USA-specific problem, where prescription drugs are actually marketed at Joe consumer. I think there is maybe one other country where this insane practice is allowed. Nowhere else are patients told to “Ask your doctor about Procrapin for your irritable bowl syndrome!!”
I think what the OP is saying is that it's the OP's job to know that, and didn't, because they over leverage the LLM.
Like if a doctor was brought in on a cardio consult on their patient because they had a maybe unrelated heart condition, and the only thing they could answer to "why did you prescribe cemidine instead of decimine" is "lemme get back to you on that."
Not just that, but it seems like the grandparent had issues understanding what they were talking about. This is absolutely fine, and they should have just asked to continue explaining more until the problem was fully understood.
It’s obvious your opinion is important, but it’s not worth a lot if you don’t understand what the actual problem is.
Also, I personally don’t like to appeal to authority (not sure if that is what you meant), and instead just use the Socratic method to keep asking questions until they themselves understand the weaknesses. It’s a very friendly way of doing things.
"I'll need to study the docs and code to answer these questions properly" is a perfectly fine (and very diplomatic) response to treatment like that.
But now they're not an expert in the code they've recently committed.
Maybe that's OK and expectations need to change, but I'd bet there are a lot of cases where the organization really wants to produce a (code, expert-in-the-code) pair, and should be willing to pay a little time to do that over producing just (code, guy-who-prompted-it).
By the time you said that, the AI could have given the 80% answer. So, no, this is no longer an adequate response. The right response would have been to take your tools and give an informed opinion on the AI answer, right there.
It's quite common to search for the author of a piece of code to ask questions about that code.
If you're happy for me to do the job in 1/10th of the time, be happy with me not being fully across it on a whim.
Not the kind of meetings I enjoy where basically expertise isn't seen as something to build on but rather just (creative) confirmation bias.
Easy, just have the agent write them for you at the end, then never read them...
(/joking)
No such thing. The AI hype bubble is only a couple years old, and will not last a couple years more.
Since LLMs have no internal evaluation, as a reviewer one has to account for it and evaluate line by line, rebuild from scratch any hidden rationale and tacit knowledge the LLM didn't have in the first place - only to be mislead into non concerns draining costly hours.
At this point, the investment is often deeper than writing from scratch.
But also, yeah, it starts to get worse than classic legacy code because you could try to build a theory of mind about the legacy code author(s). There were skills in trying to "mind read" a past generation. To find clues in poetry words more than the poetry form. (The variable names and whatever comments may have survived including commit logs; things written for humans to help explain the whys/hows, not just the whats.)
Came here to say this, but you said it for me. When you are an infinite code generator and your only parlour trick, your only hammer, is generation, and every nail is a problem of as-yet-insufficient generation, then generate you shall.
But the cognitive burden of metabolising this ultra-verbose, circuitous, often brute-force excreta is quite a bit higher than thumbing through a (competent) human's relatively terse approach.
Maybe companies today are being sold junk AI, and next step is being promised “solutions”.. capitalism is working exactly as expected
Apple didn’t go from near bankrupt to where it is today without that discipline.
In a business-driven world with business-driven governments writing business-driven rules, what's the alternative if you want to optimize for success?
Everybody cargo culting winner takes all VC plays is gonna deprive everyone involved from building the knowledge assets they need to compete while playing into the hands of the winner takes all VC plays providing the tools driving all of this.
Seems clever from the point of view of the soon-to-IPO vendors behind these tools, and also seems extremely naive of all the people, teams and orgs buying into this ecosystems.
My current play is to do deep dives into fundamentals and stealth a team of like minded people to start putting out packaged expertise to help escape this quagmire.
I'm not sure skill loss is such a huge issue, in other words. It might just be a sign that the nature of our work if shifting. Being able to recite the C++ standard and using all the 100s of features correctly will just not be as highly regarded as knowing good architecture instead?
It's not just businesses doing it either, I regularly see big PRs get merged on open source projects that seem fine on the surface but contain a 1000 paper cuts worth of bugs (not critical, but just enough to annoy you)
On top of that, the code wasn't idiomatic C++ (for this specific project) and the LLM completely ignored available APIs. Sure, it can be fixed, and maintainers should've caught it, but the amount of code being generated requires so much energy on everyone's behalf.
Another aspect is that it reorders some of our problems.
In typical development, we're more likely to go back and forth about "is this really what we want to make" or "what could possibly go wrong if we do that", and ideally we do it before PR's get approved or anything is merged/deployed. Some portion of that is getting moved to "we'll see if anyone complains later". As they say, an ounce of prevention is worth a pound of cure.
I actually mention this exact thing it the article under the section "LLMs accelerate the wrong parts", which seems to be saying exactly what you're saying:
https://larsfaye.com/articles/agentic-coding-is-a-trap#llms-...
I would find it hard to believe there is any developer in history ever uttered the words:
"I really wish I had a tool that could generate code I don't understand, and at a rate faster than I can review".
This reminds me of one of my software engineering axioms:
When making software, remember that it is a snapshot of
your understanding of the problem. It states to all,
including your future-self, your approach, clarity, and
appropriateness of the solution for the problem at hand.
Choose your statements wisely.
> So here's a feature that took, what, maybe two months from design to release, and we're falling all over ourselves to optimize the part that took a day so that it takes 5 minutes instead...Well said.
Relevant: Programming as Theory Building (1985) by Peter Naur. The actual text is rather stuffy, but basically the code+docs cannot replace the richer in-human-heads ideas for what the real-world problem is and how computers should (or shouldn't) be used to face the problem.
2. technically risky ideas that you never would have tried because it didn't make sense from risk+effort/reward standpoint are now within reach. it isn't "go faster" per se but the speed at which you can try something out still changes the nature of engineering process.
I confess that I don't understand why this isn't true, because it seems to be true on the micro level, but it really hasn't been my experience. The platform engineers I'm familiar with are desperately trying to tread water to keep their systems healthy against the now-higher code velocity without falling to pieces. (Perhaps people used to make minor day-to-day improvements while coding that Claude enables us to ignore?)
All the process you described exists to maximise the amount of time your software engineers spend writing code[0]. You put this process in place because software engineers are among the most expensive employees in the business. Their time being wasted is meaningful to the bottom line.
Make the software engineers cheap enough and the need for a lot of this process evaporates. Companies that already _have_ these processes in place will be SOL because it's incredibly challenging to break a bureaucracy like that, but companies that either don't have these processes or manage to eliminate them will have a significant competitive advantage.
Which shouldn't be news. Startups have always competed with established businesses via speed of execution. What's new is the ability to maintain that speed for longer.
> At bigger companies, you're going to need to pass all sorts of reviews from other departments, like legal, privacy, performance, accessibility, QA...
These are all in the firing line. If the company could outsource their legal liability to an external provider of these reviews, they would.
[0] We'll just ignore the irony that much of this process ends up being foisted on the employees whose time you're hoping to save.
Ai writes the plans now. I just review and modify.
Big tech has a lot of wankery like that but smaller companies can be fast and scrappy
Ask the agent questions about all the other teams' code, reaching out to them for questions it can't answer or clarification. With agent capabilities atm this is rare or can be done fairly async: "please confirm these things".
Maybe realise your code architecture is completely wrong. Manually code up some new abstractions that fit better, write the learnings into the spec plan. Strip out any implementation that largely doesn't fit your updated abstractions. Ask the agent to migrate the code to the new structure.
Repeat until spike is operational and you're happy with the abstractions used
Chat with the agent to create a Design Doc for the approach in the spike. Create a single JIRA ticket for "Productionise CodeShmode's spike". Get reviews and feedback from stakeholders.
Integrate feedback into your spike, or even the original spec document and regenerate the whole thing.
So much of the ritual you've outlined here is overhead from working in a large org where roles are siloed. When one person is empowered to do more then the actual work per person goes down and the overhead becomes the dominant. But that overhead isn't needed anymore because one person can now do many people's work.
I've whipped up spikes in a few days that would've been a month of work across a team multiple DDs and approvals. In the past this wasn't feasible so we would need to justify what those people would work on. Now you can whip it up, show a working demo and ask "should we productionise this"
What happens if this “spike” violates someone’s patent or puts the company at legal/regulatory risk? What if it leaks users’ personal information? What if it introduces a vulnerability that my 13 year old can exploit? What if it crashes 2 million of your users’ devices because it didn’t anticipate some unusual configuration? What if the code totally conflicts with some other team’s future plans and you didn’t know because they never reviewed it?
This kind of yolo “just try it” development only works if you are very small and low-profile, don’t have hundreds of millions of users, or your software is inconsequential (nobody cares if it goes down or doesn’t work).
Short-lived tightly-scoped agents can do alarmingly thorough and high-quality knowledge work, as long as the work itself is relatively mechanical and can either be carried out in independent chunks or sequentially. For example, a research agent like the Gemini "deep research" tool can save hours of digging around the web and compiling information. With careful prompting, sufficient background context, and good self-evaluation tools, an agentic loop can do very detailed data analysis, carry out serious statistics and machine learning projects, produce high-quality data visualization thereof, and put together a handy executive summary.
They occasionally hallucinate, go off track, get confused, and make mistakes. But they "know" everything that's been published in English for the last 200 years, they never get tired, and the code they write is good enough for throwaway scripting. The real power of agents being able to write code is that they can be extremely self-sufficient and flexible in carrying out these kinds of tree- and sequence-structured knowledge work tasks.
That's of course a different thing from "designing good software", which is neither tree-structured or sequential, and requires a level of intelligence (for lack of a better term) that LLMs do not seem to be capable of, at least not yet. But that's a more specific thing than just writing code in order to get stuff done that happens to require code.
I think that’s mostly true, but also I think there is some skill to using agents well. Specifically, work with agents to get a really good product requirements document, then task it out into very narrow user stories / vertical slices (this takes some iterating—the AI really seems to want to think in horizontal layers today), then maybe walk through the code interfaces to be super sure you are aligned. At each step, I make the agent interrogate me thoroughly with every question it can think of, and even if we stop now we will have a system design and tickets that are much higher quality than me thinking alone. I could hand those off to anyone to implement, but I think having an agent TDD their way through the code is the sweet spot.
Whenever the agent is doing something I don’t like (e.g., some coding style thing), I pause and have another agent help me write a style guide that agents must read. This slows me down at first but I think it will pay off in time.
I don't want my code quality, I want AGI code quality - that's what I was promised and jetpacks and flying cars too!
That's what we're spending 7+ TRILLION dollars, destroying ecosystems to build datacenters, and ruining society's social contract on truth and employment for? To build something that produces the average quality of a human, all while making the same types of mistakes along the way?
Sounds like a shit deal, really.
Yeah, but in my experience, it takes the same amount of time or longer to cajole the AI to get it there. I'd rather write it myself and know how it works than insert an LLM as the middleman, especially when it isn't really proving to be any faster.
These articles frustrate me greatly. That said, the author's point about token cost is real, and a risk.
I will admit there are occasional times after iterating so much I’m not sure if I’ve even saved time because going from “it works” to “it’s up to quality” takes so long
Still very significant savings over all that rather mechanical work. It's ultimately cheaper than doing a code review, and it's faster, because there's less need to manage the emotional state of the person whose code is being reviewed. Maybe I am a slow developer or something, but I am getting a lot of quality changes like that done that before I'd not have, solely because of time spent.
And not increasing the quality just causes problems anyway. Given the same quality, more changes mean more outages than before, just by probability. Increasing rate of change demands a similar increase in quality if you don't want your production support costs to go up. So spending at least a bit of time on quality, letting the LLM do the nagging little things that before you didn't do beause they they took too long and were not a core part of quarterly goals is basically mandatory.
And yea usually does for me
If this was an actual paid job, I do wonder how that would change my LLM use. The reason I'm a software developer at all is because I love the craft. The act of building, of using my brain to transform ideas into code... that's what I enjoy. If it was just prompting an LLM, would I still do that job? I don't know. I'd probably start looking into the idea of switching careers, at least.
I still reject > 50% of AI suggestions, because they're too mediocre, like moving code for no reason or sometimes it is just plain wrong.
The default Claude Code style harness is bad for complicated problems as well. Just taking the specific class or function you're working on, and putting it into a deep research style loop yields way better results. Limiting the initial context by hand is still the way to go in a lot of cases.
> While Erdős generated a huge number of problems, they are not all equally significant and important. I have, unfortunately, seen some mathematicians grow dismissive of Erdős problems recently, perhaps because they have seen reports of AI solving problems on this site that turned out to be quite simple, and wrongly generalised this to assume that all problems posed by Erdős are amusing novelties, of the level of olympiad problems.
From: https://www.erdosproblems.com/forum/thread/blog:5
The rest of the article isn't about AI at all, but I did think it was funny that it describes mathematicians as having more or the same opinion as SWEs.
Now if your career is built on writing out the same boilerplate code in its infinite slight variations every day, congrats, you've been automated. Thank god we can free up our intellects to focus on the actual hard problems, the ones that are somewhat cutting edge, the ones that actually push our field and humanity forward.
Literally every example of AI generated code (without significant human input) is just basic stuff that is wholly unimpressive. Oh wow, you had an AI generate a Next.js app? It's writing HTML for you? It made a generic SAAS? Guess I'll become a farmer now.
Or, wait, I'll continue to write my multithreaded real-time multiplayer network for a MMO, since the AI currently generates something that would get me fired 10 seconds ago if I tried to push it to production.
It's amazing how you introduce just the slightest difficulty or novelty to an AI and it just craps the bed. And then you go online and apparently we're gonna be replaced -6 months ago or something.
People need a reality check.
You will still need to QA stuff and review PRs, but I think AI done properly can genuinely make some tasks better.
I can certainly get it to do things that are reasonably common it seems like.
As for the article itself, I can agree with much of it.
[0] For those with AI scraping PTSD, it was a government site with public domain info and I know how to scrape politely
Yeah, likely
> development patterns that are good practices.
Wait, now you lost me
The article essentially claims that no, that line of thinking is false. If the agent writes all of it (or too much of it, where "too much" is still not well defined), then your ability to understand it will atrophy with time, and you will either a) never push to prod, because you can't understand it well enough, or b) push to prod anyway, and cause bugs and outages.
I think the article is correct.
> I'm still not convinced about the doom and gloom over developers being replaced.
Agreed. The agents are just not good enough to write code unsupervised, or supervised by people without senior-level skills. And frankly it's hard to imagine them getting there. Each new release of the coding tools/models is a mixed bag. Some things are better, some things are worse, and the gains are diminishing with each iteration. I am afraid that we're going to hit a ceiling at some point, at least with the transformer architecture.
> but for now we're all just reading random blog posts and see how others are faring and experimenting.
Yes, exactly, and many people are not faring well. The article cites several examples of people feeling less capable after using LLMs to write code for a while.
What I said doesn't contradict the article. if what you said is true, then since a human can't understand it well enough, that approach is not good according to my rule of thumb, thus agreeing with article. I only established the litmus test.
agreed on other points you made.
> standardized like AGILE and SCRUM
perhaps too cynical, but if its anything like agile and scrum in $CORPORATION it will just add to the daily slog and gum up everything...I would think that these days all of this is incorporated into the CAD/CAM software that they're using, right?
I really hope I'm wrong but I can see it already happening.
If the AI could answer the questions, why would they ask OP about it?
I think this matters even less in higher tech companies, because they're not playing in the margins where an inefficiency can hurt. Though perhaps ironically, I think AI providers really are in a realm where technical execution at the margins will make or break them.
Eventually this will be automated as well. Discipline, rigor and correctness are not strictly human tasks.
So far, it's been pretty underwhelming--on par or slower, and it's definitely more frustrating lol.
Honestly, the only killer feature I found so far is overcoming ADHD activation energy lol. Getting annoyed with the idiot robot screwing up the Terraform migration is apparently a good way to get me to finish a Terraform migration.
And yes you should completely grok what you're finally committing, making sure it's fully tested. What it enables is lots of fast experimentation before settling on your final change.
Claude code is insanely useful for explaining other peoples code
again, maybe overly cynical but ime "good practices" usually end up getting warped into "bad practices" caused by cargo-culting/up-selling by consultants as they try to mass-produce a new dev paradigm
Sometimes hard like interesting and you get to do really novel thinking. A load of p2p/decentralised things are hard like this.
Also sometimes hard like you get to a particular challenge and it turns out to be a notoriously unsolved mathematical thing, or you push against subtle boundaries of core libraries, runtimes, systems etc. Working with metagenome assemblies is this kind of hard.
Honestly the hard code I've done made such a difference to my brain. There's plenty of trivial stuff I'm happy to have automated, but of I can't work on the hard problems I may as well not be involved at all.
"Depends".
Here's an example unrelated to the Erdos problems: https://arxiv.org/abs/2510.23513
Also that paper admits the problem turned out to be pretty trivial and was only unsolved because nobody had bothered to try that hard (page 11)
There's a lot of problems with paid scientific journals being a walled garden and I am by no means defending that system, buuuut it's also true that anything published to an open repository is almost certainly there because it wasn't good enough for anything else.
> Relevant: Programming as Theory Building (1985) by Peter Naur.
Great reference and I agree. From the abstract in the PDF I have of same:
Peter Naur’s classic 1985 essay “Programming as Theory
Building” argues that a program is not its source code. A
program is a shared mental construct (he uses the word
theory) that lives in the minds of the people who work on
it. If you lose the people, you lose the program. The code
is merely a written representation of the program, and it’s
lossy, so you can’t reconstruct a program from its code.
Programming is a fascinating combination of mathematical determinism and pure expression of consciousness. Both are entirely abstract, whose worth is only quantified indirectly.Entire organizations are built upon these intangible work products. Careers are made, promotions given, "free valence problem solvers" allowed to soar, stock options issued to birth millionaires.
But Valhalla is only reached if a cadre of engineers can "see" the system, both for what it is now as well as what it must become.
EDIT: removed irrelevant "physical world" sentence fragment.
Yet everyone is buying into the discourse of a handful of vendors that they need to lock into their ecosystems and trade in their knowledge for short term speed gains.
It's brilliant from the standpoint of the vendors, and absolutely crazy that people are arguing vehemently that walking into that trap is a competitive necessity.
it's interesting cuz my intuition is to give the language model writing the files as much context as possible, which means all of the previous planning thread. but I also thought you should plan with a small model and implement with a large one, and the meta seems to be plan with an expensive one and delegate code output to smaller ones. so what do I know.
> The agent should make very small changes at a time and then test that everything still works.
yeah I think if it's treated like a codegen machine it's basically just outputting code as if you're using a dsl, except the dsl is natural language and the output is meant to be edited, no `// this is generated code, do not edit` headers
> I think AI done properly can genuinely make some tasks better
thank god I dont need to write html by hand anymore, what a pita
But I should also emphasize my limited experience and the rapid pace that this stuff is evolving.
If you are coding by hand like the old days you are probably not literally writing everything from scratch anyway, you are copy pasting a bunch of shit off google and stackoverflow or installing open source libraries.
This doesn't happen at all for using agentic coding: What the programmer wants and what the boss wants are pretty well aligned. There are corner cases where someone isn't allowed to use LLMs, but does it anyway, but in most cases, the organization agrees.
Unless the teacher's role is to scaffold and support the students in acquiring what the students want, gain trust and lower the disconnect.
Thinking is happening at a higher level. Humans are adepts at abstraction, and they are always capable of looking under the hood when needed.
We've never had so much societal capability as now. And that's only going to accelerate. Smart people will use these tools effectively. Don't be so bearish on human ingenuity.
Think of these tools as bullshit / busy work removers. You can focus on what matters and get more done than ever before. Deeper work, more connective work. It also opens fields of research up in an interdisciplinary fashion. People might explore outside of their limited domain now that they have help.
For example preliterate people have absolutely insane memory. In comparison my memory sucks. Having to use notes, look things up etc sucks. Literacy is a tradeoff but at least it can be argued to be worth it.
Then there is smartphones. This is not the same. The tradeoffs compared to pre smartphones cannot be argued to be worth it imo and I was 20 years old when they were introduced. They make society and lives worse. It's not just about not being able to use PC but your attention and social skills sucking.
Then there is AI which is even worse than smartphones. The tradeoffs are so unthinkably bad I can't really even describe it.
The proof is in the fact that the savvy Atherton dwellers work hard to keep their kids away from the crack they themselves have foisted upon the world, or at least to delay or forestall the encounter.
Sometimes patients will also have an expired prescription from a different doctor, and they want a top-up. Good doctors check. Just because some other doctor prescribed some drug 3 months ago doesn't mean its actually the right choice, or the right choice now.
Its not just a US thing. I have a few GP friends here in Australia. They complain about it too.
I live in Norway and have heard from lots of people (coworkers, friends, acquaintances) that doctors are very reluctant to prescribing anything at all - the running joke is that they'll advise you to "get some fresh air and go for a walk" even if you just broke your leg in half
As someone from The Netherlands myself, I’m fairly frustrated with our healthcare system being like this, optimizing for GPs being the gatekeeper to the specialist healthcare system, and as such being super reluctant to actually help us.
To be clear: one of my siblings is a GP and they themselves are frustrated by this as well but can’t change the system. It’s the half privatized / half socialized toxic combination that stings here.
I had to go to another country to get some of my symptoms taken seriously.
To be clear: no amount of “but I’ll pay myself for this study” helps. It’s just not possible. It’s super moronic.
Never mind the fact that specialist healthcare is super distributed, without any central oversight, and you have to behave like your own project manager when you’re being taken care of.
Healthcare is a mess in most of the developed world, probably because it’s incredibly expensive.
Same in the Netherlands. Except it's not a joke and I've met enough people who've suffered avoidable long-term effects by apathetic doctors, including yours truly.
You tell the doctor your symptoms, he explores you and perform any tests considered appropriate, and the doctor decides the prescription.
But you might have prior experience and preference.
A friend of mine always asks "Hey, can you bring antibiotics back?" when I go on vacation somewhere where access is easier.
Doctors here will give you a hard time if you want it.
I think that's how it should be. But sometimes you know you need antibiotics.
Perfect
Also, it's always the case where you think LLMs are great at doing whatever it is that you don't understand or value.
They're absolutely shit at this. You only say that because it's a thing you "don't want to be concerned with".
> in five minutes it will be better at debugging production issues
In my circles "debugging production issues" is running perf to diagnose memory allocation hot paths and tcpdump to figure out who is sending bad packets.
You also wrongly assume that requirements can always easily expressed as natural language.
Another point: Software Engineering always starts where tooling capabilities stop. You don't get a competitive advantage by building without engineers what anybody everybody else can build without engineers.
What I think will happen is AI will write code and it will do the best it can to mitigate mistakes prior to rollout, but once rollout time occurs, rollout will be incremental and it will self monitor by defining success conditions at rollout time. The nature of the code will mitigate "catastrophe" to a small group at worst, but most likely initial rollout will just run new versions of the code in a simulated context (language design could benefit from this) and analyze potential outcomes without affecting current functionality.
But when the code goes live... it will be slowly scope changes progressively (think feature/experiment flags) and if it fails in the initial cohort, it will redirect. If success is positive, it will increase the rollout cohort.
This is a normal software engineering practice today, but it's labor and process intensive when driven by humans. But in a world where humans are less involved, this process is scalable.
Counter points to my own arguments:
1. We don't know yet in detail what AI is good at.
2. AI doesn't need to be perfect, just "good enough", whatever that means for a specific project. More failures while saving hundreds of thousands dollars each year might be acceptable, for example.
This sounds a lot like allowing an LLM to define tests as well as implementation, and allowing the LLM to update the tests to make the code pass. Recently people have come to understand (again?) that testing and evaluation works better outside of the sandbox.
I'd note here that the long arc of software engineering has been commodifying the discipline into tooling. Ask any unix greybeard how shitty modern abstractions are and they'll give you all you can stomach and yet the wheel turns despite their treasured insights.
I think you’re overly hyped if your actually believe this is going to be a reality in 5-10 years.
People were burying their heads.
Today, there are not many of those people left. Some, but not a lot. Because you can only deny reality for so long.
I don't know what the coding world is going to look like in 5-10 years, but everything has changed radically in the space of a year from maybe 10% of people using agents to code to probably 95% of people now. In about a YEAR.
I don't know, but my assumption is these things will get better to a point where they will be automating close to 100% of coding, and deploying, and verifying, etc. The old job we had will be completely changed well before 10 years. I still think us "engineers" will have a role to play, but I genuinely don't know what it will look like.
Last I saw about a week ago, the stats were about 35%. There may be some confusion around this:
1. The absolute number could have remained the same but the sheer volume of vibe-coders who never coded before raised the percentage. For example, if 100 out of a population of 1000 people uses AI then the percentage is 10%. If, over the next year 9k new vibers were created but none of the existing 1000 people changed their workflow, you will see 9100 people out of 10000 people using AI - that's now a 91% rate of people using AI to code even though none of the people since last year changed the way they work.
2. Last I checked, pre-AI, there were about 12m working developers in the world (SO survey extrapolated). As of February this year, CC, by itself, had 60k subscribers. Even if we err on the side of optimism and assume every single subscriber is running the agent, that's still not 95% of developers.
> I still think us "engineers" will have a role to play, but I genuinely don't know what it will look like.
??? We already know what it looks like - "Business Analyst" has bee a role since forever (at least since 1995, when I entered the workforce). If you wanted a role where you wrote no code but merely drew up specs for the programmers to code, you could have had it as a BA.
It's just that few of us wanted to do that as it paid half what an engineer made. Now with the supply of BAs potentially doubling, it will pay a quarter of what an engineer used to make.
This pre-supposes the idea that the business is _willing_ to let that happen, which is increasingly unlikely. The current, widespread attitude amongst stakeholders is “who cares, get the model to fix it and move on”.
At least, when we wrote code by hand, needing to fix things by hand was a forcing function: one that now, from the business perspective, no longer exists.
Don’t forget to mention that.
And of course in the current workplace where there’s often a push from managers to use LLMs as much as possible and to put as much work as possible on yourself, in this churn junior will not get to learn anything besides prompting and simple tooling.
None of this requires coding by hand. I can do those things better and faster with agents helping me. That incudes unfamiliar areas where I am effectively a junior.
Haha i got some news for you…
The earth will move on. Smart people will continue to exist. I'll bet on the smart people.
It's an impressive force multiplier, but there has to be some force to multiply.
If you're not seeing this, at best you're probably unable to direct them or use them well.
FWIW, if you don't believe the above, I challenge you to put up a quick git repo, where you are unable to get the deserved quality out, and we can quickly show you how the same quality is available via SOTA agents, within a fraction of hand-coded time.
Depending on the task, it can sometimes be just as arduous to produce enough guidance and guardrails to get the LLM to output exactly what you need that you can trust without issue or extensive review than it is to write it yourself and use the LLM just for ad-hoc generation. It's a constant balance and an endless amount of micro-decisions, honestly, but it's pretty essential to stay engaged and not YOLO with agents the way so many are. Most of my interactions with models these days are done in pseudo-code.
LLMs can't reach the metaphorical. LLMs don't know what true beauty is. I will grant you they have gotten great at the literal and the poetry forms. But it is the beauty that elevates things to my quality bar, and makes a difference between "legacy code" and "innovation" to me.
I use hundreds of millions of tokens a month, and LLMs have completely transformed the way I work. They're also, frankly, pretty mid programmers.
I'll still use 'agents' for throwaway tasks--mostly with local models--including tasks where some sort of ad-hoc code generation is in the critical path (e.g. scraping data).
> ...
> Comes in GPT-3.5, accelerated my learning.
So now you can code? If I sat you in front of a computer with no internet and no GPU but your choice of IDE, you would actually be able to produce a product?
Anyone wondering about my proficiency, I can code without internet or AI help. But it takes enormous amount of time and mistakes.
You're looking at a specimen of one of the most rattled species nowadays. It is so fun finding them under these articles.
Their last attempts at finding inner-validation.
What a waste.
Was that claim not true?
I don't want to shit on what you've done but you're coming in WAY too hot for how trivial your work is and how inflated your description of it is.
You're demanding humbleness while being extremely proud. Check yourself.
> accelerated what learning? learning to code? learning to engineer? learning to manage? learning to market?
I'm pretty certain that you think you're talking to an owner of a business but you're actually talking to an AI-techbro whose "software-hardware hybrid product" and "incorporated company" has exactly zero revenue after it was prompted into existence in the hope that it will make some money before other people realise they could prompt the same thing for less.
Cope harder. Try harder to demoralize. It is not going to work.
I had heard HackerNews had some of this loser bunch of personalities. I didn't expect them to show themselves so soon.
But hey, I realized "AI bashing" articles are the best places to find these gate-keepers. Makes sense now.
Now AI lets you write code using libraries whose documentation you can't even read? How is this a win?
English alphabets came into my education at the age of 10. I got my first computer at the age of 21. I began speaking broken English around the age of 23. Proper internet at the age of 25 or so.
Not to mention, my native language doesn't have programming books, even today.
Of course, an avid reader and Science nerd here. Curiosity and tinkering never stopped.
In my country, english is hardly anyones first language, but its' mandatory in schools so I've never had the experience with having to find knowledge but its gate-kept behind a translation wall.
Now there actually is time to make things robust if you learn how to do it.
And I know this is the same in peers workplaces.
Why would that matter? I'm not giving you my experience of developers next to me, I'm telling you what I gleaned from published reports.
In any case, I know how these arguments go. It doesn't really matter.
This I think is the unexplored aspect of what's happening right now. Guardrails around "good enough" systems is where the future value lies. In the future code will never be as good as when the artisans were writing it, but if you have an automated process to validate/verify mediocre code (and kick it back to AI for refinement when it fails) before it's fully productionized, then you have a pathway to scaling agentic coding.
And this could easily apply to every change we made by hand before AI, it was just a tedious process to layer these things into code when we were just fixing bugs and whatnot. In an AI writes all the code world adding this kind of stuff as table stakes for a changeset is zero cost, effort wise.
Absolutely, I understand what you're saying.
One of the things people miss out, in most of the discussions is that they think "if you were really serious, you would have figured it out". I agree with that in most instances but language and skill acquisition is a complex process as everyone knows.
English being the de-facto reservoir of programming knowledge and applications, it takes substantial amount of time and effort to cross the threshold of understanding and transference.
In any case, I'm an eternal optimist and I believe in action. It was a great experience listening to people's opinion here and I was kind of shocked to find that some of them are so siloed in their chambers, that's interesting nonetheless.
Honestly, I want you to live with your assumptions and beliefs. No more goodwill from my side to your username.
No discourse - you made a claim, but it appears now that your claim is untrue.
It's faster for you to answer the question than to dodge it: Have you actually learned what you claimed you did?
To me the trend seems to be that AI produce the same challenges as human did before and that the same solutions are helping. Without a good maintainable code base, AI will eventually fail to even fulfill quantifiable requirements of changes.
That's kind of the point of software since the beginning. Nobody cares about the easy stuff that can be produced without much effort and what's possible without much effort has changed dramatically over the years.
No report is going to help because there are no actual figures for token providers at the moment. We'll have to wait for them to IPO before we'll know for sure.
Vibing a product into existence without needing any development knowledge or experience just means you now have a "product" that can't really be sold for money.
---
Now you want to have a "discourse"? No thanks.
Honestly, I want you to live with your assumptions and beliefs. No more goodwill from my side to your username.
Why is this? Are you not proud that you can produce products without possessing any skill?
This is my last response to you. Expended your credits.
What makes you think you are going to be given time to polish it? You would be pushed to another project. You have more responsibilities with none of the growth.
Or you’re getting the model to do the polishing, thereby developing no skills of your own, and we’re back to the start.
The argument seems to be that AI is causing managers to demand faster results, and so everything has to be a one-shotted mess of slop that just barely works. My point is that it doesn’t take much longer to build something solid instead. Implementation time and quality/robustness are not tightly coupled in the way they used to be.