An AI coding agent, used to write code, needs to reduce your maintenance costs

An AI coding agent, used to write code, needs to reduce your maintenance costs(jamesshore.com)

378 points by cratermoon 7 days ago | 109 comments

keithnz 7 days ago |

In my experience AI reduces maintenance costs. Though, context might matter here, I'm working on a multi decade set of projects, while there is a lot of greenfield feature development, the old code / older projects have suddenly become a lot easier to work with, modernize, and in a bunch of cases, eliminated. Dependency on old libraries, build tools, in some cases updated, in other cases just eliminated, builds are faster, easier for developers, etc. End to end testing has become a lot easier to setup and automate. DevOps have been improved a lot, diagnosing production issues drastically improved, we have a ton of logs and information, and while we have various consolidated dashboards / monitoring to capture critical things, now we can do a lot more analysis on our deployed system (~50 ish projects)

theteapot 7 days ago | |

This rings true for me too, but I don't think it counts if your just using AI to aid maintenance. The basic argument in the article is around how many hours of maintenance you have to do for each hour of "value-add" feature development. So A. your only measuring maintenance costs not the ratio and B. The "old code" whp wasn't written with AI in the first place.

samrus 6 days ago | | |

True. The critical calculus here is if AI decreases maintaince costs faster than it increases code output (which the article hypothesizes,maintaince costs are proportional too)

btbuildem 6 days ago | |

I agree - AI makes it easier to wrangle legacy code. I think the author's point is that if you lose access to the AI tools, everything becomes more daunting -- because you've been comfortably moving mountains with heavy equipment, and now it's back to hand tools.

samrus 6 days ago | | |

That is true, but if you lost access to modern computers and had to do everything by punchcard you would lose productivity too. I kinda dont like that argument

aprilthird2021 6 days ago | |

I have a very very opposite experience in a large company where everyone shits code all over parts of the codebase they don't understand with AI assistance.

We have had outages increase in tandem with lines of code shipped and outages are getting more and more severe. Yes we have improved much old code, deleted more old code, can automate code modernization, can better diagnose issues, have more options for mitigations, etc.

But all that has not offset the sheer magnitude of code being shipped which no one really understood.

therealdrag0 5 days ago | | |

Shipping code you don’t understand isn’t really opposite to GP, that’s a different point and more of a “skill issue”.

richardbarosky 7 days ago |

Insightful. Agree with this take.

Unfortunately, maintainability is simply bucketed as a "non-functional" requirement.

Maintainability (and similar NFRs) should actually be considered what preserves and enables the delivery of future functional requirements -- in contrast to framing non-functional requirements as simply "how" the software must do what it does vs. the "what"/functional requirements that "actually matter".

From that standpoint, if a steady flow of features/improvements is important for a project, maintainability isn't really a non-functional requirement at all, and amounts to being a functional requirement, in practice, over anything except the shortest of time horizons.

p0nce 6 days ago |

In my Dconf'24 talk "Software as investment" I proposed a basic framework based upon a value function (compositional) for each piece of software. This framework doesn't really need an update due to AI, apart from the (unrelated!) cost model being updated depending on how good AI is at maintenance. Apparently it would do 1.7x the number of bugs, but perhaps it fixes them faster too? I don't know.

Seeing software as investment avoids speaking about "technical debt" by speaking about "value", a liability just being an asset with < 0 value. When software exits the high-margin world of yesterday it needs to develop a precise definition of what software deserves to exist, economically.

dirkc 7 days ago |

Two things I'd add

1. software doesn't only have tech maintenance - there is also user support and it increases as software grows.

2. I'm not convinced maintenance costs scale linearly. And even if it scales linearly, you will eventually get to a point where maintenance takes up all your time.

samrus 6 days ago | |

You think it scales super linearly? Could make sense, with the mainta8nance of not just the parts but how they interact with each other

Seattle3503 6 days ago |

My team has been using AI to add code, but also to aggressively remove old deprecated code. "Is anyone still using this? How does this get called" is easier to answer when you can toss your FE, BE, and entire codebase at an agent and let it create a map of your software project. IDEs can do this in a single language to some degree usually in a single project, but RPC, REST, etc... break some of these tools in a lot of IDEs.

m463 7 days ago |

Same with code reviews.

I wonder if AI could make code reviews more presentable.

for example, with human code reviews, developers learn quickly not to visually change code like reflowing code or comments, changing indent (where the tools can't suppress it), moving functions around or removing lines or other spurious changes.

And don't refactor code needlessly.

also, could break reviews up into two reviews - functional changes and cosmetic changes.

jpollock 6 days ago | |

Do any refactorings in separate reviews, and say things like "REFACTOR_ONLY:", with a rule that none of the code changes behavior.

That makes reviews a lot easier. The review starts from "nothing should be changing" and then reviewers can pattern match on that.

Otherwise, the reviewer is re-evaluating every line of code to make sure nothing has changed. That's really hard to do properly.

The version control systems I've worked with have allowed queues of changes, each one reviewed independently. As I'm developing, if I need a refactor, I go up a commit, refactor, send out for review, rebase my in progress work and continue.

I send out a continual stream of "CLEANUP:" "REFACTOR_ONLY:", and similar changes with the final change being a lot smaller than a big monster of a change.

Your reviewers will appreciate the effort.

Plays the metric game (if you're working in that type of org) without being evil too.

crooked-v 7 days ago | |

https://github.com/ReviewStage/stage-cli looks like an interesting start on that subject.

whattheheckheck 7 days ago | | |

And nwave

https://github.com/nWave-ai/nWave

They have /nw-buddy to point you in the right direction

Very nifty

NanoWar 6 days ago | |

First Agent I used: Do a proper code review of the changeset, it adds comments in my merge requests. Then the junior devs paste these into their IDEs and loop forever :-P

jasonlotito 6 days ago | |

These are problems with a code review tool. Not a code change problem.

m463 6 days ago | | |

but I've had to review ai code changes that did these kinds of things in a way that confounded the (decent) review tools.

Also killed readability in general to the source files, apart from making sense of the review changees.

jwpapi 6 days ago |

I really like this question:

If you could wish for a codebase, which codebase would you wish for?

If you think a second on that question, you’ll realize you probably not wishing for a super feature-rich one, but an easy to understand one, quite close to what you have now. One that is easily to maintain and extend, depending on the upcoming business challenges.

KaseKun 6 days ago | |

Code doesn't exist in a vacuum though.

Code bases that you "work in" (maintain, etc) solve real world problems, and solving those problems should trump cleanliness every time

Codebases that are clean are typically showcase examples that sit on a shelf to be admired and appreciated.

skydhash 6 days ago | | |

> Code bases that you "work in" (maintain, etc) solve real world problems, and solving those problems should trump cleanliness every time

Only if you value your time more than the users’ time and your fellow developer time. Code are run and read more than they are written. You may need to do some hacky coding, but they should be small in scope, surrounded by warning, and have a ticket filed for properly resolving the issue. Otherwise, it’s not worth it.

QuercusMax 6 days ago | |

> an easy to understand one, quite close to what you have now

oh, you sweet summer child... I wish I had one of those

aetherspawn 7 days ago |

I think AI is great for the soul destroying boring stuff that makes me want to quit my job like wrapping legacy code in test cases. Hey I’ll take on any idiot who’s willing to do that job, even if he’s artificial.

WhereIsTheTruth 7 days ago | |

You can only type at 50WPM and read one file at a time, the LLM doesn't have the physical limits, use it at your advantage so you can actually focus on the work that matter

gitaarik 7 days ago |

Yeah, but to be honest, I sometimes just tell Claude to cleanup / refactor stuff; it finds a lot of things, discusses it with me and I approve the plan, and it churns away my tokens for some time. I do this once in a while, and I've been doing this for over 6 months and I don't feel like my development has significantly slowed down. Yeah my token usage is more for sure, but my codebase also is, so I'm not worried about that. To me AI seems to make maintenance very easy, like the rest. You just need to do it.

Edit: I make it sound a bit simple maybe. I do more extensive redactors also, where I'm more involved and opinionated. But I don't feel the need to do that very often very deeply. But yeah sometimes it's definitely necessary to prevent the project from going off rails.

hombre_fatal 6 days ago | |

Yeah, there's a double-standard I've been seeing in LLM discourse here: LLMs suck but they are also somehow expected to proactively do maintenance sweeps in your code over time and repay technical debt, presumably on your behalf.

If you want to build well-architected, well-tested code or pay back debt, the LLMs make the world your oyster. And it's easier than ever since LLMs have no problem doing ridiculous cross-cutting refactoring that you'd never have done on your own.

That LLMs essentially lead to code that's harder to maintain, or that human-produced code is easier to maintain by default just aren't claims I'd sign off on, and TFA doesn't try to render the argument.

I'd argue the opposite since LLMs make it trivial to plan arch/tests for all the code changes you wouldn't have had the energy to do it for.

tossandthrow 7 days ago | |

This is my experience exactly.

I have reduced our response time on our api to 30ms from 80ms and gotten a setup we can comfortably grow into.

I had not had time to track down these optimizations without Claude code.

gitaarik 6 days ago | |

I'm getting downvotes for this. Why exactly?

joshka 7 days ago |

I feel like AI might let us model some of the things that we initially didn't scope that led to these problems (e.g. "Decided not to fix every bug, or upgrade every dependency") - being able to more easily ask a system that can dig into "how much time are we spending on stuff related to foo"

AI tooling can also be a place where we start building our view of what maintainable software practices look like so we don't make decisions that have these same tail effort profiles. That can be things like building out tooling to handle maintenance updates

I think the real thing that comes out of AI tooling is probably that the tooling needs to be trained (or steered) towards activities that enhance human attention management.

stevepotter 7 days ago |

For me, if I can make a kickass testing system that people love so much that they actually build features with it and it’s not an afterthought, then maintenance becomes much easier. It’s often called test driven development but I’ve rarely seen it done in such a way that the dev ex is good enough for it to work.

But say you have that. Then you have great profiling. At that point you can measure correctness and performance. Then implementation becomes less of a focal point. And that makes it a lot easier to concede coding to ai

NotGMan 7 days ago | |

This will probably be how things will work in future: devs will shift to specifying features which will be validate through tests.

The AI will then be middle layer that will iterate until tests pass.

Layer 1: Specs (Humans)

Layer 2: Code (AI mostly)

Layer 3: Tests (AI + human checks).

visarga 7 days ago | | |

Yes, that is how I see it too. What I would add is - intent testing - collect user messages, and check them against executed work from time to time. Every ask must be implemented and tested, every code must be justified by a user message.

jplusequalt 7 days ago | | |

What a boring fucking future.

ttariq 5 days ago |

In my experience, the easiest way to reduce maintenance costs comes from better planning upstream (better task definitions, ACs, test coverage etc), and most important of all, it comes from giving the AI coding tools a shared context for the entire project that is a living artifact that grows with the project. In order to make that happen, we employ a single AI for the entire project (instead of each developer having their own) that is able to provide the right context to agents when they need it and update the context as each agent produces the output.

ianmarcinkowski 7 days ago |

My low value comment. This feels directionally correct to me. The problems I've been struggling with in my dev job for the past 6 months have been 80% maintenance/legacy code interfering with new feature development.

Some of our developers are overly aggressive about using AI and I've started going down that path because I need to keep up and actually enjoy the flow of working with AI in my IDE.

I put a lot of work into keeping my area of the codebase understandable and coherent but I do not see that from the others on our team. I'm not perfect but I and extremely sensitive to incoherent, or un-grok-able at a glance.

Anyway, I like the novel (to me at least) framing of this article!

deterministic 4 days ago |

My experience: Don't ever discuss maintenance or technical dept with non-developers (managers etc.)

Instead, simply be a professional and fix what is needed, while working on non-developer visible tasks.

Don't allow people with zero clue to make those decisions. You are the expect. Make the decision.

hamhamed 7 days ago |

This is what I've been preaching to my team. With 5.5 and 4.7 the coding agents are good enough know to almost never take any tech debt. Any new feature or fixes should come with a cleanup or refactor, on the same PR.

esailija 7 days ago | |

That's better than 99.99999% humans. Where do I put my credit card details?

saulpw 6 days ago | | |

I agree it's rare, but I think there are more than 830 humans on the planet which can do this.

lovich 7 days ago |

So what are all of these agentic based strategies going to do once the infinite money spigot of investment into AI ends and they need to start charging prices that actually make a profit?

I get that most of the cost is in training and not inference, but I don’t see how models stay useful once the worlds software updates in a few months post training since the models can’t learn without said training.

Are we just going to have shops do the equivalent of old COBOL shops where everything is built to one years standards and the main language/framework is mostly set in stone?

tedbradley 7 days ago | |

Glad you asked. AI empowers people who couldn't do a job before to do a job. With more supply of qualified workers, these workers compete with each other by lowering the salary they'll take.

So:

* You get paid less. * The company might pay a similar amount due to LLM costs. Although, it could be more or less as well, depending on how it works out.

A couple of years ago, I saw a story of a guy writing two articles for a website a day. The boss asked him if he wanted to transition to AI-assisted writer for less pay. He said, "No." After a couple of weeks, he got canned. He checked the website out, and it had a bunch of AI writing on it.

LLMs are there to reduce your salaries and increase the businessowner's profits. Bigger inequality in wealth, it's only going to grow more and more. Also, a ton of people fired across many different fields.

ehnto 7 days ago | | |

That is one possibility (that is playing out). Another one worth contrasting is the idea of AI as leverage for the worker. If you can take a regular developer and augment their output by 25%, then they have become more valuable to you and you should pay them more. Why should you pay them more? Because the market rate will price in that they provide more value now and you'll lose those workers to competitors if you don't.

That's a pretty old economic idea, and it will be interesting to see if it holds up in this instance. I have no idea how this all plays out. I do think it won't be one size fits all though.

psychoslave 7 days ago |

https://www.laws-of-software.com/laws/kernighan/ relates here.

The incitives for remote LLMs are off with providing defaults which optimize for maintenable sound architecture though. Same way Claude is going to produce overview of the indexes of the summaries of comprehensive reports, no one is going to read. No doubt this feels like excellent KPI on how much output was generated.

Jimmy0252 7 days ago |

The maintenance-cost framing is the useful constraint. I’d rather see agents default to smaller diffs, test scaffolding, and explicit assumptions than maximize lines changed per prompt.

robotbikes 7 days ago | |

I think this is still the role of human oversight, these tools will forever be imperfect and the instructions we give them as prompts will always been prone to inaccuracies/misinterpretation. I find it useful to evaluate the code and often ask for simpler solutions and so far it has produced slightly more elegant solutions. The tendency to spawn helper functions to solve every problem or doing things in a slightly weird or at least unconvential way when there is an easier/standard way of doing it that would create less code. Your ideas if automated would definitely make things more maintainable but even code produced my machines require a human to be responsible for making sure/verifying it works.

voncheese 6 days ago |

The article is good in that it highlights the need for AI agents/assistants to help with different parts of software development, not just the up front "build me a new widget" part. The author (correctly imo) frames that if someone just uses an AI agent/assistant at the new widget part, then they'll end up with a lot more code to maintain since with AI, they crank out more code. Even if it's high quality, there is maintenance cost over time.

That being said, the problem the author talks about is more of a self imposed thing than everyone is going to suffer thing. The author correctly points out the startup scenario, where its just "get this damn thing to work somehow so I can see if there's market fit and nab some customers". That scenario has typically always come with higher maintenance costs down the road because quality is (rightfully) lowered in the name of speed to see if there's a business and if there is, get it going.

Also felt like the author was reluctant to talk about how AI can actually help with the maintenance part. AI can be great at fixing old dependencies and annoying bugs (with human guidance). Those tasks can feel like toil for software engineers and the kinds of things a software engineer will want AI to help with

kristianc 6 days ago |

This hasn't been my experience. I'm finding that I'm spending less time on maintenance precisely because it's less tedious to maintain as I go now, and because the "make a small change across 400 files" type thing which would have seemed impossible before can be chewed through by an agent in a couple of hours.

ACCount37 6 days ago | |

Usually, that is the kind of thing that's done with search-replace tools and some manual touch at the edge cases.

But AI does indeed make it faster and easier. Why sift through the changed lines and hammer down the edge cases yourself if you can get an AI to do it?

caymanjim 6 days ago |

The author seems to be starting with the assumption that humans need to review all AI code in detail and be able to understand and maintain it without the AI assisting. In my experience, this isn't how people are using AI.

It starts out that way. In the beginning, before they trust it, and before they've learned to prompt it to get the results they want, they use it to automate some tedious bits, but humans still create the initial implementation or pattern and then have AI fill in the gaps. More like turbocharged autocomplete than a sea change in how they write code.

The more people work with AI, the less they worry about the actual code it's producing. I'm not saying this is a good thing. It can introduce bugs, performance problems, security holes. It's reality, though. AI code produced a bug? Tell AI to fix it. AI code is bloated and hard to read? If you care, tell AI to fix it. A lot of people don't care.

When humans are removed completely from code maintenance, the need for maintainable code isn't there anymore.

We're not 100% there yet, but that's where we're headed. For a lot of companies, it's worth the risk to YOLO it because it's already good enough.

I don't personally trust it enough to stop reading the code it produces, but I don't read every single line. I pay more attention to the tests than the code under test. I pay more attention to parts of the code where performance matters. I guide the overall structure. But whenever any of that doesn't meet my standards, I'm not the one who's maintaining it. I just tell the AI to fix it.

Maintenance costs aren't on my radar when maintenance is this cheap.

znort_ 6 days ago |

interesting perspective, but i'd throw in some caveats:

- productivity isn't the be-all end-all, it's just one metric and a consultant's mantra. taking a productivity hit can be more than fine if it gives you a tactical/strategic advantage or opportunity.

- i'm not convinced at all that agents will become prohibitively expensive. that's indeed some companies' wet dream but a) good cheap competition is emerging and b) you don't really need the latest models or massive computing power to get shit done.

i do agree though with the emphasis on code quality and debt, and for not blindly going for the silver bullet fad and throwing money at it like there's no tomorrow in the hopes of some "productivity figures boost". then again i doubt that companies going for that would heed such advice, we've been there many times.

topherhunt 6 days ago |

I don't buy the math here because it seems to only model half of what AI coding agents do. The entire argument treats AI as a code-generation accelerant -- more output, therefore more maintenance burden, therefore compounding debt. But in my experience (solo dev, ~30k LOC apps), Claude Code has decimated my maintenance costs. I throw broken tests at it. I use it to diagnose bugs, trace data flows, reason through unfamiliar code, and refactor when things get unwieldy. AI isn't just a faster typist -- it's a faster debugger, reader, refactorer. Modeling AI's impact on codebase growth without modeling its impact on maintenance speed seems like a very selective way to model the future. The maintenance cost curves cited here come from pre-AI dev data; using them to predict post-AI outcomes assumes the answer to the most important question (does AI reduce per-line maintenance cost?) rather than investigating it directly. Nobody has nine years of data on this because halfway-decent coding agents have existed for < 6 months. I like the cautionary advice -- watch out for how much maintenance burden you're incurring with all that delicious AI code slop, folks -- but I don't think his confident quantitative predictions ("gains erased after 5 months") are justified. Am I missing something obvious here?

richardbarosky 6 days ago | |

Yeah, it seems like the article should have qualified the issue more or been more precise. Instead of "You Need AI That Reduces Maintenance Costs", something like "Your Use of AI Should Reduce Maintenance Costs".

Some of the maintenance costs you mentioned are primarily read-only, slam dunk AI use cases. Input from AI to diagnose bugs, trace data flows, and help with reasoning. Tests are something of a gray area in the sense that they are not read-only but they don't affect the logic of the app itself.

The "write" use cases (you mention refactoring and the author seems to primarily focus on writing code) is where the author's point seems to be primarily aimed at.

Definitely agree on the read-only improvements to maintenance. Those are unquestionable slam dunk, high value improvements.

momentmaker 6 days ago |

I feel like the exponential growth of the models will make this a non-issue in the future. So the tweaks we might make right now might not be even more with when even more powerful models in the future come.

comboy 6 days ago | |

Yes and no. Imagine some super intelligence tackling the unmaintainable mess. It can untangle it, but the problem is that unless you have a test suite that's covering 100% of possible cases (which should on it's own enough to build the codebase), you will stumble upon "what's actually the intent here" problem. Some other software depending on buggy behavior on this one etc.

Which is why I think additional intent preserving abstraction is where software coding agents are likely heading.

rimliu 6 days ago | |

exponential what? Marketing aside in day-to-day use models get worse, not better.

tuo-lei 6 days ago |

maintenance cost on AI code isn't really uniform per line. most of it follows standard patterns, maybe easier to maintain than average human code. but the 5% where something went subtly wrong costs way more to fix because you can't retrace the reasoning, you just re-derive the whole thing from scratch. average looks fine but the tail kills you.

azurewraith 5 days ago | |

The "can't retrace the reasoning" problem is solvable with deterministic workflow constraints. Agents currently run with carte blanche, and they take a mile if given an inch. If the agent was in a specific phase with only specific tools available (and guarded against mega edits) the decision trace is there in the workflow definition. Moving to the next phase is a result of solving the current phase, with a reasoning. If you have no guardrails then there is no observability.

stronglikedan 6 days ago |

R&D is always expensive and that is the phase we're in with AI tools. They will eventually reduce costs, but every company has to make an upfront investment in figuring out their own specific "hows". We'll get there, some faster than others.

swiftcoder 7 days ago |

> Your crowd might tell you that, for each month you spend writing code, you’ll spend... 10 days on maintenance in the first year; and 5 days on maintenance each year after that

Someone is an optimist! I'd estimate those significantly higher, and even worse if you are in a field that has to do any sort of SOC/HIPAA/GDPR audit

afro88 6 days ago |

The bet that he misses, which a lot of companies are starting to make or at least think about, is that AI will get better at coding. So the model / harness / whatever is next takes care of the maintenance burden.

That's the theory anyway.

stevepotter 6 days ago | |

Be careful there. The whole "just wait 6 months" thing is problematic. It gives you an excuse to make a mess now, because "in 6 months" AI will magically fix it.

It also belittles the human resources. "I heard that 6 months AI will do everything, so why would I hire new engineers or promote the ones we have?"

m0llusk 6 days ago | |

Meanwhile many LLM users have seen generated code quality drop as prices and service are brought in line with costs. If graduate student level work costs many times the price of a student worker then why bother?

dailywriterguy 6 days ago |

sssshhhh daddy Sam doesn't want you to tell people that you can't actually replace people altogether.

panny 6 days ago |

Then use AI for maintenance. It's AI all the way down. AI is taking over. AI is going to win and there's no stopping it. You're going to be left behind forever if you don't use AI right now. In fact, you may be too far behind already to start.

Did I do that right? ;)

Anyway, AI maintenance can be a time saver if the maintenance is easy. Like upgrading a dependency and all you really need to do is fix imports on five hundred files and modify a method or two. That would have been time consuming, but it's not hard. I think the OP has hit a good point though. Writing code is the fun part, not the bottleneck. The pain is the maintenance, so let's apply the AI there and keep having the fun to ourselves.

rotis 6 days ago |

Not really convinced by the first graph (and following too). According to it on a 10 year old project developers only manage to spend 10% of their available time adding features because rest is consumed by maintenance? By 10 year landmark I would expect most of software to be mature, with less new features needed and most known bugs fixed.

throwthrowuknow 6 days ago |

This could have been a good piece of writing if the author chose not to be so smugly overconfident in their belief and show real evidence to support their claim. Mentioning the front page of HN as your source is glib and immediately made me doubt the conclusions. I was interested to see what work the author put into researching this but apparently they didn’t do any work at all.

When an LLM provides you with an overconfident piece of writing with no sources to back it up, what do you do?

frumiousirc 6 days ago | |

> When an LLM provides you with an overconfident piece of writing with no sources to back it up, what do you do?

You draw made up lines on made up plots and call it evidence, obviously.

lacymorrow 6 days ago |

The strongest signal I have seen for whether AI actually reduces maintenance cost is whether the developer treats AI output as a first draft or a final artifact.

When I use AI tools on existing codebases - understanding unfamiliar modules, generating targeted refactors, writing migration scripts - the maintenance burden genuinely drops. The AI is working on code I already understand architecturally, so I can evaluate its output quickly.

The problem shows up when AI generates greenfield code that nobody deeply understands. That code still has to be maintained by humans who did not write it AND did not design it. At least with code another human wrote, you can reason about their intent from naming, structure, and commit history. AI-generated code often lacks that legibility because the "author" had no persistent intent across files.

The article is right that we need to measure maintenance cost, not just velocity. In practice that means tracking time-to-understand and change-failure-rate on AI-assisted code vs. human-written code over months, not days.

azurewraith 5 days ago | |

There's a third mode that works better: structured phases (scoped to each feature, like humans do). (Plan phase => human reviews plan) => (Implement phase => human reviews diff) => (Test phase => tests run). The current TUI tooling gives you the option to do this type of bite-size scoping but you have to enable plan mode and not auto-accept edits. I've been taking the (enhanced) phased approach out of the default toolchains, having discrete phases (even simple plan=>implement=>test) that loop while capping tool access and edit sizes and that's been really promising in the realm of obtaining better agentic coding quality

aroido-bigcat 6 days ago |

One thing I like about framing this as maintenance cost is that it moves the measurement boundary. The usual AI coding metric is something like accepted diff per hour, but the more interesting unit is probably future decisions created per hour.

An agent can reduce typing while increasing the number of things nobody really owns later: rationale, invariants, tradeoffs, half-meaningful tests, files that changed because they were nearby, etc. The PR can pass and still leave the team with more intent to rediscover.

The useful agent workflows I keep coming back to are less about "write more code" and more about making every change come with a maintenance handle: what invariant changed, what should fail if this is wrong, what files should not have changed, what rollback looks like. It feels slower in the moment, but it gives future-you something to grab onto.

tomtomatoide 6 days ago |

The math works for codebases that survive past year one but not so much for less mature ones, no?

I shipped a small Stripe storefront with Claude Code over the weekend. Three pages, four integrations, one database. At that size you can read every file before merging, and that's basically the lever. Shore's argument really bites when you can't.

The thing missing from his model, I think, is project shape. Bounded greenfield has a maintenance ceiling because the code itself does. A long-lived monolith with the agent extending it is where the math gets ugly.

immanuwell 7 days ago |

but the dirtier truth is that llm-generated code skews the maintenance curve worse than human code because it optimizes for compiles and passes the happy path rather than for the boring stuff that makes future-you's life easier

eddyaipt 6 days ago |

A useful check is whether the agent leaves the repo easier to reason about after the change. If the diff ships the feature but increases review time, hidden coupling, or rollback ambiguity, the maintenance bill is already showing up.

philipp-gayret 7 days ago |

Would be an interesting concept and read were it grounded in reality. Unfortunately, it's data and graphs pulled out of someone's imagination. Reality is nowadays with the right skillset you can take state of the art AI tools and get a complete language rewrite and or refactor and be done the same afternoon.

pdhborges 7 days ago | |

At least if you a test suite that doesn't have to be migrated. I too would like to migrate some services from Python to Rust but my test suite is written in Python so I would have to actually check if the test suite migration was correct manually (I can't event compile it!) before doing the rewrite.

devinabox 7 days ago |

Great Article! I think ultimately we are heading towards a world where much better software will be created. This is the major roadblock we need to cross over before that can be true, but I think it is a very tractable problem!

I created a video that talks about this in more detail:

https://www.youtube.com/watch?v=G3Q7Y-nrUbk

faangguyindia 7 days ago |

With AI, you can hypothesise what can potentially break with each new addition (which your regression tests do not even capture at present). Then, you can write tests for each of those hypotheses, ask AI to deploy a canary, ask AI to divert 5% of traffic to the canary. Ask AI to analyse the logs for any signs of regression in performance, ask AI to roll it out to 100% if everything is good. Congrats! At this point, you've become a slave to AI and cannot do without it. Even logging into a remote server now causes mental pain; having to do anything by hand causes pain. You just wait for your limit to be reset to return to slavery again. A master of a slave is as much of a slave to his salve as the slave is to the master itself.

danielbln 7 days ago | |

My local model humming next to me will always be available. Is it as good as a foundational model? No. But it'll work just fine for most pedestrian tasks and I don't need to keep now useless mechanical knowledge in my brain.