LLMs are breaking 20 year old system design

> Long running work: an agent doing a 10 minute task isn’t a ‘request’, it’s a long-running async process.

Correct, but we solved this a long time ago when we started sending files to servers to be converted, for example. We either got a 'job_id' or a call to a webhook when the job was finished."

pizza234 49 days ago | |

The article argues that although this is an asynchronous process:

> Long running work: an agent doing a 10 minute task isn’t a ‘request’, it’s a long-running async process.

it should not be stateful at the database/storage level:

> Stateful compute: an agent might run multiple turns of a conversation, might process multiple tool calls, and relies on accumulated context. That state is not really ‘database state’, it’s the agents memory.

According to the author, the problem is already solved, but implemented with the wrong design assumptions.

(Uploading a file for conversion could be framed as a slightly different problem from the author's, though, due to size constraints)

grugdev42 49 days ago |

Article doesn't make sense. Some of the "horizontally scaled" servers have their own state. A local cache, a temporary filesystem etc.

Also, has teh author never heard of long running queued jobs? Or long running scheduled jobs? They ultimately report back into the DB (updating their status etc).

This article reeks of someone using AI to make huge leaping jumps of logic. The "single source of truth" rule has survived this long for a reason. It works!

zknill 49 days ago | |

And once those long running jobs have reported their status back to the database, how will the client find out about that status?

Please, please, please don't say "polling". Because you've clearly missed the entire argument of the article if you say polling.

dial9-1 49 days ago | | |

postgresql has LISTEN and NOTIFY. redis and kafka have pubsub. this is a solved problem

zknill 49 days ago | | |

Isn't the point that you no longer have a connection to the client?

So you can be notified by the database, but you can't (with the stateless HTTP + loadbalancer design explained in the article) get that notification back to the client. Because the client isn't connected anymore; so how does the client know that there's new information?

gatlin 49 days ago | | |

*I guess there would have to be some mechanism for the database to push notifications to the client. This is not a fundamentally unsolvable or particularly interesting problem.*

bilbo-b-baggins 49 days ago |

Claude code runs as a nearly stateless server using session JSONL files as a conversation database, sending stateless API requests to Anthropic, etc.

This post doesn’t seem to understand how these systems work at the core of agent harnesses.

OutOfHere 49 days ago | |

That is a limited argument because sometimes the harness can run itself in the cloud, not on the client side. Something then has to maintain the communication between the harness and the client.

skywhopper 49 days ago |

This article is clearly written by someone who’s never done any work on actually complex web applications. Nothing here is a new problem nor unsolved. The pattern identified as being “LLM specific” (long-running async jobs) is not particularly unusual.

ahofmann 49 days ago |

To me this makes no sense. Nothing in web development changes because of long running requests, there are plenty of solutions for this. The most easy one is to just listen long enough on a http request for the answer. The routing problem can be mitigated with session pinning. Http2 and 3 have solutions for streaming data, websockets can be used, and pub/sub also. Heck, we could push the LLM response in a k2v system/redis and read it from there. "State is in the DB" is running strong and will be for decades to come.

zknill 49 days ago | |

The industry decided a long time ago that sticky sessions was a terrible idea. They only half-solve the problem, while suffering from session loss on server loss and imbalanced load over time.

ahofmann 49 days ago | | |

The services, I ran didn't care about decisions "the industry" made. They worked just fine.

mattjoyce 49 days ago |

Durable is used 13 times in this article.

bunzee 46 days ago |

To be honest, the more difficult change is not a technical issue, but the fact that existing assumptions regarding the direction of product improvement are crumbling. Information such as who the competitors are and what the market situation is like is changing too quickly for most teams to grasp manually.

NitpickLawyer 49 days ago |

> LLMs just make this problem more visible.

This theme keeps popping up everywhere. Lots of things were "the way we did things" because a lot of reasons. LLMs just amplify some things and they get enhanced visibility. It can be a good thing, if you're able to understand what/why/how changed, or it can be a bad thing if you insist that "this is how we do things, because this is how we've always done things".

endofreach 49 days ago | |

> or it can be a bad thing if you insist that "this is how we do things, because this is how we've always done things"

Or... maybe... just maybe... it can be a bad thing, because it's a bad thing.

NitpickLawyer 49 days ago | | |

Many things can be wrong, for many reasons. The problem is when people think LLMs make it wrong, instead of understanding that LLMs just expose the thing for what it was. It's like shooting the messenger just because the messenger is an LLM. That was my point, in case I worded it badly.

cowl 49 days ago | |

or maybe it is a bad thing, because right now the model is "throw it against the wall and see what sticks or how many billions we need to make it stick"?

NitpickLawyer 49 days ago | | |

> right now the model is "throw it against the wall and see what sticks

When was it not? We've been doing this for decades. Something usually sticks.

foo42 49 days ago |

It feels like the virtual actors are the primitive the author is reaching for. As an erstwhile Elixir hobbyist I've often found myself wishing for the simplicity of actors when solving problems in my day job. I tend to work in an AWS environment, but I believe over in Azure they have something like it. I think it was called Orleans when I read about it but I think it's got a more corporate name now.

cronin101 49 days ago | |

It’s still Orleans! https://learn.microsoft.com/en-us/dotnet/orleans/overview?pi...

pmargam 49 days ago |

Using Cloudflare's Durable Objects https://developers.cloudflare.com/durable-objects/concepts/w... for this and works pretty well.

ventana 49 days ago |

If I'm reading it correctly, the TL;DR of the article is: given the client and the server, we need to be able to ingest messages to the client-server communication channel, and this channel should survive a disconnection. The article suggests using named pub/sub channels for communication, so that the “connection” between a given client and a given (cloud) server had a name and it was possible to ingest data chunks into that named channel.

I would suggest that there is a much, much older technology than pub/sub that can be used for such kind of data transfer: it's UDP, documented in 1980.

I can't stop thinking how overcomplicated our software engineering reality is so we need to reinvent layers and layers of stuff on top of the other stuff. We must make applications for browsers; browsers disallow basic network communication for the code they execute; so sending a chunk of data from a client to a server becomes a real adventure.

lxgr 49 days ago | |

UDP and nothing layered on top?

Then you'll be reimplementing host discovery (i.e. how do clients find the host that has context on their request), retransmissions, flow control, congestion control, and many other things on top of it, and suddenly it doesn't sound so simple anymore.

haileys 49 days ago |

The premise is incorrect and ignorant of the history - this is sticky sessions and the idea has been around longer than 20 years.

The "cloud native" (as the author refers to it) idea that app servers should be stateless is actually the new idea.

The industry eventually reached consensus on sticky sessions being a bad idea a lot of the time. That's why stateless app servers became the norm.

deafpolygon 49 days ago |

Written by AI…

throwaway27448 49 days ago |

? Yes if you treat llms like deterministic computation you'll get fucked, news at eleven. In terms of apps "shitty but uncannily useful search" seems like a better fit