I was recently working on a UI system that attempts to let AI build websites autonomously. When you start working on it you quickly realize how many tradeoffs you have to think through. Most of the questions arise from the limitedness of the context window. For example, Claude 3 supports 200k tokens.
You also have to take into consideration, that since chat history is sent with every new message, the price of the conversion growths ~ n^2 of the number of messages. So do you send the whole codebase? Or do you let AI run commands like ls and cat to read the files it needs? Do you want a file in the directory with quick history of what's already done, and what needs to be done?
Another thing I find interesting is how microservices became a natural choose vs. monolith apps when building with AI, again due to limits of the context window. So you focus on thinking through all the components and their APIs, and then let AI build each of the component. If it can be done in isolation without any knowledge of other components, that's better.
Also, it quickly becomes obvious that fully-autonoumous builder does not make any practical sense. Real person still needs to look at the progress, and give guidance. Not even because AI can't do this, it probably can. But because your own understand of what you are building changes over time. So it should be semi-automatical, with real users being able to change the course any moment.
How do you build the autonomous loop?
One thing I find useful is to let AI write tests first, and then run those automatically on each new chat message. TypeScript types also helps catching broken code early. In those case automatical message is sent "Hey, you broke the tests. Here are the error messages. Go ahead and fix those." Operator doesn't have to bother, until it's fixed.
Another loop can be build with the ability to send screenshots. So at any moment system can send a screenshot to AI, and ask if it's good enough, and if it wants to make any changes. That also improves the quality.
Well, you get the ides. It's an interesting task to ponder.