Again this is not true with a real world recent example.
For context: for the project I’m about to describe, I did the 3 week discovery process where I iterated through the design. I designed the architecture from an empty AWS account with IAC and an empty git repo. I know every decision that was made and why.
An issue was reported while the client was testing - a duplicate message was displayed to the user.
I gave codex three pieces of information - the duplicate IDs and told it was duplicate.
Codex:
1. Created and ran a query in the Postgres database after finding the ARN to the credentials - you don’t have to pass credentials to the database in AWS, you pass the entry in Secrets Manager directly to the database as long as you have permission to both (Dev account). I didn’t tell it the database and queried where I was storing the event.
2. It found the lambda that stored the events in the database.
3. It looked at the CloudFormation template to figure out the Lambda was triggered by messages in an SQS queue
4. Looking at the same template it saw that the SQS message was described to an SNS topic
5. It found the code that sent the events - a 3000 line lambda
6. It was able to explain what the lambda did and find there wasn’t a bug in the logic
7. It saw that the flow was data driven and got the information from a DDB table defined by an environment variable.
8. It then looked at that CloudFormation template that deployed the Lambda
9. It ran a query on the DDB table after looking at that CloudFormation template to figure out the schema
It then told me that there was a duplicate entry in the database.
I knew the entire structure of the system - again I designed all of this myself. I wanted to see how codex would do.
Everything you are saying a modern LLM can do.
I won’t even go to how well it debugged a vibe coded internal website just by telling it to use Docker container with headless chromium and Playwright. It debugged it by taking screenshots while navigating and making changes.