Ask HN: Does anybody still FEEL improvements between latest LLMs for coding? Title basically, for me it feels like latest generations of LLMs are quite equal in usefulness for coding, does anybody have anecdotes of the opposite case? |
Ask HN: Does anybody still FEEL improvements between latest LLMs for coding? Title basically, for me it feels like latest generations of LLMs are quite equal in usefulness for coding, does anybody have anecdotes of the opposite case? |
Antigravity started the workflow where you give a list of things to do and it goes off and does those things without supervision, including drafting and testing edge cases. It can even spin up images. Fable is the latest form of this workflow.
Gemini 3 Pro is actually at a mid designer level. None of the others are even at a junior human design level.
Sonnet 4.5 was, for a brief moment, creative brilliance. But we're talking coding, not writing right? They ditched it all for 4.6.
Opus 4.6 and 4.8 are extremely good for coding. I use them to reliably go through logs that are like 15k lines long. It can read my code, plot out the logs that should happen, check the logs for what actually happens, and from there, form hypotheses, and set up the logs needed to validate these.
Codex/ChatGPT is probably second best in all of the above.
A lot of that is because in the former case (AI does everything) I wasn't paying enough attention.