Personally, I'm currently using a free trial of Codex, and I'll just quote what I wrote on that channel:
I have some stupid with Codex and GPT-5.4, FWIW, even on extra high.
Like, I tell it to "use TDD" and then I find more than zero of the tests open the source code and do a regex to confirm only the existence of substrings within the source code, which obviously misses all questions of functionality