It's gotten
very bad. It was degrading since late Feb and since March 8th has become unusable. "Simplest fix" and "You're right, I'm sorry" are strong indicators. It went from senior engineer to entitled intern, and I went from having a team of peers to a lazy jerk who only tries to cut corners. I've got quantitative analytics of it, too. Briefly the other day for about 24 hours it returned to normal, and then someone flipped the switch again mid-session. I was a massive proponent of Claude/Opus, and for the last several weeks have felt rug-pulled. It's such an obvious degradation that even non-technical friends have noticed it. It's optimizing for minimum effort instead of correct and clean solutions. It sucks, because had I experienced it like this from the start I'd have bounced from agentic coding and never looked back - unfortunately, I thought it'd only get better and adjusted my workflow around it. When my Qwen3.5 27B local model gets into fewer reasoning loops than Opus does, it makes me wonder if anyone there cares or if they are just chasing IPO energy from scaling.
I had to build a stop hook to catch it's garbage, and even then it's not enough. I had 30min-1hr uninterrupted sessions (some slipstreamed comments), and now I can't get a single diff that I can accept without comment. Half of the work it does is more destructive than helpful (removing comments from existing code, ignoring directives and wandering off into nowhere, etc).
From 2 weeks after installing the stop hook (around March 8th):
```
Breakdown of the 173 violations:
73x ownership dodging (caught saying variants of "not caused by my changes")
40x unnecessary permission-seeking ("should I continue?", "want me to keep going?")
18x premature stopping ("good stopping point", "natural checkpoint")
14x "known limitation" dodging
14x "future work" / "known issue" labeling
Various: "next session", "pause here", etc.
Peak day: March 18 with 43 violations in a single day.
```
Other one is loops in reasoning, which are something I'm familiar with on small local models, not frontier ones:
```
Sessions containing 5+ instances of reasoning-loop phrases ("oh wait", "actually,", "let me reconsider", "I was wrong"):
Period Sessions with 5+ loops
Before March 8 0
After March 8 7 (up to 23 instances in one session)
```
(I've even had it write code where it has "Wait, actually, we should do X" in comments in the code!)
The worst is the dodging; it said, literally, "not my code, not my problem" to a build failure it created 5 messages ago in the same session.
```
I had to tell Claude "there's no such thing as [an issue that existed before your changes]" on average:
Once per week in January
2-3 times per week in February
Nearly daily from March 8 onward
```
Honestly, just venting, because I'm extremely depressed. I had the equivalent of a team of engineers I could trust, and overnight someone at Anthropic flicked a switch and killed them. I'm getting better results from random models on OpenRouter now (and OmniCoder 9B! 9B!). They aren't _good_ results, mind you, but they aren't idiotic.
Sad. Very sad.