It took two weeks to make Claude's "overnight solution" for flaky tests useful

It took two weeks to make Claude's "overnight solution" for flaky tests useful(thoughtbot.com)

2 points by zingar 6 days ago | 1 comment

james_ross 4 days ago |

This matches my experience as well, as the one person team on a desktop app with thousands of unit tests and hundreds of playwright e2e tests. I had a number of flaky tests that Claude was self selecting to isolate when running the tests and this was concerning. The breakthrough for me was using the superpowers debugging skill and setting a focused goal to fix one particular test that was failing most often. It ended up being a race condition that I'd never have found on my own, and it then went and found the dozen or so other similar issues in the code base. No e2e failures now. This is a very satisfying use of an AI agent for me.