Your AI coding benchmark is hiding a 2x quality gap | Dark Hacker News