Show HN: Cheddar-bench – unsupervised benchmark for coding agents | Dark Hacker News