"A benchmark for catching when code doesn't do what its documentation claims" | Dark Hacker News