Show HN: Auto-generate hard evaluation data for LLMs | Dark Hacker News