First Proof: Research-Level Math for AI Evaluation | Dark Hacker News