Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult(simonwillison.net)1 points by gingersnap 174 days ago | 1 comment