Vibes Are Not a Metric: A Guide to LLM Evals in Python | Dark Hacker News