Ask HN: What tools are you using for AI evals? Everything feels half-baked | Dark Hacker News