Show HN: Verdict – model evals on your own data, not someone else's benchmark(github.com)2 points by agunapal 10 days ago | 0 commentsNo comments yet