Show HN: Vision AI Checkup, an Optometrist for VLMs

Show HN: Vision AI Checkup, an Optometrist for VLMs(visioncheckup.com)

2 points by zerojames 1 year ago | 0 comments

Evaluating visual capabilities of language models is hard.

On the one end of the evaluation spectrum, we have vibe checks which, while useful for building intuition, are time-consuming to run across a dozen or more models. On the other end, we have large benchmarks which are so large that they are intractable to most users.

Vision AI Checkup is a new tool for evaluating VLMs. The site is made up of hand-crafted prompts focused on real-world problems: defect detection, understanding how the position of one object relates to another, colour understanding, and more.

Our prompts are especially focused on industrial tasks -- serial number reading, assembly line understanding, and more -- although we're excited to add more general prompts.

The tool lets you see how models do across categories of prompts, and how different models do on a single prompt.

We have open sourced the codebase, with instructions on how to add a prompt to the assessment: https://github.com/roboflow/vision-ai-checkup. You can also add new models.

We'd love feedback and, also, ideas for areas where VLMs struggle that you'd like to see assessed!

No comments yet