Show HN: Text-to-Video Arena(t2vleaderboard.lambdalabs.com) |
Show HN: Text-to-Video Arena(t2vleaderboard.lambdalabs.com) |
But how do we measure the quality of the outputs? Is choice of color more important than the realistic aspect or is it the composition of the scene?
We’ve launched a Text-to-Video Model Leaderboard to explore these questions, inspired by the LLM Leaderboard (lmarena.ai). Our idea: many models exist, but only an unbiased comparison can help evaluating what users of text-to-video models actually find most important.
Right now, the leaderboard includes five open-source models: * HunyuanVideo * Mochi1 * CogVideoX-5b * Open-Sora 1.2 * PyramidFlow
We plan to expand it to include proprietary models from Kling AI, LumaLabs.ai, Pika.art. You can check out the current leaderboard here: https://t2vleaderboard.lambdalabs.com/leaderboard/
We’re looking for feedback from the HN community: * How should text-to-video models be evaluated? * What criteria or benchmarks would you find meaningful? * Are there other models we should include?
We’d love to hear your thoughts and suggestions!