Beyond Benchmark Maxxing: Measuring Open Source Models as Real-World Agents(ultravox.ai)1 points by zkoch 263 days ago | 0 commentsNo comments yet