Open-world evaluations for measuring frontier AI capabilities [pdf](cruxevals.com)2 points by randomwalker 32 days ago | 0 commentsNo comments yet