Bluffbench is near saturation: LLMs can interpret counterintuitive plots(opensource.posit.co)2 points by ionychal 3 days ago | 0 commentsNo comments yet