Bluffbench: Effective agents need to prioritize evidence over preconceptions | Dark Hacker News