We built scalable evaluation infrastructure for AI web agents | Dark Hacker News