Benchmarking LLMs for Web Tasks | Dark Hacker News