Benchmarking Vulnerability of Agent-Generated Code in Real-World Tasks | Dark Hacker News