Beyond demoware: how do you evaluate an AI agent? | Dark Hacker News