What's the best way to benchmark neuro‑symbolic‑causal AI agents? | Dark Hacker News