Agent-evals: Overlap, boundary, and metacognitive scoring for coding agents(thinkwright.ai)1 points by oceanwaves 90 days ago | 1 comment