I've been working on a way for agents to query production systems to help me debug issues and close the loop on things I work on day to day. It works as a hook that rewrites ssh, awscli, gcloud, az, kubectl commands to verify they are read-only and safe. It also keeps track of sessions in files and when agents debug the same things it will give hints in the tool calls like
━━ Past Investigation (May 10, 87% similar) ━━ Root cause: php-fpm pool exhaustion causing nginx 502 Hosts involved: web1 Investigation path: web1: systemctl status nginx web1: journalctl -u nginx --no-pager -n 20 web1: systemctl status php-fpm Consider checking: systemctl status php-fpm
Tools with memory is an interesting idea as well but lmk what you think!