Ask HN: Is there any tool that can stop LLM calls at runtime (not just monitor)? I’ve been running into cases where LLM/agent systems make unexpected or repeated calls and costs spike quickly. Most tools I’ve found focus on observability (logs, traces, dashboards), but not actual enforcement. Is there anything that can: - stop or cut off a call mid-execution (based on budget, tokens, or conditions)? - enforce limits at runtime instead of just alerting after the fact? Curious if people here are solving this in practice, or just handling it at the application level. |