Often times damage is done by non LLM calls -- tool calls like sending email, add records, files, placing order, etc. Budget enforcement at the LLM layer wont work for those.
built an open protocol + reference implementation, handles both any tool calls, LLM calls, or any other call: https://runcycles.io, open sourced under Apache 2.0
What it does:
- Set budgets at company, customer, and feature level
- SDK checks budget first and blocks requests that exceed limits
- Your app still calls OpenAI/Anthropic/etc directly (no proxy/gateway)
- Prompts and outputs go directly between your app and the AI API provider
- MarginDash only receives usage metadata (token counts)
- TypeScript and Python SDKs
Flow:
Checks limit you set for customer/feature -> sends AI call if within that limit -> records costI’d love feedback on any missing enforcement scope you’d need in production