I built this after reading too many incident reports of agent loops spending
$200 in 4 minutes because a quality threshold was never met. The pattern is always the same: an agent retries, fans out, or loops. Each iteration passes individual rate-limit checks. Observability fires an alert after the money is gone. Provider caps are per-provider, not cross-provider. None of these stop the spend before it happens. RunCycles takes a different approach: reserve budget before the call, commit actual spend after, release the remainder if the work is cancelled. The reservation is atomic across all affected budget scopes — tenant, workspace, agent — using Redis Lua scripts so concurrent agents sharing a budget can't collectively overrun it. The integration surface is small:
When budget is exhausted, the next reservation attempt gets a 409
BUDGET_EXCEEDED before the downstream call is made.The architecture is three pieces: - Cycles Protocol: an open OpenAPI spec defining the reservation lifecycle, idempotency semantics, scope hierarchy, and overage policies. - RunCycles Server: Spring Boot + Redis, implements the spec. Runs in Docker. - Clients: Python, TypeScript, Java/Spring Boot. The hardest part was idempotency under retries — if a commit fails transiently and retries with the same key, it should get the original response back, not double-charge. The Lua scripts handle this atomically. What it's not: a billing system, observability dashboard, or agent framework. It's the layer that decides whether an action may proceed before it proceeds. Org: https://github.com/runcycles Docs: https://runcycles.github.io/docs |