DEV Community

Albert Mavashev
Albert Mavashev

Posted on

I burned $153 in 30 minutes with an agent loop — here's the pattern that stopped it

The incident

Short story — my agent retried a failing loop using multiple models:

My agent:
GPT-4o + Stable Diffusion + TradingView Charts + Kling.
100s iterations. $153 in under 30 minutes.

My enforcement layer did not work

Rate limiters — control velocity, not total cost.
Provider caps — per-provider, not cross-provider.
Observability — tells me after. I need before.

The pattern that worked for me: reserve-run-commit

My budget control in 3 steps:

  1. Reserve estimated cost before the call (instrumented code)
  2. Execute action (LLM, toolcall)
  3. On success — commit actual cost, release unused portion
  4. On failure — release the full reservation

The critical piece for retries: each reservation has an idempotency key. If the agent retries the exact same action, the second reservation is a no-op. Budget only gets locked once per logical action, not once per attempt.

The critical piece for concurrent agents: the reservation is atomic. Two agents can't both check the balance, both see enough, and both proceed. One gets through, the other gets blocked.

What it looks like in code @cycles annotation

from runcycles import cycles

@cycles(estimate=5000, action_kind="llm.completion", action_name="openai:gpt-4o")
def ask(prompt: str) -> str:
    return openai.chat.completions.create(...)
Enter fullscreen mode Exit fullscreen mode

The decorator handles the reserve-commit lifecycle. If the budget is gone before the call, it raises a BudgetExceededError and the call never fires. Nothing is billed.

Works with any LLM provider — OpenAI, Anthropic, Bedrock anything.

The demo

I built a runaway agent demo that shows the failure mode in under 60 seconds — same agent, same bug, two outcomes. No API key needed to run it.

demo: https://github.com/runcycles/cycles-runaway-demo

What I built (Cycles Protocol + Reference implementation)

Full docs: https://runcycles.io
Self-hosted, Multi-language SDKs, Apache 2.0

What's your approach to agent cost control? Still rolling your own counters + limiters, nothing or something else?

Top comments (0)