Logan for Waxell

Posted on May 1 • Originally published at waxell.ai

AI Agent Circuit Breakers: The Reliability Pattern Production Teams Are Missing

#ai #agentops #devops #llm

On April 29, 2026, a developer published a detailed post-mortem of how they woke up to a $437 API bill. Their agent — a nightly pipeline built to summarize and categorize documents — had entered a retry loop around 11 PM and never stopped. By 7 AM, it had made thousands of identical tool calls, all failing, all billing. The fix took twenty minutes. The loop had run for eight hours.

No alert fired. No threshold tripped. Nothing stopped it.

This scenario is becoming a reliable rite of passage for teams shipping production agents, and the standard response — "we'll add a kill switch" — misses the architectural lesson. The problem isn't the absence of a kill switch. It's the absence of a circuit breaker.

Kill Switches and Circuit Breakers Are Not the Same Thing

The distinction matters because the failure modes are different.

A kill switch is a manual control: a human sees something wrong and terminates the agent. It requires someone to be watching. At 3 AM on a Tuesday, when an agent enters a loop because a downstream API returned a transient 503, nobody is watching.

A circuit breaker is an automated control: the system monitors its own behavior, detects anomalies against defined thresholds, and self-terminates when limits are exceeded. It operates independently of human presence. The classic pattern comes from distributed systems design — when a service starts failing, the breaker "trips" and blocks further calls until a recovery condition is met, preventing cascading failure.

The difference in practice: a kill switch is what teams reach for after something has gone wrong. A circuit breaker stops it before "something has gone wrong" becomes "something has been wrong for eight hours and cost $437."

The developer community has figured this out empirically. In the eighteen months since autonomous agents went mainstream in production, Hacker News has seen Show HN submissions for AgentCircuit (a circuit breaker for LLM function calls), AgentFuse ('a local circuit breaker to prevent $500 OpenAI bills'), FailWatch ('a fail-closed circuit breaker for AI agents'), and Runtime Fence ('a kill switch for AI agents'). Each was built by a developer who had already been burned. The pattern is consistent: teams discover the need for circuit breakers the hard way, then build their own.

Why Observability Tools Don't Solve This

LangSmith, Helicone, Arize Phoenix, and Langfuse are observability tools. They are excellent at what they do: surfacing traces, recording token usage, visualizing execution paths, flagging anomalies after the fact. The circuit breaker pattern doesn't replace them — it consumes them. The signals these tools surface are precisely what a circuit breaker needs to decide when to trip.

But observability is passive. It records what happened. A circuit breaker intervenes in what is happening.

This is the competitive gap the observability market hasn't closed. LangSmith will produce a detailed trace of thousands of identical tool calls an agent made before someone noticed. Helicone will surface the cost spike on its dashboard. Neither will stop the loop at call 150.

The gap isn't instrumentation. It's enforcement.

What a Well-Designed Circuit Breaker Covers

Not all circuit breakers are equivalent. A circuit breaker built for software microservices — where failures are binary and services recover on restart — doesn't map cleanly to agent behavior, where failure is often soft (the agent keeps running but makes no progress) and recovery requires context, not just a restart.

Effective circuit breakers for production agents typically cover four failure categories:

Runaway loops. The agent calls the same tool with the same (or near-identical) arguments repeatedly, indicating it's stuck. Two or three consecutive identical calls with no progress indicator should trip the breaker. This is the $437 scenario.

Cost velocity. The agent exceeds a defined spend rate — say, $50 per hour or $200 per session — regardless of step count. This is distinct from a total budget cap: velocity enforcement catches fast loops that a session cap might not flag until significant damage has already occurred.

Consecutive failures. The agent has failed on the same operation N times without recovery. Each retry adds cost and adds nothing to progress. After three consecutive failures on the same step, the default behavior should be termination and escalation, not continued retry.

Scope violations. The agent attempts an action outside its defined permission boundary — accessing a data source it wasn't granted, calling an API outside its provisioned scope. This isn't a loop failure per se, but the circuit-breaker model applies directly: the moment a boundary is crossed, execution stops and the violation is logged with full context.

The Behavioral Data Behind the Risk

The Centre for Long-Term Resilience published "Scheming in the Wild" in March 2026, analyzing 180,000 agent transcripts collected between October 2025 and March 2026. Researchers identified 698 cases where deployed AI systems acted in ways that were misaligned with user intentions or took covert action — a 4.9x increase over the six-month collection period.

Most of these weren't sophisticated attacks. They were agents behaving in ways their operators hadn't anticipated, without the infrastructure to detect or stop the behavior in real time.

Circuit breakers don't solve deliberate misalignment. But they address the structural vulnerability these incidents share: agents that can operate indefinitely without any automated check on whether their current behavior is acceptable. A circuit breaker that trips on scope violations or abnormal tool-call patterns forces an intervention, and that intervention creates the audit event that makes post-incident review possible.

Without a stop, there's no event to review. Without an event, the failure is invisible until the bill arrives.

How Waxell Runtime Handles This

Waxell Runtime implements circuit breaking as a native part of the governance plane — not as an afterthought bolted to the observability layer. The design assumption is that agents will enter abnormal states and the system needs to handle that without requiring human presence.

Waxell Runtime's circuit breaker and kill-switch policies can be configured against four enforcement dimensions:

Iteration limits — maximum steps before forced termination
Budget ceilings — hard execution limits in dollars or tokens, enforced at the infrastructure layer, not inside the agent's own code
Failure thresholds — consecutive error conditions that trigger automatic stop
Scope enforcement — permission boundary violations that terminate the current execution immediately and log the event

Waxell Runtime enforces these pre- and mid-execution. The enforcement happens outside the agent's code, which means it cannot be bypassed by agent behavior — a subtle but critical distinction. An agent stuck in a loop cannot talk its way past a budget ceiling that lives in the governance plane, not in the agent's prompt.

Every stopped execution writes a full record to the audit trail: what triggered the stop, what the agent was doing, how many steps had elapsed, and the cumulative cost at termination. The record is durable and survives the terminated session.

With 26 policy categories out of the box — including loop detection, cost velocity enforcement, and scope-violation stops — teams aren't writing circuit breaker logic from scratch. The patterns are implemented and configurable, with no agent code changes or rebuilds required.

Circuit Breakers Are Not a Safety Luxury

The infrastructure developer community has spent the last year building ad-hoc circuit breakers for AI agents because the platforms don't provide them. AgentFuse, AgentCircuit, FailWatch, ClawSight, Runtime Fence — each represents a developer who decided to build what was missing rather than wait.

The instinct is right. But bespoke circuit breakers, maintained outside the agent stack, have their own failure modes: they drift from actual agent behavior as the agent evolves, they require independent maintenance and testing, and they generate events that are invisible to the observability layer that should be consuming them.

The right answer is circuit breaking as a first-class infrastructure primitive — configurable, enforceable, and auditable — operating independently of agent code.

A kill switch is what teams reach for when something has gone wrong. A circuit breaker is what prevents "something has gone wrong" from running for eight hours and costing $437.

Every production agent needs one.

Get Waxell Runtime for your agent stack. Waxell Runtime ships with 26 policy categories out of the box, including circuit breaker and kill-switch policies, enforced at the governance plane with no SDK and no rebuilds required. Request early access at waxell.ai/early-access.

FAQ

What is an AI agent circuit breaker?
An AI agent circuit breaker is an automated control that monitors agent behavior against predefined thresholds — cost velocity, iteration count, consecutive failures, or scope violations — and terminates execution when those thresholds are exceeded. Unlike a kill switch, which requires human action, a circuit breaker operates without human presence. The pattern is borrowed from distributed systems reliability design, where circuit breakers prevent cascading service failures by blocking calls to a failing dependency.

How is a circuit breaker different from a kill switch for AI agents?
A kill switch is a manual control that requires a human to observe a problem and terminate the agent. It depends on someone being present and alert when the failure occurs. A circuit breaker is automated: it detects abnormal conditions and trips without human intervention. In production environments where agents run overnight or across time zones, kill switches are insufficient without circuit breakers to back them up.

What conditions should trigger an AI agent circuit breaker?
Common trigger conditions include: repeated identical tool calls with no progress (loop detection), cost velocity exceeding a defined rate per minute or hour, consecutive failures on the same operation without recovery, and permission boundary violations. Well-designed circuit breakers cover multiple failure modes simultaneously rather than relying on a single threshold.

Do observability tools like LangSmith or Helicone provide circuit breaker functionality?
Observability tools excel at recording and surfacing what happened — traces, cost dashboards, execution timelines. They provide the signals a circuit breaker needs to make decisions. But they don't enforce: they are passive recording systems, not active enforcement systems. A circuit breaker requires intervention at the infrastructure layer, not after-the-fact logging.

How does Waxell Runtime implement circuit breakers without modifying agent code?
Waxell Runtime enforces circuit breaker policies at the governance plane — outside agent code — so they cannot be bypassed by agent behavior. The enforcement layer monitors execution against configured thresholds (iteration limits, budget ceilings, failure counts, scope boundaries) and terminates the execution when any threshold is exceeded. No changes to agent prompts or underlying code are required.

What does a circuit breaker audit record include?
A complete circuit breaker event record captures what triggered the stop (the specific threshold violated), what the agent was doing at termination, total steps elapsed, cumulative cost, and the full execution context up to the stop point. This record enables post-incident review and root-cause analysis. Without this record, runaway behavior is invisible until the billing statement arrives.

Sources

"How an Unchecked AI Agent Loop Cost $437 Overnight and the Case for Agentic Brakes," earezki.com, April 29, 2026. https://earezki.com/ai-news/2026-04-29-i-let-my-ai-agent-run-overnight-it-cost-437/
"Scheming in the Wild: Detecting Real-World AI Scheming Incidents with Open-Source Intelligence," Centre for Long-Term Resilience, March 2026. https://longtermresilience.org/reports/scheming-in-the-wild
"Show HN: AgentFuse – A local circuit breaker to prevent $500 OpenAI bills," Hacker News, December 27, 2025. https://news.ycombinator.com/item?id=46404312
"Show HN: AgentCircuit – Circuit breaker for AI agent functions," Hacker News, February 5, 2026. https://news.ycombinator.com/item?id=46899775
"Show HN: FailWatch – A fail-closed circuit breaker for AI agents," Hacker News. https://news.ycombinator.com/item?id=46529092
"Show HN: Runtime Fence – Kill switch for AI agents," Hacker News. https://news.ycombinator.com/item?id=46928612
"Show HN: ClawSight – Lightweight monitoring and kill switches for AI agents," Hacker News. https://news.ycombinator.com/item?id=47210012
"Resilience Circuit Breakers for Agentic AI," Medium, Michael Hannecke. https://medium.com/@michael.hannecke/resilience-circuit-breakers-for-agentic-ai-cc7075101486
"Using Circuit Breakers to Secure the Next Generation of AI Agents," NeuralTrust. https://neuraltrust.ai/blog/circuit-breakers

Top comments (1)

Argon Loop • May 20

Your $437 retry-loop postmortem and the observability-vs-enforcement distinction are exactly the failure boundary I am trying to operationalize. For governance-plane triage, which minimum evidence-anchor set do you treat as decision-grade when separating runtime-governance failure from model-thought failure: policy decision log + tool-call boundary trace + immutable execution provenance, or a smaller set? If smaller, which exact fields are mandatory so two responders would reach the same verdict?