<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Albert Mavashev</title>
    <description>The latest articles on DEV Community by Albert Mavashev (@amavashev).</description>
    <link>https://dev.to/amavashev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3831698%2Fe24cea68-fba8-440b-864e-7760e556464e.png</url>
      <title>DEV Community: Albert Mavashev</title>
      <link>https://dev.to/amavashev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/amavashev"/>
    <language>en</language>
    <item>
      <title>AI Agents Are Cross-Cutting. Your Controls Aren't.</title>
      <dc:creator>Albert Mavashev</dc:creator>
      <pubDate>Sun, 12 Apr 2026 01:28:26 +0000</pubDate>
      <link>https://dev.to/amavashev/ai-agents-are-cross-cutting-your-controls-arent-4c28</link>
      <guid>https://dev.to/amavashev/ai-agents-are-cross-cutting-your-controls-arent-4c28</guid>
      <description>&lt;p&gt;A pattern keeps showing up in production agent systems:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;the agent is cross-cutting, but the controls usually are not.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One agent run can touch:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;multiple model providers&lt;/li&gt;
&lt;li&gt;multiple tools&lt;/li&gt;
&lt;li&gt;multiple tenants&lt;/li&gt;
&lt;li&gt;multiple workers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But the controls teams usually start with still live inside one slice:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;provider spending caps&lt;/li&gt;
&lt;li&gt;observability tools&lt;/li&gt;
&lt;li&gt;framework loop limits&lt;/li&gt;
&lt;li&gt;some Redis counter somebody wrote in an afternoon&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those controls are useful. They solve real problems.&lt;/p&gt;

&lt;p&gt;They just do not answer the actual runtime question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can this agent, for this customer, on this worker, take the next action right now?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That is not a missing feature in one product.&lt;/p&gt;

&lt;p&gt;It is a &lt;strong&gt;layer problem&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The shape of the problem
&lt;/h2&gt;

&lt;p&gt;Take a normal SaaS setup.&lt;/p&gt;

&lt;p&gt;You have an agent feature serving many customers. A single run might use:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI for reasoning&lt;/li&gt;
&lt;li&gt;Anthropic for long context&lt;/li&gt;
&lt;li&gt;a cheaper model for embeddings&lt;/li&gt;
&lt;li&gt;a search API&lt;/li&gt;
&lt;li&gt;a CRM client&lt;/li&gt;
&lt;li&gt;an outbound mailer&lt;/li&gt;
&lt;li&gt;a payment API&lt;/li&gt;
&lt;li&gt;a vector store&lt;/li&gt;
&lt;li&gt;a web fetcher&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And it does not all happen in one process. It runs on stateless workers, with retries, fan-out, and concurrent jobs hitting shared budgets.&lt;/p&gt;

&lt;p&gt;Now ask a simple question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where do we put the budget cap?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That is where things get messy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the obvious controls fall short
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Provider spending caps
&lt;/h3&gt;

&lt;p&gt;Provider caps are useful safety nets.&lt;/p&gt;

&lt;p&gt;They can stop a catastrophic monthly bill on that provider.&lt;/p&gt;

&lt;p&gt;But they only see &lt;strong&gt;their own billing surface&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;An OpenAI cap does not know:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what you spent on Anthropic&lt;/li&gt;
&lt;li&gt;what the search API billed&lt;/li&gt;
&lt;li&gt;which customer the run belonged to&lt;/li&gt;
&lt;li&gt;whether this is one runaway workflow or normal traffic across fifty tenants&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Provider caps govern the provider’s exposure to your account.&lt;/p&gt;

&lt;p&gt;They do not govern your application’s full runtime surface.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Observability tools
&lt;/h3&gt;

&lt;p&gt;Observability tools are also useful.&lt;/p&gt;

&lt;p&gt;They help you answer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what happened&lt;/li&gt;
&lt;li&gt;where cost went&lt;/li&gt;
&lt;li&gt;which step was slow&lt;/li&gt;
&lt;li&gt;which tool was called&lt;/li&gt;
&lt;li&gt;where the run failed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But they answer those questions &lt;strong&gt;after the action already happened&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That is the constraint.&lt;/p&gt;

&lt;p&gt;A trace can tell you an agent burned through budget over the weekend. It cannot stop the bad call at iteration 50 when the damage was still small.&lt;/p&gt;

&lt;p&gt;Alerts reduce reaction time.&lt;/p&gt;

&lt;p&gt;They do not create a pre-execution decision point.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Framework limits
&lt;/h3&gt;

&lt;p&gt;Frameworks often provide things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;max iterations&lt;/li&gt;
&lt;li&gt;max execution time&lt;/li&gt;
&lt;li&gt;step limits&lt;/li&gt;
&lt;li&gt;tool-call ceilings&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those are sensible defaults.&lt;/p&gt;

&lt;p&gt;But they are still &lt;strong&gt;local&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;They usually apply to one orchestrator instance. Not to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;multiple workers&lt;/li&gt;
&lt;li&gt;multiple retries&lt;/li&gt;
&lt;li&gt;multiple runs hitting the same budget&lt;/li&gt;
&lt;li&gt;multi-tenant shared infrastructure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They also speak in loop counts, not in money or blast radius.&lt;/p&gt;

&lt;p&gt;A loop limit does not know whether the next step costs $0.001 or $4.00.&lt;br&gt;
It does not know whether the action is a harmless read or a customer-facing mutation.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. DIY counters
&lt;/h3&gt;

&lt;p&gt;This is the classic path.&lt;/p&gt;

&lt;p&gt;A team writes a Redis-backed counter. Increment per call. Check before request. Done.&lt;/p&gt;

&lt;p&gt;It works for a sprint.&lt;/p&gt;

&lt;p&gt;Then reality shows up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;TOCTOU races under concurrency&lt;/li&gt;
&lt;li&gt;drift across providers&lt;/li&gt;
&lt;li&gt;retries that double-count or under-count&lt;/li&gt;
&lt;li&gt;counters lost on worker crashes&lt;/li&gt;
&lt;li&gt;awkward logic around estimate vs actual&lt;/li&gt;
&lt;li&gt;tenant isolation problems&lt;/li&gt;
&lt;li&gt;a growing pile of exceptions and patches&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What started as “just a counter” turns into a distributed transactional system.&lt;/p&gt;

&lt;p&gt;That is usually the moment people realize they are not building a helper anymore. They are building a control plane.&lt;/p&gt;

&lt;h2&gt;
  
  
  The structural mismatch
&lt;/h2&gt;

&lt;p&gt;All of those tools share the same limit.&lt;/p&gt;

&lt;p&gt;They govern &lt;strong&gt;themselves&lt;/strong&gt;, not the full agent.&lt;/p&gt;

&lt;p&gt;That is the mismatch.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Control&lt;/th&gt;
&lt;th&gt;What it governs&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Provider cap&lt;/td&gt;
&lt;td&gt;One provider’s billing surface&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Observability&lt;/td&gt;
&lt;td&gt;A read path into events and traces&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Framework limit&lt;/td&gt;
&lt;td&gt;One orchestrator instance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DIY counter&lt;/td&gt;
&lt;td&gt;One local or semi-shared budget view&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;An agent, on the other hand, spans all of them.&lt;/p&gt;

&lt;p&gt;So the control layer has to span the same surface the agent spans.&lt;/p&gt;

&lt;p&gt;Anything inside only one dimension is a partial view.&lt;/p&gt;

&lt;p&gt;And a partial view is not governance.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the missing layer actually needs
&lt;/h2&gt;

&lt;p&gt;Once you accept that the control layer has to live &lt;strong&gt;outside&lt;/strong&gt; any one provider, tool, framework, or worker, the requirements become pretty clear.&lt;/p&gt;

&lt;h3&gt;
  
  
  External authority
&lt;/h3&gt;

&lt;p&gt;The decision point has to live outside any one runtime slice.&lt;/p&gt;

&lt;p&gt;Otherwise it will always be blind to the rest of the system.&lt;/p&gt;

&lt;h3&gt;
  
  
  Atomic distributed reservations
&lt;/h3&gt;

&lt;p&gt;Two workers cannot both think the same remaining budget is available.&lt;/p&gt;

&lt;p&gt;The concurrency problem has to be solved at the control layer itself, not patched afterward.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hierarchical scope
&lt;/h3&gt;

&lt;p&gt;The same primitive should work across:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;tenant&lt;/li&gt;
&lt;li&gt;workspace&lt;/li&gt;
&lt;li&gt;app&lt;/li&gt;
&lt;li&gt;workflow&lt;/li&gt;
&lt;li&gt;agent&lt;/li&gt;
&lt;li&gt;toolset&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is how you answer both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how much can this customer spend?&lt;/li&gt;
&lt;li&gt;how much can this single run spend?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Reserve, commit, release
&lt;/h3&gt;

&lt;p&gt;You need to hold budget &lt;strong&gt;before&lt;/strong&gt; the action runs, then settle with the actual amount after.&lt;/p&gt;

&lt;p&gt;Otherwise you end up with one of two bad choices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;optimistic execution with bad enforcement&lt;/li&gt;
&lt;li&gt;conservative blocking with permanently stranded budget&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  More than binary allow/deny
&lt;/h3&gt;

&lt;p&gt;Real systems need graceful degradation.&lt;/p&gt;

&lt;p&gt;Sometimes the right answer is not just ALLOW or DENY.&lt;/p&gt;

&lt;p&gt;Sometimes it is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;allow, but cap tools&lt;/li&gt;
&lt;li&gt;allow, but downgrade the model&lt;/li&gt;
&lt;li&gt;allow, but reduce context&lt;/li&gt;
&lt;li&gt;allow, but disable optional steps&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That requires a real decision layer, not just a counter.&lt;/p&gt;

&lt;h3&gt;
  
  
  Provider- and framework-agnostic design
&lt;/h3&gt;

&lt;p&gt;The control primitive cannot care whether the next action is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;an OpenAI call&lt;/li&gt;
&lt;li&gt;an Anthropic call&lt;/li&gt;
&lt;li&gt;a Stripe charge&lt;/li&gt;
&lt;li&gt;an outbound email&lt;/li&gt;
&lt;li&gt;a search query&lt;/li&gt;
&lt;li&gt;a database write&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the agent spans all of them, governance has to as well.&lt;/p&gt;

&lt;h2&gt;
  
  
  This is a layer, not a feature
&lt;/h2&gt;

&lt;p&gt;It is tempting to think the gap closes with one more feature release.&lt;/p&gt;

&lt;p&gt;Maybe observability tools add enforcement.&lt;br&gt;
Maybe providers add richer caps.&lt;br&gt;
Maybe frameworks add distributed counters.&lt;/p&gt;

&lt;p&gt;Possible? Sure.&lt;/p&gt;

&lt;p&gt;But that would turn each of those tools into a different category of system.&lt;/p&gt;

&lt;p&gt;This is why I think the missing piece is not another feature inside one runtime component.&lt;/p&gt;

&lt;p&gt;It is an &lt;strong&gt;external, cross-cutting authority layer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Keep the provider cap.&lt;br&gt;
Keep the observability stack.&lt;br&gt;
Keep the framework guardrails.&lt;br&gt;
Keep the local circuit breakers.&lt;/p&gt;

&lt;p&gt;But add the layer that answers the one question none of those can answer on their own:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;May this agent, for this customer, on this worker, take the next action right now?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That is the control question that actually matters in production.&lt;/p&gt;

&lt;p&gt;And it is why agent governance has to be cross-cutting.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>devops</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Using RAII to Add Budget and Action Guardrails to Rust AI Agent</title>
      <dc:creator>Albert Mavashev</dc:creator>
      <pubDate>Tue, 31 Mar 2026 18:35:47 +0000</pubDate>
      <link>https://dev.to/amavashev/using-raii-to-add-budget-and-action-guardrails-to-rust-ai-agent-1fii</link>
      <guid>https://dev.to/amavashev/using-raii-to-add-budget-and-action-guardrails-to-rust-ai-agent-1fii</guid>
      <description>&lt;p&gt;Rust is a strong fit for agent runtimes, but until now it has largely lacked a first-class runtime and budget enforcement layer.&lt;/p&gt;

&lt;p&gt;We built &lt;code&gt;cycles&lt;/code&gt; to add pre-execution budget and action control to Rust agents with an API that leans into ownership and compile-time safety.&lt;/p&gt;

&lt;p&gt;The key ideas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;commit(self)&lt;/code&gt; consumes the guard, so double-commit becomes a compile-time error&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;#[must_use]&lt;/code&gt; helps catch ignored reservations&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Drop&lt;/code&gt; triggers best-effort release if a guard is never finalized&lt;/li&gt;
&lt;li&gt;one lifecycle works across simple calls, streaming, and multi-agent workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The guide walks through three integration levels:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;with_cycles()&lt;/code&gt; for simple LLM/tool calls&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ReservationGuard&lt;/code&gt; for streaming and manual commit&lt;/li&gt;
&lt;li&gt;low-level client methods for custom lifecycles&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It also covers caps-aware execution:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;max token caps&lt;/li&gt;
&lt;li&gt;tool allow/deny lists&lt;/li&gt;
&lt;li&gt;step limits&lt;/li&gt;
&lt;li&gt;cooldowns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And a practical rollout path:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;start in &lt;code&gt;dry_run&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;inspect decisions and caps&lt;/li&gt;
&lt;li&gt;move to partial enforcement&lt;/li&gt;
&lt;li&gt;then hard limits&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full guide: &lt;a href="https://runcycles.io/blog/how-to-add-budget-and-action-guardrails-to-rust-ai-agents-with-cycles" rel="noopener noreferrer"&gt;https://runcycles.io/blog/how-to-add-budget-and-action-guardrails-to-rust-ai-agents-with-cycles&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Docs: &lt;a href="https://runcycles.io/quickstart/getting-started-with-the-rust-client" rel="noopener noreferrer"&gt;https://runcycles.io/quickstart/getting-started-with-the-rust-client&lt;/a&gt;&lt;br&gt;&lt;br&gt;
Repo: &lt;a href="https://github.com/runcycles/cycles-client-rust" rel="noopener noreferrer"&gt;https://github.com/runcycles/cycles-client-rust&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>rust</category>
      <category>agents</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Five Lessons from Building a Production OpenClaw Plugin</title>
      <dc:creator>Albert Mavashev</dc:creator>
      <pubDate>Mon, 30 Mar 2026 14:22:15 +0000</pubDate>
      <link>https://dev.to/amavashev/five-lessons-from-building-a-production-openclaw-plugin-2e3o</link>
      <guid>https://dev.to/amavashev/five-lessons-from-building-a-production-openclaw-plugin-2e3o</guid>
      <description>&lt;p&gt;Built a non-trivial &lt;a href="https://github.com/runcycles/cycles-openclaw-budget-guard" rel="noopener noreferrer"&gt;budget enforcement plugin&lt;/a&gt; for OpenClaw and ran into several behaviors that were not obvious from the public plugin surface: missing model metadata, no clean way to block model calls, install-time config validation traps, and a security-scanner false positive. The most surprising discovery: OpenClaw's &lt;code&gt;before_model_resolve&lt;/code&gt; hook has no way to prevent a model call — we had to redirect to a fake model name to force a provider-side rejection.&lt;/p&gt;

&lt;p&gt;This post is a practical writeup of the five issues that mattered most, the workarounds we shipped, and the feature requests we filed.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;None of this is a complaint about OpenClaw. The platform is well-designed and the hook lifecycle is the right abstraction. These are field notes from building a production plugin, shared so other developers don't have to rediscover the same things.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 1: The model name isn't in the model resolve event
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;before_model_resolve&lt;/code&gt; hook is called before the LLM provider is invoked. You'd expect the event to include which model is being resolved. It doesn't.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// What we expected&lt;/span&gt;
&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;BeforeModelResolveEvent&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// What OpenClaw actually passes&lt;/span&gt;
&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;BeforeModelResolveEvent&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// that's it&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We discovered this by logging &lt;code&gt;Object.keys(event)&lt;/code&gt; — which returned &lt;code&gt;["prompt"]&lt;/code&gt;. No &lt;code&gt;model&lt;/code&gt;, &lt;code&gt;modelId&lt;/code&gt;, &lt;code&gt;modelName&lt;/code&gt;, &lt;code&gt;model_id&lt;/code&gt;, or any variant.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it matters:&lt;/strong&gt; Our plugin needs the model name to look up per-model cost estimates, apply fallback chains (Opus → Sonnet → Haiku), and track per-model spend in the session summary. Without it, budget enforcement for models is blind.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workaround:&lt;/strong&gt; We added a &lt;code&gt;defaultModelName&lt;/code&gt; config property and a multi-source auto-detection chain that checks &lt;code&gt;api.config&lt;/code&gt;, &lt;code&gt;api.pluginConfig&lt;/code&gt;, and several nested paths:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;eventModel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;
  &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;modelId&lt;/span&gt;
  &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;modelName&lt;/span&gt;
  &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;metadata&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;
  &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;defaultModelName&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If none of those resolve, the plugin logs the available keys at info level so operators can configure &lt;code&gt;defaultModelName&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;before_model_resolve: cannot determine model name.
Event keys: [prompt]. Metadata keys: [].
Set defaultModelName in plugin config.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Feature request:&lt;/strong&gt; &lt;a href="https://github.com/openclaw/openclaw/issues/55771" rel="noopener noreferrer"&gt;openclaw/openclaw#55771&lt;/a&gt; — include &lt;code&gt;model&lt;/code&gt; and &lt;code&gt;provider&lt;/code&gt; in the &lt;code&gt;before_model_resolve&lt;/code&gt; event.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 2: You can't cleanly block a model call
&lt;/h2&gt;

&lt;p&gt;OpenClaw's &lt;code&gt;before_tool_call&lt;/code&gt; hook has clean blocking semantics:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Tool hooks support this — works perfectly&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;block&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;blockReason&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Budget exhausted&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;before_model_resolve&lt;/code&gt; hook has no equivalent. The return type only supports &lt;code&gt;{ modelOverride?, providerOverride? }&lt;/code&gt;. There is no &lt;code&gt;block&lt;/code&gt; field and no &lt;code&gt;shouldStop&lt;/code&gt; policy in the hook runner.&lt;/p&gt;

&lt;p&gt;When our plugin throws &lt;code&gt;BudgetExhaustedError&lt;/code&gt;, OpenClaw catches it (the default &lt;code&gt;catchErrors: true&lt;/code&gt; behavior), logs "handler failed," and proceeds with the model call. The agent gets a response. Budget enforcement is bypassed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workaround:&lt;/strong&gt; We redirect to a non-existent model. When budget is exhausted, the plugin returns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;modelOverride&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;__cycles_budget_exhausted__&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OpenClaw passes this to the LLM provider, which rejects it (&lt;code&gt;model not found&lt;/code&gt;). The provider rejects the call before generation, so the agent produces no response. The user sees:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;⚠ Agent failed before reply: Unknown model: openai/__cycles_budget_exhausted__
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Not pretty, but the budget is enforced. The model call costs nothing because the provider never executes it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Feature request:&lt;/strong&gt; We've asked for &lt;code&gt;block&lt;/code&gt; support in &lt;code&gt;before_model_resolve&lt;/code&gt;, matching the &lt;code&gt;before_tool_call&lt;/code&gt; pattern.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 3: Your plugin initializes multiple times
&lt;/h2&gt;

&lt;p&gt;A smaller but confusing runtime behavior: OpenClaw calls the plugin's default export once per internal channel or worker — typically 4–5 times on startup. Each instance gets its own isolated state, which is correct for concurrency. But our startup banner printed 5 times and it looked broken.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workaround:&lt;/strong&gt; A module-level &lt;code&gt;startupBannerShown&lt;/code&gt; flag shows the full config banner once; subsequent inits get a one-liner with a sequential instance counter: &lt;code&gt;Cycles Budget Guard initialized (tenant=cyclist, dryRun=false, instance=3)&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 4: process.env triggers a security warning
&lt;/h2&gt;

&lt;p&gt;OpenClaw's plugin installer scans the bundled &lt;code&gt;dist/index.js&lt;/code&gt; for dangerous code patterns. Our plugin read &lt;code&gt;process.env.CYCLES_API_KEY&lt;/code&gt; as a config fallback, and the same bundle contained &lt;code&gt;fetch()&lt;/code&gt; calls for webhook delivery and OTLP metrics.&lt;/p&gt;

&lt;p&gt;The scanner flagged this combination:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WARNING: Plugin "openclaw-budget-guard" contains dangerous code patterns:
Environment variable access combined with network send — possible
credential harvesting
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a false positive — we read the API key to authenticate with the Cycles server, not to exfiltrate it. But users see "dangerous code patterns" during &lt;code&gt;openclaw plugins install&lt;/code&gt; and understandably hesitate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workaround:&lt;/strong&gt; We removed all &lt;code&gt;process.env&lt;/code&gt; access from the plugin. Both &lt;code&gt;cyclesBaseUrl&lt;/code&gt; and &lt;code&gt;cyclesApiKey&lt;/code&gt; are now required in the plugin config. For secrets management, we document OpenClaw's built-in env var interpolation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cyclesBaseUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"${CYCLES_BASE_URL}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cyclesApiKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"${CYCLES_API_KEY}"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OpenClaw resolves &lt;code&gt;${...}&lt;/code&gt; before passing config to the plugin, so the env var access happens in OpenClaw's trusted code — not in the scanned plugin bundle.&lt;/p&gt;

&lt;p&gt;Verification: &lt;code&gt;grep -c process.env dist/index.js&lt;/code&gt; returns &lt;code&gt;0&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 5: The plugin contract has undocumented rules
&lt;/h2&gt;

&lt;p&gt;Several behaviors of the OpenClaw plugin system are not documented but are critical to get right:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;api.pluginConfig&lt;/code&gt; vs &lt;code&gt;api.config&lt;/code&gt;:&lt;/strong&gt; Your plugin config is on &lt;code&gt;api.pluginConfig&lt;/code&gt; (from &lt;code&gt;plugins.entries.&amp;lt;id&amp;gt;.config&lt;/code&gt; in &lt;code&gt;openclaw.json&lt;/code&gt;). We initially read &lt;code&gt;api.config&lt;/code&gt; — which is the &lt;em&gt;full system config&lt;/em&gt; — and couldn't figure out why our settings were always undefined.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Manifest &lt;code&gt;id&lt;/code&gt; derivation:&lt;/strong&gt; The &lt;code&gt;id&lt;/code&gt; field in &lt;code&gt;openclaw.plugin.json&lt;/code&gt; must match what OpenClaw derives from the npm package name. For &lt;code&gt;@runcycles/openclaw-budget-guard&lt;/code&gt;, OpenClaw strips the scope and gets &lt;code&gt;openclaw-budget-guard&lt;/code&gt;. Our manifest originally said &lt;code&gt;cycles-openclaw-budget-guard&lt;/code&gt; — a mismatch warning on every load.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Config validation timing:&lt;/strong&gt; If your &lt;code&gt;configSchema&lt;/code&gt; includes &lt;code&gt;required&lt;/code&gt; fields, OpenClaw validates during &lt;code&gt;openclaw plugins install&lt;/code&gt; — before the user has written any config. We had &lt;code&gt;required: ["tenant"]&lt;/code&gt; which crashed the install. Fix: remove &lt;code&gt;required&lt;/code&gt; from the schema and validate at runtime in your &lt;code&gt;resolveConfig()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Install-time loading:&lt;/strong&gt; OpenClaw loads and executes the plugin during install to inspect it. If your plugin throws on missing config, the install fails with a confusing error. Wrap your initialization in try/catch and log a friendly message:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;resolveConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`[openclaw-budget-guard] Skipping registration: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What OpenClaw gets right
&lt;/h2&gt;

&lt;p&gt;This post focuses on rough edges, but the foundation is solid:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The 5-hook lifecycle is well-designed.&lt;/strong&gt; &lt;code&gt;before_model_resolve&lt;/code&gt; → &lt;code&gt;before_prompt_build&lt;/code&gt; → &lt;code&gt;before_tool_call&lt;/code&gt; → &lt;code&gt;after_tool_call&lt;/code&gt; → &lt;code&gt;agent_end&lt;/code&gt; covers the full agent execution lifecycle. You can build meaningful enforcement without modifying agent code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;before_tool_call&lt;/code&gt; blocking is clean.&lt;/strong&gt; &lt;code&gt;{ block: true, blockReason }&lt;/code&gt; with &lt;code&gt;shouldStop&lt;/code&gt; is exactly the right pattern. We just want the same for model calls.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plugin isolation per channel is correct.&lt;/strong&gt; Each channel gets its own plugin instance with its own state. No shared-state bugs across concurrent sessions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;api.logger&lt;/code&gt; integration works well.&lt;/strong&gt; Plugin log output appears in OpenClaw's log stream with proper prefixes and levels.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The install/enable flow is simple.&lt;/strong&gt; &lt;code&gt;openclaw plugins install&lt;/code&gt; + &lt;code&gt;openclaw plugins enable&lt;/code&gt; — two commands and you're running.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What we'd like to see
&lt;/h2&gt;

&lt;p&gt;These are filed or planned feature requests:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;block&lt;/code&gt; support in &lt;code&gt;before_model_resolve&lt;/code&gt;&lt;/strong&gt; — same pattern as &lt;code&gt;before_tool_call&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model name in &lt;code&gt;before_model_resolve&lt;/code&gt; event&lt;/strong&gt; — &lt;code&gt;event.model&lt;/code&gt; and &lt;code&gt;event.provider&lt;/code&gt; (&lt;a href="https://github.com/openclaw/openclaw/issues/55771" rel="noopener noreferrer"&gt;#55771&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;after_model_call&lt;/code&gt; hook&lt;/strong&gt; — with &lt;code&gt;tokensInput&lt;/code&gt;, &lt;code&gt;tokensOutput&lt;/code&gt;, &lt;code&gt;latencyMs&lt;/code&gt; for actual cost tracking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Channel/worker ID on the &lt;code&gt;api&lt;/code&gt; object&lt;/strong&gt; — so plugins can differentiate instances in logs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plugin contract documentation&lt;/strong&gt; — &lt;code&gt;api.pluginConfig&lt;/code&gt; vs &lt;code&gt;api.config&lt;/code&gt;, manifest &lt;code&gt;id&lt;/code&gt; rules, config validation timing, install-time behavior&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Build your own
&lt;/h2&gt;

&lt;p&gt;If you're building an OpenClaw plugin, start with our source as a reference: &lt;a href="https://github.com/runcycles/cycles-openclaw-budget-guard" rel="noopener noreferrer"&gt;github.com/runcycles/cycles-openclaw-budget-guard&lt;/a&gt;. The patterns for config resolution, hook registration, state management, and error handling are all used in our released plugin.&lt;/p&gt;

&lt;p&gt;Full integration guide: &lt;a href="//runcycles.io/how-to/integrating-cycles-with-openclaw"&gt;Integrating Cycles with OpenClaw&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>typescript</category>
      <category>openclaw</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Why 200 OK Is the Most Dangerous Response in Agent Production</title>
      <dc:creator>Albert Mavashev</dc:creator>
      <pubDate>Thu, 26 Mar 2026 15:57:38 +0000</pubDate>
      <link>https://dev.to/amavashev/why-200-ok-is-the-most-dangerous-response-in-agent-production-4ao0</link>
      <guid>https://dev.to/amavashev/why-200-ok-is-the-most-dangerous-response-in-agent-production-4ao0</guid>
      <description>&lt;p&gt;The scary failures are not always the ones that crash.&lt;/p&gt;

&lt;p&gt;Sometimes everything looks fine.&lt;/p&gt;

&lt;p&gt;The API returns &lt;code&gt;200 OK&lt;/code&gt;.&lt;br&gt;&lt;br&gt;
The logs are clean.&lt;br&gt;&lt;br&gt;
The workflow completes.&lt;br&gt;&lt;br&gt;
No alert fires.&lt;/p&gt;

&lt;p&gt;And the result is wrong.&lt;/p&gt;

&lt;p&gt;That is a much worse failure mode than a timeout or a hard error, because nothing tells you to go look. The system says success. The output just quietly drifts away from reality.&lt;/p&gt;

&lt;p&gt;This is starting to show up more in agent systems than in normal software.&lt;/p&gt;

&lt;p&gt;A normal service usually fails loudly. Bad input throws an exception. A downstream service times out. A database call returns an error. Something breaks in a way people know how to detect.&lt;/p&gt;

&lt;p&gt;Agents can fail differently.&lt;/p&gt;

&lt;p&gt;They can keep going.&lt;/p&gt;

&lt;p&gt;They can produce something that looks plausible, structured, and complete while being based on the wrong state, the wrong tool result, or the wrong interpretation of the task.&lt;/p&gt;

&lt;p&gt;That is where &lt;code&gt;200 OK&lt;/code&gt; gets dangerous.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. The tool call "worked"
&lt;/h3&gt;

&lt;p&gt;An agent is supposed to pull data from a system and summarize it.&lt;/p&gt;

&lt;p&gt;The request goes through. The workflow finishes. The output looks polished.&lt;/p&gt;

&lt;p&gt;But the underlying tool response was incomplete, malformed, or misunderstood, and the agent filled in the gaps with something that sounded reasonable.&lt;/p&gt;

&lt;p&gt;No crash.&lt;br&gt;&lt;br&gt;
No red light.&lt;br&gt;&lt;br&gt;
Just bad output wrapped in a success path.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The coding agent fixed the wrong thing
&lt;/h3&gt;

&lt;p&gt;A coding agent gets asked to make the test suite green.&lt;/p&gt;

&lt;p&gt;It does.&lt;/p&gt;

&lt;p&gt;CI passes. Everyone moves on.&lt;/p&gt;

&lt;p&gt;Later someone realizes the agent did not actually fix the bug. It changed the tests to match the broken behavior.&lt;/p&gt;

&lt;p&gt;Again: success on paper, failure in reality.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. The workflow lost state in the middle
&lt;/h3&gt;

&lt;p&gt;One agent gathers context. Another agent is supposed to use it.&lt;/p&gt;

&lt;p&gt;Somewhere in the handoff, part of the state gets dropped. Not enough to crash. Just enough to make the next decision wrong.&lt;/p&gt;

&lt;p&gt;The rest of the pipeline still runs. The final report still gets produced. It just happens to be built on partial data.&lt;/p&gt;

&lt;p&gt;That is the pattern: wrong result, valid-looking execution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why monitoring does not solve this
&lt;/h2&gt;

&lt;p&gt;The default reaction is usually: we need better observability.&lt;/p&gt;

&lt;p&gt;Observability absolutely matters. Traces, logs, dashboards, metrics, all useful.&lt;/p&gt;

&lt;p&gt;But they mostly tell you what happened &lt;strong&gt;after&lt;/strong&gt; the system already acted.&lt;/p&gt;

&lt;p&gt;That helps with debugging.&lt;/p&gt;

&lt;p&gt;It does not help much when the system keeps doing the wrong thing while still looking healthy.&lt;/p&gt;

&lt;p&gt;A dashboard is great at showing crashes, latency spikes, and budget overruns.&lt;/p&gt;

&lt;p&gt;It is much worse at telling you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;this agent used the wrong tool output&lt;/li&gt;
&lt;li&gt;this handoff lost key context&lt;/li&gt;
&lt;li&gt;this branch decision was wrong but still valid enough to continue&lt;/li&gt;
&lt;li&gt;this run should have stopped three steps ago&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The core problem is not missing charts.&lt;/p&gt;

&lt;p&gt;It is missing checkpoints.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is actually missing
&lt;/h2&gt;

&lt;p&gt;Most agent stacks have a gap between:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;the agent deciding to do something&lt;/li&gt;
&lt;li&gt;the action actually happening&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In a lot of systems, that gap is basically empty.&lt;/p&gt;

&lt;p&gt;The agent reasons, chooses, acts, and reports success in one uninterrupted flow.&lt;/p&gt;

&lt;p&gt;If the reasoning is wrong, the action still happens.&lt;/p&gt;

&lt;p&gt;That is why silent failures spread so easily. There is no mandatory pause where the system asks:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Should this step be allowed to proceed?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not "did it return 200?"&lt;/p&gt;

&lt;p&gt;Not "did the code throw?"&lt;/p&gt;

&lt;p&gt;A different question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does this step still make sense under the current budget, policy, and expected shape of the run?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That checkpoint matters more than another dashboard.&lt;/p&gt;

&lt;h2&gt;
  
  
  A better pattern
&lt;/h2&gt;

&lt;p&gt;The safer pattern looks more like this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;decide -&amp;gt; check -&amp;gt; act -&amp;gt; record&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;decide -&amp;gt; act -&amp;gt; maybe notice later&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That checkpoint can be simple.&lt;/p&gt;

&lt;p&gt;Before a model call, tool invocation, file write, or external side effect, force the run through a control point.&lt;/p&gt;

&lt;p&gt;At that point, you can ask:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;is this action expected here?&lt;/li&gt;
&lt;li&gt;is the run still within budget?&lt;/li&gt;
&lt;li&gt;is this tool allowed?&lt;/li&gt;
&lt;li&gt;does the cost pattern still look normal?&lt;/li&gt;
&lt;li&gt;is the agent starting to loop or fan out unexpectedly?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This will not catch every semantic mistake.&lt;/p&gt;

&lt;p&gt;But it will catch a lot of structural ones, which is already a big improvement over "everything returned 200 so I guess we're fine."&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters in production
&lt;/h2&gt;

&lt;p&gt;Silent failures are expensive because they compound.&lt;/p&gt;

&lt;p&gt;A crash stops the workflow.&lt;/p&gt;

&lt;p&gt;A silent failure keeps feeding bad state into later steps.&lt;/p&gt;

&lt;p&gt;One wrong tool result becomes a wrong decision.&lt;br&gt;&lt;br&gt;
That wrong decision becomes a wrong action.&lt;br&gt;&lt;br&gt;
That wrong action becomes a wrong report, bad write, or misleading recommendation.&lt;/p&gt;

&lt;p&gt;And by the time someone notices, the original step is buried.&lt;/p&gt;

&lt;p&gt;That is why the cleanest run is not always the safest run.&lt;/p&gt;

&lt;p&gt;In agent systems, a green dashboard can be lying to you.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I would do first
&lt;/h2&gt;

&lt;p&gt;If you are running agents in production, I would start here:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Identify one workflow where a wrong answer actually matters.&lt;/li&gt;
&lt;li&gt;Add a mandatory checkpoint before each costly or risky action.&lt;/li&gt;
&lt;li&gt;Record what the step was supposed to do and what actually happened.&lt;/li&gt;
&lt;li&gt;Put a hard cap on how far one run is allowed to go.&lt;/li&gt;
&lt;li&gt;Look for runs that are "successful" but economically or behaviorally weird.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That last one matters.&lt;/p&gt;

&lt;p&gt;Wrong runs often have a shape.&lt;/p&gt;

&lt;p&gt;They loop.&lt;br&gt;
They fan out.&lt;br&gt;
They use the wrong tool.&lt;br&gt;
They cost too little for the work they claim to have done.&lt;br&gt;
Or they cost too much for what should have been simple.&lt;/p&gt;

&lt;p&gt;Those signals are often more useful than waiting for an exception that never comes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;The most dangerous response in agent production is not &lt;code&gt;500&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It is &lt;code&gt;200 OK&lt;/code&gt; attached to the wrong result.&lt;/p&gt;

&lt;p&gt;That is the failure mode that slips through monitoring, avoids alerts, and reaches users looking completely normal.&lt;/p&gt;

&lt;p&gt;Loud failures are annoying.&lt;/p&gt;

&lt;p&gt;Silent ones are how you lose trust in the system.&lt;/p&gt;

&lt;p&gt;Original post: &lt;a href="https://runcycles.io/blog/ai-agent-silent-failures-why-200-ok-is-the-most-dangerous-response" rel="noopener noreferrer"&gt;AI Agent Silent Failures: Why 200 OK Is the Most Dangerous Response&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Project: &lt;a href="https://runcycles.io" rel="noopener noreferrer"&gt;runcycles.io&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>devops</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Your AI Agent Budget Check Has a Race Condition</title>
      <dc:creator>Albert Mavashev</dc:creator>
      <pubDate>Wed, 25 Mar 2026 15:48:17 +0000</pubDate>
      <link>https://dev.to/amavashev/your-ai-agent-budget-check-has-a-race-condition-33ei</link>
      <guid>https://dev.to/amavashev/your-ai-agent-budget-check-has-a-race-condition-33ei</guid>
      <description>&lt;p&gt;When I first started putting budget limits around agent workflows, I thought the solution would be simple.&lt;/p&gt;

&lt;p&gt;Track the spend.&lt;br&gt;&lt;br&gt;
Check what is left.&lt;br&gt;&lt;br&gt;
Stop the next call if the budget is gone.&lt;/p&gt;

&lt;p&gt;That works in a demo. It even works in light testing.&lt;/p&gt;

&lt;p&gt;Then you run the same workflow with concurrency, retries, or a restart in the middle, and the whole thing gets shaky.&lt;/p&gt;

&lt;p&gt;The problem is not the math.&lt;br&gt;&lt;br&gt;
The problem is where the decision gets made.&lt;/p&gt;
&lt;h2&gt;
  
  
  The naive version
&lt;/h2&gt;

&lt;p&gt;A lot of first implementations look roughly like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;estimated_cost&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_remaining_budget&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;estimated_cost&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;RuntimeError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;budget exceeded&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;llm_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;actual_cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;calculate_cost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;record_spend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual_cost&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At first glance, this seems fine.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check the remaining budget&lt;/li&gt;
&lt;li&gt;Make the call&lt;/li&gt;
&lt;li&gt;Record the spend&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a single worker, single process, no retries, no failures, it mostly works.&lt;/p&gt;

&lt;p&gt;Production is not that environment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where it breaks
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Concurrency
&lt;/h3&gt;

&lt;p&gt;Say you have $5 left.&lt;/p&gt;

&lt;p&gt;Now 10 workers all check the budget at about the same time.&lt;/p&gt;

&lt;p&gt;They all read the same value.&lt;br&gt;&lt;br&gt;
They all think they have room.&lt;br&gt;&lt;br&gt;
They all proceed.&lt;/p&gt;

&lt;p&gt;You did not have one bug. You had ten correct reads and one broken design.&lt;/p&gt;

&lt;p&gt;That is a classic time-of-check vs time-of-use problem.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Retries
&lt;/h3&gt;

&lt;p&gt;Now add retry logic.&lt;/p&gt;

&lt;p&gt;Maybe the model call times out.&lt;br&gt;&lt;br&gt;
Maybe the network flakes.&lt;br&gt;&lt;br&gt;
Maybe your framework retries automatically and your application retries too.&lt;/p&gt;

&lt;p&gt;Did the first attempt spend money?&lt;br&gt;&lt;br&gt;
Did the second one?&lt;br&gt;&lt;br&gt;
Did both get recorded?&lt;br&gt;&lt;br&gt;
Did neither?&lt;/p&gt;

&lt;p&gt;If your budget tracking is tied to local control flow, retries turn accounting into guesswork.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Restarts and crashes
&lt;/h3&gt;

&lt;p&gt;If the process dies after the model call but before &lt;code&gt;record_spend()&lt;/code&gt;, your state is wrong.&lt;/p&gt;

&lt;p&gt;The money is gone.&lt;br&gt;&lt;br&gt;
Your counter says it is not.&lt;/p&gt;

&lt;p&gt;If you keep the budget in memory, a restart makes it even worse. The counter resets. The spend does not.&lt;/p&gt;
&lt;h2&gt;
  
  
  The real issue
&lt;/h2&gt;

&lt;p&gt;A budget check inside application code is not an authority.&lt;/p&gt;

&lt;p&gt;It is a hint.&lt;/p&gt;

&lt;p&gt;It is only as correct as the current process, the current thread, and the current execution path. Once multiple workers share the same budget, you need the decision to happen in one place, atomically.&lt;/p&gt;

&lt;p&gt;That means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;check the budget&lt;/li&gt;
&lt;li&gt;reserve the amount&lt;/li&gt;
&lt;li&gt;make the call&lt;/li&gt;
&lt;li&gt;reconcile the actual cost&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;read the budget&lt;/li&gt;
&lt;li&gt;hope nothing else changes&lt;/li&gt;
&lt;li&gt;make the call&lt;/li&gt;
&lt;li&gt;update later&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those are not the same thing.&lt;/p&gt;
&lt;h2&gt;
  
  
  A better pattern: reserve, execute, commit
&lt;/h2&gt;

&lt;p&gt;The simplest durable shape I found was this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;reservation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;reserve_budget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tenant/acme/workflow/summarizer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;estimated_cost&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;idempotency_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;run_step_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;llm_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;actual_cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;calculate_cost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;commit_budget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;reservation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;actual_cost&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;release_budget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;reservation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That changes the semantics in a useful way.&lt;/p&gt;

&lt;p&gt;Before the model call happens, the budget is already spoken for.&lt;/p&gt;

&lt;p&gt;A concurrent worker cannot grab the same dollars.&lt;br&gt;&lt;br&gt;
A retry can reuse the same idempotency key.&lt;br&gt;&lt;br&gt;
A failure can release what was reserved but not spent.&lt;/p&gt;

&lt;p&gt;The important part is not the API shape.&lt;br&gt;&lt;br&gt;
The important part is that the reservation is atomic.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why this belongs outside the agent
&lt;/h2&gt;

&lt;p&gt;I also learned that once agents start sharing budgets across tenants, workflows, runs, and tools, this logic stops being “just a wrapper.”&lt;/p&gt;

&lt;p&gt;Now you need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;atomic reservations&lt;/li&gt;
&lt;li&gt;idempotency&lt;/li&gt;
&lt;li&gt;retry safety&lt;/li&gt;
&lt;li&gt;shared scope rules&lt;/li&gt;
&lt;li&gt;audit history&lt;/li&gt;
&lt;li&gt;behavior that is consistent across runtimes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At that point, budget control starts to look less like a helper function and more like infrastructure.&lt;/p&gt;

&lt;p&gt;That was the point where I stopped treating it as app code and pulled it into its own service.&lt;/p&gt;
&lt;h2&gt;
  
  
  One practical rule
&lt;/h2&gt;

&lt;p&gt;If your budget check looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;read_balance&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;estimated_cost&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;do_the_thing&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;write_new_balance&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;you do not have enforcement yet.&lt;/p&gt;

&lt;p&gt;You have a race condition with good intentions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing thought
&lt;/h2&gt;

&lt;p&gt;Most agent failures are not exotic.&lt;/p&gt;

&lt;p&gt;They come from very ordinary bugs:&lt;br&gt;&lt;br&gt;
stale reads, duplicate retries, counters that live in the wrong place, and side effects that happen before anyone realizes the run has gone off the rails.&lt;/p&gt;

&lt;p&gt;A budget limit only matters if it can say &lt;strong&gt;no before the next step happens&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Everything else is reporting.&lt;/p&gt;




&lt;p&gt;I ran into this while building &lt;strong&gt;Cycles&lt;/strong&gt;, an open-source budget authority for autonomous agents. What looked like a simple spend check turned into a distributed systems problem: concurrency, retries, idempotency, and state that had to stay correct under failure.&lt;/p&gt;

&lt;p&gt;That was the real lesson. Once multiple workers can spend from the same pool, the budget check has to be atomic or it is not real.&lt;/p&gt;

&lt;p&gt;If you want to see how I implemented this in Cycles, more at &lt;a href="https://runcycles.io" rel="noopener noreferrer"&gt;https://runcycles.io&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>architecture</category>
      <category>distributedsystems</category>
      <category>opensource</category>
    </item>
    <item>
      <title>The AI Agent Control Layer Nobody Talks About</title>
      <dc:creator>Albert Mavashev</dc:creator>
      <pubDate>Mon, 23 Mar 2026 16:28:48 +0000</pubDate>
      <link>https://dev.to/amavashev/the-ai-agent-control-layer-nobody-talks-about-5eid</link>
      <guid>https://dev.to/amavashev/the-ai-agent-control-layer-nobody-talks-about-5eid</guid>
      <description>&lt;p&gt;A lot of agent control discussion still sits at the wrong layer.&lt;br&gt;
Observability tells you what happened. Guardrails help shape behavior.&lt;/p&gt;

&lt;p&gt;Neither answers the production question that matters most when agents are looping, retrying, fanning out:&lt;/p&gt;

&lt;p&gt;Can this agent still act — given what it has already done?&lt;/p&gt;

&lt;p&gt;That's the control point I've been building toward with Cycles.&lt;/p&gt;

&lt;p&gt;Simple example — a support agent with CRM and email access:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Without a runtime decision layer: the customer email fires (may be multiple emails).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;With Cycles: blocked before execution. The function never runs, emails don't go out. &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Emails that fire are customer commitments, a compliance exposure, or a support promise.&lt;/p&gt;

&lt;p&gt;Demo here: &lt;a href="https://github.com/runcycles/cycles-agent-action-authority-demo" rel="noopener noreferrer"&gt;https://github.com/runcycles/cycles-agent-action-authority-demo&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>agents</category>
      <category>devops</category>
    </item>
    <item>
      <title>Budget Limits for Claude Code, Cursor, and Windsurf via MCP</title>
      <dc:creator>Albert Mavashev</dc:creator>
      <pubDate>Sun, 22 Mar 2026 12:00:00 +0000</pubDate>
      <link>https://dev.to/amavashev/budget-limits-for-claude-code-cursor-and-windsurf-via-mcp-2b1f</link>
      <guid>https://dev.to/amavashev/budget-limits-for-claude-code-cursor-and-windsurf-via-mcp-2b1f</guid>
      <description>&lt;h2&gt;
  
  
  Budget Limits for Claude Code, Cursor, and Windsurf via MCP
&lt;/h2&gt;

&lt;p&gt;A developer starts a Claude Code session to clean up an auth flow.&lt;/p&gt;

&lt;p&gt;A few hours later, the agent has read a bunch of files, rewritten services, generated tests, retried a few times, and kept going longer than expected. The work may still be useful, but the bill is now much higher than planned.&lt;/p&gt;

&lt;p&gt;That is the gap with coding agents in Claude Code, Cursor, and Windsurf.&lt;/p&gt;

&lt;p&gt;They are designed for long, mostly unsupervised sessions. That is the benefit. It is also the risk.&lt;/p&gt;

&lt;p&gt;Most of the time, there is no built-in way to say:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;stop after this amount&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not after one more retry.&lt;br&gt;&lt;br&gt;
Not after one more tool call.&lt;br&gt;&lt;br&gt;
Just stop.&lt;/p&gt;

&lt;p&gt;That is where MCP becomes useful.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why MCP matters
&lt;/h2&gt;

&lt;p&gt;MCP gives coding hosts a standard way to discover and call external tools.&lt;/p&gt;

&lt;p&gt;That makes it a clean place to add budget enforcement without wrapping every model call in application code.&lt;/p&gt;

&lt;p&gt;With Cycles, the idea is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;agent estimates the next step&lt;/li&gt;
&lt;li&gt;Cycles reserves budget for it&lt;/li&gt;
&lt;li&gt;action runs&lt;/li&gt;
&lt;li&gt;actual usage is committed&lt;/li&gt;
&lt;li&gt;unused budget is released&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;estimate -&amp;gt; reserve -&amp;gt; execute -&amp;gt; commit/release&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That is stronger than dashboards or alerts, because the decision happens &lt;strong&gt;before&lt;/strong&gt; the next expensive step.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why provider caps are not enough
&lt;/h2&gt;

&lt;p&gt;Provider caps help, but they are usually the wrong layer.&lt;/p&gt;

&lt;p&gt;They are often:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;account-level, not session-level&lt;/li&gt;
&lt;li&gt;vendor-specific, not workflow-wide&lt;/li&gt;
&lt;li&gt;blind to tool calls and other side effects&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What you actually want is a runtime question:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Is this session still allowed to take the next expensive step?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is the question a budget authority should answer.&lt;/p&gt;
&lt;h2&gt;
  
  
  Thin MCP setup
&lt;/h2&gt;

&lt;p&gt;For Claude Code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude mcp add cycles &lt;span class="nt"&gt;--&lt;/span&gt; npx &lt;span class="nt"&gt;-y&lt;/span&gt; @runcycles/mcp-server
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then set the API key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CYCLES_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;cyc_live_...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Cursor or Windsurf:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"cycles"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@runcycles/mcp-server"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"CYCLES_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cyc_live_..."&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the appeal here: no SDK in the project, no custom wrapper around every call, no deep integration work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Better than a kill switch
&lt;/h2&gt;

&lt;p&gt;A useful budget control system should not only say yes or no.&lt;/p&gt;

&lt;p&gt;It should be able to return:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ALLOW&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ALLOW_WITH_CAPS&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;DENY&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That middle state matters.&lt;/p&gt;

&lt;p&gt;If budget is getting tight, the agent can finish in a constrained way instead of crashing abruptly. For example, it can reduce scope, use smaller token limits, skip expensive tools, and wrap up cleanly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapper vs authority
&lt;/h2&gt;

&lt;p&gt;You can absolutely vibe-code a lightweight wrapper that tracks spend.&lt;/p&gt;

&lt;p&gt;That is fine for a demo.&lt;/p&gt;

&lt;p&gt;But wrappers usually break where real control matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;retries&lt;/li&gt;
&lt;li&gt;concurrency&lt;/li&gt;
&lt;li&gt;partial failures&lt;/li&gt;
&lt;li&gt;non-LLM tool actions&lt;/li&gt;
&lt;li&gt;inconsistent enforcement across hosts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A wrapper observes.&lt;/p&gt;

&lt;p&gt;An authority decides whether the next action is allowed.&lt;/p&gt;

&lt;p&gt;That is the real difference.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;Coding agents are great at compressing lots of work into one session.&lt;/p&gt;

&lt;p&gt;They are also very good at turning a simple task into many model calls, tool invocations, retries, and side effects before anyone notices.&lt;/p&gt;

&lt;p&gt;Dashboards help.&lt;br&gt;
Alerts help.&lt;br&gt;
Provider caps help.&lt;/p&gt;

&lt;p&gt;But they do not answer the only question that matters inside the loop:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;should this agent be allowed to continue right now?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That is the role Cycles is trying to fill through MCP.&lt;/p&gt;

&lt;p&gt;Original post: &lt;a href="https://runcycles.io/blog/claude-code-cursor-windsurf-budget-limits-mcp" rel="noopener noreferrer"&gt;Budget Limits for Claude Code, Cursor, and Windsurf via MCP&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Project: &lt;a href="https://runcycles.io" rel="noopener noreferrer"&gt;runcycles.io&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>opensource</category>
      <category>mcp</category>
    </item>
    <item>
      <title>How to Add Budget Control to a LangChain Agent</title>
      <dc:creator>Albert Mavashev</dc:creator>
      <pubDate>Thu, 19 Mar 2026 18:52:40 +0000</pubDate>
      <link>https://dev.to/amavashev/how-to-add-budget-control-to-a-langchain-agent-2l56</link>
      <guid>https://dev.to/amavashev/how-to-add-budget-control-to-a-langchain-agent-2l56</guid>
      <description>&lt;h2&gt;
  
  
  How to Add Budget Control to a LangChain Agent
&lt;/h2&gt;

&lt;p&gt;LangChain makes it easy to build agents that call LLMs, search the web, execute code, and chain tool calls together. What it doesn't give you is any way to cap how much a single agent run is allowed to spend.&lt;/p&gt;

&lt;p&gt;That's fine when you're experimenting. It's a real problem when you're running agents in production — especially across multiple users or tenants. A single misbehaving agent loop can burn through hundreds of dollars before anyone notices.&lt;/p&gt;

&lt;p&gt;This guide shows how to add per-run budget control to a LangChain agent using &lt;a href="https://runcycles.io" rel="noopener noreferrer"&gt;Cycles&lt;/a&gt; — without rewriting your agent logic.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;tip Already using the callback handler?&lt;/strong&gt;&lt;br&gt;
If you want per-LLM-call budget tracking (a reservation around every model invocation), see &lt;a href="https://runcycles.io/how-to/integrating-cycles-with-langchain" rel="noopener noreferrer"&gt;Integrating Cycles with LangChain&lt;/a&gt;. This guide covers a different pattern: a &lt;strong&gt;single reservation around the entire agent run&lt;/strong&gt;, plus optional tool-level checks.&lt;br&gt;
:::&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;Here's a typical LangChain agent loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.agents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AgentExecutor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;create_openai_functions_agent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_web&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Search the web for information.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;your_search_implementation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_openai_functions_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;search_web&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;executor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;search_web&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Research the top 10 competitors...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works. But there's no limit on how many LLM calls the agent makes, how many tool invocations it triggers, or what it costs. If the agent gets stuck in a loop, retries a failing tool, or expands scope unexpectedly, it keeps running — and spending — until it either finishes or hits the provider's rate limits.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix: reserve before, commit after
&lt;/h2&gt;

&lt;p&gt;The pattern Cycles uses is borrowed from database transactions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Reserve&lt;/strong&gt; budget before the agent run starts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execute&lt;/strong&gt; the agent if the reservation is granted&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Commit&lt;/strong&gt; actual usage after — releases unused budget back to the pool&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Release&lt;/strong&gt; the full reservation if the run fails&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This gives you hard limits that are enforced &lt;em&gt;before&lt;/em&gt; spend happens — not discovered afterward on your bill.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;runcycles langchain langchain-openai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CYCLES_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"http://localhost:7878"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CYCLES_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-api-key"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"sk-..."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Need an API key?&lt;/strong&gt; Create one via the Admin Server — see &lt;a href="https://runcycles.io/quickstart/deploying-the-full-cycles-stack#step-3-create-an-api-key" rel="noopener noreferrer"&gt;Deploy the Full Stack&lt;/a&gt; or &lt;a href="https://runcycles.io/how-to/api-key-management-in-cycles" rel="noopener noreferrer"&gt;API Key Management&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Per-run budget wrapper
&lt;/h2&gt;

&lt;p&gt;Wrap your &lt;code&gt;AgentExecutor&lt;/code&gt; invocation in a single Cycles reservation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.agents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AgentExecutor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;create_openai_functions_agent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;runcycles&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;CyclesClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CyclesConfig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ReservationCreateRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CommitRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ReleaseRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Subject&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Unit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BudgetExceededError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CyclesProtocolError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CyclesClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CyclesConfig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_env&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_web&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Search the web for information.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;your_search_implementation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_agent_with_budget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tenant&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;budget_microcents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

    &lt;span class="c1"&gt;# 1. Reserve budget for the entire run
&lt;/span&gt;    &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_reservation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ReservationCreateRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;idempotency_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;subject&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Subject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tenant&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;workflow&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kind&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agent.run&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research-task&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Amount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Unit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;USD_MICROCENTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;budget_microcents&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;# 1 USD = 100_000_000 microcents
&lt;/span&gt;        &lt;span class="n"&gt;ttl_ms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;120_000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_error_response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BUDGET_EXCEEDED&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;BudgetExceededError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;error_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="nf"&gt;else &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;error_message&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Reservation failed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;CyclesProtocolError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_body_attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reservation_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;decision&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_body_attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;decision&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 2. Execute the agent — optionally downgrade if budget is tight
&lt;/span&gt;    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;decision&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ALLOW_WITH_CAPS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_openai_functions_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;search_web&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;executor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;search_web&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;max_iterations&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

        &lt;span class="c1"&gt;# 3. Commit actual usage
&lt;/span&gt;        &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit_reservation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;CommitRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;idempotency_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;commit-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Amount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Unit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;USD_MICROCENTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;budget_microcents&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# replace with real tracking
&lt;/span&gt;            &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# 4. Release on failure — budget returns to the pool
&lt;/span&gt;        &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;release_reservation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;ReleaseRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;idempotency_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;release-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt;

&lt;span class="c1"&gt;# Run it
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;run_agent_with_budget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Research the top 10 competitors in the CRM space&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tenant&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;acme&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;budget_microcents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5_000_000_000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# $50.00
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;::: info Crash safety&lt;br&gt;
If the agent crashes before committing or releasing, the reservation expires automatically after &lt;code&gt;ttl_ms&lt;/code&gt; and the held budget returns to the pool. See &lt;a href="https://runcycles.io/protocol/reservation-ttl-grace-period-and-extend-in-cycles" rel="noopener noreferrer"&gt;TTL, Grace Period, and Extend&lt;/a&gt;.&lt;br&gt;
:::&lt;/p&gt;
&lt;h2&gt;
  
  
  Adding tool-level budget checks
&lt;/h2&gt;

&lt;p&gt;Individual tools can also reserve budget before costly operations. If the tool's reservation is denied, it returns a skip message instead of failing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_web&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Search the web for information.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;tool_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

    &lt;span class="c1"&gt;# Reserve before the tool call
&lt;/span&gt;    &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_reservation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ReservationCreateRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;idempotency_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tool_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;subject&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Subject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;acme&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;toolset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;web-search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kind&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool.call&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search-web&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Amount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Unit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;USD_MICROCENTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100_000_000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;# $1.00
&lt;/span&gt;        &lt;span class="n"&gt;ttl_ms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30_000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Budget exhausted — skipping web search.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="n"&gt;tool_reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_body_attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reservation_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Execute the tool
&lt;/span&gt;    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;your_search_implementation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Commit actual usage
&lt;/span&gt;    &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit_reservation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_reservation_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;CommitRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;idempotency_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;commit-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;tool_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Amount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Unit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;USD_MICROCENTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;40_000_000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;# $0.40
&lt;/span&gt;    &lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Multi-tenant scoping
&lt;/h2&gt;

&lt;p&gt;Use the &lt;code&gt;Subject&lt;/code&gt; hierarchy to give each customer their own budget scope:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_for_customer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;run_agent_with_budget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;tenant&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;budget_microcents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10_000_000_000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# $100.00, or pull from the customer's plan
&lt;/span&gt;    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each customer's spend is tracked independently. One customer burning through their budget doesn't affect others.&lt;/p&gt;

&lt;h2&gt;
  
  
  Graceful degradation with ALLOW_WITH_CAPS
&lt;/h2&gt;

&lt;p&gt;When budget is running low, Cycles can return &lt;code&gt;ALLOW_WITH_CAPS&lt;/code&gt; instead of a hard denial. Use the decision to switch to a cheaper model or limit tool access:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_reservation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ReservationCreateRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;idempotency_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;subject&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Subject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tenant&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;workflow&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kind&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agent.run&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research-task&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Amount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Unit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;USD_MICROCENTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5_000_000_000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;# $50.00
&lt;/span&gt;    &lt;span class="n"&gt;ttl_ms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;120_000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Budget denial arrives as a 409 BUDGET_EXCEEDED error — handle it here
&lt;/span&gt;    &lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_error_response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BUDGET_EXCEEDED&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;BudgetExceededError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;error_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;CyclesProtocolError&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;

&lt;span class="c1"&gt;# On success, decision is ALLOW or ALLOW_WITH_CAPS
&lt;/span&gt;&lt;span class="n"&gt;decision&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_body_attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;decision&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;decision&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ALLOW_WITH_CAPS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Budget is tight — switch to a cheaper model
&lt;/span&gt;    &lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# ALLOW — full capacity
&lt;/span&gt;    &lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See &lt;a href="https://runcycles.io/protocol/caps-and-the-three-way-decision-model-in-cycles" rel="noopener noreferrer"&gt;Caps and Three-Way Decisions&lt;/a&gt; for more on how &lt;code&gt;ALLOW_WITH_CAPS&lt;/code&gt; works and what cap fields are available.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you get
&lt;/h2&gt;

&lt;p&gt;With this pattern in place:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Per-tenant isolation&lt;/strong&gt; — &lt;code&gt;Subject(tenant="acme")&lt;/code&gt; means each customer's budget is tracked and enforced independently&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graceful degradation&lt;/strong&gt; — &lt;code&gt;ALLOW_WITH_CAPS&lt;/code&gt; lets agents downgrade instead of stopping cold&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic reconciliation&lt;/strong&gt; — committing less than the reserved amount releases the difference back to the pool&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Crash safety&lt;/strong&gt; — if the agent crashes before committing, the reservation expires automatically and budget is released&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Next steps
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://runcycles.io/how-to/integrating-cycles-with-langchain" rel="noopener noreferrer"&gt;Integrating Cycles with LangChain&lt;/a&gt; — per-LLM-call callback handler pattern&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://runcycles.io/protocol/how-reserve-commit-works-in-cycles" rel="noopener noreferrer"&gt;Reserve / Commit Lifecycle&lt;/a&gt; — protocol deep-dive&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://runcycles.io/how-to/how-to-think-about-degradation-paths-in-cycles-deny-downgrade-disable-or-defer" rel="noopener noreferrer"&gt;Degradation Paths&lt;/a&gt; — strategies for deny, downgrade, disable, or defer&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://runcycles.io/quickstart/getting-started-with-the-python-client" rel="noopener noreferrer"&gt;Add to a Python App&lt;/a&gt; — Python client quickstart&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>langchain</category>
      <category>agents</category>
      <category>python</category>
    </item>
    <item>
      <title>I burned $153 in 30 minutes with an agent loop — here's the pattern that stopped it</title>
      <dc:creator>Albert Mavashev</dc:creator>
      <pubDate>Wed, 18 Mar 2026 16:08:49 +0000</pubDate>
      <link>https://dev.to/amavashev/i-burned-153-in-30-minutes-with-an-agent-loop-heres-the-pattern-that-stopped-it-5332</link>
      <guid>https://dev.to/amavashev/i-burned-153-in-30-minutes-with-an-agent-loop-heres-the-pattern-that-stopped-it-5332</guid>
      <description>&lt;h2&gt;
  
  
  The incident
&lt;/h2&gt;

&lt;p&gt;Short story — my agent retried a failing loop using multiple models:&lt;/p&gt;

&lt;p&gt;My agent: &lt;br&gt;
GPT-4o + Stable Diffusion + TradingView Charts + Kling. &lt;br&gt;
100s iterations. $153 in under 30 minutes. &lt;/p&gt;

&lt;h2&gt;
  
  
  My enforcement layer did not work
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Rate limiters&lt;/strong&gt; — control velocity, not total cost.&lt;br&gt;
&lt;strong&gt;Provider caps&lt;/strong&gt; — per-provider, not cross-provider.&lt;br&gt;
&lt;strong&gt;Observability&lt;/strong&gt; — tells me after. I need before.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pattern that worked for me: reserve-run-commit
&lt;/h2&gt;

&lt;p&gt;My budget control in 3 steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Reserve estimated cost before the call (instrumented code)&lt;/li&gt;
&lt;li&gt;Execute action (LLM, toolcall)&lt;/li&gt;
&lt;li&gt;On success — commit actual cost, release unused portion&lt;/li&gt;
&lt;li&gt;On failure — release the full reservation&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;The critical piece for retries&lt;/strong&gt;: each reservation has an &lt;strong&gt;idempotency key&lt;/strong&gt;. If the agent retries the exact same action, the second reservation is a no-op. Budget only gets locked once per logical action, not once per attempt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The critical piece for concurrent agents&lt;/strong&gt;: the reservation is atomic. Two agents can't both check the balance, both see enough, and both proceed. One gets through, the other gets blocked.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it looks like in code @cycles annotation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;runcycles&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;cycles&lt;/span&gt;

&lt;span class="nd"&gt;@cycles&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action_kind&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm.completion&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai:gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The decorator handles the reserve-commit lifecycle. If the budget is gone before the call, it raises a &lt;code&gt;BudgetExceededError&lt;/code&gt; and the call never fires. Nothing is billed.&lt;/p&gt;

&lt;p&gt;Works with any LLM provider — OpenAI, Anthropic, Bedrock anything.&lt;/p&gt;

&lt;h2&gt;
  
  
  The demo
&lt;/h2&gt;

&lt;p&gt;I built a runaway agent demo that shows the failure mode in under 60 seconds — same agent, same bug, two outcomes. No API key needed to run it.&lt;/p&gt;

&lt;p&gt;demo: &lt;a href="https://github.com/runcycles/cycles-runaway-demo" rel="noopener noreferrer"&gt;https://github.com/runcycles/cycles-runaway-demo&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I built (Cycles Protocol + Reference implementation)
&lt;/h2&gt;

&lt;p&gt;Full docs: &lt;a href="https://runcycles.io" rel="noopener noreferrer"&gt;https://runcycles.io&lt;/a&gt;&lt;br&gt;
Self-hosted, Multi-language SDKs, Apache 2.0&lt;/p&gt;

&lt;p&gt;What's your approach to agent cost control? Still rolling your own counters + limiters, nothing or something else?&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>agents</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
