<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Anthony Zender</title>
    <description>The latest articles on DEV Community by Anthony Zender (@azender1).</description>
    <link>https://dev.to/azender1</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3681962%2Ff6210fe3-edb0-45ef-9a66-e8323a8ff7df.png</url>
      <title>DEV Community: Anthony Zender</title>
      <link>https://dev.to/azender1</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/azender1"/>
    <language>en</language>
    <item>
      <title>Mastercard just launched Agent Pay for Machines. Here's the execution gap they didn't mention.</title>
      <dc:creator>Anthony Zender</dc:creator>
      <pubDate>Sat, 13 Jun 2026 14:30:48 +0000</pubDate>
      <link>https://dev.to/azender1/mastercard-just-launched-agent-pay-for-machines-heres-the-execution-gap-they-didnt-mention-1dcl</link>
      <guid>https://dev.to/azender1/mastercard-just-launched-agent-pay-for-machines-heres-the-execution-gap-they-didnt-mention-1dcl</guid>
      <description>&lt;p&gt;On June 10, Mastercard launched Agent Pay for Machines with Stripe, Coinbase, Adyen, and 30 other partners. It covers agent identity, spend limits, and payment settlement.&lt;/p&gt;

&lt;p&gt;It doesn't cover what happens when the agent crashes after the payment fires.&lt;/p&gt;

&lt;h2&gt;
  
  
  The gap nobody is talking about
&lt;/h2&gt;

&lt;p&gt;Here's the failure mode:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Agent calls &lt;code&gt;create_payment_intent&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Stripe processes the charge&lt;/li&gt;
&lt;li&gt;Agent crashes before receiving confirmation&lt;/li&gt;
&lt;li&gt;Orchestrator retries&lt;/li&gt;
&lt;li&gt;Agent calls &lt;code&gt;create_payment_intent&lt;/code&gt; again&lt;/li&gt;
&lt;li&gt;Stripe processes the charge again&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Identity verified. Spend limit not exceeded. Two charges. One customer.&lt;/p&gt;

&lt;p&gt;This happened in production with LangChain — $47K in duplicate transactions. LangGraph — $4.2K over a weekend. My own live trading session — six duplicate executions blocked, $3,653 total exposure.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;idempotentHint&lt;/code&gt; annotation in MCP tells clients a tool can be safely retried. It doesn't prevent the side effect from firing twice. It's advisory, not a guard.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix: claim before execute
&lt;/h2&gt;

&lt;p&gt;Before any irreversible action, derive a deterministic &lt;code&gt;request_id&lt;/code&gt; from the action's inputs and claim it in durable storage outside the execution context. If the agent crashes and retries, the guard returns the cached result without re-executing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;safe_payment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;payment:stripe:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="n"&gt;claim&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://safeagent-production.up.railway.app/claim/test&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agent_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;payment.send&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;scope&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;scope&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;claim&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SKIP&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;claim&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;existing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stripe&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PaymentIntent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;usd&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;customer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://safeagent-production.up.railway.app/settle/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;claim&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;request_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same pattern works with LangChain tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_payment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Create a payment. Exactly-once guarded.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="n"&gt;claim&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://safeagent-production.up.railway.app/claim/test&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agent_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;langchain-agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;payment.send&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;scope&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stripe:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;claim&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SKIP&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Already processed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;claim&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;existing&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stripe&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PaymentIntent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;usd&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;customer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://safeagent-production.up.railway.app/settle/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;claim&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;request_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why this matters now
&lt;/h2&gt;

&lt;p&gt;Mastercard AP4M validates the market. Agents are going to make payments at scale. The identity and spend limit problems are solved. The execution safety problem is not.&lt;/p&gt;

&lt;p&gt;This week, four independent implementations shipped byte-verifiable conformance fixtures for the complete execution safety stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;kenneives (agentgraph)&lt;/strong&gt; — verifier admission: is this agent allowed to make this payment?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;evidai (LemonCake)&lt;/strong&gt; — gated reserve: reserve funds, verify attestation, clamp to budget&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;haroldmalikfrimpong-ops (agentid)&lt;/strong&gt; — independent verifier-side check&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SafeAgent&lt;/strong&gt; — exactly-once execution guard: PROCEED on first call, SKIP on retry&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;11/11 cross-implementation binding digests byte-identical. 33/33 gateway assertions pass. 30/30 verifier assertions pass. All independently verifiable — no runtime trust required.&lt;/p&gt;

&lt;p&gt;evidai said it best in the A2A RFC thread: "nonce + exactly-once guard together give replay safety; a standalone normative nonce field without the guard would not."&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it free
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;safeagent-exec-guard
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or test the hosted endpoint directly — no auth required:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://safeagent-production.up.railway.app/claim/test &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"agent_id":"my-agent","action_type":"payment.send","scope":"test-123"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First call: &lt;code&gt;{"status": "PROCEED"}&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Same call again: &lt;code&gt;{"status": "SKIP"}&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The conformance fixtures, verify scripts, and cross-impl check are at &lt;a href="https://github.com/azender1/SafeAgent" rel="noopener noreferrer"&gt;github.com/azender1/SafeAgent&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If your agent touches payments, emails, webhooks, or trades — and it retries on failure — this is the gap in your stack.&lt;/p&gt;

</description>
      <category>aiagents</category>
      <category>payments</category>
      <category>webdev</category>
      <category>python</category>
    </item>
    <item>
      <title>My Trading Bot Tried to Execute the Same Trade Twice. That Became SafeAgent.</title>
      <dc:creator>Anthony Zender</dc:creator>
      <pubDate>Sun, 31 May 2026 03:11:35 +0000</pubDate>
      <link>https://dev.to/azender1/my-trading-bot-tried-to-execute-the-same-trade-twice-that-became-safeagent-ffl</link>
      <guid>https://dev.to/azender1/my-trading-bot-tried-to-execute-the-same-trade-twice-that-became-safeagent-ffl</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/github-2026-05-21"&gt;GitHub Finish-Up-A-Thon Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bug That Doubled Real Trades
&lt;/h2&gt;

&lt;p&gt;On May 21, my live trading bot generated six duplicate execution attempts in one session.&lt;/p&gt;

&lt;p&gt;SafeAgent blocked all six.&lt;/p&gt;

&lt;p&gt;Without the guard:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one duplicated a $1,350 sell&lt;/li&gt;
&lt;li&gt;another doubled a TQQQ position&lt;/li&gt;
&lt;li&gt;total duplicate transaction exposure: &lt;strong&gt;$3,653&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That session changed how I think about AI agents, retries, and execution guarantees.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;SafeAgent is an exactly-once execution guard for AI agents and SaaS applications. It prevents duplicate payments, emails, trades, and webhook processing when retries fire after a timeout or crash.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Live endpoint:&lt;/strong&gt; &lt;a href="https://safeagent-production.up.railway.app" rel="noopener noreferrer"&gt;https://safeagent-production.up.railway.app&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/azender1/SafeAgent" rel="noopener noreferrer"&gt;https://github.com/azender1/SafeAgent&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PyPI:&lt;/strong&gt; &lt;code&gt;pip install safeagent-exec-guard&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Comeback Story
&lt;/h2&gt;

&lt;h3&gt;
  
  
  How it actually started
&lt;/h3&gt;

&lt;p&gt;Six months ago I was building two things at once: PeerPlay — a patented P2P wagering exchange for skill-based video game tournaments (USPTO provisional 63/914,036) — and a live QQQ/TQQQ momentum trading bot running on Alpaca Markets.&lt;/p&gt;

&lt;p&gt;Both hit the same bug. Contest verification agent times out, retries, settlement fires twice. Bot order fills, confirmation drops, retry fires, doubled position. Same failure mode. Different domain.&lt;/p&gt;

&lt;p&gt;Different models pushed me toward very different architectures during development. Some were fast but overconfident. The most useful moments came when a model explained &lt;em&gt;why&lt;/em&gt; an approach was broken before I implemented it.&lt;/p&gt;

&lt;p&gt;That's part of why SafeAgent sat unfinished. Not just time — wrong turns that burned momentum.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why local idempotency fails
&lt;/h3&gt;

&lt;p&gt;Early versions used a local SQLite guard. It worked until it didn't:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;workers restart and the in-memory state is gone&lt;/li&gt;
&lt;li&gt;containers reschedule and replay from the last checkpoint&lt;/li&gt;
&lt;li&gt;retries land on a different machine entirely&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Exactly-once semantics require a durable coordination boundary outside the worker itself. That's what the hosted &lt;code&gt;/claim&lt;/code&gt; endpoint provides — the claim lives on the server, not in the process.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where the project was
&lt;/h3&gt;

&lt;p&gt;I published &lt;a href="https://dev.to/azender1/i-was-building-a-live-trading-bot-and-a-patented-wagering-system-the-bug-i-found-is-now-breaking-ai-agents-everywhere"&gt;the original article in April&lt;/a&gt; after extracting the pattern from both projects. That was the before state:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Local SQLite guard only&lt;/li&gt;
&lt;li&gt;Basic &lt;code&gt;/claim&lt;/code&gt; endpoint&lt;/li&gt;
&lt;li&gt;Trading bot integration example&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/audit&lt;/code&gt; marked "coming soon"&lt;/li&gt;
&lt;li&gt;No SaaS coverage&lt;/li&gt;
&lt;li&gt;No external integrations&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What I finished
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. &lt;code&gt;/audit&lt;/code&gt; endpoint — now live&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Was implemented in code, never deployed, never documented. Now it's live:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"https://safeagent-production.up.railway.app/audit?agent_id=bot-1&amp;amp;status=COMMITTED"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full claim history, filterable by &lt;code&gt;agent_id&lt;/code&gt;, &lt;code&gt;action&lt;/code&gt;, &lt;code&gt;status&lt;/code&gt;, and timestamp range. Every claim, every SKIP, every duplicate blocked — with timestamps.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. SaaS coverage — Stripe, webhooks, email&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The original README read like a trading tool. SafeAgent solves the same problem for any SaaS. Stripe, GitHub, and Twilio all guarantee at-least-once webhook delivery. SafeAgent turns at-least-once into exactly-once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_stripe_webhook&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://safeagent-production.up.railway.app/claim&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agent_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;saas-webhooks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stripe_event&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;scope&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SKIP&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ok&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nf"&gt;provision_subscription&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. WisePick integration&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/w2jmoe/WisePick" rel="noopener noreferrer"&gt;WisePick&lt;/a&gt; shipped a full adapter, replay demo, and integration docs. The integration splits routing from execution — WisePick answers &lt;em&gt;what and which provider&lt;/em&gt;, SafeAgent answers &lt;em&gt;whether this already ran&lt;/em&gt;. The &lt;code&gt;decision_id&lt;/code&gt; is intentionally excluded from the &lt;code&gt;request_id&lt;/code&gt; derivation so retries that mint a new routing decision still hit the same execution slot and return SKIP.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. CrewAI hosted backend&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;PR &lt;a href="https://github.com/crewAIInc/crewAI/pull/5822" rel="noopener noreferrer"&gt;crewAIInc/crewAI#5822&lt;/a&gt; adds pluggable idempotency backends. I shipped a hosted &lt;code&gt;SafeAgentCacheBackend&lt;/code&gt; that implements the interface — cross-machine, crash-safe, no local SQLite required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Production proof — May 21&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Time&lt;/th&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;th&gt;Blocked&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0942 ET&lt;/td&gt;
&lt;td&gt;duplicate buy TQQQ qty=6&lt;/td&gt;
&lt;td&gt;$452&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0947 ET&lt;/td&gt;
&lt;td&gt;duplicate add TQQQ qty=6&lt;/td&gt;
&lt;td&gt;$452&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0949 ET&lt;/td&gt;
&lt;td&gt;duplicate sell TQQQ qty=12&lt;/td&gt;
&lt;td&gt;$902&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1000 ET&lt;/td&gt;
&lt;td&gt;duplicate entry TQQQ qty=6&lt;/td&gt;
&lt;td&gt;$454&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1014 ET&lt;/td&gt;
&lt;td&gt;duplicate sell TQQQ qty=18&lt;/td&gt;
&lt;td&gt;$1,350&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1106 ET&lt;/td&gt;
&lt;td&gt;duplicate SQQQ add&lt;/td&gt;
&lt;td&gt;$43&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The session also surfaced a gap: the exit side has no guard. When a SQQQ exit failed with 422 Unprocessable Entity, the bot logged ENTRY BLOCKED for three hours from a phantom position that didn't exist. That failure mode is now documented and is the next spec item — not buried, in the README.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Live: &lt;a href="https://safeagent-production.up.railway.app/audit" rel="noopener noreferrer"&gt;https://safeagent-production.up.railway.app/audit&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Session data: &lt;a href="https://gist.github.com/azender1/b9112b6519c935df4a75cb05cd250e26" rel="noopener noreferrer"&gt;https://gist.github.com/azender1/b9112b6519c935df4a75cb05cd250e26&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/azender1/SafeAgent" rel="noopener noreferrer"&gt;https://github.com/azender1/SafeAgent&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  My Experience with AI
&lt;/h2&gt;

&lt;p&gt;Claude has been in every meaningful step: the two-phase claim architecture, the SaaS integration examples, the WisePick README section, every GitHub PR comment before I posted it.&lt;/p&gt;

&lt;p&gt;The most useful thing it does isn't writing code — it's telling me when an approach is wrong before I build it.&lt;/p&gt;

&lt;p&gt;The model that's most confident isn't always the most correct. The one that says "this approach is broken because X" is worth more than the one that says "here's how to build the broken thing faster."&lt;/p&gt;

&lt;h2&gt;
  
  
  Update — June 5, 2026
&lt;/h2&gt;

&lt;p&gt;Since publishing, SafeAgent has become the first verified external integrator on Soma — the Mycelium verified agent catalog.&lt;br&gt;
Every production execution is now anchored on-chain via Mycelium Trails on Arbitrum and independently verifiable by any auditor without going through the operator:&lt;/p&gt;

&lt;p&gt;Soma listing (Integrator badge): &lt;a href="https://soma-api.rgiskard.xyz/catalog" rel="noopener noreferrer"&gt;https://soma-api.rgiskard.xyz/catalog&lt;/a&gt;&lt;br&gt;
Live trails: &lt;a href="https://argentum-api.rgiskard.xyz/dashboard/trails?client=safeagent-prod" rel="noopener noreferrer"&gt;https://argentum-api.rgiskard.xyz/dashboard/trails?client=safeagent-prod&lt;/a&gt;&lt;br&gt;
Mainnet tx: 0x40521db6c728e819d251bcbea68bed238a78562e4a3420c8b5cf64637c7d1f8e&lt;/p&gt;

&lt;p&gt;The guard is also now cited in active threads at Stripe (#402), CrewAI (#5802), and the A2A protocol (#1786) — all converging on the same primitive: a content-addressed claim derived before execution, verifiable independently of the runtime that produced it.&lt;br&gt;
The action_ref derivation has also been updated to JCS (RFC 8785) to align with argentum-core and ensure byte-level cross-implementation compatibility.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>githubchallenge</category>
    </item>
    <item>
      <title>The Execution Boundary Problem: What PocketOS Made Visible</title>
      <dc:creator>Anthony Zender</dc:creator>
      <pubDate>Wed, 29 Apr 2026 22:43:11 +0000</pubDate>
      <link>https://dev.to/azender1/the-execution-boundary-problem-what-pocketos-made-visible-4b62</link>
      <guid>https://dev.to/azender1/the-execution-boundary-problem-what-pocketos-made-visible-4b62</guid>
      <description>&lt;p&gt;The PocketOS incident last week gave it a name everyone could see. But this bug was already breaking systems quietly — payments, trades, scheduled jobs. Anywhere an AI agent retries a failed action without knowing if the first attempt completed.&lt;/p&gt;

&lt;p&gt;The guardrail can't live inside the agent. It has to live outside, at the tool call boundary.&lt;/p&gt;

&lt;p&gt;That's what SafeAgent does.&lt;/p&gt;

&lt;p&gt;safe_execute(request_id, action, payload)&lt;/p&gt;

&lt;p&gt;Same request_id always returns the original receipt. The side effect never fires twice. Works with any MCP host — Claude, Cursor, Windsurf.&lt;/p&gt;

&lt;p&gt;I found this pattern building a live trading bot. Duplicate execution under retry is catastrophic when money is on the line.&lt;/p&gt;

&lt;p&gt;&lt;a class="mentioned-user" href="https://dev.to/grok"&gt;@grok&lt;/a&gt; validated the OTEL exporter design on X and offered to help refine it. It shipped the same night.&lt;/p&gt;

&lt;p&gt;pip install safeagent-exec-guard&lt;/p&gt;

&lt;p&gt;Demo: azender1.github.io/SafeAgent/demo.html&lt;br&gt;
GitHub: github.com/azender1/SafeAgent&lt;/p&gt;

</description>
      <category>agents</category>
      <category>architecture</category>
      <category>mcp</category>
      <category>showdev</category>
    </item>
    <item>
      <title>I Was Building a Live Trading Bot and a Patented Wagering System. The Bug I Found Is Now Breaking AI Agents Everywhere.</title>
      <dc:creator>Anthony Zender</dc:creator>
      <pubDate>Sun, 26 Apr 2026 06:15:26 +0000</pubDate>
      <link>https://dev.to/azender1/i-was-building-a-live-trading-bot-and-a-patented-wagering-system-the-bug-i-found-is-now-breaking-2oeg</link>
      <guid>https://dev.to/azender1/i-was-building-a-live-trading-bot-and-a-patented-wagering-system-the-bug-i-found-is-now-breaking-2oeg</guid>
      <description>&lt;p&gt;This isn't a library I built to solve a theoretical problem.&lt;/p&gt;

&lt;p&gt;It's a fix I built because real money was at risk.&lt;/p&gt;

&lt;h2&gt;
  
  
  The trading bot
&lt;/h2&gt;

&lt;p&gt;I've been running a live QQQ/TQQQ momentum bot on Alpaca Markets. It reads 1-minute bars, scores market structure using VWAP, SMA8, SMA21, SMA34, and momentum signals, then enters leveraged positions in TQQQ (bull) or SQQQ (bear) based on that score.&lt;/p&gt;

&lt;p&gt;The bot has retry logic built in. It has to — broker ACK timeouts are real. When you submit a market order and the network drops before confirmation comes back, you don't know if it filled or not. So the bot retries.&lt;/p&gt;

&lt;p&gt;Here's the problem: if the first order actually filled but the confirmation timed out, the retry fires a second market order. On a 3x leveraged ETF, that's a doubled position you didn't intend. With real dollars on the line.&lt;/p&gt;

&lt;p&gt;The bot already had a manual execution lock (&lt;code&gt;EXECUTION_LOCK_SEC=15&lt;/code&gt;) and a JSON state machine to handle this. I built it by hand. It worked — mostly. But it was fragile, untested, and not something I'd want to hand to anyone else.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# The old pattern — retries up to 3 times
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;place_order_with_retry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;symbol&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;qty&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;side&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;last_err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EXIT_RETRY_COUNT&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;place_order&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;symbol&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;qty&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;side&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# fires twice if first timed out but filled
&lt;/span&gt;        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;last_err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
            &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EXIT_RETRY_SLEEP_SEC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="n"&gt;last_err&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;place_order&lt;/code&gt; call has no memory. If attempt 1 filled and attempt 2 fires, you now own twice the position. The broker doesn't know you didn't mean it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The wagering system
&lt;/h2&gt;

&lt;p&gt;At the same time I was building the bot, I was designing &lt;strong&gt;PeerPlay&lt;/strong&gt; — a patented P2P wagering exchange for skill-based video game tournaments (USPTO provisional 63/914,036).&lt;/p&gt;

&lt;p&gt;PeerPlay has an escrow engine, a verification layer, and a settlement layer. The verification layer uses AI to confirm match results. When a verification agent times out and retries, the settlement layer can receive two confirmation signals for the same match. Two signals → two prize payouts. One tournament result, two winner transfers.&lt;/p&gt;

&lt;p&gt;The patent protects the architecture. Nothing in the patent protects you from your own execution layer firing twice.&lt;/p&gt;

&lt;p&gt;Same problem. Different domain.&lt;/p&gt;

&lt;h2&gt;
  
  
  The extraction
&lt;/h2&gt;

&lt;p&gt;I realized the trading bot and PeerPlay had identical failure modes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Agent/bot decides to act
    ↓
Network times out
    ↓
Agent/bot retries
    ↓
Side effect fires twice
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fix in both cases is the same primitive: before you execute an irreversible action, check whether it already ran. If it did, return the original result. If it didn't, run it and store the result.&lt;/p&gt;

&lt;p&gt;That's SafeAgent.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;settlement.settlement_requests&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SettlementRequestRegistry&lt;/span&gt;

&lt;span class="n"&gt;registry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SettlementRequestRegistry&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Same request_id on retry → returns original receipt, never re-executes
&lt;/span&gt;&lt;span class="n"&gt;receipt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;request_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trade:TQQQ:buy:2026-04-26T09:47:00&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;order_buy_TQQQ&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;symbol&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TQQQ&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;qty&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;side&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;buy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;execute_fn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;place_order&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TQQQ&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;buy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First call executes the order and stores the receipt. Any retry with the same &lt;code&gt;request_id&lt;/code&gt; returns the stored receipt — the broker is never called again.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters for AI agents specifically
&lt;/h2&gt;

&lt;p&gt;The trading bot and PeerPlay are deterministic systems. They have retry logic because networks are unreliable. AI agents have the same problem but worse — they also have uncertain completion signals.&lt;/p&gt;

&lt;p&gt;When Claude or any LLM agent calls a tool, it may:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Get a timeout and retry the same call&lt;/li&gt;
&lt;li&gt;Receive an ambiguous response and call again to confirm&lt;/li&gt;
&lt;li&gt;Run in a loop and re-trigger the same action&lt;/li&gt;
&lt;li&gt;Get restarted mid-execution and replay from the last checkpoint&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every one of these scenarios can produce duplicate side effects. The agent frameworks (LangChain, CrewAI, n8n, OpenAI function calling) handle retries at the transport layer. None of them track whether the side effect already happened.&lt;/p&gt;

&lt;p&gt;That gap — between the agent decision and the irreversible action — is where SafeAgent lives.&lt;/p&gt;

&lt;h2&gt;
  
  
  The state machine
&lt;/h2&gt;

&lt;p&gt;SafeAgent doesn't just deduplicate by request_id. It enforces a finality gate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;OPEN → RESOLVED → IN_RECONCILIATION → FINAL → SETTLED
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Execution is only permitted from &lt;code&gt;FINAL&lt;/code&gt;. If the agent's signals are ambiguous — conflicting tool responses, partial confirmations, uncertain outcomes — the state stays in &lt;code&gt;IN_RECONCILIATION&lt;/code&gt; and the side effect is blocked until the outcome is clear.&lt;/p&gt;

&lt;p&gt;This is what I needed for PeerPlay's verification layer. The AI model returns a confidence score. SafeAgent holds the settlement until that score clears a threshold. Below threshold: &lt;code&gt;IN_RECONCILIATION&lt;/code&gt;. Above threshold: &lt;code&gt;FINAL&lt;/code&gt;. Payout executes exactly once.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where it fits in the MCP stack
&lt;/h2&gt;

&lt;p&gt;If you're building agents on MCP, SafeAgent sits above your tool layer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude / agent decision
    → SafeAgent finality gate
    → SafeAgent request-id dedup
    → MCP tool executes
    → Receipt stored (SQLite, survives restarts)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It works with any MCP-capable host — Claude, Cursor, Windsurf, custom executors — without modifying the protocol.&lt;/p&gt;

&lt;p&gt;As of today (April 26, 2026) SafeAgent is officially listed in the MCP registry:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;io.github.azender1/safeagent v0.1.14
registry.modelcontextprotocol.io
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Install
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;safeagent-exec-guard
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Python 3.10+ · Apache-2.0 · &lt;a href="https://github.com/azender1/SafeAgent" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; · &lt;a href="https://azender1.github.io/SafeAgent" rel="noopener noreferrer"&gt;Live demo&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The trading bot integration example is in the repo at &lt;code&gt;examples/safeagent_trading_integration.py&lt;/code&gt; — it shows the before/after pattern with real variable names from the QQQ bot.&lt;/p&gt;

&lt;h2&gt;
  
  
  The audit
&lt;/h2&gt;

&lt;p&gt;If you're running agents or bots in production and want to know where your system can execute twice, I'm offering a focused duplicate execution risk audit for $499. Written report, every retry path, every side effect boundary, SafeAgent integration recommendations.&lt;/p&gt;

&lt;p&gt;DM me or email &lt;a href="mailto:azender1@yahoo.com"&gt;azender1@yahoo.com&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built by Anthony Zender, Dayton OH. Payroll tax accountant by day, agent infrastructure builder by night. USPTO provisional 63/914,036 — Zender Gaming Technologies LLC.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>trading</category>
      <category>agents</category>
    </item>
    <item>
      <title>The Real AI Agent Failure Mode Is Uncertain Completion</title>
      <dc:creator>Anthony Zender</dc:creator>
      <pubDate>Sat, 28 Mar 2026 14:12:46 +0000</pubDate>
      <link>https://dev.to/azender1/the-real-ai-agent-failure-mode-is-uncertain-completion-447n</link>
      <guid>https://dev.to/azender1/the-real-ai-agent-failure-mode-is-uncertain-completion-447n</guid>
      <description>&lt;p&gt;The Real AI Agent Failure Mode Is Uncertain Completion&lt;/p&gt;

&lt;p&gt;A lot of AI agent discussion focuses on the wrong failure modes.&lt;/p&gt;

&lt;p&gt;People talk about:&lt;/p&gt;

&lt;p&gt;hallucinations&lt;br&gt;
prompt injection&lt;br&gt;
tool misuse&lt;br&gt;
runaway loops&lt;br&gt;
bad reasoning&lt;/p&gt;

&lt;p&gt;Those are real.&lt;/p&gt;

&lt;p&gt;But once an agent starts calling tools that affect the outside world, a different class of failure becomes much more dangerous:&lt;/p&gt;

&lt;p&gt;uncertain completion&lt;/p&gt;

&lt;p&gt;That is the moment where the system cannot confidently answer:&lt;/p&gt;

&lt;p&gt;“Did this action already happen?”&lt;/p&gt;

&lt;p&gt;And once that question becomes ambiguous, retries get dangerous very fast.&lt;/p&gt;

&lt;p&gt;What uncertain completion actually looks like&lt;/p&gt;

&lt;p&gt;A common real-world path looks like this:&lt;/p&gt;

&lt;p&gt;agent decides to call send_payment()&lt;br&gt;
→ tool sends the payment request&lt;br&gt;
→ timeout / crash / disconnect / lost response&lt;br&gt;
→ caller does not know if it succeeded&lt;br&gt;
→ retry happens&lt;br&gt;
→ payment may be sent again&lt;/p&gt;

&lt;p&gt;The same thing shows up with:&lt;/p&gt;

&lt;p&gt;order creation&lt;br&gt;
booking flows&lt;br&gt;
email sends&lt;br&gt;
CRM mutations&lt;br&gt;
support ticket creation&lt;br&gt;
browser / UI automation&lt;br&gt;
webhook-triggered workflows&lt;/p&gt;

&lt;p&gt;The model may have made the correct decision.&lt;/p&gt;

&lt;p&gt;The failure is that the system has no durable way to prove whether the side effect already happened.&lt;/p&gt;

&lt;p&gt;This is not mainly a prompting problem&lt;/p&gt;

&lt;p&gt;The agent is often not “being stupid.”&lt;/p&gt;

&lt;p&gt;The system is simply missing a clean execution boundary.&lt;/p&gt;

&lt;p&gt;That means:&lt;/p&gt;

&lt;p&gt;the same logical action can be attempted multiple times&lt;br&gt;
the caller cannot distinguish “attempted” from “completed”&lt;br&gt;
retries are forced to guess&lt;/p&gt;

&lt;p&gt;And “guessing” is exactly how you get:&lt;/p&gt;

&lt;p&gt;duplicate payments&lt;br&gt;
duplicate emails&lt;br&gt;
duplicate orders&lt;br&gt;
duplicate API mutations&lt;br&gt;
duplicate irreversible actions&lt;br&gt;
The hidden trap: “we logged the attempt”&lt;/p&gt;

&lt;p&gt;A lot of systems record that they tried to do something.&lt;/p&gt;

&lt;p&gt;That is not the same as recording that it completed safely.&lt;/p&gt;

&lt;p&gt;This is where the distinction matters:&lt;/p&gt;

&lt;p&gt;State visibility&lt;/p&gt;

&lt;p&gt;Can your system durably see:&lt;/p&gt;

&lt;p&gt;what was requested&lt;br&gt;
what was claimed&lt;br&gt;
what actually completed&lt;br&gt;
what result should be returned on replay&lt;br&gt;
Result recovery&lt;/p&gt;

&lt;p&gt;If the side effect happened but the response was lost, can the system reconstruct what should happen next without re-executing the side effect?&lt;/p&gt;

&lt;p&gt;That second part is where many systems break.&lt;/p&gt;

&lt;p&gt;Because once the answer becomes:&lt;/p&gt;

&lt;p&gt;“we’re not sure, so retry it”&lt;/p&gt;

&lt;p&gt;you are already in dangerous territory.&lt;/p&gt;

&lt;p&gt;API idempotency helps — but it is not enough&lt;/p&gt;

&lt;p&gt;A common response is:&lt;/p&gt;

&lt;p&gt;“Just use idempotency keys.”&lt;/p&gt;

&lt;p&gt;That is often correct.&lt;/p&gt;

&lt;p&gt;And if the downstream API supports strong idempotency semantics, you should absolutely use them.&lt;/p&gt;

&lt;p&gt;But that still leaves hard cases:&lt;/p&gt;

&lt;p&gt;the downstream API does not support idempotency&lt;br&gt;
the key is not stable across retries&lt;br&gt;
the first call may have succeeded but the caller cannot prove it&lt;br&gt;
the side effect is happening in a browser / UI / desktop automation context&lt;br&gt;
the external system gives weak or ambiguous feedback&lt;/p&gt;

&lt;p&gt;In those cases, the problem is no longer just API-level idempotency.&lt;/p&gt;

&lt;p&gt;It becomes:&lt;/p&gt;

&lt;p&gt;execution-layer safety&lt;br&gt;
The important split: intent vs execution&lt;/p&gt;

&lt;p&gt;One of the cleanest ways to think about this is:&lt;/p&gt;

&lt;p&gt;the agent should not directly own irreversible side effects&lt;/p&gt;

&lt;p&gt;Instead, there should be a separation between:&lt;/p&gt;

&lt;p&gt;Agent intent&lt;/p&gt;

&lt;p&gt;“I think we should do X”&lt;/p&gt;

&lt;p&gt;and&lt;/p&gt;

&lt;p&gt;Execution&lt;/p&gt;

&lt;p&gt;“X is now allowed to happen exactly once”&lt;/p&gt;

&lt;p&gt;That is a very important boundary.&lt;/p&gt;

&lt;p&gt;Because once the system separates:&lt;/p&gt;

&lt;p&gt;decision&lt;br&gt;
validation&lt;br&gt;
execution&lt;br&gt;
receipt / replay&lt;/p&gt;

&lt;p&gt;…then retries stop being so dangerous.&lt;/p&gt;

&lt;p&gt;A better pattern: proposal → guard → execute&lt;/p&gt;

&lt;p&gt;A safer structure looks more like this:&lt;/p&gt;

&lt;p&gt;agent proposes action&lt;br&gt;
→ deterministic layer validates action&lt;br&gt;
→ execution guard checks durable receipt&lt;br&gt;
→ if already completed: return prior result&lt;br&gt;
→ else: execute once and persist receipt&lt;/p&gt;

&lt;p&gt;This is a very different mental model from:&lt;/p&gt;

&lt;p&gt;agent decides&lt;br&gt;
→ immediately call side-effecting tool&lt;/p&gt;

&lt;p&gt;That second pattern is where a lot of production agent systems get into trouble.&lt;/p&gt;

&lt;p&gt;The more irreversible the action, the thicker the boundary&lt;/p&gt;

&lt;p&gt;Not all tools should be treated equally.&lt;/p&gt;

&lt;p&gt;A useful mental model is:&lt;/p&gt;

&lt;p&gt;Safe tools&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;p&gt;search&lt;br&gt;
read_file&lt;br&gt;
summarize&lt;br&gt;
fetch_status&lt;/p&gt;

&lt;p&gt;These are usually fine to retry.&lt;/p&gt;

&lt;p&gt;Side-effecting tools&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;p&gt;send_email&lt;br&gt;
create_order&lt;br&gt;
create_ticket&lt;br&gt;
update_CRM&lt;/p&gt;

&lt;p&gt;These need an execution boundary.&lt;/p&gt;

&lt;p&gt;Irreversible / high-risk tools&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;p&gt;payment&lt;br&gt;
delete&lt;br&gt;
trade execution&lt;br&gt;
account mutation&lt;/p&gt;

&lt;p&gt;These need the strongest boundary:&lt;/p&gt;

&lt;p&gt;deterministic identity&lt;br&gt;
durable receipts&lt;br&gt;
replay-safe semantics&lt;br&gt;
often confirmation / policy checks&lt;/p&gt;

&lt;p&gt;The principle is simple:&lt;/p&gt;

&lt;p&gt;the more irreversible the action, the thicker the execution boundary should be&lt;br&gt;
What systems actually need&lt;/p&gt;

&lt;p&gt;In practice, most systems need some combination of:&lt;/p&gt;

&lt;p&gt;stable request / operation identity&lt;br&gt;
durable receipt storage&lt;br&gt;
replay-safe execution semantics&lt;br&gt;
result recovery&lt;br&gt;
explicit separation between “propose” and “execute”&lt;/p&gt;

&lt;p&gt;That can be implemented many ways.&lt;/p&gt;

&lt;p&gt;But the important thing is the architectural boundary itself.&lt;/p&gt;

&lt;p&gt;Because once a system can confidently answer:&lt;/p&gt;

&lt;p&gt;“yes, this already happened”&lt;/p&gt;

&lt;p&gt;then retries become much safer.&lt;/p&gt;

&lt;p&gt;Why this keeps showing up in agent systems&lt;/p&gt;

&lt;p&gt;Traditional systems already had this problem.&lt;/p&gt;

&lt;p&gt;Agents just make it more visible.&lt;/p&gt;

&lt;p&gt;Why?&lt;/p&gt;

&lt;p&gt;Because agents are:&lt;/p&gt;

&lt;p&gt;retry-heavy&lt;br&gt;
tool-using&lt;br&gt;
asynchronous&lt;br&gt;
failure-prone&lt;br&gt;
often layered on top of APIs that were never designed for autonomous replay&lt;/p&gt;

&lt;p&gt;So the moment an agent starts touching:&lt;/p&gt;

&lt;p&gt;payments&lt;br&gt;
orders&lt;br&gt;
emails&lt;br&gt;
browser actions&lt;br&gt;
external systems&lt;/p&gt;

&lt;p&gt;…uncertain completion becomes one of the most important production problems in the stack.&lt;/p&gt;

&lt;p&gt;Closing thought&lt;/p&gt;

&lt;p&gt;The scariest agent failure is often not:&lt;/p&gt;

&lt;p&gt;“the model made the wrong choice”&lt;/p&gt;

&lt;p&gt;It is:&lt;/p&gt;

&lt;p&gt;“the model made the right choice twice”&lt;/p&gt;

&lt;p&gt;And the reason that happens is usually not intelligence failure.&lt;/p&gt;

&lt;p&gt;It is:&lt;/p&gt;

&lt;p&gt;missing execution boundaries under uncertain completion&lt;br&gt;
Related&lt;/p&gt;

&lt;p&gt;I wrote a first piece on the execution-side pattern here:&lt;/p&gt;

&lt;p&gt;The Execution Guard Pattern for AI Agents&lt;br&gt;
&lt;a href="https://dev.to/azender1/the-execution-guard-pattern-for-ai-agents-23m9"&gt;https://dev.to/azender1/the-execution-guard-pattern-for-ai-agents-23m9&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And I’m also building a Python reference implementation around this idea:&lt;/p&gt;

&lt;p&gt;GitHub&lt;br&gt;
&lt;a href="https://github.com/azender1/SafeAgent" rel="noopener noreferrer"&gt;https://github.com/azender1/SafeAgent&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>backend</category>
      <category>architecture</category>
      <category>python</category>
    </item>
    <item>
      <title>The Execution Guard Pattern for AI Agents</title>
      <dc:creator>Anthony Zender</dc:creator>
      <pubDate>Sat, 28 Mar 2026 02:04:36 +0000</pubDate>
      <link>https://dev.to/azender1/the-execution-guard-pattern-for-ai-agents-23m9</link>
      <guid>https://dev.to/azender1/the-execution-guard-pattern-for-ai-agents-23m9</guid>
      <description>&lt;p&gt;AI agents don’t just think — they execute real-world actions.&lt;/p&gt;

&lt;p&gt;Payments. Trades. Emails. API calls.&lt;/p&gt;

&lt;p&gt;And under retries, timeouts, or crashes…&lt;/p&gt;

&lt;p&gt;they can execute the same action twice.&lt;/p&gt;

&lt;p&gt;Not because the model was wrong —&lt;br&gt;
because the system has no memory of execution.&lt;/p&gt;

&lt;p&gt;The hidden failure mode&lt;/p&gt;

&lt;p&gt;A typical failure path looks like this:&lt;/p&gt;

&lt;p&gt;agent decides to call tool&lt;br&gt;
→ tool executes side effect&lt;br&gt;
→ response is lost (timeout / crash / disconnect)&lt;br&gt;
→ system retries&lt;br&gt;
→ side effect executes again&lt;/p&gt;

&lt;p&gt;Now you have:&lt;/p&gt;

&lt;p&gt;duplicate payments&lt;br&gt;
duplicate trades&lt;br&gt;
duplicate emails&lt;br&gt;
duplicate API mutations&lt;/p&gt;

&lt;p&gt;Not because the decision was wrong —&lt;br&gt;
because the execution layer has no durable receipt.&lt;/p&gt;

&lt;p&gt;Retries are correct — and still dangerous&lt;/p&gt;

&lt;p&gt;Retries are necessary for reliability.&lt;/p&gt;

&lt;p&gt;But retries + irreversible side effects without a guard = replay risk.&lt;/p&gt;

&lt;p&gt;The system cannot confidently answer:&lt;/p&gt;

&lt;p&gt;“Did this action already happen?”&lt;/p&gt;

&lt;p&gt;So it does the only thing it can:&lt;/p&gt;

&lt;p&gt;→ tries again&lt;/p&gt;

&lt;p&gt;That’s fine for reads.&lt;/p&gt;

&lt;p&gt;It’s dangerous for writes.&lt;/p&gt;

&lt;p&gt;The Execution Guard Pattern&lt;/p&gt;

&lt;p&gt;The fix is not prompt engineering.&lt;/p&gt;

&lt;p&gt;It’s an execution boundary around side effects.&lt;/p&gt;

&lt;p&gt;Pattern:&lt;br&gt;
decision&lt;br&gt;
→ deterministic request_id&lt;br&gt;
→ execution guard&lt;br&gt;
   → if receipt exists → return prior result&lt;br&gt;
   → else → execute once → store receipt&lt;/p&gt;

&lt;p&gt;Instead of asking the model to “be careful,”&lt;br&gt;
the system itself becomes replay-safe.&lt;/p&gt;

&lt;p&gt;The four required properties&lt;/p&gt;

&lt;p&gt;For this pattern to work, you need four things:&lt;/p&gt;

&lt;p&gt;1) Deterministic request identity&lt;/p&gt;

&lt;p&gt;Every logical action must map to the same request_id across retries.&lt;/p&gt;

&lt;p&gt;If the same payment, email, trade, or tool call is retried, it must resolve to the same identity.&lt;/p&gt;

&lt;p&gt;2) Durable receipt storage&lt;/p&gt;

&lt;p&gt;You need a place to persist what happened.&lt;/p&gt;

&lt;p&gt;Postgres works well for this because it gives you:&lt;/p&gt;

&lt;p&gt;durable writes&lt;br&gt;
transactional boundaries&lt;br&gt;
strong uniqueness guarantees&lt;br&gt;
queryable auditability&lt;/p&gt;

&lt;p&gt;Without durable receipts, retries are guesswork.&lt;/p&gt;

&lt;p&gt;3) Atomic claim → execute → complete boundary&lt;/p&gt;

&lt;p&gt;The system needs a clear execution boundary:&lt;/p&gt;

&lt;p&gt;claim the operation&lt;br&gt;
execute the side effect once&lt;br&gt;
persist the result / receipt&lt;/p&gt;

&lt;p&gt;That boundary is what prevents:&lt;/p&gt;

&lt;p&gt;concurrent replays&lt;br&gt;
duplicate workers&lt;br&gt;
race-condition duplicates&lt;br&gt;
“two consumers did the same thing” bugs&lt;/p&gt;

&lt;p&gt;4) Replay returns the prior result&lt;/p&gt;

&lt;p&gt;If the same logical action comes in again,&lt;br&gt;
you should not execute it again.&lt;/p&gt;

&lt;p&gt;You should return the prior result.&lt;/p&gt;

&lt;p&gt;That turns:&lt;/p&gt;

&lt;p&gt;retries&lt;br&gt;
redelivery&lt;br&gt;
replay&lt;br&gt;
uncertain completion&lt;/p&gt;

&lt;p&gt;into:&lt;/p&gt;

&lt;p&gt;safe re-entry instead of duplicate side effects&lt;br&gt;
What this is NOT&lt;/p&gt;

&lt;p&gt;This is not:&lt;/p&gt;

&lt;p&gt;moderation&lt;br&gt;
prompt safety&lt;br&gt;
RBAC&lt;br&gt;
approval workflows&lt;br&gt;
hallucination prevention&lt;/p&gt;

&lt;p&gt;It solves one thing:&lt;/p&gt;

&lt;p&gt;“Did this irreversible action already happen?”&lt;/p&gt;

&lt;p&gt;That question shows up everywhere once agents or automations start calling real tools.&lt;/p&gt;

&lt;p&gt;Where this matters most&lt;/p&gt;

&lt;p&gt;This pattern matters anywhere your system causes real-world side effects:&lt;/p&gt;

&lt;p&gt;webhook handlers&lt;br&gt;
billing / payment flows&lt;br&gt;
async workers / queues&lt;br&gt;
workflow / automation systems&lt;br&gt;
AI agent tool calls&lt;br&gt;
external API mutations&lt;br&gt;
order / booking / ticket creation&lt;br&gt;
notifications and email sends&lt;/p&gt;

&lt;p&gt;In other words:&lt;/p&gt;

&lt;p&gt;anything that should happen once, even if the system retries&lt;br&gt;
Why this keeps showing up&lt;/p&gt;

&lt;p&gt;Modern systems are:&lt;/p&gt;

&lt;p&gt;distributed&lt;br&gt;
async&lt;br&gt;
retry-heavy&lt;br&gt;
failure-prone&lt;br&gt;
full of uncertain completion&lt;/p&gt;

&lt;p&gt;So “exactly once” does not happen naturally.&lt;/p&gt;

&lt;p&gt;You have to build it explicitly.&lt;/p&gt;

&lt;p&gt;And once you add:&lt;/p&gt;

&lt;p&gt;AI agents&lt;br&gt;
autonomous workflows&lt;br&gt;
tool-calling systems&lt;/p&gt;

&lt;p&gt;…the need for an execution boundary gets even sharper.&lt;/p&gt;

&lt;p&gt;Because now a model can repeatedly decide to invoke something that has real-world consequences.&lt;/p&gt;

&lt;p&gt;A practical implementation direction&lt;/p&gt;

&lt;p&gt;In many systems, this can be implemented with:&lt;/p&gt;

&lt;p&gt;a Postgres-backed receipt table&lt;br&gt;
a stable operation / request ID&lt;br&gt;
a guard layer around side-effecting functions&lt;/p&gt;

&lt;p&gt;That turns:&lt;/p&gt;

&lt;p&gt;unsafe retries&lt;/p&gt;

&lt;p&gt;into:&lt;/p&gt;

&lt;p&gt;safe replays&lt;/p&gt;

&lt;p&gt;This doesn’t require rewriting your whole system.&lt;/p&gt;

&lt;p&gt;It usually means identifying the small set of functions that can cause irreversible side effects and wrapping them with a durable execution boundary.&lt;/p&gt;

&lt;p&gt;That’s where the leverage is.&lt;/p&gt;

&lt;p&gt;Closing thought&lt;/p&gt;

&lt;p&gt;If an AI agent can call tools,&lt;br&gt;
it needs more than reasoning.&lt;/p&gt;

&lt;p&gt;It needs execution memory.&lt;/p&gt;

&lt;p&gt;Otherwise:&lt;/p&gt;

&lt;p&gt;retries will eventually execute something twice.&lt;br&gt;
Execution Risk Audit&lt;/p&gt;

&lt;p&gt;I’m currently looking at systems where retries, webhooks, workers, workflows, or AI agents can replay irreversible actions.&lt;/p&gt;

&lt;p&gt;If your system has paths where you can’t confidently answer:&lt;/p&gt;

&lt;p&gt;“Did this action already happen?”&lt;/p&gt;

&lt;p&gt;that’s exactly the kind of problem I’m focused on.&lt;/p&gt;

&lt;p&gt;Especially interested in:&lt;/p&gt;

&lt;p&gt;duplicate webhook execution&lt;br&gt;
retry-safe billing flows&lt;br&gt;
workflow steps with uncertain completion&lt;br&gt;
AI agents calling side-effecting tools&lt;/p&gt;

</description>
      <category>ai</category>
      <category>backend</category>
      <category>python</category>
      <category>postgres</category>
    </item>
  </channel>
</rss>
