DEV Community

Cover image for Mastercard just launched Agent Pay for Machines. Here's the execution gap they didn't mention.
Anthony Zender
Anthony Zender

Posted on

Mastercard just launched Agent Pay for Machines. Here's the execution gap they didn't mention.

On June 10, Mastercard launched Agent Pay for Machines with Stripe, Coinbase, Adyen, and 30 other partners. It covers agent identity, spend limits, and payment settlement.

It doesn't cover what happens when the agent crashes after the payment fires.

The gap nobody is talking about

Here's the failure mode:

  1. Agent calls create_payment_intent
  2. Stripe processes the charge
  3. Agent crashes before receiving confirmation
  4. Orchestrator retries
  5. Agent calls create_payment_intent again
  6. Stripe processes the charge again

Identity verified. Spend limit not exceeded. Two charges. One customer.

This happened in production with LangChain — $47K in duplicate transactions. LangGraph — $4.2K over a weekend. My own live trading session — six duplicate executions blocked, $3,653 total exposure.

The idempotentHint annotation in MCP tells clients a tool can be safely retried. It doesn't prevent the side effect from firing twice. It's advisory, not a guard.

The fix: claim before execute

Before any irreversible action, derive a deterministic request_id from the action's inputs and claim it in durable storage outside the execution context. If the agent crashes and retries, the guard returns the cached result without re-executing.

import requests

def safe_payment(agent_id: str, customer_id: str, amount: int):
    scope = f"payment:stripe:{customer_id}:{amount}"

    claim = requests.post(
        "https://safeagent-production.up.railway.app/claim/test",
        json={
            "agent_id": agent_id,
            "action_type": "payment.send",
            "scope": scope
        }
    ).json()

    if claim["status"] == "SKIP":
        return claim["existing"]

    result = stripe.PaymentIntent.create(
        amount=amount,
        currency="usd",
        customer=customer_id
    )

    requests.post(
        f"https://safeagent-production.up.railway.app/settle/{claim['request_id']}"
    )

    return result
Enter fullscreen mode Exit fullscreen mode

Same pattern works with LangChain tools:

from langchain.tools import tool
import requests

@tool
def create_payment(customer_id: str, amount: int) -> str:
    """Create a payment. Exactly-once guarded."""

    claim = requests.post(
        "https://safeagent-production.up.railway.app/claim/test",
        json={
            "agent_id": "langchain-agent",
            "action_type": "payment.send",
            "scope": f"stripe:{customer_id}:{amount}"
        }
    ).json()

    if claim["status"] == "SKIP":
        return f"Already processed: {claim['existing']}"

    result = stripe.PaymentIntent.create(
        amount=amount, currency="usd", customer=customer_id
    )

    requests.post(
        f"https://safeagent-production.up.railway.app/settle/{claim['request_id']}"
    )

    return result.id
Enter fullscreen mode Exit fullscreen mode

Why this matters now

Mastercard AP4M validates the market. Agents are going to make payments at scale. The identity and spend limit problems are solved. The execution safety problem is not.

This week, four independent implementations shipped byte-verifiable conformance fixtures for the complete execution safety stack:

  • kenneives (agentgraph) — verifier admission: is this agent allowed to make this payment?
  • evidai (LemonCake) — gated reserve: reserve funds, verify attestation, clamp to budget
  • haroldmalikfrimpong-ops (agentid) — independent verifier-side check
  • SafeAgent — exactly-once execution guard: PROCEED on first call, SKIP on retry

11/11 cross-implementation binding digests byte-identical. 33/33 gateway assertions pass. 30/30 verifier assertions pass. All independently verifiable — no runtime trust required.

evidai said it best in the A2A RFC thread: "nonce + exactly-once guard together give replay safety; a standalone normative nonce field without the guard would not."

Try it free

pip install safeagent-exec-guard
Enter fullscreen mode Exit fullscreen mode

Or test the hosted endpoint directly — no auth required:

curl -X POST https://safeagent-production.up.railway.app/claim/test \
  -H "Content-Type: application/json" \
  -d '{"agent_id":"my-agent","action_type":"payment.send","scope":"test-123"}'
Enter fullscreen mode Exit fullscreen mode

First call: {"status": "PROCEED"}

Same call again: {"status": "SKIP"}

The conformance fixtures, verify scripts, and cross-impl check are at github.com/azender1/SafeAgent.

If your agent touches payments, emails, webhooks, or trades — and it retries on failure — this is the gap in your stack.

Top comments (0)