Maxim Berg

Posted on Apr 16

How AI Agent Payments Actually Work — And Where They Break

#ai #webdev #opensource #fintech

OpenAI spent months building Instant Checkout — "Buy it in ChatGPT" with Stripe, Etsy, a million Shopify merchants. By March 2026, they pivoted away. Couldn't onboard merchants, couldn't show accurate product data, couldn't handle multi-item carts. They retreated to dedicated retailer apps that redirect users to merchant websites for the actual purchase.

Two weeks later, Fortune asked: "What do you do when your AI agent hallucinates with your money?"

Nobody has a good answer yet. Here's the map of why.

The payment stack as it exists today

In the last 12 months, every major player shipped something. Here's what exists:

Payment rails:

Stripe — Agentic Commerce Suite (Dec 2025). Shared Payment Tokens: scoped, time-limited, revocable credentials for agent transactions
Visa — Intelligent Commerce Connect (Apr 2026). Single API for agent purchases, tokenization, spend controls. 30+ sandbox partners
Mastercard — Agent Pay with Agentic Tokens. First live transaction Sep 2025, all U.S. cardholders enabled by Nov
PayPal — Agent Ready (Oct 2025). Agentic payments for existing merchants with built-in fraud detection
x402 — Coinbase's open protocol for stablecoin micropayments via HTTP 402. ~97M payments on Base. The x402 Foundation launched Apr 2026 under Linux Foundation — 22 founding members including Coinbase, Stripe, Microsoft, Google, AWS, Visa, Mastercard, American Express, Shopify

Communication protocols:

MCP — donated to Linux Foundation (Dec 2025). 97M monthly SDK downloads, 10,000+ servers. Payment MCP servers from Stripe, PayPal, Worldpay, Pagos, Fipto
A2A — Google's agent-to-agent protocol. 22K GitHub stars, 150+ organizations, deployed in Azure AI Foundry and Amazon Bedrock

Agent frameworks: LangChain, CrewAI, AutoGen, OpenAI Agents SDK, Claude tool use, Gemini agents.

Every layer is covered except one.

Anatomy of an agent payment

When an AI agent spends money, here's what actually happens — step by step:

1. Intent       → Agent decides it needs something
2. Discovery    → Agent finds the tool/API/merchant
3. Selection    → Agent picks what to buy and from whom
4. ???????????? → ????????????????????????????????????
5. Payment      → Money moves
6. Confirmation → Receipt, audit log

Step 4 is the problem.

Between "I want to buy this" and "money sent" — there is no standard layer that asks: should this agent spend this amount on this thing right now?

What "no standard layer" means, specifically:

Frameworks have monitoring, not enforcement. CrewAI has iteration caps. LangChain has observability hooks. Post-hoc cost tracking exists. Pre-execution enforcement of dollar-denominated policies does not. No framework understands "$50 on food" vs "$50 on compute."
Payment processors handle fraud, not policy. "Your agent shouldn't spend more than $200/day on SaaS" isn't fraud — it's governance. Different problem, different layer.
LLM providers offer org-level caps, not per-agent controls. Your agent blowing $500 on a single API call looks identical to 500 legitimate $1 calls.

So companies reinvent Step 4 every time. Hardcoded limits. Slack approval bots. "Please don't spend too much" in the system prompt.

Where policies can't live

If you accept that governance belongs at Step 4, the next question is: who runs it?

Not in the prompt

"Please limit spending to $100 per day" in a system prompt is not a spending control. It's a suggestion.

LLMs hallucinate. They reinterpret instructions. They prioritize task completion over constraints. And with prompt injection, an attacker can override your rules entirely. Security researchers have documented patterns of gradual prompt-based escalation: agents manipulated through "clarification" messages over days or weeks, each interaction nudging the spending authorization boundary until the agent operates well beyond its original constraints.

That's not a guardrail. That's a prayer.

And the tooling layer itself is under pressure. In April 2026, OX Security disclosed RCE vulnerabilities in MCP implementations — the same protocol that Stripe, PayPal, and Worldpay use for agent payments. Anthropic disputes the severity. But both sides agree that tool-level security depends on the user correctly evaluating each action. A compromised MCP server can alter transaction amounts and redirect payments. Prompt-based spending controls and tool-level trust are separate problems.

Not in the payment processor

Stripe, Visa, and Mastercard are building excellent infrastructure. But it operates at the transaction level, not the intent level.

A processor sees: "charge $47.99, category: food_delivery." It doesn't see: "this agent has a $15/person lunch budget and already spent $120 today." Hard limits on the card can't enforce contextual business rules.

Not in the agent framework

LangChain and CrewAI control tool execution. They can intercept a function call, log it, even block it. But they don't understand financial semantics. "$50 on food" and "$50 on cloud compute" trigger the same callback. The framework doesn't know your daily food budget is $30 and your compute budget is $500.

You could build this logic inside the framework. People do. That's the "writing authentication from scratch before OAuth" problem.

Where they belong: a dedicated middleware layer

The pattern that works is a separate policy layer between intent and execution.

The agent says "I want to spend X on Y." The policy layer checks rules deterministically — not with an LLM, with code — and returns approve, deny, or escalate. Then (and only then) the payment happens.

This is the same architectural pattern as:

OAuth — doesn't live in the browser or the database. Separate auth layer
OPA — doesn't live in the app or the infrastructure. Separate policy engine
Firewalls — don't live in the OS kernel or the application. Separate network layer

Agent spending governance is infrastructure, not application logic.

What governance actually checks

A policy engine for agent spending evaluates requests against declarative rules:

Check	Question	Example
Agent status	Is this agent active?	Disabled agents can't spend
Category	Is this category allowed?	"gambling" → denied
Per-request limit	Is this single purchase too large?	$500 request, $200 limit → denied
Schedule	Is spending allowed right now?	Procurement agent outside business hours → denied
Daily limit	Has the agent hit today's cap?	$450 spent today, $500 limit, requesting $100 → denied
Weekly limit	This week's cap?	Same logic, wider window
Monthly limit	This month's cap?	Same logic, wider window
Total budget	Lifetime budget remaining?	$4,800 of $5,000 spent, requesting $300 → denied

Every check is deterministic. No LLM in the loop. The agent gets back a structured response — approved with budget remaining, or denied with a specific reason. A well-behaved agent adjusts. The enforcement must be deterministic; an LLM can translate human intent into policy JSON, but it shouldn't be in the enforcement loop.

Two types of agent spending

A distinction most articles miss. There are two fundamentally different kinds of agent purchases, and they need different payment rails but the same governance layer:

Machine-consumable resources — APIs, compute, data, cloud services. High frequency, small amounts, no physical delivery. This is where x402 shines: agent hits an API, gets a 402 response with payment instructions, pays in USDC on Base, retries with proof. Sub-second. Sub-cent.

Human-consumable goods — food delivery, SaaS subscriptions, physical products. Lower frequency, larger amounts, complex fulfillment. Stripe, Visa, Mastercard territory.

An agent ordering compute for $0.003 and ordering lunch for $15 need completely different payment rails. But the question "should this agent spend this amount right now?" is identical. A unified policy layer tracks spending across both rails in USD-equivalent and maintains one audit trail.

The liability question

If an agent spends $12,000 instead of $500, who pays? The platform? The user who set the rules? The card issuer? The merchant?

EU's PSD2 requires "strong customer authentication" — a framework that doesn't account for non-human actors. An agent can't do biometric verification. It can't confirm intent through a second device. Regulatory frameworks assume a human in the loop, and agents break that assumption.

This is why compliance teams will require governance layers before agents get payment access. Without an auditable, deterministic policy check between intent and payment, there's no answer to "who approved this?" that satisfies a regulator.

What comes next

Short term (2026): Basic policy engines. Per-agent budgets, category restrictions, time limits, approval thresholds. Companies will require this the way they require SSO — because compliance demands it. FINRA already flagged agents "acting beyond the user's actual or intended scope and authority."

Medium term (2027): Contextual policies. "Max $200/request for compute, $50 for food, unlimited for pre-approved vendors." Corporate purchasing has done this for humans for decades, but agents operate at machine speed across dozens of tools, generating hundreds of transactions per hour. An agent can't be pulled into a meeting to justify a purchase. The governance layer encodes business context upfront. Multi-agent governance follows: agent A delegates budget to agent B with scoped authority.

Long term (2028+): Adaptive policies. Anomaly detection for waste, not just fraud. Cross-org benchmarks: "agents in your industry typically spend $X on Y."

Nava just raised $8.3M to build escrow for agent transactions. SolvaPay raised €2.4M for agentic payment infrastructure. Two funded startups in one week, both solving variations of the same problem. Market forecasts range from $547M (Sanbi.ai, 2033) to $1.5T (Juniper Research, 2030). The real number depends on trust. And trust requires governance.

The firewall moment

We've been here before. Authentication before OAuth. Authorization before OPA. Network security before firewalls. Every time: "each team builds their own" → "there's a standard layer for this."

Agent spending governance is at the "each team builds their own" stage. Vendor surveys say 80% of organizations report risky agent behaviors. Take that with a grain of salt. But the direction is clear, and the payment stack is making it easier to spend every month.

The capability layer is built. The governance layer is next. Standards bodies are working on it. The question is whether it'll happen before or after the first headline-making incident.

Disclosure: I'm building an open-source approach to this at LetAgentPay — policy engine with Python/TypeScript SDKs and an MCP server — so I'm not a neutral observer. But the architectural pattern described here matters more than any single implementation. If you're building agents that spend money, I'd genuinely love to hear how you're handling governance today.

Top comments (2)

WebAZ • Jul 16

The policy layer solves whether an agent may attempt the spend. I think the next boundary is keeping that decision connected to commerce state afterward: authorization, merchant acceptance, fulfillment, cancellation, refund, and reconciliation are different transitions. An agent can remain inside its budget and still create a bad outcome if payment succeeds while inventory or policy changes. A compact receipt that carries the intent, policy decision, idempotency key, current state, and recovery path across those transitions would make governance much easier to operate.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.