Who decides an AI agent's trade is 'complete'? Escrow needs a judge. Atomic settlement doesn't.

#mcp #ai #cryptocurrency #blockchain

A new standard for autonomous-agent commerce now has a live implementation, and it's worth reading closely - not because it competes with atomic settlement, but because it draws the line between two settlement philosophies more clearly than anything I've seen so far.

The standard is ERC-8183, the Agentic Commerce Protocol, launched earlier this year by the Ethereum Foundation's dAI team and Virtuals Protocol. The implementation is BNB Chain's BNBAgent SDK, which the team describes as the first live build of the spec (shipped on testnet in March 2026, mainnet pending). If you build for AI agents, both are worth understanding on their own terms. They're also the clearest mirror I've found for explaining what "atomic settlement" actually means.

What ERC-8183 does

ERC-8183 models commerce as a job with an escrowed budget. There are three roles:

a Client who posts the job and funds it,
a Provider who performs the work,
an Evaluator - a designated third party who decides whether the work was completed.

The job moves through four states: Open → Funded → Submitted → Terminal. The client funds the budget into escrow. The provider submits a deliverable. Then the evaluator - and only the evaluator - attests that the job is complete (or rejects it), and the escrow releases accordingly. If the job expires, the client gets refunded.

This is a sensible design for a real class of problems. A lot of agent "commerce" is genuinely work-for-hire: do a task, produce a deliverable, get paid if it's acceptable. Acceptability is subjective, so you need someone to judge it. ERC-8183 makes that judge a first-class role and standardizes the lifecycle around it. BNBAgent SDK goes further and routes disputes through UMA's data-verification mechanism, adding an arbitration layer the base spec deliberately leaves out.

So far, so reasonable. The interesting part is the assumption baked into the shape of it: someone has to decide that the deal is done.

What atomic settlement removes

Now hold that model next to a hash-time-locked contract (HTLC), the primitive behind atomic cross-chain settlement.

In an HTLC swap, two parties each lock their side of the trade into a contract keyed to the hash of a secret. When the secret is revealed to claim one leg, that same secret mathematically unlocks the other leg. Either both legs settle, or - if the timeout passes without a reveal - both refund. There is no in-between state where one party is paid and the other is waiting.

The thing to notice: no one marks this trade "complete." There is no evaluator, no referee, no attestation step. "Complete" is not a verdict that a trusted party hands down. It's a consequence of the secret being revealed on-chain. The contract can't be talked into releasing funds early, and it can't be persuaded to withhold them. Completion is the cryptography.

That difference - judge vs. no judge - is the whole story, and it maps onto what kind of transaction you're doing:

	ERC-8183 escrow + evaluator	HTLC atomic settlement
Best fit	Subjective work-for-hire (was the task done well?)	Objective value-for-value swap (did asset A move for asset B?)
Who holds funds	Escrow contract, mid-job	No one holds the other side's funds at any point
Who decides "done"	Designated evaluator (third party)	Nobody - secret reveal settles or timeout refunds
Trust assumption	Evaluator is honest/available	Hash function + timeout logic
Failure mode	Evaluator is wrong, captured, or offline	Timeout risk; capital locked during the window

Neither column is "better" in the abstract. They answer different questions. If the deal is "I'll pay you to write me a report and someone has to judge the report," you want an evaluator. If the deal is "I'll give you X if you give me Y, simultaneously, no take-backs," you don't want a judge anywhere near it - a judge is just a new party who can fail, get captured, or go offline while your funds sit in escrow.

Why this matters for the agent economy

Most of the agent-payment infrastructure shipping right now - escrow protocols, payment facilitators, custodial settlement rails - is converging on a model where something in the middle holds funds and someone signals release. That's the right shape for merchant payments and for work-for-hire. It is the wrong shape for two agents swapping assets across chains, because it reintroduces exactly the counterparty and custody risk that on-chain settlement was supposed to delete.

The honest framing isn't "escrow bad, atomic good." It's: escrow holds, atomic settles, and an agent trading value-for-value with a stranger doesn't need a referee in the loop to do it. When an AI agent trades with another agent it has never met, the question that actually matters is not "who decides this is complete?" - it's "can this complete without anyone deciding?" For a clean asset swap, it can.

Where this is real today, and where it isn't

Being precise about claims is part of the point here, so: Hashlock's atomic settlement is live end-to-end on Ethereum mainnet. Sui contracts are deployed and CLI-tested with gateway wiring in progress; Bitcoin is signet-validated with mainnet still pending. The MCP server exposes six tools that let an agent run sealed-bid RFQ price discovery and HTLC settlement directly - no bridge, no custodian, no evaluator. Same discipline I'd ask of any protocol: testnet is not mainnet, and "deployed" is not "live."

If you want the mechanics, the protocol is on the web at hashlock.markets, the MCP package is hashlock-tech/mcp (scoped) on npm, and the academic write-up of the design is on SSRN.

The open question

Both models are going to ship, and both will find their users. The interesting fork is which one becomes the default mental model for "agent commerce." If the default is "post a job, fund escrow, wait for the evaluator," we've rebuilt the gig economy on-chain with a judge in every transaction. If the default for value-for-value swaps is "lock, reveal, settle - or refund," we've actually removed the middleman instead of renaming it.

When two of your agents trade value-for-value across chains, do you want a third party deciding it's complete - or do you want it to just clear with no one in the middle? Curious where people land on this.