Baris Sozen

Posted on May 5

Six tools for an agent, six hundred methods for a human: why agent-shaped APIs beat SDK-shaped APIs for trading

#mcp #ai #cryptocurrency #blockchain

The most common pushback we hear about Hashlock Markets is one sentence long.

"Six MCP tools? That feels thin."

It's a fair instinct if your reference frame is the SDK era. A serious trading SDK is a thousand-method surface. CCXT alone wraps a hundred-plus exchanges and exposes a unified interface that has been growing for years. The Bybit Python SDK has ninety-something endpoints. A modern broker SDK reads more like a textbook than a library.

When you put that mental model next to a six-tool MCP server, it feels under-engineered. The instinct is correct under one assumption — that a human developer will write the integration code. Under that assumption, more methods means fewer custom helpers, less work, more flexibility.

Here's the thing. Most of the trades inside an AI agent aren't being designed by a human developer. They're being described in natural language by a user, parsed by the model, and called by a planner that has to keep every method's behavior in its head as part of the prompt context. The SDK-era assumption doesn't hold any more.

Once that assumption breaks, six tools stops looking thin. It starts looking like the right number.

This post lays out the argument. It's a top-of-funnel piece — no jargon-heavy walkthrough, no schema dumps. Just the case for why the trading surface an agent actually calls should be deliberately small, and why we built ours that way.

The SDK-shaped API is a human convenience

An SDK is a productivity tool for a developer who is going to write a non-trivial amount of code on top of it. It is deliberately granular — fifty methods, because some user might want to filter by exactly that field, paginate exactly that way, or stitch exactly those two endpoints together in a custom way.

Granularity is correct for a human, because a human can read docs once and remember the layout. The cost of one hundred methods is amortized across years of an integration's life. The benefit is flexibility for the unknown shape of the integration.

The economics flip when the consumer of the API is itself a model.

Now every method is something the model has to either keep in its prompt or look up at runtime. Every version is a tiny regression risk. Every "undocumented edge case in method 87" becomes a silent failure mode that a non-deterministic planner can rediscover in production. The granularity that was free for a human becomes a tax on the agent.

The unit economics of an agent-callable API:

Every tool is context. Tools that the model never calls still cost prompt tokens to describe, every single turn.
Every tool is a branch the planner can pick wrong. More tools means more chances for the planner to confuse two semantically-close behaviors.
Every version pin is a regression risk. A breaking change in a sixteenth-most-used method can silently break trades.
Every edge case is a non-deterministic failure mode. Models sometimes call APIs in ways no human would. The fewer surfaces, the fewer of those.

A human gets bored if they have to read more than ten pages. A model can read ten thousand pages. But it pays for every page in attention, in compute, and — if the API is being called inside a real trading loop — in latency. The SDK-shaped API was never optimized for that cost structure.

The agent-shaped API is the smallest surface that still completes the job

An agent-shaped API answers a different question: what is the minimum set of tools that lets a planning model take a trading intent from "natural language request" to "atomic settlement on chain"?

Walk through what an OTC trade actually requires:

The user wants to buy or sell an asset for another asset, with some constraints (size, deadline, optional counterparty filter). That's an intent.
The protocol needs to broadcast that intent to a set of market makers privately, so the order doesn't leak. That's a sealed-bid RFQ.
A market maker who likes the price needs to commit a price quote. That's a response.
Once a quote is picked, both sides need to lock funds in a way that lets either claim atomically with a shared secret, or refund after a deadline. That's a hash-time-locked contract with four lifecycle operations: lock, claim, refund, inspect.

That's the trade. Every piece of work an agent does end-to-end is one of those steps.

Six tools is not a limit we picked; it's what falls out of writing the minimum version of the surface above:

Tool	What it actually does
`create_rfq`	Taker side: post a sealed-bid intent for a trade.
`respond_rfq`	Maker side: commit a price quote against an intent.
`create_htlc`	Either side: record an on-chain lock for the leg you owe.
`withdraw_htlc`	Either side: claim an HTLC by revealing the preimage.
`refund_htlc`	Either side: reclaim a leg if the counterparty disappears past the deadline.
`get_htlc`	Either side: inspect the live state of any in-flight swap.

That's it. There is no list_markets, no get_orderbook, no cancel_order, no set_leverage, no transfer_internal, no get_balance, no get_kline, no twenty other methods that exist in a CEX SDK because exchange UI screens needed them. We don't need them, because the model the protocol exposes is not a CEX. It's a private auction plus a four-state settlement primitive.

Critically, the trade either settles atomically or it doesn't. There is no half-state to inspect, no "fill ratio" to query, no "stuck order" the agent has to reason about. The HTLC's four-state lifecycle (locked, withdrawn, refunded, expired) is a state machine that fits in any reasonable system prompt without using up the planner's working memory.

What "agent-shaped" actually means in practice

Here's the same trade in two API shapes, side by side:

SDK-shaped, on a CEX:

get_account_balance — does the user have funds?
get_market_info — is the symbol valid, what's the precision, what are the tick sizes?
get_ticker — what's the current best bid/ask?
get_orderbook(depth=20) — sanity-check the depth.
create_order(symbol, side, type, amount, price) — submit.
get_order(orderId) — poll for status.
get_order(orderId) — keep polling.
cancel_order(orderId) — if it doesn't fill in time.
get_trade_history(orderId) — confirm fills.
get_account_balance — confirm.

Ten calls. Six different resources. The planner has to know which resource each method lives on, that order IDs come back as strings on this exchange and integers on that one, that "filled" is a status string here and a boolean over there.

Agent-shaped, on Hashlock Markets:

create_rfq(...) — post intent.
respond_rfq(...) — (maker side, separate flow) commit a quote.
create_htlc(...) — taker locks one leg.
create_htlc(...) — maker locks the other leg.
withdraw_htlc(...) — taker reveals preimage, claims maker's leg.
withdraw_htlc(...) — maker uses the now-public preimage, claims taker's leg.
get_htlc(...) — either side inspects live status anytime.

Six different tools, each with a clear lifecycle role, and a state machine the planner can reason about deterministically. No "is the order partially filled," no "did my limit price actually hit," no "do I need to also cancel the residual." Either both sides have the preimage or both sides refund. That is the only terminal state.

The reduction in surface area isn't a packaging trick. It comes from a different settlement primitive. Sealed-bid RFQ skips order books. HTLC skips custody. Atomic cross-chain settlement skips bridges. Each subtraction removes the tools that would have been needed to reason about that set of failure modes.

The MCP spec is what makes this auditable

The reason the six tools work natively across Claude Desktop, Cursor, Windsurf, OpenAI agent runtimes, and LangChain is that they are exposed via the Model Context Protocol — a tool-surface contract Anthropic published as an open standard.

MCP gives you three things that matter for trading:

A typed tool schema — every tool's inputs are JSON-schema, every output is structured. The planner doesn't have to parse free text. It calls a tool, it gets a typed response, it makes a deterministic next decision.
A stateless transport — the same six tools are available over local stdio (npx -y the canonical hashlock-tech/mcp scoped package) and over Streamable HTTP at hashlock.markets/mcp. One protocol. Two transports. Same surface.
An introspectable contract — when something goes wrong in production, the trace is just a sequence of typed tool calls. Reproducing a bug is calling the same tools in the same order with the same arguments. There is no "but it worked in the SDK" mystery.

This matters less for the planner and more for the operator. When you wire an agent into an SDK and something goes wrong at 3 AM, you debug into someone else's library. When you wire an agent into an MCP server and something goes wrong, you debug into a six-call trace. The mean time to "I understand what happened" is an order of magnitude lower.

What you actually lose by going agent-shaped

Worth being honest about the trade-offs.

You lose orderbook-style market structure. No top-of-book quotes, no Level-2 depth, no fancy order types like trailing-stops or iceberg orders. If your strategy depends on reading the book, you should not use this. You should use a CEX-MCP that wraps a CEX's full trading API, and pay the surface-area tax accordingly.

You lose continuous mid-quotes for free. A sealed-bid RFQ has to have a maker price for your specific intent. If no maker is online for your size on your pair, you wait. A streaming order book gives you a quote even when nobody's serious about your size, but that quote is not honored unless you actually fill against it.

You lose order types beyond market-and-limit. No stops, no brackets, no OCO, no conditional triggers. The intent says "I want to buy X for Y by deadline T." The maker either responds or doesn't.

In return you get:

Six tools to reason about, not six hundred.
Sealed-bid pricing instead of public order leakage.
Atomic cross-chain settlement instead of trust-the-bridge.
A fee floor of 1–2 bps versus the 8–10 bps your CEX is paying internalised by spread.
A surface that the planner can hold in working memory across an entire trade lifecycle.

Most of the strategies an AI agent should run for an end-user — single-asset rebalancing, USD-out at a budgeted slippage, cross-chain hedging, paying a counterparty in a different stablecoin than they invoiced — fit cleanly into this surface. The strategies that don't fit (HFT, market-making at the venue, complex orderbook scalping) are not strategies an end-user wants their agent doing autonomously anyway.

"Six tools" is the headline. Pre-quote validation is the substrate.

The provocation works because the headline is short. The argument that holds it up is longer.

A reasonable follow-up — one that came up explicitly in our May 1 dev.to thread on single-venue CEX-MCPs versus Hashlock Markets and again in yesterday's piece on the four pre-settlement filters — is that "thin tool surface" only works if the protocol is doing the heavy lifting underneath. That's true. The four filters that quotes pass through before the agent ever sees them — counterparty KYC tier, bonded reputation, price-deviation guard, ring privacy — are part of why six tools is sufficient. The protocol absorbs the validation that would otherwise have been a dozen extra methods on the SDK side.

So the full thesis isn't just "fewer tools is better." It's "fewer agent-callable tools, plus a heavier protocol-side validation layer, is what an MCP-driven trading surface should look like."

Six on top, four underneath. That's the shape.

Closing question

The agent-shaped API thesis is a young one. We're 18 months into the MCP era and the right number of tools per protocol is still being argued out across DeFi, agent-runtime, and broker categories.

If you're building or evaluating MCPs in this space, here's the question I'd ask yourself before adding the next method:

What is the smallest tool surface you trust an AI agent with end-to-end on real money?

I'd argue it's smaller than your current SDK. Possibly much smaller. The trades you want an agent doing autonomously, on your funds, are the ones that fit in a state machine the planner can hold in working memory.

If you want to compare the surface yourself, the canonical npm package is at hashlock-tech/mcp (scoped, on npm — install with npx -y and the package name) or remote at hashlock.markets/mcp. Six tools. One auth flow. Three chains: Ethereum, Bitcoin, Sui — Base, Arbitrum, Solana, and TON on the roadmap.

Same surface either way.

If you build agent integrations against trading APIs and have opinions on what the right tool count is for your use case, I'd genuinely like to hear them — drop a comment. There are interesting open questions on both sides of this argument and the category is too young to be sure.