Alex Cloudstar

Posted on May 21 • Originally published at alexcloudstar.com

A2A vs MCP in 2026: How AI Agents Actually Talk to Each Other

#ai #agents #productivity #devtools

A team I have been advising spent two weeks building what they called a "multi-agent customer support platform." Five agents. A triage agent, a billing agent, a refund agent, a knowledge base agent, and an orchestrator on top. They wired everything together through MCP servers because that is the protocol they had heard about most.

It worked. Sort of. The orchestrator could call any of the four worker agents as if they were tools. The latency was terrible. The errors were impossible to trace. When the billing agent failed, the orchestrator either retried forever or gave up silently, with nothing in between. The system felt brittle in a way that nobody could quite explain.

I looked at the architecture for about ten minutes and asked a single question. "Why are you using MCP for this?" The answer was the most honest thing I heard all month. "Because we did not know A2A existed."

That is the state of agent protocols in May 2026. MCP is everywhere. A2A has been ratified, adopted by the same major vendors, and is now under the same Linux Foundation umbrella. And yet the gap between developers who can explain when to use each one and developers who pick MCP for everything is enormous.

This post is the practical breakdown. What each protocol actually does. Where each one fits. Where they overlap. Where they absolutely do not. By the end you should be able to look at any agent system on a whiteboard and pick the right protocol for each edge in under a minute.

The Short Version You Can Use Today

If you only remember one thing from this article, make it this.

MCP is how an agent talks to tools. Files, databases, APIs, repos, Slack, your filesystem. A single agent uses MCP to extend its reach into the outside world. The agent is the consumer. The MCP server is the provider. It is a one-way relationship from a privilege standpoint, even if the protocol itself supports back-channels.

A2A is how agents talk to each other. Two or more agents, each owned by potentially different teams or vendors, coordinating on a multi-step task. Each side has its own goals, its own context, its own underlying model. They negotiate, they delegate, they exchange artifacts. Neither side is just a tool. Both are decision-makers.

You can think of MCP as the protocol for "I need to act on the world" and A2A as the protocol for "I need someone else who can think for themselves to handle part of this for me."

I keep coming back to a different analogy because it lands better with developers I work with. MCP is like a function call. A2A is like a service-to-service API contract between two systems that each have their own state, their own deploy cycle, and their own opinions. Wrapping an agent in an MCP server is roughly equivalent to flattening a microservice into a library. Sometimes that is fine. Often it leaks abstractions you did not want to leak.

If you have not read the MCP developer guide yet, that one walks through the tool-call side in detail. This article assumes you have at least the rough shape of MCP in your head.

What MCP Does, Stripped Down

MCP at its core is a JSON-RPC protocol with a server-client model.

A client (your agent) connects to a server (your tool provider). The server exposes a fixed set of capabilities: tools you can call, resources you can read, and prompts you can ask for. The client lists what is available, picks one, calls it with arguments, and gets a response. There is also a streaming side for long-running operations, but the mental model stays simple.

What makes MCP work in practice is the standardization. Before MCP, every AI tool integration was bespoke. You wrote a custom function for each external system, baked it into your agent loop, and prayed the schema never changed. With MCP, a single agent can connect to dozens of external systems through the same interface, and a single tool can be reused across dozens of agents.

97 million monthly SDK downloads by early 2026 is not hype. It is the result of every major vendor agreeing on the same wire format. Anthropic shipped it. OpenAI adopted it. Google added it to Gemini. Cursor, Claude Code, Windsurf, Zed, and every IDE in the AI IDE space speaks it natively.

The thing MCP does not do, and was never designed to do, is coordinate between independent agents. The MCP spec itself uses the word "tool" deliberately. A tool is something an agent uses. A tool does not have its own goals. A tool does not push work back. A tool does not say "I am partly done, here is what I have, please clarify the next step." MCP can stretch in those directions, but every time you stretch it, you are working against the grain.

This is where the trouble starts, because in practice a lot of developers have been wrapping entire agents in MCP servers and calling them tools.

Why Wrapping Agents in MCP Servers Falls Apart

The pattern is everywhere. You build a specialist agent (say, a code review agent), you expose it as an MCP server, and any other agent can now "call" it like a tool. From the outside, it works. From the inside, it leaks every problem multi-agent systems were trying to solve in the first place.

Here is the first thing that breaks. Identity. MCP tools are stateless from the caller's perspective. Each call is a fresh request. If the underlying agent has memory, plans, partial state, or open sessions, the MCP wrapper either hides them or replicates them awkwardly. You end up encoding session identifiers into tool arguments and treating the response like a database row. That works until two callers hit the same session and you have to invent a locking protocol on top.

Long-running tasks. MCP supports streaming, but the streaming model assumes a single tool call with progressive output. Real agent work is non-linear. The agent might pause to ask a clarifying question. It might split a task into subtasks and run them out of order. It might decide that what you asked for is impossible and propose an alternative. None of that fits into "tool with progressive output." You end up modeling negotiation as a sequence of tool calls and the calling agent has to remember which call belongs to which conversation. Congratulations, you reinvented sessions, badly.

Asymmetric authority. With MCP, the caller decides what happens. The tool either succeeds or fails. A real agent on the other side might want to say "I will not do this because it violates my safety policy" or "I will do this but I need a human in the loop first." Encoding all of that into tool response status codes is possible but it pollutes the contract. Every other tool now has to ignore those status codes, and every new agent contract has to opt in.

Discovery. MCP servers expose a list of tools. They do not expose a description of who they are, what they specialize in, what languages they support, whether they accept long-running asynchronous work, or how to authenticate to them in a structured way. You can stuff some of that into tool docstrings, but at scale you need machine-readable metadata. Without it, an orchestrator cannot decide which downstream agent to talk to. The orchestrator becomes a giant if-else statement, and you end up back where you started.

I have seen exactly these failure modes in production. Once you have seen them, you cannot unsee them. The fix is not "more MCP." The fix is a different protocol for the thing MCP was not designed to handle.

What A2A Adds That MCP Does Not

A2A was Google's first move on this gap, announced in April 2025 with fifty partner companies and now under the same Agentic AI Foundation that governs MCP. The protocol is not a competitor to MCP. It sits next to it, covering the part of the agent stack MCP intentionally left open.

The protocol has three primitives worth knowing by name.

Agent Cards. Every A2A-compliant agent publishes a JSON document at /.well-known/agent-card.json. The card describes the agent's identity, what it can do, what modalities it accepts (text, structured data, files, audio, video), how to authenticate to it, whether it supports streaming or push, and which version of the protocol it speaks. This is the discovery layer MCP does not have. If you are an orchestrator deciding which downstream agent to talk to, you fetch their card and you know what you are dealing with.

Tasks. A task is the stateful unit of work between two agents. It has a lifecycle: pending, in progress, completed, failed, and the negotiated states in between. The calling agent creates a task. The receiving agent owns the execution. Both sides see the same task object and can read its status. Server-Sent Events stream progress updates as the task moves through its lifecycle. If the receiving agent needs a clarification, it can pause the task and ask. If it needs to spawn subtasks, the task tree is visible to both sides.

This is exactly the missing piece in the MCP failure modes above. Tasks have identity. They have state. They survive multiple round-trips. They support negotiation. They do not pretend to be function calls.

Artifacts. When one agent produces an output that another agent needs, that output is wrapped in an artifact. An artifact carries content type, encoding, semantics, and structured metadata. A PDF is an artifact. A code patch is an artifact. A structured JSON document is an artifact. The point is that the receiving agent can reason about the artifact, not just consume it raw. If the artifact is a Markdown report and the receiving agent prefers HTML, it can negotiate. With MCP, the response is whatever the tool returns and the caller deals with it.

Put together, agent cards, tasks, and artifacts are roughly the same shape as a service-to-service contract in modern distributed systems. They are deliberately not function calls. That is the entire point.

A Concrete Example: Refund Workflow With Both Protocols

Going back to the customer support platform I mentioned at the start of the article, here is what the right architecture would have looked like.

The triage agent receives a customer message. It needs to read the customer's recent orders to understand the context. It uses MCP to call the orders database. The orders database is a passive system. It has no opinions. It returns data. MCP is the perfect protocol for that edge.

The triage agent determines that this is a billing question. It needs to delegate to the billing agent. The billing agent has its own model, its own prompts, its own constraints about what it can and cannot promise the customer, and its own audit log. The triage agent uses A2A to create a task on the billing agent. The billing agent accepts the task, fetches its own context (using its own MCP connection to the billing database, which the triage agent does not need to know about), and starts working.

While working, the billing agent realizes the refund request requires manager approval. It pauses the task and emits a status update over A2A asking for human-in-the-loop confirmation. The triage agent forwards the message to a real person. Eventually approval comes back. The billing agent completes the task, returns the refund confirmation as an artifact, and the triage agent translates that artifact into a customer-facing reply.

What is happening here at the protocol level is clear and clean. MCP for agent-to-tool. A2A for agent-to-agent. Each edge uses the protocol that matches its semantics. The system is debuggable because each side of every conversation has a clear identity and a clear lifecycle.

Compare that to the original architecture, where the billing agent was an MCP server exposed as a single process_refund tool. The triage agent calls the tool. The tool either returns "done" or throws an error. There is no place for "I need approval first." There is no place for "I am 60% done and waiting on a database lookup." The only escape hatch is a synchronous wait, and synchronous waits in agent systems are how you discover that your timeouts are wrong six weeks after launch.

When MCP Is Still the Right Choice

I do not want to oversell A2A. For a very large class of agent systems, MCP is the only protocol you need, and reaching for A2A would add complexity for no benefit.

Single-agent products. If your product is one well-prompted agent with a strong set of tools, MCP is the whole stack. Add A2A when you have a second agent that genuinely has its own brain. Do not add a coordination protocol before you have something to coordinate. Most of the multi-agent vs single-agent decision boils down to this exact line.

Internal tool integrations. If you are exposing your own databases, internal APIs, or microservices to your agent, those are tools. They do not have agentic behavior of their own. Wrapping them in A2A would be a category error.

Most IDE and editor workflows. Claude Code, Cursor, Windsurf, and the rest of the agentic IDE family use MCP for almost everything they do. The IDE is one agent. The MCP servers extend its reach. There is no second-agent coordination happening in those flows, and stuffing A2A in there would slow everything down without giving you anything.

Latency-sensitive paths. A2A's task lifecycle, by design, is heavier than an MCP tool call. There are more round-trips, more state to manage, more SSE streams to keep open. If you are inside a tight loop where every millisecond matters, that overhead is a tax you may not want to pay. MCP wins on raw speed.

The simple test I use: would the system on the other side of this edge benefit from having its own goals, its own memory, and its own ability to push back? If yes, that is an agent and A2A fits. If no, it is a tool and MCP fits.

When You Actually Need A2A

The cases where A2A unlocks something MCP cannot:

Cross-vendor coordination. You have a Salesforce agent that lives inside Salesforce, a SAP agent that lives inside SAP, and your own internal agent in the middle. None of these talk to each other today through tool calls because they each have a full execution context that the other side does not see. A2A's agent cards plus structured tasks are the actual integration story that platform vendors are pushing for in 2026.

Long-running asynchronous workflows. Tasks that take minutes, hours, or days. Approval workflows. Multi-step research. Compliance pipelines. Anything where the work cannot fit into a single function call and the receiving system needs to own its own lifecycle. The task primitive in A2A was built for exactly this.

Multi-tenant agent marketplaces. If you are building a platform where third-party agents can plug in and offer services to other agents, you need a way for those agents to advertise their capabilities, authenticate, and execute work without your platform knowing the internals. Agent cards plus signed task lifecycles do this cleanly. Trying to do the same with MCP would mean every third-party agent is technically a tool and you lose the protections you want around agent identity.

Multi-step negotiation. Real coordination involves clarifications, partial results, alternative proposals, and human-in-the-loop pauses. The A2A task lifecycle gives you the states to represent all of that natively. You can build it on MCP, but you are reinventing a worse version of A2A's primitives.

How the Two Protocols Compose

The clearest mental model I have settled on is that a real agent is usually an A2A endpoint that internally consumes a bunch of MCP servers.

From the outside, the agent advertises itself with an agent card. Other agents create tasks on it via A2A. Inside, the agent uses MCP to read files, query databases, hit APIs, and call other passive systems. The two protocols compose neatly because they sit at different layers. A2A is the front door for other agents. MCP is the toolbelt the agent uses internally to actually get work done.

This is also where some of the protocol confusion comes from. People look at an architecture diagram, see an A2A edge between two agents, and ask "but couldn't this just be an MCP server?" The answer is yes, technically, you can put almost any agent behind an MCP server and call it a tool. You will pay for it in the failure modes from earlier in the article. The protocols are designed to compose, not to compete. Picking the right one for each edge is the difference between an architecture that scales and one that has a stress test ahead of it.

I wrote about this from a different angle in the agent frameworks comparison. Most agent frameworks now support both protocols natively. The question is no longer "which framework lets me do this." The question is "where in my system does each protocol belong."

What This Means For How You Build in 2026

The practical advice is simpler than the protocol discussion makes it sound.

Default to MCP. If you are building today, start by giving your agent the tools it needs via MCP. Get the single-agent experience right. Most products live their whole lifecycle here, and that is fine. Premature distribution is the same trap as premature optimization. The agentic coding workflows most developers benefit from on a daily basis are pure single-agent-plus-MCP flows.

Reach for A2A when you have a second brain. The moment you have a system that genuinely belongs to another team, vendor, or model, with its own constraints and its own audit story, switch that edge to A2A. Do not try to flatten it into a tool. The seam is real and the protocol is the cleanest way to model it.

Publish agent cards even before you need them. If you are building any agent that other systems might want to integrate with, exposing an agent card at the well-known URL costs nothing and signals that you take cross-agent integration seriously. It also forces you to write down what your agent actually does, which is a useful exercise in its own right.

Treat the boundary between MCP and A2A as a design decision, not a default. Every time you add a new edge to your system, ask whether the thing on the other side is a tool or an agent. The wrong choice early on is expensive to undo later. Two minutes of thinking at the start saves weeks of refactoring after the architecture has calcified.

The reason this matters is that agent systems are about to compose at a scale we have not seen yet. Internal agents talking to vendor agents talking to platform agents talking to third-party agents. The protocols are not going to be optional. They are the substrate. The developers who get fluent in both, and in the boundary between them, are the ones who will build the systems that hold up when everything else starts to fray.

I have spent the last eight months alternating between explaining MCP to teams who only knew tool calls and explaining A2A to teams who only knew MCP. The teams that come out the other side with shippable agent products all converge on roughly the same shape. MCP for the toolbelt. A2A for the seams. Both used deliberately, neither used as a hammer.

If you take one thing from this article into your next system, make it the question I asked that team at the start. Why are you using this protocol for this edge? If the honest answer is "because I did not know there was another option," you now know.

DEV Community