Jubin Soni

Posted on Apr 12

The Agent Protocol Stack: MCP vs A2A vs AG-UI — When to Use What

#ai #agents #a2a #mcp

If you're building AI agents in 2026, you've probably bumped into at least one of these acronyms: MCP, A2A, AG-UI. Maybe all three. And if you're anything like me, your first reaction was: "Are these competing standards? Do I need all of them? Which one do I actually use?"

Here's the short answer: they're not competing — they're complementary. Each one solves a different problem at a different layer of the agent architecture. Think of them like TCP, HTTP, and HTML — different protocols at different layers that work together to make the web function.

The long answer is the rest of this article.

The One-Sentence Version

Protocol	Created By	What It Connects	One-Liner
MCP	Anthropic	Agent ↔ Tools & Data	"How does my agent use tools?"
A2A	Google (Linux Foundation)	Agent ↔ Agent	"How do agents talk to each other?"
AG-UI	CopilotKit	Agent ↔ User Interface	"How does my agent talk to the user?"

That's the mental model. Now let's go deeper.

MCP: The Tool Layer

What It Solves

Your agent needs to do things — query a database, call an API, read a file, search the web. Before MCP, every integration was bespoke. You'd write custom function-calling code for each tool, each framework, each model. MCP standardizes this into a single protocol.

How It Works

MCP uses a client-server architecture over JSON-RPC 2.0. The MCP server exposes tools (functions with typed inputs/outputs), resources (data the agent can read), and prompts (reusable templates). The MCP client — typically embedded in your agent framework — discovers these capabilities and invokes them on behalf of the model.

Key Concepts

Tools are the core primitive — functions the model can call. Each tool has a name, description (the LLM reads this to decide when to use it), and a typed input schema. The model sees the tool list, decides which ones to call, and the MCP client executes them.

Resources let the server expose read-only data — files, database schemas, configuration — that provides context without requiring a tool call.

Transports are flexible. Local tools can use stdio (spawning a subprocess). Remote tools use Streamable HTTP, which is what you'd use for production deployments. AWS Bedrock AgentCore Runtime expects this transport.

When to Use MCP

Use MCP when your agent needs to interact with external systems: databases, APIs, monitoring tools, file systems, cloud services. If you're wrapping an existing API for agent consumption, MCP is the protocol.

AWS provides a growing library of open-source MCP servers for services like S3, DynamoDB, CloudWatch, and Cost Explorer. You can also build custom MCP servers for your own internal APIs and deploy them to AgentCore Runtime.

When NOT to Use MCP

MCP is not for agent-to-agent communication. If you have a research agent that needs to delegate a sub-task to a coding agent, MCP isn't the right fit — that's A2A territory. MCP is also not designed for frontend communication — it doesn't have event streaming primitives for UI updates.

A2A: The Agent Collaboration Layer

What It Solves

You've built multiple specialized agents. One handles research, another handles code generation, a third manages deployments. Now you need them to work together on a complex task without sharing their internal state, tools, or prompts. A2A standardizes how agents discover each other, delegate tasks, and exchange results.

How It Works

A2A follows a client-server model where agents communicate over HTTP using JSON-RPC 2.0 (and optionally gRPC as of v0.3). The key differentiator from MCP is opacity — agents don't expose their internals. They advertise what they can do, not how they do it.

Key Concepts

Agent Cards are JSON metadata documents hosted at /.well-known/agent.json. They describe the agent's name, capabilities (called "skills"), supported input/output types, and authentication requirements. Think of them as a machine-readable business card — any A2A client can discover what a remote agent does without prior knowledge.

Tasks are the unit of work. A client sends a message to a remote agent, which creates a task with a lifecycle: submitted → working → completed (or failed, canceled). Tasks can produce artifacts — the actual outputs like generated text, images, or structured data.

Interaction patterns are flexible. Simple tasks complete synchronously. Long-running tasks use Server-Sent Events (SSE) for streaming updates. Truly async workflows use push notifications via webhooks.

When to Use A2A

Use A2A when you have multiple agents that need to collaborate but shouldn't share internal state. Common patterns include a supervisor agent delegating to specialists, cross-organization agent collaboration (your agent talking to a vendor's agent), and multi-framework setups (a LangGraph agent coordinating with a CrewAI agent).

A2A is especially valuable when agents are built by different teams or companies. The opacity principle means Agent A doesn't need to know that Agent B uses LangGraph internally — it just sends a task and gets results back.

AWS Bedrock AgentCore Runtime supports deploying A2A servers alongside MCP servers, with the same IAM auth, session isolation, and auto-scaling. A2A containers expose their endpoint on port 9000 with an Agent Card at /.well-known/agent-card.json.

When NOT to Use A2A

A2A adds overhead that isn't necessary for simple single-agent setups. If your agent just needs to call tools, use MCP. If you need tight coupling between agent components (shared memory, shared context), A2A's opacity model will work against you — consider an agent framework's native multi-agent patterns instead.

AG-UI: The User Interface Layer

What It Solves

Your agent is running, calling tools, maybe coordinating with other agents. But the user is staring at a loading spinner. They don't know what's happening, can't intervene when things go wrong, and can't see intermediate results. AG-UI standardizes how agents communicate with user-facing applications in real time.

How It Works

AG-UI is an event-based protocol where the agent backend emits a stream of typed events that the frontend consumes. Unlike REST (request → response) or WebSocket (unstructured bidirectional), AG-UI defines ~16 specific event types that cover the full range of agent-user interactions.

Key Concepts

Event types are the core of AG-UI. The main ones:

Lifecycle events (RUN_STARTED, RUN_FINISHED, RUN_ERROR) — let the frontend show loading states and handle errors
Text message events (TEXT_MESSAGE_START, _CONTENT, _END) — stream generated text token by token for the "typing" effect
Tool events (TOOL_CALL_START, TOOL_CALL_END) — show the user what tools the agent is using and their results
State deltas (STATE_DELTA) — send incremental UI state changes (progress bars, form updates) without resending everything
Interrupts (INTERRUPT) — pause execution to ask the user for approval before a sensitive action (like deleting a resource)

Shared state enables bidirectional synchronization between the agent and the application. The agent can read application state (what page the user is on, what document is open) and push state changes back (update a chart, fill a form).

Frontend tools are an interesting inversion — the agent can call functions that execute in the browser, like updating a collaborative document or rendering a visualization.

When to Use AG-UI

Use AG-UI when your agent needs to communicate with a user-facing application in real time. This includes chat interfaces that show tool execution progress, collaborative editing where the agent modifies a shared document, dashboards that update as the agent discovers information, and any workflow that requires human-in-the-loop approval.

AG-UI was born from CopilotKit's production experience and has integrations with LangGraph, CrewAI, Strands Agents, Pydantic AI, and more. AWS Bedrock AgentCore Runtime added AG-UI support in March 2026, handling auth and scaling just like MCP and A2A workloads.

When NOT to Use AG-UI

If your agent is a background job with no user interaction (batch processing, scheduled tasks), AG-UI adds unnecessary complexity. Stick with simple API responses or logging. Also, AG-UI is about communication, not UI rendering — if you need the agent to generate actual UI components, look at A2UI (a separate spec from Google for declarative UI generation that can be transported over AG-UI events).

How They Fit Together

Here's where it gets interesting. In a real production system, you're likely using all three:

The flow:

The user asks a question in the frontend
AG-UI streams the request to the supervisor agent and carries back real-time updates
The supervisor uses MCP to call tools directly (databases, APIs, cloud services)
For complex sub-tasks, the supervisor uses A2A to delegate to specialist agents
Those specialist agents may themselves use MCP for their own tools
Results flow back up through A2A → supervisor → AG-UI → user

Each protocol handles its layer. No overlap. No conflict.

The Decision Framework

When you're designing an agent system, ask these three questions:

1. "Does my agent need to use external tools or data?"

→ Yes: Use MCP

Wrap your APIs, databases, and services as MCP servers. Use existing open-source MCP servers for common services (AWS, GitHub, Slack, etc.).

2. "Does my agent need to collaborate with other agents?"

→ Yes: Use A2A

Especially when agents are built by different teams, use different frameworks, or need to maintain privacy of their internal logic. Publish Agent Cards for discovery.

3. "Does my agent need to communicate with a user in real time?"

→ Yes: Use AG-UI

Stream progress, show tool execution, synchronize state, and handle human-in-the-loop approvals. Use AG-UI events to keep the user informed and in control.

Most production agent systems will answer "yes" to at least two of these. And that's fine — the protocols are designed to compose.

Quick Comparison Table

	MCP	A2A	AG-UI
Layer	Tool access	Agent collaboration	User interaction
Created by	Anthropic	Google / Linux Foundation	CopilotKit
Wire protocol	JSON-RPC 2.0	JSON-RPC 2.0 + gRPC	Event stream (SSE)
Discovery	Tool listing via `tools/list`	Agent Card at `/.well-known/agent.json`	N/A (direct connection)
Key primitive	Tool (function call)	Task (lifecycle-managed work unit)	Event (~16 standard types)
Transport	stdio, Streamable HTTP	HTTP, SSE, gRPC, webhooks	SSE, WebSockets
Auth model	OAuth 2.0, IAM	OAuth 2.0, API keys, mTLS	Application-defined
Opacity	Transparent (tools are exposed)	Opaque (internals hidden)	N/A
Streaming	Yes (SSE for resources)	Yes (SSE for task updates)	Yes (core design principle)
AWS support	AgentCore Runtime + Gateway	AgentCore Runtime (port 9000)	AgentCore Runtime (March 2026)
Spec version	2025-03-26	v0.3	~16 event types, active development

Running All Three on AWS

AWS Bedrock AgentCore Runtime is one of the few platforms that supports all three protocols natively. Here's how they deploy:

Protocol	AgentCore Runtime Port	Container Path	Auth
MCP	8000	`/mcp`	IAM SigV4 or OAuth 2.0
A2A	9000	`/` (root)	IAM SigV4 or OAuth 2.0
AG-UI	Configurable	Configurable	IAM SigV4 or OAuth 2.0

Each protocol gets the same enterprise infrastructure: session isolation in microVMs, automatic scaling, IAM auth, and observability through AgentCore. You write the server, AgentCore handles everything else.

The AgentCore Gateway can sit in front of MCP servers to provide centralized tool discovery, routing, and policy enforcement via Cedar. For A2A, agents advertise their capabilities through Agent Cards. For AG-UI, the frontend connects directly to the AgentCore Runtime endpoint and receives streamed events.

What About A2UI?

You might have also heard about A2UI (Agent-to-UI), a separate specification from Google. It's easy to confuse with AG-UI given the similar names, but they solve different problems:

A2UI defines what UI to render — it's a declarative spec for describing UI components (buttons, charts, forms) that agents can generate safely without executing arbitrary code
AG-UI defines how agents and UIs communicate at runtime — the event stream, state synchronization, and interaction lifecycle

They're complementary. An agent can use AG-UI to stream events to the frontend, and one of those events can carry an A2UI payload that describes a UI component to render. AG-UI is the transport; A2UI is the content format.

Getting Started

If you're building your first agent system, here's the practical sequence:

Start with MCP. Most agents need tools first. Build an MCP server for your primary data source or API. Deploy it to AgentCore Runtime or run it locally during development.
Add AG-UI when you build the frontend. Once your agent works, connect it to a user-facing app using AG-UI events. CopilotKit provides React components that handle the event stream out of the box.
Introduce A2A when you need specialization. When a single agent can't handle everything, split into specialists and use A2A for delegation. This typically happens when you're at the point of multi-team or multi-framework agent development.

You don't need all three on day one. But understanding what each one does — and where it fits — saves you from building custom plumbing that a protocol already handles.

References:

Top comments (2)

Max Quimby • Apr 17

The TCP/HTTP/HTML analogy is exactly right, and it's something I've been struggling to convey to product stakeholders who keep asking "which protocol should we choose?" The answer is almost always "all three, for different problems at different layers."

One thing your breakdown surfaces for me: A2A's opacity requirement — agents expose capabilities, not internals — makes trust modeling harder in ways that aren't obvious upfront. When Agent A delegates to Agent B, you need some mechanism to verify that B is actually doing what its agent card claims. We ended up implementing a lightweight attestation layer on task completion metadata, but I haven't seen this discussed much in the community yet.

The AG-UI piece is what I'm watching most closely. The streaming state model is elegant for simple cases, but coordinating UI updates across multiple concurrent sub-agents without creating confusing intermediate states is genuinely hard. Three agents running in parallel, each pushing partial updates — the UX can feel chaotic even when the underlying logic is correct.

Are you seeing any convergence on patterns for AG-UI + A2A combined implementations? The interaction between them for long-running tasks with human-in-the-loop checkpoints feels like the next unsolved layer.

Global Chat • Apr 12

The TCP/HTTP/HTML analogy is spot on, but it makes me wonder about the DNS equivalent. MCP servers publish tool schemas, A2A agents expose Agent Cards, AG-UI handles the rendering layer. But how does an agent actually discover which MCP servers or A2A peers exist for a given task? Right now there are 11+ IETF drafts competing on agent discovery (agents.txt expired its initial spec on April 10), and no consensus yet. Protocol-agnostic directories that index across MCP, A2A, and ACP registries seem like the practical stopgap. Have you looked into the discovery layer at all?