Implementing A2A Protocol for Multi-Agent Communication

#ai #agents #tutorial #python

If you've ever wired two AI agents together, you know the drill. Custom JSON schemas, bespoke HTTP endpoints, and a growing pile of adapter code that nobody wants to maintain. Google's A2A (Agent-to-Agent) protocol is the answer to that mess, and I've been implementing it across OpenClaw and Hermes agents on Rapid Claw for the past few weeks. Here's what the implementation actually looks like.

What A2A solves (and what it doesn't)

A2A standardizes the message envelope between independent agents. Think of it as the TCP/IP of agent communication — it defines how agents discover each other, exchange structured messages, delegate tasks, and return results. It doesn't care what framework you're using internally.

The key distinction: MCP (Model Context Protocol) handles agent-to-tool communication. A2A handles agent-to-agent communication. You need both in any serious multi-agent deployment, and they compose cleanly because an A2A peer is essentially a tool with an agent on the other end.

The envelope format

Every A2A message carries the same required fields. The interesting bits go in payload:

envelope = {
    "a2a_version": "1.0",
    "message_id": f"msg_{uuid4().hex}",
    "correlation_id": "conv_01HZKXR7...",  # ties the conversation together
    "trace": {
        "trace_id": "4bf92f3577b34da6...",
        "span_id": "00f067aa0ba902b7",
    },
    "sender": {"agent_id": "planner-openclaw-prod-01", "framework": "openclaw"},
    "recipient": {"agent_id": "executor-hermes-prod-03", "framework": "hermes"},
    "intent": "task.delegate",
    "payload": {
        "task": "summarize_and_file",
        "inputs": {"url": "https://example.com/report.pdf"},
        "constraints": {"max_tokens": 4000, "deadline_ms": 30000}
    },
    "reply_to": "https://agents.rapidclaw.dev/a2a/planner/inbox",
    "expires_at": "2026-04-18T12:34:56Z"
}

Three fields do the heavy lifting: correlation_id threads multi-agent conversations into a single trace, trace carries OpenTelemetry-compatible span context so your existing APM stitches everything together, and intent is the verb recipients dispatch on — not a URL path.

Publishing an OpenClaw agent as an A2A endpoint

An OpenClaw agent becomes an A2A peer by exposing an inbox and registering with a platform registry. The agent doesn't need to know who will call it — only how to respond:

from fastapi import FastAPI, HTTPException
from openclaw import Agent, Task
from a2a import Envelope, verify_signature, sign

app = FastAPI()
planner = Agent.from_config("planner.yaml")

@app.post("/a2a/inbox")
async def inbox(envelope: Envelope):
    if not verify_signature(envelope, allowed=TRUSTED_SIGNERS):
        raise HTTPException(401, "signature verification failed")

    if envelope.intent == "task.delegate":
        task = Task(
            name=envelope.payload["task"],
            inputs=envelope.payload["inputs"],
            trace=envelope.trace,
        )
        result = await planner.run(task)

        reply = Envelope(
            intent="result.return",
            correlation_id=envelope.correlation_id,
            trace=envelope.trace,
            sender={"agent_id": AGENT_ID, "framework": "openclaw"},
            recipient=envelope.sender,
            payload={"status": "ok", "result": result.to_dict()},
        )
        return sign(reply, PRIVATE_KEY).dict()

The caller discovers executors by label, not URL — this is the part A2A gets right. No hardcoded hostnames:

executor = await lookup(
    intent="task.execute",
    labels={"framework": "hermes", "env": "prod"},
)

Three patterns worth implementing

Request/reply is the simplest. Planner calls executor, waits for the reply envelope, acts on it. Use for sub-tasks with clear deadlines.

Fan-out/fan-in dispatches the same intent to a pool of executors in parallel, correlates replies by correlation_id, and takes the first good answer or aggregates. This is how you build research-agent ensembles.

Async with callback fires a task.delegate with a reply_to URL and returns immediately. The callee POSTs a result.return when done. You get durability without holding an HTTP connection open.

The platform layer matters

The protocol is the easy part. Production A2A needs five things at the platform layer: a registry for discovery, identity and mTLS per agent, routing with network policy, observability that stitches traces across agents, and per-agent rate limits. You can build all five yourself — Postgres registry, Vault for keys, Envoy for mTLS, OTEL collector, Redis for rate limits — or use something like Rapid Claw that ships them preconfigured.

If you're thinking about multi-agent architectures more broadly, I wrote up the common orchestration patterns (planner/executor, supervisor, blackboard) that pair well with A2A as the transport layer.

A2A isn't revolutionary — it's the boring infrastructure piece that was missing. And boring infrastructure is exactly what you want when you're trying to ship agent systems that actually work in production.