The AI agent ecosystem is exploding with protocols. Anthropic released MCP (Model Context Protocol). Google announced A2A (Agent-to-Agent). Every week there's a new "standard" for agent communication.
But here's the thing most people miss: these protocols solve different problems at different layers. Using MCP for distributed agent orchestration is like using HTTP for job scheduling—wrong tool, wrong layer.
Let me break down the actual difference and why you probably need both.
What MCP Actually Does
MCP (Model Context Protocol) is a tool-calling protocol for a single model. It standardizes how one LLM discovers and invokes external tools—databases, APIs, file systems, etc.
┌─────────────────────────────────────┐
│ Your LLM │
│ │
│ "I need to query the database" │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ MCP Client │ │
│ └──────┬──────┘ │
└────────────┼────────────────────────┘
│
▼
┌───────────────┐
│ MCP Server │
│ (tool host) │
└───────────────┘
│
▼
[Database]
MCP is great at this. It solves tool discovery, schema negotiation, and invocation for a single model context.
What MCP doesn't cover:
- How do you schedule work across multiple agents?
- How do you track job state across a cluster?
- How do you enforce safety policies before execution?
- How do you handle agent liveness and capacity?
- How do you fan out workflows with parent/child relationships?
MCP was never designed for this. It's a tool protocol, not an orchestration protocol.
Enter CAP: The Missing Layer
CAP (Cordum Agent Protocol) is a cluster-native job protocol for AI agents. It standardizes the control plane that MCP doesn't touch:
- Job lifecycle: submit → schedule → dispatch → run → complete
- Distributed routing: pool-based dispatch with competing consumers
- Safety hooks: allow/deny/throttle decisions before any job runs
- Heartbeats: worker liveness, capacity, and pool membership
- Workflows: parent/child jobs with aggregation
- Pointer architecture: keeps payloads off the bus for security and performance
┌─────────────────────────────────────────────────────────────┐
│ CAP Control Plane │
│ │
│ Client ──▶ Gateway ──▶ Scheduler ──▶ Safety ──▶ Workers │
│ │ │ │
│ ▼ ▼ │
│ [Job State] [Results] │
└─────────────────────────────────────────────────────────────┘
│
▼
┌──────────────┐
│ MCP (tools) │
└──────────────┘
CAP handles:
-
BusPacketenvelopes for all messages -
JobRequest/JobResultwith full state machine -
context_ptr/result_ptrto keep blobs off the wire - Heartbeats for worker pools
- Safety Kernel integration (policy checks before dispatch)
- Workflow orchestration with
workflow_id,parent_job_id,step_index
The Key Insight: Different Layers
Think of it like the network stack:
| Layer | Protocol | What It Does |
|---|---|---|
| Tool execution | MCP | Model ↔ Tool communication |
| Agent orchestration | CAP | Job scheduling, routing, safety, state |
| Transport | NATS/Kafka | Message delivery |
MCP is layer 7. CAP is layer 5-6.
You wouldn't use HTTP to schedule Kubernetes jobs. Similarly, you shouldn't use MCP to orchestrate distributed agent workloads.
How They Work Together
Here's the beautiful part: MCP and CAP complement each other perfectly.
A CAP worker receives a job, executes it (potentially using MCP to call tools), and returns a result. MCP handles the tool-calling inside the worker. CAP handles everything outside.
┌─────────────────────────────────────────────────────────────────┐
│ CAP Cluster │
│ │
│ ┌──────────┐ ┌───────────┐ ┌─────────────────────────┐ │
│ │ Client │───▶│ Scheduler │───▶│ Worker Pool │ │
│ └──────────┘ └───────────┘ │ ┌───────────────────┐ │ │
│ │ │ │ CAP Worker │ │ │
│ ▼ │ │ │ │ │ │
│ [Safety Kernel] │ │ ▼ │ │ │
│ │ │ ┌─────────┐ │ │ │
│ │ │ │ MCP │ │ │ │
│ │ │ │ Client │ │ │ │
│ │ │ └────┬────┘ │ │ │
│ │ └────────┼──────────┘ │ │
│ └───────────┼─────────────┘ │
└────────────────────────────────────────────────┼───────────────┘
▼
[MCP Servers]
(tools, DBs, APIs)
Example flow:
- Client submits job via CAP (
JobRequesttosys.job.submit) - Scheduler checks Safety Kernel → approved
- Job dispatched to worker pool via CAP
- Worker uses MCP to call tools (query DB, fetch API, etc.)
- Worker returns result via CAP (
JobResulttosys.job.result) - Scheduler updates state, notifies client
MCP never touches the bus. CAP never touches the tools. Clean separation.
Why This Matters for Production
If you're building a toy demo, you don't need CAP. One model, a few tools, MCP is plenty.
But if you're building production multi-agent systems, you need:
| Requirement | MCP | CAP |
|---|---|---|
| Tool discovery & invocation | ✅ | ❌ |
| Job scheduling | ❌ | ✅ |
| Distributed worker pools | ❌ | ✅ |
| Safety policies (allow/deny/throttle) | ❌ | ✅ |
| Job state machine | ❌ | ✅ |
| Worker heartbeats & capacity | ❌ | ✅ |
| Workflow orchestration | ❌ | ✅ |
| Payload security (pointer refs) | ❌ | ✅ |
CAP gives you the control plane. MCP gives you the tool plane.
Getting Started with CAP
CAP is open source (Apache-2.0) with SDKs for Go, Python, Node/TS, and C++.
Minimal Go worker (20 lines):
nc, _ := nats.Connect("nats://127.0.0.1:4222")
nc.QueueSubscribe("job.echo", "job.echo", func(msg *nats.Msg) {
var pkt agentv1.BusPacket
proto.Unmarshal(msg.Data, &pkt)
req := pkt.GetJobRequest()
res := &agentv1.JobResult{
JobId: req.GetJobId(),
Status: agentv1.JobStatus_JOB_STATUS_SUCCEEDED,
}
out, _ := proto.Marshal(&agentv1.BusPacket{
Payload: &agentv1.BusPacket_JobResult{JobResult: res},
})
nc.Publish("sys.job.result", out)
})
Links:
- GitHub: github.com/cordum-io/cap
- Spec: github.com/cordum-io/cap/tree/main/spec
- Reference implementation: Cordum
TL;DR
- MCP = tool protocol for single-model contexts
- CAP = job protocol for distributed agent clusters
- They solve different problems at different layers
- Use both: CAP for orchestration, MCP inside workers for tools
- Stop using MCP for things it wasn't designed for
The multi-agent future needs both protocols. Now you know which one to reach for.
CAP is developed by Cordum, the AI Agent Governance Platform. Star the repo if this was useful: github.com/cordum-io/cap
Tags: #ai #agents #mcp #distributed-systems #orchestration #protocols
Top comments (0)