DEV Community

yaron torgeman
yaron torgeman

Posted on

# MCP vs CAP: Why Your AI Agents Need Both Protocols

The AI agent ecosystem is exploding with protocols. Anthropic released MCP (Model Context Protocol). Google announced A2A (Agent-to-Agent). Every week there's a new "standard" for agent communication.

But here's the thing most people miss: these protocols solve different problems at different layers. Using MCP for distributed agent orchestration is like using HTTP for job scheduling—wrong tool, wrong layer.

Let me break down the actual difference and why you probably need both.


What MCP Actually Does

MCP (Model Context Protocol) is a tool-calling protocol for a single model. It standardizes how one LLM discovers and invokes external tools—databases, APIs, file systems, etc.

┌─────────────────────────────────────┐
│            Your LLM                 │
│                                     │
│  "I need to query the database"     │
│              │                      │
│              ▼                      │
│     ┌─────────────┐                 │
│     │  MCP Client │                 │
│     └──────┬──────┘                 │
└────────────┼────────────────────────┘
             │
             ▼
     ┌───────────────┐
     │  MCP Server   │
     │  (tool host)  │
     └───────────────┘
             │
             ▼
        [Database]
Enter fullscreen mode Exit fullscreen mode

MCP is great at this. It solves tool discovery, schema negotiation, and invocation for a single model context.

What MCP doesn't cover:

  • How do you schedule work across multiple agents?
  • How do you track job state across a cluster?
  • How do you enforce safety policies before execution?
  • How do you handle agent liveness and capacity?
  • How do you fan out workflows with parent/child relationships?

MCP was never designed for this. It's a tool protocol, not an orchestration protocol.


Enter CAP: The Missing Layer

CAP (Cordum Agent Protocol) is a cluster-native job protocol for AI agents. It standardizes the control plane that MCP doesn't touch:

  • Job lifecycle: submit → schedule → dispatch → run → complete
  • Distributed routing: pool-based dispatch with competing consumers
  • Safety hooks: allow/deny/throttle decisions before any job runs
  • Heartbeats: worker liveness, capacity, and pool membership
  • Workflows: parent/child jobs with aggregation
  • Pointer architecture: keeps payloads off the bus for security and performance
┌─────────────────────────────────────────────────────────────┐
│                     CAP Control Plane                       │
│                                                             │
│  Client ──▶ Gateway ──▶ Scheduler ──▶ Safety ──▶ Workers   │
│                              │                      │       │
│                              ▼                      ▼       │
│                         [Job State]           [Results]     │
└─────────────────────────────────────────────────────────────┘
                                                      │
                                                      ▼
                                              ┌──────────────┐
                                              │ MCP (tools)  │
                                              └──────────────┘
Enter fullscreen mode Exit fullscreen mode

CAP handles:

  • BusPacket envelopes for all messages
  • JobRequest / JobResult with full state machine
  • context_ptr / result_ptr to keep blobs off the wire
  • Heartbeats for worker pools
  • Safety Kernel integration (policy checks before dispatch)
  • Workflow orchestration with workflow_id, parent_job_id, step_index

The Key Insight: Different Layers

Think of it like the network stack:

Layer Protocol What It Does
Tool execution MCP Model ↔ Tool communication
Agent orchestration CAP Job scheduling, routing, safety, state
Transport NATS/Kafka Message delivery

MCP is layer 7. CAP is layer 5-6.

You wouldn't use HTTP to schedule Kubernetes jobs. Similarly, you shouldn't use MCP to orchestrate distributed agent workloads.


How They Work Together

Here's the beautiful part: MCP and CAP complement each other perfectly.

A CAP worker receives a job, executes it (potentially using MCP to call tools), and returns a result. MCP handles the tool-calling inside the worker. CAP handles everything outside.

┌─────────────────────────────────────────────────────────────────┐
│                         CAP Cluster                             │
│                                                                 │
│   ┌──────────┐    ┌───────────┐    ┌─────────────────────────┐ │
│   │  Client  │───▶│ Scheduler │───▶│      Worker Pool        │ │
│   └──────────┘    └───────────┘    │  ┌───────────────────┐  │ │
│                         │          │  │   CAP Worker      │  │ │
│                         ▼          │  │        │          │  │ │
│                   [Safety Kernel]  │  │        ▼          │  │ │
│                                    │  │   ┌─────────┐     │  │ │
│                                    │  │   │   MCP   │     │  │ │
│                                    │  │   │ Client  │     │  │ │
│                                    │  │   └────┬────┘     │  │ │
│                                    │  └────────┼──────────┘  │ │
│                                    └───────────┼─────────────┘ │
└────────────────────────────────────────────────┼───────────────┘
                                                 ▼
                                          [MCP Servers]
                                          (tools, DBs, APIs)
Enter fullscreen mode Exit fullscreen mode

Example flow:

  1. Client submits job via CAP (JobRequest to sys.job.submit)
  2. Scheduler checks Safety Kernel → approved
  3. Job dispatched to worker pool via CAP
  4. Worker uses MCP to call tools (query DB, fetch API, etc.)
  5. Worker returns result via CAP (JobResult to sys.job.result)
  6. Scheduler updates state, notifies client

MCP never touches the bus. CAP never touches the tools. Clean separation.


Why This Matters for Production

If you're building a toy demo, you don't need CAP. One model, a few tools, MCP is plenty.

But if you're building production multi-agent systems, you need:

Requirement MCP CAP
Tool discovery & invocation
Job scheduling
Distributed worker pools
Safety policies (allow/deny/throttle)
Job state machine
Worker heartbeats & capacity
Workflow orchestration
Payload security (pointer refs)

CAP gives you the control plane. MCP gives you the tool plane.


Getting Started with CAP

CAP is open source (Apache-2.0) with SDKs for Go, Python, Node/TS, and C++.

Minimal Go worker (20 lines):

nc, _ := nats.Connect("nats://127.0.0.1:4222")

nc.QueueSubscribe("job.echo", "job.echo", func(msg *nats.Msg) {
    var pkt agentv1.BusPacket
    proto.Unmarshal(msg.Data, &pkt)

    req := pkt.GetJobRequest()
    res := &agentv1.JobResult{
        JobId:  req.GetJobId(),
        Status: agentv1.JobStatus_JOB_STATUS_SUCCEEDED,
    }

    out, _ := proto.Marshal(&agentv1.BusPacket{
        Payload: &agentv1.BusPacket_JobResult{JobResult: res},
    })
    nc.Publish("sys.job.result", out)
})
Enter fullscreen mode Exit fullscreen mode

Links:


TL;DR

  • MCP = tool protocol for single-model contexts
  • CAP = job protocol for distributed agent clusters
  • They solve different problems at different layers
  • Use both: CAP for orchestration, MCP inside workers for tools
  • Stop using MCP for things it wasn't designed for

The multi-agent future needs both protocols. Now you know which one to reach for.


CAP is developed by Cordum, the AI Agent Governance Platform. Star the repo if this was useful: github.com/cordum-io/cap


Tags: #ai #agents #mcp #distributed-systems #orchestration #protocols

Top comments (0)