Dhyan Raj

Posted on Apr 16

Service Mesh for MCP Agents — Without the Sidecar

#kubernetes #servicemesh #mcp #devops

Remember the first time you installed Istio?

The pitch was beautiful. mTLS everywhere. Automatic retries. Distributed tracing. Traffic shifting with a YAML file. You ran istioctl install, got a coffee, came back, and discovered every pod had grown a 50MB Envoy companion. Memory usage doubled. Cold starts crawled. Your Helm chart sprouted a proxy.resources.limits block. And every six months, you'd schedule a maintenance window to upgrade the control plane and pray nothing in the data path broke.

Service meshes are extraordinary. They're also expensive — in CPU, in mental load, in operational surface area.

So here's the question we kept asking while building MCP Mesh:

What if you could have everything a service mesh gives you, but built into the agents themselves — no sidecar, no control plane, no CRDs?

Not as a thought experiment. As a thing you can pip install today.

The five jobs of a service mesh

Strip away the marketing and a service mesh does five things:

Discovery — find the right backend without hardcoding hosts.
mTLS — every call mutually authenticated, automatically.
Traffic policy — retries, timeouts, circuit breaking, canaries.
Observability — every hop traced, every error surfaced.
Identity — workloads have provable identity that survives across orgs.

These are real production needs. They didn't go away with AI agents. If anything, they got worse: now you've got Python agents calling TypeScript agents calling Java agents calling Claude. Discovery, mTLS, retries — all the same problems, dressed in fashion.

So why not just wrap MCP agents in Envoy and ship?

You can. People do. And then they're back to two binaries per workload, a control plane to babysit, and the joy of debugging "is the sidecar healthy or is the app healthy?"

Why sidecars exist

Be fair to the sidecar pattern. It exists for one excellent reason: the application doesn't know about the mesh, and you don't want to make it know.

If you have a hundred microservices written by ten teams across five years in three languages, asking each of them to integrate a mesh SDK is a non-starter. The sidecar lets you bolt the mesh on from the outside. It's a retrofit.

But MCP agents are new code. They're being written now, on a protocol designed now, for an ecosystem that's barely two years old. We don't have to retrofit. We can put the mesh inside the agent.

That's what MCP Mesh does.

Registry as facilitator, not proxy

The architecture is one diagram:

┌─────────┐     Heartbeat / Topology     ┌──────────┐
│ Agent A │ ──────────────────────────►  │ Registry │
│         │ ◄──────────────────────────  │          │
└─────────┘                              └──────────┘
     │
     │ Direct MCP call (mTLS)
     ▼
┌─────────┐
│ Agent B │
└─────────┘

Two things to notice.

First: the registry never sits in the data path. Agents heartbeat in every five seconds, get back the topology of who provides what, and then call each other directly. The registry is a phone book, not a telephone exchange.

Second: if the registry dies, your agents keep talking. They've got the topology cached. New agents can't join until it's back, but existing traffic flows uninterrupted. Try that with Pilot down.

This is the move that makes "no sidecar" possible. You don't need a proxy because the destination is already known to the source. The mesh isn't between agents; it is the agents.

Walking the service-mesh feature list

Let's go down the checklist and show what each one looks like in MCP Mesh.

Discovery → declare what you need, get it injected

@mesh.tool(
    capability="plan_trip",
    dependencies=[
        {"capability": "weather", "tags": ["+claude"]},
        {"capability": "hotels"},
        {"capability": "flights"},
    ],
)
async def plan_trip(
    destination, dates,
    weather: mesh.McpMeshTool = None,
    hotels:  mesh.McpMeshTool = None,
    flights: mesh.McpMeshTool = None,
):
    forecast = await weather(destination=destination, dates=dates)
    options  = await hotels(destination=destination, dates=dates)
    routes   = await flights(destination=destination, dates=dates)
    return TripPlan(forecast, options, routes)

No service host. No port. No DNS lookup. You declared what you need; the mesh handed you a callable. The proxy is in the function signature.

When a new weather provider comes online — better, faster, cheaper — your call goes there on the next heartbeat tick. Zero deploy. Zero code change.

mTLS → one flag

meshctl start --registry-only --tls-auto
meshctl start my_agent.py --tls-auto

That generates a mini-CA, issues certs to every agent, and turns on mutual TLS for every call. In production, swap the file provider for SPIRE or Vault:

export MCP_MESH_TLS_PROVIDER=spire
export MCP_MESH_SPIRE_SOCKET=/run/spire/agent/sockets/agent.sock

SPIFFE-aware out of the box. URI SANs, not DNS SANs. Private keys live in /dev/shm so they never touch disk. No Citadel install. No mesh-wide rollout. Each agent picks up its identity at startup and uses its language's native TLS stack — httpx in Python, tls in Node, SSLContext in Java.

The mesh isn't a layer below your app. It's a few lines inside it.

Multi-org trust → entities, not federation

In a real enterprise mesh you've got two orgs that need to talk. Service mesh answer: federation, trust bundles, multi-cluster. MCP Mesh answer:

meshctl entity register "partner-corp" --ca-cert /path/to/ca.pem

Done. Their agents are now trusted peers in your mesh.

Traffic policy → kwargs, not VirtualService

@mesh.tool(
    dependencies=["slow_service"],
    dependency_kwargs={
        "slow_service": {
            "timeout": 60,
            "retry_count": 3,
            "streaming": True,
            "session_required": True,
        }
    },
)
async def my_tool(slow_service: mesh.McpMeshTool = None):
    return await slow_service(payload=...)

That's your DestinationRule. In Python. Type-checkable. Diff-reviewable. Lives next to the code that uses it.

Want canaries? Tags do that:

dependencies=[
    {"capability": "math", "tags": ["addition", ["python", "typescript"]]},
]

Try the Python provider first; if not available, fall back to TypeScript. That's a shadow deploy, a canary, and a fallback policy in one tag expression.

Header propagation → one env var

Auth tokens, tenant IDs, correlation headers — they need to flow end-to-end through the call chain.

export MCP_MESH_PROPAGATE_HEADERS=authorization,x-tenant-id,x-request-id

Set it on every agent. Inbound headers matching the allowlist get captured into request-scoped context (Python uses contextvars for async safety) and re-injected into every downstream call automatically. No middleware. No Envoy filter. No code in your tool functions.

Observability → one flag and a Helm chart

export MCP_MESH_DISTRIBUTED_TRACING_ENABLED=true
helm install mcp-core oci://ghcr.io/dhyansraj/mcp-mesh/mcp-mesh-core --version 1.3.3

That ships you a Tempo backend, a Grafana with dashboards pre-loaded, OTLP wired up between every agent. Then:

$ meshctl trace abc123def456

└─ plan_trip (trip-planner) [245ms] ✓
   ├─ weather (claude-weather) [120ms] ✓
   │  └─ llm:claude (claude-provider) [98ms] ✓
   ├─ hotels (gpt-hotels) [80ms] ✓
   └─ flights (flight-agent) [40ms] ✓

Every hop. Every agent. Every LLM call. From your terminal.

The thing service meshes can't do

Here's where MCP Mesh stops looking like Istio and starts looking like something new.

A service mesh routes traffic. Bytes from one socket to another. Your code still has to know there's an HTTP client in there, construct a URL, parse a response, deserialize JSON, handle the error envelope.

MCP Mesh routes function calls.

forecast = await weather(destination=destination, dates=dates)

That weather is a Python callable. It happens to live in another process, on another machine, written in another language, possibly powered by an LLM. You can't tell from the code. You don't have to. The mesh handed you a function. You called it. The result came back typed.

We call this DDDI — Distributed Dynamic Dependency Injection. Spring gave us DI on a single JVM. Guice gave it to us in fewer lines. MCP Mesh gives it to us across the network, across runtimes, across language boundaries.

The proxy is in your function signature. That's the move.

And it composes:

weather = reasoning_weather if user.wants_explanation else api_weather
forecast = await weather(destination, dates)

That's a routing decision in Python. In an if. Not a VirtualService. Not a percentage split in YAML. The full power of your language, applied to traffic shaping.

"But it only meshes MCP agents…"

That's where the skeptic leans in. Sure, sure, but I've got Postgres, a REST API, a vector DB, and a legacy SOAP service. Your shiny mesh doesn't help me with any of those.

Hold up.

MCP isn't a wire format for AI gimmicks. MCP is a tool protocol with typed schemas, streaming, structured errors, and bidirectional communication built in. Wrap your Postgres in an MCP tool — it's now a mesh participant. Wrap your REST API — same. Your RAG pipeline, your S3 bucket, your billing system — every one of those is one @mesh.tool away from being a first-class member of the mesh, with mTLS, retries, tracing, and discovery, for free.

And here's the kicker: MCP is arguably a better default than REST for new internal services.

Schemas, without dragging OpenAPI tooling along.
Streaming, without SSE plumbing.
Tool discovery, without a Swagger portal.
Bidirectional, without WebSocket gymnastics.

If you're building a new internal API in 2026, you should at least consider exposing it over MCP. The ergonomics are better. The schemas are first-class. The tooling is finally catching up. This isn't AI hype — this is what REST should have evolved into a decade ago.

But say you're not ready to MCP-ify everything. The proxy layer in MCP Mesh — EnhancedUnifiedMCPProxy — already abstracts transport from semantics. Adding a REST adapter is an architectural extension, not a rewrite. gRPC, the same. The mesh isn't tied to MCP. MCP is just the first protocol it speaks. Multi-protocol mesh isn't a roadmap dream — it's an extension point sitting in the codebase right now.

This isn't a narrow tool for a niche protocol. This is a paradigm shift in how distributed systems get composed, and MCP happens to be the protocol that made the shift cheap enough to ship.

So here's the punchline

For the K8s crowd: you can have a mesh without the proxy tax. mTLS, discovery, retries, tracing, identity — all of it — delivered in-process, debuggable in your own language, with no second binary per pod.

For the application developer: you can call a remote agent like a local function, in any of three languages, with the proxy in your function signature.

For the platform team: you get a control plane that fits in one binary, scales to thousands of agents, and stays out of the data path.

For everyone: the mesh isn't bolted on. It is the agents.

If you searched "service mesh for MCP agents" and ended up here — congratulations. You found the only one.

And it doesn't have a sidecar.

npm install -g @mcpmesh/cli
meshctl scaffold

Welcome to the mesh.

Docs · GitHub · Discord · YouTube

Top comments (1)

Global Chat • Apr 17

Registry-as-phonebook-not-exchange is the right framing. The entity-register move is clever for trust but leaves one thing open. Once you have trusted partner-corp's CA, how does a tool declaring dependencies=["slow_service"] actually discover that partner-corp exposes it? Are the two registries exchanging topology, or does each agent heartbeat to both? The trust layer scales by CA swap, but the discovery layer is what usually turns into the real cross-org bottleneck once you have more than two partners.