Andrea

Posted on Apr 21

MCP security has 4 layers. Most teams have 2.

#security #ai #mcp #agents

When people talk about "securing MCP" they mean very different things. One team is scanning MCP server manifests for malicious tool definitions. Another is locking agents in Docker containers with no outbound network. A third is writing runtime policies that deny certain tool calls. A fourth is parsing audit logs after the fact to see what happened.

These aren't different solutions to the same problem. They're four different problems, at four different layers of the stack. Lump them together and you'll end up thinking one tool is enough when it isn't.

Here's the model I've landed on after building SentinelGate for the past few months.

Layer 1 — Scan: is this server safe to install?

What it does: Inspects MCP servers before you run them. Looks at tool manifests, embedded prompts, tool descriptions, for known-malicious patterns — tool poisoning, misleading descriptions, supply-chain risk.

Tools: Cisco MCP Scanner, Invariant Labs mcp-scan, BlueRock's MCP Trust Registry.

What it doesn't catch: Anything that happens at runtime. A scanned-clean server can still be misused by an agent that was tricked by prompt injection.

Layer 2 — Sandbox: can the agent reach outside?

What it does: Isolates the agent's execution environment. Network off by default, limited filesystem access, resource limits, process boundaries.

Tools: E2B, Docker, Fly.io, Firecracker, custom VMs.

What it doesn't catch: Anything inside the perimeter. Containers are walls — they can block everything or allow everything, but they have no notion of "this tool call is a read, so it's fine, but that one is a delete, so block it". They operate below the protocol.

Layer 3 — Gate: should this specific call go through?

What it does: Sits between the agent and the tools. Intercepts every MCP tool call, evaluates policy, decides allow / deny / require-approval. Content scanning on request arguments and tool responses.

Tools: SentinelGate (what we work on) and Stacklok ToolHive both sit here — with different policy models.

ToolHive evaluates each call on its own, using Cedar (stateless: principal, action, resource). SentinelGate remembers what happened earlier in the same session, so it can block sequences across calls — a read of ~/.ssh/id_rsa followed by a write to a remote endpoint, for example.

Pick based on whether your threat model needs per-call authz or cross-call pattern detection.

Take the scenario from last week's post — a malicious.txt file with a hidden prompt injection. Scan wouldn't have seen it (the MCP filesystem server is legitimate). Sandbox wouldn't have blocked it (it's a syntactically valid tool call from inside the perimeter). A Gate with content scanning on tool responses does.

What it doesn't catch: Anything that bypasses MCP. If the agent makes a raw syscall, or runs curl directly, a gate doesn't see it — which is why this layer is paired with Sandbox, not a replacement for it.

Layer 4 — Audit & Response: what happened, and what do we do now?

What it does: Not prevention — that's the Gate's job. It takes the event stream the Gate emits and turns it into long-term, queryable, cross-session data. Months of retention. Cross-agent pattern detection ("this agent has been denied 40 times today — is it compromised, or is the policy wrong?"). Alerts routed into your existing on-call. Compliance archive. Gates block in real time, SIEMs store what happened at scale.

Tools: Splunk, Elastic, Loki, your existing SIEM. SentinelGate emits structured events designed to flow into whatever you already run.

What it doesn't catch: Anything that wasn't logged in the first place. Which is why the Gate layer matters — it's the thing producing the events this layer consumes.

A quick map of the four

Layer	When it runs	Example tools	Main blind spot
1. Scan	Pre-deploy	Cisco MCP Scanner, Invariant Labs mcp-scan	Runtime behavior
2. Sandbox	Runtime, perimeter	E2B, Docker, Firecracker	Calls inside the perimeter
3. Gate	Runtime, per-call	SentinelGate, Stacklok ToolHive	Non-MCP channels
4. Audit & Response	Post-runtime	Splunk, Elastic, SIEM	Events that were never logged

The gap most teams have

The patterns I see most often in production MCP setups:

Scan + Sandbox (dev + security collaboration): they scan servers before deployment and run agents in Docker
Sandbox + Audit (platform teams): they containerize and ship logs to a SIEM

What's missing in both is the Gate. The layer that evaluates every tool call in real time and can say no before something happens, not after.

Where SentinelGate fits

SentinelGate is Layer 3: it evaluates every MCP call at runtime and decides allow / deny / require-approval. That's the reason to adopt it.

You also get the emission side of Layer 4 for free: every decision is logged as a structured event, the admin UI replays sessions, the kill switch cuts everything off in seconds. Running a SIEM already? Events stream straight into it. Not yet? SentinelGate's built-in view covers short-term triage until you plug one in.

What still lives in your SIEM: months of retention, cross-agent analytics, on-call alert routing.

If you already have Scan and Sandbox covered, Gate is the next thing to add.

Closing

If you're running MCP in production and you don't have a Gate, that's the gap. The rest is optimization.

Repo: github.com/Sentinel-Gate/Sentinelgate — star it if this framework is useful, it helps me keep writing.

DEV Community