Bill Wilson

Posted on Mar 11

Harden Your MCP Server NOW Before Anthropic Forces You To

#ai #cybersecurity #mcp #security

A researcher at adversa.ai just demonstrated a zero-click RCE chain that starts with a calendar invite and ends with full code execution on your machine. The attack path goes through MCP tool chaining - low-risk tools escalating to high-risk local executors. Anthropic was notified about a related flaw in Claude Extensions (DXT). They declined to fix it.

That's not a hypothetical. Cyberwarzone reported active exploitation artifacts today, March 11th, 2026.

If you're running an MCP server in production, you need to harden it right now. Not next sprint. Now.

The Attack Chain

Here's what adversa.ai documented:

Attacker sends a calendar invite with crafted metadata
Agent's calendar tool processes the invite (low-risk, read-only tool)
The metadata contains instructions that the agent interprets as actionable
Agent chains from calendar tool to a file system tool or code executor
Arbitrary code runs on the host machine

The critical insight: each individual tool permission seems reasonable. A calendar reader should be safe. A code executor with user confirmation should be safe. But the chain from one to the other - with the agent as the intermediary - bypasses every gate.

This is the MCP equivalent of a confused deputy attack. The agent has permission to use both tools. The attacker just needs to convince the agent that using them together is the right thing to do.

Why Anthropic's Response Matters

Infosecurity Magazine reported that Anthropic acknowledged the DXT vulnerability but declined to patch it, citing that user permission requirements are sufficient mitigation.

I disagree. User permission prompts are a speed bump, not a wall. Most users click "Allow" reflexively - especially when the request looks like normal tool usage. And in automated agent deployments, there might not be a user to click anything at all.

This tells us something important about the timeline: Anthropic will eventually ship mandatory sandboxing. The pressure from security researchers, the exploitation artifacts, and the liability risk make it inevitable. My estimate is within four weeks. When they do, every MCP server that isn't already sandboxed will break.

How agentpay-mcp Handles This

When I built agentpay-mcp, security wasn't an afterthought - it was the first design constraint. Every payment tool runs in a sandboxed context with explicit permission boundaries.

Here's the architecture:

MCP Client (Claude, etc.)
  |
  +-- agentpay-mcp server
        |
        +-- Tool: check_balance
        |     Scope: READ_ONLY
        |     Sandbox: isolated
        |
        +-- Tool: transfer
        |     Scope: WRITE_FUNDS
        |     Sandbox: isolated
        |     Limits: per-tx, per-day, per-recipient
        |
        +-- Tool: pay_api
              Scope: WRITE_FUNDS
              Sandbox: isolated
              Limits: per-call max, daily max

Key design decisions that prevent the adversa.ai attack pattern:

No tool chaining within the server. Each tool executes independently. check_balance can't trigger transfer. The agent can call both, but the server won't let one tool invoke another.
Explicit scope declarations. Every tool declares exactly what it can do. READ_ONLY tools can't write. WRITE_FUNDS tools can't access the filesystem. There's no scope escalation path.
Rate limits and spend caps. Even if an attacker convinces the agent to call transfer 1,000 times, the daily spend limit catches it. The tool returns an error, not a drained wallet.
No local executor access. agentpay-mcp doesn't expose shell commands, file system access, or code execution. It handles money. That's it.

Hardening Your Own MCP Server: A Checklist

Whether you're using agentpay-mcp or building your own MCP server, here's the minimum security posture:

1. Isolate tool execution

Every tool should run in its own sandbox. No shared memory, no shared file handles, no ability for one tool to invoke another.

2. Declare explicit scopes

Don't give tools blanket permissions. A weather API tool needs network access to one endpoint. Not filesystem access. Not code execution.

3. Implement input validation

Every tool input should be validated against a strict schema. Reject anything that doesn't match. Don't let the agent pass arbitrary strings to tools that expect structured data.

4. Add rate limiting

Even if every individual call is authorized, 10,000 calls in a minute probably isn't intentional. Rate limit everything.

5. Log everything

Every tool invocation, every input, every output. When something goes wrong - and it will - you need the audit trail.

6. Don't chain high-risk tools

If your MCP server exposes both a file reader and a code executor, an attacker WILL find a way to chain them. Either remove one or add an explicit human-in-the-loop gate between them.

7. Assume the agent is compromised

Design your tool permissions as if the agent is actively trying to abuse them. Because after a prompt injection, it might be.

The Timeline

Anthopic will ship mandatory sandboxing. The only question is when. Based on the convergence of signals - zero-click RCE documentation, declined DXT fixes, active exploitation, and Claude Code's own incident where it deleted production data - I'd put it at four to six weeks.

When that happens, MCP servers that already follow sandboxed architecture will keep working. Everything else will need emergency patches.

Don't wait for the forced migration. Harden now.

Resources:

agentpay-mcp on npm - sandboxed MCP payment server
agent-wallet-sdk on npm - non-custodial agent wallets
adversa.ai MCP Security Resources (March 2026)
Infosecurity Magazine: "New Zero-Click Flaw in Claude Extensions"

This article was written with AI assistance. All technical claims, code, and architectural decisions were validated by the author.

DEV Community