Anthropic's MCP Changelog Reads Like a Bug Bounty in Slow Motion

#ai #agents #architecture #security

Book: AI Agents Pocket Guide
My project: Hermes IDE | GitHub — an IDE for developers who ship with Claude Code and other AI coding tools
Me: xgabriel.com | GitHub

On April 16, 2026, The Register ran the headline "MCP 'design flaw' puts 200k servers at risk". The story underneath read like a confession from the changelog, not a single CVE. OX Security's research, as reported by The Register, counted ten high-or-critical CVEs across MCP-using projects, 7,000 publicly reachable servers, and around 200,000 vulnerable instances downstream of one architectural choice Anthropic still calls expected behavior.

If you have an MCP server in production, you are running infrastructure whose protocol committee has been quietly patching for a year. The fixes ship under release-note bullets that read like janitorial work. Stack them up and you get a different picture: a security narrative whose disclosure cycle has only just hit the news section.

What the changelog actually says

Pull up the November 25, 2025 spec changelog and read the bullets in order, not by category.

"Updated the Security Best Practices guidance."
"Clarify that input validation errors should be returned as Tool Execution Errors rather than Protocol Errors to enable model self-correction (SEP-1303)."
"Align OAuth 2.0 Protected Resource Metadata discovery with RFC 9728, making WWW-Authenticate header optional with fallback to .well-known endpoint (SEP-985)."
"Support polling SSE streams by allowing servers to disconnect at will (SEP-1699)."

In English: the prior spec let an injected validation failure look like a protocol-level abort to the model, which encouraged the model to retry under attacker-shaped premises. The OAuth change is a quiet acknowledgement that the original discovery flow leaked information through header timing. The SSE polling change closes a session-pinning vector. None of this is announced as a fix. All of it is a fix.

The 2026 MCP Roadmap reads the same way. "Deeper security and authorization work" is the headline. The text underneath promises stronger isolation primitives, finer-grained scopes, and capability negotiation. Translation: the things people built MCP servers around for the last twelve months were being held together by convention, and the next spec is going to put a fence around it.

The flaw that did get a name

In January, The Hacker News reported three flaws in Anthropic's MCP Git server that allowed file access and code execution. In February, marmelab published an MCP-security walkthrough that catalogued tool-poisoning, prompt-injection, and over-broad scopes as recurring patterns rather than implementation bugs. In April, OX Security dropped the headline finding: per the research summarised by The Register, the official MCP SDKs treat STDIO transport as a configuration-to-shell pipe. Send a misshapen command, the SDK runs it, hands you back an error, and you walk away with arbitrary OS execution on the host. According to Tom's Hardware and Computing, Anthropic's position is that the behaviour is expected and that input sanitisation is the developer's responsibility.

Stack the timeline:

January 2026: three CVE-grade bugs in Anthropic's reference Git server.
February 2026: marmelab catalogues prompt-injection-via-tool-description as endemic.
March 2026: LiteLLM (CVE-2026-30623) and Bisheng (CVE-2026-33224) patched.
April 2026: OX discloses the STDIO design flaw across the official SDKs. Windsurf (CVE-2026-30615) is reported to allow local code execution through an MCP-config edit. Per Tom's Hardware and Computing, Anthropic frames the STDIO behaviour as expected rather than a protocol-level patch target.
April 2026: Claude Code, Gemini CLI, and GitHub Copilot Agent are reported as vulnerable to comment-driven prompt injection that can be used to steal credentials. Per the cited reporting, the disclosed vendors responded with documentation updates rather than a structural fix.

That is not a changelog. That is a disclosure rhythm.

What the next three cycles probably look like

Predictions are cheap, but the surface area constrains the shape of the next disclosures.

Cycle one: marketplace poisoning. Researchers already poisoned nine of eleven MCP marketplaces in a controlled test. The next big disclosure looks like a real malicious package with thousands of installs, a sleeper command in the tool-description, and a coordinated push to remove it. Expect a "supply-chain alert" headline.

Cycle two: cross-tool exfiltration via shared context. When an agent runs two MCP servers in the same session, one can describe a tool that influences how the other's output is interpreted. The model has no isolation primitive between tools. The Slack-to-Notion-to-S3 chain that looks innocent will leak data through tool-description prompt injection. There is already a research line on indirect prompt injection in chatbot plugins, accepted to IEEE S&P 2026.

Cycle three: identity and scope leakage. OAuth scopes in MCP are inherited from whatever you wired into the connector. A server that asks for read:any because that's what the demo did will get installed across thousands of agents. The disclosure will be a tool that, by quietly being allowed read access to anything, walks out the door with an enterprise's calendar, then chat history, then files.

None of this is paranoid. All three patterns are visible in the existing CVEs.

A defensive checklist if you already run MCP in production

If you cannot turn off the MCP servers you have, you can at least put a perimeter around them.

Sandbox the MCP server. Containers with no host network. A read-only filesystem outside an explicit data dir. Drop CAP_SYS_ADMIN, CAP_NET_RAW, everything you don't need. Run as a non-root user. The STDIO design flaw is a local-RCE primitive; the sandbox is what turns it into a contained one.
Treat tool descriptions as untrusted input. Render them in a separate context that the model sees once at session start, then freeze. If a server can rewrite its own tool descriptions mid-session, you are one prompt-injection away from a different tool surface than the user thinks.
Pin server versions and verify integrity. SHA-pin every MCP server you install. Run npm audit / pip-audit on the runtime. Subscribe to the Vulnerable MCP Project feed — it tracks CVEs, advisories, and the lag between disclosure and patch.
Egress allowlist. Outbound network from the MCP container goes through a proxy that allows only the domains that server actually needs. Most exfiltration paths fail here.
Log every tool invocation with arguments and results. Not for compliance — for the postmortem. When the disclosure lands, the only evidence you will have is the audit trail.
Disable any MCP server you don't use this week. The ones sitting idle are the ones that get owned, because nobody is watching the logs.

A 25-line monitor for anomalous MCP behavior

This is a lightweight watcher you wrap around the MCP host process. It tails the tool-call stream, flags arguments that look like shell metacharacters in fields the schema marked as plain strings, and shouts when a tool's call rate spikes outside its rolling baseline. Drop it in front of any MCP server before you debate whether to keep it running.

import json, re, sys, time
from collections import defaultdict, deque

WINDOW = 60
SHELL = re.compile(r"[;&|`$><]|\$\(|\)\s*\|")
RATES: dict[str, deque] = defaultdict(lambda: deque(maxlen=200))

def alert(reason, tool, args):
    print(json.dumps({
        "ts": time.time(), "level": "alert",
        "reason": reason, "tool": tool, "args": args,
    }), file=sys.stderr, flush=True)

def check(call: dict):
    tool = call.get("name", "?")
    args = call.get("arguments", {}) or {}
    for k, v in args.items():
        if isinstance(v, str) and SHELL.search(v):
            alert("shell-metachar", tool, {k: v[:80]})
    now = time.time()
    q = RATES[tool]
    q.append(now)
    recent = sum(1 for t in q if now - t <= WINDOW)
    if recent > 30:
        alert("rate-spike", tool, {"per_min": recent})

for line in sys.stdin:
    try: check(json.loads(line))
    except Exception: continue

Call it a smoke detector, not a WAF. The pattern that got Windsurf's CVE-2026-30615 going was a config-edit tool that accepted an unsanitized command field; this catches the metacharacters in that field on the way past. The rate spike fires when an attacker has flipped a tool into a loop — the case the Pomerium roundup was already cataloguing nearly a year ago.

The boring part

MCP isn't broken — it's a protocol that grew faster than its threat model, and the changelog is the place that growth shows up first. Read it the way you read a CVE feed for any other dependency. The bullets that say "guidance updated" are the bullets that mean someone found it, fixed it, and hopes you noticed. For the next two release cycles, the disclosures will outrun the patches, and the reference implementations will keep tightening behind them.

If you only do one thing this week, sandbox the MCP servers and turn on the egress allowlist. The next disclosure is already in someone's drafts folder.

If this was useful

Chapter 5 of the AI Agents Pocket Guide walks through tool-permissioning patterns for autonomous systems — capability scoping, sandbox boundaries, and the failure modes that show up the moment your agent gets a third tool. If you are running MCP servers in production and trying to figure out which knobs actually matter, that's the chapter.