"It's not a bug, it's spec": a zero-click RCE in AI coding agents that three vendors won''t patch

#security #mcp #ai #devsecops

TL;DR — A prompt injection can rewrite your AI IDE's mcp.json the moment you open a project, with no dialog and no click, and get arbitrary code execution. It's one of 12+ CVEs in the same class. The root cause lives in the official MCP SDK — and Anthropic, Google, and Microsoft have declined to issue CVEs for their own tools, on the grounds that rewriting the file "requires explicit user permission." In practice, that permission is usually "you installed the IDE."

In a previous post I argued that the real attack surface for AI coding agents isn't "the model goes rogue" — it's the config file. At the time, the worst case (TrustFall) still needed a human: clone a malicious repo, open it, press Enter on a trust dialog.

CVE-2026-30615 removes the Enter.

The zero-click chain

Disclosed by OX Security on 2026-04-15. On Windsurf IDE 1.9544.26, the chain is:

The attacker prepares HTML content the IDE will render — a malicious web page, a poisoned repo README, a tampered tool description.
An injected instruction silently overwrites the local mcp.json and registers an attacker-controlled STDIO server.
The MCP SDK re-reads the config and launches the registered binary.
Arbitrary command execution. CVSS 8 / High.

No approval dialog. No confirmation step. Among the IDEs OX tested, Windsurf was the only true zero-click — Cursor, Claude Code, and Gemini CLI each required at least one user action.

Codeium (Windsurf) shipped a patch. That's the part everyone agrees on. The argument starts with everyone else.

This is a class, not a bug

The same disclosure groups 12+ CVEs under one pattern — RCE via MCP STDIO:

LangFlow (CVE unassigned)
GPT Researcher (CVE-2025-65720)
LiteLLM (CVE-2026-30623)
Agent Zero (CVE-2026-30624)
Windsurf (CVE-2026-30615)
DocsGPT (CVE-2026-26015)
Flowise, Upsonic, Bisheng, Jaaz, and more

The shared root cause: the official MCP SDK passes user-controllable config values into StdioServerParameters without sanitization, and that flows straight into spawning a subprocess. OX filed this root cause under a category I haven't seen on a vuln report before — "Won't Be Patched" — because Anthropic's position is that this is spec-conformant behavior, not a defect to fix at the protocol level.

There's a known operational mitigation: allowlist the STDIO command value to known launchers, e.g. {npx, uvx, python, python3, node, docker, deno}. That closes the "point at any binary you like" path. But it's something each downstream implementation has to add itself. It is not the SDK's default.

The review surface keeps shrinking

Line up three incidents and the trend is hard to miss:

Act 1 — TrustFall: the config file is malicious from the start. Clone, open, press Enter on the trust dialog. At least the dialog appears.
Act 2 — AWS Kiro: an indirect prompt injection writes trustedCommands: ["*"]. The config changes after you've reviewed it, so you miss the moment.
Act 3 — Windsurf zero-click: opening HTML silently rewrites mcp.json. No dialog at all. The fact that a rewrite happened isn't even surfaced in the IDE.

Each act shaves off more of the surface where a human could notice something is wrong. By Act 3, the event itself is invisible.

So whose problem is it?

Here's where I'd genuinely like the comments.

Google, Microsoft, and Anthropic have declined to issue CVEs for their own tools in this class. The stated reason is reasonable on its face: modifying these files requires the user's explicit permission.

But walk through what that permission actually is. The injection rewrites a file inside a workspace the agent already has write access to — access you granted, in bulk, when you opened the project. There is no per-write prompt. So "explicit user permission" collapses into "you ran the IDE." If the threshold for not a vulnerability is "the user once consented to use the software," almost nothing involving a config file is ever a vulnerability.

I'm not claiming the vendors are acting in bad faith. A protocol-level fix is genuinely hard, and "spec-conformant" is technically true. But "technically spec" and "not the user's problem" are different claims, and the second one is the one that ends up on the user. When the people who own the protocol decline to treat it as a defect, the risk doesn't disappear — it just moves downstream to whoever's running the agent.

Which is the actual question: if the vendor won't fix it and won't even name it, where does the responsibility land?

If nobody patches, watch the config layer

My answer, for what it's worth, is that this has to be observed somewhere other than the IDE.

EDR sees the npx or python that got spawned. It does not see "a new STDIO server was added to mcp.json." By the time the subprocess starts, the config change is already seconds in the past. The interesting signal — the permission state changed while you weren't looking — happens one layer up from where most tooling is watching.

That's the layer I've been poking at. I've been building a small open-source thing (Sigil) that watches agent config files like mcp.json and .claude/settings.json, scores the risk, and emits an event to your SIEM — it doesn't block, it just tells you when the permission state changed while your hands were off the keyboard. Across a fleet of machines that shows up as triage-able alerts — the silent change, made visible:

And because it exposes that same posture as plain MCP, you can also just ask. Here's Codex doing exactly that — pulling the riskiest host in a fleet and the reasons behind it:

Notice one of the flagged reasons: an untrusted remote MCP server. That's the same class of mcpServers entry CVE-2026-30615 plants — except here it's surfaced as posture, where a human (or another agent) can actually see it after the fact. (The CVE itself is STDIO-command-based; Sigil's STDIO-command scoring is tracked in #53. The attack surface — the mcpServers key — is the same.)

That's deliberately the last thing in this post, not the point of it.

The point

Act 1 was "plant a malicious config file." Act 3 is "rewrite the config file the instant it's opened, silently." The time and surface a user has to review anything got measurably smaller in between — and the vendors who own the protocol have decided that's spec, not bug.

The attack surface is the config file. So the thing you watch should be the config file too — its state, and the moment that state changes.

How does your team handle config from untrusted repos today — sandbox the whole workspace, pin the agent's permissions, or just trust the trust dialog? I'd actually like to know.