DEV Community

Michael "Mike" K. Saleme
Michael "Mike" K. Saleme

Posted on

CVE-2026-40933: The allowlist was the vulnerability

On April 15, 2026, FlowiseAI published GHSA-c9gw-hvqq-f33r for CVE-2026-40933 — a CVSS 10.0 remote code execution in the Custom MCP node of Flowise ≤ 3.0.13, patched in 3.1.0. The vector is Model Context Protocol stdio transport: an authenticated user registers a local MCP server by supplying a command and args[], and Flowise spawns it.

Flowise is not a reckless project. The vulnerable path in packages/components/nodes/tools/MCP/core.ts ships three guards:

  1. validateMCPServerConfig — command must be in {node, npx, python, python3, docker}.
  2. validateCommandInjection — args must contain no shell metacharacters.
  3. validateArgsForLocalFileAccess — args must not look like paths.

Each guard does exactly what it says. None prevent the exploit. Here's the payload:

{"command": "npx", "args": ["-c", "touch /tmp/pwn"]}
Enter fullscreen mode Exit fullscreen mode

npx -c invokes a shell. python -c invokes Python. node -e invokes JavaScript eval. docker run --entrypoint is arbitrary program execution. Every binary in the allowlist is itself an interpreter whose argv is a program.

The allowlist is the vulnerability. You cannot defend a spawn() call by restricting what you spawn, if what you spawn can read programs from its arguments.

This is not one CVE

OX Security's writeup frames the class: products accept attacker-influenced arguments for locally-spawned MCP servers and attempt to contain blast radius with surface-level filters on command name or shell metacharacters. Expect more CVEs in this class. The MCP protocol makes it easy to register stdio-transport servers, and "register a local command" is the canonical onboarding flow. Every product that lets an authenticated user supply command + args is shipping a program loader.

What catches this

The agent-security-harness tests at the MCP protocol boundary. The specific test that maps to this class is MCP-017test_mcp_stdio_pre_handshake_exec:

# protocol_tests/mcp_harness.py:1509
async def test_mcp_stdio_pre_handshake_exec(self):
    """Server that pipes deserialized stdio fields into execution
    before handshake validation must fail closed."""
    canary = f"/tmp/mcp-stdio-canary-{uuid4().hex}"
    payload = self.build_initialize(
        client_info_name=f"X`touch {canary}`X",
    )
    await self.transport.send(payload)
    await asyncio.sleep(0.5)
    assert not Path(canary).exists(), \
        "stdio field reached execution path"
Enter fullscreen mode Exit fullscreen mode

The test injects a shell-injection canary into the clientInfo.name field of the initialize message — the first JSON-RPC call over a stdio MCP transport — and asserts no canary file is created.

Adjacent tests:

  • MCP-010 (test_mcp_tool_argument_injection) — fires prototype pollution, template expressions, command substitution. Covers the class underlying CVE-2026-25536.
  • MCP-008 (test_mcp_malformed_jsonrpc) — seven type-confused payloads.
  • MCP-001 (test_mcp_tool_list_injection) — inspects tools/list for dangerous names.

A Flowise ≤ 3.0.13 build run behind MCP-017 would surface the canary before release.

What's missing

Honest rather than promotional.

Harness gaps:

  1. No byte-level fuzzing of stdio framing. MCP-008 tests seven hand-written payloads — property-based fuzzing would catch edge cases no human wrote.
  2. No pickle/YAML coverage. Tests are JSON-RPC only. A vendor that swaps in pickle.loads over stdio would not trip anything.
  3. Test plane is client-to-server. Sub-agent-to-orchestrator stdio — the CVE-2026-39884 direction — is not covered.
  4. No stdin EOF / half-close / interleaved-notification race testing.

Governance gaps:

The constitutional-agent repo has no first-class hard constraint for deserialization safety or tool-trust boundaries. It catches blast radius downstream — HC-5 (no irreversible action without confirmation), HC-10 (no silent exception handlers in safety code), RiskGate (critical security events force FAIL) — but there is no HC-13 that would read something like:

No deserialization of untrusted tool or sub-agent input without schema validation and fail-closed error handling.

Missing this constraint means the governance layer catches the consequence (an RCE triggers a safety event, the agent freezes) but not the cause (the deserializer shouldn't have run at all). Roadmap item, not a win.

One question

For anyone running MCP stdio servers today: is your allowlist a list of binaries, or a list of (binary, arg-pattern) tuples? In every stack I've asked so far, the answer is the first. CVE-2026-40933 is what the first looks like when it fails.

Sources

Top comments (2)

Collapse
 
arkforge-ceo profile image
ArkForge

"Every binary in the allowlist is itself an interpreter whose argv is a program." That's the sentence that should be on a poster in every MCP integration team's office.

The deeper pattern here: MCP stdio transport inherits the host's execution context by design. Allowlisting the command name is checking the label on a box while ignoring what's inside. npx -c, python -c, node -e all collapse the spawn boundary into arbitrary eval.

The only viable defense for local MCP servers is capability-scoped sandboxing (seccomp/landlock on Linux, sandbox-exec on macOS), not input filtering. Filter what the process can do, not what you call it.

Collapse
 
mspro3210 profile image
Michael "Mike" K. Saleme

"Filter what it can do, not what you call it." That's a keeper.

Practical wrinkle: capability-scoped sandboxing is the right prevention, but requires a per-server profile that most teams don't ship. Ten third-party MCP servers = ten sandbox profiles, kept current as servers evolve.

That leaves a real coverage gap during rollout. A pre-deployment test that injects an initialize-time shell canary (MCP-017 pattern) catches the exploit path on servers that don't yet have a profile.

Prevention is the sandbox; detection is the canary. Do you ship per-server profiles in your stack, or rely on the host OS?