Michael "Mike" K. Saleme

Posted on Apr 21

CVE-2026-40933: The allowlist was the vulnerability

#mcp #security #cve #aiagents

On April 15, 2026, FlowiseAI published GHSA-c9gw-hvqq-f33r for CVE-2026-40933 — a CVSS 10.0 remote code execution in the Custom MCP node of Flowise ≤ 3.0.13, patched in 3.1.0. The vector is Model Context Protocol stdio transport: an authenticated user registers a local MCP server by supplying a command and args[], and Flowise spawns it.

Flowise is not a reckless project. The vulnerable path in packages/components/nodes/tools/MCP/core.ts ships three guards:

validateMCPServerConfig — command must be in {node, npx, python, python3, docker}.
validateCommandInjection — args must contain no shell metacharacters.
validateArgsForLocalFileAccess — args must not look like paths.

Each guard does exactly what it says. None prevent the exploit. Here's the payload:

{"command": "npx", "args": ["-c", "touch /tmp/pwn"]}

npx -c invokes a shell. python -c invokes Python. node -e invokes JavaScript eval. docker run --entrypoint is arbitrary program execution. Every binary in the allowlist is itself an interpreter whose argv is a program.

The allowlist is the vulnerability. You cannot defend a spawn() call by restricting what you spawn, if what you spawn can read programs from its arguments.

This is not one CVE

OX Security's writeup frames the class: products accept attacker-influenced arguments for locally-spawned MCP servers and attempt to contain blast radius with surface-level filters on command name or shell metacharacters. Expect more CVEs in this class. The MCP protocol makes it easy to register stdio-transport servers, and "register a local command" is the canonical onboarding flow. Every product that lets an authenticated user supply command + args is shipping a program loader.

What catches this

The agent-security-harness tests at the MCP protocol boundary. The specific test that maps to this class is MCP-017 — test_mcp_stdio_pre_handshake_exec:

# protocol_tests/mcp_harness.py:1509
async def test_mcp_stdio_pre_handshake_exec(self):
    """Server that pipes deserialized stdio fields into execution
    before handshake validation must fail closed."""
    canary = f"/tmp/mcp-stdio-canary-{uuid4().hex}"
    payload = self.build_initialize(
        client_info_name=f"X`touch {canary}`X",
    )
    await self.transport.send(payload)
    await asyncio.sleep(0.5)
    assert not Path(canary).exists(), \
        "stdio field reached execution path"

The test injects a shell-injection canary into the clientInfo.name field of the initialize message — the first JSON-RPC call over a stdio MCP transport — and asserts no canary file is created.

Adjacent tests:

MCP-010 (test_mcp_tool_argument_injection) — fires prototype pollution, template expressions, command substitution. Covers the class underlying CVE-2026-25536.
MCP-008 (test_mcp_malformed_jsonrpc) — seven type-confused payloads.
MCP-001 (test_mcp_tool_list_injection) — inspects tools/list for dangerous names.

A Flowise ≤ 3.0.13 build run behind MCP-017 would surface the canary before release.

What's missing

Honest rather than promotional.

Harness gaps:

No byte-level fuzzing of stdio framing. MCP-008 tests seven hand-written payloads — property-based fuzzing would catch edge cases no human wrote.
No pickle/YAML coverage. Tests are JSON-RPC only. A vendor that swaps in pickle.loads over stdio would not trip anything.
Test plane is client-to-server. Sub-agent-to-orchestrator stdio — the CVE-2026-39884 direction — is not covered.
No stdin EOF / half-close / interleaved-notification race testing.

Governance gaps:

The constitutional-agent repo has no first-class hard constraint for deserialization safety or tool-trust boundaries. It catches blast radius downstream — HC-5 (no irreversible action without confirmation), HC-10 (no silent exception handlers in safety code), RiskGate (critical security events force FAIL) — but there is no HC-13 that would read something like:

No deserialization of untrusted tool or sub-agent input without schema validation and fail-closed error handling.

Missing this constraint means the governance layer catches the consequence (an RCE triggers a safety event, the agent freezes) but not the cause (the deserializer shouldn't have run at all). Roadmap item, not a win.

One question

For anyone running MCP stdio servers today: is your allowlist a list of binaries, or a list of (binary, arg-pattern) tuples? In every stack I've asked so far, the answer is the first. CVE-2026-40933 is what the first looks like when it fails.

Sources

Top comments (2)

ArkForge • Apr 24

"Every binary in the allowlist is itself an interpreter whose argv is a program." That's the sentence that should be on a poster in every MCP integration team's office.

The deeper pattern here: MCP stdio transport inherits the host's execution context by design. Allowlisting the command name is checking the label on a box while ignoring what's inside. npx -c, python -c, node -e all collapse the spawn boundary into arbitrary eval.

The only viable defense for local MCP servers is capability-scoped sandboxing (seccomp/landlock on Linux, sandbox-exec on macOS), not input filtering. Filter what the process can do, not what you call it.

Michael "Mike" K. Saleme • Apr 24

"Filter what it can do, not what you call it." That's a keeper.

Practical wrinkle: capability-scoped sandboxing is the right prevention, but requires a per-server profile that most teams don't ship. Ten third-party MCP servers = ten sandbox profiles, kept current as servers evolve.

That leaves a real coverage gap during rollout. A pre-deployment test that injects an initialize-time shell canary (MCP-017 pattern) catches the exploit path on servers that don't yet have a profile.

Prevention is the sandbox; detection is the canary. Do you ship per-server profiles in your stack, or rely on the host OS?