DEV Community: Nick Stocks

Why Local Sandboxing Isn't Enough for MCP Servers

Nick Stocks — Sun, 05 Apr 2026 14:47:14 +0000

Local sandbox tools for MCP servers are solving a real problem. They solve it well on a developer's machine. That's not the same as solving it when your agent handles other people's data.

This post looks at what tools like sandbox-mcp (by Navendu Pottekkat) and code-sandbox-mcp (by Automata Labs) actually provide, where they stop, and what that gap costs you once agents are running against real user data.

What Local Sandbox Tools Do Well

Navendu Pottekkat built sandbox-mcp to address a concrete problem: LLMs that generate code can't execute it safely. His write-up explains the design clearly. The tool spins up Docker containers on your local machine, routes code execution through them, and returns output to the agent — without touching your host system directly.

The OS hardening in sandbox-mcp is real work. Containers drop all Linux capabilities (capDrop: ["all"]), set no-new-privileges: true, and run with a read-only filesystem — only specific mount points are writable. Resource limits are configurable per-sandbox (CPU, memory, process count, file quotas). These are not defaults you get from vanilla Docker; Pottekkat made deliberate choices to reduce privilege at the OS layer.

Code-sandbox-mcp, from Automata Labs, takes a more minimal approach — Docker container lifecycle management exposed as MCP tools (sandbox_initialize, sandbox_exec, copy_file, sandbox_stop) without the same documented OS hardening. It's useful for flexible, image-driven execution, with the container configuration left to the operator.

Both tools have the same scope: isolate code execution on a developer's local machine. That scope is well-defined and both tools serve it honestly.

The Shared Kernel Problem

Docker containers — including hardened ones — share the host operating system kernel. Every system call from every container on your machine is handled by the same kernel, regardless of cap-drop configuration.

Capability drops restrict what a process can do with valid operations: they won't prevent exploitation of a kernel vulnerability itself. Linux exposes roughly 350 system calls. Palo Alto's Unit 42 describes the core issue: traditional containers are "not truly sandboxed" because they share the host OS kernel, and the attack surface is the full syscall table. A container escape via a kernel vulnerability — CVE-2022-0492 (cgroups bypass), CVE-2019-5736 (runc overwrite) — gives an attacker direct access to the host, independent of how the container was configured.

The mistaike.ai sandbox runs MCP servers inside a user-space kernel (gVisor runsc). Instead of passing system calls directly to the host kernel, gVisor's Sentry intercepts them and implements them in user space. The host kernel exposure drops to fewer than 20 syscalls for the entire sandbox. Google's gVisor documentation describes this: gVisor and similar approaches reduce the kernel attack surface to under 10% of Linux's full syscall count, as documented in Unit 42's research.

This is a qualitatively different isolation model, not a stricter configuration of the same model.

The Egress Gap

sandbox-mcp disables network access in its shell sandbox — that's a genuine safeguard for untrusted shell execution. But the Go, Java, JavaScript, Rust, network-tools, and APISIX sandboxes enable network access because they need it. Code-sandbox-mcp doesn't document egress restrictions at all.

For MCP servers specifically, network access isn't optional — they need to call APIs, connect to databases, fetch data. You can't disable the network and still have a functional MCP tool.

That means a sandboxed MCP server running on your machine, through either tool, can reach any external endpoint it wants. No filtering, no allowlist, no logging of what it connects to.

This matters when an MCP server is compromised — whether by a supply chain attack, a malicious dependency, or a CVE in a library it uses. A compromised container with unrestricted network access can exfiltrate data over plain HTTP, reach internal network endpoints, or beacon to any external host. The container isolation protects your filesystem and process space. It doesn't protect your network.

The mistaike.ai sandbox enforces default-deny egress via an Envoy proxy. MCP servers declare the external FQDNs they need (maximum 10, no wildcards). Everything else is blocked before a packet leaves the container.

No DLP on Tool Call Traffic

When your agent calls an MCP tool, two things flow through the connection: the request (which may contain secrets, PII, or sensitive business data) and the response (which may contain the same, or injected content).

Local sandbox tools operate at the process execution layer — Docker container isolation is about filesystem and process separation, not protocol inspection. They don't intercept and analyse the MCP protocol data flowing in and out.

If your agent passes an AWS credential as part of a tool call parameter, it passes through. If a tool response includes a database connection string from a third-party data source, your agent processes it.

Bidirectional DLP scanning on every tool call intercepts both directions in under 50ms. Outbound: credentials and PII are caught before they reach third-party code. Inbound: sensitive data patterns in responses are redacted before your agent acts on them. Every match is logged with what triggered, what rule matched, and what was redacted.

No 0-Day CVE Protection

When you install an MCP server locally, you're pulling in its entire dependency tree. A compromised package — whether from a supply chain attack or a newly disclosed vulnerability — runs with whatever permissions the container has. Local sandbox tools don't scan those dependencies or alert you when one goes bad.

Traditional vulnerability scanners check committed code in CI. They don't continuously monitor the MCP servers you're actually running, and they don't inspect what those servers return.

mistaike.ai provides 0-day CVE protection: every tool response is checked against known CVE patterns, updated daily or more frequently as new vulnerabilities are disclosed. If a registered MCP server is found to have a compromised dependency, it's quarantined immediately — pulled from service before it can process another request. This protection is free on all plans, including the gateway tier.

See the full CVE registry →

No Team-Wide Policy

Local sandbox tools are per-developer configurations. Each engineer maintains their own Docker setup, their own sandbox images, their own resource limits. There is no shared policy.

At team scale, when the same MCP server is being called from five machines with five different sandbox configurations, the effective security posture is the weakest configuration in the group. There's no centralised audit log spanning all connections.

Team-wide policy means one set of rules applies to all agent traffic from all team members. A policy change propagates immediately across all connections. The audit log shows what happened across the full team, not just one developer's local environment.

What This Actually Means If You're Building for Users

The sections above describe technical gaps. But the reason those gaps matter isn't technical — it's about what you can promise the people who depend on what you build.

If you're a freelancer building agent workflows for clients: Your client is going to ask how their data is protected. "I run the MCP server in a Docker container on my machine" is an honest answer, but it doesn't give them anything to audit. With DLP scanning and an immutable audit log, you can show them exactly what data flowed through, what was caught, and what rules were applied. That's the difference between a verbal assurance and a verifiable control. For enterprise clients, this is often a prerequisite — not a nice-to-have.

If you're a solo founder shipping your first AI product: The gap between prototype and product is the gap between "it works" and "it's safe for other people's data." Local sandboxing gets you through the prototype phase. But when your first real user puts their API key into your tool, or your agent processes a customer's database query, you need to know that data isn't leaking through a compromised MCP tool response. You probably don't have time to build that scanning infrastructure yourself, and you shouldn't have to. That's the kind of thing you should be able to buy for the cost of a couple of coffees a month.

If you're a small team and your agent connects to third-party MCP servers: You control your own code, but you don't control what's in the dependency trees of community MCP servers. A compromised package in a popular tool means every agent using it is exposed. 0-day CVE protection means every response is scanned and compromised servers are quarantined automatically — you don't have to notice the advisory yourself. Without that, you're trusting every upstream maintainer to never get compromised. That's a bet that doesn't scale.

If you're building an internal tool for your team: Even if the data isn't customer-facing, it's still sensitive. Internal tools often have less scrutiny than production systems, which makes them better targets. A centrally managed policy that applies DLP and CVE scanning across all team members' agent connections means one engineer's local Docker misconfiguration doesn't become the weakest link.

If you're evaluating tools and wondering when to make the switch: There's no urgency to move off local sandboxing while you're prototyping. Local tools are fine for development. The trigger point is when real data enters the picture — your users' data, client data, production API keys, anything you'd care about losing. That's when the gap between process isolation and content-aware security becomes material. The free gateway tier lets you add CVE protection to your existing setup without changing how your servers are deployed, so you can close the most critical gap first and move to managed hosting when you're ready.

Where Local Tools Fit

Local sandbox tools are the right choice for:

Development and testing on your own machine, where production-grade policy enforcement is overhead
Short-lived code experiments where kernel-level isolation is acceptable risk
Any context where the agent isn't processing real user data or touching production systems

They're good tools solving a real problem at the right scope. The issue isn't that they're inadequate — it's that the problem changes shape when you go from building for yourself to building for other people, and the tooling needs to change with it.

Side by Side

	code-sandbox-mcp	sandbox-mcp (pottekkat)	mistaike.ai hosted sandbox
Process isolation	Docker (defaults)	Docker + cap-drop all + no-new-privs + read-only FS	gVisor user-space kernel
Host kernel exposure	~350 syscalls	~350 syscalls	<20 syscalls
Egress control	None	None (shell only has network disabled)	Default-deny, FQDN allowlist
DLP scanning	None	None	Bidirectional, <50ms
0-day CVE protection	None	None	Daily updates, response scanning, auto-quarantine
Content safety	None	None	Prompt injection detection
Secrets handling	Operator-defined	Operator-defined	Envelope-encrypted, memory-injected
Team policy	Per-developer	Per-developer	Org-wide, instant propagation
Audit log	None	None	Per-scan, immutable
Hosted option	No	No	Yes

The security architecture page has a full breakdown of how each layer is implemented.

Getting Started

If you're running MCP servers locally and want to close the CVE gap first: connect your existing tools through the mistaike.ai gateway. You get 0-day CVE protection on every tool response immediately, at no cost, without changing how your servers are deployed.

If you're ready for managed hosting with the full isolation stack: start free → and follow the setup guide →.

Full pricing →

Nick Stocks is the founder of mistaike.ai.

Originally published on mistaike.ai

Why Default-Deny Egress Matters for MCP Server Hosting

Nick Stocks — Sun, 05 Apr 2026 09:55:08 +0000

An MCP server is code running on infrastructure. By default, that code can connect to any IP address or domain on the internet — including yours.

Start for free → | Setup guide →

When people think about MCP security risks, they think about what comes in: malicious tool responses, prompt injection, supply chain attacks. That's the right instinct, but it's only half the picture.

The other half is what goes out.

An MCP server running inside a container — whether on your infrastructure or a managed platform — has outbound network access by default. It can open a connection to any domain, any IP, on any port. If that server is compromised or malicious, it can use that network access to exfiltrate everything it touches: environment variables, secrets, data passed through tool calls, credentials it has been given.

Default-deny egress closes that door.

The Exfiltration Attack Model

The attack is straightforward. A compromised or malicious MCP server:

Receives data through a tool call — secrets injected into the environment, a user's query, a database response, an API credential
Opens an outbound connection to an attacker-controlled endpoint
Sends the data out

The entire attack happens at the network layer. No exploit chain, no persistence mechanism. Just an HTTP POST to a domain the attacker controls.

This isn't theoretical. The Smithery.ai breach — documented by GitGuardian — demonstrated credential exfiltration from a path traversal vulnerability that exposed API keys from thousands of MCP servers. Default-deny egress would not have prevented the initial path traversal, but it would have blocked the outbound call that extracted the credentials. The blast radius shrinks dramatically when the server can't reach arbitrary endpoints.

What "Default-Deny" Actually Means

Standard Docker containers have unrestricted outbound network access. There is no egress configuration by default — a container can connect anywhere. Most MCP server management tools and hosting platforms do not change this. They isolate the container from the host but leave outbound traffic open.

Default-deny egress inverts the model:

All outbound connections are blocked unless explicitly allowed
The allowlist is declared at deploy time — not modifiable at runtime
Connections to undeclared destinations are dropped at the network layer, independent of what the server code does

This means a server that has been fully compromised — where an attacker has arbitrary code execution — still cannot reach an exfiltration endpoint that wasn't declared before the server started. The attacker controls the code but not the network rules.

How mistaike.ai Implements Egress Control

On mistaike.ai, egress enforcement runs through an Envoy proxy that sits between the container and the external network. When you deploy an MCP server, you declare the FQDNs it needs to reach (max 10, no wildcards, no raw IP addresses). The proxy enforces this list; everything else is dropped.

The constraints exist for security reasons:

No wildcards — *.example.com allows too broad a surface, including subdomains you don't control
No IP addresses — raw IP allowlisting bypasses DNS-based controls and is difficult to audit
Max 10 FQDNs — keeps the allowlist reviewable; a server that needs more than 10 external endpoints is doing more than one thing and should be decomposed

The enforcement happens before traffic leaves the container. It is not firewall rules applied at the host level after the fact — it is the proxy layer that mediates every outbound connection.

This Is One Layer, Not a Complete Defence

Default-deny egress addresses one specific attack vector: uncontrolled exfiltration via outbound network access. It does not:

Prevent the initial compromise
Stop data from being included in declared outbound traffic (to an allowed domain)
Replace DLP scanning

This is why mistaike.ai's sandbox combines egress control with DLP scanning and kernel-level isolation. Each layer addresses different failure modes:

gVisor (kernel isolation) — limits what system calls the server can make, containing privilege escalation
Default-deny egress — blocks exfiltration to undeclared endpoints regardless of what the server code does
DLP scanning — inspects the content of traffic on declared paths, catching credential and PII leakage through legitimate channels

An attacker who compromises a server faces all three controls simultaneously. Bypassing one doesn't make the others irrelevant.

DLP alone is insufficient if the server can reach arbitrary endpoints — you'd have to scan every outbound byte to every possible destination. Egress control reduces the problem to scanning traffic on a declared set of paths. Combined, they catch different parts of the same attack.

What to Look for When Evaluating MCP Hosting

Egress control is not complex to implement, but it requires deliberate design. A platform that has thought seriously about MCP server security will be able to answer:

What is the default egress policy for hosted servers?
At what layer is egress enforced? (Application-level filtering is weaker than network-layer enforcement)
Can servers modify their own egress rules at runtime?
Is the allowlist declared at deploy time and immutable during execution?
What happens to outbound connections to undeclared destinations?

If a platform can't answer these questions clearly, the answer is probably "unrestricted outbound."

Egress Control on mistaike.ai

Managed MCP hosting on mistaike.ai includes default-deny egress on every server, every plan. You declare the FQDNs your server needs at deploy time. Everything else is blocked.

This sits alongside sandboxed builds, envelope-encrypted secrets, kernel-level isolation, DLP scanning, and CVE pattern matching — the full architecture is at /security/sandbox.

If you're evaluating MCP hosting and want to understand what's in each layer, the CVE registry shows the patterns we track and update daily.

Start free → | Full pricing → | Setup guide →

Nick Stocks is the founder of mistaike.ai.

Originally published on mistaike.ai

When Prompt Injection Becomes Remote Code Execution

Nick Stocks — Sat, 04 Apr 2026 12:21:51 +0000

Prompt injection is usually discussed as a text-level attack — tricking an LLM into saying something it shouldn't. Four new CVEs in CrewAI demonstrate that when agents have tools, prompt injection becomes a vehicle for remote code execution on the host system.

Start for free → | Why your AI agent needs DLP →

Most conversations about prompt injection focus on the LLM itself. An attacker crafts input that overrides system instructions, and the model does something its operator didn't intend — leaks a system prompt, ignores a guardrail, says something off-brand.

That framing is incomplete. When an LLM is embedded in an agent framework with access to tools — code interpreters, file loaders, search APIs — prompt injection doesn't just change what the model says. It changes what the model does. And what it does runs on your system.

On April 1, 2026, CERT/CC published VU#221883: four CVEs in CrewAI, one of the most widely-used AI agent frameworks. The vulnerabilities are individually straightforward. Chained together via prompt injection, they produce a complete attack path from untrusted input to remote code execution on the host.

This is why we built mistaike.

The Four CVEs

CVE-2026-2275: The Sandbox That Isn't One

CrewAI's Code Interpreter tool is designed to execute agent-generated code inside a Docker container. If Docker isn't available — not installed, not running, or the daemon isn't reachable — the tool silently falls back to SandboxPython.

SandboxPython sounds safe. It isn't. The critical issue is that it permits ctypes — Python's foreign function interface, which lets code load and call functions from any shared library on the system.

In practice, this means:

from ctypes import cdll
libc = cdll.LoadLibrary("libc.so.6")
libc.system(b"id")

That executes a shell command with the privileges of the agent process. The attacker doesn't need to escape a container. There's no container to escape from. They can call execve, fork, socket, or any other libc function directly.

A "sandbox" that allows arbitrary ctypes calls is not a sandbox. It's a polite suggestion. The attacker's code runs on the host with full access to the process's memory, file descriptors, and network stack.

CVE-2026-2287: Silent Sandbox Degradation

This is the enabling condition for CVE-2026-2275. CrewAI checks whether Docker is available at agent initialisation — but it doesn't re-verify the check at execution time.

The consequence: if Docker becomes unavailable after the agent starts — the daemon crashes, the socket becomes unreachable, the container runtime is killed — the Code Interpreter continues to accept execution requests, but silently routes them through SandboxPython instead of Docker.

No exception. No log warning. No error to the operator. The agent carries on as if sandboxing is working.

This is a TOCTOU (time-of-check/time-of-use) class of failure. The state verified at initialisation is not the state present at execution. In containerised environments, cloud-hosted agents, and CI pipelines, Docker availability is often conditional — and an attacker who can influence the environment (or simply wait for an intermittent failure) can force the fallback.

The operator's dashboard shows the agent running. It is. Just without the isolation they think they have.

CVE-2026-2286: SSRF via RAG Search Tools

CrewAI's RAG search tools accept arbitrary URLs at runtime without validation. The attack vector is not a crafted HTTP request or a malicious API parameter — it's conversational input to the agent. The injected prompt directs the LLM to search a specific URL, and the tool fetches it.

The target isn't a public website. It's the cloud metadata endpoint.

On AWS, the instance metadata service is reachable from any running workload at 169.254.169.254. A request to http://169.254.169.254/latest/meta-data/iam/security-credentials/ returns the name of the attached IAM role. A second request to http://169.254.169.254/latest/meta-data/iam/security-credentials/{role-name} returns a JSON blob with a live AccessKeyId, SecretAccessKey, and session Token — valid AWS credentials, rotated automatically, with whatever permissions the instance role carries.

GCP exposes similar data at http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token. Azure at http://169.254.169.254/metadata/identity/oauth2/token.

An attacker who can get the agent to fetch one of these URLs has extracted cloud credentials. Those credentials can enumerate S3 buckets, describe EC2 instances, read Secrets Manager entries, or call any other AWS service the role is permitted to reach — without ever touching the host's filesystem or network directly.

The RAG tool returns the response content to the agent's context window. From there, it may appear in the agent's output to the user, be passed to another tool, or be logged. Any of these paths exfiltrates the credentials.

CVE-2026-2285: Arbitrary File Read via JSON Loader

CrewAI's JSON loader tool reads files from disk using paths constructed at runtime — and performs no path normalisation or validation before opening them.

Without a call to os.path.abspath() followed by a prefix check against an allowed directory, any path the agent constructs is valid input to the loader. Including relative paths with traversal components.

The most directly valuable targets:

/proc/self/environ — the running process's environment variables. If the application loaded a .env file at startup, all of those keys — DATABASE_URL, OPENAI_API_KEY, SECRET_KEY, STRIPE_SECRET_KEY — are present here, in plain text, as environment variables accessible from the process's own /proc entry.
.env files — if the agent process's working directory is the application root (common in development and many container configurations), a relative path like ../../.env traverses to the application's dotenv file.
~/.ssh/id_rsa, ~/.ssh/id_ed25519 — private keys. If the agent process runs as a user with an SSH keypair, the attacker can read it.
/etc/shadow (if running as root), AWS credential files at ~/.aws/credentials, kubectl config at ~/.kube/config.

The contents are returned to the agent's context window. The agent doesn't need to exfiltrate them explicitly — they'll appear in its next response, get passed to another tool, or be written to a log.

How the Chain Works

None of these vulnerabilities require sophisticated exploitation individually. The attack path is:

Prompt injection. Attacker-controlled text reaches the agent's context — via direct user input, RAG retrieval from an external source, a tool response from a compromised third-party service, or any other channel the agent reads from.
Tool invocation. The injected text instructs the agent to use its tools: execute this code, load this file, search this URL. The LLM processes injected instructions the same way it processes legitimate ones — it has no reliable way to distinguish them. The tool call is made.
Sandbox escape. The Code Interpreter checks Docker availability. If Docker is down (or was never running), it falls back to SandboxPython. The attacker's code runs with ctypes access — arbitrary C function calls on the host.
Credential harvesting and lateral movement. Parallel to the code execution path: the SSRF tool fetches cloud metadata credentials; the JSON loader reads /proc/self/environ and .env. The attacker now has host RCE, live cloud credentials, and application secrets. They have everything they need to move laterally — to the database, to object storage, to other services in the same VPC.

The LLM was never the target. It was the delivery mechanism.

Why This Is the Problem We Set Out to Solve

We started building mistaike because we kept running into a specific blind spot in how organisations think about AI agents: they treat the LLM as the security boundary.

The assumption is: if we trust the model, and we've tuned it not to do bad things, we're safe. Prompt injection — when it's acknowledged at all — gets treated as a correctness problem. Make the model more instruction-following, improve the system prompt, add a content filter on inputs.

The CrewAI CVEs make the flaw in that reasoning concrete. The attacker doesn't need the LLM to want to do something harmful. They need it to do what it's told — which is exactly what a well-instruction-following model does best.

The tools are the attack surface. Every tool call is a boundary crossing — from the LLM's context into real infrastructure. And in most agent deployments, those boundary crossings are completely unmonitored:

No validation that the arguments a tool receives are within expected ranges
No inspection of what tool responses contain before they re-enter the agent's context
No control over what data leaves the system through tool output channels
No audit log of what was executed, fetched, or read

This is the gap we built mistaike to close. DLP on the tool-call boundary catches credentials moving in either direction — whether the exfiltration is deliberate (a compromised tool returning AWS keys) or incidental (the agent summarising the contents of /proc/self/environ in its response). Content safety on tool inputs catches injection payloads before they reach the LLM and trigger malicious tool calls in the first place.

The CVE chain above represents a worst-case scenario for an unprotected agent deployment. With inspection at the tool boundary, several steps of it become detectable or blockable before system compromise.

Why This Pattern Extends Beyond CrewAI

CrewAI is not uniquely at fault. It's the framework where these specific bugs were discovered. The same conditions exist throughout the agent ecosystem — because they're not bugs so much as architectural defaults.

The pattern requires three things, all of which are common:

Untrusted input reaches the agent's context. This means RAG retrieval from external sources, user messages, tool outputs from third-party services, webhook payloads — anything the agent processes that an attacker can influence. In production agentic deployments, this is almost always true.
The agent has tools with system-level capabilities. Code execution, file access, HTTP requests, database queries. These are not exotic — they're the reason people use agent frameworks.
The tools lack independent security boundaries. No sandbox for code execution, no URL allowlisting for HTTP requests, no path validation for file access, no output inspection before re-ingestion.

All three are true for the majority of agent deployments today. The attack surface is not limited to CrewAI users.

What Actually Helps

Patching CrewAI removes these specific vulnerabilities. It doesn't close the underlying pattern.

Harden the sandbox — and fail closed, not open. Docker with a restrictive seccomp profile is a meaningful improvement over unrestricted execution. Kernel-level isolation via gVisor reduces the available syscall surface from 300+ calls to approximately 20 — dramatically limiting what attacker code can do even if it executes. But the more important principle is the failure mode: a sandbox that degrades silently to unrestricted execution provides no protection in practice. If the container runtime isn't available, the Code Interpreter should refuse to execute, not fall back. Fail closed.

Validate tool inputs outside the LLM. The LLM's decision to call a tool with a particular URL, file path, or code payload should not be the final authority on whether that call happens. Tools should enforce their own allow lists and path restrictions independently of what the agent requested — because those restrictions need to hold even when the LLM has been manipulated. Validation that can be bypassed via prompt injection is not validation.

Control egress at the network layer. Default-deny outbound network access for agent execution environments. Explicitly declare the domains each tool is permitted to reach. Block the metadata endpoints (169.254.169.254, metadata.google.internal) at the network level — not via application logic that an injected prompt can influence. An attacker who achieves code execution but can't reach the metadata service or exfiltrate data externally has a significantly reduced blast radius.

Inspect what crosses the tool boundary. Every tool call is an outbound data channel. Every tool response is an inbound data channel. DLP on both directions catches credentials and sensitive data moving in either direction — whether the cause is prompt injection, a compromised dependency, or a misconfiguration. This is particularly important for tool responses re-entering the agent's context, where sensitive content can be picked up and summarised by the LLM without any explicit exfiltration step.

Treat the LLM as permanently compromised. The system's security properties must hold when the model is fully controlled by an attacker. Prompt injection defences are improving, but they are probabilistic. Any security boundary that relies on the LLM correctly identifying and refusing malicious instructions is not a security boundary.

The Uncomfortable Implication

The AI agent ecosystem has spent two years treating prompt injection as a trust and safety problem — how do we stop the LLM from being rude, leaking its system prompt, or generating off-policy content?

The CrewAI CVEs are a reminder that prompt injection in an agentic context is a systems security problem. The attacker's goal isn't to make the LLM say something embarrassing. It's to use the LLM as an authenticated proxy — one that already has access to your infrastructure — to reach systems that would otherwise require direct compromise.

Every tool an agent can invoke is an attack surface. Every data source it reads is an injection point. The security boundary isn't the LLM's instruction-following fidelity. It's the isolation between the agent and the systems it touches — and right now, for most deployments, that isolation ranges from thin to non-existent.

References

CERT/CC: VU#221883 — CrewAI contains multiple vulnerabilities including SSRF, RCE and local file read
CrewAI Vulnerabilities Allow Attackers to Bypass Sandboxes and Compromise Systems (CyberPress, April 2026)
CrewAI Hit by Critical Vulnerabilities Enabling Sandbox Escape and Host Compromise (GBHackers, April 2026)
CrewAI Vulnerabilities Expose Devices to Hacking (SecurityWeek, April 2026)

Nick Stocks is the founder of mistaike.ai.

Originally published on mistaike.ai

Docker Is Not a Sandbox: Why MCP Server Isolation Needs gVisor

Nick Stocks — Fri, 03 Apr 2026 20:18:24 +0000

Docker containers are not sandboxes. They are isolated processes that share your host kernel. For MCP servers running third-party or user-uploaded code, that distinction is the difference between a contained incident and a full host compromise.

Start for free → | Setup guide →

There's a widely held assumption in the AI infrastructure world: put your MCP server in a Docker container, and you've sandboxed it. You haven't.

Docker is an excellent tool for packaging and deploying software. It is not a security boundary. If you're running untrusted MCP server code in Docker and calling it sandboxed, you've accepted a risk you may not fully understand.

This post explains what Docker actually provides, what it doesn't, and why kernel-level isolation via gVisor changes the threat model for MCP server hosting.

What Docker Actually Provides

Docker uses two Linux kernel features to isolate containers: namespaces and cgroups.

Namespaces give each container its own view of the system — its own process tree, network stack, filesystem mount points, and hostname. From inside the container, it looks like a separate machine.

Cgroups limit how much CPU, memory, and I/O a container can consume. They prevent resource exhaustion but don't restrict what the container can do at the kernel level.

What both of these have in common: they are provided by the host kernel, and the container communicates with that same host kernel directly.

Every system call a process makes inside a Docker container — open(), mmap(), ptrace(), socket() — goes to the real host kernel. There is no intermediary. There is no second kernel. The container's process talks directly to the kernel that runs your entire host.

The diagram above shows this split clearly. On the left, a Docker container with standard namespace isolation: the syscall surface is wide open to the host kernel. On the right, gVisor's model: a user-space kernel sits between the container and the host.

Why That Matters for MCP Servers

An MCP server is code — often written by a third party, sometimes uploaded by a user, always executing with some degree of privilege. In the MCP ecosystem today, plugin authors range from major vendors to individual developers with varying security practices. Supply chain attacks targeting developer tooling are documented and increasing.

If an MCP server process is compromised — whether through a known CVE in a dependency, a supply chain attack, or malicious code that was always malicious — the attacker controls a process that has direct access to your host kernel.

The host kernel exposes over 300 system calls to a standard Docker container. A sufficiently crafted kernel exploit targeting any one of those can break out of the container entirely, gaining access to the host system. Container escapes via kernel vulnerabilities are not theoretical: CVE-2022-0492 (Palo Alto Networks Unit 42, 2022) and CVE-2024-1086 (NIST NVD, 2024) are recent examples of Linux kernel vulnerabilities that could be exploited to escape container isolation. A kernel exploit that reaches the real host kernel doesn't stay in one container.

What gVisor Does Differently

gVisor is an open-source project released by Google in 2018. It provides a user-space kernel called the Sentry.

When a containerised process makes a system call, instead of that call going to the host kernel, it goes to the Sentry — a Go process running in user space that implements the Linux system call interface. The Sentry handles the call, enforces its own security policy, and only reaches the real host kernel for a small set of operations it can't avoid.

That small set is approximately 20 host syscalls, accessed via a separate process called the Gofer. Standard Docker exposes 300+.

The reduction looks like this:

Isolation model	Host syscalls exposed
Standard Docker container	300+
Docker + seccomp filter (restrictive)	~50–100
gVisor (runsc)	~20

Even with a restrictive seccomp profile, the attack surface remains substantially larger than gVisor's model. With no seccomp policy — which is the default for most Docker deployments — a container has access to the full kernel interface.

Google has documented using gVisor in production for running untrusted workloads. Their security overview covers the threat model in detail.

The MCP Threat Model With and Without gVisor

Consider a concrete scenario: an MCP server has a dependency with a known remote code execution CVE. An attacker exploits it and achieves arbitrary code execution inside the container.

With standard Docker:
The attacker's code runs inside a container that shares the host kernel. They can attempt to use kernel vulnerabilities, write to /proc entries that affect host state, or exploit misconfigurations in the namespace setup. If successful, they're on the host — with access to every other container, the filesystem, network, and credentials in the environment.

With gVisor:
The attacker's code runs inside a container whose kernel is the Sentry. Any attempt to use a host kernel vulnerability is blocked at the Sentry boundary — the host kernel syscalls the Sentry uses are a narrow, controlled set that doesn't expose the broader attack surface. The blast radius stays inside the container.

The Sentry is not invulnerable. A vulnerability in the Sentry itself could, in principle, be exploited. But the Sentry is a substantially smaller and more auditable codebase than the Linux kernel, and its design deliberately limits what it can do at the host level.

Defence in Depth: gVisor Is One Layer

Kernel isolation addresses one part of the threat model. It doesn't address everything.

A compromised MCP server that can't exploit the host kernel can still attempt to exfiltrate data over the network, abuse the credentials injected into its environment, or send malicious responses back to the agent calling it.

That's why mistaike.ai's hosted MCP sandbox combines multiple independent layers:

gVisor (runsc runtime) — limits kernel syscall surface to ~20 host calls. Kernel exploit attempts hit the Sentry, not your host.

Default-deny egress via Envoy — each MCP server declares the external domains it requires (max 10, FQDNs only, no wildcards). All other outbound traffic is blocked. An attacker that achieves code execution can't beacon out to arbitrary infrastructure.

Bidirectional DLP — every tool call and every response is scanned before it moves. Credentials and PII are caught in both directions.

Ephemeral containers — five minutes idle, the container is destroyed and replaced with a fresh instance. No persistent foothold.

These layers are independent. gVisor limits what the process can do at the kernel level. Envoy limits what it can reach on the network. Neither replaces the other. Both are necessary.

The diagram above shows how these sit relative to each other: gVisor between the container and the host kernel, Envoy on the network path out.

Why This Matters Now

The MCP ecosystem is growing faster than its security practices. Developers are connecting agents to community-built servers, user-uploaded code, and rapidly iterated plugins. The assumption that "Docker is good enough" is widespread and understandable — Docker is genuinely excellent for most use cases.

But MCP server hosting is a specific threat model: you are running code written by someone you don't fully control, on infrastructure you do control, processing data that may be sensitive. That threat model benefits from kernel-level isolation in a way that generic containerised web services don't.

Running a blog on Docker? Namespace isolation is fine. Running a third-party MCP server that processes your users' financial data? The kernel attack surface is worth reducing.

How to Use gVisor

gVisor is open-source and available at gvisor.dev. The runsc runtime can be configured as a Docker runtime with a few lines of configuration.

If you want MCP servers running under gVisor without configuring it yourself, mistaike.ai's hosted sandbox does this by default. Every server you upload runs under runsc, with default-deny egress, DLP on every call, and ephemeral containers. You get the isolation model without managing the infrastructure.

Read the setup guide → | See the full security architecture →

Summary

Docker namespaces and cgroups provide process isolation, not kernel isolation
Container processes communicate directly with the host kernel — 300+ syscalls are reachable
A kernel exploit from inside a standard Docker container reaches the real host kernel
gVisor interposes a user-space kernel (the Sentry) between the container and the host
The Sentry limits host kernel exposure to ~20 syscalls via the Gofer process
For MCP servers running untrusted code, this substantially reduces the blast radius of a kernel-level attack
Kernel isolation is one layer — it works alongside network controls and DLP, not instead of them

Start for free → | Read the setup guide → | See the full security architecture →

Nick Stocks is the founder of mistaike.ai.

Originally published on mistaike.ai

Axios Has 100 Million Weekly Downloads. North Korea Backdoored It in 39 Minutes.

Nick Stocks — Thu, 02 Apr 2026 09:17:36 +0000

Yesterday — March 31, 2026 — a North Korea-linked threat actor hijacked the npm account of an Axios maintainer and published two backdoored versions of the most widely used HTTP client in the JavaScript ecosystem.

Axios has over 100 million weekly downloads. It sits underneath LangChain, OpenAI's SDK, dozens of MCP clients, and virtually every Node.js application that makes an HTTP request. If you're running AI agents in production, your dependency tree almost certainly includes it — even if you never installed it directly.

The malicious versions were live for approximately three hours before detection and removal. In that window, every npm install that resolved to axios@1.14.1 or axios@0.30.4 silently installed a cross-platform remote access trojan.

This is not a theoretical risk. This is what happened yesterday.

What Happened

At 00:21 UTC on March 31, an attacker published axios@1.14.1 using a compromised maintainer account (jasonsaayman). Thirty-nine minutes later, they published axios@0.30.4 — targeting both the current and legacy version lines simultaneously.

Both versions introduced a single new dependency: plain-crypto-js@4.2.1. This purpose-built package contained a postinstall hook that downloaded and executed platform-specific stage-2 implants from sfrclak[.]com:8000.

According to Aikido's analysis, the attacker deployed three parallel RAT implementations — one for Windows, one for macOS, one for Linux — all sharing an identical C2 protocol and beacon behavior.

The attack was detected and the packages were removed from npm approximately two to three hours later.

Attribution

Google's Threat Intelligence Group (GTIG) attributes this attack to UNC1069, a financially motivated North Korean threat actor active since at least 2018. The attribution is based on the use of WAVESHAPER.V2, an updated variant of malware previously deployed by this group.

UNC1069 targets cryptocurrency platforms, fintech companies, and SaaS providers. SANS confirms the RAT's credential harvesting behavior: it swept environment variables, .env files, SSH keys, cloud provider credentials, and API tokens from compromised systems.

This is a state-backed operation targeting the foundational dependency layer of the JavaScript ecosystem.

Why This One Is Different

We've written about supply chain attacks three times in the past two weeks. Each of those targeted AI-specific infrastructure — Trivy, LiteLLM, LangChain, Telnyx. Tools that sit in the AI layer of the stack.

Axios is not in the AI layer. Axios is underneath the AI layer.

Every AI agent framework that makes HTTP calls depends on a library like Axios. Every MCP client that connects to a remote server sends its requests through an HTTP library. Every workflow automation tool — n8n, Zapier integrations, custom agent orchestrators — uses HTTP to talk to the world.

When TeamPCP compromised LiteLLM, they got access to LLM API keys. When they hit Trivy, they got CI/CD secrets. Both attacks were severe, but both had a defined blast radius: organisations using those specific tools.

When someone compromises Axios, the blast radius is the entire JavaScript ecosystem.

Here's the uncomfortable math. The TeamPCP campaign has now demonstrated a clear pattern: compromise a popular package, harvest the credentials from everyone who installs it, use those credentials to compromise the next package. Each victim becomes the vector for the next attack.

Elastic Security Labs documented the full attack chain:

Compromised maintainer account → published malicious package
postinstall hook downloads platform-specific RAT
RAT sweeps credentials (env vars, .env files, SSH keys, cloud tokens)
Stolen credentials enable lateral movement to other packages and services

If the Axios RAT harvested credentials during its three-hour window, we should expect those credentials to appear in follow-on attacks in the coming days and weeks — just as the Trivy credentials fueled the Checkmarx, LiteLLM, and Telnyx compromises.

The AI Agent Angle

This matters more for AI agent deployments than for typical web applications, for two reasons.

First: AI agents aggregate credentials. A typical web server might have a database URL and an API key or two. An AI agent orchestrator — or an MCP hub — might have credentials for a dozen different services: LLM providers, vector databases, code repositories, Slack, email, CRM systems, internal APIs. A credential harvester on an AI agent host has a target-rich environment.

Second: AI agent supply chains are deeper than you think. When you install an MCP server or an agent framework, you're pulling in hundreds of transitive dependencies. Most teams audit their direct dependencies. Almost nobody audits the full tree. Axios appears as a transitive dependency in packages that don't mention HTTP in their descriptions.

Run npm ls axios in any Node.js AI project. Count the paths. That's your exposure surface.

What You Should Do Right Now

1. Check if you installed the compromised versions.

# Check your lockfile for the malicious versions
grep -r "axios@1.14.1\|axios@0.30.4" package-lock.json yarn.lock pnpm-lock.yaml 2>/dev/null

Safe versions: axios@1.14.0 (last legitimate 1.x release with SLSA provenance) and axios@0.30.3 (last legitimate 0.30.x release). If you see 1.14.1 or 0.30.4 in any lockfile, assume compromise and rotate all credentials accessible from that environment.

2. Pin your dependencies.

If you're still using caret ranges (^1.14.0) for critical packages, this is the incident that should change that. Use exact versions or lockfile integrity checks. Arctic Wolf's advisory recommends enabling npm's --ignore-scripts flag in CI to prevent postinstall hooks from executing automatically.

3. Audit transitive dependencies, not just direct ones.

npm ls axios          # Show all paths to axios in your tree
npm audit signatures  # Verify package provenance

4. Treat credential rotation as mandatory after any supply chain incident in your dependency tree.

Not "if you think you're affected." If the compromised package was anywhere in your resolved dependency graph during the attack window, rotate everything: API keys, cloud credentials, SSH keys, database passwords.

The Pattern

Trivy. Checkmarx. LiteLLM. Telnyx. LangChain. Now Axios.

Six major supply chain incidents in the JavaScript and Python ecosystems in the span of two weeks. Three of them are attributed to the same North Korean threat actor group. The attacks are accelerating, the targets are getting more foundational, and the credential-chaining technique means each compromise funds the next.

The supply chain problem in AI infrastructure is not getting better. It is getting worse, faster, because every AI agent deployment expands the attack surface — more dependencies, more credentials, more integration points.

There is no single fix. But there are practices that reduce your exposure: pinned dependencies, lockfile integrity, automated vulnerability scanning, minimal credential scoping, and — critically — runtime monitoring of what your AI agents actually send over the wire. Because when the HTTP library itself is compromised, the only thing standing between your credentials and an attacker's C2 server is whether something is watching the traffic.

Three hours. That's how long the Axios backdoor was live. Long enough.

Originally published on mistaike.ai

What You're Installing When You Add an MCP Server

Nick Stocks — Thu, 02 Apr 2026 03:33:12 +0000

There's a simple question most MCP users can't answer before installing a server:

What am I actually installing?

When you add an MCP server to your agent, you're not just adding a tool. You're inheriting its code, its dependencies, and its behaviour. In many cases that includes a large and often opaque dependency tree, along with whatever known vulnerabilities exist within it.

To better understand this, we ran a large-scale analysis of MCP servers drawn from public registries. This post covers the first two phases of that work: inventory and dependency risk.

We're also publishing the results as a public API so anyone can query the data directly.

Public API: mistaike.ai/cve-registry — no API key required.

Phase 1 — Inventory

We began by collecting MCP servers from public registry sources and normalising them into a single dataset.

Across sources, this produced a working indexed dataset of over 25,000 distinct MCP implementations drawn from two registries. The goal of Phase 1 was coverage, not judgement: what exists in the ecosystem?

Phase 2 — Dependency and CVE Scanning

We then analysed repositories and dependency graphs to identify known vulnerability exposure.

For each server, we:

enumerated dependencies
mapped them to known CVEs and advisories
tracked counts and worst-case severity
recorded package footprint where available

The output is a server-level view of dependency risk — something that doesn't exist in standard vulnerability databases, which index packages, not deployable tools. The live index currently covers over 6,000 servers.

What the Index Shows

The CVE registry returns entries sorted by CVE count by default. Some examples from the upper end of the distribution, described by category:

A multi-agent framework for orchestrating AI pipelines — 103 known vulnerabilities in its dependency tree, 4 rated critical. The high count is largely attributable to transitive dependencies from pulling in a broad AI ecosystem. This server is well-maintained; the vulnerabilities are in its supply chain, not its own code.
A console automation server that exposes shell command execution to agents — 65 known vulnerabilities, worst severity critical.
A developer CLI management server — 47 known vulnerabilities, worst severity critical.
An infrastructure configuration server used for network operations — 46 known vulnerabilities, worst severity critical.

These examples come from the first page of results. The full index is queryable at mistaike.ai/cve-registry.

What Stood Out

Several patterns emerged from the data.

Dependency risk is widespread. A meaningful portion of MCP servers carry known vulnerabilities through their dependency trees. In some cases, individual servers accumulate dozens or more CVEs — often through transitive dependencies rather than direct code. The multi-agent framework example above is a good illustration: the author's code isn't the problem; the problem is what it depends on, and what those dependencies depend on.

Severity alone isn't a sufficient signal. Some servers with very high CVE counts have only low-severity issues. Others with fewer total CVEs include critical-severity packages. Both dimensions matter when assessing risk.

Dependency sprawl is common. Many MCP servers pull in large numbers of packages, increasing both attack surface and maintenance burden. Combined with unpinned dependencies — which resolve to the latest version at install time — this creates non-deterministic builds and makes remediation harder.

None of these patterns are unique to MCP. They reflect broader software supply chain issues. What makes MCP different is where these servers run: often locally, often with access to files, tokens, APIs, and developer workflows.

Why Publish This as an API

Public CVE databases exist, but they don't map cleanly to deployable units like MCP servers.

If you're deciding whether to install an MCP server, the question isn't which CVEs exist in the ecosystem? It's what known vulnerability exposure am I inheriting if I run this server?

The public CVE registry is designed to answer that question directly.

How to Use It

The CVE registry supports:

search by name or repository (?search=)
filter by severity (?severity=critical|high|medium|none)
sort by CVE count, severity, or recency (?sort=)
pagination (?page=, ?page_size=)

Example uses:

check a server before adding it to your agent configuration
integrate into CI/CD to flag high-risk servers before deployment
build dashboards tracking ecosystem risk over time
prioritise manual review of servers with critical-severity exposure

How This Fits the Broader Research

The CVE index is Phase 2 of a larger analysis pipeline.

Later phases focus on runtime behaviour: what MCP servers actually do when executed, what network connections they make, and whether that behaviour aligns with user expectations and documentation.

Initial work on a deeper behavioural analysis phase examined a subset of servers at runtime. The results were largely reassuring: 86% of servers examined showed no concerning behaviour beyond their documented purpose. A small number showed behaviours worth investigating further. The five most significant:

Undisclosed telemetry on a local execution tool. A server designed for local desktop automation — file operations, terminal access — silently calls Google Analytics and a first-party telemetry endpoint on every tool invocation. The server runs locally; the tracking does not.
User queries sent over plain HTTP to a bare IP. A domain research tool routes user input — including project descriptions, keywords, and repository context — over unencrypted HTTP to a server identified only by a raw IP address running an LLM. No transport security. Not mentioned in documentation.
Steganographic Unicode watermarking. A server embeds invisible Unicode characters into every response it produces. The characters encode a persistent machine identifier that travels with the output wherever it goes — into the AI's context, into logs, into any downstream system. Undisclosed, not opt-in, not visible.
Query logging and AI platform profiling. A search-oriented server stores every query verbatim in a server-side database against the user's API key, building a 90-day history. It also inspects environment variables at startup to identify which AI client is in use and embeds this in outbound requests. The README describes two tools; the server exposes nineteen.
Unredacted user inputs forwarded to third-party analytics. A blockchain data server sends the full contents of each tool call — including user-supplied arguments — to a third-party analytics platform on every invocation, along with the calling client's name and version.

These findings are being manually validated and shared with the relevant maintainers before full publication. All of the servers involved are listed across the major MCP registries — the official MCP directory, Glama, and Smithery. These are not obscure or fringe listings — they are servers a developer would encounter through normal discovery.

A full behavioural analysis post will follow that process.

Important Caveats

This dataset should be treated as a signal, not a verdict.

A CVE doesn't necessarily mean a vulnerability is exploitable in your environment. Some issues exist in unused code paths, or may already be mitigated in practice. Dependency graphs can both overstate and understate real-world risk.

At the same time, known vulnerabilities are relevant. They indicate maintenance posture, upgrade cadence, and potential exposure that's worth understanding before deployment.

What's Missing Today

One structural gap became clear during this work: there is currently no standard way for MCP servers to declare external network dependencies, telemetry behaviour, data categories transmitted, or whether such behaviour is optional.

As a result, users often rely on documentation, source code review, or trust alone to understand what a server does. That doesn't scale as the ecosystem grows.

Conclusion

The MCP ecosystem is growing quickly. That growth brings a need for better visibility into what is being installed and executed.

This work focuses on the first layer of that visibility: dependency risk. By publishing a public CVE index mapped directly to MCP servers, the aim is to make it easier for developers and organisations to understand what they're adopting before they run it.

This is not a claim that the ecosystem is unsafe. Many servers show minimal or manageable exposure. But every MCP server brings its own supply chain. Understanding that supply chain is a necessary first step.

About This Data

This API includes publicly known vulnerability data only. It does not include behavioral analysis, telemetry findings, or unverified security concerns, which are handled separately and may be subject to validation and responsible disclosure processes.

This data is intended to support risk awareness and prioritization, not to label projects as insecure or malicious. Users should review context, validate findings, and consider their own threat model before making decisions.

CVE Registry: mistaike.ai/cve-registry — no API key required.

Feedback and questions welcome.

Originally published on mistaike.ai

One Stolen Token. Five Ecosystems. The TeamPCP Supply Chain Attack Is Still Spreading.

Nick Stocks — Sun, 29 Mar 2026 18:42:18 +0000

On March 19, a threat actor group called TeamPCP used a compromised GitHub service account to force-push malicious code to 76 of 77 version tags for Trivy — one of the most widely-used security scanners in CI/CD pipelines.

Ten days later, five ecosystems are compromised — GitHub Actions, Docker Hub, npm, Open VSX, and PyPI — and the attack is still expanding.

What makes this different from a typical supply chain incident is the mechanism: each compromise harvests credentials that fuel the next one. It's not a single point of failure. It's a self-propagating chain where every breached package becomes both victim and vector.

Two days ago we covered the LiteLLM incident in isolation. Since then, the picture has changed. This is a coordinated campaign by a single actor, and its scope is significantly wider than any individual incident suggested.

The Cascade

Here is the timeline, based on analysis from Datadog Security Labs, SANS, Arctic Wolf, and Endor Labs.

March 19 — Trivy (GitHub Actions, Docker Hub)

TeamPCP had compromised Aqua Security's aqua-bot service account at an unknown earlier date. On March 19, they used it to force-push malicious code to 75 of 76 version tags in aquasecurity/trivy-action and all 7 tags in aquasecurity/setup-trivy. Any CI/CD pipeline that ran trivy-action@v1 or any non-pinned tag after this point executed the attacker's code instead of the real scanner.

The payload was a three-stage credential stealer. It swept environment variables, .env files, shell histories, SSH keys, cloud provider credentials, and Kubernetes tokens. Stolen data was encrypted with AES-256-CBC + RSA-4096 (OAEP padding) and exfiltrated as tpcp.tar.gz.

March 23 — Checkmarx KICS (GitHub Actions, npm, Open VSX)

Using CI/CD secrets harvested from the Trivy compromise, TeamPCP pivoted to Checkmarx. They compromised two GitHub Actions repositories — ast-github-action and kics-github-action — along with related npm packages and Open VSX extensions. Sysdig documented that over 66 npm packages were poisoned through a self-propagating worm component they call CanisterWorm. Stolen npm tokens from compromised CI/CD environments were automatically weaponized to infect victim-maintained packages — creating new upstream compromises without attacker intervention.

This is where the attack shifted from targeted to exponential.

March 24 — LiteLLM (PyPI)

LiteLLM's CI/CD pipeline ran unpinned Trivy. The compromised scanner injected malicious code into LiteLLM's build process. Versions 1.82.7 and 1.82.8 were pushed to PyPI through what appeared to be a legitimate maintainer account.

LiteLLM processes roughly 95 million downloads per month and sits between applications and their LLM providers — OpenAI, Anthropic, AWS Bedrock, GCP Vertex AI. It has access to every API key configured in the deployment. The credential harvester did not need to be sophisticated. It was already in the room where the secrets live.

Version 1.82.8 was the higher-risk package: it used a .pth file that executes automatically when the Python interpreter starts, not just when the package is imported.

The attack window was approximately 5.5 hours (10:39–16:00 UTC on March 24). LiteLLM's post-incident report confirms they've engaged Mandiant for forensic analysis.

March 27 — Telnyx (PyPI)

Three days after LiteLLM, TeamPCP published malicious versions of the Telnyx Python SDK — versions 4.87.1 and 4.87.2. Telnyx averages 742,000 downloads per month.

Endor Labs attributes this directly to the LiteLLM compromise: TeamPCP's credential harvester swept environment variables, .env files, and shell histories from every system that imported the poisoned LiteLLM packages. If any developer or CI pipeline had both LiteLLM installed and access to the Telnyx PyPI token, that token was already in TeamPCP's hands.

The Telnyx compromise introduced a new technique: audio file steganography.

Hiding Malware in a WAV File

The malicious code was injected into telnyx/_client.py, which runs at import time. No install hook. No postinstall script. Just import telnyx and the payload executes.

On execution, the malware downloads a file called hangup.wav from a remote server. The file looks like a normal WAV audio file. It is not.

According to StepSecurity's analysis and SafeDep's technical breakdown, the WAV file contains an XOR-obfuscated executable packed into the audio frame bytes using Python's built-in wave module:

The malware reads the audio frame data from the WAV file
The first 8 bytes of the decoded blob are the XOR key
The remaining bytes are XOR'd against that key in a repeating pattern
The result is a credential-stealing executable

On Linux and macOS, it runs a credential harvester that encrypts stolen data with AES-256-CBC + RSA-4096 before exfiltrating it. On Windows, it drops a persistent executable disguised as msbuild.exe into the Startup folder, with a 12-hour re-drop cooldown enforced by a hidden .lock file.

The steganography serves two purposes. First, it bypasses static analysis tools that don't inspect audio files. Second, the payload is unreadable without the XOR key, which is embedded in the data itself rather than hardcoded anywhere in the package source.

The First Blockchain C2 in the Wild

TeamPCP's command-and-control infrastructure is also unusual. According to SANS, the CanisterWorm component — the self-propagating npm worm — uses an Internet Computer Protocol (ICP) canister as a dead-drop C2. This is the first documented abuse of decentralized blockchain infrastructure for supply chain C2.

Traditional C2 takedown relies on domain seizure or hosting provider cooperation. A blockchain canister has no single point of takedown. The domain can't be seized because there is no domain. The hosting can't be pulled because the canister is replicated across a decentralized network.

The Extortion Pivot

As of March 25, SANS reports that TeamPCP has pivoted from credential theft to active extortion. The group is reportedly working through approximately 300 GB of compressed stolen credentials and collaborating with LAPSUS$ to target multi-billion-dollar companies.

Mandiant estimates that over 1,000 enterprise SaaS environments have been impacted, with the number expected to grow to 5,000–10,000 as the full downstream impact of the credential cascade becomes clearer.

The campaign is not over. Each compromised package that harvested credentials created a pool of tokens that can be used to compromise the next target. The attack surface grows with every installation.

Why This Attack Worked

Three structural factors made this cascade possible.

Unpinned dependencies in CI/CD. LiteLLM's CI pipeline ran Trivy without pinning to a specific version or verifying checksums. When TeamPCP replaced the tag contents, LiteLLM's build pulled the malicious version automatically. Every pipeline that uses @v1 or @latest tags for GitHub Actions has this exact exposure.

CI/CD environments are credential-rich. A typical CI/CD environment has access to package registry tokens, cloud provider credentials, database passwords, API keys, and deployment secrets. The credential harvester didn't need to know what it was looking for — it swept everything and let the attacker sort it out later.

No ecosystem-level circuit breaker. Once TeamPCP had valid PyPI tokens, they could publish new versions of any package those tokens had access to. PyPI has no mechanism to detect that a previously-legitimate maintainer's credentials are being used by an attacker. Neither does npm, Docker Hub, or the GitHub Actions marketplace. Each ecosystem trusts its own authentication, and none of them talk to each other.

The result is a supply chain attack that propagates across ecosystem boundaries. Compromised GitHub Actions yield PyPI tokens. Compromised PyPI packages yield npm tokens. Compromised npm packages yield more GitHub tokens. The chain feeds itself.

What To Do

If you run Trivy in CI/CD:

Check whether your pipelines used unpinned trivy-action tags between March 19 and March 22
If they did, treat the CI environment as fully compromised and rotate all secrets it had access to
Pin GitHub Actions to full commit SHAs, not version tags. Tags can be force-pushed. Commit hashes cannot
Microsoft's guidance has detection queries for Azure, Defender, and Sentinel

If you installed LiteLLM from PyPI:

Check whether v1.82.7 or v1.82.8 was installed during the March 24 window
If either version was installed on any system: full credential rotation — cloud providers, API keys, database passwords, SSH keys, Kubernetes tokens
Audit for outbound connections to models.litellm[.]cloud

If you use the Telnyx Python SDK:

Versions 4.87.1 and 4.87.2 are malicious. Downgrade to 4.87.0 immediately
The package is currently quarantined on PyPI
If either version was installed: rotate all credentials in the environment, check for msbuild.exe in Windows Startup folders, check for unexpected outbound connections to 83[.]142[.]209[.]203:8080

For AI infrastructure generally:

Pin every dependency in CI/CD to immutable references (commit SHAs for Actions, hashes for packages)
Audit what credentials exist in your CI/CD environments. If your build runner has access to your PyPI token, your cloud provider keys, and your database password, a single compromised dependency can take all of them
LLM proxies, agent orchestrators, and AI workflow tools sit in the execution path of your AI stack. They have elevated access because they need it. That makes them high-value targets. Treat them accordingly
Monitor for unexpected outbound connections from CI/CD and development environments. The earliest signal of a credential harvester is a network call to a domain you don't recognise

The Trivy → Checkmarx → LiteLLM → Telnyx chain is the clearest example yet of a self-propagating supply chain attack. Each compromise created the conditions for the next one. The attack didn't need to be technically brilliant at any single step. It needed to be persistent, and it needed the ecosystem to not notice fast enough.

The campaign has been running for ten days. Five ecosystems are compromised. An estimated 300 GB of stolen credentials are being actively used. And the next package in the chain could be publishing right now.

Sources: Datadog Security Labs: TeamPCP campaign analysis · SANS: When the Security Scanner Became the Weapon · Arctic Wolf: TeamPCP campaign advisory · Endor Labs: Telnyx compromise attribution · Microsoft: Trivy compromise detection guidance · StepSecurity: WAV steganography analysis · SafeDep: Telnyx PyPI compromise · LiteLLM: Security incident report · The Hacker News: TeamPCP Telnyx · Sysdig: Checkmarx compromise

Originally published on mistaike.ai

LangChain Just Got Three CVEs. The Bugs Are From 2006.

Nick Stocks — Sat, 28 Mar 2026 14:29:47 +0000

On March 27, researchers at Cyera disclosed three security vulnerabilities affecting LangChain and LangGraph — two of the most widely deployed AI development frameworks in the world.

LangChain-Core recorded 23 million downloads in the week before disclosure. LangChain had 52 million. LangGraph had 9 million. That's 84 million combined weekly downloads carrying at least one of these vulnerabilities into production environments.

The CVEs are:

CVE-2026-34070 (CVSS 7.5) — path traversal in prompt loading
CVE-2025-68664 (CVSS 9.3) — deserialization injection that leaks API keys and environment secrets
CVE-2025-67644 (CVSS 7.3) — SQL injection in the LangGraph checkpoint store

Path traversal. Deserialization of untrusted data. SQL injection. If you've been in web security for any length of time, you've seen these before. They were in the OWASP Top 10 in 2004.

CVE-2026-34070: Path Traversal in Prompt Loading

LangChain's prompt-loading API (langchain_core/prompts/loading.py) accepts file paths to load prompt templates. It does not validate those paths.

A specially crafted prompt template reference can escape the intended directory and read arbitrary files from the server's filesystem. Configuration files, deployment metadata, tokens, prompt templates belonging to other applications — anything the process has read access to.

This is CWE-22: Improper Limitation of a Pathname to a Restricted Directory. It was first catalogued in 2006. Most web frameworks have built-in protections against it. LangChain's prompt loader did not.

Fix: upgrade langchain-core to version 1.2.22 or higher.

CVE-2025-68664: Serialization Injection That Leaks Secrets

This one is the most severe at CVSS 9.3.

LangChain has an internal serialization format. When a dictionary contains an lc key, the framework treats it as a serialized LangChain object rather than regular data. The vulnerability: dumps() and dumpd() did not escape user-controlled dictionaries that happened to include the reserved lc key.

An attacker can craft input data that the framework interprets as a serialized object. When that object is processed, it can trigger the loading of arbitrary LangChain components — including ones that expose environment variables, API keys, and other secrets.

Cyata documented this vulnerability in December 2025 under the name "LangGrinch." As researcher Vladimir Tokarev noted: "Each vulnerability exposes a different class of enterprise data: filesystem files, environment secrets, and conversation history."

This is CWE-502: Deserialization of Untrusted Data. Java developers have been fighting this class of bug since at least 2015, when the Apache Commons Collections deserialization vulnerability became one of the most exploited flaws in enterprise software. The AI ecosystem is learning the same lesson.

Fix: upgrade langchain-core to version 0.3.81 (if on the 0.x branch) or 1.2.5+.

CVE-2025-67644: SQL Injection in LangGraph Checkpoints

LangGraph uses SQLite for checkpoint storage. The SqliteSaver component's list() and alist() methods accept metadata filter keys — and those keys are interpolated directly into SQL queries without sanitisation.

An attacker who can influence the metadata filter keys can inject arbitrary SQL. The result: full bypass of any query filters and access to all checkpoint records, which contain conversation state, tool call results, and any data the agent processed during its run.

This is CWE-89: SQL Injection. It was the number-one vulnerability in the original OWASP Top 10 in 2004. Parameterised queries have been the standard defence for over twenty years. The LangGraph checkpoint store did not use them.

Fix: upgrade langgraph-checkpoint-sqlite to version 3.0.1.

The Pattern

These are not exotic AI-specific attack vectors. There is no prompt injection here, no novel adversarial technique, no research paper required to understand the attack surface. These are bread-and-butter application security bugs — the kind that automated scanners have been catching in web applications since the mid-2000s.

The web security community spent two decades building defences against these vulnerability classes. Frameworks like Django, Rails, and Express have path traversal protection, parameterised queries, and safe serialization built into their core. Developers using those frameworks get these protections by default without thinking about them.

The AI framework ecosystem has not inherited those protections. It has inherited the speed and ambition, but not the scar tissue.

LangChain is not a small project maintained by a single developer. It has significant funding, a large team, and enterprise customers. These vulnerabilities are not the result of neglect or resource constraints. They're the result of building fast in a domain where the security patterns haven't been established yet — and where the developers building the frameworks may not have backgrounds in the web security discipline that solved these problems.

Why This Matters More Than Usual

When a web application has a path traversal bug, the blast radius is the data that application can access. When an AI orchestration framework has a path traversal bug, the blast radius includes:

Every API key the agent has been configured with
Every tool credential stored in environment variables
Every prompt template, including ones that encode business logic
Every conversation history checkpoint, which may contain customer data, internal documents, or credentials that users pasted into chat

AI frameworks accumulate access. They need API keys for LLM providers, database credentials for memory stores, authentication tokens for the tools they orchestrate. A single vulnerability in the framework layer exposes everything the framework touches — which, by design, is everything.

The SQL injection in LangGraph's checkpoint store is a good example. Checkpoints contain the full state of agent conversations: tool calls, responses, intermediate reasoning, user inputs. An attacker who can query the checkpoint store without filters has access to the complete operational history of every agent running on that instance.

What To Do

Patch immediately. The fixes are straightforward version bumps:

langchain-core >= 1.2.22 (or >= 0.3.81 for the 0.x line)
langgraph-checkpoint-sqlite >= 3.0.1

Audit your environment variables. If you're running LangChain in an environment that has API keys, cloud credentials, or database passwords in environment variables — and you almost certainly are — assume those were accessible through CVE-2025-68664 until you patched. Rotate them.

Check your checkpoint stores. If you use LangGraph with SQLite checkpoints and the checkpoint store was accessible to untrusted input, assume conversation history was accessible. Audit what data those conversations contained.

Run a dependency audit. pip audit or safety check will flag these CVEs in your lockfile. If you're not running dependency audits in CI, this is a good week to start.

Consider what's between your agents and the world. These CVEs are in the framework layer — the code that sits between your application logic and the LLMs, databases, and tools your agents use. If you're running DLP, content scanning, or access controls, they need to cover the framework layer too, not just the agent's outbound calls.

The Bigger Picture

RSAC 2026 wrapped up this week. Cisco released DefenseClaw, an open-source framework for scanning AI agent skills and MCP servers. Microsoft announced Agent 365. CrowdStrike launched Charlotte AI AgentWorks. SentinelOne, Check Point, Saviynt, and Teleport all shipped AI agent security products.

The industry is building defences for the AI agent era. That's genuinely necessary. But the LangChain disclosure is a reminder that the most urgent vulnerabilities aren't the exotic ones. They're the ones we already know how to find and fix — in the frameworks that haven't looked for them yet.

84 million weekly downloads. Path traversal, SQL injection, and deserialization. The AI industry is speed-running the web's security history, and it hasn't reached the chapter where we learned to check our inputs.

Sources: The Hacker News: LangChain, LangGraph Flaws Disclosure (March 27, 2026) · TechRadar: LangChain framework security issues · Cyata: LangGrinch CVE-2025-68664 (December 2025) · Cisco DefenseClaw announcement (March 2026)

Originally published on mistaike.ai

29 Million Secrets Leaked on GitHub Last Year. AI Coding Tools Made It Worse.

Nick Stocks — Sat, 28 Mar 2026 00:36:34 +0000

GitGuardian published the fifth edition of its State of Secrets Sprawl report on March 27. It's the largest study of credential exposure on public GitHub, and this year's edition lands a finding that the AI agent ecosystem needs to sit with.

AI-assisted commits leak secrets at roughly twice the rate of human-only commits. And 24,008 unique secrets were found specifically in MCP configuration files.

Those aren't estimates. They're counts.

The Numbers

The headline stats from the report:

28.65 million new hardcoded secrets detected in public GitHub commits in 2025. A 34% year-over-year increase and the largest single-year jump GitGuardian has recorded.
AI-assisted commits had a 3.2% secret-leak rate, versus a 1.5% baseline across all public GitHub commits. That's roughly 2x the baseline.
AI-service credentials (API keys for LLM providers, embedding services, AI platforms) increased 81% year-over-year, reaching 1,275,105 detected leaks.
24,008 unique secrets were found in MCP configuration files on public GitHub. Of those, 2,117 were confirmed valid — live credentials sitting in public repos.
64% of valid secrets from 2022 are still active in 2026. Four years later, not revoked.

Why AI Tools Leak More

The 2x leak rate for AI-assisted commits is not a simple "AI is bad at security" story. GitGuardian's report is careful about this, and the nuance matters.

Developers remain in control of what gets accepted, edited, and pushed. AI coding tools suggest code. Humans approve it, modify it, and commit it. The leak happens through a human workflow — but the workflow has changed.

Three things are different when AI is in the loop:

Speed. AI-assisted development moves faster. More code reviewed per hour, more commits per day, more surface area for a secret to slip through. The cognitive load of reviewing AI-generated code for security issues sits on top of reviewing it for correctness.

Confidence. When a tool generates code that works, the instinct is to ship it. The review step becomes shallower. A hardcoded API key in a config block generated by an AI assistant looks the same as any other config value — unremarkable, easy to miss.

Defaults. AI coding tools generate what they've seen in training data. If thousands of public repositories contain hardcoded API keys in configuration files, that pattern gets learned and reproduced. The model isn't being malicious. It's being accurate — accurately reproducing the insecure patterns it was trained on.

The result is not a tool failure. It's a process gap: the velocity increased, but the guardrails didn't.

24,008 Secrets in MCP Configs

This finding deserves its own section because it points to something structural.

MCP (Model Context Protocol) is how AI agents connect to external tools — databases, APIs, file systems, code repositories. An MCP configuration file defines which servers to connect to, what credentials to use, and how to authenticate.

GitGuardian found 24,008 unique secrets across MCP-related configuration files on public GitHub. The report identifies a root cause that's uncomfortable: the documentation itself encourages the pattern.

Popular MCP setup guides — including official quickstarts — routinely show API keys placed directly in configuration files or command-line arguments. When the getting-started guide puts the API key inline, developers follow that pattern. When those config files get committed, the secret goes with them.

This is not surprising. It's the same pattern that plagued .env files, Docker Compose files, and Kubernetes manifests before tooling caught up. The difference is scale and timing: MCP adoption is accelerating fast, and the ecosystem's security tooling hasn't caught up yet.

Of the 24,008 secrets found, 2,117 were confirmed valid. That means 2,117 live credentials — capable of authenticating against real services — were sitting in public GitHub repositories at the time of the scan.

The Remediation Gap

Perhaps the most alarming number in the report isn't about AI at all.

64% of valid secrets detected in 2022 are still active in 2026. Four years later. Not rotated, not revoked, not expired.

This isn't a detection problem. GitGuardian detected them. The problem is what happens after detection: somebody needs to identify the secret's owner, assess its blast radius, revoke it, rotate it, update every system that depends on it, and verify nothing breaks. For most organisations, that workflow either doesn't exist or stalls at "identify the owner."

AI agents make this worse in a specific way. An AI coding tool that generates a config file with a hardcoded secret doesn't know who owns that secret, what it connects to, or what the rotation procedure is. It can't file the remediation ticket. It just writes the code and moves on.

The gap between "secret detected" and "secret revoked" is where the real risk lives. And it's growing.

What This Means for Agent Infrastructure

If you're building or operating AI agents, three things from this report should change your threat model:

1. MCP config files are a credential attack surface. Treat .cursor/mcp.json, claude_desktop_config.json, and any MCP server configuration with the same paranoia you'd apply to .env files. Don't commit them. Don't share them in Slack. Don't paste them in documentation.

2. AI-generated code needs secret scanning in the commit pipeline. Pre-commit hooks that catch secrets before they hit the repository are no longer optional. Tools like GitGuardian, TruffleHog, and detect-secrets belong in every pipeline that ships AI-assisted code. The 2x leak rate makes this arithmetic simple.

3. The output side matters as much as the input side. Most discussion about AI agent security focuses on what goes into the agent — prompt injection, poisoned context, malicious tool responses. This report is about what comes out: the code the agent writes, the configs it generates, the credentials it embeds. Output scanning — including DLP on tool call payloads — catches the secrets that pre-commit hooks miss, because not all agent output flows through git.

Honest Context

We build DLP for MCP tool calls at mistaike.ai, so we have a stake in this conversation. We're not pretending otherwise.

But the GitGuardian data stands on its own. 29 million secrets. 24,008 in MCP configs. 2x leak rate from AI-assisted code. 64% still valid after four years. These are someone else's numbers from an independent study, and they describe a problem that exists whether or not you use our product.

The practical takeaway is simple: if your AI agents generate code, configs, or tool call payloads, something needs to be scanning that output for secrets. That something could be a pre-commit hook, a CI pipeline check, a runtime DLP layer, or all three. The specific tool matters less than having the coverage at all.

Right now, most teams don't.

Sources: GitGuardian State of Secrets Sprawl 2026 (March 27, 2026) · GitGuardian blog: AI-Service Leaks Surge 81% · Help Net Security: AI frenzy feeds credential chaos · HackerNoob: AI Coding Tools Double Secret Leak Rates

Originally published on mistaike.ai

LiteLLM Was Backdoored via Its Security Scanner. Langflow Hit CISA's Exploit Catalog. Same Week.

Nick Stocks — Fri, 27 Mar 2026 12:40:07 +0000

Two tools that appear in most AI development stacks had critical security incidents within 48 hours of each other this week.

On March 24, LiteLLM — the open-source LLM proxy and router used across thousands of enterprise deployments — distributed malicious packages containing a credential-stealing backdoor. The attack window was approximately 5.5 hours.

On March 25, CISA added Langflow to its Known Exploited Vulnerabilities catalog. Langflow is the visual framework for building AI workflows, with 145,000 GitHub stars. CVE-2026-33017 had already been exploited in the wild within 20 hours of the advisory's publication.

Neither incident was subtle. Both were preventable with steps that are not particularly difficult. Together they mark a visible shift: AI development tooling is now a primary attack target, not collateral damage.

The LiteLLM Attack Chain

LiteLLM's compromise traced back to Trivy — a popular open-source container security scanner. Trivy had been compromised via a supply chain attack, and LiteLLM's CI/CD pipeline used Trivy as a dependency.

The result: when LiteLLM's automated build ran, the compromised Trivy component injected malicious code into the resulting packages. Versions v1.82.7 and v1.82.8 were pushed to PyPI through what appeared to be a legitimate maintainer account, but they contained a credential harvester that LiteLLM's team had not written.

According to LiteLLM's post-incident report, the payload was designed to collect:

Environment variables
SSH keys
Cloud provider credentials (AWS, GCP, Azure)
Kubernetes tokens
Database passwords

Stolen data was encrypted and exfiltrated to models.litellm[.]cloud — a domain that looks plausible at a glance. The packages were live on PyPI from 10:39 UTC to 16:00 UTC on March 24. LiteLLM has engaged Google Mandiant for forensic analysis and paused new releases pending a full supply chain review.

Who was at risk: anyone who installed LiteLLM without a pinned version during that window. Users running the official Docker image with pinned dependencies were not affected.

This is the recursive supply chain attack in its clearest form. A security tool used to protect an AI tool became the vector that compromised it. The irony is not subtle. The lesson is: the thing auditing your dependencies can also be the thing that poisons them.

LiteLLM sits in the call path between your application and the LLMs. It processes every prompt, every response, and has access to every API key you've configured. The malicious package didn't need to be clever to reach valuable data. It was already there.

The Langflow RCE

CVE-2026-33017 is a code injection vulnerability in Langflow carrying a CVSS score of 9.3. The vulnerable endpoint is:

POST /api/v1/build_public_tmp/{flow_id}/flow

This endpoint is designed to let unauthenticated users build public flows. The vulnerability arises because it accepts attacker-supplied flow data containing arbitrary Python code in node definitions — and that code executes server-side with no sandboxing.

One crafted HTTP request. Arbitrary Python execution on the host. No authentication required.

Researchers at Endor Labs documented that exploitation started approximately 20 hours after the advisory was published on March 19. No public proof-of-concept existed at the time. Attackers reverse-engineered the exploit directly from the advisory text.

Sysdig's incident timeline shows how fast weaponisation moved:

Hour 0 — advisory published
Hour 20 — automated scanning begins
Hour 21 — Python-based exploitation observed in the wild
Hour 24 — data harvesting activity confirmed

CISA added CVE-2026-33017 to its Known Exploited Vulnerabilities catalog on March 25. Federal agencies running Langflow have until April 8 to patch to version 1.9.0+ or cease using the product.

All versions 1.8.1 and earlier are affected.

What Both Incidents Share

The obvious answer is "supply chain" — but that framing is too broad to be useful here.

Look at the specifics.

LiteLLM sits between your application and the LLMs. It has elevated access because it needs it. The payload targeted exactly the credentials that would give an attacker the most lateral movement: cloud provider keys, database passwords, Kubernetes tokens. Whoever built the attack understood what LiteLLM deployments look like.

Langflow builds AI workflows where node definitions can contain executable code. The vulnerability wasn't in an obscure edge case. It was in the mechanism that makes the platform functional: flow definitions contain code, and the platform executes that code. The public-facing endpoint inherited all of that execution capability without any authentication guard.

These aren't bugs found in peripheral features. They're consequences of what these tools are built to do.

Tools that sit in the execution path of AI workflows — proxies, orchestrators, visual builders — accumulate trust and access because they need it to function. That trust is exactly what attackers are targeting.

The Advisory-to-Exploit Pipeline

The 20-hour exploitation window for Langflow deserves closer attention.

The standard assumption in enterprise security is that you have somewhere between 72 hours and a few weeks between a public CVE and active exploitation. That assumption is based on a world where attackers need to understand the vulnerability, write the exploit, test it, and deploy it.

CVE-2026-33017 invalidated all of that. Attackers read the advisory, extracted the endpoint name and the attack vector description, and had working exploits running before most organisations had finished their morning meetings.

Modern CVE advisories are detailed. They have to be — developers need to understand what's affected to make patching decisions. But a well-written advisory for a code injection vulnerability is also an exploit blueprint. The technical description is the proof-of-concept.

Security teams should treat AI workflow and AI infrastructure CVEs as having a near-zero exploitation delay, not a standard 72-hour window.

What To Do

For Langflow:

Upgrade to version 1.9.0 immediately
If you cannot patch immediately, restrict or disable the POST /api/v1/build_public_tmp endpoint
Do not expose Langflow instances directly to the internet
Rotate any credentials the host had access to

For LiteLLM:

Pin your LiteLLM version in all production deployments
Check whether your environment installed v1.82.7 or v1.82.8 during the March 24 10:39–16:00 UTC window
If it did: rotate everything that LiteLLM had access to — AWS keys, GCP service accounts, database passwords, API keys
Review your PyPI dependency resolution strategy for internal deployments

For AI infrastructure generally:

Your LLM proxy, workflow builder, and agent orchestration layer all have elevated access. Treat them as production services with meaningful blast radii, not developer tools.
Audit what credentials are available in the environments where these tools run.
Pin dependencies. All of them. Especially security tooling used in CI/CD pipelines.
If your AI infrastructure makes outbound network calls, log them. An unexpected call to an unrecognised domain is the earliest signal you'll get.

Neither of these incidents required a sophisticated attacker. LiteLLM needed someone willing to spend time on a supply chain attack that would have wide downstream reach. Langflow needed someone who could read a CVE advisory and write a Python HTTP request.

The tools we use to build AI systems are now valuable targets in their own right. That's new, and it changes the threat model for anyone operating AI infrastructure.

Sources: LiteLLM security incident report (March 24, 2026) · BleepingComputer: Langflow CVE-2026-33017 · Sysdig: 20-hour exploitation timeline · The Hacker News: Endor Labs research

Originally published on mistaike.ai

We Stopped Bolting Security onto MCP. We Built It In.

Nick Stocks — Wed, 25 Mar 2026 01:26:21 +0000

Managed MCP hosting with Data Loss Prevention, 0-day CVE protection, and Content Safety is live on mistaike.ai. Self-service. No enterprise contract. 0-Day CVE protection is free.

Start for free → | Setup guide →

Enterprise MCP security platforms have existed for a while. They cost five figures per year, require dedicated security teams, and take months to configure.

We searched for something different — a managed MCP platform where Data Loss Prevention, CVE protection, and Content Safety were default features developers and small teams could actually use. We couldn't find one. So we built it.

Sign up, connect your MCP tools, and every tool call is inspected from the first minute. No configuration required to turn security on. It's already on. And if you're just getting started with existing MCP tools, our gateway alone is enough — you still get 0-day CVE protection on every call, completely free.

Your Agent Is Flying Blind

Every time your AI agent calls an MCP tool, it executes code written by someone else — on your infrastructure, with access to your network, your secrets, and your data. The response comes back and your agent acts on it, unquestioned.

Most developers don't think about this. They connect MCP servers the way they install packages: trust the name, hope someone checked it.

The Smithery.ai breach showed what happens when that trust is misplaced. One path traversal vulnerability exposed 3,243 MCP servers and thousands of API keys. 82% of surveyed MCP implementations had path traversal vulnerabilities.

This isn't hypothetical. It's the current state of MCP infrastructure.

Three Security Layers. Every Tool Call.

Data Loss Prevention — Both Directions, Under 50ms

Every tool call through mistaike.ai is scanned bidirectionally in real-time.

Outbound (your agent → the tool): secrets, credentials, PII, and financial data are caught before they reach third-party code. Your AWS keys don't leave. Your customer's email address doesn't get forwarded.

Inbound (the tool → your agent): API keys, database connection strings, and personal data are stripped from responses before your agent processes them.

When a scan triggers, the content is redacted. Your agent sees a clean response. The offending data never moves. Every match is written to an immutable audit log: what triggered, what was redacted, which rule matched, confidence score.

0-Day CVE Protection — Free, and Always Up to Date

0-day CVE protection is free. For everyone. On every plan.

This is the one people miss.

Traditional vulnerability scanners check your committed code. That's table stakes. What they don't check: the code and data patterns embedded in the MCP responses your agent receives and acts on.

An MCP tool can return a SQL injection vector. A path traversal construct. An insecure deserialisation pattern. Your agent doesn't know — it just sees a tool response and uses it.

Our pipeline cross-references every tool response against 9,527 known security vulnerability patterns drawn from CVE datasets and curated security research. If a response matches a known attack pattern, it's flagged before your agent ever processes it. This catches supply chain attacks specifically designed for the AI agent layer.

We update our CVE lists and protections at least once a day. You don't manage updates — you're always protected against the latest known vulnerabilities, automatically.

Even if you're not ready for managed hosting, connect your existing MCP tools through our gateway and you get this protection today, at no cost.

Content Safety — Stopping What Data Loss Prevention Doesn't Catch

Tool responses can carry more than leaked data. They can carry instructions. Prompt injection attacks hide in tool outputs, attempting to redirect your agent's behaviour mid-task.

Content Safety scanning runs on every inbound response, independently from the Data Loss Prevention pipeline. Configurable sensitivity. Per-server overrides for teams that need different thresholds on different tools. Full audit trail of every flag.

Managed MCP Hosting: Your Code Never Runs on Your Infrastructure

Upload a Python MCP server. We build it, run it, and route your agents through our gateway. The untrusted code never touches your systems.

Six isolation layers between your code and your infrastructure:

1. Kernel-level sandboxing. Your server runs inside a user-space kernel that intercepts every system call — a separate kernel that limits what the process can see and do at the OS level.

2. Default-deny egress. Your server declares the external domains it needs (max 10, FQDNs only, no wildcards). Everything else — all outbound network access — is blocked before it leaves the container.

3. Envelope-encrypted secrets. Credentials are encrypted at rest, decrypted and injected directly into process memory at runtime, and the injection path is destroyed immediately after. No environment variables. No files on disk. Nothing to exfiltrate.

4. Sandboxed build pipeline. Dependency installation runs in its own isolated container with PyPI-only network access. Every dependency is vulnerability-scanned before the image is finalised.

5. Hard resource limits. Fixed CPU, memory, storage, and PID limits per tier — not configurable by users. This prevents resource exhaustion attacks and fork bombs within the sandbox.

6. Ephemeral containers. Five minutes idle, the container is destroyed. Not paused — destroyed. Every new request gets a fresh instance. No state accumulation, no persistent foothold.

If a server is compromised, the blast radius is one container with no outbound network access, no persistent storage, and no path to anything outside the sandbox.

Pricing

Pricing starts from £10/month, with team plans available.

Not ready for hosting? The gateway is free. Route your existing MCP tools through mistaike.ai and get 0-day CVE protection on every call with no subscription required.

Full pricing →

Built for Developers and Teams Who Can't Afford to Think About This

If you're an independent developer connecting AI agents to MCP tools: start with the free gateway — you get 0-day CVE protection immediately. Add Data Loss Prevention and managed hosting when you need them.

If you're a small team: each team member's agent traffic is inspected by the same rules, policy changes take effect immediately across all connections, and the audit log gives your ops team visibility without requiring a dedicated security stack.

If you're a startup: when a customer asks "how do you protect data flowing through your AI integrations?" — you have a real answer backed by a real audit trail.

The security doesn't scale down with smaller plans. The developer on the free tier gets the same 0-day CVE protection as a team on an enterprise plan. The limits are on compute allocation and hosted servers, not protection.

Start free → | Read the setup guide → | See the full security architecture →

Nick Stocks is the founder of mistaike.ai. The platform is built and operated using AI agents — and yes, the DLP caught credential leaks during development. That's how we knew it worked.

Originally published on mistaike.ai

Your Zero-Trust Architecture Has a Blind Spot. It's Called MCP.

Nick Stocks — Fri, 20 Mar 2026 16:52:37 +0000

Your Zero-Trust Architecture Has a Blind Spot. It's Called MCP.

You spent years building zero-trust. Verified every user. Locked down every device. Inspected every packet. Then you connected an AI agent to your systems via the Model Context Protocol and implicitly trusted everything the agent was told.

That contradiction just became the most talked-about topic in AI security.

In the past 72 hours, Dark Reading previewed an RSAC 2026 session where Netskope researcher Gianpietro Cutolo will argue that MCP's security risks are architectural — not the kind you can address via patching or configuration changes. SC Media published an essay calling MCP "the backdoor your zero-trust architecture forgot to close". And an independent researcher scanned 900 MCP configurations on GitHub and found that 75% had security problems.

The industry is converging on an uncomfortable conclusion: the protocol that connects your AI agents to everything isn't covered by any of the security layers you already have.

The Numbers

The data from independent research groups paints a consistent picture.

Security researchers catalogued approximately 7,000 internet-exposed MCP servers — roughly half of all known deployments. Many operate with no authorization controls whatsoever.

Knostic researchers scanned approximately 2,000 publicly accessible MCP servers and found that every single verified instance granted access to internal tool listings without any authentication.

A comprehensive scan of 2,614 MCP implementations found that 82% of those handling file operations were vulnerable to path traversal attacks, and 67% carried code injection risk. Between 38 and 41% of 518 officially registered MCP servers offered no meaningful authentication at all.

And the Orchesis scan of 900+ MCP configurations committed to public GitHub repositories found that three out of four failed basic security checks.

These aren't theoretical vulnerabilities. They're the current state of production deployments.

The Attack Surface Nobody Named

Here's the conceptual problem: the cybersecurity industry has mature defences for network-layer attacks, compromised credentials, and device posture. We have names for these threats, frameworks to address them, and tools to enforce policy.

But MCP introduces something different. SC Media's Sunil Gentyala describes what researchers now call the "context-layer attack surface": the capacity for malicious or manipulated content flowing into an AI agent's reasoning process to induce it to perform unauthorised operations — without any underlying model compromise.

This is not a network attack. It's not credential theft. It's not even prompt injection in the traditional sense. It's the ability to manipulate what an agent believes about the world, and then watch it act on those manipulated beliefs using real tools with real permissions.

Your zero-trust architecture verified the user. It verified the device. It inspected the network traffic. But the MCP connection sits between your agent and its tools, carrying context that nobody is inspecting, authenticating, or rate-limiting.

As Security Boulevard put it in their comprehensive MCP vulnerability guide: "Unlike static APIs that process predictable, human-driven requests, MCP involves agent-driven decision-making, shifting contexts, and evolving chains of tools. Every interaction creates new risk vectors. Every context switch opens new paths for exploitation."

Why You Can't Patch This

Netskope's Gianpietro Cutolo, whose RSAC 2026 session is scheduled for next week, makes a specific claim: MCP's security risks exist at the architectural level in both LLMs and in MCP itself. They're not implementation bugs. They're design decisions.

The protocol was designed for interoperability. It succeeded. Every major AI platform adopted it — Anthropic, OpenAI, Google, Microsoft, LangChain, Vercel, Pydantic AI. The standardisation worked exactly as intended.

But that interoperability came with an implicit trust model: the agent trusts the server to return honest tool descriptions. The server trusts the agent to make reasonable requests. Neither party verifies the other's identity, integrity, or intent.

You can patch individual CVEs. You can fix specific server implementations. But you can't patch away the fact that the protocol itself has no authentication, no message signing, no tamper detection, and no way to verify that the tools an agent sees are the tools the administrator intended.

This is what Cutolo means by "architectural." The attack surface isn't in the bugs — it's in the blueprint.

What Zero-Trust for the Context Layer Looks Like

If the security industry spent a decade extending zero-trust from networks to identities to devices, the next extension is to the context layer. Here's what that means in practice:

Treat MCP connections as privileged access pathways. Every connection between an agent and an MCP server is a pathway to sensitive data and operations. Inventory them. Classify them. Govern them with the same rigour as admin access.

Inspect every tool call. The agent doesn't just send requests — it sends context. Tool names, parameters, embedded content. Every tool call is a potential exfiltration channel, and every response is a potential injection point. If you wouldn't let unaudited HTTP requests reach your database, you shouldn't let unaudited tool calls reach your tools.

Enforce least privilege per tool, not per server. Most MCP servers expose a bundle of tools with a single set of permissions. An agent that needs read access to a calendar shouldn't automatically get write access to email. Tool-level authorisation is the MCP equivalent of role-based access control.

Scan for data in transit. Traditional DLP catches files leaving the perimeter. MCP DLP has to catch data leaving through tool call parameters — API keys in arguments, PII in prompts, credentials in responses. The exfiltration channel is the tool call itself.

Log everything. If your agent made a decision, you need to know what context it saw when it made it. Without audit logging at the MCP layer, you can't investigate incidents, prove compliance, or even know that something went wrong.

Where This Is Going

The products are starting to appear. PointGuard AI shipped an MCP Security Gateway on March 18. Aurascape announced a Zero-Bypass MCP Gateway on March 17. Open-source projects like AgentSign are building cryptographic identity layers for AI agents.

The message from RSAC is clear: this won't be solved by a patch to MCP v2. It'll be solved by building the same kind of security infrastructure around the context layer that we built around the network, the identity, and the device.

The zero-trust architecture you already have isn't wrong. It's just incomplete. The context layer is the next perimeter to close.

This post draws on reporting from Dark Reading (March 19, 2026), SC Media (March 18, 2026), Security Boulevard (March 19, 2026), Orchesis (March 18, 2026), Descope (February 25, 2026), and Agent Wars (March 13, 2026). All statistics are from the cited sources.

Originally published on mistaike.ai