42,000 AI Agents Were Exposed to the Internet. Here's What We Can Learn.
In early 2026, security researcher Maor Dayan published findings about OpenClaw — the open-source AI assistant platform with 214,000+ GitHub stars that lets users self-host an AI agent with deep system access. Email, calendars, file systems, code repos, databases, APIs. Full access. On your machine. Under your control.
The pitch was compelling: sovereign AI without the surveillance trade-offs.
The execution was, in Dayan's words, "the largest security incident in sovereign AI history."
The Numbers
Researchers scanning the public internet found over 42,000 OpenClaw instances exposed with no authentication. Not misconfigured — architecturally vulnerable, by default. 93% had critical authentication bypass vulnerabilities.
A single backend misconfiguration at Moltbook leaked 1.5 million API tokens, along with 35,000 user emails and full conversation histories. Within 72 hours of OpenClaw going viral in January, automated scanners had extracted 200+ API keys and found 1,000+ admin panels open to the internet.
A Snyk audit of the community plugin store found 36.82% of all community skills had security flaws. A deeper ClawHub audit uncovered 341 malicious skills — credential theft, malware delivery, and data exfiltration, all available for one-click install.
Then came CVE-2026-25253 (CVSS 8.8): a remote code execution vulnerability where a malicious website could hijack an active agent via WebSocket and get shell access to the host machine. Visit a webpage, lose your machine.
The total damage: $50,000+ in unauthorized API charges reported publicly, with the real number almost certainly higher.
(Sources: dev.to/tiamatenity, March 7 2026; dev.to/santiagopalma12, March 16 2026; Futurism, March 15 2026)
Why This Happened
OpenClaw did something new: it gave AI agents real system access. Not a chatbot in a browser. An agent that could run scripts, manage files, call APIs, and interact with infrastructure. That's powerful. It's also exactly the kind of thing that needs security boundaries — and OpenClaw shipped without them.
Here's what was missing:
No authentication on tool endpoints. Admin panels and API routes were accessible to anyone who could find them. 93% of exposed instances had no auth at all.
No outbound scanning. When the agent called external tools, nothing checked whether it was sending API keys, database credentials, or private data in the payload.
No inbound scanning. When tool responses came back, nothing checked for prompt injection, malicious instructions, or exfiltration commands.
No plugin verification. The community skill store had no security review process. 341 malicious plugins made it in. Users installed them with one click.
No audit trail. When things went wrong, there was no log of what the agent did, what it sent, or what it received.
Every single one of these is a known problem with a known solution.
This Isn't Just an OpenClaw Problem
The broader MCP ecosystem has the same vulnerabilities at a protocol level. The research has been piling up:
Checkmarx catalogued 11 emerging security risks in the Model Context Protocol, including cross-agent context abuse, schema manipulation, and poisoned data injection. Every new MCP server connection expands the trust boundary.
A Medium analysis by security researcher Ayoub Nainia (March 2026) documented five ways he accidentally broke his own AI agent through MCP — leaking SSH keys, exfiltrating private repo data, and enabling arbitrary command execution. His summary: "Nobody warned me."
Researchers at Marmelab demonstrated cross-tool hijacking and external prompt injection in live MCP environments, showing how a malicious webpage parsed by one MCP server could compromise a completely different tool.
And a scan of 8,000+ MCP servers on the public internet (reported on r/cybersecurity, February 2026) found widespread exposed admin panels, debug endpoints, and unauthenticated API routes.
The pattern is consistent: AI agents are being given deep system access, connected to external tools via MCP, and deployed with no security layer between the agent and the world.
What Actually Needs to Happen
The OpenClaw incident is a case study in what happens when you skip the security layer. But it's not unique. Any MCP-connected agent has the same fundamental exposure if you're not scanning what goes in and out.
Here's the minimum viable security for any AI agent with tool access:
1. Scan outbound tool calls for secrets and PII. Your agent doesn't know what's sensitive. It will include AWS keys, database passwords, and email addresses in tool payloads if they're in its context. Something needs to catch that before it leaves.
2. Scan inbound tool responses for prompt injection. A malicious MCP server — or a compromised one — can embed instructions in its response that hijack your agent. Inbound scanning catches injected instructions before the agent acts on them.
3. Log every tool call. When something goes wrong (and it will), you need a record of what happened. What tool was called, what was sent, what came back. Without this, you're debugging blind.
4. Don't trust plugins by default. OpenClaw's one-click install with no review process is how 341 malicious skills got distributed. Any tool your agent connects to should be treated as untrusted until proven otherwise.
5. Block before it leaves. Detection is good. Prevention is better. If a secret is in an outbound payload, don't log it and let it through — stop it.
What We're Doing About It
This is literally why mistaike.ai exists. We sit between your agent and every MCP server it talks to. Every tool call passes through a DLP pipeline — outbound gets scanned for secrets and PII, inbound gets scanned for prompt injection and destructive commands. Everything gets logged.
We built this because we needed it. Our own AI agents build this product, and they've demonstrated — repeatedly — that they will forward production credentials to external tools if nothing stops them.
Is it perfect? No. Our prompt injection detector has false positives. The secret scanner sometimes flags high-entropy strings that aren't secrets. We're honest about that.
But the OpenClaw breach wasn't caused by sophisticated attacks that bypassed advanced security measures. It was caused by having no security measures at all. Auth bypass, credential leaks, malicious plugins, zero logging — every one of these failures had an existing, known solution that wasn't implemented.
The bar isn't perfection. The bar is not leaving 42,000 instances open to the internet with no authentication, no scanning, and no audit trail.
That bar is achievable. Today.
The incident details in this post are sourced from published security research by Maor Dayan (March 2026), Santiago Palma (March 2026), Futurism/Bloomberg (March 2026), Checkmarx, Ayoub Nainia (March 2026), and Marmelab (February 2026).
Originally published on mistaike.ai
Top comments (0)