Jahanzaib

Posted on Apr 6 • Originally published at jahanzaib.ai

OpenClaw's Security Crisis: What 346,000 Stars and 135,000 Exposed Instances Teach Us About AI Agent Security

#openclaw #aisecurity #aiagents #cybersecurity

Two weeks ago I got a message from a client asking whether OpenClaw was still safe to run. Their DevOps lead had seen the headlines about 135,000 exposed instances and nine CVEs published in four days, and they wanted to know if the system I helped them deploy was one of them. I ran a quick check, confirmed they were fine because we had set it up correctly from day one, and then spent the next hour reading every security advisory and CVE detail that had dropped in the past three months.

OpenClaw, the open source AI agent from Peter Steinberger that hit 346,000 GitHub stars faster than any project in GitHub's history, is at the center of the first major AI agent security crisis of 2026. And the technical details are not abstract. They are specific, reproducible, and relevant to anyone running AI agents in production right now. Whether you use OpenClaw, NanoClaw, or any other autonomous agent framework, this story contains things you need to know before your next deployment.

Key Takeaways

CVE-2026-25253 (CVSS 8.8) allows one-click remote code execution by exploiting OpenClaw's WebSocket origin validation gap. A victim visiting a single malicious webpage is enough to trigger full system compromise.
135,000+ OpenClaw instances were found exposed on the public internet across 82 countries. More than 15,000 were directly vulnerable to remote execution.
Nine CVEs were disclosed in four days, including command injection, path traversal, and server-side request forgery flaws. Eight vulnerabilities were classified as critical.
341 of 2,857 skills in the ClawHub marketplace were found to be malicious at time of audit. That is 12% of the entire plugin registry.
This is not an OpenClaw-specific problem. Any AI agent with persistent credentials, autonomous execution, and integrations into your digital life carries the same category of risk. The architecture itself is the attack surface.
I am still deploying OpenClaw for clients. The difference between a safe deployment and an exposed one is about four configuration choices, and I will walk through all of them.

What Actually Happened With OpenClaw

OpenClaw launched in November 2025 as an open source personal AI agent. Within 24 hours of going viral in January 2026 it had 20,000 GitHub stars. By early April it sits at 346,000, making it the fastest-growing open source project in GitHub history. That growth attracted something else too: security researchers who started looking very carefully at what the tool actually does.

The first major finding was exposure. By the time CVE-2026-25253 was publicly disclosed on February 3, 2026, security researchers had already found over 135,000 OpenClaw instances running on publicly accessible IP addresses across 82 countries. More than 15,000 of those were directly exploitable via the RCE vulnerability. Most of the rest were accessible over unencrypted HTTP.

Security researchers found over 135,000 OpenClaw instances exposed on the public internet — many running unencrypted over HTTP.

CVE-2026-25253 is the one getting the most attention, and it deserves it. The vulnerability stems from a single design decision: OpenClaw's control UI reads a gatewayUrl parameter from the query string without validating it, and auto-connects on page load. When it connects, it sends the stored gateway authentication token in the WebSocket payload. An attacker can host a malicious webpage, trick a user into visiting it, and receive that token within milliseconds. The WebSocket server does not validate the origin header, so any website can trigger this connection.

Once an attacker has the gateway token, the blast radius is enormous. They can disable user confirmation prompts by setting exec.approvals.set to off. They can escape container restrictions by switching tools.exec.host to gateway. Then they have arbitrary code execution on the host machine. The entire attack chain runs in milliseconds according to Oasis Security's disclosure.

OpenClaw patched this in version 2026.1.29, released January 30. But the issue was already in the wild and nine more CVEs followed over the next four days. These included command injection (CVE-2026-24763), SSRF in the gateway (CVE-2026-26322, CVSS 7.6), and path traversal in the browser upload component (CVE-2026-26329). In total, the initial audit turned up 512 vulnerabilities with eight classified as critical.

Why AI Agents Are a Different Security Problem

I have been building production AI systems for a few years now and have shipped 109 of them across industries. In that time, I have seen a lot of organizations treat AI agent security the same way they treat web application security. That framing misses something important.

A web application has a defined interface. It accepts specific inputs, performs specific operations, and returns outputs within a bounded scope. Its attack surface is relatively static and auditable. An AI agent is different in a fundamental way: the instructions that control its behavior arrive at runtime, from untrusted sources, through the same channel as ordinary content.

OpenClaw's runtime can ingest untrusted text, download and execute skills from external sources, and perform actions using the credentials assigned to it — without equivalent controls to static application code.

CrowdStrike published analysis of OpenClaw that put this clearly: "Indirect prompt injection attacks targeting OpenClaw have already been seen in the wild, such as an injection attempt to drain crypto wallets." The attack method involves embedding malicious instructions in data the agent ingests: emails, webpages, documents. The agent reads the content and the malicious instructions look identical to legitimate data from the model's perspective.

This is a property of how language models process information. User data and control instructions occupy the same token space. There is no hardware-level separation between what the model is told to do and what it reads in the environment. This means prompt injection is not a bug you can patch once and forget. It is an architectural reality of the current generation of AI agents.

OpenClaw specifically amplifies this risk because of its integration footprint. A single instance connects to WhatsApp, Telegram, Slack, Discord, and iMessage, while also managing email, calendars, files, and shell commands. CrowdStrike described this as "prompt injection transforming from a content manipulation issue into a full-scale breach enabler, where the blast radius extends to every system and tool the agent can reach."

If you are evaluating whether your business is ready to deploy AI agents, the AI readiness assessment on this site includes a technical readiness dimension specifically designed to surface these kinds of architectural concerns before you commit to a deployment path.

The Supply Chain Problem Nobody Was Talking About

The CVEs got the headlines, but the ClawHub marketplace finding might be the more systemic issue. Security researchers auditing the OpenClaw skill registry found 341 malicious skills out of 2,857 total. That is 12% of the entire plugin ecosystem.

These are not obviously malicious tools. They appear as useful utilities: productivity helpers, calendar integrations, file management shortcuts. Once installed, a malicious skill runs with the same permissions as OpenClaw itself, which means it can read files, execute shell commands, exfiltrate credentials, and make outbound network requests. The user has no way to distinguish a legitimate skill from a compromised one without auditing the source code.

Reco.ai's analysis of the marketplace found what they called "shadow AI with elevated privileges" — third-party code running inside an agent runtime that has persistent access to everything the agent can touch. This is a supply chain problem that mirrors what we saw with npm malware campaigns, except the blast radius per compromised package is considerably larger because the agent runtime has system-level access rather than just code-level access.

341 of 2,857 skills in the ClawHub marketplace were found to be malicious — roughly 12% of the entire plugin registry. Many appeared as ordinary productivity tools.

OpenClaw's maintainers responded by accelerating their skill review process and removing the flagged entries. But the underlying dynamic remains: any marketplace-based extension model for an AI agent creates a continuous supply chain risk that requires ongoing vigilance, not just a one-time audit.

For the clients I have deployed OpenClaw with, the rule is simple: no skills from the marketplace without code review. We use a curated allowlist of skills that I have reviewed manually or built in-house. It adds friction to the deployment process, and that friction is the point.

Microsoft and CrowdStrike Weigh In

Two weeks after the initial CVE disclosures, Microsoft published a security blog titled "Running OpenClaw safely: identity, isolation, and runtime risk." It is worth reading in full if you deploy AI agents at any scale. The core framework they articulate is three-layer: identity first, scope second, model last.

Identity first means deciding who can talk to the agent before anything else. This includes implementing DM pairing, allowlists, and explicit open access controls. The reasoning is that if you do not control who can send instructions to the agent, you cannot control what the agent does.

Scope second means deciding where the agent is allowed to act. This includes group allowlists, mention gating, tool restrictions, sandboxing, and device-level permissions. The principle of least privilege applies here exactly as it does in traditional infrastructure security: the agent should have access to exactly what it needs for the task at hand and nothing more.

Model last is the most counterintuitive piece. Microsoft's guidance is to design your deployment under the assumption that the model can be manipulated. Not might be, can be. Build your system so that successful manipulation has a limited blast radius regardless of how clever the attack is. This means isolation, not trust, is the primary defense.

CrowdStrike's analysis added an enterprise-specific dimension. They found a "growing number of internet-exposed OpenClaw instances, many accessible over unencrypted HTTP rather than HTTPS." Their recommendation for security teams: deploy Falcon Exposure Management to identify internal and external OpenClaw deployments, monitor DNS requests to openclaw.ai domains, and implement runtime guardrails to detect prompt injection attempts before execution.

The Cisco security team published a post calling personal AI agents "a security nightmare" in enterprise contexts, pointing specifically to the risk when "employees deploy OpenClaw on corporate machines and connect it to enterprise systems without IT oversight." This is the shadow AI problem in its sharpest form: the productivity tool that arrived before the governance policy.

My Actual Deployment Configuration

I am not going to stop deploying OpenClaw because of this. The tool is genuinely useful for the right use cases and the vulnerabilities I described above are, with the right configuration, mitigable. Here is what I actually do when I set up OpenClaw for a client.

Secure AI agent deployment starts at the infrastructure layer: network isolation, dedicated credentials, and container boundaries before any configuration of the agent itself.

The first thing is network binding. OpenClaw's gateway defaults to 0.0.0.0:18789, which binds to all network interfaces including public ones. I always change this to 127.0.0.1 on the first line of configuration. If remote access is required, it goes behind a VPN, never directly to the internet. This single change eliminates the primary exposure vector for CVE-2026-25253 and the mass exposure issue the researchers identified.

Second is credential isolation. The agent gets its own dedicated accounts for every integration: a dedicated email account rather than a shared corporate inbox, a dedicated calendar, dedicated messaging credentials. These accounts have the minimum permissions required. When the agent makes a mistake or gets compromised, the blast radius is contained to those accounts rather than to an executive's email archive.

Third is a containerized runtime. OpenClaw runs inside a Docker container with a non-root user, no privileged flags, restricted outbound network access (using a blocklist of ranges the agent has no legitimate reason to reach), and no host path mounts beyond what the specific use case requires. This is standard practice for any code running with elevated privileges.

Fourth is the skill allowlist. No marketplace skills without review. If a client needs a specific integration, I either review the skill's source code in detail or build a minimal version in-house. The effort is worth it given that 12% of the ClawHub registry was compromised at peak.

Fifth is OpenClaw's built-in audit command: openclaw security audit and openclaw security audit --fix. I run this after any configuration change and before any deployment that exposes the gateway to additional network surfaces. The command checks for gateway auth exposure, browser control exposure, overly permissive allowlists, and filesystem permission issues. It is not a complete security audit but it catches the most common misconfigurations quickly.

If you are working through whether to adopt AI agents at all, whether to deploy OpenClaw versus a managed alternative, or whether your current setup has exposures you are not aware of, the contact page is the right starting point. This is exactly the kind of evaluation I do before any deployment.

What This Means for Businesses Adopting AI Agents

The OpenClaw story is not really about OpenClaw. It is about what happens when autonomous systems with persistent credentials and broad integration access reach mainstream adoption before the security practices catch up. OpenClaw is the first case study because it grew faster than anything else, but the same dynamics apply to any AI agent deployment.

I see this pattern with clients regularly. A team discovers that an AI agent can automate a meaningful chunk of their operational work. They deploy it quickly because the productivity gain is real and immediate. The security review happens later, if at all. And later means after the agent has already connected to production systems, processed sensitive data, and accumulated credentials that are hard to rotate without breaking workflows.

The 77% of security professionals who told Fortune's survey they were comfortable with autonomous AI systems operating without human oversight have probably not done a detailed threat model for what "without human oversight" means when the agent can read all email, execute shell commands, and send messages on behalf of real people. Comfort without analysis is the vulnerability.

There are three questions I ask every client before we touch OpenClaw or any other AI agent framework.

First: what are the actual credentials this agent will hold, and what can someone do with them if they get compromised? Walk through the worst-case scenario for every integration before deployment. If the answer is "access to our entire customer database" or "the ability to send emails as our CEO," the deployment needs more isolation work before it goes live.

Second: what external content will this agent ingest? If the agent reads emails, web pages, or third-party documents, it is consuming untrusted content through the same channel as its operating instructions. Every piece of external content is a potential prompt injection surface. This does not mean the agent cannot read external content. It means you need explicit sandboxing and output filtering between what the agent reads and what it can do.

Third: what does the governance process look like when this agent misbehaves? Not if, when. At some point the agent will take an action you did not intend. The question is how fast you can detect it, how fast you can stop it, and how much damage it can do in the time between the error and the intervention. If the answer to any of those is "we do not know," that is the gap to close before deployment.

I cover these evaluation dimensions in depth across some of my existing work: the OpenClaw overview post explains what the tool does, and the setup guide covers the installation process. But neither of those pieces goes deep on security hardening, which is why this post exists.

If you want to understand whether your business is at a stage where AI agents make sense at all, versus simpler automation alternatives, the AI readiness assessment takes about 10 minutes and gives you a scored breakdown across eight dimensions including technical readiness and data security posture. It is the starting point I recommend before any conversation about agent deployment.

The Broader Signal: AI Agent Security Is a Real Discipline Now

One positive thing the OpenClaw crisis has done is accelerate the formalization of AI agent security as a distinct field. Microsoft published an enterprise security framework. CrowdStrike added OpenClaw-specific detection to Falcon. Kaspersky labeled current versions "unsafe for general use." The OWASP AI Security working group has been expanding its guidance specifically for agentic systems.

This is good. It means the ecosystem is treating AI agent security with the seriousness it requires rather than treating agents as just another application type. The specific risks: prompt injection, supply chain compromise, credential amplification, autonomous execution without approval gates, are real and they require real tooling.

The tools I use and recommend for securing agent deployments now include: Docker container isolation as baseline, Pangea for runtime authorization and audit logging, Prompt Security's ClawSec suite for OpenClaw specifically (the GitHub repo is at prompt-security/clawsec), and Microsoft Defender for Cloud for enterprise deployments that need centralized visibility across multiple agent instances.

What I do not use are "AI safety" tools that add prompts telling the model to "be safe" or "don't do anything harmful." Those are not security controls. They are suggestions. Security comes from architectural boundaries, not model instructions. You can combine both, but if you are relying on the latter without the former, your deployment is not secure regardless of how detailed the system prompt is.

I have spent the last few months building agent security hardening into every deployment I do through AgenticMode AI, specifically because the OpenClaw story made clear that this is not optional work anymore. The clients who came to me with existing OpenClaw setups that needed audit work all had variations of the same problem: the gateway was not locked down, the credentials were too broad, and the skill allowlist was off by default. Three configuration changes, none of them complex. But none of them happen automatically, which is why they did not happen.

AI agent security hardening comes down to four configuration choices: bind to localhost, isolate credentials, containerize the runtime, and allowlist skills manually.

The broader lesson is that AI agent adoption is moving faster than AI agent operations practice. The security infrastructure, governance frameworks, and deployment standards are being built while people are already running agents in production. That gap creates risk, and the OpenClaw crisis is the first major public demonstration of what that risk looks like when it materializes.

The right response is not to avoid AI agents. For the right use cases, they deliver real and measurable operational leverage that simpler automation cannot match. I have documented this across the case studies on this site, including production systems that handle thousands of operations per day without human intervention. The right response is to deploy them with the same engineering discipline you would apply to any system that holds credentials and executes actions on behalf of real people.

Citation Capsule: OpenClaw reached 346K GitHub stars and 3.2 million users by April 2026 (OpenClaw Statistics April 2026). CVE-2026-25253 disclosed at CVSS 8.8 with patch in v2026.1.29 (The Hacker News, Feb 2026). 135,000+ exposed instances across 82 countries, 15,000+ directly vulnerable (PBX Science, 2026). 341 of 2,857 marketplace skills found malicious (CrowdStrike, Feb 2026). Microsoft enterprise security framework for AI agents (Microsoft Security Blog, Feb 2026). 77% of security professionals comfortable with autonomous AI without oversight (Fortune, Feb 2026).

Is OpenClaw safe to use in 2026?

OpenClaw can be deployed safely with the right configuration. The key steps are binding the gateway to localhost rather than all interfaces, running the agent in an isolated container with a non-root user, using dedicated credentials with minimum permissions for each integration, and maintaining a manually reviewed skill allowlist rather than installing marketplace skills freely. OpenClaw's own built-in audit command catches the most common misconfigurations. Versions from v2026.1.29 onward include patches for the critical CVEs. The tool is not safe with default settings in any environment that has public internet exposure.

What is CVE-2026-25253 and how serious is it?

CVE-2026-25253 is a cross-site WebSocket hijacking vulnerability in OpenClaw with a CVSS score of 8.8. It allows a remote attacker to steal a user's gateway authentication token simply by getting them to visit a malicious webpage. The attack takes milliseconds. With the stolen token, the attacker can disable confirmation prompts, escape container restrictions, and execute arbitrary commands on the host machine. A patch was released in OpenClaw v2026.1.29 on January 30, 2026. If you are running an earlier version, update immediately.

What is prompt injection and why does it matter for AI agents?

Prompt injection is an attack where malicious instructions are embedded in content the AI agent reads: emails, web pages, documents, API responses. Because language models process instructions and data through the same token stream, the model cannot inherently distinguish between a legitimate instruction from you and a malicious instruction embedded in a webpage it reads. For an agent like OpenClaw that connects to email, messaging apps, and the web, any piece of external content is a potential injection surface. CrowdStrike has already observed prompt injection attacks against OpenClaw in the wild, including attempts to drain crypto wallets via injected instructions in web content.

How many OpenClaw instances were exposed to the internet?

Security researchers found over 135,000 OpenClaw instances publicly accessible on the internet across 82 countries by the time CVE-2026-25253 was disclosed in February 2026. More than 15,000 of those instances were directly vulnerable to remote code execution. Most of the rest were accessible over unencrypted HTTP. The exposure happened because OpenClaw's gateway defaults to binding on all network interfaces (0.0.0.0) rather than localhost only, and most users did not change this default before connecting the agent to the internet.

What is the ClawHub marketplace risk?

ClawHub is OpenClaw's skill marketplace, similar to an app store for the agent. A security audit found 341 malicious skills out of 2,857 total, representing roughly 12% of the entire registry at time of audit. These malicious skills run with the same permissions as OpenClaw itself, meaning they can read files, execute shell commands, and exfiltrate credentials. They are often disguised as productivity utilities. The recommendation for production deployments is to avoid the public marketplace entirely and use only manually reviewed skills or custom-built integrations.

Should businesses avoid AI agents because of the OpenClaw security issues?

No. The OpenClaw security crisis is a lesson about deployment practices, not a reason to avoid AI agents entirely. AI agents deliver real operational value for the right use cases. The answer is to deploy them with proper security controls: network isolation, credential scoping, container-based runtime isolation, and supply chain controls for any third-party plugins. The same engineering discipline that applies to any system holding credentials and executing actions on behalf of real users applies here. Businesses that want an objective assessment of whether AI agents are the right fit for their current technical readiness should start with the AI readiness assessment.

How do I check if my OpenClaw instance is vulnerable?

Run openclaw security audit from your OpenClaw installation. This built-in command checks for the most common vulnerabilities: gateway auth exposure, browser control exposure, overly permissive allowlists, and filesystem permission issues. If your gateway is binding to 0.0.0.0 instead of 127.0.0.1, that is the first thing to change. Also verify you are running v2026.1.29 or later, which includes patches for CVE-2026-25253 and related vulnerabilities. If your instance is accessible from the public internet, restrict that access immediately while you complete the rest of the hardening steps.

What is the difference between OpenClaw and NanoClaw for security purposes?

NanoClaw, built by Lazer and Gavriel Cohen, is a lightweight containerized alternative designed with isolation as a first-class concern from the start. It has a smaller integration footprint than OpenClaw and fewer surface areas for both CVE-style vulnerabilities and prompt injection. For use cases where you do not need OpenClaw's full 100+ built-in skill library, NanoClaw is often the more secure choice by default. OpenClaw has more skills and broader integrations but requires more intentional hardening to reach an equivalent security posture. For high-sensitivity deployments, NanoClaw's reduced attack surface is a meaningful advantage.

DEV Community