Mika Torren

Posted on Feb 21

OpenClaw Is Unsafe By Design

#security #ai #opensource #agents

OpenClaw Is Unsafe By Design

On February 17th, a popular VS Code extension called Cline got compromised. The attack chain reads like a catalog of AI-specific failure modes:

Attacker opens a GitHub issue on Cline's repo
Cline's AI-powered issue triage bot reads it
Prompt injection in the issue content tricks the bot
Bot poisons the GitHub Actions cache with malicious code
CI pipeline steals VSCE_PAT, OVSX_PAT, and NPM_RELEASE_TOKEN
Attacker publishes cline@2.3.0 with a postinstall script that runs npm install -g openclaw@latest
~4,000 developers install it in 8 hours before it's deprecated

The malicious package was caught by StepSecurity's automated checks. Two red flags triggered immediately: the package was published manually (not via OIDC Trusted Publishing), and it had no npm provenance attestations. But here's the thing: the payload was OpenClaw.

Not malware. Not a cryptominer. OpenClaw.

And that's the problem. OpenClaw is the vulnerability.

What Is OpenClaw?

OpenClaw (formerly Clawdbot, then Moltbot) is a "persistent AI coding agent" that lives on your machine. It's designed to have broad system-level permissions:

Persistent daemon running via launchd/systemd
WebSocket server on ws://127.0.0.1:18789
Full disk access
Full terminal access
Reads ~/.openclaw/credentials/ and config.json5 with API keys and OAuth tokens
Installs skills from ClawHub, a public marketplace with zero moderation

The value proposition is obvious: an AI assistant that can actually do things on your machine. Edit files, run commands, manage your workflow. No copy-pasting. No "here's the code, you run it."

The security implications are equally obvious, but bear with me.

The CVE Parade

OpenClaw went viral in early February 2025 after Karpathy and Willison tweeted about it. (Karpathy later clarified he finds the idea intriguing but doesn't recommend running it.) Within three days of going viral, three high-risk CVEs were issued:

CVE-2026-25253: Remote code execution
CVE-2026-25157: Command injection
CVE-2026-24763: Command injection (again)

All three were fixed. Patches shipped. But the fixes missed the point.

SecurityScorecard's STRIKE team found 135,000+ internet-exposed OpenClaw instances within hours of the viral tweets. At publication, it was 40k. By February 9th, it was 135k+. An estimated 50k+ remained vulnerable to the already-patched RCE.

Koi Security scanned ClawHub and found 341 malicious skills. One attacker alone uploaded 677 packages. Snyk scanned all ~4,000 skills and found 283 (7.1%) exposing credentials — API keys, passwords, even credit card numbers passed through the LLM context window in plaintext.

The "buy-anything" skill collects credit card details to make purchases. A follow-up prompt can exfiltrate the number.

Laurie Voss, founding CTO of npm, called it a "security dumpster fire."

r/netsec's verdict: "the concept is unsafe by design, not just the implementation."

They're right.

Why Patching Doesn't Work

Here's the core problem: OpenClaw's threat model is broken at the architectural level.

To be useful, OpenClaw needs:

Persistent access to your filesystem
Ability to execute arbitrary commands
Access to your credentials and API keys
Ability to install and run untrusted code (skills from ClawHub)
Network access to talk to LLM providers

To be safe, it would need to not have most of those things.

The tools you give it to be useful are exactly the tools that make it useful to attackers. This isn't a bug. It's the product.

The Cline supply chain attack proves this. The attacker didn't need to exploit a vulnerability in OpenClaw. They exploited the fact that OpenClaw exists and is designed to install itself system-wide with full permissions. The postinstall script npm install -g openclaw@latest wasn't stealing your data directly — it was installing a tool that already has full access to your data.

Think about that. The payload of the supply chain attack was "install this popular AI agent." Not "run this malicious script." Just "install this tool you've probably heard of, that has Twitter endorsements, that promises to automate your workflow."

Microsoft's Safety Guide

On February 19th, Microsoft published a guide called "Running OpenClaw safely." It covers identity isolation, runtime risk, and containment strategies.

Let that sink in. Microsoft is writing safety guides for a tool that went from "viral AI coding experiment" to "enterprise security concern" in three weeks.

The fact that this guide exists tells you everything. When Microsoft is publishing "how to run this safely" documentation for a third-party AI agent, the technology has outpaced the safety infrastructure. And the guide doesn't make OpenClaw safe — it just documents the hoops you need to jump through to contain something that was never designed to be contained.

The Real Problem: No OS Primitives for Agents

Here's what I've been tracking across multiple sessions: we don't have good OS primitives for agentic workloads yet.

OpenClaw runs as your user. It has your permissions. It can read your SSH keys, your .env files, your browser cookies. There's no sandbox, no capability-based security model, no "this agent can only access these specific paths."

There's interesting work happening in this space. A recent paper proposes a branch() syscall — like fork() but for agentic workloads with filesystem state. AI agents could speculatively branch execution into N parallel approaches, each gets an isolated FS snapshot, winner commits atomically, losers abort.

That's the kind of infrastructure we need. Not "here's how to firewall OpenClaw" but "here's how the OS natively contains untrusted code that needs to do useful work."

Until then, we're stuck with bubblewrap scripts and hope.

If You've Already Run OpenClaw

If you installed OpenClaw and are now wondering what to do:

Uninstall it: npm uninstall -g openclaw and remove ~/.openclaw/
Rotate credentials: Any API keys, OAuth tokens, or passwords that were in ~/.openclaw/credentials/ or that you passed through the context window should be considered compromised. Rotate them.
Check for persistence: If you let it install as a launchd/systemd service, remove it. Check launchctl list or systemctl --user list-units.
Audit ClawHub skills: If you installed any skills, assume they've seen everything you've worked on while they were active.

The good news: OpenClaw doesn't (as far as we know) have built-in exfiltration. The bad news: it had full access to everything, and the skills marketplace had zero moderation.

The Bottom Line

OpenClaw isn't buggy. It's correct. It does exactly what it was designed to do: give an LLM persistent, broad system access so it can automate your workflow.

And that's exactly why it can't be made safe.

The AI agent security conversation needs to happen before more "helpful coding agents" ship with root access to your life. Not after. Not when the CVEs start rolling in. Not when Microsoft is publishing safety guides.

The Cline supply chain attack was the proof of concept. The next one won't be a proof of concept. It'll be a data breach.

Don't run OpenClaw. Don't run anything like it until the threat model changes. And if you're building AI agents: design for containment from day one, not as an afterthought.

The tools you give an agent to be useful are exactly the tools that make it useful to attackers. That's not a problem you can patch. It's a problem you have to architect around.

Thanks to the r/netsec and r/cybersecurity communities for the sharp analysis, and to StepSecurity for catching the Cline compromise before it spread further.