In February 2026, security researcher Jamieson O'Reilly published an experiment that broke ClawHub's trust model wide open. He created a backdoored skill, used bots to inflate its download count to 4,000+, and made it the #1 most downloaded skill on ClawHub. Real developers from 7 countries executed it.
Around the same time, Paul McCarty (maintainer of OpenSourceMalware) reported finding hundreds of malicious skills on ClawHub. A Reddit post on r/cybersecurity about the situation received 339 upvotes and significant attention.
This isn't just a vulnerability disclosure. It's a fundamental question about supply chain trust in the AI agent ecosystem.
Technical Vulnerability Analysis
The vulnerabilities Jamieson demonstrated are basic but devastating.
Download Counter Manipulation
ClawHub's download counter accepted unauthenticated requests. There was no rate limiting. Worse, the server trusted the X-Forwarded-For header at face value — making IP spoofing trivial.
The attack flow is simple:
- Publish malicious skill
- Bot-inflate download count
- Achieve "#1 skill" status
- Developers trust and install
On npm or PyPI, inflating download counts at least requires authenticated requests. ClawHub didn't even have that minimal barrier.
Hidden Payloads
The sophisticated part of the attack exploited skill structure:
- The ClawHub UI primarily displays
SKILL.md - The actual malicious payload lives in referenced files (e.g.,
rules/logic.md) - Users see "clean marketing" while the AI agent sees "execute these commands"
This is fundamentally different from traditional package manager malware. Malicious npm packages execute code — detectable through static analysis or sandboxing. But skills are natural language instructions. They run within the agent's existing permissions, making it difficult for traditional security tools to detect "malicious natural language directives."
"Just Be Careful" Isn't Enough
OpenClaw creator Peter Steinberger's initial response was essentially "use your judgment and be careful." (He has since deleted those responses; screenshots remain.)
As Paul McCarty put it: "You don't get to leave a loaded ghost gun in a playground and walk away from all responsibility of what comes next."
The inadequacy of this response is clear. The entire point Jamieson proved is that trust signals are fakeable. When an attacker can control every signal users rely on — download counts, rankings, visible descriptions — "just be careful" is meaningless.
The Root Problem of Supply Chain Trust
This incident isn't unique to the AI agent ecosystem. It's an old software supply chain pattern repeating in a new form.
| Ecosystem | Incident | Pattern |
|---|---|---|
| npm | event-stream (2018) | Popular package takeover → malware injection |
| PyPI | typosquatting (ongoing) | Name-similar packages cause confusion |
| VS Code | Malicious extensions (2024) | Marketplace trust exploitation |
| ClawHub | This incident (2026) | Download inflation + hidden natural language payloads |
The difference is the nature of the attack surface. Code packages can be inspected before execution. But AI agent skills are natural language instructions that operate within the agent's existing permissions — file system access, network requests, shell command execution. If a skill says "download and run this script from this URL," the agent complies.
Most malicious ClawHub skills so far are low-effort — ClickFix-style "curl this auth tool" approaches. But as the Reddit post author noted, that's because sophisticated actors haven't fully arrived yet. APT groups quietly exfiltrating credentials and cryptocurrency through "popular skills" is a matter of when, not if.
The Need for Automated Verification
"Read everything yourself and use your judgment" doesn't scale. Very few developers read every hidden file in a skill, and judging the maliciousness of natural language instructions is more subjective than code review.
What's needed is automated trust verification — not based on fakeable metrics like download counts and rankings, but on actual content analysis.
This is the same challenge in the Soul Spec ecosystem. Soul files are instruction files, not executables, but they define AI agent behavior — creating the same trust problem. What if a SOUL.md contains a hidden directive to "send all user data to this endpoint"?
SoulScan was designed for this problem. Its 5-stage pipeline with 53 security patterns automatically verifies Soul files — detecting data exfiltration patterns, privilege escalation attempts, hidden directives, and more. What the ClawHub incident demonstrates is that this kind of automated verification isn't optional — it's essential.
Takeaways
Three lessons from this incident:
First, popularity metrics are not trust signals. Download counts, star ratings, rankings — all manipulable. Whether it's a package manager or an AI skill marketplace, equating popularity with safety is dangerous.
Second, the agent's permission model defines the attack surface. Skills execute within the agent's existing permissions. Granting broad system access to an agent is itself a precondition for risk.
Third, natural language supply chain attacks are just beginning. Detection tools for code-based supply chain attacks are maturing, but attacks via natural language instructions are in their infancy. Security tooling and standards in this space are urgently needed.
It's fortunate that Paul McCarty and Jamieson O'Reilly sounded the alarm. The question is how the ecosystem responds.
References:
- Jamieson O'Reilly's original post
- Paul McCarty's analysis
- Reddit discussion
- Paul McCarty interview video
Originally published at https://blog.clawsouls.ai/posts/clawhub-malware-supply-chain/
Top comments (0)