Marcus Rowe

Posted on May 28 • Originally published at techsifted.com

SymJack: The Supply Chain Attack That Turns Your AI Coding Agent Against You

#security #supplychainattack #claudecode #cursor

Your AI coding agent just became an attack vector.

That's the short version of what Adversa AI published this week. The research team disclosed a technique called SymJack — a symlink hijacking attack that turns AI coding assistants into supply chain attack delivery systems. It affects Claude Code, Cursor, Gemini CLI, GitHub Copilot CLI, Grok Build CLI, and OpenAI Codex CLI. All six. SecurityWeek covered the disclosure; the original technical write-up is on Adversa AI's blog with proof-of-concept video for each affected tool.

The scary part isn't the technical cleverness of the attack. It's that the human approval step — the thing all these tools use as their primary safety mechanism — is exactly what SymJack defeats.

What SymJack Is (And Why The Name Makes Sense)

The name breaks down simply: symlink hi*jack*. It's an attack that weaponizes the way AI coding agents handle file operations, specifically how they follow and resolve symbolic links before showing users what they're about to approve.

Here's the basic problem. When you're working with an AI coding agent and it wants to copy a file, it shows you an approval prompt. Something like: "Copy video.mp4 to docs/media/video.mp4." Looks harmless. You approve. But if video.mp4 is actually a symlink that redirects to the agent's own MCP configuration file, the "copy" operation overwrites that config instead — replacing it with attacker-controlled JSON that registers a malicious MCP server.

When the agent restarts, that new server launches. With full user privileges. Running whatever the attacker put in that startup command.

The developer approved what the screen showed. The kernel wrote somewhere entirely different.

Adversa AI's researchers put it bluntly in their disclosure: "The human approval step, the key control these tools lean on for safety, is the thing being defeated." That's not a bug in one product. It's a design assumption that every major AI coding tool made and every major AI coding tool got wrong in the same way.

How The Attack Actually Works

Three things have to be in place for SymJack to execute:

Attacker control of a repository — either through a malicious pull request or a poisoned package that gets cloned into the developer's workflow
A pre-built malicious MCP server — staged and ready on attacker infrastructure
A developer using an AI coding tool — with default agent configurations

The attack runs in three stages.

Stage one: Instruction injection. AI coding agents read project instruction files automatically when they open a repository. Claude Code reads CLAUDE.md. Other agents read AGENTS.md, .github/copilot-instructions.md, or similar files. A malicious repository can embed hidden directives in these files — buried inside blank content using file includes — that pre-position the agent to execute the attack chain when it starts working.

Stage two: Deceptive approval. The agent presents what appears to be a harmless file operation. "Copy this video to the docs folder." The approval prompt shows the nominal source and destination paths. What it doesn't show — because the tool isn't resolving symlinks before generating the prompt — is that the source file is a symlink pointing to the agent's own MCP config directory.

Stage three: Configuration hijacking. The developer approves. The cp command runs. The symlink redirects the write operation to overwrite the agent's MCP server configuration with attacker-controlled JSON. The next time the agent restarts, that malicious MCP server launches with user-level privileges.

What happens after that depends on what the attacker put in the payload. Adversa AI's proof-of-concept demonstrations showed credential theft as the primary impact — SSH keys, cloud tokens, browser session data. But the researchers also noted that in CI environments, the attack chain runs with zero clicks. An automated CI runner that pulls untrusted code and runs an AI agent against it will execute the entire SymJack chain without anyone seeing an approval prompt at all. One malicious pull request can drain every secret the runner holds.

Every Major AI Coding Tool Was Vulnerable

Adversa AI tested six products. All six were vulnerable at disclosure.

Tool	Tested Version
Claude Code	v2.1.128
Gemini CLI / Antigravity CLI	v0.43.0 / v1.0.2
Cursor Agent CLI	v2026.05.20
GitHub Copilot CLI	v1.0.51
Grok Build CLI	v0.1.216
OpenAI Codex CLI	v0.133.0

This isn't a coincidence. It's a consequence of all these tools making the same architectural choices: they all ingest project instruction files, they all have shell access, and they all use human approval prompts that show command strings rather than resolved canonical paths. Same design decisions, same attack surface.

How The Vendors Responded

The responses varied, and not all of them were great.

Anthropic initially rejected the report as out of scope. Then, quietly, Claude Code v2.1.129 shipped with a change that resolves symlinks before generating approval prompts and displays the real destination path to the user. So the fix exists, even if the initial response didn't acknowledge the problem. That's... better than nothing, and it's the most complete response of any vendor on the list. I covered Anthropic's Claude Code Security public beta last month — it's worth noting that neither that feature nor the existing approval flow caught this class of attack before the Adversa AI disclosure.

Google declined. Their position: this requires the user to explicitly approve the operation, so it's working as intended. That argument holds up in the narrowest possible sense and falls apart the moment you consider that the user is approving what the prompt shows, not what the kernel executes. The approval is informed by the wrong information.

Cursor declined as well, citing it as a duplicate of a prior symlink report they were already aware of. OK — so they knew about it and hadn't fixed it.

xAI and GitHub hadn't responded at publication time.

OpenAI declined via Bugcrowd, arguing that user approval constitutes informed consent. Same logic as Google. Same problem.

Why This Matters More Than It Might Seem

I want to be clear about something: SymJack isn't a traditional vulnerability where you patch the CVE and call it done. It's exposing a systemic assumption at the heart of how AI coding agents work.

These tools are designed to be helpful. They read your project files, they understand your codebase, they execute file operations on your behalf. That's the value proposition. But "reads project files and executes operations based on their content" is also a pretty clean description of a code injection attack surface.

The MCP security flaws that OX Security disclosed back in May hit a similar nerve — the protocol that makes AI agents powerful is also the protocol that makes them attackable. SymJack is the execution-layer version of that same problem. The instruction files that make agents context-aware are also vectors for injecting attacker directives. The shell access that lets agents do useful work is the same shell access they use to write malicious config files.

And the volume of developers adopting these tools right now is enormous. We covered the enterprise cost implications of mass Claude Code adoption earlier this year. That same wave of adoption means millions of developers are running agents against code from GitHub, npm packages, and open-source repos they don't fully control. Supply chain attacks work when attackers can reach a lot of targets through a small number of compromised packages or repositories. AI coding agents make that reach dramatically easier.

On CI runners specifically, the risk is acute. If your pipeline pulls code, runs an AI agent against it as part of a code review or generation step, and that code is malicious — the attack chain runs completely automatically. No approval prompt. No human in the loop. One compromised PR can exfiltrate every secret the runner holds.

What You Should Do Right Now

A few concrete steps, in rough priority order.

Update your AI coding tools. Claude Code v2.1.129 includes the symlink resolution fix. Check for updates on every other tool you're using — vendor responses suggest this is moving fast, and mitigations may have shipped after the Adversa AI disclosure.

Be skeptical of file operation approvals. SymJack works because developers approve quickly. If your agent is asking to copy files and the source isn't immediately obvious, stop and look at it. The resolved path is what matters, not the displayed string. After Claude Code's patch, the real path should show up in the prompt — but for tools that haven't patched, you're reading the nominal path.

Don't run AI agents against untrusted repositories without isolation. Cloning a random GitHub repo and running an AI agent against it is now in the same risk category as running code from that repo. Treat it accordingly. Use sandboxed environments or containers.

Harden your CI pipelines. If your CI runner uses an AI coding agent as part of its workflow, that runner should have minimal credentials and no access to production secrets. Isolate runners handling untrusted PRs from runners that hold real credentials. This is good practice generally — SymJack makes it urgent.

Rotate credentials on potentially exposed systems. If you've been running AI coding agents against code from sources you don't fully control, and you haven't audited your MCP configuration files recently — check them. The Adversa AI research suggested that a compromised config would be subtle and wouldn't necessarily generate obvious errors. If you're uncertain, rotate your SSH keys and cloud tokens.

Watch MCP configuration files. Real-time monitoring for changes to .mcp.json and similar agent config files is now worth setting up. An unexpected modification to those files followed by interpreter activity is a SymJack execution signature.

The Bigger Picture

What SymJack makes undeniable is that AI coding agents have attack surfaces that didn't exist two years ago. These tools are powerful, they're being adopted at scale, and they introduce a category of risk that most developer security posture isn't set up to catch.

The approval prompts feel like a safety control. They're not — not if the information in the prompt is wrong. And as Adversa AI's research shows, getting that information wrong is easy when symlinks are involved.

The tools will get better. Anthropic's partial patch shows the fix isn't difficult once vendors commit to it. But in the meantime, the combination of AI agent adoption and supply chain attack sophistication is a real problem that deserves serious attention from anyone writing code with these tools.

The approval prompt is there for a reason. Just make sure you actually know what you're approving.

Sources: Adversa AI — "The Approval Prompt Is Lying to You: Symlink RCE in Five AI Coding Agents" | SecurityWeek — SymJack Attack Turns AI Coding Agents Into Supply Chain Attack Delivery Systems

Top comments (1)

Harjot Singh • May 31

This is the attack class that should scare everyone shipping coding agents, supply-chain compromise that turns the agent itself into the threat. The agent has write access, runs code, and trusts its inputs, so a poisoned dependency or an injected instruction is a direct path to your repo and secrets. The defense isn't trusting the agent harder, it's least-privilege plus a gate on what it can actually execute, so a compromised agent still can't act outside its sandbox. I architect Moonshift this way, the agent proposes, a permissioned layer executes. Which mitigation would you prioritize first, dependency pinning/verification or capability sandboxing?