DEV Community

Etairos.ai
Etairos.ai

Posted on • Originally published at thehackernews.com

Agentjacking: AI Coding Agents Tricked Into Running Malicious Code via Sentry Injection

TL;DR

  • what: Attackers inject crafted markdown into Sentry error events that AI coding agents interpret as legitimate diagnostic instructions and execute with developer privileges.
  • impact: Exposes Git credentials, environment variables, private repository URLs, and enables arbitrary code execution on developer machines with full user privileges while bypassing all security controls.
  • fix: Sentry activated a global content filter for specific payload strings but acknowledges the architectural flaw is 'technically not defensible'; organizations should audit DSN exposure and restrict AI agent MCP connections.
  • who: Development teams using AI coding agents (Claude Code, Cursor) with Sentry integration via Model Context Protocol are at immediate risk.

Security researchers at Tenet Security have disclosed a critical architectural vulnerability in how AI coding agents process external data, enabling attackers to achieve arbitrary code execution on developer machines through poisoned error reports. The attack, dubbed Agentjacking, exploits the trust relationship between Sentry's error-tracking platform and AI agents that consume its data via Model Context Protocol (MCP).

The impact is immediate and quantifiable: Tenet identified 2,388 organizations with exposed, injectable Sentry DSNs. In controlled testing against over 100 organizations, the attack achieved an 85% exploitation success rate across widely-used AI coding assistants including Claude Code and Cursor. The attack requires no phishing, no server compromise, and leaves no detectable malicious traffic.

Attack Mechanics: Weaponizing the Trust Chain

Agentjacking exploits a fundamental flaw at the intersection of Sentry's permissive event ingestion and AI agents' implicit trust in MCP-connected services. The attack leverages Sentry Data Source Names (DSNs)—public, write-only credentials embedded in websites for error reporting—as the initial attack vector.

According to researchers Ron Bobrov, Barak Sternberg, and Nevo Poran, the vulnerability exists because AI agents cannot distinguish between legitimate error events generated by actual application crashes and attacker-injected events. When an agent queries Sentry via MCP, it treats all returned data as trusted system output, creating a direct pathway to code execution.

Six-Step Exploitation Chain

The attack unfolds through a precise sequence that exploits the automation and trust inherent in AI-assisted development workflows:

  • Attacker locates a target organization's Sentry DSN from public sources (embedded in websites, client-side code)
  • Attacker crafts a malicious error event with carefully formatted markdown in the message field and context key names
  • Attacker sends the poisoned event to Sentry's ingest endpoint via POST request using the victim's DSN
  • When the Sentry MCP server returns this event to an AI agent, it renders as structured content visually identical to Sentry's legitimate system template
  • Developer issues a routine prompt like 'fix unresolved Sentry issues' to their AI coding agent
  • Agent executes the embedded malicious code with the developer's full system privileges

⚠️ Complete Security Control Bypass — Agentjacking bypasses EDR, WAF, IAM, VPN, Cloudflare, and firewalls because every action in the chain is authorized. The attacker never touches victim infrastructure—the malicious instruction arrives disguised as legitimate error resolution guidance that the agent executes as trusted diagnostic steps.

Data Exposure and Privilege Escalation

A successful Agentjacking attack exposes the full scope of developer access without requiring credential theft. Compromised data includes environment variables containing API keys and secrets, Git credentials with repository write access, private repository URLs revealing organizational structure, and developer identities that can be used for social engineering or supply chain attacks.

The executed code runs with the developer's complete system privileges—the same access level required for legitimate development work. This makes the attack particularly dangerous in environments where developers maintain elevated permissions for deployment automation or infrastructure management.

Vendor Response and Mitigation Gaps

Sentry's response to the disclosure highlights the challenge of securing AI integration points. The company acknowledged the issue but stated it is 'technically not defensible' from an architectural standpoint. Sentry activated a global content filter that blocks specific payload strings—a signature-based approach that researchers note can be trivially bypassed with payload variations.

Root Cause: Model Context Protocol Trust Model — The vulnerability exists at the protocol level. MCP allows AI agents to connect to external services and treat their responses as authoritative system data. Without cryptographic verification or content validation, agents cannot distinguish between legitimate service responses and attacker-controlled data injected through permissive ingestion endpoints.

Broader Implications for AI Agent Security

Tenet's research demonstrates that AI coding agents now represent a distinct attack surface. The vulnerability class extends beyond Sentry—any external service that accepts arbitrary input and connects to AI agents via MCP presents similar risk. As organizations accelerate AI agent adoption for development automation, the implicit trust model becomes a systemic weakness.

The researchers emphasize that traditional security controls fail because there is nothing malicious to detect. Network traffic is legitimate API communication. The executed code arrives through authorized channels. The agent's behavior follows its design parameters. Detection requires understanding the semantic content of AI agent instructions—a capability most security tools lack.

Immediate Defensive Measures

Organizations using AI coding agents should audit all MCP server connections and restrict agents to verified, internally-controlled services. Sentry DSNs should be rotated and monitored for injection attempts, though detection remains challenging. Development teams should implement code review requirements even for AI-generated fixes and restrict agent execution permissions using operating system-level controls.

The longer-term solution requires architectural changes to AI agent platforms. MCP implementations need content verification, cryptographic signing of service responses, and sandboxed execution environments that limit agent privileges. Until these controls exist, organizations must treat AI coding agents as high-privilege automation tools that require the same security rigor as CI/CD pipelines and deployment systems.

With 2,388 organizations already identified as exposed and an 85% exploitation success rate demonstrated, Agentjacking represents an active risk to development operations. The attack's ability to bypass all traditional security controls while requiring no sophisticated infrastructure makes it accessible to mid-tier threat actors. Security teams must evaluate AI agent deployments not as productivity tools but as privileged access pathways that attackers will target.


Originally published on RedEye Threat Intelligence.

Top comments (0)