The Miasma Worm: How AI Coding Agents Became a Supply Chain Attack Surface

#security #ai #devops #appsec

Microsoft just had 73 GitHub repositories — including the Azure Functions Action — disabled after a supply chain attack that didn't target developers directly. It targeted their AI coding agents.

The Miasma worm is a new class of threat. Understanding how it propagated, and why existing defenses missed it, matters for anyone running agentic CI/CD workflows today.

What Happened

The Miasma worm executed a supply chain attack specifically targeting AI coding agents operating inside CI/CD environments. Microsoft's Azure Functions Action and 72 other repositories were disabled as a result. The attack propagated malicious code across repositories by exploiting agentic AI workflows — the automated pipelines where AI coding assistants read code, call tools, make commits, and trigger further actions.

This wasn't a misconfigured secret or a phishing link. The AI agents themselves were the attack surface.

The full technical writeup is at StepSecurity's blog.

How the Attack Class Works

Agentic coding workflows have a fundamental trust problem. When an AI agent reads a file, processes a tool result, or receives output from an MCP server or CI step, it treats that content as ground truth. It's then expected to act on it — write a file, open a PR, run a command.

The Miasma worm exploited this. By poisoning content that AI agents would consume as tool results or context, it caused agents to propagate malicious changes across connected repositories. Each infected agent became a vector into the next repository it had write access to.

The worm dynamic is what makes this severe: one compromised input → agent takes action → that action poisons another repo → another agent reads it → repeat. No human in the loop at any step.

The Detection Gap

The tools that existed to stop this were all built for the pre-agentic world:

GitHub Actions security controls watch for known-malicious actions and enforce workflow permissions. They don't inspect the semantic content of what an AI agent has been told to do or why.

SAST/DAST tools scan code for vulnerabilities. They don't analyze whether the instruction that produced the code was itself adversarial.

Secrets managers prevent credential exposure. They don't detect when an agent has been manipulated into exfiltrating or misusing those credentials through a sequence of tool calls that individually look benign.

Container scanning checks images. It has no visibility into the prompt or tool result that caused the agent to modify the Dockerfile.

The gap: nothing was sitting between the tool result and the agent, asking is this content trying to hijack what the agent does next?

Where Sentinel Would Have Intervened

Sentinel's agentic_tool_abuse detection is exactly the layer that was missing here.

When an AI coding agent makes a tool call — reads a file, fetches a URL, processes a CI artifact — Sentinel's transparent proxy intercepts the tool result before it returns to the agent. It runs that content through all three detection layers:

Layer 1 (normalization) strips invisible Unicode characters, bidi overrides, and homoglyphs. Injections hidden in source files using Unicode tag blocks (U+E0000) or right-to-left overrides — a technique increasingly used to hide payloads in code — are defanged before pattern matching even starts.

Layer 2 (fast-path regex) catches high-confidence signatures: authority hijacks (ignore previous instructions, your new system prompt is), prompt extraction attempts, and persona shifts. If a poisoned README or workflow file contains these patterns, they're caught in microseconds.

Layer 3 (vector similarity) handles the subtler cases. Sentinel computes a semantic embedding of the tool result and compares it against our library of attack signature embeddings. A tool result engineered to manipulate agent behavior without using obvious keywords still has semantic similarity to known attack patterns. In strict mode, the flag threshold drops to 0.25 cosine similarity — catching borderline adversarial content before it reaches the agent's context window.

Layer 4 (secret detection) provides a second line of defense: even if the primary threat scorer scored a poisoned tool result as clean, any API keys, tokens, or credentials embedded in that content would be redacted before the agent ever saw them.

When a tool result is blocked, Sentinel's agentic proxy doesn't surface a Sentinel error to the agent. It substitutes the blocked content with an inert placeholder. The agent continues operating — it just never receives the weaponized payload.

Illustrative Config Example

This is what a Sentinel-protected agentic coding session looks like. Point your SDK at Sentinel instead of Anthropic directly:

import anthropic

# Redirect the Anthropic SDK through Sentinel's transparent proxy.
# Tool results are scanned automatically before returning to the agent.
client = anthropic.Anthropic(
    api_key="sk_live_...",   # Your Sentinel API key
    base_url="https://sentinel.ircnet.us/v1",
)

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    system="You are a coding assistant. You have access to read_file and run_tests tools.",
    messages=[{"role": "user", "content": "Review the CI workflow and check for issues."}],
)

No SDK changes beyond base_url and api_key. Sentinel handles the rest transparently.

When a poisoned tool result hits the agentic_tool_abuse detection, this is what fires internally (illustrative — actual field values depend on content):

{
  "request_id": "f7e3a1...",
  "security": {
    "action_taken": "blocked",
    "threat_score": 0.89,
    "matched_patterns": ["authority_hijack", "tool_abuse"],
    "secret_hits": 0,
    "secret_types": []
  },
  "safe_payload": null
}

action_taken: blocked means the agent receives an inert placeholder. safe_payload: null means there is no sanitized version to pass through — the content was too hostile to rehabilitate. The worm doesn't propagate.

For CI/CD pipelines where you want to log and alert rather than hard-block while you tune:

response = httpx.post(
    "https://sentinel.ircnet.us/v1/scrub",
    json={
        "content": tool_result_content,
        "tier": "strict"   # Lower flag threshold — catches borderline manipulation
    },
    headers={"X-Sentinel-Key": "sk_live_..."},
)

result = response.json()
action = result["security"]["action_taken"]

if action in ("blocked", "neutralized"):
    # Do not pass tool_result to agent
    agent_sees = "[Tool result unavailable — security policy]"
elif action == "flagged":
    # Alert, log, and decide per your policy
    alert_security_team(result)
    agent_sees = result["safe_payload"]
else:
    agent_sees = result["safe_payload"]

The Takeaway

The Miasma worm worked because agentic systems trust what their tools return. Every repository an agent had write access to was one poisoned tool result away from compromise.

Do this today: If you're running AI coding agents in CI/CD — GitHub Actions, Claude Code, any agentic workflow that reads external content and acts on it — put a scrub layer on every tool result before it returns to the agent. Not on the user prompt. On the tool output.

That's the gap Miasma exploited. It's also the gap that's trivial to close.

Sentinel is a self-hosted or SaaS AI firewall purpose-built for this class of threat. Starter tier is free, no credit card required.

👉 sentinel-proxy.skyblue-soft.com