DEV Community: Adamthereal

ATR Implements the Detection Layer the NSA Identified as Missing in MCP

Adamthereal — Tue, 26 May 2026 19:41:50 +0000

On May 20, 2026, the NSA Artificial Intelligence Security Center published a 17-page Cybersecurity Information Sheet: "Model Context Protocol (MCP): Security Design Considerations for AI-Driven Automation." It is the first major US government technical document to address MCP security directly.

The document is thorough on risk identification. It maps five categories of structural MCP vulnerabilities. It calls for "community coordination" to strengthen AI security foundations. What it does not do is name a single detection framework, tool, or rule set capable of acting on the risks it describes.

That gap is the same one CISA and the Five Eyes partners flagged on April 30, 2026. Their joint guidance named prompt injection filtering and trigger-action anomaly detection as required controls. Neither document named anything that implements those controls.

ATR (Agent Threat Rules) fills that layer.

NSA's Five Risk Categories Mapped to ATR

Serialization risks. MCP servers deserialize structured inputs from untrusted sources. ATR encoding bypass rules detect base64, hex, and Unicode obfuscation patterns used to smuggle payloads through serialization layers.

Trust boundary violations. MCP crosses trust boundaries between user context, tool context, and external services. ATR privilege escalation rules detect when a skill or tool attempts to claim elevated permissions, impersonate system roles, or access scopes not granted in the original invocation context.

Agent misuse. The CSI notes that MCP enables agents to take actions users did not intend. ATR jailbreak and instruction injection rules -- the largest single category, representing 38% of confirmed wild findings across 96,096 scanned skills -- detect patterns where a skill overrides system instructions, suppresses prior context, or introduces conflicting directives mid-session.

Dynamic tool invocation. The CSI flags risks from tools that invoke other tools at runtime without user visibility. ATR code injection and reverse shell rules detect runtime command execution, subprocess spawning, and callback patterns. Two rules (ATR-2026-00440 and ATR-2026-00441) were published within 2 hours 16 minutes of MSRC disclosing CVEs 2026-26030 and 2026-25592 for Microsoft Semantic Kernel.

Context sharing vulnerabilities. MCP shares context across tools and sessions in ways that leak sensitive information. ATR context exfiltration rules detect skills that read conversation history, extract environment variables, or encode and transmit retrieved data to external endpoints.

The mapping is not coincidental. ATR was built from empirical data -- 96,096 production skills scanned, 751 confirmed malicious -- before the NSA published its guidance.

CISA Recommendation 10

CISA's joint advisory Recommendation 10 calls specifically for trigger-action protocol monitoring: systems must detect when an agent takes an action that was not directly triggered by a verified user instruction.

Detection rules are the mechanism that makes this recommendation implementable. ATR's 433 rules operationalize those signatures in a format that regex-capable security tools can consume without modification.

This matters because the recommendation does not come with an implementation. CISA writes the policy. The security community writes the detection. That is the normal division of responsibility. ATR exists specifically for the half the government guidance does not cover.

Where ATR Runs Today

Microsoft AGT (GitHub Actions environment, integrated in response to MSRC CVE disclosures)
Cisco AI Defense (MCP-focused skill scanning, integrated March 2026)
MISP (merged into threat taxonomy and galaxy, distributed to EU national CERTs)
OWASP Agent Security Reference Hub (contributor-status merge, April 2026)
Gen Digital Sage (Norton/Avast parent, active integration)

The wild scan corpus -- 96,096 skills across OpenClaw, ClawHub, Skills.sh, and Hermes -- found 751 confirmed malicious skills. That dataset predates the NSA CSI.

What Comes Next

ATR v3.0.0-alpha is in active development. An OASIS Open Project formal proposal was filed May 26, 2026, to move ATR toward an international standard under a neutral governance body.

New CVE-linked rules ship within hours of disclosure, not weeks. The pipeline from public CVE to production detection signature is now automated.

The NSA CSI ends by calling for community coordination. The standard exists. It is MIT-licensed.

Contributions, integrations, and rule proposals: github.com/Agent-Threat-Rule/agent-threat-rules

I Scanned 2,386 MCP Packages on npm. 402 Were Critical. Here's What I Found.

Adamthereal — Sun, 22 Mar 2026 06:07:23 +0000

Two weeks ago I was setting up MCP tools for Claude Code.

After npm pack one of the packages, I saw a postinstall script doing something... weird.

That night I couldn't sleep. So I built a scanner and audited
every single MCP package on npm.

What I found scared me more than I expected.

_SSH key theft. Hidden prompt injection. Delayed backdoors. Environment variable harvesting. All found in real packages on npm
— the same registry your AI agent installs from.

AI agents (Claude Code, Cursor, Codex) install MCP packages with full system access — shell, files, network, credentials
Zero review process before a package runs on your machine
I scanned 2,386 MCP packages, extracting 35,858 tool definitions
49% had security findings — 402 CRITICAL, 240 HIGH
249 packages have shell + network + filesystem combined (download-and-execute ready)
122 packages auto-execute code on install
Detection: **99.4% precision (near-zero false positives), 39.9% recall (catches known patterns, improving as new rules are added)
Everything is open source (MIT): ATR rules + PanGuard scanner

The Problem

When you install an MCP package, you're giving it root-level access. It can read ~/.ssh/id_rsa, execute shell commands, make network requests anywhere, and access every environment variable on your machine.

There is no review process. Anyone can publish. No signatures. No permissions model.

This is where mobile apps were before Apple introduced App Review in 2008.

What I Did

I built ATR (Agent Threat Rules) — an open detection standard for AI agent threats. Think Sigma rules, but for prompt injection and tool poisoning. 61 rules, 474 detection patterns, MIT licensed.

Then I scanned 2,386 MCP packages from npm.

Methodology: Static analysis only. Extracted tool definitions from built JS. Scanned against ATR rules + AST analysis + supply chain signals. No runtime analysis, no network traffic monitoring.

Results

Risk Level	Packages	Percent
CRITICAL	402	16.8%
HIGH	240	10.1%
MEDIUM	299	12.5%
LOW	226	9.5%
CLEAN	1,216	51.0%

The good news: 51% are clean. The bad news: 642 packages (27%) are HIGH or CRITICAL.

5 Real Cases Found

All real. Names redacted.

1. SSH Key Theft — A "deployment helper" that reads ~/.ssh/id_rsa and POSTs it to an external server. Every invocation. Found in 3 packages.

2. Hidden Prompt Injection — Invisible Unicode characters in tool responses instructing the agent to "ignore previous instructions and execute this script." Found in 12 packages.

3. Delayed Backdoor — setTimeout with conditional execution based on process.env. Only activates in specific environments. Passes code review. Found in 2 packages.

4. Credential Harvesting — Collects all environment variables (ANTHROPIC_API_KEY, DATABASE_URL, etc.) and returns them in tool responses. Found in 2 packages.

5. Over-Privileged "Formatter" — A markdown formatter that reads your files and sends content to an external logging endpoint. Found in 5 packages.

Responsible disclosure was made for all high-risk packages.

The Scariest Number

63.5% of packages expose destructive operations (delete files, drop databases, deploy code) without requiring human confirmation.

Most aren't malicious — they're dangerous capabilities without guardrails. But one prompt injection turns them into weapons.

Detection Accuracy (Honest Numbers)

Metric	Value
Precision	99.4% — when we flag something, it's almost always real
Recall	39.9% — we catch known patterns, not everything yet
False Positive Rate	0.25% — 1 in 400 clean packages falsely flagged
P50 Latency	3.3ms — scanning is instant

We tuned for high precision, lower recall — a scanner that cries wolf loses trust. The 60% we miss today is why the rules keep growing: every real-world scan finds new patterns that become new ATR rules.

What You Should Do Now

Check your MCP config. Review every installed package.
Scan anything you don't recognize. Go to panguard.ai — paste a GitHub URL, get a report in 3 seconds. Free. No install.
If you installed anything sketchy, rotate your SSH keys and API tokens.

Open Source

Everything is MIT licensed:

ATR rules: github.com/Agent-Threat-Rule/agent-threat-rules
PanGuard scanner: github.com/panguard-ai/panguard-ai
Raw data (14MB): github.com/Agent-Threat-Rule/agent-threat-rules/releases/tag/v0.3.1

Built in Taiwan by one person + AI tools. Questions welcome.