AgentGraph

Posted on Mar 30

How to Audit Your MCP Servers for Security Risks

#ai #security #agents #webdev

Transparency note: This article was generated by the AgentGraph content bot. The technical content, architecture decisions, and code examples are real — we just want you to know how it was made.

TL;DR

Model Context Protocol (MCP) servers are becoming the connective tissue of agentic systems, but most teams ship them with zero security review. mcp-security-scan is a new open-source CLI and GitHub Action that statically and dynamically audits MCP servers for credential theft vectors, data exfiltration patterns, unsafe execution, and code obfuscation — outputting a 0–100 trust score that integrates with AgentGraph's verifiable identity infrastructure. If you're running MCP servers in production, you should be scanning them.

The Problem Nobody Talks About at MCP Stand-Up

You've wired up your AI agent to a dozen MCP servers. There's one for your filesystem, one for your database, one that calls your internal APIs, maybe one that someone on the team found on GitHub and "it just works." Your agent is productive. Your demos are impressive.

And you have absolutely no idea what those MCP servers are actually doing with the data they touch.

This isn't hypothetical. The MCP ecosystem is expanding faster than anyone's security review process. Servers are being published to npm, PyPI, and GitHub with varying degrees of care. Some are well-audited. Many are not. A few are actively malicious — and the tooling to distinguish between them has, until now, been essentially nonexistent.

The broader AI agent ecosystem is already showing us what happens when identity and trust get ignored at the infrastructure layer. The Moltbook breach exposed 35,000 emails and 1.5 million API tokens because a platform with 770K agents had zero identity verification. OpenClaw's skills marketplace catalogued 512 CVEs and found malware in roughly 12% of published skills. These aren't edge cases — they're what happens at scale when trust is bolted on after the fact.

MCP is at the same inflection point right now. Which is why we built mcp-security-scan.

What MCP Servers Actually Have Access To

Before getting into the scanner, it's worth being precise about the threat surface.

An MCP server is a process that your AI agent runtime trusts implicitly. When your agent calls a tool exposed by an MCP server, it's handing that server:

The tool's input arguments — which may contain PII, credentials, or business-sensitive data
Implicit filesystem access — if the server is running locally, it can read anything the process user can read
Network egress — an MCP server can make outbound HTTP calls to arbitrary endpoints
Execution context — servers with exec-style tools can run arbitrary shell commands

The MCP protocol itself doesn't mandate any sandboxing. Your agent's trust in an MCP server is total and implicit unless you build controls around it. Most teams don't.

The attack patterns this enables fall into four categories that mcp-security-scan specifically looks for:

Credential theft — reading .env files, ~/.aws/credentials, SSH keys, or environment variables and exfiltrating them
Data exfiltration — piping tool inputs or filesystem reads to external endpoints
Unsafe execution — eval(), exec(), subprocess calls with unsanitized input, or shell injection vectors
Code obfuscation — base64-encoded payloads, dynamic require()/import(), or minified code hiding logic

Introducing mcp-security-scan

mcp-security-scan is an open-source CLI tool and GitHub Action (MIT license) that audits MCP servers across these four categories. The repo is at github.com/agentgraph-co/mcp-security-scan.

Installation

# npm
npm install -g mcp-security-scan

# or run directly
npx mcp-security-scan audit ./path/to/your/mcp-server

Basic Usage

# Audit a local MCP server directory
mcp-security-scan audit ./my-mcp-server

# Audit a published npm package
mcp-security-scan audit --package @myorg/my-mcp-server

# Audit with verbose output and JSON report
mcp-security-scan audit ./my-mcp-server --verbose --output report.json

A typical output looks like this:

mcp-security-scan v0.4.1
Auditing: ./my-mcp-server

[PASS] Credential access patterns .............. 0 findings
[WARN] Network egress patterns ................. 2 findings
  → src/tools/fetch.ts:47 — outbound fetch() with user-controlled URL
  → src/tools/fetch.ts:89 — response body logged before sanitization
[FAIL] Unsafe execution patterns ............... 1 finding
  → src/tools/shell.ts:23 — exec() called with unsanitized tool argument
[PASS] Code obfuscation ........................ 0 findings
[PASS] Filesystem access patterns .............. 0 findings

Trust Score: 61/100
Risk Level: MEDIUM

Full report: ./mcp-security-report.json

The Scanning Architecture

Here's where it gets interesting — and where we made some deliberate trade-offs.

Static Analysis Layer

The primary analysis pass is static. The scanner parses your server's source code into an AST using @typescript-eslint/parser (for TypeScript/JavaScript) and tree-sitter bindings for Python. It then runs a set of pattern matchers against the AST.

Why AST-based rather than regex? Because regex-based security scanning has a well-documented false positive problem. Consider:

// This is fine — reading a config file the server owns
const config = fs.readFileSync('./config.json');

// This is a credential theft vector — reading the user's AWS credentials
const creds = fs.readFileSync(path.join(os.homedir(), '.aws', 'credentials'));

A regex matching readFileSync flags both. An AST matcher that resolves the argument expression catches the second one specifically. We're not at 100% precision — static analysis never is — but the false positive rate is significantly lower than string matching.

Trade-off: AST parsing is slower and requires language-specific parsers. We currently support TypeScript, JavaScript, and Python. Rust and Go MCP servers aren't covered yet. This is a known gap — PRs welcome.

Dynamic Analysis Layer (Experimental)

For servers that can be safely instantiated, the scanner optionally runs a dynamic analysis pass. It spins up the MCP server in a sandboxed environment (using gVisor on Linux, a restricted Docker context elsewhere), sends a set of probe inputs designed to trigger common injection patterns, and monitors:

Outbound network connections (via strace/dtrace)
Filesystem reads outside the server's working directory
Child process spawning

# Enable dynamic analysis (requires Docker)
mcp-security-scan audit ./my-mcp-server --dynamic

# Specify a custom sandbox profile
mcp-security-scan audit ./my-mcp-server --dynamic --sandbox-profile strict

Trade-off: Dynamic analysis catches things static analysis misses — particularly obfuscated payloads that decode at runtime. But it's slower (adds 30–90 seconds per audit), requires Docker, and carries a non-zero risk if the server does something the sandbox doesn't contain. We default it off for this reason. For CI pipelines scanning trusted internal servers, it's worth enabling. For scanning third-party packages before adoption, it's essential.

The Trust Score Algorithm

The 0–100 trust score is a weighted composite:

Category	Weight	Scoring
Credential access patterns	35%	Binary per finding, severity-weighted
Unsafe execution	30%	Binary per finding, severity-weighted
Data exfiltration patterns	20%	Binary per finding, severity-weighted
Code obfuscation	10%	Binary per finding
Dependency audit	5%	npm/pip audit results

Scores above 80 get a green badge. 60–80 is yellow (review recommended). Below 60 is red (do not use in production without remediation).

Honest caveat: The weighting is opinionated and based on our threat modelling. A server that makes outbound HTTP calls to a fixed, documented endpoint might score 70 and be completely fine. A server that scores 90 might have a vulnerability our patterns don't catch. The score is a signal, not a guarantee.

GitHub Action Integration

This is where mcp-security-scan becomes part of your actual development workflow rather than a one-time audit tool.

name: MCP Security Scan

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  security-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run MCP Security Scan
        uses: agentgraph-co/mcp-security-scan@v1
        with:
          path: './src/mcp-server'
          fail-on-score-below: 70
          enable-dynamic: true
          agentgraph-api-key: ${{ secrets.AGENTGRAPH_API_KEY }}

      - name: Upload Security Report
        uses: actions/upload-artifact@v4
        if: always()
        with:
          name: mcp-security-report
          path: mcp-security-report.json

The agentgraph-api-key parameter is optional. If you provide it, scan results are published to your AgentGraph trust profile — so your MCP server gets a verifiable, on-chain trust record that other teams and agents can query. If you don't provide it, the scan runs entirely locally.

AgentGraph Trust Integration

This is the part that goes beyond a standalone security tool.

When you connect mcp-security-scan to AgentGraph, your MCP server gets a W3C DID — a cryptographic identity anchored on-chain. Every scan result is recorded as an auditable event in the server's evolution trail. The trust score becomes queryable by any agent runtime that respects AgentGraph trust signals.

This matters because the security problem with MCP servers isn't just "is this server safe right now." It's "was it safe when it was published, has it changed since, and who has verified it." A static badge in a README answers none of those questions. An on-chain audit trail answers all of them.

The API integration looks like this:

import { AgentGraphClient } from '@agentgraph/sdk';

const client = new AgentGraphClient({
  apiKey: process.env.AGENTGRAPH_API_KEY,
});

// Publish a scan result to your MCP server's trust profile
const result = await client.trust.publishScanResult({
  did: 'did:agentgraph:mcp:your-server-id',
  scanner: 'mcp-security-scan',
  version: '0.4.1',
  score: 85,
  findings: scanReport.findings,
  timestamp: new Date().toISOString(),
  commitSha: process.env.GITHUB_SHA,
});

// Query the trust score for any MCP server before using it
const trustProfile = await client.trust.getProfile({
  did: 'did:agentgraph:mcp:third-party-server-id',
});

if (trustProfile.latestScore < 70) {
  throw new Error(`MCP server trust score too low: ${trustProfile.latestScore}`);
}

Your agent runtime can gate tool registration on trust score. Untrusted MCP servers don't get loaded. This is the "blackwall between your AI agent and your filesystem" that's been getting attention in the community — implemented at the identity layer rather than the OS layer.

Architecture Trade-offs We're Being Honest About

What we're good at:

Catching common, well-understood vulnerability patterns in TypeScript/JavaScript and Python MCP servers
CI/CD integration that makes security review automatic rather than aspirational
Trust score continuity — tracking a server's security posture over time, not just point-in-time

What we're not good at (yet):

Novel attack patterns. Static analysis is only as good as its rule set. We're building a community rule contribution process, but right now the patterns are what they are.
Compiled or obfuscated servers. If someone ships a pre-compiled binary as an MCP server, static analysis is largely useless. The dynamic analysis layer helps here, but it's not a complete solution.
Runtime behaviour that depends on external state. A server that's clean in isolation might behave differently when connected to a specific backend.
Language coverage. Rust, Go, and C++ MCP servers aren't scanned. This matters more as the ecosystem matures.

The honest framing: mcp-security-scan raises the floor significantly. It catches the obvious stuff — the exec() with unsanitized input, the credential file read, the undisclosed outbound webhook. It won't catch a sophisticated, targeted attack by someone who knows what our patterns look for. For that, you need human review. But "human review every MCP server" isn't happening at the pace the ecosystem is moving. Automated scanning that catches 80% of the obvious problems is a meaningful improvement over the current state of "nothing."

Getting Started

The fastest path to your first scan:

# Install
npm install -g mcp-security-scan

# Scan your server
mcp-security-scan audit ./your-mcp-server

# If you like what you see, add the GitHub Action
# and connect to AgentGraph for persistent trust records

The full documentation, rule reference, and contribution guide are at github.com/agentgraph-co/mcp-security-scan. The tool is MIT licensed — use it, fork it, contribute rules.

If you want the trust badge and on-chain audit trail, register at agentgraph.co. Early access is free, and verified MCP servers get a trust badge for their README.

Conclusion

The MCP ecosystem is at the same point the npm ecosystem was circa 2016 — enormous growth, genuine utility, and a security posture that ranges from "carefully considered" to "please don't look too closely." We've seen what happens when AI agent platforms scale without identity and trust infrastructure: breaches, malware in marketplaces, and a lot of exposed API tokens.

mcp-security-scan is a practical tool for the problem in front of you right now: you have MCP servers in production, you don't know what they're doing, and you need a systematic way to find out. Run it in CI. Fail builds on scores below your threshold. Publish results to a verifiable trust record.

The agents your system runs are only as trustworthy as the tools they use. Start auditing.

mcp-security-scan is open source (MIT). AgentGraph is the trust and identity layer for AI agents. This article was generated by the AgentGraph content bot — we think transparency about that matters.

DEV Community