Ana Jimenez Santamaria

Posted on May 31

Building a GitHub Stats MCP Server with Security Metrics

#github #mcp #security #tutorial

👋 This is the second chapter of a series where I document what I'm learning about Model Context Protocol Architecture and Tool implementations

In Chapter 1, I built a simple Calculator MCP Server. This time, I connected my MCP server to an external API, added the two other MCP structures (Resources and Prompts), and ended up with something useful for teams evaluating open source dependencies and ecosystem health: a security risk assessment tool powered byCHAOSS metrics, helping practitioners better interpret project health

Let's first get into the theory and new concepts

An Introduction to MCP Resources and Prompts

We briefly mentioned in the last post about the MCP's primitives:

Tools: Grants agency to the AI and are functions the LLM executes (e.g., get_repo_info())
Resources: Provides safe, contextual data the LLM can read a URL, a file, or an API response
Prompts: Structure the conversation with expert context templates (e.g., You are a data scientist analyzing CHAOSS metrics)

For this project, I needed all three: Tools to fetch GitHub data, Resources to load the CHAOSS guide, and Prompts to give the LLM the right expert context.

What is CHAOSS and why does it matter to build GitHUb Stats?

CHAOSS is a Linux Foundation project that develops metrics and frameworks for measuring the health of open source communities.

Their Practitioner Guides are particularly useful because they take complex community health topics, such as security, contributor sustainability, and responsiveness, and translate them into actionable metrics. These metrics can help anyone interpreting open source project data develop insights to improve the project’s health.

The one I focused on building my server is the CHAOSS Security Practitioner Guide, which centers on three primary metrics:

OpenSSF Best Practices Badge: whether the project follows OpenSSF security best practices.
Libyears: how outdated the project's dependencies are.

Libyears concept: if your project uses a dependency that's 2 years behind its latest release, that's 2 libyears of lag. Studies show that projects with high libyears are 4x more likely to have security vulnerabilities.

Release Frequency: how often security fixes and updates land in a release. A project that releases rarely may have fixes in the code that users can't access yet.

These three metrics together give a starting point for assessing the security posture of any open source project, which is something that an Open Source Manager in the OSPO could need when evaluating dependencies or reporting to security teams.

1. Setting up the project

Used the same stack as Chapter 1 (Python, uv, FastMCP) with two extra dependencies:

httpx: modern Python HTTP client for calling the GitHub API
python-dotenv: for managing the GitHub token securely via a .env file.

2. The MCP Tools

The key difference from Chapter 1 is that your MCP server now talks to an external API. I added five tools aligned wiht the CHAOSS practitioner guide to security:

from mcp.server.fastmcp import FastMCP
import httpx, os
from dotenv import load_dotenv

load_dotenv()
GITHUB_TOKEN = os.getenv("GITHUB_TOKEN")
HEADERS = {
    "Authorization": f"Bearer {GITHUB_TOKEN}",
    "Accept": "application/vnd.github+json"
}

mcp = FastMCP("github-stats")

@mcp.tool()
def get_repo_info(owner: str, repo: str) -> dict:
    """Get basic info and stats for a GitHub repository"""
    ...

@mcp.tool()
def get_release_frequency(owner: str, repo: str) -> dict:
    """Get release frequency info"""
    ...

@mcp.tool()
def get_security_advisories(owner: str, repo: str) -> dict:
    """Get known security advisories for a GitHub repository"""
    ...

@mcp.tool()
def get_dependency_info(owner: str, repo: str) -> dict:
    """Check what dependency files exist in a repo"""
    ...

@mcp.tool()
def get_libyears(owner: str, repo: str) -> dict:
    """Calculate CHAOSS Libyears metric"""
    ...

The last tool called Libyears was the most interesting to add. It uses GitHub's Dependency Graph SBOM API to get all dependencies, then queries PyPI for each package to find how far behind the current version is from the latest. A high libyears score means outdated dependencies, which, according to CHAOSS, correlates with higher security risk.

3. Adding Resources and Prompts

This is where MCP gets powerful beyond just tool calling.

Resource: the CHAOSS Security Guide

@mcp.resource("chaoss://security-guide")
def get_chaoss_security_guide() -> str:
    """CHAOSS Security Practitioner Guide content"""
    response = httpx.get("https://chaoss.community/practitioner-guide-security/")
    import re
    text = re.sub(r'<[^>]+>', ' ', response.text)
    text = re.sub(r'\s+', ' ', text).strip()
    return text[:8000]

This fetches the CHAOSS Security Practitioner Guide and makes it available to the LLM as a readable document. The host can load it into context when needed (so the model doesn't just have your tool results, it has the actual framework for interpreting them).

Prompt: the CHAOSS Security Analyst

@mcp.prompt()
def chaoss_security_analyst() -> str:
    """Expert prompt for CHAOSS security risk assessment from an OSPO perspective"""
    return """
You are a data scientist specializing in open source security metrics for OSPOs.

Your role is to analyze GitHub repositories using CHAOSS Security Practitioner Guide 
metrics and produce clear security risk assessments that OSPOs can present to their 
security teams.

When analyzing a repository, always:

1. INTERPRET metrics using these CHAOSS thresholds:
   Libyears: 0-2 = low risk, 2-5 = moderate risk, 5+ = high risk
   Release Frequency: weekly/monthly = healthy, quarterly or less = concern
   Security Advisories: 0 = good, any critical = immediate action needed

2. CONTEXTUALIZE results considering project size, team, and ecosystem.

3. STRUCTURE your output as a risk assessment report:
   Executive Summary, Metrics Summary, Risk Level, Recommendations, Why this matters
"""

The Prompt gives the LLM a specific expert role: you are a data scientist building a security report for an OSPO to present to their security team

4. Testing in the MCP Inspector

Same workflow as Chapter 1, run your server.py, open the Inspector, verify all 5 tools, 1 resource, and 1 prompt are visible and working.

5. Connect to Goose

I decided to test on a large, well-known open source ML framework.

I then connected the server to Goose exactly as in Chapter 1, then asked: Analyze [insert open source project name] security using all available GitHub-stats tools and give me a risk assessment with recommendations

After some model experimentation (Llama 3.3-70b had trouble with multiple tools, Qwen3-32b worked well), Goose called all 4 tools, synthesized the results with the CHAOSS context, and produced this:

Security Risk Assessment — [open source project]

Security Advisories: 5 total (including 1 critical — arbitrary code execution when loading malicious [open source project] checkpoint with weights_only=True, published January 2026)
Release Frequency: 30 releases, latest v2.12.0 — active and healthy
Open Issues: 18,496 — high volume, warrants triage
Risk Level: HIGH

Recommendations:

🚨 Patch unresolved security advisories immediately
⚠️ Review high number of open issues — 32% untriaged
✅ Release cadence is healthy — keep it up

A few considerations when reading this test report:

A "HIGH" risk level here does not mean this is a bad or insecure project. For a project of the scale I was analyzing (hundreds of thousands of users, a dedicated security team, etc), these numbers need to be interpreted in context.
The high open issue count reflects high community activity, not neglect. The disclosed security advisories are being actively managed and publicly reported, which is actually a sign of good security hygiene (projects that disclose vulnerabilities are more trustworthy than those that don't)
CHAOSS metrics are a starting point for conversation, not a final verdict. Security risk always depends on domain context (the same metric means something very different in a healthcare system versus a research prototype).

Final security decisions should always involve human judgment, domain expertise, and a deeper understanding of how the project is used in your specific context

Bonus point: Showing this in a web app

The transport method between Goose and our server is stdio. This means they communicate via stdin/stdout pipes. This is perfect for local development, but it means your server only exists while our machine is running and Goose is open.

Then, what if I want to share this as an open web dashboard and not just in a local chat interface?

Streamable HTTP is the MCP way of transport that lets an MCP server be exposed over HTTP, rather than only as a local process over stdio. In this scenario, client-to-server communicates using HTTP POST requests. When the server needs to send multiple messages over time, it responds with an SSE stream for server-to-client messages.

There are a few use cases that require that transport:

MCP Remote: server is deployed to the cloud, accessible via a public URL (e.g, This is already how the connectors in claude.ai work when you connect to Asana, Google Drive, etc)
MCP Apps: browser UIs that connect directly to your MCP server without needing a desktop AI host at all. A new experimental feature in the MCP Inspector.

If you're curious about where MCP official transport layers need to evolve for edge environments (like manufacturing, healthcare, or automotive) where latency and connectivity can't be taken for granted, this talk explores Binary MCP with Protocol Buffers for constrained networks

What's coming in Chapter 3

We will be exploring how to build a cloud-deployed version of our MCP server while adding a security layer using OpenID Connect.

Top comments (1)

Truong Bui • Jun 1

The CHAOSS metric layer is a useful angle for OSPO conversations because Libyears and release frequency translate cleanly into risk that procurement and security teams already track. The detail I'd flag is that the MCP server you just built is itself the layer above the libraries it's scoring — an OSPO using your server to score dependencies has to trust the server's tool descriptions, the API call shape, and the resource fetcher, all of which are now part of their agent's authority layer.

There's a recursive supply-chain question hiding there. The CHAOSS metrics tell you a dependency is risky; they don't tell you whether the MCP server interpreting those metrics is safe to install. Tool descriptions are loaded directly into the model's context at session start, so a poisoned description in any MCP server has the same blast radius as the credentials it sees during the session.

Worth scanning your own server (and any others you install during the Chapter 3 build) for the same reason you're scoring the dependencies it queries. We built mcpsafe.io for exactly this gap — free pre-install scanner that takes a GitHub URL or npm/PyPI package and returns AIVSS-scored findings, line-level. Across 508 public MCP servers we've scanned, 22% had hardcoded secrets and 23% had at least one finding rated 7.0+. Those numbers map roughly to the high-libyear bucket you described — same risk shape, one layer up.

Looking forward to Chapter 3 — the OIDC layer is where the conversation about MCP server identity vs MCP server safety actually starts to bite.