GitHub tells you the "primary language" of a repo. That label is... incomplete.
I've been building a tool that pulls structured data from any GitHub repo — language breakdown, dependencies, health metrics, CI setup, security signals — and decided to point it at some of the most popular frameworks on the planet.
A few things caught me off guard.
The Data
| Framework | Stars | Open Issues | Contributors | Hidden Detail |
|---|---|---|---|---|
| Next.js | 137.6K | 3,298 | 4,037 | 13% Rust (Turbopack) |
| Go | 132.3K | — | — | 5.4% Assembly (runtime) |
| FastAPI | 94.9K | 168 | 886 | Only 5 direct dependencies |
| Django | 86.7K | 405 | 3,383 | Zero Docker config |
| Flask | 71.1K | 3 | 862 | 3 issues. That's it. |
| Express | 68.7K | 183 | 373 | 100% JavaScript |
| Rails | 58.2K | 1,457 | 6,906 | Most contributors of any framework |
| LangChain | — | 407 | 3,631 | 3,631 contributors in 2 years |
The Surprises
Next.js is 13% Rust
Not a typo. Turbopack — the Rust-based bundler — accounts for a significant chunk of the Next.js codebase. GitHub labels it "JavaScript" because that's the plurality language, but the reality is JS (56%) + TypeScript (30%) + Rust (13%).
If you're evaluating Next.js as a dependency, "it's a JavaScript framework" is technically true but practically misleading. Turbopack means Rust compilation is part of your build chain.
Flask Has 3 Open Issues
Three. On a 71,000-star project.
Compare that to Next.js (3,298) or Rails (1,457). The Pallets team runs the tightest ship in open source. Some people see low issue counts and think "dead project." Flask proves that's wrong — it's just aggressively managed.
Express Is Still 100% JavaScript
No TypeScript migration. No Rust rewrite. No drama. Sixteen years, 68K stars, pure JavaScript.
Every new Node framework launches in TypeScript these days. Express just... doesn't. And it works fine. Sometimes boring is the feature.
Go Is 5.4% Assembly
Makes total sense for a language runtime and compiler, but GitHub's language bar rounds this away. You'd never know from the repo page that a meaningful chunk of Go is hand-written assembly for the garbage collector and runtime.
FastAPI Has Only 5 Direct Dependencies
For a full-featured web framework, that's remarkably lean: Starlette, Pydantic, typing-extensions, typing-inspection, annotated-doc. Minimal dependency surface = fewer supply chain risks.
Rails Has the Most Contributors by Far
6,906 contributors — nearly double LangChain (3,631) despite Rails being 18 years old. That's not just open source; that's a civilization.
How I Got This Data
I built RepoCrunch — a Python CLI that analyzes any GitHub repo and returns structured JSON. No AI, no hallucination, just deterministic data from the GitHub API.
pip install repocrunch
repocrunch analyze vercel/next.js --pretty
It returns everything you see above: language breakdown, dependency count, CI setup, maintenance status, security signals.
Why Structured Output Matters
The real reason I built this: I wanted my AI coding tools to have factual repo data instead of guessing. RepoCrunch has a built-in MCP server:
repocrunch mcp
Add it to your Claude Desktop or Cursor config, and your AI assistant can answer "is this repo safe to depend on?" with actual data instead of its training weights.
Sample output (FastAPI):
{
"summary": {
"stars": 94912,
"license": "MIT",
"primary_language": "Python"
},
"tech_stack": {
"framework": "Starlette",
"package_manager": "pdm",
"key_deps": ["starlette", "pydantic"]
},
"health": {
"open_issues": 168,
"contributors": 886,
"commit_frequency": "daily",
"maintenance_status": "actively_maintained"
}
}
What's Next
I'm thinking about running this against the top 50 Python packages and publishing the results. If there's interest, I'll do a follow-up with more repos and more surprising findings.
Curious what metrics you'd actually want to see when evaluating a dependency. Stars? Commit frequency? Something else entirely?
Links:
- GitHub: kimwwk/repocrunch
- PyPI:
pip install repocrunch - MCP mode:
repocrunch mcp(for Claude/Cursor integration)
Top comments (0)