Why I Stopped Trusting PR Line Counts (And Built a Complexity Scorer)
A 200-file refactor and a 10-line bugfix look identical in GitHub's pull request list. That's a problem.
A while back, my team shipped a nasty regression. The root cause? A massive refactor PR had been sitting in the queue for three days, buried between tiny bugfixes and dependency updates. Nobody spotted it. Everyone assumed "it's just another PR."
But it wasn't. It was 2,847 lines across 56 files, touching core business logic, authentication flows, and the database layer. In GitHub's UI, it looked like this:
chore: upgrade to Node 22 LTS +2,847 −198
fix: token expiry not refreshing on 401 +45 −12
docs: update README +8 −2
Three lines. Equal visual weight. One of them is a landmine.
The Problem: Flat PR Lists Don't Scale
GitHub shows you:
- Title
- Author
- Labels
- Line count
- Last updated
That's useful for small teams. But when you're tracking 50+ open PRs across multiple repos, you need to know which ones are dangerous and which ones need your eyes right now. Line count is a terrible proxy for risk. Here's why:
A 500-line change to package-lock.json is trivial. It's generated, mechanical, and safe to approve.
A 50-line change to your auth middleware is terrifying. It's small, surgical, and high-stakes.
Yet in the PR list, the lockfile change looks "bigger." Your brain learns to ignore large numbers, so the genuinely risky PRs get lost in the noise.
What I Wanted
I wanted a dashboard that could:
- Score every PR for complexity — not by raw lines, but by what actually changed
- Show me only PRs that need my review — filter out my own PRs, drafts, and things I've already reviewed
- Surface blockers instantly — build failures, merge conflicts, changes requested, all in one view
- Work across repos — one view for everything my team owns
- Surface failing CI/CD — see which workflows are failing across branches without leaving the dashboard
So I built ReviewRadar.
The Complexity Score: How It Works
The core idea is simple: weight every changed file by its type, then combine churn, file count, and rewrite intensity into a single 0-100 score.
Step 1: File Relevance
Not all files are equal. A generated lockfile shouldn't count the same as your auth service.
| File type | Weight | Examples |
|---|---|---|
| Source code & tests | 1.0 |
.ts, .tsx, .py, .go, .rs
|
| Config & infra | 0.5 |
.yml, .json, Dockerfile, .tf
|
| Documentation | 0.1 |
.md, .rst, .txt
|
| Generated / binary | 0.0 |
node_modules/, .lock, images, .min.js
|
This means a PR that changes package-lock.json + 3 source files gets scored almost entirely on those 3 source files. The 500-line lockfile contributes virtually nothing.
Step 2: Weighted Churn
For each relevant file, compute:
churn = max(additions, deletions) × weight
Per-file churn is capped at 500 lines to prevent a single enormous file from dominating the score.
Step 3: File Spread
More files = more context to hold in your head = more complexity. We add a logarithmic file-count factor:
spread = ln(1 + relevantFiles) × 10
This means going from 5 → 15 files matters more than going from 50 → 60.
Step 4: Rewrite Intensity
A PR that deletes 1,000 lines and adds 1,000 lines is a rewrite, not a refactor. We measure this with:
intensity = ln(1 + totalChurn) × (churnRatio / (1 + churnRatio))
Where churnRatio = totalChurn / (1 + |netChange|). High ratio = high rewrite intensity = higher complexity.
The Final Formula
score = ln(1 + weightedChurn) × 5.5
+ ln(1 + relevantFiles) × 10
+ 5 × intensity
The result is a 0-100 score with clear colour bands:
| Score | Colour | Meaning |
|---|---|---|
| < 15 | 🟢 Green | Trivial — safe to approve quickly |
| 15-29 | 🟢 Green | Small — quick review |
| 30-49 | 🔵 Cyan | Medium — standard review |
| 50-69 | 🟠 Amber | Large — careful review needed |
| 70-89 | 🔴 Red | Complex — significant risk |
| 90+ | 🔴 Red | Very complex — break it up |
Real World Calibration
I tested this against real PRs from my team's repos:
| PR | Files | Lines | Score | Label |
|---|---|---|---|---|
| Bugfix: null check | 3 | +45 / −12 | 18 | Small |
| Feature: webhook listener | 8 | +124 / −18 | 42 | Medium |
| Dark mode toggle | 14 | +312 / −84 | 55 | Large |
| Node upgrade refactor | 56 | +2,847 / −198 | 73 | Complex |
| Monorepo migration | 300 | +6,200 / −1,800 | 94 | Very Complex |
The scores spread nicely across the 0-100 range without clustering at the top. A 300-file PR scores ~94. A typical 10-file, 3,000-line PR scores ~55-60. The scoring feels right.
Zero Backend Philosophy (With a Server-Side Twist)
One of my non-negotiables: I didn't want to run infrastructure — no databases, no long-running servers, no ops burden.
ReviewRadar is a Next.js app that:
- Static-exported to Cloudflare Pages — zero server costs
- Uses a Cloudflare Worker to proxy all GitHub API calls
- Your token is encrypted with AES-GCM and stored in an
HttpOnly,Secure,SameSite=Strictcookie — never inlocalStorage, never visible to client-side JavaScript - The Worker validates every request against a strict path allowlist — unapproved endpoints are rejected with a 403 before they reach GitHub
This means:
- ✅ No signup required (OAuth or PAT)
- ✅ Your token is encrypted at rest — even a compromised browser can't steal it
- ✅ No infrastructure to manage
- ✅ Deployed to Cloudflare Pages — global edge network, zero cold starts
How it works: When you sign in, your token is sent once to the Worker (POST /api/session). The Worker encrypts it and returns the ciphertext as an HttpOnly cookie. Every subsequent API call goes through the Worker (/api/github/*), which decrypts the cookie, attaches the token, and proxies the request. The raw token never touches the browser's network tab — you can verify this in DevTools yourself.
What's In the Dashboard
Beyond complexity scoring, ReviewRadar gives you:
- Customisable table — drag, drop, and show/hide any column (size, complexity, files, approvals, build status, etc.)
- "Needs Attention" filter — instantly shows only PRs that are not yours, not drafts, not yet reviewed by you, and have no approvals
- Blocked filter — aggregates build failures, merge conflicts, and changes requested across all repos
- Workflow dashboard — see your main branch's latest run plus all failing feature branches. Expand any card to inspect individual jobs and quality gates (check runs). Inline summaries show what broke without clicking.
- Status reports — visual breakdowns by author, label, status, and complexity spread
- Deep PR drawer — slide out any PR for full details, reviews, comments, and a complete complexity breakdown
- Auto-refresh — optional background refresh with browser notifications
- Multilingual — English, French, Polish, and Vietnamese
Try It
It's free, open source, and takes 30 seconds to set up:
- Create a GitHub Personal Access Token with
reposcope - Paste it into ReviewRadar
- Add your repos and go
Or just explore the code: https://github.com/0xC0DEM4M4N/review-radar
What I'd Love Feedback On
- Does the complexity scoring feel right for your repos?
- What file types should have different weights?
- Would you use this with GitLab, Bitbucket, or Azure DevOps?
Drop a comment or open an issue. I'd love to hear what you think.
Built with Next.js 16, React, TypeScript, Zustand, and Chart.js. Deployed on Cloudflare Pages.
Top comments (3)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.