DEV Community

Hannah Redmond
Hannah Redmond

Posted on

Why I Stopped Trusting PR Line Counts (And Built a Complexity Scorer)

Why I Stopped Trusting PR Line Counts (And Built a Complexity Scorer)

A 200-file refactor and a 10-line bugfix look identical in GitHub's pull request list. That's a problem.

A while back, my team shipped a nasty regression. The root cause? A massive refactor PR had been sitting in the queue for three days, buried between tiny bugfixes and dependency updates. Nobody spotted it. Everyone assumed "it's just another PR."

But it wasn't. It was 2,847 lines across 56 files, touching core business logic, authentication flows, and the database layer. In GitHub's UI, it looked like this:

chore: upgrade to Node 22 LTS                    +2,847 −198
fix: token expiry not refreshing on 401             +45 −12
docs: update README                                  +8 −2
Enter fullscreen mode Exit fullscreen mode

Three lines. Equal visual weight. One of them is a landmine.

The Problem: Flat PR Lists Don't Scale

GitHub shows you:

  • Title
  • Author
  • Labels
  • Line count
  • Last updated

That's useful for small teams. But when you're tracking 50+ open PRs across multiple repos, you need to know which ones are dangerous and which ones need your eyes right now. Line count is a terrible proxy for risk. Here's why:

A 500-line change to package-lock.json is trivial. It's generated, mechanical, and safe to approve.

A 50-line change to your auth middleware is terrifying. It's small, surgical, and high-stakes.

Yet in the PR list, the lockfile change looks "bigger." Your brain learns to ignore large numbers, so the genuinely risky PRs get lost in the noise.

What I Wanted

I wanted a dashboard that could:

  1. Score every PR for complexity — not by raw lines, but by what actually changed
  2. Show me only PRs that need my review — filter out my own PRs, drafts, and things I've already reviewed
  3. Surface blockers instantly — build failures, merge conflicts, changes requested, all in one view
  4. Work across repos — one view for everything my team owns
  5. Surface failing CI/CD — see which workflows are failing across branches without leaving the dashboard

So I built ReviewRadar.

The Complexity Score: How It Works

The core idea is simple: weight every changed file by its type, then combine churn, file count, and rewrite intensity into a single 0-100 score.

Step 1: File Relevance

Not all files are equal. A generated lockfile shouldn't count the same as your auth service.

File type Weight Examples
Source code & tests 1.0 .ts, .tsx, .py, .go, .rs
Config & infra 0.5 .yml, .json, Dockerfile, .tf
Documentation 0.1 .md, .rst, .txt
Generated / binary 0.0 node_modules/, .lock, images, .min.js

This means a PR that changes package-lock.json + 3 source files gets scored almost entirely on those 3 source files. The 500-line lockfile contributes virtually nothing.

Step 2: Weighted Churn

For each relevant file, compute:

churn = max(additions, deletions) × weight
Enter fullscreen mode Exit fullscreen mode

Per-file churn is capped at 500 lines to prevent a single enormous file from dominating the score.

Step 3: File Spread

More files = more context to hold in your head = more complexity. We add a logarithmic file-count factor:

spread = ln(1 + relevantFiles) × 10
Enter fullscreen mode Exit fullscreen mode

This means going from 5 → 15 files matters more than going from 50 → 60.

Step 4: Rewrite Intensity

A PR that deletes 1,000 lines and adds 1,000 lines is a rewrite, not a refactor. We measure this with:

intensity = ln(1 + totalChurn) × (churnRatio / (1 + churnRatio))
Enter fullscreen mode Exit fullscreen mode

Where churnRatio = totalChurn / (1 + |netChange|). High ratio = high rewrite intensity = higher complexity.

The Final Formula

score = ln(1 + weightedChurn) × 5.5
      + ln(1 + relevantFiles) × 10
      + 5 × intensity
Enter fullscreen mode Exit fullscreen mode

The result is a 0-100 score with clear colour bands:

Score Colour Meaning
< 15 🟢 Green Trivial — safe to approve quickly
15-29 🟢 Green Small — quick review
30-49 🔵 Cyan Medium — standard review
50-69 🟠 Amber Large — careful review needed
70-89 🔴 Red Complex — significant risk
90+ 🔴 Red Very complex — break it up

Real World Calibration

I tested this against real PRs from my team's repos:

PR Files Lines Score Label
Bugfix: null check 3 +45 / −12 18 Small
Feature: webhook listener 8 +124 / −18 42 Medium
Dark mode toggle 14 +312 / −84 55 Large
Node upgrade refactor 56 +2,847 / −198 73 Complex
Monorepo migration 300 +6,200 / −1,800 94 Very Complex

The scores spread nicely across the 0-100 range without clustering at the top. A 300-file PR scores ~94. A typical 10-file, 3,000-line PR scores ~55-60. The scoring feels right.

Zero Backend Philosophy (With a Server-Side Twist)

One of my non-negotiables: I didn't want to run infrastructure — no databases, no long-running servers, no ops burden.

ReviewRadar is a Next.js app that:

  • Static-exported to Cloudflare Pages — zero server costs
  • Uses a Cloudflare Worker to proxy all GitHub API calls
  • Your token is encrypted with AES-GCM and stored in an HttpOnly, Secure, SameSite=Strict cookie — never in localStorage, never visible to client-side JavaScript
  • The Worker validates every request against a strict path allowlist — unapproved endpoints are rejected with a 403 before they reach GitHub

This means:

  • ✅ No signup required (OAuth or PAT)
  • ✅ Your token is encrypted at rest — even a compromised browser can't steal it
  • ✅ No infrastructure to manage
  • ✅ Deployed to Cloudflare Pages — global edge network, zero cold starts

How it works: When you sign in, your token is sent once to the Worker (POST /api/session). The Worker encrypts it and returns the ciphertext as an HttpOnly cookie. Every subsequent API call goes through the Worker (/api/github/*), which decrypts the cookie, attaches the token, and proxies the request. The raw token never touches the browser's network tab — you can verify this in DevTools yourself.

What's In the Dashboard

Beyond complexity scoring, ReviewRadar gives you:

  • Customisable table — drag, drop, and show/hide any column (size, complexity, files, approvals, build status, etc.)
  • "Needs Attention" filter — instantly shows only PRs that are not yours, not drafts, not yet reviewed by you, and have no approvals
  • Blocked filter — aggregates build failures, merge conflicts, and changes requested across all repos
  • Workflow dashboard — see your main branch's latest run plus all failing feature branches. Expand any card to inspect individual jobs and quality gates (check runs). Inline summaries show what broke without clicking.
  • Status reports — visual breakdowns by author, label, status, and complexity spread
  • Deep PR drawer — slide out any PR for full details, reviews, comments, and a complete complexity breakdown
  • Auto-refresh — optional background refresh with browser notifications
  • Multilingual — English, French, Polish, and Vietnamese

Try It

It's free, open source, and takes 30 seconds to set up:

  1. Create a GitHub Personal Access Token with repo scope
  2. Paste it into ReviewRadar
  3. Add your repos and go

Or just explore the code: https://github.com/0xC0DEM4M4N/review-radar

What I'd Love Feedback On

  • Does the complexity scoring feel right for your repos?
  • What file types should have different weights?
  • Would you use this with GitLab, Bitbucket, or Azure DevOps?

Drop a comment or open an issue. I'd love to hear what you think.


Built with Next.js 16, React, TypeScript, Zustand, and Chart.js. Deployed on Cloudflare Pages.

Top comments (3)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.