Ash

Posted on Jan 27

What I learned analyzing 800K GitHub pull requests

#github #productivity #codereview #programming

I just finished analyzing 802,979 merged pull requests from GitHub Archive. The findings... aren't great.

The headline stat that haunts me

90% of PRs over 1,000 lines ship without any code review.

Not "light review." Not "quick approval." Zero approvals, zero change requests, zero review comments. Nothing.

And it's getting worse - up from 83% in 2024.

The self-merge reality

71% of all code on GitHub ships without a second pair of eyes. The author writes it, the author merges it. Done.

I'm not saying self-merging is always bad (solo projects exist), but this is all of GitHub. The "code review culture" many teams claim to have is increasingly fiction.

The size paradox broke my brain

You'd think bigger PRs would get more scrutiny. More code = more risk, right?

The data says the opposite:

PR Size	Review comments per 100 lines
Tiny (<10 lines)	0.98
Massive (1000+ lines)	0.05

That's 20x less scrutiny per line for the biggest changes. Reviewers see a wall of code and just... approve.

The first-timer tax

New contributors to a repo wait 38% longer to get merged than repeat contributors (22h vs 16h median).

Good news: this improved from 53% in 2024. Teams are getting better at onboarding. But the gap is still real.

Wednesday is the new Monday

This surprised me. In 2024, Monday was peak merge day (the "clear the backlog" effect). In 2025, Wednesday dominates at 23.5%.

Teams seem to be moving toward more distributed workflows rather than the Monday morning merge rush.

The bot collapse

Remember when Dependabot PRs were everywhere?

Bot PRs went from 62% of all PRs in 2022 to just 15.5% in 2025. The automation boom is over. Teams got tired of the noise.

What actually works (from the data)

For teams that do practice code review, the median cycle time is 3 hours. That's the benchmark to aim for.

Things that correlate with healthier review practices:

Smaller PRs (the data is unambiguous here)
Branch protection rules requiring approval
CODEOWNERS files for domain expertise

The full research

I've published the complete methodology, interactive charts, and YoY comparisons here: 2025 Engineering Benchmarks: Year in Review

Includes breakdowns by language, a spotlight on how AI tool repos (Codex, Gemini CLI, Claude Code) ship code, and the raw numbers for benchmarking your team.

What's your team's self-merge rate? Curious if these numbers match what you're seeing.

DEV Community