DEV Community

Om Keswani
Om Keswani

Posted on

What Code Reviews Really Measure — And Why Your Best Engineers Hate Them

Riya opened her laptop to a wall of red badges: seventeen pull requests waiting for her review.

She was the staff engineer everyone trusted with “the important stuff,” which meant every risky change, every migration, every cross‑team integration had her name on it. Somewhere under that pile was the design doc she actually wanted to work on. She sighed and clicked the first PR anyway.

It was a 900‑line diff. The description said only: “Fix stuff, please review.”

Riya skimmed. The change touched a payment flow she’d helped design two years ago. There were branching conditionals stacked on top of already‑messy logic, a couple of “just in case” flags, and a new helper with a name that meant nothing.

She knew what this really needed: to be split into three smaller changes, with a conversation about the underlying assumptions. That would take a few hours. The dashboard on the TV outside, though, only cared that “Time to First Review” stayed low.

“Okay,” she muttered. “Nits it is.”

Two comments on naming. One suggestion for extracting a method. A question about a null check. She hit “Approve,” closed the tab, and opened the next PR.

By the time she got to number five, the patterns were obvious. Huge PRs from rushed teammates. Vague descriptions. Review comments from others fixated on indentation, brace style, and whether a map should be called data or items. Nobody was asking, “Is this the right thing to build?” or “Are we about to make this system harder to change for the next five years?”

It wasn’t that no one cared. It was that the whole system pushed them away from caring. Leaders watched graphs of review speed and PR throughput. Devs optimized for whatever kept those graphs happy. Plenty of teams had quietly discovered that code review, done badly, was now the slowest, most frustrating part of shipping features.

Later that afternoon, a junior engineer, Aarav, pinged her.

“Hey Riya, any chance you can look at my PR? It’s been stuck a while.”

She opened it. Another big change. Aarav had threaded a new feature into a part of the codebase everyone avoided.

“Why so big?” she asked in the comments.

He replied almost immediately.

“Last time I split it up, it took three times as long to get merged. People kept asking for ‘the full picture.’ Figured I’d just send one big thing and get it over with.”

There it was, in writing: the survival strategy the team had taught him. Don’t take risks. Don’t touch scary code unless you can sneak the change in once. Don’t argue style. Don’t propose bolder refactors. Just get past the reviewers.

For a moment, Riya thought about just doing what everyone else did: pick a few safe comments, ask him to rename a function, and move on. Instead, she DM’d him.

“Walk me through what you’re trying to do. No IDE. Just talk.”

They jumped on a call. Twenty minutes later, they had sketched a much smaller first step: one refactor to untangle the worst part of the flow, behind a feature flag. A second PR could layer the new feature on top. A third could clean up the old code path once they were confident.

“That sounds… nicer,” Aarav said. “I didn’t know we were allowed to do that.”

“Allowed?” Riya laughed. “We should be begging people to do that.”

After the call, she stared at the endless PR list again. It wasn’t just the code that needed refactoring. It was the whole way they were treating review.

The next week, at their engineering sync, she told a story instead of showing another metric chart.

She described her average day: the flood of notifications, the pressure to be “fast,” the feeling of signing off on changes she hadn’t had the time to truly understand. She told the Aarav story—how he’d learned to send huge PRs because “that’s what gets merged here.” She admitted she’d started avoiding the hardest reviews because they meant conflict, long threads, and no visible credit.

Then she asked a simple question: “If we looked only at our review data, what would we think we care about? And does that match what we say we care about?”

Silence, then a few nods. Someone joked about “the sacred graph of PR cycle time,” but nobody defended it too hard.

They made three small changes that day.

First, they agreed on a “normal” PR size and wrote it down. Anything far bigger needed a clear reason in the description. Second, they set an expectation that every author would answer four questions in their PR: what problem this solved, why this approach, what the trade‑offs were, and what reviewers should focus on. Third, they decided that style nits belonged to linters and formatters, not humans, and spent an afternoon tightening their tooling.

None of this was revolutionary. There was no shiny new platform, no AI assistant promising to review their code for them. But within a couple of weeks, Riya noticed that her review queue felt different. PRs were smaller. Descriptions were clearer. She could say “no” to being tagged on everything and still feel the team was safe.

The work was still there. Some days her notifications still exploded. There were still awkward conversations and disagreements. But reviews felt less like a political stage and more like the conversation they were supposed to be: a few people, looking at a small slice of change, trying to leave the code a little better than they found it.

One evening, as she closed the last PR of the day, Riya caught herself thinking something she hadn’t felt in a long time.

“This one was actually fun.”

Top comments (0)