The Slop Cannon vs. The Slow Burn
A lot of people seem convinced that AI coding tools are best used to write low-quality code as fast as possible. Spew out barely-passable slop, open massive PRs, merge them unvetted. Ship it!
But here's the thing: LLMs are incredibly flexible. And you can use them just as effectively to write high-quality code more slowly.
This might sound obvious, yet enough developers are still treating LLMs as "slop cannons" that it's worth making the opposite case.
LLMs Are Excellent Bug Finders
If Mythos taught us anything, it's that LLM agents are really good at finding bugs. Throw them at a codebase enough times, and they will surface so many issues that you'll barely know what to do with them.
The latest public models from Anthropic and OpenAI are good enough to find plenty of bugs in an unvetted codebase. The problem isn't finding the bugs — it's prioritizing and validating them.
A Multi-Model Review Workflow That Actually Works
Here's the technique I use: run multiple, different models to review a PR, ranked by critical/high/medium/low. Once they're all done, review their findings, do your own research to rule out false positives, and write a final report.
Concretely, I run:
- A Claude sub-agent
- Codex
- Cursor Bugbot
...all in parallel, each looking for different kinds of bugs.
The key insight: the more different models you throw at a PR, the less likely you are to get hallucinations or bogus bug reports. Each model has different blind spots, so the overlap gives you confidence.
In my experience, this workflow always finds tons of bugs, and the false positive rate is near zero. It finds so many that you can't tackle them all — which brings us to the next step.
The Prioritization Loop
My typical workflow after running the reviewers:
- Fix all criticals and highs first (with my guidance on the proper solution), then repeat until no criticals/highs remain
- Skip highs/mediums where the juice isn't worth the squeeze — e.g., 100 lines of code to fix a narrow edge case
- Abandon the PR if it has so many criticals that the whole approach turns out to be misguided
This process doesn't necessarily make me faster in terms of raw output. If anything, the review process often surfaces pre-existing bugs, so I'm on a tangential quest writing unit tests and fixing subtle flaws that pre-date the PR.
And you know what? I find that extremely satisfying.
The Developer You Were Before LLMs
This "slow" approach is actually how many thoughtful developers already worked pre-LLMs: careful, methodical, quality-obsessed, focused on making things better for the next coder.
The happy-path of a complex architecture is less interesting than its failure modes. Pre-LLMs, getting familiar with a codebase meant understanding where the assumptions broke down, then getting your hands dirty to fix it. LLMs just accelerate that process.
Practical Tips for the "Slower" Approach
If you're currently using agents to write multi-hundred-line PRs you barely understand yourself, try this:
- Ask an agent how your PR works and how it might fail before merging
-
Use Matt Pocock's
/grill-meskill until you understand the entire PR front-to-back - Generate Mermaid diagrams to visualize the architecture
- Wipe context between review sweeps — clearing context really helps avoid being anchored by the first result
- Split reviewers into different archetypes (frontend, backend, infra) for PRs that span domains
You might not be "10x more productive." You might burn a ton of tokens just to find out your entire plan was wrongheaded. But I'd argue that's a feature, not a bug — better to discover a flawed approach in review than in production.
The Bottom Line
The LLM-assisted coding debate is often framed as "speed vs. quality." But that's a false dichotomy. LLMs give you the ability to be both faster and more thorough — you just have to be intentional about how you use them.
So take a deep breath, slow down, and try this technique. You might find that writing better code more slowly is the most underrated superpower in your toolkit.
What workflow have you found most effective for AI-assisted code review? Do you run multiple models in parallel, or rely on a single review agent?
Top comments (0)