I've been using AI-assisted code review for about eight months now across five production websites. Not evaluating tools for a listicle — actually shipping code with them every day.
Here's what stuck, what didn't, and what surprised me.
Claude Code (CLI)
This is my primary tool now. I run it in the terminal alongside my editor, and it handles everything from writing components to reviewing diffs before commits.
What works: Context awareness across an entire codebase. I can say "check this Next.js page for SEO issues" and it understands the project structure, checks meta tags, validates JSON-LD schemas, and catches missing alt attributes — all without me pointing it to specific files.
What doesn't: Long sessions eat context. After about 45 minutes of back-and-forth, responses get less precise. I've learned to start fresh sessions for new tasks.
Cost reality: With Opus, a heavy coding day runs maybe $15-20 in API costs. Light days are $3-5. Not cheap, but the time savings are real.
GitHub Copilot
Still installed, still useful for autocomplete. But I've stopped using it for anything beyond single-line completions.
What works: Tab-completing boilerplate. Import statements, function signatures, repetitive patterns. It saves maybe 30 seconds per completion, hundreds of times a day.
What doesn't: Copilot Chat never clicked for me. The suggestions feel generic compared to having a full codebase-aware tool. It's great at guessing what you want to type next, less great at understanding why.
What I Tried and Dropped
Cursor: Good product, but I couldn't justify the subscription when Claude Code does the same thing in my existing terminal workflow. If you prefer an IDE-integrated experience, it's solid.
Codeium/Continue.dev: Free alternatives to Copilot. Continue.dev is surprisingly capable for being open source. I'd recommend it if you're cost-sensitive.
Amazon CodeWhisperer (now Q): Tried it for two weeks. Autocomplete quality was noticeably worse than Copilot for my TypeScript/React stack.
The Workflow That Actually Works
Morning: Open Claude Code, load the project, review yesterday's changes.
During coding: Copilot handles autocomplete. When I hit something complex — a tricky database query, a performance optimization, an unfamiliar API — I switch to Claude Code for a deeper conversation.
Before commits: I ask Claude Code to review the diff. It catches things I miss: unused imports, inconsistent error handling, missing edge cases.
Weekly: I run a broader review of new code across all five sites. This is where AI shines — scanning hundreds of files for patterns you'd never manually check.
Honest Assessment
AI code review tools have genuinely improved my code quality. But not in the way I expected.
The biggest gain isn't catching bugs. It's having a conversation partner that knows your codebase. Rubber duck debugging, but the duck actually talks back with useful suggestions.
The biggest risk is over-reliance. I've caught myself accepting AI suggestions without fully understanding them. That's dangerous. Every change still needs human review — AI review is additive, not a replacement.
Numbers
Across my five websites (Next.js 14-16, TypeScript, various deployment targets):
- ~40% fewer post-deploy bugs since adopting AI review
- Code review time per PR: down from ~25 minutes to ~10 minutes
- False positive rate on AI suggestions: roughly 15-20% (things it flags that aren't actually issues)
Not revolutionary. But consistently better.
Building five websites with AI tools. Previous posts on subscription savings and streaming comparisons.
Top comments (0)