DEV Community

Jonescodes
Jonescodes

Posted on • Originally published at willjonesdev.com

Building Pattern-Aware Code Review with Cloudflare and Claude

Nobody loves code review. You open a PR, scan the diff, catch a typo, approve it, move on. The deeper stuff like pattern consistency and architectural alignment takes real time and real focus. Most of us don't have enough of either.

I wanted to build something that catches pattern drift. Not a linter. Not static analysis. Something that understands how your codebase handles a problem and tells you when new code does it differently.

Why existing AI reviewers fall short

Most AI code review tools see only the diff. They give generic advice like "consider adding error handling" without knowing that your project already has a handleApiError() wrapper used in every API method. The feedback sounds smart but isn't grounded in your codebase.

The difference between a generic suggestion and a useful one:

Generic: "Consider adding error handling"

Pattern-aware: "This bypasses the handleApiError() wrapper
used in every other API method. See src/utils/errors.ts:12"
Enter fullscreen mode Exit fullscreen mode

The second one is only possible if the reviewer has seen your code.

Starting as an MCP tool

The first version of DiffPrism was a Claude Code MCP tool. MCP (Model Context Protocol) lets you build tools that AI agents can call. DiffPrism gave Claude Code a browser UI for reviewing diffs. You could open a review, see the changes, and Claude would post inline comments.

It worked. But it had too much friction. You had to be in a Claude Code session. You had to prompt it. You had to have the MCP server running. Every step was manual.

Real code review happens on GitHub. PRs live there. Discussions live there. Merges live there. A separate tool will always be an extra step. So I pivoted.

Moving to a GitHub App

A GitHub App receives webhooks when things happen on a repo. PR opened, code pushed, comment posted. No manual triggering needed.

The new DiffPrism is a Cloudflare Worker that listens for GitHub webhooks. Someone comments /review on a PR. DiffPrism fetches the diff, gathers codebase context, sends everything to Claude, and posts inline comments on the PR. The developer never leaves GitHub.

Two Cloudflare Workers talk to each other through a service binding:

GitHub webhook
  → DiffPrism Worker (receives event, queues review job)
  → Queue consumer (fetches diff, gets context, calls Claude, posts comments)
  → Repo Context Service (semantic search, related files, conventions)
Enter fullscreen mode Exit fullscreen mode

Service bindings let workers call each other directly without going through the public internet. The context service stays private but the GitHub App can still reach it.

The repo context service

This is where I spent the most time. The context service indexes your entire repository and makes it searchable. This is what makes pattern-aware review possible.

When you install DiffPrism on a repo, it triggers indexing. The service walks the repo tree, splits code into semantic chunks at function and class boundaries, generates vector embeddings with Workers AI, and stores everything in Cloudflare Vectorize. It also tracks imports and exports to build an import graph.

When a review happens, DiffPrism asks the context service two questions:

  1. What code is semantically similar to what changed?
  2. What files connect to the changed files through imports?

These answers get fed to Claude alongside the diff. Now Claude can reference your actual patterns instead of guessing.

The whole stack runs on Cloudflare. D1 for the database, R2 for storage, Vectorize for vector search, Workers AI for embeddings, Queues for async processing. $5/month covers everything at my current scale.

Quality problems I had to solve

Getting context retrieval right took iteration.

Combined query problem. I was embedding the entire PR diff as a single search query. A diff that touches routing, auth, and tests produces a diluted embedding that matches nothing well. I added a batch search endpoint that accepts multiple queries. Each changed file gets its own focused embedding. Three queries across express.js returned six unique relevant results. One combined query returned noise.

Path-only embeddings. The related files tool was embedding "File: src/router/index.js" as the vector query. Almost nothing to work with. I changed it to look up the file's exports and first chunk of content. Related file scores jumped from 0.6 to 0.91. lib/request.js now finds lib/response.js as highly related. Obviously correct.

Missing conventions. DiffPrism could search code and find related files but could not ask about naming conventions, test patterns, or import style. I exposed architecture and conventions detection as JSON API endpoints so the review prompt can include project standards.

Indexing timeouts. Indexing a 200-file repo timed out because the entire pipeline ran synchronously. The embedding step alone takes minutes for hundreds of chunks. I split it into two phases. The HTTP request stores pre-walked data in R2 and queues a job. The queue consumer reads from R2 and runs the pipeline async. Express.js (209 files, 308 chunks) indexed without issues.

Testing at scale

I indexed express.js to validate beyond small repos. 209 files, 308 chunks, all processed async through the queue.

Search for "middleware error handling" returned the error handling example file at 0.78 relevance. "Routing and route matching" returned router files. Architecture detection correctly identified JavaScript as the primary language, lib/ and test/ as the directory structure, index.js as the entry point, ESLint and Prettier as tooling. Conventions detected kebab-case naming and CommonJS imports. All correct.

What's next

The core review loop works. Comment /review, get inline comments that reference your codebase in about 10 seconds. Next is making it production-ready: real multi-tenant auth, Stripe billing, and upgrading from Haiku to Sonnet for better review quality.

The end goal is a code review tool that knows your codebase well enough to catch what humans miss when reviewing at speed. Not replacing human review. Making it better.

If you want to try it, DiffPrism is free for up to 10 reviews per month. Install the GitHub App and comment /review on any PR.

Top comments (0)