DEV Community

Cover image for How I Review Pull Requests with Claude (and Actually Merge Them)
PADMANABHA DAS
PADMANABHA DAS

Posted on

How I Review Pull Requests with Claude (and Actually Merge Them)

Last month, I merged a critical PR that touched 14 files across 11 services. The usual process — opening GitHub, switching between tabs, mentally tracking what changed where — would've taken me an hour. I did it in 12 minutes.

The secret? I stopped treating Claude as a chatbot and started treating it as a GitHub client.

The Problem with Traditional PR Reviews

Here's my typical PR review workflow before I changed things:

  1. Open GitHub, read the PR description
  2. Click through each file, lose context switching between them
  3. Open the codebase locally to understand the broader picture
  4. Go back to GitHub, forget what I just read
  5. Check the commit history to understand how the PR evolved
  6. Repeat steps 2-5 until frustrated enough to just approve

The mental overhead is exhausting. And when you're reviewing your own PRs (solo developers, you know the drill), the context-switching kills whatever flow you had while coding.

I tried various solutions — GitHub CLI, VS Code extensions, even custom scripts. Nothing stuck because they all added friction in different places. What I wanted was simple: review PRs the way I think about them, not the way GitHub's UI organizes them.

Enter MCP: Making Claude Actually Useful for GitHub

MCP (Model Context Protocol) is Anthropic's open standard that lets AI assistants connect to external tools and data sources. Instead of copy-pasting code into Claude and asking "what does this do?", MCP lets Claude directly interact with services — including GitHub.

Think of it like this: without MCP, Claude is smart but blind. It can analyze code you show it, but it can't see your repositories, can't check your PR status, can't read your latest commits. With MCP, Claude gets eyes and hands.

The protocol works by running a local server that exposes "tools" — specific actions Claude can take. When I ask Claude about a PR, it doesn't scrape GitHub's website. It calls structured APIs through the MCP server, gets clean JSON responses, and reasons about them just like it reasons about any other information.

I built a GitHub MCP server with 44+ tools that gives Claude full access to repositories, branches, commits, issues, pull requests, and releases. The tools are organized by what they do:

  • Repository tools: Create, update, delete repos; manage collaborators; change visibility
  • Branch tools: Create branches, set defaults, view branch details
  • Commit tools: List commits, see what changed in each commit
  • Issue tools: Full issue lifecycle — create, update, assign, close
  • PR tools: The stars of this article — everything from listing PRs to merging them
  • Release tools: Create and manage releases

It's not a wrapper around the GitHub CLI. It's a proper integration that lets Claude understand my codebase the way I do.

Repository: github.com/chayan-1906/GitHub-MCP

How the MCP Server Works Under the Hood

When I ask Claude about a PR, here's what actually happens:

  1. Claude recognizes it needs GitHub data and identifies the appropriate tool
  2. It calls the MCP server with structured parameters (owner, repo, PR number)
  3. The server authenticates with GitHub using my stored OAuth token
  4. GitHub's API returns JSON data
  5. The server passes that data back to Claude
  6. Claude reasons about the data and responds in natural language

Each tool in my MCP server follows a consistent pattern. Here's a simplified look at how get-pull-request-details is registered:

server.tool(
  'get-pull-request-details',
  'Fetches detailed information about a specific GitHub pull request',
  {
    owner: z.string().describe('GitHub username or organization'),
    repository: z.string().describe('Repository name'),
    prNumber: z.number().describe('Pull request number'),
  },
  async ({ owner, repository, prNumber }) => {
    const accessToken = await getGitHubAccessToken();
    const prData = await fetchPullRequest(accessToken, owner, repository, prNumber);
    return { content: [{ type: 'text', text: JSON.stringify(prData, null, 2) }] };
  }
);
Enter fullscreen mode Exit fullscreen mode

The z.string() and z.number() are Zod validators — they ensure Claude passes the right parameter types. The tool description helps Claude understand when to use each tool.

My PR Review Workflow (With a Real Example)

Let me walk you through how I actually review PRs now, using a real example from my news aggregation platform, PulsePress. This was a critical fix for my AI model fallback system — the kind of PR that would normally require careful, tedious review.

Step 1: Get the Big Picture

I start by asking Claude to fetch the PR details. No need to open GitHub at all:

Show me PR #9 from PulsePress-Node.js repository
Enter fullscreen mode Exit fullscreen mode

Claude calls the get-pull-request-details tool and returns:

{
  "number": 9,
  "title": "fix: implement centralized AI model fallback system",
  "state": "closed",
  "merged": true,
  "commits": 6,
  "additions": 492,
  "deletions": 473,
  "changedFiles": 14,
  "headBranch": "fix/ai-fallback-system",
  "baseBranch": "master",
  "labels": ["critical-fix", "refactor", "system-wide", "reliability", "no-breaking-changes"]
}
Enter fullscreen mode Exit fullscreen mode

Right away, I know this is a significant change — 14 files, nearly 1000 lines touched, labeled as critical. Claude also fetches the full PR description, which explains the problem: AI services were failing completely when a model became unavailable (say, Google deprecates gemini-2.3-flash-lite) instead of gracefully falling back to alternatives.

That's a critical vulnerability. If one AI model goes down, all 11 AI features in my app break. Not good.

Step 2: Understand What Changed

Next, I need to see which files were modified and how significant each change is:

List all files changed in this PR with their stats
Enter fullscreen mode Exit fullscreen mode

Claude calls list-pull-request-files:

[
  { "filename": "src/services/QuotaService.ts", "status": "modified", "additions": 61, "deletions": 1 },
  { "filename": "src/services/SentimentAnalysisService.ts", "status": "modified", "additions": 32, "deletions": 31 },
  { "filename": "src/services/SummarizationService.ts", "status": "modified", "additions": 43, "deletions": 39 },
  { "filename": "src/services/ArticleEnhancementService.ts", "status": "modified", "additions": 46, "deletions": 50 },
  { "filename": "src/services/QuestionAnswerService.ts", "status": "modified", "additions": 57, "deletions": 55 },
  { "filename": "src/types/quota.ts", "status": "modified", "additions": 18, "deletions": 3 }
]
Enter fullscreen mode Exit fullscreen mode

The pattern jumps out immediately: QuotaService.ts got 61 new lines with only 1 deletion — that's where the core fix lives. The service files all have roughly equal additions and deletions — they're being migrated to use the new pattern. And quota.ts has new type definitions to support the change.

This is a classic refactor structure: introduce abstraction in one place, migrate consumers everywhere else.

Step 3: Review the Core Changes

Now I dig into the actual implementation. I ask Claude to show me the new code:

Show me the new executeWithModelFallback method in QuotaService.ts
Enter fullscreen mode Exit fullscreen mode

Claude uses get-file-content to read the file, and I see the heart of the fix:

static async executeWithModelFallback<T>({
  primaryModel,
  fallbackModels,
  executeAICall,
  count = 1
}: IExecuteWithModelFallbackParams<T>): Promise<IExecuteWithModelFallbackResponse<T>> {

  const modelsToTry: string[] = [primaryModel, ...fallbackModels];
  const attemptedModels: string[] = [];

  for (const modelName of modelsToTry) {
    attemptedModels.push(modelName);

    // Skip invalid models
    if (!GEMINI_MODELS.includes(modelName as TGeminiModel)) {
      console.warn('Unknown Gemini model, skipping', { modelName });
      continue;
    }

    // Try to reserve quota
    const quotaResult = await this.reserveQuotaForGeminiModel({
      modelName: modelName as TGeminiModel,
      count
    });

    if (!quotaResult.allowed) continue;

    // Quota reserved, now try the actual AI call
    try {
      const result = await executeAICall(modelName);
      return { success: true, result, selectedModel: modelName, attemptedModels };
    } catch (error: any) {
      // AI call failed, try next model
      console.warn('AI call failed, trying next model', { modelName, error: error.message });
    }
  }

  // All models exhausted
  return { success: false, error: 'ALL_AI_CALLS_FAILED', selectedModel: '', attemptedModels };
}
Enter fullscreen mode Exit fullscreen mode

This is elegant. Previously, the code had a 2-step pattern: reserve quota first, then make the AI call separately. If the AI call failed (model deprecated, 404 error, rate limit), there was no retry mechanism — just complete failure. Now quota reservation and AI execution happen together inside a retry loop. If one model fails, it automatically tries the next.

Step 4: Verify the Migration Pattern

I want to make sure all 11 services migrated correctly. Consistency matters — if one service handles errors differently, that's a bug waiting to happen.

I pick one service and check:

Show me how SentimentAnalysisService.ts uses the new fallback method
Enter fullscreen mode Exit fullscreen mode

The old pattern was:

// OLD: 2-step pattern (problematic)
const quotaReservation = await QuotaService.reserveQuotaForModelFallback({
  primaryModel: AI_SENTIMENT_MODELS[0],
  fallbackModels: AI_SENTIMENT_MODELS.slice(1),
  count: 1,
});
if (!quotaReservation.allowed) return { error: 'QUOTA_EXHAUSTED' };

const result = await this.analyzeWithGemini(quotaReservation.selectedModel, content);
// If this fails, no retry. Just failure.
Enter fullscreen mode Exit fullscreen mode

The new pattern:

// NEW: 1-step pattern with automatic fallback
const fallbackResult = await QuotaService.executeWithModelFallback({
  primaryModel: AI_SENTIMENT_ANALYSIS_MODELS[0],
  fallbackModels: AI_SENTIMENT_ANALYSIS_MODELS.slice(1),
  executeAICall: (modelName: string) => this.analyzeWithGemini(modelName, content),
  count: 1,
});

if (!fallbackResult.success) {
  if (fallbackResult.error === 'QUOTA_EXHAUSTED') {
    return { error: 'GEMINI_DAILY_LIMIT_REACHED' };
  }
  return { error: 'SENTIMENT_ANALYSIS_FAILED' };
}

const result = fallbackResult.result!;
Enter fullscreen mode Exit fullscreen mode

Clean. The AI call is now passed as a callback, and the retry logic is handled centrally. I spot-check two more services — same pattern. Consistent, reviewable, maintainable.

Step 5: Examine the Type Definitions

Good TypeScript code needs good types. I ask Claude to show me the new interfaces:

Show me the new types added in quota.ts
Enter fullscreen mode Exit fullscreen mode
export interface IExecuteWithModelFallbackParams<T> {
  primaryModel: string;
  fallbackModels: string[];
  executeAICall: (modelName: string) => Promise<T>;
  count?: number;
}

export interface IExecuteWithModelFallbackResponse<T> {
  success: boolean;
  result?: T;
  error?: string;
  selectedModel: string;
  attemptedModels: string[];
}
Enter fullscreen mode Exit fullscreen mode

The generic <T> is important — it means the same fallback method works for any AI service, whether it returns sentiment data, summaries, tags, or anything else. The attemptedModels array is clever too; it gives visibility into which models were tried before success (or failure). Great for debugging production issues.

Step 6: Check the Commit History

Sometimes the commit history reveals important context about how a solution evolved:

Show me the commits in this PR
Enter fullscreen mode Exit fullscreen mode

Claude calls list-pull-request-commits and returns 6 commits. This tells me the solution was iterated, not rushed. The developer (me, weeks ago) made incremental progress rather than one massive "fix everything" commit.

Step 7: The Merge Decision

At this point, I have everything I need:

  • ✅ Understood the problem (AI model failures with no fallback)
  • ✅ Reviewed the core solution (centralized executeWithModelFallback)
  • ✅ Verified migration consistency across all 11 services
  • ✅ Examined type definitions for proper generics and error handling
  • ✅ Confirmed no breaking changes (same API responses, just more reliable)
  • ✅ Checked commit history for incremental development

The PR is solid. Ready to merge.

Claude has merge-pull-request available with configurable merge strategies (merge commit, squash, rebase). For this PR, I'd use squash merge to keep the main branch history clean. But for critical changes like this, I still click the merge button in GitHub's UI — old habits die hard, and there's something reassuring about that final manual confirmation.

The Tools That Make This Work

Here's a quick reference of the GitHub MCP tools I use most for PR reviews:

Tool What It Does
get-pull-request-details Fetches PR metadata, description, labels, merge status
list-pull-request-files Lists all changed files with additions/deletions
list-pull-request-commits Shows commit history within the PR
get-file-content Reads any file from a specific branch
get-commit-modifications Shows exactly what changed in a specific commit
create-pull-request-review Submits review — approve, request changes, or comment
merge-pull-request Merges with configurable strategy (merge/squash/rebase)

The server has 44+ tools total covering repositories, branches, issues, releases, and more. But for PR reviews, these seven are my daily drivers.

Why This Actually Works

Three reasons this workflow beats traditional PR reviews:

1. No context switching. Everything happens in one conversation. I don't lose track of what I read when I switch tabs because I never switch tabs. The entire review happens in Claude.

2. Claude remembers the conversation. When I ask "does the error handling in ArticleEnhancementService follow the same pattern we saw earlier?", Claude already knows what pattern I'm referring to. No need to re-explain context.

3. I can ask questions about the code. "Why did we change from return-based errors to exceptions in the Gemini methods?" Claude analyzes the diff and explains that exceptions allow the fallback loop to continue execution, while return { error } would exit immediately without trying other models.

Setting It Up

If you want to try this workflow:

  1. Download the executable from the GitHub releases page (macOS and Windows available)
  2. Run it once — it auto-configures Claude Desktop
  3. Launch Claude Desktop
  4. Start asking about your repositories

The server handles GitHub OAuth automatically. You authenticate once in your browser, and the token persists. No manual API token management.

The Honest Limitations

This isn't perfect. A few things I've learned:

  • Large diffs are still hard. If a PR touches 50+ files, even Claude struggles with that much context. I break those into focused questions about specific parts rather than trying to review everything at once.
  • It's not a replacement for running the code. I still pull branches locally for anything that needs actual testing or debugging. Claude can tell me what the code does, but it can't tell me if it actually works.
  • Rate limits exist. Heavy usage hits GitHub's API limits. The server handles rate limiting gracefully, but be aware during intense review sessions. I've never actually hit the limit during normal use, but batch operations (like reviewing 10 PRs in a row) might get throttled.
  • Private repos need proper OAuth scopes. If you're working with organization repositories or private repos, make sure your OAuth app has the right permissions. The server prompts for this during setup, but it's worth double-checking if you run into access issues.

Tips for Getting the Most Out of This Workflow

After using this setup for a few months, here's what I've learned:

Ask specific questions. "Review this PR" is too vague. "Show me all files that modify error handling" gives Claude something concrete to work with.

Start with the high-level view. I always get PR details first, then file list, then dive into specific files. This top-down approach matches how I think about code changes.

Use Claude's memory. Once you've discussed a pattern or decision earlier in the conversation, you can reference it. "Does this file follow the same pattern?" works because Claude remembers what pattern you discussed.

Don't be afraid to ask "why." "Why did we change from returns to exceptions here?" — Claude can often infer the reasoning from the surrounding code and explain it back to you.

What Changed For Me

I used to dread PR reviews. The clicking, the tab switching, the constant loss of mental context. Now I actually enjoy them. The friction is gone. I can focus on what matters — understanding the change, evaluating the design, catching potential bugs — instead of fighting with the interface.

If you're spending more time navigating GitHub than actually reviewing code, give this a shot. Setup takes five minutes. The productivity gain is permanent.

Top comments (0)