Building a Code Review Bot with Claude API and GitHub Actions
Automate your code reviews with AI-powered analysis that catches real bugs, security issues, and style violations — directly in your pull requests.
Every engineering team faces the same bottleneck: code reviews take time, reviewers get fatigued, and important issues slip through. What if an AI assistant could handle the first pass — flagging security vulnerabilities, catching logic errors, and enforcing style conventions — before a human even opens the PR?
In this article, you'll build a production-ready code review bot that integrates Claude's API with GitHub Actions. It will automatically analyze pull request diffs, post structured comments, and help your team ship higher-quality code faster.
What We're Building
By the end of this tutorial, you'll have:
- A GitHub Action that triggers on every pull request
- A Node.js script that fetches the PR diff and sends it to Claude for analysis
- Structured review output posted as GitHub PR comments
- Rate limiting, error handling, and cost controls built in
The bot will analyze code for:
- Security vulnerabilities (SQL injection, XSS, hardcoded secrets)
- Logic bugs and edge cases
- Performance antipatterns
- Missing error handling
- Style and maintainability issues
Prerequisites
- A GitHub repository
- An Anthropic API key (get one at console.anthropic.com)
- Node.js 18+ (for local testing)
- Basic familiarity with GitHub Actions
Step 1: Setting Up the GitHub Action
Create .github/workflows/ai-code-review.yml in your repository:
name: AI Code Review
on:
pull_request:
types: [opened, synchronize, reopened]
# Only run on PRs targeting main/master
branches:
- main
- master
# Limit concurrent runs to avoid API rate limits
concurrency:
group: ai-review-${{ github.event.pull_request.number }}
cancel-in-progress: true
jobs:
ai-review:
name: Claude Code Review
runs-on: ubuntu-latest
# Skip draft PRs
if: github.event.pull_request.draft == false
permissions:
contents: read
pull-requests: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- name: Install dependencies
run: |
npm install @anthropic-ai/sdk@latest
npm install @octokit/rest@latest
- name: Run AI Code Review
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.pull_request.number }}
REPO_OWNER: ${{ github.repository_owner }}
REPO_NAME: ${{ github.event.repository.name }}
BASE_SHA: ${{ github.event.pull_request.base.sha }}
HEAD_SHA: ${{ github.event.pull_request.head.sha }}
run: node .github/scripts/ai-review.js
- name: Upload review artifacts
if: always()
uses: actions/upload-artifact@v4
with:
name: ai-review-${{ github.event.pull_request.number }}
path: /tmp/ai-review-*.json
retention-days: 7
This action:
- Triggers on PR open, update, and reopen events
- Skips draft PRs to avoid wasting API calls
- Uses concurrency controls to cancel stale runs
- Grants write access to post PR comments
Step 2: The Review Script
Create .github/scripts/ai-review.js:
import Anthropic from '@anthropic-ai/sdk';
import { Octokit } from '@octokit/rest';
import { execSync } from 'child_process';
import fs from 'fs';
// Configuration
const MAX_DIFF_LINES = 500; // Truncate massive diffs
const MAX_FILE_SIZE = 50_000; // Skip files over 50KB
const REVIEW_COMMENT_TAG = '<!-- AI-CODE-REVIEW -->';
// File extensions to review (skip binary, generated files)
const REVIEWABLE_EXTENSIONS = new Set([
'.js', '.jsx', '.ts', '.tsx', '.mjs', '.cjs',
'.py', '.rb', '.go', '.rs', '.java', '.kt',
'.c', '.cpp', '.h', '.hpp', '.cs',
'.php', '.swift', '.scala',
'.sh', '.bash', '.zsh',
'.yml', '.yaml', '.json', '.toml',
'.sql', '.graphql',
]);
// Files/patterns to skip
const SKIP_PATTERNS = [
/node_modules/,
/\.min\.(js|css)$/,
/package-lock\.json$/,
/yarn\.lock$/,
/pnpm-lock\.yaml$/,
/dist\//,
/build\//,
/\.generated\./,
/__generated__/,
/migrations\/\d+/,
];
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
const octokit = new Octokit({
auth: process.env.GITHUB_TOKEN,
});
/**
* Fetch the PR diff using git
*/
async function getPRDiff(baseSha, headSha) {
try {
const diff = execSync(
`git diff ${baseSha}..${headSha} --unified=5 --no-color`,
{ maxBuffer: 10 * 1024 * 1024 } // 10MB buffer
).toString();
return diff;
} catch (error) {
console.error('Failed to get diff:', error.message);
throw error;
}
}
/**
* Parse diff into per-file chunks
*/
function parseDiff(rawDiff) {
const files = [];
const fileDiffRegex = /^diff --git a\/(.*?) b\/(.*?)$/gm;
const parts = rawDiff.split(/^diff --git /m).slice(1);
for (const part of parts) {
const lines = part.split('\n');
const headerLine = lines[0]; // "a/path b/path"
const match = headerLine.match(/a\/(.*?) b\/(.*)/);
if (!match) continue;
const filePath = match[2];
const ext = '.' + filePath.split('.').pop();
// Skip non-reviewable files
if (!REVIEWABLE_EXTENSIONS.has(ext)) continue;
if (SKIP_PATTERNS.some(p => p.test(filePath))) continue;
const diffContent = 'diff --git ' + part;
const addedLines = lines.filter(l => l.startsWith('+')).length;
const removedLines = lines.filter(l => l.startsWith('-')).length;
files.push({
path: filePath,
diff: diffContent,
addedLines,
removedLines,
lineCount: lines.length,
});
}
return files;
}
/**
* Build the review prompt for Claude
*/
function buildReviewPrompt(files) {
const fileSection = files.map(f => {
const truncatedDiff = f.diff.split('\n').slice(0, MAX_DIFF_LINES).join('\n');
const wasTruncated = f.diff.split('\n').length > MAX_DIFF_LINES;
return `### File: ${f.path}
${wasTruncated ? `*(diff truncated at ${MAX_DIFF_LINES} lines)*\n` : ''}
\`\`\`diff
${truncatedDiff}
\`\`\``;
}).join('\n\n');
return `You are an expert code reviewer performing a thorough analysis of a pull request diff.
Review the following code changes and identify:
1. **Security vulnerabilities** — SQL injection, XSS, CSRF, path traversal, hardcoded credentials, insecure dependencies
2. **Logic bugs** — off-by-one errors, null/undefined dereferences, race conditions, incorrect conditionals
3. **Error handling gaps** — unhandled promise rejections, missing try/catch, unchecked return values
4. **Performance issues** — N+1 queries, unnecessary re-renders, missing memoization, memory leaks
5. **Maintainability** — overly complex functions, missing comments on non-obvious logic, magic numbers
For each issue found:
- Specify the **file path** and approximate **line number**
- Give a **severity**: CRITICAL, HIGH, MEDIUM, LOW
- Provide a **clear explanation** of the problem
- Suggest a **concrete fix** with example code when applicable
If the code is well-written with no significant issues, say so clearly.
Format your response as JSON matching this schema:
{
"summary": "2-3 sentence overall assessment",
"score": 1-10,
"issues": [
{
"file": "path/to/file.js",
"line": 42,
"severity": "HIGH",
"category": "security|logic|error-handling|performance|style",
"title": "Short issue title",
"description": "Detailed explanation",
"suggestion": "How to fix it",
"codeExample": "Optional corrected code snippet"
}
],
"positives": ["List of things done well"],
"blockers": true/false
}
## Changes to Review
${fileSection}`;
}
/**
* Call Claude API to review the code
*/
async function reviewWithClaude(prompt) {
console.log('Sending diff to Claude for review...');
const message = await anthropic.messages.create({
model: 'claude-opus-4-5',
max_tokens: 4096,
temperature: 0.1, // Low temperature for consistent, precise analysis
system: `You are a senior software engineer performing code review. You are thorough, precise, and constructive.
You prioritize security and correctness above all else. You always respond with valid JSON.`,
messages: [
{
role: 'user',
content: prompt,
},
],
});
const responseText = message.content[0].text;
// Extract JSON from response (Claude sometimes wraps in markdown)
const jsonMatch = responseText.match(/\{[\s\S]*\}/);
if (!jsonMatch) {
throw new Error('Claude did not return valid JSON');
}
return JSON.parse(jsonMatch[0]);
}
/**
* Format review results as a markdown comment
*/
function formatReviewComment(review, filesReviewed) {
const severityEmoji = {
CRITICAL: '🚨',
HIGH: '⚠️',
MEDIUM: '💡',
LOW: '📝',
};
const scoreEmoji = review.score >= 8 ? '✅' : review.score >= 6 ? '🟡' : '🔴';
let comment = `${REVIEW_COMMENT_TAG}
## 🤖 AI Code Review
${scoreEmoji} **Score: ${review.score}/10** | **Files reviewed: ${filesReviewed}** | **Blockers: ${review.blockers ? 'YES ⛔' : 'None ✅'}**
### Summary
${review.summary}
`;
if (review.issues && review.issues.length > 0) {
// Group by severity
const bySeverity = { CRITICAL: [], HIGH: [], MEDIUM: [], LOW: [] };
for (const issue of review.issues) {
bySeverity[issue.severity]?.push(issue);
}
comment += `### Issues Found (${review.issues.length})\n\n`;
for (const [severity, issues] of Object.entries(bySeverity)) {
if (issues.length === 0) continue;
comment += `#### ${severityEmoji[severity]} ${severity} (${issues.length})\n\n`;
for (const issue of issues) {
comment += `**${issue.title}** — \`${issue.file}:${issue.line}\`\n`;
comment += `${issue.description}\n\n`;
if (issue.suggestion) {
comment += `> **Fix:** ${issue.suggestion}\n\n`;
}
if (issue.codeExample) {
comment += `\`\`\`\n${issue.codeExample}\n\`\`\`\n\n`;
}
}
}
} else {
comment += `### ✅ No Issues Found\n\nThis change looks good! No significant issues detected.\n\n`;
}
if (review.positives && review.positives.length > 0) {
comment += `### 👍 What's Good\n`;
for (const positive of review.positives) {
comment += `- ${positive}\n`;
}
comment += '\n';
}
comment += `---\n*Reviewed by Claude claude-opus-4-5 · [Report an issue](https://github.com/anthropics/anthropic-sdk-python/issues)*`;
return comment;
}
/**
* Post or update the review comment on the PR
*/
async function postReviewComment(owner, repo, prNumber, commentBody) {
// Find existing AI review comment to update
const { data: comments } = await octokit.rest.issues.listComments({
owner,
repo,
issue_number: prNumber,
});
const existingComment = comments.find(c => c.body.includes(REVIEW_COMMENT_TAG));
if (existingComment) {
console.log(`Updating existing review comment ${existingComment.id}`);
await octokit.rest.issues.updateComment({
owner,
repo,
comment_id: existingComment.id,
body: commentBody,
});
} else {
console.log('Creating new review comment');
await octokit.rest.issues.createComment({
owner,
repo,
issue_number: prNumber,
body: commentBody,
});
}
}
/**
* Main execution
*/
async function main() {
const {
REPO_OWNER: owner,
REPO_NAME: repo,
PR_NUMBER,
BASE_SHA,
HEAD_SHA,
} = process.env;
const prNumber = parseInt(PR_NUMBER, 10);
console.log(`\n=== AI Code Review ===`);
console.log(`Repository: ${owner}/${repo}`);
console.log(`PR #${prNumber}: ${BASE_SHA}..${HEAD_SHA}\n`);
try {
// Step 1: Get the diff
const rawDiff = await getPRDiff(BASE_SHA, HEAD_SHA);
if (!rawDiff.trim()) {
console.log('No diff found — skipping review');
return;
}
// Step 2: Parse into files
const files = parseDiff(rawDiff);
console.log(`Found ${files.length} reviewable files`);
if (files.length === 0) {
console.log('No reviewable files in this PR (all binary, generated, or lock files)');
return;
}
// Step 3: Build prompt and review
const prompt = buildReviewPrompt(files);
const review = await reviewWithClaude(prompt);
console.log(`Review complete: score=${review.score}, issues=${review.issues?.length ?? 0}`);
// Step 4: Save artifacts
const artifactPath = `/tmp/ai-review-${prNumber}.json`;
fs.writeFileSync(artifactPath, JSON.stringify({ review, files: files.map(f => f.path) }, null, 2));
// Step 5: Post comment
const comment = formatReviewComment(review, files.length);
await postReviewComment(owner, repo, prNumber, comment);
console.log('Review comment posted successfully');
// Step 6: Fail the check if there are critical blockers
if (review.blockers && review.issues?.some(i => i.severity === 'CRITICAL')) {
console.error('\n❌ Critical issues found — failing the check');
process.exit(1);
}
} catch (error) {
console.error('Review failed:', error);
// Post error comment so developers know the bot failed
try {
await postReviewComment(owner, repo, prNumber,
`${REVIEW_COMMENT_TAG}\n## 🤖 AI Code Review\n\n❌ Review failed: ${error.message}\n\nThe AI review bot encountered an error. Please review manually.`
);
} catch (commentError) {
console.error('Failed to post error comment:', commentError.message);
}
process.exit(1);
}
}
main();
Step 3: Adding Secrets to Your Repository
Before the action can run, add your Anthropic API key as a GitHub secret:
# Using GitHub CLI
gh secret set ANTHROPIC_API_KEY --body "sk-ant-..." --repo your-org/your-repo
# Or through the web UI:
# Settings → Secrets and variables → Actions → New repository secret
The GITHUB_TOKEN is automatically available — no setup needed.
Step 4: Crafting Effective Review Prompts
The quality of your code reviews depends heavily on prompt design. Here are patterns that work well:
Pattern 1: Chain-of-Thought for Security Analysis
const securityPrompt = `
Analyze this code for security vulnerabilities.
Think through each of these attack vectors step by step:
1. Input validation: Is user input sanitized before use?
2. Authentication: Are auth checks applied consistently?
3. Authorization: Can a user access another user's data?
4. Injection: Any SQL, command, or template injection possibilities?
5. Secrets: Any hardcoded credentials or API keys?
For each vector, explain your reasoning before concluding.
`;
Pattern 2: Contextual Reviews with File History
async function getFileContext(filePath, baseSha) {
// Get recent git log for this file
const log = execSync(
`git log --oneline -10 ${baseSha} -- ${filePath}`
).toString();
return `Recent changes to ${filePath}:\n${log}`;
}
// Inject context into prompt
const contextualPrompt = `
${await getFileContext(file.path, BASE_SHA)}
Given this history, review the following change carefully for
regressions and consistency with the existing codebase:
${file.diff}
`;
Pattern 3: Language-Specific Rules
function getLanguageGuidelines(ext) {
const guidelines = {
'.ts': `
- Check for any 'any' types that should be more specific
- Verify async/await error handling
- Look for missing return types on exported functions
- Check for proper null/undefined handling with optional chaining
`,
'.py': `
- Check for mutable default arguments (def foo(x=[]))
- Verify exception types are specific, not bare 'except:'
- Look for string formatting using % instead of f-strings
- Check for proper resource cleanup (context managers)
`,
'.go': `
- Verify all errors are explicitly handled, not ignored with _
- Check for goroutine leaks (channels that are never closed)
- Look for race conditions in concurrent code
- Verify defer statements are used correctly
`,
};
return guidelines[ext] || '';
}
Step 5: Handling Large Pull Requests
Large PRs need special handling to stay within API token limits:
const MAX_TOKENS_PER_REQUEST = 100_000; // Claude's context window
const CHARS_PER_TOKEN_ESTIMATE = 4;
const MAX_CHARS = MAX_TOKENS_PER_REQUEST * CHARS_PER_TOKEN_ESTIMATE;
async function reviewLargePR(files) {
// Sort by priority: security-sensitive files first
const prioritized = files.sort((a, b) => {
const securityFiles = /auth|security|crypto|token|password|secret/i;
const aIsSecurity = securityFiles.test(a.path);
const bIsSecurity = securityFiles.test(b.path);
if (aIsSecurity && !bIsSecurity) return -1;
if (!aIsSecurity && bIsSecurity) return 1;
return b.addedLines - a.addedLines; // Then by size
});
const batches = [];
let currentBatch = [];
let currentSize = 0;
for (const file of prioritized) {
const fileSize = file.diff.length;
if (currentSize + fileSize > MAX_CHARS && currentBatch.length > 0) {
batches.push(currentBatch);
currentBatch = [file];
currentSize = fileSize;
} else {
currentBatch.push(file);
currentSize += fileSize;
}
}
if (currentBatch.length > 0) batches.push(currentBatch);
console.log(`Splitting ${files.length} files into ${batches.length} review batches`);
// Review each batch
const reviews = await Promise.all(
batches.map((batch, i) => {
console.log(`Reviewing batch ${i + 1}/${batches.length} (${batch.length} files)`);
return reviewWithClaude(buildReviewPrompt(batch));
})
);
// Merge results
return mergeReviews(reviews);
}
function mergeReviews(reviews) {
const allIssues = reviews.flatMap(r => r.issues || []);
const avgScore = reviews.reduce((sum, r) => sum + r.score, 0) / reviews.length;
const hasBlockers = reviews.some(r => r.blockers);
return {
summary: reviews.map(r => r.summary).join(' '),
score: Math.round(avgScore),
issues: allIssues,
positives: reviews.flatMap(r => r.positives || []),
blockers: hasBlockers,
};
}
Step 6: Inline Review Comments
Instead of (or in addition to) a summary comment, you can post inline comments on specific lines:
async function postInlineComments(owner, repo, prNumber, commitSha, issues) {
// Create a review with inline comments
const reviewComments = issues
.filter(issue => issue.file && issue.line)
.map(issue => ({
path: issue.file,
line: issue.line,
side: 'RIGHT',
body: `**${issue.severity}: ${issue.title}**\n\n${issue.description}\n\n${
issue.suggestion ? `**Fix:** ${issue.suggestion}` : ''
}`,
}));
if (reviewComments.length === 0) return;
await octokit.rest.pulls.createReview({
owner,
repo,
pull_number: prNumber,
commit_id: commitSha,
event: 'COMMENT', // 'APPROVE' or 'REQUEST_CHANGES' for stronger feedback
comments: reviewComments,
});
}
Step 7: Cost Controls
Claude API calls cost money. Here's how to keep costs predictable:
// Estimate tokens before sending
function estimateTokens(text) {
// Rough estimate: 1 token ≈ 4 characters
return Math.ceil(text.length / 4);
}
// Cost per 1M tokens (claude-opus-4-5 as of 2024)
const COST_PER_1M_INPUT = 15.00;
const COST_PER_1M_OUTPUT = 75.00;
function estimateCost(inputTokens, outputTokens = 2000) {
const inputCost = (inputTokens / 1_000_000) * COST_PER_1M_INPUT;
const outputCost = (outputTokens / 1_000_000) * COST_PER_1M_OUTPUT;
return (inputCost + outputCost).toFixed(4);
}
// Skip review if diff is too large (likely auto-generated)
const MAX_REVIEW_COST_USD = 0.50;
async function shouldReview(files) {
const totalChars = files.reduce((sum, f) => sum + f.diff.length, 0);
const estimatedTokens = Math.ceil(totalChars / 4);
const estimatedCost = parseFloat(estimateCost(estimatedTokens));
if (estimatedCost > MAX_REVIEW_COST_USD) {
console.warn(`Estimated cost $${estimatedCost} exceeds limit $${MAX_REVIEW_COST_USD}`);
console.warn('Skipping full review — will review only changed files over 10 lines');
return false;
}
console.log(`Estimated review cost: $${estimatedCost}`);
return true;
}
Advanced: Caching Reviews for Unchanged Files
If a PR is updated with only minor changes, avoid re-reviewing files that haven't changed:
import crypto from 'crypto';
function hashDiff(diff) {
return crypto.createHash('sha256').update(diff).digest('hex').slice(0, 16);
}
// Store in GitHub Actions cache or external store
const reviewCache = new Map();
async function getCachedOrReview(file) {
const cacheKey = hashDiff(file.diff);
if (reviewCache.has(cacheKey)) {
console.log(`Cache hit for ${file.path}`);
return reviewCache.get(cacheKey);
}
const review = await reviewWithClaude(buildReviewPrompt([file]));
reviewCache.set(cacheKey, review);
return review;
}
Putting It All Together: The Complete Workflow
Here's the full picture of what we've built:
PR Opened/Updated
↓
GitHub Actions Triggers
↓
Checkout Code + Install deps
↓
git diff baseSha..headSha
↓
Parse diff → Filter files (skip binaries, lock files, generated)
↓
Prioritize files (security-sensitive first)
↓
Batch if too large → Send to Claude API
↓
Parse JSON response
↓
Format markdown comment
↓
Post/Update PR comment via GitHub API
↓
Fail check if CRITICAL blockers found
Real-World Prompt That Catches Real Bugs
After months of refinement, this is the prompt structure that catches the most issues:
const SYSTEM_PROMPT = `You are a senior software engineer with 15 years of experience.
You have deep expertise in security, performance, and system design.
You are thorough but not pedantic — you focus on issues that actually matter.
You always provide actionable feedback with concrete examples.
You respond only with valid JSON, never markdown prose outside the JSON.`;
const diffPrompt = (files) => `
Review the following code changes for a production system serving millions of users.
Critical areas to check:
- Any change to authentication or authorization logic
- Database queries (especially with user-supplied data)
- File system operations
- External API calls and their error handling
- Cryptographic operations
- Race conditions in async code
${buildFileSection(files)}
Remember: A false negative (missing a real bug) is far worse than a false positive.
When in doubt, flag it.
`;
Conclusion
You now have a fully functional AI code review bot that:
- Automatically reviews every pull request in under 2 minutes
- Catches security vulnerabilities, logic bugs, and performance issues
- Posts structured, actionable feedback directly in GitHub
- Handles large PRs through intelligent batching
- Controls costs with token estimation and caching
The key to making this bot useful rather than noisy is prompt engineering. Start with broad analysis, measure which issue types are most actionable for your team, and iteratively refine your prompts.
The complete code is available as a GitHub template repository — fork it and have reviews running in minutes.
Wilson Xu is a software engineer specializing in developer tooling and AI integrations. Find him on GitHub at @chengyixu.
Top comments (0)