<h2>Why AI Code Review Matters</h2>
<p>Manual code review is a bottleneck. Senior developers spend <strong>20-30% of their time</strong> reviewing pull requests, yet studies show human reviewers miss approximately 50% of bugs in code under review. AI-assisted code review doesn't replace humans — it augments them by catching mechanical issues so reviewers can focus on architecture and design decisions.</p>
<p>The key insight: the quality of AI code review is <strong>entirely determined by the prompt</strong>. A generic "review this code" instruction produces generic, surface-level feedback. A well-engineered prompt produces specific, actionable, priority-ranked issues that match your team's standards.</p>
<h2>The Three-Layer Review Architecture</h2>
<p>Production AI code review should operate in three distinct layers, each with a specialised prompt:</p>
<ol>
<li><strong>Security Layer</strong> — Scans for vulnerabilities: injection attacks, auth bypasses, data exposure, insecure dependencies</li>
<li><strong>Quality Layer</strong> — Evaluates code quality: logic errors, edge cases, error handling, type safety, test coverage</li>
<li><strong>Style Layer</strong> — Enforces consistency: naming conventions, documentation, architectural patterns, team standards</li>
</ol>
<p>Running these as separate prompts is more effective than a single "review everything" prompt because each layer has different evaluation criteria and severity scales.</p>
<h2>Security Review Prompt</h2>
<pre><code>System: You are a senior application security engineer performing a security-focused code review.
Context
- Language: {language}
- Framework: {framework}
- This code handles: {description}
Security Checklist
Evaluate the code against these categories:
- INJECTION: SQL injection, XSS, command injection, LDAP injection, template injection
- AUTHENTICATION: Broken auth flows, session management, credential handling
- AUTHORISATION: Missing access controls, IDOR, privilege escalation
- DATA EXPOSURE: Sensitive data in logs, hardcoded secrets, PII leakage
- CRYPTOGRAPHY: Weak algorithms, improper key management, predictable tokens
- INPUT VALIDATION: Missing sanitisation, type coercion, boundary checks
- DEPENDENCIES: Known CVEs, outdated packages, supply chain risks
Output Format
For each finding:
- SEVERITY: CRITICAL | HIGH | MEDIUM | LOW
- CWE: The relevant CWE identifier
- LOCATION: File and line number
- DESCRIPTION: What the vulnerability is
- EXPLOIT: How an attacker could exploit it
- FIX: The specific code change needed
If no security issues are found, state "No security issues identified" and explain what security measures are correctly implemented.
<h2>Quality Review Prompt</h2>
<pre><code>System: You are a principal software engineer reviewing code for production readiness.
Review Criteria
- CORRECTNESS: Logic errors, off-by-one errors, race conditions, null handling
- EDGE CASES: Empty inputs, boundary values, concurrent access, network failures
- ERROR HANDLING: Uncaught exceptions, error propagation, user-facing error messages
- PERFORMANCE: N+1 queries, unnecessary re-renders, memory leaks, algorithmic complexity
- TESTABILITY: Tight coupling, hidden dependencies, untestable side effects
- MAINTAINABILITY: Complex conditionals, deep nesting, duplicate logic, magic numbers
Constraints
- Focus on substantive issues, not nitpicks
- Every issue must include a concrete fix
- Rate each issue: MUST_FIX | SHOULD_FIX | CONSIDER
- If the code is well-written, say so and explain what makes it good
Output
Provide your review as a structured list, ordered by severity.
<h2>Integrating AI Review into CI/CD</h2>
<p>The most effective pattern integrates AI review directly into your pull request workflow. Here's a production architecture:</p>
<pre><code># .github/workflows/ai-review.yml
name: AI Code Review
on:
pull_request:
types: [opened, synchronize]
jobs:
ai-review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Get changed files
id: changed
run: |
echo "files=$(git diff --name-only origin/main...HEAD | tr '\n' ' ')" >> $GITHUB_OUTPUT
- name: Run AI Security Review
run: |
for file in ${{ steps.changed.outputs.files }}; do
# Send each file to your AI review API
curl -X POST https://your-api/review \
-H "Authorization: Bearer ${{ secrets.AI_API_KEY }}" \
-d "{"file": "$(cat $file)", "layer": "security"}"
done
<h2>Handling False Positives</h2>
<p>AI code reviewers produce false positives. Managing them is critical for developer trust:</p>
<ul>
<li><strong>Calibrate severity thresholds</strong> — Start with CRITICAL and HIGH only; add lower severities once trust is established</li>
<li><strong>Provide context</strong> — Include the project's tech stack, coding standards, and known patterns in the prompt</li>
<li><strong>Use suppress comments</strong> — Allow developers to mark false positives with <code>// ai-review-ignore: reason</code></li>
<li><strong>Track accuracy</strong> — Log accept/reject rates per issue category and use this data to refine your prompts</li>
<li><strong>Feedback loop</strong> — Feed dismissed issues back into the prompt as "do not flag" examples</li>
</ul>
<h2>Diff-Based vs Full-File Review</h2>
<p>A common mistake is sending entire files for review. For pull requests, <strong>diff-based review is superior</strong>:</p>
<ul>
<li><strong>Token efficiency</strong> — You pay for input tokens. Sending only the diff can reduce costs by 80%+</li>
<li><strong>Focused feedback</strong> — The model focuses on what changed rather than re-reviewing existing code</li>
<li><strong>Context window</strong> — Large files may exceed the model's context window</li>
</ul>
<p>However, include surrounding context (10-20 lines above and below each change) so the model understands the code's environment. The optimal format:</p>
<pre><code>## Changed File: src/auth/login.ts
Change Type: Modified
Context (lines 45-85, changed lines marked with +/-)
async function handleLogin(req: Request) {
const { email, password } = req.body;
- const user = await db.query('SELECT * FROM users WHERE email = ' + email);
-
const user = await db.query('SELECT * FROM users WHERE email = $1', [email]);
if (!user) {
return res.status(401).json({ error: 'Invalid credentials' });
}<h2>Multi-Model Review Strategy</h2> <p>Different models have different strengths for code review:</p> <table> <tr><th>Model</th><th>Best For</th><th>Weakness</th></tr> <tr><td>GPT-4</td><td>Security analysis, complex logic</td><td>Can be verbose; higher cost</td></tr> <tr><td>Claude 3.5 Sonnet</td><td>Code quality, refactoring suggestions</td><td>May over-suggest abstractions</td></tr> <tr><td>Gemini Pro</td><td>Documentation review, API consistency</td><td>Less reliable on security edge cases</td></tr> </table> <p>A production system can route different review layers to different models, optimising for both quality and cost.</p> <h2>How AI Prompt Architect Helps</h2> <p>AI Prompt Architect provides pre-built <strong>code review prompt templates</strong> that are battle-tested across hundreds of repositories. Use the <strong>Generate</strong> workflow with "code review" as your task to get a structured review prompt tailored to your stack. The <strong>Refine</strong> workflow can then customise it with your team's specific coding standards and common pitfalls.</p>
This article was originally published with extended interactive STCO schemas on AI Prompt Architect.
Top comments (0)