DEV Community

Dragon Ha
Dragon Ha

Posted on

GemmaDiff: I Built a Local AI Code Reviewer with Gemma 4 That Never Sends Your Code to the Cloud

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

GemmaDiff โ€” a command-line tool that reviews your git diffs using Google's Gemma 4 model, running entirely on your local machine. No cloud APIs, no data leaving your laptop, no monthly subscriptions.

$ git add src/auth.py
$ gemmadiff

๐Ÿ” GemmaDiff - ๆœฌๅœฐ AI ไปฃ็ ๅฎกๆŸฅ

๐Ÿ“ ๅฎกๆŸฅๆš‚ๅญ˜ๅ˜ๆ›ด
๐Ÿ“ ๆ–‡ไปถ: src/auth.py
๐Ÿ“Š ๅ˜ๆ›ด: +23 -5
๐Ÿค– ๆญฃๅœจๅˆ†ๆžไปฃ็ ...
โฑ๏ธ  ๅˆ†ๆž่€—ๆ—ถ: 4.2s

โš ๏ธ  ๅ‘็Žฐ 1 ไธช้—ฎ้ข˜
๐ŸŸ  #1 [HIGH] security
   ๐Ÿ“ src/auth.py:23
   Hardcoded JWT secret key in source code
   ๐Ÿ’ก Move to environment variable: os.getenv('JWT_SECRET')
Enter fullscreen mode Exit fullscreen mode

The Problem

Every developer knows the drill: you write code, push it, and wait for a cloud-based code review tool to analyze it. But here's the friction I kept hitting:

  1. Privacy: I can't send my client's proprietary code to GitHub Copilot or CodeRabbit
  2. Latency: Waiting 10-30 seconds for a cloud API response breaks my coding flow
  3. Cost: $10-20/month adds up when you're freelancing
  4. Offline: Planes, trains, and terrible WiFi at coffee shops

I wanted something that:

  • Reviews code as fast as I can type
  • Works completely offline
  • Costs nothing
  • Actually catches real issues (not just style nits)

How I Used Gemma 4

Gemma 4 is the perfect model for this use case. Here's why:

The 256K Context Window is a Game Changer

Code reviews require understanding context. A security vulnerability in auth.py might depend on how config.py handles secrets. With Gemma 4's 256K context window, I can feed in entire diffs โ€” even large PRs with 50+ files โ€” and the model understands the relationships between changes.

# The diff can be massive โ€” Gemma 4 handles it
if len(diff) > 100000:
    diff = diff[:100000] + "\n\n[... diff truncated ...]"
Enter fullscreen mode Exit fullscreen mode

The 26B MoE Model Hits the Sweet Spot

I chose the Gemma 4 26B MoE model because:

  • It only activates 3.8B parameters during inference (fast!)
  • But it has the knowledge of a 26B parameter model (smart!)
  • On my MacBook Pro M3, it reviews a typical diff in ~5 seconds

Structured Output with System Prompts

The key to making this work is a carefully crafted system prompt that forces Gemma 4 to output structured JSON:

REVIEW_SYSTEM_PROMPT = """You are a senior code reviewer. Analyze the git diff and provide a structured review.

Respond in JSON format:
{
  "summary": "One-line summary",
  "risk_level": "low|medium|high|critical",
  "issues": [{
    "severity": "critical|high|medium|low|info",
    "category": "security|bug|performance|style|maintainability",
    "file": "filename.py",
    "line": 42,
    "description": "What the issue is",
    "suggestion": "How to fix it"
  }],
  "positive": ["Good practices you noticed"],
  "suggestions": ["General improvement suggestions"]
}"""
Enter fullscreen mode Exit fullscreen mode

This gives me predictable, parseable output that I can format into beautiful terminal output or pipe into CI/CD systems.

Demo

Basic Usage

# Review staged changes (most common workflow)
python gemmadiff.py

# Review all unstaged changes
python gemmadiff.py --all

# Review a specific commit
python gemmadiff.py --commit abc123

# Review changes vs main branch (for PRs)
python gemmadiff.py --pr

# Use smaller model for faster review
python gemmadiff.py --model gemma4:4b

# Output as JSON for CI/CD integration
python gemmadiff.py --json
Enter fullscreen mode Exit fullscreen mode

Real Example Output

I tested GemmaDiff on a real PR that added JWT authentication:

============================================================
๐Ÿ“‹ GemmaDiff Code Review
============================================================

๐Ÿ“Š ๅ˜ๆ›ด็ปŸ่ฎก
   ๆ–‡ไปถ: 2 ไธช
   ๆ–ฐๅขž: +45
   ๅˆ ้™ค: -12

๐Ÿ“ ๆ€ป็ป“
   Added JWT authentication with refresh token support
   ้ฃŽ้™ฉ็ญ‰็บง: MEDIUM

โš ๏ธ  ๅ‘็Žฐ 2 ไธช้—ฎ้ข˜
------------------------------------------------------------

  ๐ŸŸ  #1 [HIGH] security
     ๐Ÿ“ src/auth.py:23
     Hardcoded JWT secret key in source code
     ๐Ÿ’ก Move to environment variable: os.getenv('JWT_SECRET')

  ๐ŸŸก #2 [MEDIUM] performance
     ๐Ÿ“ src/auth.py:45
     Database query in loop (N+1 problem)
     ๐Ÿ’ก Use batch query: User.query.filter(User.id.in_(user_ids))

๐Ÿ‘ ๅšๅพ—ๅฅฝ
   โœจ Good use of bcrypt for password hashing
   โœจ Proper token expiration handling

๐Ÿ’ก ๆ”น่ฟ›ๅปบ่ฎฎ
   โ€ข Add rate limiting for login endpoint
   โ€ข Add unit tests for token refresh logic
Enter fullscreen mode Exit fullscreen mode

CI/CD Integration

GemmaDiff outputs JSON, making it easy to integrate into GitHub Actions:

name: Code Review
on: [pull_request]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: ollama/ollama-action@v1
        with:
          model: gemma4:26b
      - run: |
          pip install ollama
          python gemmadiff.py --pr --json > review.json
Enter fullscreen mode Exit fullscreen mode

Code

The full source is available on GitHub, but here's the core logic:

def review_diff(diff: str, model: str = 'gemma4:26b') -> dict:
    """Send diff to Gemma 4 for review."""

    response = ollama.chat(
        model=model,
        messages=[
            {
                'role': 'system',
                'content': REVIEW_SYSTEM_PROMPT
            },
            {
                'role': 'user',
                'content': f"Review this git diff:\n\n```
{% endraw %}
diff\n{diff}\n
{% raw %}
```"
            }
        ],
        options={
            'temperature': 0.1,  # Low temp for consistent output
            'num_predict': 4096
        }
    )

    return json.loads(response['message']['content'])
Enter fullscreen mode Exit fullscreen mode

The entire tool is ~400 lines of Python. No frameworks, no dependencies beyond ollama.

Why This Matters

For Individual Developers

  • Review your own code before committing
  • Catch issues early (before they hit CI/CD)
  • Learn from the AI's suggestions

For Teams

  • Integrate into CI/CD for automated reviews
  • Consistent review standards across the team
  • No code leaves your infrastructure

For Security-Sensitive Industries

  • Healthcare, finance, government โ€” code never touches external servers
  • Compliance-friendly (HIPAA, SOC2, etc.)
  • Full audit trail with JSON output

Performance Benchmarks

Tested on MacBook Pro M3 (36GB RAM):

Diff Size Lines Changed Review Time Memory
Small ~50 lines 2.1s 18.4GB
Medium ~200 lines 4.2s 18.4GB
Large ~1000 lines 8.7s 18.4GB
Huge ~5000 lines 18.3s 18.4GB

What I Learned

  1. System prompts are everything: The quality of the review depends more on the prompt than the model. I spent 80% of my time refining the system prompt.

  2. Structured output > free-form: Forcing JSON output makes the tool actually usable in real workflows. Free-form text is pretty but useless for automation.

  3. MoE is perfect for this: The 26B MoE model gives me 26B-level intelligence at 3.8B-level speed. It's the ideal trade-off for a code review tool.

  4. Local AI is production-ready: I was surprised by how well Gemma 4 performs on real-world code. It catches 80% of what cloud tools catch, and it's getting better every month.

Try It Yourself

# 1. Install Ollama
brew install ollama

# 2. Pull Gemma 4
ollama pull gemma4:26b

# 3. Clone and run
git clone https://github.com/DragonHa-XIA/gemmadiff
cd gemmadiff
pip install ollama
python gemmadiff.py
Enter fullscreen mode Exit fullscreen mode

What's Next

  • [ ] VS Code extension
  • [ ] Pre-commit hook integration
  • [ ] Support for more languages (Go, Rust, Java)
  • [ ] Custom review rules via config file
  • [ ] GitHub Action (ready-to-use)

Built for the Gemma 4 Challenge. All code runs locally using Google's open-source Gemma 4 model.

What would you build with Gemma 4? Drop a comment below!

Top comments (0)