This is a submission for the Gemma 4 Challenge: Build with Gemma 4
What I Built
GemmaDiff โ a command-line tool that reviews your git diffs using Google's Gemma 4 model, running entirely on your local machine. No cloud APIs, no data leaving your laptop, no monthly subscriptions.
$ git add src/auth.py
$ gemmadiff
๐ GemmaDiff - ๆฌๅฐ AI ไปฃ็ ๅฎกๆฅ
๐ ๅฎกๆฅๆๅญๅๆด
๐ ๆไปถ: src/auth.py
๐ ๅๆด: +23 -5
๐ค ๆญฃๅจๅๆไปฃ็ ...
โฑ๏ธ ๅๆ่ๆถ: 4.2s
โ ๏ธ ๅ็ฐ 1 ไธช้ฎ้ข
๐ #1 [HIGH] security
๐ src/auth.py:23
Hardcoded JWT secret key in source code
๐ก Move to environment variable: os.getenv('JWT_SECRET')
The Problem
Every developer knows the drill: you write code, push it, and wait for a cloud-based code review tool to analyze it. But here's the friction I kept hitting:
- Privacy: I can't send my client's proprietary code to GitHub Copilot or CodeRabbit
- Latency: Waiting 10-30 seconds for a cloud API response breaks my coding flow
- Cost: $10-20/month adds up when you're freelancing
- Offline: Planes, trains, and terrible WiFi at coffee shops
I wanted something that:
- Reviews code as fast as I can type
- Works completely offline
- Costs nothing
- Actually catches real issues (not just style nits)
How I Used Gemma 4
Gemma 4 is the perfect model for this use case. Here's why:
The 256K Context Window is a Game Changer
Code reviews require understanding context. A security vulnerability in auth.py might depend on how config.py handles secrets. With Gemma 4's 256K context window, I can feed in entire diffs โ even large PRs with 50+ files โ and the model understands the relationships between changes.
# The diff can be massive โ Gemma 4 handles it
if len(diff) > 100000:
diff = diff[:100000] + "\n\n[... diff truncated ...]"
The 26B MoE Model Hits the Sweet Spot
I chose the Gemma 4 26B MoE model because:
- It only activates 3.8B parameters during inference (fast!)
- But it has the knowledge of a 26B parameter model (smart!)
- On my MacBook Pro M3, it reviews a typical diff in ~5 seconds
Structured Output with System Prompts
The key to making this work is a carefully crafted system prompt that forces Gemma 4 to output structured JSON:
REVIEW_SYSTEM_PROMPT = """You are a senior code reviewer. Analyze the git diff and provide a structured review.
Respond in JSON format:
{
"summary": "One-line summary",
"risk_level": "low|medium|high|critical",
"issues": [{
"severity": "critical|high|medium|low|info",
"category": "security|bug|performance|style|maintainability",
"file": "filename.py",
"line": 42,
"description": "What the issue is",
"suggestion": "How to fix it"
}],
"positive": ["Good practices you noticed"],
"suggestions": ["General improvement suggestions"]
}"""
This gives me predictable, parseable output that I can format into beautiful terminal output or pipe into CI/CD systems.
Demo
Basic Usage
# Review staged changes (most common workflow)
python gemmadiff.py
# Review all unstaged changes
python gemmadiff.py --all
# Review a specific commit
python gemmadiff.py --commit abc123
# Review changes vs main branch (for PRs)
python gemmadiff.py --pr
# Use smaller model for faster review
python gemmadiff.py --model gemma4:4b
# Output as JSON for CI/CD integration
python gemmadiff.py --json
Real Example Output
I tested GemmaDiff on a real PR that added JWT authentication:
============================================================
๐ GemmaDiff Code Review
============================================================
๐ ๅๆด็ป่ฎก
ๆไปถ: 2 ไธช
ๆฐๅข: +45
ๅ ้ค: -12
๐ ๆป็ป
Added JWT authentication with refresh token support
้ฃ้ฉ็ญ็บง: MEDIUM
โ ๏ธ ๅ็ฐ 2 ไธช้ฎ้ข
------------------------------------------------------------
๐ #1 [HIGH] security
๐ src/auth.py:23
Hardcoded JWT secret key in source code
๐ก Move to environment variable: os.getenv('JWT_SECRET')
๐ก #2 [MEDIUM] performance
๐ src/auth.py:45
Database query in loop (N+1 problem)
๐ก Use batch query: User.query.filter(User.id.in_(user_ids))
๐ ๅๅพๅฅฝ
โจ Good use of bcrypt for password hashing
โจ Proper token expiration handling
๐ก ๆน่ฟๅปบ่ฎฎ
โข Add rate limiting for login endpoint
โข Add unit tests for token refresh logic
CI/CD Integration
GemmaDiff outputs JSON, making it easy to integrate into GitHub Actions:
name: Code Review
on: [pull_request]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: ollama/ollama-action@v1
with:
model: gemma4:26b
- run: |
pip install ollama
python gemmadiff.py --pr --json > review.json
Code
The full source is available on GitHub, but here's the core logic:
def review_diff(diff: str, model: str = 'gemma4:26b') -> dict:
"""Send diff to Gemma 4 for review."""
response = ollama.chat(
model=model,
messages=[
{
'role': 'system',
'content': REVIEW_SYSTEM_PROMPT
},
{
'role': 'user',
'content': f"Review this git diff:\n\n```
{% endraw %}
diff\n{diff}\n
{% raw %}
```"
}
],
options={
'temperature': 0.1, # Low temp for consistent output
'num_predict': 4096
}
)
return json.loads(response['message']['content'])
The entire tool is ~400 lines of Python. No frameworks, no dependencies beyond ollama.
Why This Matters
For Individual Developers
- Review your own code before committing
- Catch issues early (before they hit CI/CD)
- Learn from the AI's suggestions
For Teams
- Integrate into CI/CD for automated reviews
- Consistent review standards across the team
- No code leaves your infrastructure
For Security-Sensitive Industries
- Healthcare, finance, government โ code never touches external servers
- Compliance-friendly (HIPAA, SOC2, etc.)
- Full audit trail with JSON output
Performance Benchmarks
Tested on MacBook Pro M3 (36GB RAM):
| Diff Size | Lines Changed | Review Time | Memory |
|---|---|---|---|
| Small | ~50 lines | 2.1s | 18.4GB |
| Medium | ~200 lines | 4.2s | 18.4GB |
| Large | ~1000 lines | 8.7s | 18.4GB |
| Huge | ~5000 lines | 18.3s | 18.4GB |
What I Learned
System prompts are everything: The quality of the review depends more on the prompt than the model. I spent 80% of my time refining the system prompt.
Structured output > free-form: Forcing JSON output makes the tool actually usable in real workflows. Free-form text is pretty but useless for automation.
MoE is perfect for this: The 26B MoE model gives me 26B-level intelligence at 3.8B-level speed. It's the ideal trade-off for a code review tool.
Local AI is production-ready: I was surprised by how well Gemma 4 performs on real-world code. It catches 80% of what cloud tools catch, and it's getting better every month.
Try It Yourself
# 1. Install Ollama
brew install ollama
# 2. Pull Gemma 4
ollama pull gemma4:26b
# 3. Clone and run
git clone https://github.com/DragonHa-XIA/gemmadiff
cd gemmadiff
pip install ollama
python gemmadiff.py
What's Next
- [ ] VS Code extension
- [ ] Pre-commit hook integration
- [ ] Support for more languages (Go, Rust, Java)
- [ ] Custom review rules via config file
- [ ] GitHub Action (ready-to-use)
Built for the Gemma 4 Challenge. All code runs locally using Google's open-source Gemma 4 model.
What would you build with Gemma 4? Drop a comment below!
Top comments (0)