This is a submission for the Gemma 4 Challenge: Build with Gemma 4
What I Built
GemmaDiff — a command-line tool that reviews your git diffs using Google's Gemma 4 model, running entirely on your local machine. No cloud APIs, no data leaving your laptop, no monthly subscriptions.
$ git add src/auth.py
$ gemmadiff
🔍 GemmaDiff - 本地 AI 代码审查
📝 审查暂存变更
📁 文件: src/auth.py
📊 变更: +23 -5
🤖 正在分析代码...
⏱️ 分析耗时: 4.2s
⚠️ 发现 1 个问题
🟠 #1 [HIGH] security
📍 src/auth.py:23
Hardcoded JWT secret key in source code
💡 Move to environment variable: os.getenv('JWT_SECRET')
The Problem
Every developer knows the drill: you write code, push it, and wait for a cloud-based code review tool to analyze it. But here's the friction I kept hitting:
- Privacy: I can't send my client's proprietary code to GitHub Copilot or CodeRabbit
- Latency: Waiting 10-30 seconds for a cloud API response breaks my coding flow
- Cost: $10-20/month adds up when you're freelancing
- Offline: Planes, trains, and terrible WiFi at coffee shops
I wanted something that:
- Reviews code as fast as I can type
- Works completely offline
- Costs nothing
- Actually catches real issues (not just style nits)
How I Used Gemma 4
Gemma 4 is the perfect model for this use case. Here's why:
The 256K Context Window is a Game Changer
Code reviews require understanding context. A security vulnerability in auth.py might depend on how config.py handles secrets. With Gemma 4's 256K context window, I can feed in entire diffs — even large PRs with 50+ files — and the model understands the relationships between changes.
# The diff can be massive — Gemma 4 handles it
if len(diff) > 100000:
diff = diff[:100000] + "\n\n[... diff truncated ...]"
The 26B MoE Model Hits the Sweet Spot
I chose the Gemma 4 26B MoE model because:
- It only activates 3.8B parameters during inference (fast!)
- But it has the knowledge of a 26B parameter model (smart!)
- On my MacBook Pro M3, it reviews a typical diff in ~5 seconds
Structured Output with System Prompts
The key to making this work is a carefully crafted system prompt that forces Gemma 4 to output structured JSON:
REVIEW_SYSTEM_PROMPT = """You are a senior code reviewer. Analyze the git diff and provide a structured review.
Respond in JSON format:
{
"summary": "One-line summary",
"risk_level": "low|medium|high|critical",
"issues": [{
"severity": "critical|high|medium|low|info",
"category": "security|bug|performance|style|maintainability",
"file": "filename.py",
"line": 42,
"description": "What the issue is",
"suggestion": "How to fix it"
}],
"positive": ["Good practices you noticed"],
"suggestions": ["General improvement suggestions"]
}"""
This gives me predictable, parseable output that I can format into beautiful terminal output or pipe into CI/CD systems.
Demo
Basic Usage
# Review staged changes (most common workflow)
python gemmadiff.py
# Review all unstaged changes
python gemmadiff.py --all
# Review a specific commit
python gemmadiff.py --commit abc123
# Review changes vs main branch (for PRs)
python gemmadiff.py --pr
# Use smaller model for faster review
python gemmadiff.py --model gemma4:4b
# Output as JSON for CI/CD integration
python gemmadiff.py --json
Real Example Output
I tested GemmaDiff on a real PR that added JWT authentication:
============================================================
📋 GemmaDiff Code Review
============================================================
📊 变更统计
文件: 2 个
新增: +45
删除: -12
📝 总结
Added JWT authentication with refresh token support
风险等级: MEDIUM
⚠️ 发现 2 个问题
------------------------------------------------------------
🟠 #1 [HIGH] security
📍 src/auth.py:23
Hardcoded JWT secret key in source code
💡 Move to environment variable: os.getenv('JWT_SECRET')
🟡 #2 [MEDIUM] performance
📍 src/auth.py:45
Database query in loop (N+1 problem)
💡 Use batch query: User.query.filter(User.id.in_(user_ids))
👍 做得好
✨ Good use of bcrypt for password hashing
✨ Proper token expiration handling
💡 改进建议
• Add rate limiting for login endpoint
• Add unit tests for token refresh logic
CI/CD Integration
GemmaDiff outputs JSON, making it easy to integrate into GitHub Actions:
name: Code Review
on: [pull_request]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: ollama/ollama-action@v1
with:
model: gemma4:26b
- run: |
pip install ollama
python gemmadiff.py --pr --json > review.json
Code
The full source is available on GitHub, but here's the core logic:
def review_diff(diff: str, model: str = 'gemma4:26b') -> dict:
"""Send diff to Gemma 4 for review."""
response = ollama.chat(
model=model,
messages=[
{
'role': 'system',
'content': REVIEW_SYSTEM_PROMPT
},
{
'role': 'user',
'content': f"Review this git diff:\n\n```
{% endraw %}
diff\n{diff}\n
{% raw %}
```"
}
],
options={
'temperature': 0.1, # Low temp for consistent output
'num_predict': 4096
}
)
return json.loads(response['message']['content'])
The entire tool is ~400 lines of Python. No frameworks, no dependencies beyond ollama.
Why This Matters
For Individual Developers
- Review your own code before committing
- Catch issues early (before they hit CI/CD)
- Learn from the AI's suggestions
For Teams
- Integrate into CI/CD for automated reviews
- Consistent review standards across the team
- No code leaves your infrastructure
For Security-Sensitive Industries
- Healthcare, finance, government — code never touches external servers
- Compliance-friendly (HIPAA, SOC2, etc.)
- Full audit trail with JSON output
Performance Benchmarks
Tested on MacBook Pro M3 (36GB RAM):
| Diff Size | Lines Changed | Review Time | Memory |
|---|---|---|---|
| Small | ~50 lines | 2.1s | 18.4GB |
| Medium | ~200 lines | 4.2s | 18.4GB |
| Large | ~1000 lines | 8.7s | 18.4GB |
| Huge | ~5000 lines | 18.3s | 18.4GB |
What I Learned
System prompts are everything: The quality of the review depends more on the prompt than the model. I spent 80% of my time refining the system prompt.
Structured output > free-form: Forcing JSON output makes the tool actually usable in real workflows. Free-form text is pretty but useless for automation.
MoE is perfect for this: The 26B MoE model gives me 26B-level intelligence at 3.8B-level speed. It's the ideal trade-off for a code review tool.
Local AI is production-ready: I was surprised by how well Gemma 4 performs on real-world code. It catches 80% of what cloud tools catch, and it's getting better every month.
Try It Yourself
# 1. Install Ollama
brew install ollama
# 2. Pull Gemma 4
ollama pull gemma4:26b
# 3. Clone and run
git clone https://github.com/DragonHa-XIA/gemmadiff
cd gemmadiff
pip install ollama
python gemmadiff.py
What's Next
- [ ] VS Code extension
- [ ] Pre-commit hook integration
- [ ] Support for more languages (Go, Rust, Java)
- [ ] Custom review rules via config file
- [ ] GitHub Action (ready-to-use)
Built for the Gemma 4 Challenge. All code runs locally using Google's open-source Gemma 4 model.
What would you build with Gemma 4? Drop a comment below!
Top comments (1)
Localization, privacy, and efficiency—this is Tokensparsamkeit in its truest form. Leveraging the Gemma 4 26B MoE for Git Diff reviews is a brilliant move; it hits the sweet spot between "high intelligence" and "low overhead." Performing a security "surgery" on code without it ever leaving your local machine is exactly how modern developer productivity should look.