This is a submission for the Gemma 4 Challenge: Build with Gemma 4
What I Built
DiffWhisperer is a professional-grade CLI tool that transforms cryptic git diff outputs into high-level architectural narratives using Gemma 4 31B.
Every developer knows the pain of staring at a massive pull request with hundreds of changed lines, trying to figure out the broader impact. DiffWhisperer bridges the gap between "what changed" and "why it matters" — acting as a virtual Senior Architect on your team.
Here's what it feels like to use:
# Standard narration
python main.py narrate
# Deep 3-stage chain-of-thought analysis
python main.py narrate --deep
# Persona-based review
python main.py narrate --persona senior
python main.py narrate --persona mentor
python main.py narrate --persona pirate
# Check what gets redacted before any API call
python main.py narrate --dry-run
# Interactive chat session about your diff
python main.py chat --persona senior
Key Features:
🕵️ Pre-Flight Privacy Shield — A local regex-based scanner detects and redacts API keys, secrets, internal IPs, and PII before any data ever leaves your machine. Includes a custom Interval Merging Algorithm to handle overlapping patterns without index corruption. Run --dry-run to inspect redactions before making any AI call.
🧠 Multi-Stage Reasoning Pipeline — Instead of one prompt, DiffWhisperer uses a 3-stage chain-of-thought process:
- Technical Extraction — Summarizes the core logic shifts
- Security Audit — Self-critiques for risks and blind spots
- Persona Synthesis — Combines findings into a tailored narrative
💬 Interactive Git-Chat REPL — After the narration, drop into a stateful chat session and ask follow-up questions about your diff. Ask for unit tests, refactoring suggestions, or plain-English explanations — all in your terminal.
🎭 Persona-Based Reviews — Switch between Senior Architect, Mentor, or Pirate mode depending on your audience.
🛡️ Zero-Crash Philosophy — Universal exponential backoff (5 retries), dual-model fallback (31B → 26B MoE), Pydantic validation, bulletproof JSON parsing, and Windows UTF-8 fix — built to never crash in a real workflow.
Demo
🎬 Watch the Full Demo on YouTube
Code
🔗 GitHub Repository: github.com/Neo-0013/diff-whisperer
# 1. Clone the repo
git clone https://github.com/Neo-0013/diff-whisperer.git
cd diff-whisperer
# 2. Set up virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Configure your API key
cp .env.example .env
# Open .env and add: GEMMA_API_KEY=your_key_here
# 5. Run the one-command demo
python test.py
For judges: Just run
python test.py— it automatically runs the full test suite, simulates a diff with a mock API key, demonstrates the Privacy Shield dry-run, and runs a live AI narration end-to-end. Cleans up after itself completely. No setup headaches.
Get your free API key at Google AI Studio — no credit card required.
How I Used Gemma 4
I chose Gemma 4 31B Dense as the primary model after evaluating the entire Gemma 4 family:
E2B / E4B (Small) — Perfect for edge and mobile deployments, but code review requires multi-step reasoning across large diffs that regularly hit 15,000+ tokens across multiple files. The small models struggle with cascading logic.
26B MoE (Mixture-of-Experts) — Highly efficient with great throughput. I use this as my automatic fallback model. But for the primary reasoning task — understanding architectural intent across a full PR — the dense architecture gives more reliable deep reasoning.
31B Dense ✅ — The sweet spot for DiffWhisperer. The 128K context window lets me pass an entire pull request in a single call without chunking. The instruction-tuned reasoning handles my 3-stage chain-of-thought pipeline reliably. Every token gets full model attention — critical when reasoning about cascading dependencies across files.
Real example from development: During testing, Gemma 4 31B identified a binary file misnamed with a .py extension committed alongside source code — flagging it as a critical "blind merge" risk. Smaller models missed it entirely. That's the reasoning density only the 31B delivers.
The Multi-Stage Pipeline specifically exploits Gemma 4's strengths:
- Stage 1 — Technical extraction across the full 128K context
- Stage 2 — Self-critique security and architectural audit
- Stage 3 — Persona-tailored narrative synthesis
DiffWhisperer also implements a Dual-Model Fallback System — if the 31B is overloaded after 5 retries, it automatically downgrades to the 26B MoE model. You always get your code story, no matter what.
Built with ❤️ for the Google Gemma 4 Challenge on DEV.to
Stop reading dry diffs. Start reading stories.
Top comments (0)