Surgically Cleaning Code: A Gemma 4 AI Tool for Repairing Corrupted Snippets

#devchallenge #gemmachallenge #gemma #javascript

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

I built Code Cleaner, a professional desktop utility designed to solve a common developer pain point: the "dirty" code we get when copying snippets from terminals, PDF guides, or legacy documentation.

While many tools can strip line numbers using simple patterns, they often fail when the code is actually corrupted—such as when a terminal window wraps a single HTML tag or a keyword across two lines.

Code Cleaner solves this by implementing a hybrid sanitization pipeline. It first uses deterministic Regex for rapid line-number removal and then leverages Gemma 4 to "heal" the structural corruption, ensuring the code is syntactically correct and ready for immediate execution.

Key highlights include:

Visual Analysis Layer: A transparent view that highlights exactly what the AI detected as corruption.
Interactive Diff View: A professional side-by-side comparison allowing developers to verify every single character change before applying it.
Developer-Centric UI: A high-contrast "Cobalt" theme with full Light/Dark mode support, built with Electron.

Demo

Below, you can see the AI in action—transitioning from corrupted input to a perfectly healed output.

Code

The full source code is available on GitHub: https://github.com/odsantos/gemma4-code-cleaner (https://github.com/odsantos/gemma4-code-cleaner)

How I Used Gemma 4

For this project, I implemented an intentional, two-tier model strategy to balance precision and performance:

1. The Powerhouse: Gemma 4 31B (Dense)
The core of the "Repair" feature is powered by the cloud-based gemma-4-31b-it model. I chose the 31B Dense model because structural repair is a reasoning task, not a pattern-matching task. To fix a fragmented keyword or a split tag, the model must understand the context of the programming language. By using the 31B model's high-level thinking capabilities, Code Cleaner can distinguish between a line number and a legitimate numeric constant, preventing accidental data loss.

2. The Speedster: Gemma 4 26B (MoE)
To provide a flexible user experience, I included the gemma-4-26b-a4b-it model as an option for users who need high-throughput, lightweight cleaning for simpler snippets where deep reasoning is less critical.

The Hybrid Approach:
I deliberately avoided sending the "easy" work (line number stripping) to the AI. Instead, I built a pipeline where Regex handles the deterministic cleanup, and Gemma 4 is reserved for the probabilistic "surgical repair." This minimizes latency and maximizes the model's effectiveness on the most difficult parts of the task.

DEV Community

Surgically Cleaning Code: A Gemma 4 AI Tool for Repairing Corrupted Snippets

What I Built

Demo

Code

How I Used Gemma 4

Top comments (0)