Vasyl

Posted on Mar 20

How I built an AI that patches code without rewriting files

#python #ai #ollama #opensource

Every time I asked an AI to fix a bug, it rewrote my entire file.

I'd ask: "fix the empty list check on line 47."
The AI would return: 300 lines of "improved" code.
Half my carefully tuned logic — gone.

This happened to me one too many times. So I built something different.

The Problem with AI Code Tools

Most AI coding assistants work like this:

You send the whole file
AI rewrites the whole file
You diff 300 lines to find the 2 that changed
Something else broke that wasn't broken before

Copilot, Cursor, and friends are great — but they all have this problem when you're working on existing code you care about.

The Solution: Surgical SEARCH/REPLACE Patches

I built AI Code Sherlock around one core idea:

The AI should only touch the exact block that needs changing.

Every response from the AI looks like this:
[SEARCH_BLOCK]
result = data[0]["value"]
[REPLACE_BLOCK]
if not data:
return None
result = data[0]["value"]
[END_PATCH]

The engine finds this exact string in your file, validates it appears exactly once, creates a backup, and replaces only that block. Nothing else is touched.

If the search block isn't found — or matches more than once — the patch is rejected. No silent corruption.

The Auto-Improve Pipeline

This is where it gets interesting.

You configure:

Goal: "achieve f1 > 0.85 on validation set"
Script: train_model.py
Strategy: Safe Ratchet
Max iterations: 20

Then press Run and walk away.

The pipeline:

Runs your script
Reads stdout/stderr
Extracts metrics
Builds a prompt with context + history
Gets a patch from the AI
Validates syntax
Applies the patch
Re-runs the script
Checks if metrics improved (Safe Ratchet mode)
Repeats until goal is reached or iterations exhausted

Real terminal output looks like this:
[PIPELINE] Iteration 3/10 strategy=SAFE_RATCHET goal="f1 > 0.85"
[RUN] python train_model.py → exit 0 (14.3s)
precision=0.71 recall=0.68 f1=0.69
[AI] Analysing metrics... building patch...
[APPLY] ✓ Patch applied · syntax OK · backup created
[RUN] python train_model.py → exit 0 (13.8s)
f1=0.73 ↑+0.04
[RATCHET] metrics improved — continuing to iteration 4...

8 AI Strategies

Different problems need different approaches:

Strategy	When to use
🛡️ Conservative	Only fix explicit errors
⚖️ Balanced	Fix + moderate improvements (default)
🔥 Aggressive	Maximum changes, refactor logic
🔒 Safe Ratchet	Apply only if metrics improve
🧭 Explorer	Different approach every iteration
🔬 Hypothesis	Form hypothesis → test → validate
🎭 Ensemble	Generate 3 variants, pick best
📈 Exploit	Double down on what worked

Works 100% Offline with Ollama

No API key required. Just run:

ollama serve
ollama pull deepseek-coder-v2

And point AI Code Sherlock at localhost. Your code never leaves your machine.

It also supports OpenAI, Gemini, Groq, Mistral — any OpenAI-compatible endpoint.

Consensus Engine

Query multiple models at once and let them vote:

Vote mode: patches agreed on by ≥N models win
Best-of-N: pick response with most valid patches
Merge: combine unique patches from all models
Judge: one model evaluates all responses and picks the best

The Error Map

Every error gets stored with its confirmed solution. When the same error appears again, the AI sees: "this exact problem was fixed before — here's what worked."

Avoid-patterns prevent the AI from repeating approaches that already failed.

Key Technical Details

Built with Python 3.11 + PyQt6
Async subprocess runner with real-time stdout/stderr streaming
AST-based context compression (120k tokens → 4k without losing signal)
Unicode sanitizer strips zero-width spaces, BOM, smart quotes from AI responses
Atomic settings save (corruption-proof)
Version control: every patch backed up, full diff viewer, one-click restore

Try It

GitHub: https://github.com/signupss/ai-code-sherlock

Website: https://codesherlock.dev

git clone https://github.com/signupss/ai-code-sherlock.git
cd ai-code-sherlock
pip install -r requirements.txt
python main.py

Free and open source under MIT License.

Would love to hear what you think — especially if you've tried to build something similar or have ideas for the pipeline. What would make this useful for your workflow?

DEV Community