DEV Community

Cover image for I Built an AI That Finds Your Bugs and Rewrites Your Code to Fix Them.
Kamaumbugua-dev
Kamaumbugua-dev

Posted on

I Built an AI That Finds Your Bugs and Rewrites Your Code to Fix Them.

How I built CodeLens — a Groq-powered code review tool that detects SQL injection, memory leaks, and O(n²) algorithms, then rewrites your entire file with all issues resolved. Full breakdown of the architecture, prompt engineering tricks, and the LLM hallucination problem I had to solve.

Every developer has shipped a bug they should have caught.

Not because they were careless. Because code review is expensive. You're scanning hundreds of lines for subtle patterns: a missing conn.close(), an f-string wired directly into a SQL query, a nested loop that looks innocent at n = 10 but detonates at n = 10,000.

I wanted to build a tool that never gets tired, never misses a pattern, and can tell you exactly what will go wrong in production — before you push.

That's CodeLens.


What It Does

Paste any code. In seconds you get:

  • A health score (0–100) with an animated gauge
  • Every vulnerability categorized by severity: CRITICAL, WARNING, INFO
  • Exact line numbers, descriptions, fix suggestions, and predicted production impact
  • A "Rework Code" button that rewrites your entire file with every issue resolved, with inline # FIX: comments explaining each change

Here's what it catches on a simple Python file:

CRITICAL  SQL Injection            L7     f-string in cursor.execute()
CRITICAL  Hardcoded Credentials    L27    password = "admin123"
CRITICAL  Unsafe eval()            L29    eval(open("config.txt").read())
CRITICAL  Plaintext Card Numbers   L15    print(f"...card {card_number}")
WARNING   Resource Leak             L16    file handle never closed
WARNING   Resource Leak             L42    db connection never closed
WARNING   O(n²) Complexity          L46    nested loop over same list
WARNING   Unbounded Cache           L38    dict with no eviction policy
INFO      Division by Zero Risk     L50    len(transactions) unchecked
Enter fullscreen mode Exit fullscreen mode

Health score: 28 / 100.

One click later, the LLM rewrites the file. Every issue fixed. Every change commented.


The Stack

Deliberately lean:

React 19 (Vercel)  →  FastAPI (Render)  →  Groq API (llama-3.3-70b)
Enter fullscreen mode Exit fullscreen mode

No database. No auth. No queue. Every request is stateless — code goes in, analysis comes out.

The frontend is a three-panel layout:

  1. Code editor — line numbers highlight affected lines in red
  2. Analysis dashboard — health gauge, metric bars, issue list with severity filters
  3. Vulnerability slides — right panel with CSS scroll-snap, one full-height card per vulnerability

The backend has three endpoints worth talking about: /analyze, /fix, and /github/analyze.


The Hard Part: Getting the LLM to Return Valid JSON Every Time

The analysis response needs to be machine-parseable. Every time. Across any language, any code quality, any edge case.

This is harder than it sounds. By default, models wrap JSON in markdown fences, add explanatory preamble, or truncate responses mid-object when they hit a token limit. Any of these breaks the frontend.

My system prompt ends with:

Return ONLY valid JSON. No markdown, no code fences, no explanation outside the JSON.
Enter fullscreen mode Exit fullscreen mode

And I strip artifacts post-response with:

raw_text = re.sub(r"^```

(?:json)?\s*", "", raw_text)
raw_text = re.sub(r"\s*

```$", "", raw_text)
analysis = json.loads(raw_text)
Enter fullscreen mode Exit fullscreen mode

This handles 99% of cases. The remaining 1% raises a json.JSONDecodeError that returns a structured 500 to the client.


The Line Number Hallucination Problem

This was the most interesting bug I fixed.

Early versions of CodeLens would confidently report issues on lines that didn't exist. A 50-line file would get issues flagged at lines 73, 91, 108. The model was pattern-matching against training data — it recognized the type of bug and estimated a line number based on where it typically appears in codebases it had seen, not in the code you gave it.

The fix is obvious in hindsight: give the model line numbers to reference.

Instead of sending:

import sqlite3

def get_user(username):
    query = f"SELECT * FROM users WHERE username = '{username}'"
Enter fullscreen mode Exit fullscreen mode

I send:

1 | import sqlite3
2 |
3 | def get_user(username):
4 |     query = f"SELECT * FROM users WHERE username = '{username}'"
Enter fullscreen mode Exit fullscreen mode

And I add an explicit constraint to the prompt:

The code has 50 lines total. You MUST only reference line numbers
that actually exist (1 to 50).
Enter fullscreen mode Exit fullscreen mode

The implementation:

def add_line_numbers(code: str) -> str:
    lines = code.splitlines()
    width = len(str(len(lines)))
    return "\n".join(
        f"{str(i+1).rjust(width)} | {line}"
        for i, line in enumerate(lines)
    )
Enter fullscreen mode Exit fullscreen mode

Hallucinated line numbers dropped to near zero. The model now has a concrete anchor instead of a floating reference.


The Rework Pipeline

The "Rework Code" feature is a second LLM call chained to the first.

After analysis, the frontend sends the original code + the full issue list to /fix:

class FixRequest(BaseModel):
    code: str
    language: str
    issues: List[Any]
Enter fullscreen mode Exit fullscreen mode

The fix prompt encodes every issue as a line-referenced instruction:

Fix ALL of the following issues in this python code:

ISSUES TO FIX:
  - [Line 7] [CRITICAL] SQL Injection: Use parameterized queries
  - [Line 27] [CRITICAL] Hardcoded Credentials: Use os.environ.get(...)
  - [Line 29] [CRITICAL] Unsafe eval(): Use json.load() instead
  ...

ORIGINAL CODE:
{code}

Return the complete fixed code with inline FIX comments.
Enter fullscreen mode Exit fullscreen mode

The system prompt is strict:

Return RAW CODE ONLY. No markdown fences, no explanation, no preamble.
Add inline comments prefixed with # FIX: explaining each change.
Enter fullscreen mode Exit fullscreen mode

The result gets placed back into the editor. The user sees their fixed file immediately.


The CORS Bug That Burned Two Hours

Deploying to Vercel + Render exposed something I'd glossed over: allow_origins=["*"] and allow_credentials=True is invalid per the CORS specification.

Browsers enforce this at the preflight stage. Your OPTIONS request returns 200, but the browser rejects the response because the spec says wildcard origins cannot coexist with credentials. You get a cryptic console error and a silent failure in the UI.

The fix is one line:

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=False,  # must be False with wildcard origin
    allow_methods=["*"],
    allow_headers=["*"],
)
Enter fullscreen mode Exit fullscreen mode

Worth knowing before you spend two hours debugging network tab preflight responses.


The Vulnerability Slides

The right panel uses CSS scroll-snap-type: y mandatory. Each vulnerability gets its own full-height card:

scroll-snap-type: y mandatory;
Enter fullscreen mode Exit fullscreen mode
<div style={{ height: "100%", scrollSnapAlign: "start" }}>
  <VulnSlide issue={issue} />
</div>
Enter fullscreen mode Exit fullscreen mode

There's a dot navigation sidebar that syncs with the scroll position:

onScroll={(e) => {
  const idx = Math.round(
    e.target.scrollTop / e.target.clientHeight
  );
  setActiveSlide(idx);
}}
Enter fullscreen mode Exit fullscreen mode

Rounding (not flooring) prevents the active dot from flickering during the snap animation — the snap always settles on an integer, but scrollTop passes through fractional values mid-animation.

Each slide has a "SLIDE" button in the issue list that calls:

slidesRef.current.scrollTo({
  top: idx * slidesRef.current.clientHeight,
  behavior: "smooth"
});
Enter fullscreen mode Exit fullscreen mode

Bi-directional sync between the list and the slides, no state management library needed.


Deployment Notes

A few things that bit me:

Render cold starts. The free tier sleeps services after 15 minutes of inactivity. First request after sleep takes 30–50 seconds. I added a loading state with an explanation so users wait instead of leave.

Vite bakes env vars at build time. VITE_API_BASE is injected into the bundle when Vercel builds — not at runtime. Old preview deployment URLs serve old bundles permanently. The production domain always reflects the latest build. If your frontend is still hitting the wrong backend, you're on an old preview URL.

Railway port mismatch. I originally deployed on Railway. The dashboard had the networking port set to 8000, but the $PORT environment variable was 8080. Internal healthchecks passed (Railway probed the container directly), but external traffic failed at the edge with persistent 502s. Moved to Render, problem gone.


Try It

Live: codelens-new.vercel.app

Source: github.com/Kamaumbugua-dev/CODELENS

Paste the worst code you can find. The demo loads a Python file with SQL injection, hardcoded secrets, unsafe eval(), and an O(n²) algorithm. Hit Analyze, then Rework. The whole thing takes about 10 seconds on a warm backend.


Built by Steven K. — Head of AXON LATTICE LABS™

CodeLens™ — See your code's future before it ships.

Top comments (0)