DEV Community

Nova
Nova

Posted on

LLM Debugging Playbook: From Repro to Patch Without Guessing

Debugging with an LLM is like debugging with a junior dev who’s very fast and very confident.

If you ask “why is this failing?”, you’ll often get a plausible story… and waste 30 minutes.

If you instead give the model a tight process, it becomes a legitimately useful debugging partner.

Here’s the playbook I use: Repro → Hypothesis → Instrument → Minimal fix → Regression test.


Step 0: make a minimal repro (or admit you don’t have one)

LLMs are great at pattern matching and terrible at guessing your hidden context.

Before you paste anything, do this:

  • write the exact command to reproduce
  • include the exact error output
  • state expected vs actual behavior
  • mention environment (OS/runtime/version)

If you can’t reproduce consistently, say so. That’s information.


Step 1: force the model to ask questions first

This alone cuts hallucinations in half.

You are debugging with me like a senior engineer.
Before proposing fixes, ask up to 6 targeted questions that reduce uncertainty.
Then propose 2-3 hypotheses ranked by likelihood.
Output:
1) Questions
2) Hypotheses (with confidence %)
Enter fullscreen mode Exit fullscreen mode

Answer the questions. Don’t skip.


Step 2: constrain the hypothesis space

Once you have hypotheses, you want the model to commit.

Given the hypotheses, choose the single most likely one.
Explain the reasoning briefly.
Then propose the minimal experiment that would confirm/deny it.
Output as:
- Hypothesis
- Experiment
- Expected observations if true/false
Enter fullscreen mode Exit fullscreen mode

This prevents “try 12 random things”.


Step 3: instrument (logs/tests) instead of guessing

Most bugs die the moment you add the right logging.

Prompt:

Propose instrumentation changes only (no functional changes).
Goal: confirm the hypothesis.
Constraints:
- Keep changes under 30 lines.
- Add logs/metrics in the narrowest location.
Output a unified diff.
Enter fullscreen mode Exit fullscreen mode

Run it. Paste the output back.


Step 4: request a minimal patch

Now you’re allowed to fix.

Now write the minimal fix.
Rules:
- Smallest diff that addresses the root cause
- Include error handling
- No refactors
- Output unified diff
Enter fullscreen mode Exit fullscreen mode

If the model tries to redesign your whole module, you didn’t constrain it enough.


Step 5: lock it with a regression test

This is where LLMs shine, because tests are structured.

Write a regression test that fails before the fix and passes after.
Rules:
- Use our existing test framework
- Make the test name describe the bug
- Keep it deterministic
- Output code only
Enter fullscreen mode Exit fullscreen mode

Then ask for one more thing: edge-case tests.


A concrete example (template)

Here’s the “debug ticket” I paste when I’m stuck:

Repro:
1) npm test -- users
2) Fails with: <paste>
Expected: <…>
Actual: <…>

Environment:
- Node 20.11
- PostgreSQL 15
- macOS 14

Recent change:
- We switched cursor pagination from createdAt to id

Relevant code:
<paste the one function + call site>
Enter fullscreen mode Exit fullscreen mode

You’ll notice it’s boring. That’s the point.


Two anti-patterns to avoid

1) “Fix my code” with no repro

That becomes creative writing.

2) Pasting the whole repo

The model will get shallow. Paste the smallest slice that can still explain the bug.


If you want more “real world” templates like this (code review, spec-writing, prompt chaining, test generation), I’m building a Prompt Engineering Cheatsheet at Nova Press.

Free sample: https://getnovapress.gumroad.com

Top comments (0)