Ken Imoto

Posted on Apr 30 • Edited on May 5

I Converted 10 Debugging Techniques into AI Prompts — Here's the Template

#ai #claudecode #debugging #productivity

I asked AI to fix a bug. It confidently returned a modified file. I ran it. A different bug appeared.

Sound familiar? It's like asking a confident stranger for directions in an unfamiliar city. The intent is genuine. The accuracy is a separate question.

The Stack Overflow Developer Survey (2025) found that 66% of developers say AI-generated code is "almost right, but not quite," and 45% report that debugging AI-generated code takes more time. AI excels at producing plausible code. It does not excel at asking "under what conditions will this code break?"

So what if we gave AI the thinking patterns that human debuggers use? That's what this article does: 10 debugging techniques, compressed into 5 prompt blocks you can copy into CLAUDE.md or any agent skill definition.

Why AI Jumps to "Plausible Fixes"

Tell AI "the API returns a 500 error." Most of the time, it adds a try-catch or null check. Sometimes the symptom disappears. But if the real cause was connection pool exhaustion, that try-catch just hid the problem. Hours later, the same failure resurfaces elsewhere.

LLMs predict the most likely next token. "Error handling patterns" exist abundantly in training data. So pattern-matching a fix is easier than investigating a root cause. The human debugger's judgment — "I don't know the cause yet; keep investigating" — doesn't happen unless you explicitly instruct it.

10 Techniques in 5 Prompt Blocks

Block 1          Block 2        Block 3         Block 4         Block 5
Question    →    Boundary   →   Timeline    →   Observe     →   Stop
assumptions      & diff         & control       & simplify      signal

Block 1: Question Assumptions + Reproduce (Techniques 1-2)

Before attempting any fix:
- Are logs complete? Could there be gaps?
- Is monitoring data trustworthy?
- Does the health check verify "working correctly" or just "responding"?

Reproduce the bug first. Show minimal reproduction steps.
If you cannot reproduce it, report that fact.
Do not fix based on guesses.

"Do not fix based on guesses" is the quiet MVP. Without it, AI skips reproduction and jumps to "probably this is the cause."

Block 2: Boundary & Diff (Techniques 3-4)

Identify the boundary where the problem occurs:
- Which component is still working correctly?
- Where does the behavior diverge?
- Check git log for recent changes

"Which component is still working correctly?" forces the AI into a binary search instead of trying to analyze everything at once.

Block 3: Timeline & Control Logic (Techniques 5-7)

Organize by timeline:
- When did this problem start?
- Sudden change, or gradual degradation?
- Check retry, cache, and timeout configurations
- Is there a path where small errors get amplified?

"Sudden or gradual?" is a classification filter. Sudden = event-triggered. Gradual = resource exhaustion. That one question cuts the investigation scope in half.

Block 4: Observe & Simplify (Techniques 8-10)

If observation points are insufficient, propose adding logs or traces.
If removing components can simplify the problem, show the steps.
Consider intentionally breaking something to test a hypothesis.

Block 5: Stop Signal (3-Strike Rule)

If the same test fails 3 times in a row, stop fixing.
Organize and report to the human:
- What fixes were attempted and their results
- Current hypothesis about root cause
- Possible structural issues (architecture, spec ambiguity)
- What needs human judgment

In my experience, AI will attempt a 4th fix if you don't stop it. It keeps digging the same hole. An explicit stop signal also saves you token costs.

The One Rule That Matters Most

No fixes without root cause first.

Claude Code's best practices include this as an explicit rule: "NO FIXES WITHOUT ROOT CAUSE FIRST." It enforces a 4-phase sequence:

Phase	Action	Why AI skips it
1. Root Cause Investigation	Logs, traces, code analysis	"I've seen this pattern" — jumps ahead
2. Pattern Analysis	Check if same bug exists elsewhere	Only fixes the one spot
3. Hypothesis Testing	Write test to verify cause	"Fixing is faster than testing"
4. Implementation	Fix the verified cause	Wants to start here

Prohibiting the jump from Phase 1 to Phase 4 — as an explicit prompt constraint — noticeably changes AI debugging accuracy.

Test-Driven Debugging: Give AI a Goal

The most effective way to have AI debug: make the goal unambiguous.

"Fix this bug" → success criteria are vague.
"Make this test pass" → success criteria are exact.

Write a test that reproduces the bug (Red)
Confirm the test fails
Ask AI: "Make this test pass"
Confirm it passes (Green)
Confirm all existing tests still pass

"I don't have time to write tests." I hear this. But without tests, AI fixes tend to create new bugs. You end up spending more time, not less.

Cross-Model Debugging

When one model fails the same bug three times, it's stuck in the same blind spot. Hand the problem to a different model:

The previous agent attempted 3 fixes for this bug.
All failed. Here are the attempts:
[Failed fix 1, 2, 3]

Analyze the root cause using a different approach.
Do not repeat the previous agent's fixes.

Pair debugging works between humans. It works between AIs too.

For the complete set of AI debugging patterns, CLAUDE.md design, and context engineering practices:

📖 Practical Claude Code: Context Engineering for Modern Development

References

David Agans, Debugging: The 9 Indispensable Rules (2002)
Stack Overflow, "2025 Developer Survey — AI" (2025)
Kent Beck, Test Driven Development: By Example (2002)

DEV Community