If you’ve ever asked an assistant “why does this crash?” and gotten a confident-but-wrong answer, you’ve seen the real bottleneck:
Most bug reports aren’t runnable.
Humans can often “fill in the blanks” from intuition and experience. Assistants can’t. They need a concrete executable situation that produces the failure.
That’s the idea behind what I call the Repro Script Pattern:
Before you ask for a fix, ask for a minimal, runnable reproduction script + exact expected/actual output.
Done well, this turns fuzzy “it’s broken” conversations into a tight loop that looks a lot like test-driven debugging.
Why this works
When you provide (or force the assistant to create) a reproducible setup, you get three benefits immediately:
- You anchor the conversation to reality. No more guessing about versions, hidden state, or “maybe the API changed.”
- You make assumptions visible. The assistant has to commit to inputs, environment, and observable outcomes.
- You create a durable artifact. The repro script becomes documentation, a regression test, or a snippet you can share with teammates.
In practice, the Repro Script Pattern is a scope reducer and a quality booster.
The pattern (5 steps)
1) Define the target environment
Pick the smallest environment that can still show the bug:
- language/runtime version (Node 22, Python 3.12, Go 1.22)
- OS assumptions (Linux/macOS/Windows)
- package manager (npm/pnpm/poetry)
- required deps
If you don’t specify this, the assistant will silently pick one, and you’ll debug the mismatch.
2) Force a minimal input
A good repro has one path to failure.
- one API call
- one file
- one function
- one dataset row
If the assistant asks for “more context,” it’s usually a hint that the current input isn’t minimal or isn’t fully specified.
3) Require a single command to run
The rule is simple:
If a teammate can’t run it in 60 seconds, it’s not a repro.
“Single command” doesn’t mean “single file.” It means there’s a clean entry point:
node repro.mjs
# or
python repro.py
# or
go test ./... -run TestRepro
4) Capture expected vs actual output
This is the most underrated piece.
If you only describe the actual output (“it throws”), you force the assistant to guess your intent. Make intent explicit:
- expected: what should happen
- actual: what does happen
- delta: what’s wrong about it
5) Only then: ask for diagnosis + fix
Once the repro exists, the assistant can:
- reason about one concrete scenario
- propose a patch that’s constrained by the repro
- verify the patch against expected output
A prompt template you can reuse
Copy/paste this into your workflow (and tweak the stack):
You are helping me debug a bug.
First, do NOT propose a fix.
Step 1: Ask up to 5 clarifying questions ONLY if strictly necessary to create a minimal runnable repro.
Step 2: Produce a minimal reproduction as:
- files (with exact contents)
- install commands
- run command
- expected output
- actual output
Step 3: List the most likely root causes ranked (with reasoning).
Step 4: Propose a fix as a patch (diff) and explain why it resolves the repro.
Step 5: Extend the repro into a regression test if possible.
Constraints:
- Prefer smallest possible dependency set.
- If any assumption is made, label it as ASSUMPTION.
- Keep the repro under 60 lines if possible.
That “do NOT propose a fix” line is doing a lot of work. It prevents the assistant from jumping to solution mode before the problem is pinned down.
Concrete example: a Node fetch timeout that “doesn’t work”
Imagine you have code like this:
- “I’m using
AbortController.” - “I still get hung requests sometimes.”
- “Can you fix it?”
That’s not runnable, and the assistant will guess.
Here’s what the Repro Script Pattern looks like.
Repro script (repro.mjs)
// repro.mjs
import http from "node:http";
// Start a server that never responds.
const server = http.createServer((_req, _res) => {
// Intentionally do nothing.
});
server.listen(0, async () => {
const { port } = server.address();
const url = `http://127.0.0.1:${port}`;
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 100);
const start = Date.now();
try {
await fetch(url, { signal: controller.signal });
console.log("UNEXPECTED: request resolved");
} catch (err) {
console.log("elapsed_ms", Date.now() - start);
console.log("error_name", err?.name);
} finally {
clearTimeout(timeout);
server.close();
}
});
Run command
node repro.mjs
Expected output
-
elapsed_msis close to ~100–200ms -
error_nameisAbortError
Actual output
On some setups, you’ll see:
- long hangs
- or the process doesn’t exit cleanly
Now we can debug something specific:
- Is
fetchthe built-in one or polyfilled? - Is the abort signal wired correctly?
- Are we closing the server?
- Are we leaking handles that keep Node alive?
Notice how the repro forces the conversation into checkable claims.
Turning the repro into a fix request
Now your request becomes:
“Given this repro, why might Node stay alive / hang, and what change makes the script consistently exit within 300ms?”
That’s answerable.
What to do when the assistant can’t run code
Sometimes the assistant can’t execute your environment. The pattern still works because you’re not asking it to run, you’re asking it to describe a runnable artifact.
Two tips:
-
Ask it to include a ‘verification section’.
- “If you run the repro and see X, it supports hypothesis A; if you see Y, it supports hypothesis B.”
-
Ask it to surface uncertainty explicitly.
- “List the top 3 unknowns that could invalidate your diagnosis.”
This keeps the output honest and makes your next action obvious.
Repro quality checklist
Before you accept a repro script, check:
- [ ] One command to run
- [ ] Minimal dependencies
- [ ] Exact versions listed (or explicitly assumed)
- [ ] Expected vs actual output included
- [ ] No unrelated code paths
- [ ] A teammate could run it quickly
If it fails any of these, tighten it.
The payoff: faster debugging, fewer hallucinations
The biggest upgrade you can make to AI-assisted debugging isn’t a better model.
It’s better artifacts.
The Repro Script Pattern turns debugging into a disciplined loop:
1) pin down reality → 2) propose hypothesis → 3) patch → 4) verify → 5) keep the repro as a regression test.
Once you start doing this consistently, you’ll notice something surprising: even when the assistant is wrong, it’s wrong in a useful way—because the failure is now reproducible, inspectable, and fixable.
If you try this pattern, I’d love to hear what stack you use it with (frontend? infra? data pipelines?) and what you had to tweak.
Top comments (0)