Debugging Is a Search Problem: Bisect Everything

#debugging #productivity #programming #career

The problem

Two engineers get the same bug: a value that's correct when it enters the system and wrong when it leaves. One fixes it in fifteen minutes. The other burns the afternoon. Same tools, same access, same intelligence.

The difference isn't knowledge of the codebase. It's that one of them is running a binary search and the other is poking around.

Most engineers debug linearly: start where the error showed up, read the nearby code, change something, re-run, repeat. That's O(n) in the size of the problem — and worse, because most of those changes produce no information, so it's often not even that good.

The senior move is to treat every bug as a search over a space of possible causes, and to make each observation cut that space roughly in half. Ten halvings cover a thousand candidates. That's the whole trick, and it generalizes far past git bisect.

Why it happens

A bug lives somewhere in a space you can't see all at once. That space has several axes:

Time — which change introduced it.
Code — which module, function, or line.
Layers — at which stage of the pipeline the data went bad.
Input — which part of the payload triggers it.

Linear debugging searches one axis by hand — usually the code axis, starting at the symptom. But the symptom is where the bug surfaced, which is rarely where it lives. So you read the wrong neighborhood and mutate code hoping to get lucky. Each speculative edit that "might fix it" gives you almost nothing when it doesn't.

Binary search wins because information compounds. If you can find one observation that's true on one half of the space and false on the other, a single measurement deletes half your candidates — no matter which half.

What to do about it

Pick the axis with the cleanest before/after boundary and halve it.

Time — automate git bisect. If it worked last release and breaks now, you don't need to read the diff. Write a script that exits 0 on good and 1 on bad, and let git run the search:

git bisect start
git bisect bad                 # current commit is broken
git bisect good v2.3.0         # this release was fine
git bisect run ./repro.sh      # git binary-searches for you

Across 1,000 commits that's ~10 runs, not 1,000. The output is the exact commit that introduced the bug — often the entire investigation.

Layers — check the middle of the pipeline. Data flows A → B → C → D → E and comes out wrong at E. Don't start at A or E. Log the value at C. If it's already wrong at C, the bug is in A–C; if it's fine, it's in C–E. One print statement halved the system.

Input — shrink the repro. A 4,000-line request fails. Delete half. Still fails? Delete half of what's left. Passes now? Put it back and cut the other half. In a dozen steps a 4,000-line payload becomes the three fields that actually matter. This is delta debugging, and it works on config files, CSS, and feature flags — anything you can chop.

Code — disable half. Comment out half the handlers, half the middleware, half the plugins. Bug gone? It was in the half you removed. This feels crude; it is faster than reading.

The discipline underneath all four: each step's job is information, not a fix. Before you touch anything, ask "what's the one observation that splits this in half?" If a change can't tell you which half the bug is in, you're not searching — you're guessing.

Key takeaways

A bug is a point in a search space, not a line of code. Your job is to shrink the space, fast.
Every good observation halves what's left. log2(n), not n — ten steps cover a thousand suspects.
Bisect on whichever axis has the cleanest boundary: time (git bisect), layers (check the midpoint value), input (shrink the repro), or code (disable half).
Optimize each step for information, not for a fix. "Might this fix it?" is the wrong question. "Which half does this rule out?" is the right one.
The symptom is where the bug surfaced, not where it lives. Stop starting there.

DEV Community

Debugging Is a Search Problem: Bisect Everything

The problem

Why it happens

What to do about it

Key takeaways

Top comments (0)