DEV Community

max
max

Posted on

The Debugging Checklist I Run Before Asking Anyone for Help

Last Tuesday I spent 40 minutes convinced my API was broken. Requests were timing out, the logs looked fine, and I was drafting a message to a friend who knows the codebase when I realized the bug was in my .env file. I had a trailing space after a database URL. A trailing space.

Forty minutes. For a space character.

I should have caught it in under five. I have a process for this. I just skipped it because I was tired and figured this one would be obvious. It was not obvious. They never are when you skip the process.

So here is the actual checklist I run through before I message a coworker, open a Stack Overflow tab, or do anything that involves another human looking at my dumb mistake. This is not theory. This is the stuff that lives in my hands at this point.


Step 1: Confirm the Bug Exists Where You Think It Does

Most debugging advice is useless because it assumes you already know where the bug is. That is a massive assumption. Half the time, the bug is not even in the file you are staring at.

Before doing anything else, prove to yourself that the system is actually broken in the way you think it is. Not "it feels broken" or "the output looks wrong." Run a request manually. Check the actual response body. Look at the actual database row. Hit the actual endpoint with curl.

I cannot tell you how many times I have been debugging the wrong thing entirely. The frontend is showing stale cached data and the API is fine. The test is failing because the test setup is wrong, not because the code is wrong. The deploy did not actually go out yet.

Stop assuming. Verify.


Step 2: Check What Changed

This is the highest-value question in all of debugging and most people skip it.

Was this working before? When did it stop? What changed between then and now?

git log --oneline -20
git diff HEAD~5
Enter fullscreen mode Exit fullscreen mode

Look at your recent commits. Check if someone else merged something. Look at your environment variables, your config files, your dependencies. Did a package update? Did someone change a shared resource? Did your database get migrated?

If you can narrow it down to "it broke after this specific change," you have just eliminated 95% of the codebase from your investigation. That is not a small thing.


Step 3: Read the Actual Error Message

I know this sounds condescending. I promise it is not.

I have watched developers, including myself, glance at an error message, get the gist of it, and then go off debugging based on their interpretation of what the error probably means. Then they spend an hour chasing the wrong thing because the error message was quite specific and they just did not read the whole thing.

Read it. The whole thing. Including the stack trace. Including the part after the first line. Including the "caused by" section at the bottom that everyone scrolls past.

If there is a line number, go to that line. If there is a file path, open that file. If there is an error code, look up that specific code. Not the category. The specific code.


Step 4: Reproduce It Reliably

If you cannot make the bug happen on command, you are not ready to debug it. You are guessing.

Strip away everything you can. If the bug happens in a web app, can you trigger it from a unit test? From a curl command? From a REPL? The fewer moving parts between you and the bug, the faster you will find it.

Here is the thing though. If you cannot reproduce it, that IS the clue. Intermittent bugs are almost always one of three things: race conditions, state leaking between tests, or something environment-specific. That narrows your search enormously.


Step 5: Add Logging at the Boundaries

Not everywhere. At the boundaries.

Log what goes into the function and what comes out. Log what the database query returns. Log what the external API sends back. You are trying to find the exact point where the data goes from correct to incorrect.

print(f"INPUT:  {user_id}, {payload}")
result = do_the_thing(user_id, payload)
print(f"OUTPUT: {result}")
Enter fullscreen mode Exit fullscreen mode

Yeah, print debugging. I know. I have a debugger. I know how to use breakpoints. But nine times out of ten, a few strategic print statements find the bug faster than stepping through code because they let me see the data flow across multiple function calls at once instead of one frame at a time.

I am not saying debuggers are bad. I am saying the fastest path to the bug is usually the one with the least setup.


Step 6: Question Your Assumptions

This is the hard one.

By the time you have been staring at a bug for 20 minutes, you have built a mental model of what is happening. That mental model is probably wrong in at least one way, and that wrong assumption is probably why you have not found the bug yet.

Force yourself to question the things you "know":

  • Are you sure this code is actually being executed? Add a log and check.
  • Are you sure that variable contains what you think it does? Print it.
  • Are you sure the database has the data you expect? Query it directly.
  • Are you sure you are hitting the right server? Check the URL. Check it again.
  • Are you sure the config file you are editing is the one the app is reading?

That last one has bitten me at least four times. Editing a .env file in one directory while the process is reading from a completely different one.


Step 7: Isolate With Binary Search

If you have a big chunk of code and you know the bug is in there somewhere, do not read every line hoping to spot it. That works sometimes. It is slow every time.

Comment out half the code. Does the bug still happen? If yes, it is in the other half. If no, it is in the half you commented out. Repeat.

This works for more than just code. It works for config files, for SQL queries, for CSS stylesheets, for request payloads. Anything where you have a blob of stuff and the problem is somewhere inside it.


Step 8: Search for the Exact Error

I do this later in the process than most people recommend, and here is why: if you Google the error message before understanding your own code, you will find a Stack Overflow answer that looks relevant, apply the fix blindly, and either introduce a new bug or mask the original one.

But once you have done steps 1 through 7 and you actually understand the shape of the problem? Then searching is incredibly effective because you can evaluate whether a solution actually applies to your situation.

Put the error in quotes. Include the specific library or framework version. Filter to results from the last year or two. Old answers for old versions will ruin your afternoon.


Step 9: Rubber Duck It (Seriously)

I used to think this was silly advice from people who had never debugged anything serious. I was wrong.

The act of explaining the problem out loud, or writing it out as if you are going to post it somewhere, forces you to organize your understanding. And the gaps in your understanding become obvious when you try to articulate them.

I write the Stack Overflow question. Title, description, what I have tried, what I expected, what actually happened. I would estimate 30% of the time, I find the answer while writing the question. The other 70%, I at least have a well-structured question ready to post.


Step 10: Let a Second Brain Look at It

I was pretty skeptical about AI coding tools for a while. Thought it was all hype. Then sometime in mid-2024 I pasted a module into Claude on a whim, mostly to prove to a friend it would not find anything useful. It caught a race condition I had been missing for weeks. A subtle one too, where two async handlers were writing to the same object and the outcome depended on which one resolved first. I had been staring at that code for days.

Now this is a regular step in my process. Not the first step. Not a replacement for actually understanding the problem. But once I have done the work above and I am genuinely stuck, I will paste in the relevant code with a specific description of what is going wrong. Not "review this code" but "this function returns the wrong total when called with concurrent requests and I suspect it is a state mutation issue, here is my evidence."

The key word there is specific. Vague questions get vague answers, from humans and AI alike.


The Whole List, Condensed

  1. Confirm the bug is real and is where you think it is
  2. Check what changed recently
  3. Read the complete error message
  4. Reproduce it reliably
  5. Log at the boundaries
  6. Question your assumptions
  7. Binary search to isolate
  8. Search with context
  9. Rubber duck it
  10. Let a second brain look

I do not always do all ten. Sometimes step 2 solves it instantly. Sometimes I jump straight to step 5 because I have a hunch. The point is not rigid adherence. The point is that when I am stuck, I have a concrete next action instead of staring at the screen hoping for insight.

The developers I respect most are not the ones who never get stuck. They are the ones who get unstuck fast. And they get unstuck fast because they have a process, not because they are smarter.


I ended up documenting a lot of my debugging prompts and workflow patterns for AI-assisted development in a Claude Code Prompt Pack I put together for myself. It includes the exact prompt structures I use for bug hunting, code review, and the kind of targeted analysis that actually surfaces real issues. Figured I might as well share it since I was already maintaining it anyway.

Top comments (0)