DEV Community

Cover image for How we learned to log bugs properly
Dat Nguyen
Dat Nguyen

Posted on

How we learned to log bugs properly

We weren’t slow at fixing bugs — we were slow at understanding them

In our previous post, we talked about a simple realization:

We weren’t slow at fixing bugs.
We were slow at understanding them.

That led us to a deeper question:

What does a “good” bug log actually look like?

Most bug reports fail before debugging even starts

A typical report looks like this:

  • “It’s broken”
  • “Webhook not working”
  • “App doesn’t respond”

From a user’s perspective, that’s completely reasonable.

But from a developer’s perspective, it’s missing everything we need:

  • Where did it happen?
  • What did the user do before that?
  • What failed exactly?
  • Can we reproduce it?

So the real process becomes:

  1. Support asks follow-up questions
  2. Devs try to guess
  3. Time gets lost before any fix even begins

The problem isn’t debugging.

It’s the input we start with.

A useful bug is not a message — it’s a reconstructable event

After running into this repeatedly, we started thinking differently.

A bug report shouldn’t be something you read.
It should be something you can reconstruct.

At minimum, a useful log should answer:

  • Where did this happen? (URL / screen)
  • What failed? (request, error, response)
  • What led to it? (user actions, event sequence)
  • Under what conditions? (device, browser, network)
  • When did it happen? (timestamp)

Once you have that, debugging changes.

You’re no longer asking:

“What might have happened?”

You’re asking:

“Why did this specific sequence lead to failure?”

Why user-reported steps are not enough

One of the most fragile parts of debugging is reproduction.

We usually rely on:

  • users
  • or support teams

to describe steps.

But by the time that happens:

  • details are forgotten
  • steps are incomplete
  • or slightly inaccurate

Even small differences can make a bug impossible to reproduce.

Reproduction should come from the system, not memory

So we stopped asking users for steps.

Instead, we derive them from the actual session.

We look at what really happened:

  • page navigation
  • clicks and interactions
  • network requests
  • state changes

From that, we reconstruct a simplified sequence of events leading up to the issue.

Not a perfect script — but usually enough to trigger the same failure again.

That turns reproduction from guesswork into something much closer to replay.

Bugs are rarely universal

Another thing we kept seeing:

The same issue doesn’t affect everyone.

Sometimes it only happens:

  • on a specific browser
  • on a specific device
  • for a specific user
  • under certain network conditions

Without that context, bugs feel random.

With it, patterns start to emerge.

That’s why environment data matters just as much as the error itself.

What we actually log now

Over time, our logs evolved into something closer to a structured issue.

For each bug, we capture:

  1. The failure itself
  • request URL
  • method
  • status code
  • response body
  1. Where it happened
  • page URL
  • screen / feature
  1. What led to it
  • sequence of user actions
  • navigation flow
  1. The environment
  • browser
  • OS
  • device type
  • network conditions
  1. When it happened
  • precise timestamp
  1. Reproduction context
  • a reconstructed path to trigger the issue again

At that point, a bug stops being a vague report
and becomes something you can actually work with immediately.

What changed for us

The biggest shift wasn’t logging more.

It was logging the right things.

Before:

  • we had errors
  • but no context

Now:

  • we have context
  • and the error becomes obvious

We spend less time asking:

“What happened?”

And more time on:

“How do we fix it?”

What we’re still figuring out

Even with all of this, it still feels incomplete.

There are still cases where:

  • everything looks technically correct
  • but the outcome is wrong for the user

No error. No exception.

Just a mismatch between what the system did and what the user expected.

Those are harder to capture.

And it raises a bigger question:

What does a truly complete bug log look like?

If you’ve run into similar cases, I’d really appreciate your perspective.

We’re still building this and learning from real-world usage:
👉 https://flashlog.app

Top comments (0)