DEV Community

Abhishek Singh
Abhishek Singh

Posted on

I built a dataset of 50,000 debugging sessions — and what I found surprised me

Most bug datasets only tell you: "Bug was fixed in 23 minutes."

They don't tell you what happened during those 23 minutes.

So I built one that does.


What is DebugTraj-50K?

It is a dataset of 50,000 developer debugging sessions with 665,364 step-by-step behavioral events recorded across 8 programming languages and 71 error types.

Every session captures what a developer actually did while fixing a bug:

  • How many times they searched Google
  • How many compile attempts they made
  • Which files they opened
  • Whether they used an AI tool
  • Whether they asked a colleague
  • What strategy finally worked
  • And whether they succeeded or gave up

This is not a dataset about bugs. It is a dataset about human behavior under pressure.


Who is this useful for?

AI researchers and companies
Tools like GitHub Copilot and Cursor need behavioral data to train models that understand how developers think — not just how code looks. This dataset fills that gap.

ML practitioners
You can build models to predict:

  • Will this debugging session succeed?
  • How long will it take?
  • What will the developer do next?

CS educators
See exactly where students struggle, how long they take, and what makes senior developers faster.

DevTool companies
Understand real developer pain points when building IDEs, debuggers, and productivity tools.

Individual developers
Compare your own debugging habits with 50,000 others.


The Daily Coding Companion

On top of the dataset I built a practical notebook called the Daily Coding Companion.

It answers questions every developer asks themselves while debugging:

  • How long will MY bug take to fix?
  • Am I taking too long compared to my peers?
  • What should I try next when I am stuck?
  • When should I stop and ask for help?
  • Which language is actually hardest to debug?

You just fill in your language, your error type, and how long you have been stuck — and it gives you answers based on real data from 50,000 sessions.

No ML knowledge needed.

# Example: estimate fix time for your current bug
my_language   = 'Python'
my_experience = 'Mid-level (2-5 years)'
my_severity   = 3

similar = sessions[
    (sessions['programming_language'] == my_language) &
    (sessions['experience_level']     == my_experience) &
    (sessions['error_severity']       == my_severity)
]

print(f"Expected fix time : {similar['resolution_time_minutes'].mean():.0f} minutes")
print(f"Chance of fixing  : {(similar['outcome']=='fixed').mean()*100:.1f}%")
Enter fullscreen mode Exit fullscreen mode

Key findings from the data

  • Senior developers fix bugs 3.7x faster than junior developers on average
  • Rust and C++ take 1.5x longer to debug than Python or JavaScript
  • Sessions where developers took a break had a higher fix rate
  • AI tool usage is highest among mid-level developers, not juniors
  • After 8+ searches and 12+ compile attempts, asking a colleague increases fix rate significantly

Links

Dataset: https://www.kaggle.com/datasets/abhisheksingh016/debugtraj-50k

Daily Coding Companion notebook: https://www.kaggle.com/code/abhisheksingh016/daily-coding-companion

Both are free and open under CC BY 4.0.


If you find it useful, an upvote on Kaggle goes a long way. And if you build something with this dataset, I would love to see it in the comments.

Top comments (0)