AI coding assistants have changed how we write software. Autocomplete, boilerplate, refactors — all faster than ever.
But debugging? Research groups have found something counterintuitive: AI speeds up writing code… but slows down fixing it.
This isn’t speculation. It shows up in data from METR, Microsoft Research, and Google’s developer studies. And the explanation reveals something fundamental about how debugging actually works.
METR: AI reduces typing but increases debugging
METR’s 2025 study looked at professional developers solving real GitHub issues in large codebases.
The surprising result:
- Coding time decreased
- Debugging time increased ~20%
Why?
Developers spend more time:
- reviewing AI-generated code
- hunting down logic errors that looked correct
- debugging fixes that didn’t match runtime behavior
- refactoring mismatched patterns
One example METR cited:
const filtered = items
.filter(i => i.active)
.map(i => i.data)
.sort((a, b) => a.priority - b.priority);
Looks perfect. Fails instantly, though, if item.data is null for one case.
And that’s the point: AI doesn’t see what actually happened at runtime. It only sees static code.
Microsoft Research: AI struggles with root cause debugging
Microsoft tested AI tools on 200 real enterprise bugs.
Accuracy on root causes:
- 37% for complex bugs
- 23% for frontend issues
- 31% for async/concurrency bugs
And:
- 22% of AI-generated fixes introduced regressions.
Almost every failure traced back to the same reason: LLMs were inferring behavior from static text rather than reasoning from runtime state.
Debugging is about timing, lifecycle, state, ordering, and environment — none of which live in source code.
Google Developer Research: The “runtime reconstruction tax”
Google studied how developers use AI during debugging. The majority of time was spent not fixing bugs, but reconstructing the situation for the AI:
- 15% → explaining context
- 23% → recreating what happened in DevTools
- 18% → trying AI fixes that didn’t work
- 12% → rolling back bad patches
Total:
- ~70% of debugging time was spent compensating for the AI’s lack of runtime visibility.
This is the paradox in its clearest form.
The core issue: AI has zero access to runtime context
When debugging, here’s what developers use:
- actual variable values
- DOM snapshots
- event sequences
- network timing
- component state
- the moment the UI broke
AI assistants see none of that unless you manually describe it.
They receive:
- a stack trace
- a file
- the error message
- whatever context you type up
- They do not receive:
- the DOM
- state history
- render order
- unresolved promises
- timing edges
- user input sequence
- network conditions
Debugging fundamentally lives in runtime behavior. LLMs fundamentally operate on static text. That mismatch is the entire paradox.
Addressing the problem
As part of the theORQL team, our work centers on runtime-aware debugging, specifically understanding why AI struggles without the context developers get automatically from DevTools. We’re not claiming AI is bad at debugging. The research shows something much simpler: AI debugs poorly because it can’t see what actually happened.
A frontend developer on Dev.to recently wrote about how having access to runtime context dramatically reduced her debugging time (link to post). That kind of real-world experience aligns closely with what these research studies reveal — visibility is the difference maker.
Our goal is to explore how tools might bridge the divide between runtime context and editor context. We don’t plan to replace debugging, but to give AI the situational awareness humans already rely on.
What do you think?
- Have you seen debugging get slower with AI?
- Do you feel the “visibility gap” when using AI to fix bugs?
- What does an ideal debugging workflow look like to you?
This paradox affects most everyone writing frontend code. As debugging becomes increasingly entangled with AI, the community’s perspective will shape what the next decade of tooling looks like.
Top comments (0)