DEV Community

Cover image for The 66% Problem

The 66% Problem

Evan Lausier on January 23, 2026

I spent three hours last Tuesday chasing a bug that didn't exist. The code looked perfect. It was syntactically correct, followed best practices, ...
Collapse
 
sylwia-lask profile image
Sylwia Laskowska

Oh yes, I remember when my UX designer wrote his very first piece of Python code in ChatGPT 😄
It worked at first, but once he asked for “optimization”, something broke. In the end he asked me to take a look. I removed about 80% of the code because it was adding things he didn’t actually need, tweaked the main function a bit - and it worked.
He decided I was a genius after that 😄

Collapse
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

You're a genius anyway 😉 – that said, I've written code with chatGPT, Claude, and many others… for simple cases, it's fine. But for production-ready code… hmmm, you understand. And debugging is the same. Asking it to find the error in simple code, sure – but if there's functional logic behind it, not detailed… of course, it won't find it. Worse, it might rewrite the code for you, with terrible self-assurance, explaining that it found the error… and your code will become an ocean of nonsense.

Collapse
 
shitij_bhatnagar_b6d1be72 profile image
Shitij Bhatnagar

Agree, and post AI code generation, one can go on telling AI about its code mistakes, it will apologize politely, show some crappy fix .. for me, the amount of time lost in getting AI to produce better code could have been better utilized to train a human or fix the issue myself.

Collapse
 
sylwia-lask profile image
Sylwia Laskowska

Haha 😄 definitely not a genius — unless we’re talking about a genius of chaos 😂

But yes, totally agree. For simple cases, quick prototypes, or a first pass at debugging, it can be really helpful. For larger projects with real business logic and context… well, that’s a whole different story — and a pretty risky one 😅

Thread Thread
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

Hey! A genius of chaos is still a genius 😄

Thread Thread
 
evanlausier profile image
Evan Lausier • Edited

LOL I donno @sylwia-lask .... I've read a lot of your stuff... you might be.. 😊 strict equality one was really good.

I don't think I realized how widespread this was... I thought it was just me for a while.😂

Sounds like were all sharing in the fun haha

Thread Thread
 
sylwia-lask profile image
Sylwia Laskowska

Haha 😄 careful, you’re raising expectations now!

Don’t worry - I’ll balance it out soon. Tomorrow I’ll probably publish a post about how bad I am at CSS 😂

Collapse
 
evanlausier profile image
Evan Lausier

HA! That's so funny! 😂

Collapse
 
fredbrooker_74 profile image
Fred Brooker

I feel your pain

I've spent 5 hours debugging nonexistent bugs until it was clear that the unit tests were flawed 😂

simply put - Gemini created bad test cases and could not solve the problem afterwards
😂 😂 🍸 💩

Collapse
 
evanlausier profile image
Evan Lausier

Oh my!! 😂😂 That is like the AI version of "The Good Idea Fairy" 😂

Collapse
 
webdeveloperhyper profile image
Web Developer Hyper

I always check AI outputs carefully and ask follow-up questions about the code. Sometimes I also go back to the official documentation to verify whether the AI’s output is really correct. Even so, I still miss bugs that I didn’t anticipate.😭
However, AI is improving very quickly and getting better day by day. So I believe that as my skills improve and AI coding improves as well, the number of bugs will decrease in the future.

Collapse
 
evanlausier profile image
Evan Lausier

It really is. I find myself using it more and more for quick analysis when I am strapped for time. It more often than not points me in the right direction but is not quite there in the detailed root causes.

Collapse
 
richardpascoe profile image
Richard Pascoe

Thank you, Evan. A truly thought-provoking post. I've often wondered about the productivity claims surrounding AI, as expressed in a recent discussion post - AI Productivity Gains? - but your words really put the reality of the situation into sharp focus and I really appreciate that.

Collapse
 
kc900201 profile image
KC

Microsoft Research published a study earlier this year that quantified this. They tested nine different AI models on SWE-bench Lite, a benchmark of 300 real-world debugging tasks.

@evanlausier could you share which research study that was done by showing any link or reading reference? I'm interested to know on which factor did they make the study.

Some skills don't need to be automated. They need to be sharpened.

Agree with this. Moreover, we need to set up the metrics on how they can be improved. It would be better if we could set up a benchmark for the specific skills to match the market's expectation.

Collapse
 
evanlausier profile image
Evan Lausier • Edited

That is going to be the really challenging part. How do we set the benchmarks? But yes, the blog is below. A colleague came across it and shared with me after a discussion on leveraging the technology at work.

microsoft.com/en-us/research/blog/...

Collapse
 
marina_eremina profile image
Marina Eremina

It’s great to see these Stack Overflow surveys, right? I also find it reassuring when my own opinion about a tool aligns with what a large part of the community thinks, looks like many of us try it out and reach similar conclusions :)

Collapse
 
ben-santora profile image
Ben Santora

Your comment made me realize that I haven't been to the Stack Overflow site in years. I need to visit it again.

Collapse
 
evanlausier profile image
Evan Lausier

LOL right?

Collapse
 
evanlausier profile image
Evan Lausier

I know right!? I thought I was the only one too!!

Collapse
 
shitij_bhatnagar_b6d1be72 profile image
Shitij Bhatnagar • Edited

I fully agree with the intent of the article and the findings and especially, 'debugging isn't pattern completion. Debugging is hypothesis testing'. Debugging is often not discussed much but every engineer knows that at times it can be very frustrating to debug code, even the code you may have written or reviewed/approved earlier and debugging skills are 'non-negotiable', they can be a differentiator as well.

The whole narrative around code generation by AI is broken, because at the end of the day, if I were a production manager, I would never trust AI generated code and when something breaks, a bot will not fix, I need a man and that man needs to be confident on the code itself, that link itself is missing.

AI code can be an unverified assistance at most, not the main actor. I have also noticed how context-less the remarks from AI code review bots can be (take the online git management tools) on MRs, and that's because they are just following some rules fed to them, not because they found some real bug in your code. I feel, Find Bugs and PMD were more predictable than these AI code review bots.

Thanks for the article.

Collapse
 
evanlausier profile image
Evan Lausier

Right? Thank you so much!! Im really glad it resonated. It probably helped the article that I spent half the morning debugging something one of my junior resources got from using an AI code tool 😊

Collapse
 
aloisseckar profile image
Alois Sečkár

"almost right, but not quite" - every single time I try to generate an AI image...

Collapse
 
hariprasadraja profile image
Hariprasad

I have been struggling on this problem from months now. one of a junior developer written 6000+ lines of testing code in a single PR. While reviewing it, I wonder do he actually read all these before sending it for my review. In a highly productive month, I would have written a 1000 to 2000 lines max for a month of work, but seeing such a number in a single PR is so insane.

Collapse
 
evanlausier profile image
Evan Lausier

Omg, yea thats crazy!

Collapse
 
vasughanta09 profile image
Vasu Ghanta

Nailed the "66% problem"—AI's polite, lurking bugs are stealthier than honest failures; treat it like overconfident juniors and double-test to sharpen those irreplaceable debugging instincts!

Collapse
 
evanlausier profile image
Evan Lausier

Love it! "overconfident juniors" 😂

Collapse
 
vadim profile image
Vadim

The dopamine hit of instant completion masks the debugging debt accumulating behind us.

Well said, I feel the same

Collapse
 
derekcheng profile image
Derek Cheng

LLMs are great at writing code, but definitely need to be supervised. There's a ton that needs to be built in tooling and workflow to make that process super efficient.

Collapse
 
evanlausier profile image
Evan Lausier

oh, 100%

Collapse
 
leob profile image
leob

So if you need to meticulously code-review every line, are you still more efficient, I wonder? :-)

Collapse
 
evanlausier profile image
Evan Lausier

That might be a "depends on whos asking" kind of response 😂😂