DEV Community

AI makes writing code easier. It doesn't make engineering easier.

Dimitris Kyrkos on June 19, 2026

The narrative is backwards There's a narrative going around that AI is making software engineering easier. I think it's getting the dire...
Collapse
 
xulingfeng profile image
xulingfeng

"The gap between code that exists and software that works in context" — that line hit hard. Just wrapped up my 15th story on the same gap. The team in that one also had AI hitting 97.2% coverage, but the client had 14 external dependencies and a 24-hour CI pipeline. Turns out coverage report exists ≠ production won't blow up 😅

Collapse
 
dimitrisk_cyclopt profile image
Dimitris Kyrkos

The 97.2% coverage with 14 external dependencies is the perfect example because coverage measures "did the code run" not "did the code handle what production will actually throw at it." You can hit 100% coverage with tests that all mock the external dependencies, which means you've thoroughly tested your code's behavior in a world that doesn't exist. The 24-hour CI pipeline detail is telling too because that's usually a symptom of the same problem, the team is running a massive test suite that gives them confidence numbers without actually reducing risk proportionally. The gap I keep seeing is that AI makes it trivially easy to generate tests that boost coverage metrics without anyone asking "what does this test actually prove about production behavior." Coverage went from a useful signal to a vanity metric the moment generating tests became cheaper than thinking about what to test.

Collapse
 
xulingfeng profile image
xulingfeng

"Coverage turned into a vanity metric when generating tests got cheaper than thinking about what to test" — that's the whole thing in one sentence. I'm honestly tempted to repost your reply as a comment under my own article so more people see it 😂

Thread Thread
 
dimitrisk_cyclopt profile image
Dimitris Kyrkos

Ha go for it, good ideas should travel. And honestly that framing only clicked for me because of your 97.2% coverage example, the specific number makes the absurdity concrete in a way that "coverage doesn't equal quality" never does. Drop a link to your article if you want, curious to read the full story behind the 14 external dependencies and the 24-hour pipeline.

Thread Thread
 
xulingfeng profile image
Comment deleted
Thread Thread
 
dimitrisk_cyclopt profile image
Dimitris Kyrkos

Just read through it. The RFP framing is what makes it land, because now the 97.2% isn't just a bad metric, it's a sales pitch that won the room while the actual product fell apart behind it. That's the part nobody wants to say out loud, that half these numbers exist to close deals not to protect production. Good stuff, gonna follow the series.

Collapse
 
pixel-wraith profile image
Jake Lundberg

agreed on the core...writing code was always the cheap part. but I'd push on the "doesn't make it easier" framing, because I think it's worse than that. when building was slow, the build itself was a brake on bad design. you'd get halfway in and feel the friction...this is dragging, the abstraction's wrong, time to back up. you caught the mistake before you'd fully paid for it. AI took that brake off. now you can build the wrong thing all the way and fast, and the design flaw stays invisible until it's big and load-bearing. so the part of the job that didn't get easier is now the part that's most expensive to get wrong

Collapse
 
kartik-nvjk profile image
Kartik N V J K

The widening-gap framing is right, and it shows up most in the parts that never make it into the prompt: error paths, edge cases, and behavior under load. When generation was slow, that thinking got forced on you line by line; now it has to be a deliberate separate step that's easy to skip when the diff already looks done. The only reliable counter I've found is deciding the failure cases and the checks up front, before any code gets generated.

Collapse
 
adegbengaagoro profile image
Agoro, Adegbenga. B

Software design and engineering principles have never been more important than they are today.

One thing worth remembering is that code generation was never the game. Developers who were exceptionally good at rapidly producing code were often referred to as “code monkeys.” They weren’t the people invited into the room to define how the system should work or how complex business problems should be solved.

The real value has always been in understanding the problem, designing the system, defining the boundaries, and making sound architectural decisions. Writing the code was simply the implementation of that thinking.

Working with AI makes this even more important. Before we let an AI agent generate code, we need to clearly define what we’re building, why we’re building it, and the constraints it needs to operate within.

Many teams that initially evaluated AI based purely on code generation speed are now discovering that without strong architectural guidance, coding standards, and quality guardrails, a significant portion of that generated code ends up needing to be rewritten.

Code generation speed is not the same as production value.

And production value is not the same as customer value.

The teams that will get the most out of AI won’t be the ones generating the most code. They’ll be the ones designing the best systems for AI to build.

Collapse
 
nazar_boyko profile image
Nazar Boyko

The line about slow generation forcing deliberation is the part that rings true. When every line cost effort, you were basically rubber-ducking the design as you typed it. Now that step is free, so the thinking has to move somewhere, and for most teams it just doesn't. Where I feel it most is review. Reviewing code a human wrote, you can usually guess the intent behind it. Reviewing generated code there's no intent to read, just plausible output, so you end up reverse-engineering what it was even trying to do, which is slower than people admit. Has your team actually carved out time for that, or does review still get squeezed the way it did before?

Collapse
 
anhmtk profile image
anhmtk

This hits the nail on the head. "The gap between code that exists and software that works in context is widening" is probably the most accurate description of the current AI era.

As a non-tech founder who literally self-taught and built a web platform for AI agents entirely alongside LLMs, I live this paradox every single day. AI makes me feel like a wizard who can materialize features in minutes. But the moment real traffic hits, or when I have to reason about edge cases, rate-limiting, and structural maintenance, the "wizardry" fades, and the sheer necessity of true engineering judgment becomes glaringly obvious.

Building a product with AI has actually made me respect senior engineers and architects infinitely more. AI can write the functions, but it doesn't possess the empathy to understand user behavior, nor the historical judgment to prevent architectural decay.

The tools changed, but the ultimate bottleneck is still—and will always be—human engineering discipline. Thanks for writing this!

Collapse
 
codingwithjiro profile image
Elmar Chavez

There will always be the people bottleneck. Software can't possibly be accelerated overnight. Decisions always lie on an actual person orchestrating the AI. Today these decisions need higher quality verifications than ever before. This is a good reminder, thank you for this.