Todd Linnertz

Posted on Apr 7 • Originally published at devopsdiary.blog

AI Doesn't Fix Your Development Problems. It Accelerates Them.

#ai #devops #governance #softwareengineering

I've watched the same failure pattern play out across every technology wave of my career.

Team gets a new tool that promises to change everything. Productivity numbers go up. Everyone celebrates. Six months later, they're drowning in the same late-stage rework they were drowning in before. Just more of it, arriving faster.

I saw it with CASE tools in the nineties. With offshore development in the 2000s. With Agile transformations in the 2010s. With DevOps automation in the 2020s.

AI code generation is the most powerful version of this pattern I've ever seen. And most engineering organizations are walking straight into it.

The Illusion Looks Like This

Your team adopts GitHub Copilot or a similar tool. A developer asks it to implement a user authentication module. In forty seconds, it produces three hundred lines of code, complete with error handling, tests and documentation comments.

It looks like progress. It genuinely feels like the future.

Most teams never stop to ask whether the spec for that authentication module was unambiguous.

Because if the acceptance criteria were vague, if the security requirements weren't spelled out, if the integration assumptions weren't documented, you didn't just get a module in forty seconds. You got a module built on a foundation of ambiguity in forty seconds. The rework that's coming is exactly the same size it would have been without AI, compressed into a shorter timeline, with more generated code to sort through.

This is what I mean when I say AI accelerates the appearance of progress while the underlying causes of late-stage rework remain unchanged.

The Real Source of the Problem

Late-stage rework has never been caused by slow typing.

After five companies and more failed projects than I can count, I can say this with confidence: rework happens because of process failures, not speed deficits.

The real culprits are consistent:

Ambiguous specifications that leave developers filling in the blanks with assumptions that won't survive contact with the product team.

Unstable upstream artifacts. The architecture document that's still being revised while the engineering team is implementing against it.

No separation between generation and judgment. The same person (or tool) that produces the artifact is asked to validate it. The result is rationalization, not evaluation.

Missing governance at handoff points. Work flows from planning to design to implementation with no formal freeze points and no immutable record of what was decided and when.

These process failures predate AI by decades. I saw every one of them long before anyone had a code assistant. What AI does is make them faster, and worse. When a developer could only produce two hundred lines of code per day, bad process produced two hundred lines of rework per day. When AI can produce two thousand lines of code per day, bad process produces two thousand lines of rework per day.

The throughput multiplied. The problem did not diminish.

What Most Teams Do About It

Most teams respond to this by trying to write better prompts.

That's the wrong level of the problem. Better prompts improve the quality of AI output within a session. They do nothing about the structural issues that make that output drift, conflict with upstream decisions, or fail validation three weeks later.

Some teams add code review. That helps at the implementation level, but it doesn't address the artifact chain. AI-generated architecture documents, PRDs, and design specifications have the same ambiguity problem as AI-generated code, and often create it earlier in the cycle where the blast radius is larger.

The instinct to treat AI governance as a prompt engineering problem is understandable. Prompt engineering is visible and immediate. The structural failures that cause rework aren't. They hide until you're already underwater.

What Actually Fixes It

After watching the same failure patterns repeat, and then watching them accelerate as my teams started adopting AI tooling, I concluded that the fix requires three structural changes, none of which are about prompting.

Treat AI as a generation engine, not a decision-maker. AI is extraordinarily good at producing artifacts: code, documentation, architecture drafts, test plans. It is not good at determining whether those artifacts are correct relative to upstream decisions it may not fully understand. The organizations that get this right separate generation (what AI does) from judgment (what humans and structured validators do). These are different activities and they need different infrastructure.

Freeze artifacts before downstream work begins. An architecture document that can change while engineering is implementing against it is a liability, plain and simple. Frozen artifacts create an immutable record of what was decided. When something downstream breaks, you know whether the upstream artifact shifted or whether the implementation deviated. Without freeze semantics, this is guesswork.

Make validation produce verdicts, not suggestions. When you ask an AI to review its own output, it will find ways to explain why what it generated is reasonable. That's rationalization, not validation. Real validation produces a binary result: the artifact meets the required criteria, or it doesn't. Anything softer than that is a governance gap dressed up as a process.

At a previous company, I inherited four operational workflows where the same rework patterns were burning cycles everywhere. We didn't buy new tools or speed anything up. We restructured the handoffs and built validation into each transition point. Defect rates dropped 50%. Throughput improved between 35 and 57 percent across all four areas. None of that came from faster tooling. All of it came from fixing the process around the work.

These aren't novel ideas. They're the same principles that make CI/CD pipelines reliable: automated gates, immutable artifacts, clear separation of build and deploy. The insight is that they apply just as well to AI-assisted software delivery as they do to code deployment pipelines. Maybe more so.

The difference is structure around the generation.

The Framework I Built

When I led GitOps adoption at my current company, the technology was the easy part. Getting architecture review board approval, building deployment standards and creating the governance structure that let teams adopt safely took ten times longer. The teams that tried to skip the governance stalled out. The ones that went through it shipped to production. That experience confirmed something I already suspected: the structure around adoption matters more than the tool being adopted.

In early 2026, I formalized these ideas into an open-source framework called AIEOS (AI-Enabled Operating System).

AIEOS structures how engineering artifacts are produced, validated and connected across the full software development lifecycle when AI is involved in generating them. It's built across 24 repositories: an 8-layer model covering the full value-delivery cycle from strategic direction through operational diagnostics, a multi-agent orchestration harness and a guided console for running governance workflows.

The design reflects a simple premise: when AI generates engineering artifacts, the quality of the output depends on the quality of the structure around it. Better prompts help. Better governance infrastructure is what makes the results repeatable, auditable and trustworthy at scale.

The repo is at github.com/wtlinnertz. It's open source, and the rest of this series will dig into how it's designed and why.

What's Coming in This Series

Over the next six posts, I'll cover:

The eight questions every AI-assisted engineering team must be able to answer and how they map to a governance architecture
The three non-negotiable rules for trustworthy AI-generated code
What DevOps taught me about AI governance (and why that background is an advantage)
Inside AIEOS: how multi-agent orchestration runs governance workflows
AI governance in financial services and why the compliance context changes everything

If you've been watching AI tooling arrive in your organization and wondering why the rework isn't going away, this series is for you.

Todd Linnertz is a Senior Technology Leader with deep experience in enterprise architecture and DevOps. He is the creator of AIEOS, an open-source AI governance system for software delivery teams. Find him at devopsdiary.blog and github.com/wtlinnertz.

Top comments (1)

Ezejah Chimkamma • Apr 8

This hits harder than most AI takes out there.

“AI doesn’t fix problems, it accelerates them” is the part a lot of teams aren’t ready to hear yet. Everyone’s chasing speed, but almost no one is fixing the structure that speed is built on.

The separation between generation and judgment is especially powerful; that’s where most teams silently fail. They trust output before validating intent.

This feels less like a warning and more like a blueprint for teams that actually want to scale AI without scaling chaos.