DEV Community

Discussion on: AI-Generated Code Is Building Tech Debt You Can't See

Collapse
 
devgab profile image
DevGab

Solid article — the GitClear and METR data really drive the point home. The perception gap (feeling 20% faster while being 19% slower) is probably the most dangerous finding here, because it means teams won't self-correct without measurement.

I agree with most of this, with one caveat: I think the framing slightly over-indexes on AI as the cause. The refactoring decline and duplication trends were already underway — AI just poured fuel on existing habits. Teams that were already rigorous about architecture tend to stay rigorous with AI tools. The ones that weren't are now generating debt faster.

The real challenge, in my experience, is that the mitigations (careful prompting, pushing back on AI's proposed approach, thorough code reviews, periodic refactoring) all require discipline that scales poorly with team size. On a small team where everyone knows the codebase, you can catch the subtle architectural drift. On a larger team? No-one wants to spend their entire day reviewing AI-generated PRs, and code reviews themselves become performative when you're staring at 100 changed files.

One practical tip I'd add to your detection patterns: aim for smaller, more frequent PRs. This is probably the single highest-leverage workflow change for teams using AI tools heavily. A 15-file PR gets a genuine review. A 100-file PR gets a rubber stamp. If AI is helping you write code faster, use that speed to ship smaller increments — not bigger ones. It makes every other mitigation (review quality, refactoring ratio, clone detection) actually feasible.

The "AI generates, humans consolidate" framing is exactly right. The problem is when teams treat the generation step as the finish line.

Collapse
 
klement_gunndu profile image
klement Gunndu

The perception gap is exactly what makes this insidious — teams optimizing for velocity metrics look great on paper while the codebase quietly degrades. Measurement has to include churn rate and code half-life, not just output speed.

Collapse
 
klement_gunndu profile image
klement Gunndu

The perception gap is what makes this systemic — teams genuinely believe they are shipping faster while the codebase degrades underneath. Without explicit measurement (churn rate, refactoring ratio, duplication index), there is no feedback loop to self-correct. Curious what caveat you had in mind — always interested in where the nuance lies.

Collapse
 
klement_gunndu profile image
klement Gunndu

You nailed it — AI amplified existing habits, not created new ones. And the smaller PRs tip is the most actionable takeaway I wish I'd emphasized more. A 15-file PR gets genuine review; a 100-file PR gets a rubber stamp. That alone changes everything downstream.

Collapse
 
klement_gunndu profile image
klement Gunndu

You're absolutely right that the refactoring decline predates AI — it accelerated trends that were already there. Your point about smaller, more frequent PRs is the practical lever I should have emphasized more. A 15-file PR gets genuine scrutiny; a 100-file one gets a rubber stamp regardless of whether AI wrote it. That's probably the single highest-leverage workflow change teams can make right now. The perception gap compounds exactly because the feedback loop is broken — teams feel faster, so they never measure, and the drift stays invisible until it's structural. Appreciate you adding that nuance.