Accountability Debt: The Hidden Cost in AI Code Generation

#ai #softwareengineering #leadership

It's Monday morning, so being the "good" tech leader you are, you look at the pending PRs.

Branch name: codex/something.
Files changed: 200+.
Lines of new code: 40,000.
Description: "bugfix."

You haven't even peeked at the diff yet, and you already know exactly what you're looking at.

The branch name tells you. The author tells you (a co-founder who is not technical). The bug bot, already leaving comments while you watch, tells you. You know because you've seen this pattern before. Everyone has seen this pattern before. It just usually comes from a junior engineer with a "free" weekend, not from a co-founder on a Monday.

I want to name something that's been building quietly in engineering organizations, and I don't think we have a clean term for it yet.

Accountability Debt

What I've started calling it is accountability debt. The gap between who gets the efficiency win from AI-generated code and who actually owns it when something goes wrong. Those two people are rarely the same person.

And honestly, this problem doesn't stop at non-technical founders. Experienced engineers can create accountability debt too. Some of the worst versions of it will come from smart people moving too fast with tools that let them generate organizational complexity faster than anyone else can absorb it.

The co-founder who spent a weekend with Cursor, Codex, or maybe Claude... he got the win. He shipped something. He solved a problem, or pointed an AI at one. That efficiency was real, and it was captured by him, on Sunday afternoon, before your team knew any of this existed.

But weeks or maybe months later, the cost gets paid by the engineer on call when something breaks. Or maybe by the person trying to debug 40,000 lines of generated code with no documentation, no explanation of why any particular function is written the way it is, and no trail to follow. Or perhaps by the tech lead explaining to the next hire why this thing works the way it does, when the honest answer is: nobody knows.

Technical debt has always worked this way. Shortcuts have always transferred cost into the future. What AI changes is the scale and the speed. What used to take months of accumulated shortcuts can now happen over a single weekend. The debt can be generated faster than the organization can recognize it's accumulating.

Why this specific PR is actually three problems

I am going to be concrete here, because "this is a bad PR" is not useful to anyone.

It's unmaintainable. No documentation. No inline explanation of decisions. No context for why a function is written the way it is. When something breaks, and something will break, the engineer debugging it at 2am has no trail. They're reading code written by a model that's no longer available to explain itself, opened by someone who doesn't fully understand it, in a system that has now been fundamentally changed.

It's not reviewable. A good PR tells a story. You can read it and understand what changed and why. A 200-file, 40k-line PR is not a story. It's an archaeological dig. No reviewer can meaningfully approve this. A PR that nobody can meaningfully review is not ready to merge, regardless of whether the tests pass.

It skips the actual work. The right path is to understand what problems exist, write tickets, and let the engineers who maintain this system solve them with full context. That's slower. It's also the only way to end up with a system anyone can actually support.

The governor is gone

There's a thing that used to constrain how much a single person could accidentally introduce into a codebase: human typing speed.

A governor, in an engine, is a physical limiter. It doesn't ask the driver to slow down. It just doesn't let the engine go faster than a certain point. That is constraint as infrastructure.

Typing speed was that for software. One person working alone on a weekend had a natural ceiling on the size of the problem they could create, not because anyone told them to stop, but because their hands ran out of hours. It was an accidental governor. Nobody designed it. It just existed.

That's gone now. Someone with an AI tool and a free weekend can generate more code than a full team used to ship in a sprint. And the thing that used to keep the scope of one person's decisions bounded isn't there anymore. So the question becomes: what replaces it? The answer has to be judgment. And judgment, sadly, is a lot harder to rely on than physics.

The hard part we need to say out loud

Here's where it gets hard. What do you do when the person holding the 40k line PR is the person who signs the checks?

Let's start with what doesn't work.

"This doesn't follow our process." Correct. Lands nowhere when the person you're saying it to owns the process.

"This needs more documentation." Gets dismissed as perfectionism.

"This should be broken into smaller PRs." Sounds like bureaucracy to someone looking at a working feature and wondering why engineers make everything complicated.

What works better is reframing the risk in business terms, not engineering terms.

The question isn't "does this code meet our standards." The question is: who owns this system after it merges? That's a support cost question. An on-call burden question. A "what does incident response look like when the person who generated this code can't explain what it does" question.

If you can help the person holding the PR see that it creates a liability that lands on the engineering team rather than on them, that's a more productive conversation than any appeal to process.

The second thing that works better: offer a concrete alternative. Not "this is wrong." "Here's what right looks like." Walk them through the ticket flow. Show them what a reviewable PR actually looks like. Give them somewhere to go that isn't just a wall.

Because there is an efficiency win and the productivity unlock is real. The goal isn't to stop people from using AI tools. The goal is to build organizations where the person capturing the win and the person carrying the cost are in conversation before the PR opens, not after something breaks.

That's a structural problem. So how do you solve a structural problem? With structure, not with individual acts of courage in code review.

What accountability debt actually costs

The cost of accountability debt shows up in very clearly measurable ways:

Longer incident resolution times, because nobody has the context to debug fast
Higher on-call burden on engineers who didn't generate the code and weren't consulted about it
Slower future development, because the codebase becomes harder to reason about
Attrition, because good engineers leave environments where they're handed systems they can't understand and told to own them

But none of that shows up on the dashboard next to "shipped a feature over the weekend." And that's the whole problem.

A note on where this is going

This isn't a temporary problem. AI tools are not going to become less capable or less accessible. The organizations that figure out how to capture the efficiency win without silently transferring the cost to the people least able to push back are going to have an advantage, in retention, in system reliability, and in the kind of engineering "culture" that can actually sustain growth.

The ones that don't are accumulating a debt that doesn't appear on any balance sheet until something breaks badly enough to make it visible.

By then, the person who captured the win is usually onto the next thing.

Prefer listening? I cover this in depth in episode 5 of Chaotic Commits, merge conflict: when the diff is 40k lines and the author is the owner — including what actually happened with the PR, and what it feels like to have that conversation when the person holding it signs the checks. Find it wherever you get podcasts.