Nuclear submarine crews operate reactors they can't fully comprehend in real-time. Microprocessor engineers design chips with billions of transistors no single person understands. Air traffic controllers manage thousands of aircraft simultaneously. Medical specialists treat patients whose full condition exceeds any one doctor's knowledge. Legal systems maintain millions of laws no lawyer has read entirely. Google engineers work in a monorepo of 2 billion lines nobody holds in their head.
Each domain faced the same crisis software development is facing now: the system became too complex for human comprehension, and the consequences of misunderstanding became catastrophic. Each domain solved it. Each arrived at the same three-layer structure independently.
Software development is the last domain to hit this wall — because AI-assisted code generation is the forcing function that made the complexity exceed human comprehension in months rather than decades. The solution already exists. It's been proven across six domains. The only thing missing is the correct vocabulary to carry it over.
That vocabulary requires separating three debts that the industry currently conflates.
Three debts, three locations
Technical debt:
Location: The codebase
What it is: Code that works but is structured poorly —
shortcuts, missing abstractions, duplicated logic
Who named it: Ward Cunningham (1992)
Key property: DELIBERATE — the team chose the shortcut
knowingly, intending to pay it back later
Cognitive debt:
Location: The developer's mind
What it is: Loss of comprehension — the developer cannot
reason about the system, cannot predict its behavior,
cannot modify it safely
Who named it: Margaret-Anne Storey, building on Peter Naur's
"Programming as Theory Building" (1985)
Key property: The debt lives in PEOPLE, not in code.
The code can be well-structured. The developer still
doesn't understand it.
Intent debt:
Location: The system's enforcement layer
What it is: The gap between what the system SHOULD preserve
and what is actually expressed as an executable
constraint that the system CAN enforce
Who named it: Margaret-Anne Storey — formalized in "From Technical
Debt to Cognitive and Intent Debt: Rethinking Software
Health in the Age of AI" (arxiv 2603.22106, March 2026).
Storey defines intent debt as the absence of externalized
rationale that developers and AI agents need to work
safely with code.
Key property: The debt lives in the ABSENCE of a mechanism.
The intent exists (someone decided something)
but the system has no way to enforce it.
Note: Storey's definition emphasizes externalized KNOWLEDGE
broadly. This article narrows the focus to externalized
EXECUTABLE CONSTRAINTS specifically — the subset of intent
that can be mechanically enforced. The distinction matters:
an ADR that explains WHY is externalized knowledge (reduces
intent debt in Storey's framing). A CI check that ENFORCES
the why is an executable constraint (reduces intent debt in
this article's framing). Both are necessary. The executable
constraint is what makes the intent durable at machine speed.
These are independent. You can have any combination of the three. A system can have zero technical debt (well-structured code), high cognitive debt (nobody understands it), and high intent debt (no constraints are expressed). Or well-structured code, full comprehension, but no mechanical enforcement. Each combination produces different failures and requires different solutions.
Why the distinction matters
When the terms are conflated, the solutions are wrong:
If you think cognitive debt IS intent debt:
You try to fix comprehension by writing specifications.
But specifications don't make the developer understand the code.
They make the SYSTEM enforce constraints. The developer's mental
model is still missing. The specification gate catches violations —
but the developer can't debug, modify, or extend the code because
they never built the theory of how it works.
If you think intent debt IS cognitive debt:
You try to fix enforcement by improving developer understanding.
Pair programming, code reviews, documentation. The developer
understands the system — but the understanding lives in their head.
When they leave, the intent leaves with them. When AI generates
code at 3 AM, no human is reviewing. The understanding exists.
The enforcement doesn't.
If you think either IS technical debt:
You try to fix both by refactoring code.
But the code might already be well-structured. Cognitive debt doesn't live
in the code — it lives in the developer's mind. Intent debt doesn't
live in the code — it lives in the absence of a constraint.
Refactoring addresses neither.
Each debt has a different solution because each lives in a different place.
The Parnas root
David Parnas's 1972 paper "On the Criteria To Be Used in Decomposing Systems into Modules" is the common root of both cognitive debt and intent debt — though Parnas didn't use either term.
His insight: humans can't hold entire systems in their heads. His solution: information hiding. Each module exposes an interface (what it does) and hides its implementation (how it does it).
The module interface is a single artifact that addresses BOTH debts simultaneously:
As a comprehension mechanism (addresses cognitive debt): the developer only needs to understand what the module promises, not how it works internally. The interface is the boundary of comprehension. The implementation behind the boundary is irrelevant to anyone outside the module. This is how Boeing engineers work on a 787 — each engineer understands their subsystem's interface contracts, not the entire aircraft.
As an intent mechanism (addresses intent debt): the interface IS the declaration of what must be true at this boundary. The function signature, the API contract, the type constraint — each is an expressed intent. The system can enforce it. The compiler checks type signatures. The contract test checks API conformance. The database engine checks schema constraints.
Parnas didn't need to distinguish cognitive debt from intent debt because in 1972, one act produced both: the developer DESIGNED the module interface, which simultaneously built their mental model (comprehension) and declared the contract (intent). The act of writing code was the mechanism that produced both outputs.
What AI broke
AI-assisted development decoupled comprehension from intent expression. The Su-Field model from TRIZ makes the decoupling visible.
Before AI: one field produces two useful outputs
Su-Field Model — Manual Development
F (writing code)
/ \
/ \
/ \
S1 S2
Developer Codebase
Writing code is the FIELD that connects the developer
to the codebase. This single field produces TWO outputs:
Output 1: Comprehension (cognitive debt ↓)
The developer builds a mental model WHILE writing.
Typing, debugging, rewriting — each forces internalization
of the logic, the edge cases, the constraints.
Output 2: Expressed intent (intent debt ↓)
The code itself captures decisions. The function signature
IS the interface contract. The module boundary IS the
architectural intent. The test IS the specification.
One field. Two outputs. Both debts managed by a single act.
The system is COMPLETE — no missing interactions.
In TRIZ terms, this is a complete Su-Field system: two substances (developer, codebase) connected by one field (writing code) that produces the desired effects (comprehension + expressed intent). The system works because the field simultaneously acts on both the developer (building understanding) and the codebase (encoding intent).
With AI: the field changes, both outputs disappear
Su-Field Model — AI-Assisted Development
F' (prompting)
/ \
/ \
/ \
S1 S3
Developer AI Agent
|
| generates code
↓
S2
Codebase
The field changed from WRITING CODE to PROMPTING.
A new substance (S3: AI agent) was inserted between
the developer and the codebase.
Output 1: Comprehension → LOST
The developer prompts. The AI writes. The developer
never types the code, never debugs the logic, never
struggles with the edge cases. The mental model is
never formed. Cognitive debt accumulates.
Output 2: Expressed intent → LOST
The AI generates code from patterns, not from the
developer's architectural decisions. No explicit
interface contract was declared. No module boundary
was designed. The AI's output captures PATTERNS
from training data, not INTENT from the developer.
Intent debt accumulates.
The field (prompting) doesn't produce either output.
The system is INCOMPLETE — both useful effects are missing.
In TRIZ terms, the field substitution (writing → prompting) eliminated both useful outputs. The system went from complete (one field, two outputs) to incomplete (one field, zero outputs). The inserted substance (AI agent) produces a new output (code, faster) but doesn't produce the two outputs the old field provided (comprehension, expressed intent).
The TRIZ resolution: add fields, don't restore the old one
TRIZ says: don't try to restore the original field (that would mean going back to manual coding). Instead, add new fields that independently produce the missing outputs.
Su-Field Model — Resolved System
Read top to bottom. The AI generates code into the codebase.
Two additional fields act on the same codebase independently.
S1 Developer
|
| F' (prompting)
↓
S3 AI Agent
|
| generates code
↓
S2 Codebase
/ \
/ \
/ \
/ \
↓ ↓
F2 F3
Parnas boundaries Mechanical enforcement
| |
↓ ↓
S1 Developer S4 Specification gate
COMPREHENDS ENFORCES intent
at interface on every change
level mechanically
(cognitive (intent
debt ↓) debt ↓)
F' produces code (fast, AI-generated)
F2 produces comprehension (developer understands interfaces,
not the AI-generated implementation behind them)
F3 produces enforced intent (the gate checks constraints
regardless of who generated the code)
Three fields. One codebase. Two restored outputs.
The system is COMPLETE again.
The key: F' (prompting) still works — you keep the speed gains from AI generation. F2 and F3 are ADDED alongside it, not instead of it. The developer doesn't go back to writing all the code. They understand the system through its interfaces (F2) and the system enforces their intent through mechanical checks (F3). The AI generates freely within the boundaries that F2 and F3 maintain.
The same Su-Field structure across all six domains
The reason the solution carries over from nuclear submarines to software is not analogy. It's structural identity. The Su-Field model of the problem is the SAME in every domain. When the problem structure is the same, the solution structure is the same.
Every domain that solved this had the same Su-Field problem:
One field (F) previously produced two outputs:
Output 1: Human comprehension of the system
Output 2: Expressed constraints the system enforces
The field was replaced or overwhelmed:
Nuclear: Reactor dynamics exceed operator comprehension speed
Chips: Transistor count exceeds designer comprehension capacity
Aviation: Aircraft speed exceeds pilot reaction time
Medicine: Patient complexity exceeds single-doctor knowledge
Law: Legal corpus exceeds single-lawyer memory
Google: Codebase size exceeds single-engineer understanding
Software/AI: Code generation speed exceeds developer comprehension
Both outputs were lost:
Comprehension → operators/designers/pilots/doctors can't hold
the full system in their heads anymore
Enforcement → the intent isn't mechanically checked because
it relied on the human who could no longer keep up
Every domain added the same two fields:
F2: Comprehension through bounded interfaces
Nuclear: Operator understands PROCEDURES, not reactor physics
Chips: Designer understands INTERFACE SPECS, not transistors
Aviation: Pilot understands FLIGHT PLAN, not aerodynamics
Medicine: Specialist understands THEIR DOMAIN, not all medicine
Law: Lawyer understands THEIR JURISDICTION, not all law
Google: Engineer understands THEIR MODULE'S API, not 2B lines
Software: Developer understands PARNAS BOUNDARIES, not AI code
F3: Enforcement through mechanical gates
Nuclear: Physical interlocks enforce parameter limits
Chips: Formal verification tools check every interface
Aviation: Fly-by-wire enforces flight envelope
Medicine: Lab equipment produces deterministic diagnostics
Law: Statutory databases flag conflicts automatically
Google: CI gates block contract violations on every commit
Software: Specification gate checks constraints on every change
Seven domains. Same problem structure. Same solution structure. The solution didn't transfer by metaphor. It transferred because the Su-Field model is structurally identical: one field replaced → two outputs lost → two fields added to restore each output independently.
This is why the three-layer model works for software: it's not an adaptation of what nuclear submarines do. It IS what nuclear submarines do — the same architectural resolution of the same structural problem, expressed in software-native mechanisms (Parnas boundaries instead of operating procedures, CI gates instead of physical interlocks, ADRs instead of basis documents).
The three-layer model IS this resolved Su-Field system:
Layer 1 (Parnas boundaries) = F2 — restores comprehension
Layer 2 (Mechanical enforcement) = F3 — restores expressed intent
Layer 3 (Rationale preservation) = connects F2 and F3 —
the WHY links the boundary
to the constraint so both
persist together
The act that produced both outputs was replaced. The outputs must now be produced independently. That's why cognitive debt and intent debt require separate solutions — they were coupled through a field that no longer exists.
The eight states: every combination of three debts
Each debt is either HIGH or LOW. Three independent debts produce eight possible states. Each state has a different consequence and a different fix.
┌────┬──────────┬───────────┬────────┬────────────────────────────────────────┐
│ # │Technical │ Cognitive │ Intent │ Consequence │
│ │ Debt │ Debt │ Debt │ │
├────┼──────────┼───────────┼────────┼────────────────────────────────────────┤
│ 1 │ LOW │ LOW │ LOW │ BEST CASE. Code is well-structured. │
│ │ │ │ │ Team understands the system. Intent │
│ │ │ │ │ is enforced mechanically. Safe to │
│ │ │ │ │ change, safe to scale, safe to hand │
│ │ │ │ │ to AI agents. │
├────┼──────────┼───────────┼────────┼────────────────────────────────────────┤
│ 2 │ LOW │ LOW │ HIGH │ FRAGILE. Code is well-structured. │
│ │ │ │ │ Team understands it. But intent lives │
│ │ │ │ │ in people's heads, not in constraints. │
│ │ │ │ │ One departure or one AI agent that │
│ │ │ │ │ doesn't know the rules → violations. │
│ │ │ │ │ FIX: Add enforcement to existing specs. │
├────┼──────────┼───────────┼────────┼────────────────────────────────────────┤
│ 3 │ LOW │ HIGH │ LOW │ GOVERNED DESPITE IGNORANCE. Code is │
│ │ │ │ │ well-structured. Nobody understands it. │
│ │ │ │ │ But constraints catch violations │
│ │ │ │ │ mechanically. Safe to run, hard to │
│ │ │ │ │ modify. The AI-era quadrant: the gate │
│ │ │ │ │ compensates for missing comprehension. │
│ │ │ │ │ FIX: Invest in Parnas boundaries and │
│ │ │ │ │ rationale (Layers 1 + 3). │
├────┼──────────┼───────────┼────────┼────────────────────────────────────────┤
│ 4 │ LOW │ HIGH │ HIGH │ TIME BOMB. Code is well-structured │
│ │ │ │ │ but nobody understands it and nothing │
│ │ │ │ │ enforces the intent. Working by │
│ │ │ │ │ accident. Any change can break it. │
│ │ │ │ │ Incidents are undiagnosable. │
│ │ │ │ │ FIX: Add enforcement first (Layer 2), │
│ │ │ │ │ then boundaries (Layer 1). │
├────┼──────────┼───────────┼────────┼────────────────────────────────────────┤
│ 5 │ HIGH │ LOW │ LOW │ MANAGEABLE MESS. Code has shortcuts │
│ │ │ │ │ and duplication. But the team │
│ │ │ │ │ understands it and constraints enforce │
│ │ │ │ │ safety. The team can refactor safely │
│ │ │ │ │ because they know what to change and │
│ │ │ │ │ the gate verifies they didn't break │
│ │ │ │ │ anything. │
│ │ │ │ │ FIX: Refactor the code (standard). │
├────┼──────────┼───────────┼────────┼────────────────────────────────────────┤
│ 6 │ HIGH │ LOW │ HIGH │ LEGACY SYSTEM. Code has shortcuts. │
│ │ │ │ │ Team understands it (the one person │
│ │ │ │ │ who's been here 10 years). No │
│ │ │ │ │ mechanical enforcement. Everything │
│ │ │ │ │ depends on that person. Classic legacy │
│ │ │ │ │ pattern. │
│ │ │ │ │ FIX: Capture the person's knowledge as │
│ │ │ │ │ constraints before they leave. │
├────┼──────────┼───────────┼────────┼────────────────────────────────────────┤
│ 7 │ HIGH │ HIGH │ LOW │ SAFE BUT FROZEN. Code has shortcuts. │
│ │ │ │ │ Nobody understands it. But constraints │
│ │ │ │ │ enforce safety. The system runs but │
│ │ │ │ │ can't evolve — any refactoring attempt │
│ │ │ │ │ is blocked by incomprehension. │
│ │ │ │ │ FIX: Invest in comprehension (Parnas │
│ │ │ │ │ boundaries + rationale) to enable │
│ │ │ │ │ safe refactoring. │
├────┼──────────┼───────────┼────────┼────────────────────────────────────────┤
│ 8 │ HIGH │ HIGH │ HIGH │ WORST CASE. Code has shortcuts. │
│ │ │ │ │ Nobody understands it. Nothing │
│ │ │ │ │ enforces the intent. The system is │
│ │ │ │ │ unmaintainable, unsafe, and cannot be │
│ │ │ │ │ modified without risk of outage. │
│ │ │ │ │ This is Month 12 of the Fallacies │
│ │ │ │ │ timeline without intervention. │
│ │ │ │ │ FIX: Triage. Add enforcement for the │
│ │ │ │ │ highest-risk properties first. │
└────┴──────────┴───────────┴────────┴────────────────────────────────────────┘
Most teams building with AI are in State 4 (TIME BOMB) or moving toward State 8 (WORST CASE). The code is well-structured because the AI generates conventional-looking code. But nobody understands it (cognitive debt HIGH — the developer never wrote it) and nothing enforces the intent (intent debt HIGH — no constraints were declared).
The fastest path from State 4 to State 3 (GOVERNED DESPITE IGNORANCE): add mechanical enforcement (Layer 2). This doesn't restore comprehension — the developer still doesn't understand the AI-generated code. But the gate catches violations regardless. The system is safe to run even when it's hard to modify. Layer 1 (Parnas boundaries) and Layer 3 (rationale) can follow — they make the system modifiable, not just safe.
The fastest path from State 8 to safety: triage. Identify the three highest-risk properties. Add enforcement for those three. You're not in State 1 (best case). You're in a partial State 7 (safe but frozen for the governed properties). That's dramatically better than State 8. Expand from there.
Why the current mitigations don't resolve the debts
The emerging responses to cognitive debt in AI-assisted development — consolidated by Storey from practitioners including Simon Willison, Martin Fowler, Steve Yegge, and discussions across Hacker News and LinkedIn — cluster around five practices:
- More rigorous code review
- Writing tests that capture intent
- Updating design documents continuously
- Treating prototypes as disposable
- Using AI to support cognitive tracking
Each practice is reasonable. None resolves the structural problem. Each is a braking mechanism — it manages the debt by slowing down, not by changing where the debt accumulates.
More rigorous review is human-speed enforcement. It addresses cognitive debt (the reviewer builds understanding while reviewing) and partially addresses intent debt (the reviewer catches violations of unwritten constraints). But it doesn't scale to AI-speed generation. When the AI produces 10x more changes, the reviewer becomes the bottleneck — or cuts corners, which means the debt isn't being managed at all.
Tests that capture intent are L2 cache — they catch violations for the cases someone thought to test. But they only cover what was anticipated. The property that wasn't tested is the property that gets violated. Tests address intent debt for KNOWN intents. They don't address the intents nobody wrote tests for.
Updating design documents is rationale preservation — but only if the documents are connected to enforcement. A design document that says "all external calls must have timeouts" doesn't prevent an AI agent from generating a call without a timeout. The document exists. The enforcement doesn't. This is intent debt: the intent is expressed (in a document) but not executable (no CI check verifies it).
Treating prototypes as disposable is correct but narrow. It addresses one source of cognitive debt (prototype code that was never meant to be understood). It doesn't address production code that was AI-generated and never understood by anyone.
Using AI to support cognitive tracking introduces the failure mode from Fallacy #3: the AI that tracks cognitive state has the same probabilistic limitations as the AI that generated the code. The tracker can miss the same patterns the generator introduced.
Where the theory lives matters
Storey correctly identifies that the "theory of the system" is distributed across people, documentation, tests, conversations, tooling, and agents. This is a precise observation. The next step is recognizing that some of these locations are FRAGILE and some are DURABLE:
FRAGILE locations (theory is lost when conditions change):
People → leave the company, change roles, forget
Conversations → ephemeral, unrecorded, unreproducible
AI agents → stateless, no memory across sessions
DURABLE locations (theory persists independently of conditions):
Type signatures → enforced by the compiler on every build
API contracts → enforced by contract tests on every change
Database schemas → enforced by the engine on every write
CI checks → enforced on every merge, regardless of author
Tests → enforced on every run
The resolution isn't to maintain all locations equally. It's to move the critical pieces from fragile locations to durable ones. The intent that lives in a person's head (fragile) must be expressed as a CI check (durable). The rationale that lives in a conversation (fragile) must be recorded as an ADR connected to the constraint it explains (durable).
This is the answer to Storey's open question: "How will teams externalize intent and sustain shared understanding?" By moving the theory from locations that erode to locations that persist and enforcing it mechanically so that the theory holds regardless of who is on the team, which AI agent is generating code, or how fast the codebase is changing.
The three debts, the three layers, the three solutions
Because the debts are independent and live in different places, each requires its own intervention — not a braking mechanism, but a structural change that moves the theory from fragile locations to durable ones:
To reduce cognitive debt (restore comprehension):
Parnas boundaries are the primary mechanism. The developer understands the system at the interface level — what each module promises, what it accepts, what it returns. The implementation behind the interface is irrelevant to comprehension. The developer doesn't need to understand 10,000 lines of AI-generated code. They need to understand 50 interface contracts.
Supporting mechanisms: Architecture Decision Records (ADRs) explain WHY the boundaries are shaped the way they are. Worked examples show HOW the modules compose. Both help future developers build the mental model the original developer had.
To reduce intent debt (express intent as executable constraints):
Mechanical enforcement is the primary mechanism. The intent is expressed as a typed predicate, a contract test, a schema constraint, a linter rule, a CI check. The system can evaluate it on every change, at machine speed, deterministically.
The insight from Fallacy #7: the constraints usually ALREADY EXIST. Type signatures, API contracts, database schemas, module boundaries — these are expressed intent. The debt isn't in the absence of constraints. It's in the absence of ENFORCEMENT of constraints that already exist.
To reduce both simultaneously:
The module boundary addresses both — as a comprehension mechanism (the developer only needs to understand the interface) and as an intent mechanism (the interface is the enforceable contract). Adding a third layer — rationale preservation — prevents both debts from recurring by recording WHY the boundary exists and connecting the rationale to the constraint.
Layer 1 (Parnas boundaries): Reduces cognitive debt
Module interfaces make the system comprehensible
at the interface level.
Layer 2 (Mechanical enforcement): Reduces intent debt
Constraints make the intent executable by the system.
Catches violations regardless of developer comprehension.
Layer 3 (Rationale preservation): Reduces both
Records WHY (addresses cognitive debt for future developers)
AND connects the why to the constraint (ensures the intent
survives personnel changes and AI-driven modifications).
Why "specification debt" is not precise
Some discussions use "specification debt" as a catch-all. This term is imprecise because it conflates two different absences:
Absence of a specification (the interface or constraint doesn't exist): This is a Parnas boundary problem — the module was never decomposed with a well-defined interface. The solution is to create the boundary.
Absence of enforcement (the specification exists but isn't checked mechanically): This is an enforcement problem — the API contract exists but no contract test verifies it. The type signature exists but the language allows unsafe casts. The database schema has constraints but the application bypasses them. The solution isn't a new specification. It's a CI check on the existing one.
Calling both "specification debt" loses the distinction between "we don't have the spec" and "we have the spec but don't enforce it." The second is cheaper and faster to fix — because the artifact already exists. Conflating them makes teams think they need to write new specifications when they actually need to enforce existing ones.
The terminology, settled
Term Location What it is Solution
────────── ────────── ────────────────── ──────────────────
Technical debt Codebase Poor structure, shortcuts Refactor the code
Cognitive debt Developer's mind Loss of comprehension Parnas boundaries
(understand interfaces,
not implementations)
Intent debt System's Gap between what SHOULD Mechanical enforcement
enforcement be preserved and what IS of existing constraints
layer expressed as an executable (CI checks, contract
constraint tests, linter rules)
Specification Ambiguous — Either missing spec Split into:
debt avoid OR missing enforcement → missing boundary
(create it)
→ missing enforcement
(enforce it)
Technical debt lives in the code. Cognitive debt lives in people. Intent debt lives in the absence of enforcement. Each has a different location, a different cause, and a different fix. Conflating them leads to refactoring when you should be enforcing, enforcing when you should be teaching, or teaching when you should be building boundaries.
The terms are not synonyms. They are independent axes. Getting them right changes the diagnosis. The diagnosis changes the architecture. The architecture changes the outcome.
This terminology clarification is relevant to The Fallacies of GenAI Development, where several fallacies involve cognitive debt (Fallacies #1, #2, #4) and others involve intent debt (Fallacies #5, #7, #8). Distinguishing them clarifies which fallacy produces which debt and which layer of the resolution addresses it.
The three-layer model (Parnas boundaries + mechanical enforcement + rationale preservation) is the architectural pattern that addresses both debts independently through a shared mechanism: the module boundary strengthened with enforcement and rationale. Parnas (1972) is the root. The three layers are the completion.
Top comments (0)