DEV Community

Cover image for We've Been Thinking About AI Code Wrong. Code Isn't Flat — It's Layers.
Nhường Phan
Nhường Phan

Posted on

We've Been Thinking About AI Code Wrong. Code Isn't Flat — It's Layers.

I've been vibe coding almost every day for the past year. Claude, GPT, Copilot — they're all incredible. I can go from idea to working app in hours instead of weeks.

But I keep hitting the same wall.

And I think the wall isn't AI. The wall is code itself.


The $200 Bug

Last month I was building a payment feature. Minute 1, I told AI:

"Balance can never go negative."

AI wrote the check. Clean code. Looked perfect.

30 minutes later, I asked AI to add a bulk transfer feature. The context window was getting full. AI generated new code — and somewhere in that new code, it created a path where balance could go to -$200.

AI didn't hallucinate. It didn't write bad code. It literally didn't have my constraint in context anymore. The constraint existed in my prompt from 30 minutes ago — which had already scrolled out of the window.

The constraint lived in 1 line of English. Then it disappeared.

I fixed it. Added a test. Moved on. But it kept bugging me.

Then I realized: this isn't an AI problem. This is a code problem.


Why Code Is Broken (For AI)

Think about how your brain works when someone says "build a money transfer feature":

"Transfer money between two users"
  → ok, first validate, then transfer, then save
    → validate means: check users are active, check balance
      → check balance means: sender has enough money
        → that means: if sender.balance < amount, reject
Enter fullscreen mode Exit fullscreen mode

You think in layers. From vague to specific. Each layer only has 3-5 concepts. Your brain naturally decomposes the problem.

But when you write code? All layers collapse into one flat file.

# This is what your brain thought:
#   Layer 0: Transfer money safely
#   Layer 1: Validate → Execute → Save
#   Layer 2: Check sender active, check balance
#   Layer 3: if sender.balance < amount: raise Error
#
# This is what survived:

def transfer_money(sender, receiver, amount):
    if sender.status != "active":
        raise InvalidUserError(user_id=sender.id)
    if sender.balance < amount:
        raise InsufficientBalanceError(...)
    new_balance = sender.balance - amount
    # ... 50 more lines
Enter fullscreen mode Exit fullscreen mode

The why disappeared. Only the how survived.

Layer 0 ("transfer money safely") became a function name.
Layer 1 ("validate, then execute, then save") became... nothing. Maybe a comment if you're lucky.
Layer 2 became the actual code.

When AI works on line 50, it has no idea what you decided at "layer 0." The architectural intent is gone. The constraints are gone. The reasoning is gone.

Code is a lossy compression of thought.


What If the Layers Don't Disappear?

This is the idea I can't stop thinking about.

What if you write exactly how you think — starting broad, getting more specific — and every layer IS the code?

"Transfer money safely"

  "Validate both users"
    "Sender must be active, otherwise reject"
    "Balance must be sufficient, otherwise reject"

  "Execute the transfer"
    "New sender balance = sender balance minus amount"
    "New receiver balance = receiver balance plus amount"

  "Save and return result"
    "Atomically save both balances to database"
    "Return result with updated users"
Enter fullscreen mode Exit fullscreen mode

Read that again.

  • Line 1 is what a product owner writes in a ticket.
  • Lines 3-4 are what a business analyst writes in a spec.
  • Lines 7-8 are what a developer writes in code.
  • Line 11 is what a DBA cares about.

Same file. Same syntax. Same language. Every line is English. Every line is code.

The depth is unlimited. Simple feature? 2 layers. Banking compliance feature? 10 layers. Each layer only adds the detail that layer needs.


The Key Insight: Constraints Flow Downward

Here's where it gets interesting.

When you write "Balance must be sufficient" at layer 1 — what happens to that constraint at layer 4? In normal code, it disappears. Someone has to remember it. AI has to have it in context.

But what if the system remembers it?

Layer 0: "Transfer money safely"
         constraint: "balance must never be negative"     ← written once

Layer 1: "Execute the transfer"
         constraint: "balance must never be negative"     ← auto-inherited

Layer 2: "Calculate new sender balance"
         constraint: "balance must never be negative"     ← auto-inherited

Layer 3: new_balance = sender.balance - amount
         system check: can this violate "balance >= 0"?
         → YES, if sender.balance < amount
         → ERROR: "Constraint 'balance never negative' may be violated.
                   Counterexample: balance=100, amount=150 → new_balance=-50"
Enter fullscreen mode Exit fullscreen mode

The constraint flows DOWN through every layer automatically. The system catches violations mathematically — not through tests, not through AI memory, but through structural inheritance.

I call this Constraint Inheritance Chain (CIC):

  1. DOWN: Parent constraints automatically apply to ALL children at any depth
  2. UP: When a child step is verified correct, its result becomes a fact the parent can use
  3. ACROSS: Output from step 1 is automatically available to step 2
  4. DIAGONAL: Type definitions inject constraints everywhere they're used

That last one is powerful. Watch:

"WalletBalance is a number that is never negative"    ← define once

"User has a balance (WalletBalance)"                  ← use the type

"Transfer money from sender (User)"                   ← use User
Enter fullscreen mode Exit fullscreen mode

The system automatically knows: sender.balance >= 0. Nobody wrote that constraint for transfer_money. It was inferred from the type definition, through the record field, into the function scope.

In my prototype, developers write ~5 explicit constraints, and the system verifies ~16 total — because 69% are auto-inferred from type definitions.


Same Logic, Three Styles

One thing I'm exploring: what if the same code can be written in multiple styles? Product owners, developers, and AI all think differently. But the underlying logic is the same.

Document style (product owner writes this):

transfer money safely
  given sender is a User, receiver is a User, amount is a PositiveAmount
  producing TransferResult or InsufficientBalanceError

  requires that sender is active
  requires that receiver is active
  must always satisfy balance is non-negative
  guarantees that sender balance decreases by exactly the amount
Enter fullscreen mode Exit fullscreen mode

Code style (developer writes this):

do transfer money safely
  from sender:User, receiver:User, amount:PositiveAmount
  -> TransferResult or InsufficientBalanceError

  promise before: sender.status is "active"
  promise before: receiver.status is "active"
  promise always: sender.balance >= 0
  promise after: result.sender.balance is old(sender.balance) - amount
Enter fullscreen mode Exit fullscreen mode

Same parse tree. Same verification. Same output. The system doesn't care which style you use. AI tends to use document style naturally (because that's how language models think). Developers can use code style if they prefer.


What Gets Generated

When all layers are detailed enough, the system compiles to real, runnable code:

# Auto-generated from the layered description above

async def transfer_money_safely(
    sender: User,
    receiver: User,
    amount: PositiveAmount,
) -> TransferResult:
    """Transfer money safely between two users."""

    # --- contracts ---
    assert sender.status == "active", "sender must be active"
    assert receiver.status == "active", "receiver must be active"
    _old_sender_balance = sender.balance

    # --- validate both users ---
    if sender.status != "active":
        raise InvalidUserError(user_id=sender.id)
    if sender.balance < amount:
        raise InsufficientBalanceError(
            current_balance=sender.balance,
            requested_amount=amount)

    # --- execute the transfer ---
    new_sender_balance = sender.balance - amount
    new_receiver_balance = receiver.balance + amount

    # --- save and return result ---
    async with db.transaction():
        await user_repo.update(sender.id, balance=new_sender_balance)
        await user_repo.update(receiver.id, balance=new_receiver_balance)

    result = TransferResult(
        sender=replace(sender, balance=new_sender_balance),
        receiver=replace(receiver, balance=new_receiver_balance),
        amount=amount)

    assert result.sender.balance == _old_sender_balance - amount
    return result
Enter fullscreen mode Exit fullscreen mode

Plus auto-generated tests:

def test_transfer_rejects_inactive_sender():
    """Violate: requires that sender is active"""
    sender = User(status="suspended", balance=1000)
    with pytest.raises(InvalidUserError):
        transfer_money_safely(sender, receiver, 100)

def test_transfer_rejects_insufficient_balance():
    """Violate: balance must be sufficient"""
    sender = User(status="active", balance=50)
    with pytest.raises(InsufficientBalanceError):
        transfer_money_safely(sender, receiver, 100)

def test_transfer_happy_path():
    """Verify: sender balance decreases by amount"""
    sender = User(status="active", balance=1000)
    result = transfer_money_safely(sender, receiver, 200)
    assert result.sender.balance == 800
Enter fullscreen mode Exit fullscreen mode

From layers → verified correct → Python + tests. Automatically.


Why This Matters For AI Coding

Current AI coding flow:

idea → AI writes code → hope it's correct → test → find bugs → fix → repeat
Enter fullscreen mode Exit fullscreen mode

With layers:

idea → describe in layers → system verifies consistency → generate code
Enter fullscreen mode Exit fullscreen mode

The difference:

  • AI never sees more than 3-5 concepts per layer. No context window overflow.
  • Constraints are structural, not contextual. System remembers them even when AI's context window is full.
  • Verification happens before code generation. Bugs caught at description time, not runtime.
  • Every layer is readable English. Code review = reading a document.

The Uncomfortable Question

Here's what I keep coming back to:

Do we actually want to "write code"?

Or do we want to describe behavior at increasing levels of detail until it's precise enough to run?

Because if it's the second one... then maybe the right tool isn't a better code editor, or a better AI model, or better prompting.

Maybe the right tool is a language where description IS code. Where the document you write first is the source of truth. Where layers don't collapse. Where constraints don't disappear. Where AI and humans work at whatever level of detail they need.

I've been prototyping this — a language where you write structured English that decomposes layer by layer until it's precise enough to compile. Where the system mathematically verifies that every layer is consistent with every other layer. Where the generated code is guaranteed to match your description.

The tagline I keep coming back to:

Write a document. Run it as code. It can't be wrong.

Still early. Still rough. But every time I use it, I can't go back to flat code.

Curious if this resonates — or if I'm overthinking it. Have you hit the "flat code" wall with AI? How do you deal with constraints disappearing mid-session?


If you're interested in following this project, I'll be sharing updates as the prototype matures. Drop a comment or follow — I'd love to hear your experiences with AI coding pain points.

Top comments (0)