Pablo Ifrán

Posted on Apr 15 • Originally published at elpic.Medium

Vibe Coding Is Real. So Is the Mess It Makes.

#ai #python #architecture #productivity

You described what you wanted, the AI wrote the code, and it worked. That felt amazing. Then six weeks later you tried to add a feature and nothing made sense.

That's the vibe coding tax and most people find it the hard way.

What Vibe Coding Actually Is

Andrej Karpathy coined the term in early 2025: you describe what you want in plain language, the AI generates code, you accept it and move on. No deep review of the diff. No worrying about how it's structured. "Fully give in to the vibes."

For throwaway scripts and weekend prototypes, it's genuinely excellent. Fast, frictionless, fun.

For production software, taken literally, it's how you build a codebase nobody can maintain including you.

Here's the thing: most developers aren't doing pure vibe coding. They're doing something in between. They describe what they want, review some of the output, push back on the bad parts, and accept the rest. That middle ground is where the interesting problems live and where the real skill is developing.

This series is about that middle ground.

The Vibe Coding Spectrum

It helps to think about this as a spectrum, not a binary.

On one end: you write everything yourself. Full control. Slow. The ceiling is your own knowledge and typing speed.

On the other end: pure vibe coding. You describe, AI generates, you accept. Fast. The ceiling is the AI's judgment which is often surprisingly high for isolated tasks and surprisingly bad for anything requiring architectural coherence.

Most useful work happens somewhere between those poles. The question is: where should you be on that spectrum for the work you're doing right now?

← Manual                                          Full Vibe →
  |---------|---------|---------|---------|---------|

  write     pair     prompt    describe  accept
  yourself  with AI  and edit  and skim  blindly

  (slow,    (good    (good     (fast,    (fast,
  full      for      for most  risky at  dangerous
  control)  known    things)   scale)    always)
            patterns)

For experienced developers, the practical sweet spot is somewhere in the "prompt and edit" to "describe and skim" range depending on how well-contained the task is.

The problem: without architectural discipline, even "prompt and edit" at scale produces code that's hard to reason about. Every file is a bit different. Every AI session introduced slightly different conventions. The codebase accumulates context-free snippets that work individually but don't fit together.

The Two Ways Vibe Coding Goes Wrong

1. Architectural drift

AI coding assistants are very good at generating code that works locally. They're not good at maintaining global coherence because they don't see the whole codebase at once and have no stake in what accumulates over time.

Ask Claude to add a new endpoint and it'll write a clean route handler. Ask it again next week and it might choose a slightly different structure for the same kind of thing. Neither is wrong. Together, they're inconsistent. Over months, the codebase has seventeen slightly different ways of doing the same thing.

This isn't a hypothetical. It's what you get when you vibe code without giving the AI architectural rails to follow.

2. Boundary blindness

AI will put code wherever it fits. Ask it to add validation logic and it might add it in the route handler. Ask it to add a calculation and it might add it in the service. Ask it to format a response and it might reach into a domain object.

Each individual decision is defensible. The aggregate is a codebase where business rules are scattered across every layer, infrastructure concerns leak into the domain, and changing one thing requires understanding everything.

Sound familiar? It's exactly what hexagonal architecture was designed to prevent but vibe coding can recreate it faster than any other workflow.

Why Architecture Makes Vibe Coding Usable

Here's the core insight of this series: clear architectural boundaries are the best context you can give an AI coding assistant.

When the AI knows where things go when every piece of code has an obvious home it generates code that fits. When it doesn't know, it improvises. And AI improvisation, repeated over hundreds of sessions, produces the architectural drift described above.

Think about what changes when you tell the AI:

"Add a discount_service.py in the application/ layer. It should only import from domain/. It takes a CustomerRepository port as a constructor argument. No FastAPI, no SQLAlchemy."

That's a constrained prompt. The AI has three rules to follow. It can focus entirely on the logic because you've handled the architecture. The output is almost always good.

Compare that to:

"Add a discount service."

That one's a guess. The AI will add something that works, but you'll find it reached into the database directly, or attached itself to the wrong class, or introduced a new pattern that doesn't match anything else in the codebase.

The more structure you give, the less cleanup you do.

The Practical Workflow

Here's what actually works for vibe coding with architectural discipline. We'll spend the whole series on these ideas, but the mental model is simple:

Before generating: Set the rails. Define the layer, the constraints, the contracts. Tell the AI what it's not allowed to do, not just what it should do.

While reviewing: Check the boundaries first, logic second. A correctly structured piece of code with a small logic bug is easy to fix. A correctly logical piece of code in the wrong layer is expensive to fix.

After accepting: Run the tests. Not just the unit tests the integration tests that verify the full path. AI-generated code is good at passing unit tests it wrote alongside the code. It's less reliable at passing pre-existing tests that check real behavior.

A Real Example of Both Worlds

Here's the session that makes vibe coding feel broken. You have a working FastAPI service and ask Claude: "Add a loyalty points system customers earn 10 points per dollar spent, can redeem 100 points for $5 off."

Claude generates this in the route handler:

# In api/orders.py — the route handler
@router.post("/")
def place_order(body: PlaceOrderRequest, db: Session = Depends(get_db)):
    customer = db.query(Customer).filter_by(id=body.customer_id).first()
    total = sum(item.price * item.quantity for item in body.items)

    # Apply loyalty redemption if requested
    if body.redeem_points and customer.loyalty_points >= 100:
        points_to_redeem = (customer.loyalty_points // 100) * 100
        discount = (points_to_redeem / 100) * 5
        total -= discount
        customer.loyalty_points -= points_to_redeem

    order = Order(customer_id=body.customer_id, total=total, status="pending")
    db.add(order)
    customer.loyalty_points += int(total * 10)
    db.commit()
    return {"order_id": order.id, "total": total}

It works. You ship it. Then you add a batch order endpoint. It needs loyalty points too. Claude copies the loyalty logic into the new route. Different but similar. Now you have two implementations. Add a mobile API. Third copy.

Then the rate changes from 10 points per dollar to 15. You need to find all three places but there's no obvious way to know there are three, and they're slightly different, so a search isn't reliable.

Here's the same feature with architectural rails:

# In domain/models.py — where it belongs
@dataclass
class LoyaltyAccount:
    points: int = 0
    POINTS_PER_DOLLAR: ClassVar[int] = 10
    REDEMPTION_UNIT: ClassVar[int] = 100
    REDEMPTION_VALUE: ClassVar[float] = 5.0

    def earn(self, spend_amount: float) -> "LoyaltyAccount":
        earned = int(spend_amount * self.POINTS_PER_DOLLAR)
        return LoyaltyAccount(points=self.points + earned)

    def redeem(self, requested: bool) -> tuple["LoyaltyAccount", float]:
        if not requested or self.points < self.REDEMPTION_UNIT:
            return self, 0.0
        redeemable = (self.points // self.REDEMPTION_UNIT) * self.REDEMPTION_UNIT
        discount = (redeemable / self.REDEMPTION_UNIT) * self.REDEMPTION_VALUE
        return LoyaltyAccount(points=self.points - redeemable), discount

The rules live in one place. Every route calls the same domain method. The rate changes in one place. The prompt that produced this:

"Add a LoyaltyAccount value object to domain/models.py. It should hold points: int and have two methods: earn(spend_amount: float) which returns a new LoyaltyAccount with earned points added, and redeem(requested: bool) which returns a tuple of the updated account and the discount amount. Keep the rate and redemption value as class-level constants. No database calls, no imports outside the standard library."

Constrained prompt. Clear layer. Clear interface. The AI generated exactly that no drift.

The Shift in Mindset

You stop asking "does this code work?" and start asking "does this code belong here?"

Both questions matter. But the second one is the harder one and the one the AI won't ask for you.

The AI is an excellent executor. It can take a well-specified task and produce solid code fast. What it can't do is maintain global architectural intent across sessions. That's your job. And it turns out that job specifying intent clearly enough for an AI to execute correctly is a genuinely useful engineering skill.

Not a lesser skill than writing everything yourself. A different one.

DEV Community