Back to Code | Ep 03: The Lost Craft — TDD and False Confidence

#ai #programming #testing #tdd

The 15-week technical battle of LogiFlow — a company waking up from the illusion created by artificial intelligence and returning to real engineering.

The Story

Victory was in the air in the War Room. The previous day's Hexagonal Architecture surgery had been a success. Emre ran npm run test in the terminal. The screen turned green within seconds: 142 passed, 0 failed.

"See, Defne?" said Emre. "AI didn't just write the code — it also wrote tests with 98% coverage. Our codebase is bulletproof right now."

Defne looked at the green checkmarks on the screen. There was no smile on her face.

"Emre, a P0 ticket just came in from the QA team. In the staging environment, a truck with a 5-hour hazmat route had its ETA entered into the system as 300 minutes. The system didn't account for the legally mandated 2-hour rest break."

Emre lurched forward in shock. "Impossible! The test just went green!"

Technical Autopsy: Testing the Echo

Emre projected the AI-generated test file on screen:

// "Flawless" But Fake Test Generated by AI
describe('CalculateHazmatRouteUseCase', () => {
  it('should calculate ETA for hazmat truck correctly',
    async () => {
      mockTruckRepo.findById.mockResolvedValue({
        id: 'truck-1', isHazmat: true
      });
      mockMapService.getRoute.mockResolvedValue({
        durationInMinutes: 300
      });

      const result = await useCase.execute(
        'truck-1', { lat: 0, lng: 0 }
      );

      // AI's assertion — ADAPTED TO MATCH THE CODE!
      expect(result).toBe(300);
    }
  );
});

"Here is AI's most dangerous illusion in engineering: Tautological Testing and False Confidence."

"When you tell AI to 'write tests for this function,' AI doesn't understand the business rule. AI looks at what the current code returns and writes that value into the expect line. If the code incorrectly returns 1000, AI writes the test to expect 1000! You're just testing the code's own echo."

The Lost Craft: TDD Is Not a "Testing" Tool

TDD's sacred cycle: Red - Green - Refactor

Red (Intent): You define a business rule that hasn't been written yet with a test describing how it should behave.
Green (Implementation): You write the simplest code that fulfills that promise.
Refactor: You clean up the code.

The Solution: BDD and Human-Written "Intent"

// HUMAN-WRITTEN "INTENT" TEST
describe('Hazmat Routing Domain Policy', () => {
  const hazmatTruck: TruckProfile = { isHazmat: true };
  const longRoute: RouteCalculation = {
    durationInMinutes: 300
  };

  it('should add legally mandated 2-hour break on routes exceeding 4 hours', () => {
    const finalEta = calculateFinalEta(longRoute, hazmatTruck);
    // 300 min driving + 120 min legal break = 420 min
    expect(finalEta).toBe(420);
  });

  it('should NOT add break on exact 4-hour (240 min) routes (Edge Case)', () => {
    const exactRoute: RouteCalculation = {
      durationInMinutes: 240
    };
    expect(
      calculateFinalEta(exactRoute, hazmatTruck)
    ).toBe(240);
  });

  it('should NEVER add break for non-hazmat trucks', () => {
    const standardTruck: TruckProfile = { isHazmat: false };
    expect(
      calculateFinalEta(longRoute, standardTruck)
    ).toBe(300);
  });
});

Defne hit npm run test. The screen went RED. Expected: 420, Received: 300.

At that moment, everyone in the room took a deep breath. The red light wasn't a bug — it was a truth being revealed.

Lessons from Episode 3

1. Tautological Tests: AI reads the current code's output and writes that value into the expect line, creating a 100% coverage illusion.

2. TDD Is a Design Tool, Not Validation: The purpose of Test-Driven Development isn't "checking for bugs" — it's "binding how the system should behave into a contract."

3. Edge Case Blindness: AI excels at writing "happy path" tests. But it can't skeptically engineer boundary values and legal penalty clauses the way a human can.

4. New Workflow (Red-Green-AI): Humans write tests first and make them Red. Only then is AI asked for the implementation to make the red tests pass.

This is Episode 3 of the "Back to Code" series. Next up: Episode 4 — Forgetting the Machine: Big O Notation and the Performance Tax.

Series: back.to.code · 2026