The 15-week technical battle of LogiFlow — a company waking up from the illusion created by artificial intelligence and returning to real engineering.
The Story
The test suite took 45 minutes to run. Developers started merging without running tests. Some of AI's tests passed sometimes, timed out other times. Flaky tests were born.
It started innocently. One test that relied on setTimeout would pass locally but fail in CI. Then another test that depended on database insertion order started flickering. Within weeks, the team had a Pavlovian response to red CI: just re-run it.
The test suite became a slot machine. Pull the lever, hope for green.
Technical Autopsy: Injecting Time
// ❌ AI-written flaky test
it('token should expire after 1 hour', async () => {
const token = createToken();
await new Promise(r => setTimeout(r, 3600000)); // Wait 1 hour!
expect(token.isExpired()).toBe(true);
});
// ✅ Human-written deterministic test
it('token should expire after 1 hour', () => {
jest.useFakeTimers().setSystemTime(
new Date('2026-01-01T10:00:00Z')
);
const token = createToken();
jest.advanceTimersByTime(3600000);
expect(token.isExpired()).toBe(true);
});
The difference is philosophical. The flaky test asks: "What happens if we literally wait an hour?" The deterministic test asks: "What happens when the system believes an hour has passed?"
A flaky test is more dangerous than no test at all. CI/CD must be fast and deterministic.
The Cost of Flaky Tests
| Symptom | Impact |
|---|---|
| Developers re-run CI "just in case" | Wasted compute, wasted time |
| Red builds get ignored | Real bugs slip through |
| Test suite takes 45 min | Developers skip tests before merge |
| Random timeouts in CI | No one trusts the pipeline |
When no one trusts the tests, you effectively have no tests.
Lessons from Episode 9
1. Flaky Test = Bad Test: Tests that sometimes pass and sometimes fail destroy developer confidence.
2. Determinism: Inject time (Time Injection). Don't use real Date.now.
3. Fast CI: Test suites longer than 10 minutes kill developer experience. Parallelize, shard, and prune.
This is Episode 9 of the "Back to Code" series. Next up: Episode 10 — The Security Vulnerability Factory.
Series: back.to.code · 2026
Top comments (0)