DEV Community

Cover image for Why green CI doesn't mean your system works
Darya Belaya
Darya Belaya

Posted on • Originally published at habr.com

Why green CI doesn't mean your system works

A case study: how a TypeScript migration doubled my test runs — with zero failures


CI was green. Tests were passing. PRs were merging.

The system was broken. And nothing in the logs showed it.

After migrating my test project from JavaScript to TypeScript, I noticed something odd: CI started taking almost twice as long to run. No failures. No errors. Just... slower.

I assumed it was normal. TypeScript compilation overhead, probably. I moved on.


How I found it

By accident.

Near the end of the migration, when I started deleting the original .js files, the test count dropped by almost half:

  • Before: ~240 tests
  • After: ~120 tests

That number didn't make sense. Not slightly off — structurally wrong.

I hadn't removed any tests. I only deleted old JavaScript files that were supposed to be gone already.

That's when I stopped debugging performance. I was debugging duplicated reality.


What was actually happening

Playwright was picking up both .spec.js and .spec.ts files at the same time.

Every test in the suite was running twice. The same assertions, the same setup, the same teardown — duplicated silently, without a single warning.

The worst part wasn't the wasted time. It was that CI made it look like things were improving. Runtime crept up gradually, which read as "normal post-migration slowdown." I had a plausible story for the symptom, so I stopped looking.


Root cause: one missing line

playwright.config.ts had no explicit testMatch. Playwright was just picking up both .js and .ts files — its default glob matches both. So it picked up everything.

The fix was one line:

testMatch: ['**/*.spec.ts']
Enter fullscreen mode Exit fullscreen mode

Getting to that line took a lot longer.


What this taught me

CI does not validate correctness. It validates execution.

Green CI only means: nothing crashed during execution.

It doesn't mean: the right tests ran, in the right quantity, with the right assumptions about the environment.

In my case, the problem could have been caught with a simple discovered tests counter in CI — if the count deviates from the expected value, fail the build explicitly instead of staying silent.

That counter is now part of the pipeline. The buggy branch (intentionally broken config) is part of the portfolio — so anyone working through it can reproduce, diagnose, and fix it themselves.


The broader pattern

Most problems in test systems don't show up as failures. They show up as:

  • duplicated execution
  • silent performance degradation
  • runner behaviour changes with no test changes

And none of them have alerts — because we don't design for them.


Failure signature

  • CI green
  • runtime doubled
  • test count doubled
  • zero warnings

The hidden assumption "I assumed a slower CI run meant normal post-migration overhead. The runner had been doing twice the work for weeks — silently, without a single warning."


Part of the **Silent Failures in Test Automation* series.*

Full project (API + UI + E2E + CI + AI endpoint): GitHub


Top comments (0)