DEV Community

Void Stitch
Void Stitch

Posted on

The real cost of flaky CI: a community survey

Flaky CI tests are the developer tax nobody budgeted for.

A 2023 Google study found that flaky tests account for roughly 1.5% of all CI failures — but that number explodes in practice. When your test suite has 2,000 tests and a 5% flakiness rate, your CI pipeline re-runs those tests constantly. Each re-run burns developer time: context switching, reading logs, re-triaging, deciding "real failure or noise?"

Conservative math for a 20-person team:

  • 5 flaky tests × 3 reruns/day × 20 engineers checking CI = 300 developer-minutes/week
  • At $75/hr loaded cost: ~$375/week, $18k/year — from 5 flaky tests

The missing piece: which commit broke it?

Most teams today either manually triage (expensive) or quarantine (the flaky test never gets fixed). Neither solves the root cause.

The real question engineers never answer fast enough: which commit first made this test flaky? That's the commit you need to fix.

Running git bisect by hand against 50+ commits is painful. Nobody does it proactively. So flaky tests accumulate.

Quick survey: what does this actually cost your team?

I'm building a tool that automatically runs the bisect and posts the introducing commit directly on the PR — no manual investigation. Before setting pricing, I want to understand the real costs teams face and willingness to pay for automation.

6 questions, ~3 minutes: Take the survey →

Key questions covered:

  • How many hours per week does your team lose to flaky CI?
  • What tools (Trunk, Buildkite, Datadog, BuildPulse) are you currently using?
  • A Van Westendorp price-sensitivity check (anchored at $10–$50/committer/month)
  • Would a 14-day free trial get you in the door?

I'll share the results publicly once we hit 30 responses. If you'd like to receive the summary, leave your email in the survey.

What's your team's experience with flaky CI? Curious whether the git bisect automation angle resonates or if the pain is more in the detection phase.

Top comments (0)