Flaky tests are the silent tax on every engineering team.
You run your CI suite. Three tests fail. You re-run. They pass. Your engineers shrug and move on — until the next time. And the next. Each re-run is 10–20 minutes of CI time and at least one context switch. Multiply by every engineer, every week.
The research on this is surprisingly thin. Most tooling vendors cite "industry averages" but there's little real data on what teams actually lose, and what they'd pay to get it back.
We're running a quick survey — 6 questions, ~3 minutes
I'm building a tool that identifies the exact commit that introduced a flaky test (automated git bisect, posted as a PR comment). Before locking in pricing, I want to understand real engineering workflows.
Please answer in the comments — or email culprit@megaloop.app if you prefer a private response.
Q1: What best describes your role?
- Software Engineer / Senior Engineer
- Staff Engineer / Principal Engineer
- Engineering Manager / Director of Engineering
- VP of Engineering / CTO / Head of Engineering
- Other
Q2: How many engineers regularly commit code in your main CI pipeline each week?
- 1–3
- 4–10
- 11–25
- 26–50
- 51+
Q3: Roughly how many hours per week does your team collectively lose to flaky CI tests (reruns, investigations, context-switching)?
- <1 hour
- 1–3 hours
- 4–8 hours
- 9–15 hours
- 15+ hours
Q4: Which tools do you currently use to detect or manage flaky tests? (mention all that apply)
Trunk, Buildkite Test Engine, Datadog CI Visibility, BuildPulse, homegrown solution, none — we handle it manually, other
Q5: Imagine a tool that automatically identifies the exact commit that first made a test flaky and posts it as a PR comment — no manual git bisect. At what monthly price per committer would it feel:
- (a) Too cheap to trust the quality?
- (b) A fair bargain?
- (c) Getting expensive, but you'd still consider it?
- (d) Too expensive — you'd walk away?
Reference anchors: $10 / $18 / $30 / $50 per committer/month — feel free to name your own numbers.
Q6: If this tool offered a 14-day free trial (no credit card required), how likely would you be to sign up?
- Very likely
- Somewhat likely
- Neutral
- Unlikely
- Very unlikely
Why this matters
Most teams don't have a clear number for the cost of their flaky tests — they just know it's annoying. I'm trying to put a real number on it, and understand whether the "identify the introducing commit" feature is genuinely worth paying for, or whether teams are happy quarantining flakes and moving on.
Three minutes of your time helps a lot. Happy to share the aggregate results once we hit 20+ responses.
Building something in this space? Reach out at culprit@megaloop.app.
Top comments (0)