Why Flaky Tests Are Worse Than You Think
Flaky tests aren’t just annoying. They’re destructive. They break trust in your CI pipeline, slow down engineering teams, and hide real bugs under the noise of random failures.
The worst part? Developers start ignoring all test failures, assuming they’re “just flaky.” At that point, your test suite is worse than useless — it’s lying to you.
This guide is about fixing flakiness at the root. Not band-aiding it. Not retrying endlessly. Actually understanding why it happens and how to prevent it — from local dev to cloud CI.
What Causes Flaky Tests (And What to Do About It)
Flakiness sneaks into every layer of testing, but it wears different disguises depending on what you’re testing.
UI Testing
When you test user interfaces, you’re testing against the most asynchronous, unpredictable layer of your stack.
Imagine this: Your test navigates to a page and clicks “Submit” — but the button’s disabled for 200ms after load to allow animations. Sometimes the click happens too soon, sometimes not. Your test randomly fails.
Code Example: Stable Clicking
import static org.junit.jupiter.api.Assertions.assertEquals;
import java.time.Duration;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.support.ui.ExpectedConditions;
import org.openqa.selenium.support.ui.WebDriverWait;
import org.junit.jupiter.api.Test;
public class FormTest {
@Test
public void submitButtonShouldBeClickable() {
WebDriver driver = new ChromeDriver();
try {
driver.get("https://example.com/form");
WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
WebElement submitButton = wait.until(
ExpectedConditions.elementToBeClickable(By.id("submit-button"))
);
submitButton.click();
wait.until(ExpectedConditions.urlToBe("https://example.com/success"));
assertEquals("https://example.com/success", driver.getCurrentUrl());
} finally {
driver.quit();
}
}
}
Notice: no fixed sleeps. Only event-based waits. That’s how you remove race conditions.
API Testing (REST and GraphQL)
APIs are supposed to be deterministic — but when tests hit real servers, all bets are off.
Network spikes, caching issues, or asynchronous database replication can cause tests to fail randomly.
Code Example: Mocked REST API Test
WireMockServer wireMockServer = new WireMockServer(8080);
wireMockServer.start();
WireMock.stubFor(post(urlEqualTo("/users"))
.willReturn(aResponse()
.withStatus(201)
.withBody("{\"id\":\"123\", \"name\":\"Alice\"}")));
Response response = given()
.baseUri("http://localhost:8080")
.contentType("application/json")
.body("{\"name\":\"Alice\"}")
.post("/users");
response.then().statusCode(201);
assertEquals("Alice", response.jsonPath().getString("name"));
wireMockServer.stop();
Here, you control every byte of the server response. Zero external dependencies. Zero flakiness.
Performance Testing
Performance is inherently noisy — but that doesn’t mean your tests have to be flaky.
Imagine your load tests show 1s latency on Monday and 3s latency on Wednesday — but no code has changed. That’s not a flaky test — that’s a flaky environment.
To fix it, performance tests must run in controlled conditions.
Key Tip: Always run tests multiple times and use median, 95th percentile — not averages. Averages lie when there’s noise.
Security Testing
Security tests like fuzzing or vulnerability scanning can become flaky if you don’t control randomness.
If your fuzzing engine seeds randomly every run, you’ll get different results, making failures look random.
Solution:
- Fix random seeds in fuzzers
- Run against snapshots of systems
- Log all input payloads for reproducibility
Chaos Engineering
Chaos tests intentionally cause system failures. But without tight control, they make tests untrustworthy instead of resilient.
The goal is targeted chaos, not blind chaos.
Why Environments Matter More Than Your Test Code
99% of articles about flaky tests focus only on “bad test code.”
Reality: environment stability is just as important.
Tests that hit unstable environments are doomed from the start.
Cloud Computing Considerations
- Use ephemeral test environments spun up per PR (e.g., with Terraform)
- Create immutable infrastructure — never “fix” a test env by hand
- Control resource auto-scaling and instance types for tests
- Snapshot entire DBs or services pre-test to ensure known states
Observability in Testing
Good engineers log and monitor not just their app — but their tests too.
✅ Track test start/end times
✅ Monitor resource usage during tests
✅ Correlate test failures with infrastructure anomalies
Tools to Add Observability:
- Grafana dashboards from test results
- Prometheus metrics from CI pipelines
- OpenTelemetry traces through tests
Stable Testing Pyramid: Best Practice Design
Below you can find a rough distribution of the tests that you have for your applications:
Industry Case Studies
Here as 3 examples of some tech giants in order to showcase how flakiness is tackled:
Quick Pro-Tips
- Never Retry Blindly: Retrying flaky tests without understanding the cause just masks real problems.
- Build Test Observability First: Know exactly where your tests fail, not just that they failed.
- Cloud is Your Friend (if used right): Use cloud ephemeral environments spun up per PR — and teardown after.
- Prefer Mocks for External Services Always: You don’t control Google’s API. Or AWS failures. Mock them aggressively.
- Prioritize Test Stability as a Feature: Test stability is not “extra work.” It’s a product quality feature.
Final Thoughts: Flaky Tests Are a Systemic Issue, Not a “Test Code” Issue
Flaky tests point to flaky systems:
- Fragile environments
- Bad assumptions about timing
- Poor infrastructure control
- Missing observability
Fixing flaky tests makes your product better, your systems more resilient, and your team much faster.
If you’re passionate about software testing, infrastructure, and creating high-quality solutions in a dynamic, knowledge-sharing environment, we invite you to explore our job opportunities at Agile Actors. Here, your personal growth is a key part of our collective development, especially as we tackle the ever-evolving challenges in software engineering and testing. Join us and be a part of our journey to shape the future of testing and infrastructure excellence.
Top comments (0)