DEV Community

Cover image for Why Your Selenium Tests Are Flaky (And How to Fix Them Forever)
Dhiraj Das
Dhiraj Das

Posted on • Originally published at dhirajdas.dev

Why Your Selenium Tests Are Flaky (And How to Fix Them Forever)

🎯

What This Article Covers

  • The Flakiness Problem: Why time.sleep() and WebDriverWait aren't enough
  • What Causes Flaky Tests: Racing against UI state changes
  • The Stability Solution: Monitoring DOM, network, animations, and layout shifts
  • One-Line Integration: Wrap your driver with stabilize() β€” zero test rewrites
  • Full Diagnostics: Know exactly why tests are blocked

If you've worked with Selenium for more than a week, you've written code like this:

driver.get("https://myapp.com/dashboard")
time.sleep(2)  # Wait for page to load
driver.find_element(By.ID, "submit-btn").click()
time.sleep(1)  # Wait for AJAX
Enter fullscreen mode Exit fullscreen mode

And you've felt the shame of knowing it's wrongβ€”but also the relief of "it works." Until it doesn't. Until the CI server is 10% slower than your machine, and suddenly your tests fail 20% of the time.

This is the story of flaky tests, why they happen, and how I built a library called waitless to eliminate them.

⚠️

The Flakiness Problem

Let me show you a real scenario. You have a React dashboard. User clicks a button. The button triggers an API call. The API returns data. React re-renders the component. A spinner disappears. A table appears.

This entire sequence takes maybe 400ms. But your test does this:

button = driver.find_element(By.ID, "load-data")
button.click()
table = driver.find_element(By.ID, "data-table")  # πŸ’₯ BOOM
Enter fullscreen mode Exit fullscreen mode

The table doesn't exist yet. React is still fetching. Selenium throws NoSuchElementException.

So you "fix" it:

button.click()
time.sleep(2)
table = driver.find_element(By.ID, "data-table")  # Works... usually
Enter fullscreen mode Exit fullscreen mode

The Problem with time.sleep()

The Problem with time.sleep()

Congratulations. You've just made your test: 1) 2 seconds slower than necessary, 2) Still flaky when the API takes 2.5 seconds, 3) Impossible to debug when it fails.

❌

Why Traditional Solutions Don't Work

time.sleep() β€” The Naive Approach

Sleep for a fixed duration and hope the UI is ready. Problems: Too short β†’ test fails. Too long β†’ test suite takes forever. No feedback on what's actually happening.

WebDriverWait β€” The "Correct" Approach

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

WebDriverWait(driver, 10).until(
    EC.element_to_be_clickable((By.ID, "submit-btn"))
)
Enter fullscreen mode Exit fullscreen mode

This is better. You're waiting for a specific condition. But here's the dirty secret: it only checks one element.

  • What about the modal that's still animating into view?
  • What about the AJAX request that hasn't finished?
  • What about the React re-render that's about to move your button?

WebDriverWait says "the button is clickable." Reality says "there's an invisible overlay from an animation that will intercept your click."

Retry Decorators β€” The Denial Approach

@retry(tries=3, delay=1)
def test_dashboard():
    driver.find_element(By.ID, "submit-btn").click()
Enter fullscreen mode Exit fullscreen mode

This is the equivalent of saying "I know my code is broken, but if I run it enough times, it'll eventually work." Retries don't fix flakiness. They hide it.

πŸ”

What Actually Causes Flaky Tests?

After debugging hundreds of flaky tests, I found they all come down to racing against the UI:

| What You Do              | What's Actually Happening           |
|--------------------------|-------------------------------------|
| Click a button           | DOM is being mutated by framework   |
| Assert text content      | AJAX response still in flight       |
| Interact with modal      | CSS transition still animating      |
| Click navigation link    | Layout shift moves element          |
Enter fullscreen mode Exit fullscreen mode

The Real Question

The question isn't "is this element clickable?" The question is: "Is the entire page stable and ready for interaction?" That's what I set out to answer with waitless.

✨

Defining "Stability"

What does it mean for a UI to be "stable"? I identified four key signals:

1. DOM Stability

The DOM structure has stopped changing. No elements being added, removed, or modified. How to detect: MutationObserver watching the document root. Track time since last mutation.

2. Network Idle

All AJAX requests have completed. No pending API calls. How to detect: Intercept fetch() and XMLHttpRequest. Count pending requests.

3. Animation Complete

All CSS animations and transitions have finished. How to detect: Listen for animationstart, animationend, transitionstart, transitionend events.

4. Layout Stable

Elements have stopped moving. No more layout shifts. How to detect: Track bounding box positions of interactive elements. Compare over time.

πŸ—οΈ

The Architecture

Waitless has two parts:

JavaScript Instrumentation (runs in browser)

window.__waitless__ = {
    pendingRequests: 0,
    lastMutationTime: Date.now(),
    activeAnimations: 0,

    isStable() {
        if (this.pendingRequests > 0) return false;
        if (Date.now() - this.lastMutationTime < 100) return false;
        return true;
    }
};
Enter fullscreen mode Exit fullscreen mode

This script is injected into the page via execute_script(). It monitors everything happening in the browser.

Python Engine (evaluates stability)

class StabilizationEngine:
    def wait_for_stability(self):
        """Waits until all stability signals are satisfied."""
        # Checks performed automatically:
        # βœ“ DOM mutations have settled
        # βœ“ Network requests completed
        # βœ“ Animations finished
        # βœ“ Layout is stable
Enter fullscreen mode Exit fullscreen mode

The Python engine continuously evaluates browser state until all configured stability signals indicate the page is ready for interaction.

πŸͺ„

The Magic: One-Line Integration

The key design goal was zero test modifications. Adding stability detection should require changing ONE line:

from waitless import stabilize

driver = webdriver.Chrome()
driver = stabilize(driver)  # ← This is the only change

# All your existing tests work as-is
driver.find_element(By.ID, "button").click()  # Now auto-waits!
Enter fullscreen mode Exit fullscreen mode

How does this work? The stabilize() function wraps the driver in a StabilizedWebDriver that intercepts find_element() calls. Retrieved elements are wrapped in StabilizedWebElement. When you call .click(), it first waits for stability, then clicks.

class StabilizedWebElement:
    def click(self):
        self._engine.wait_for_stability()  # Auto-wait!
        return self._element.click()  # Then click
Enter fullscreen mode Exit fullscreen mode

Zero Rewrites Required

Your tests don't know they're waiting. They just... stop failing.

πŸ”§

Handling Edge Cases

Real apps aren't simple. Here's how waitless handles the messy reality:

Problem: Infinite Animations

Some apps have spinners that rotate forever. Analytics scripts that poll constantly. WebSocket heartbeats that never stop.

Solution: Configurable thresholds

from waitless import StabilizationConfig

config = StabilizationConfig(
    network_idle_threshold=2,  # Allow 2 pending requests
    animation_detection=False,  # Ignore spinners
    strictness='relaxed'        # Only check DOM mutations
)

driver = stabilize(driver, config=config)
Enter fullscreen mode Exit fullscreen mode

Problem: Navigation Destroys Instrumentation

Single-page apps remake the DOM on route changes. The injected JavaScript disappears.

Solution: Re-validation before each wait

def wait_for_stability(self):
    if not self._is_instrumentation_alive():
        self._inject_instrumentation()  # Re-inject if gone
    # Then wait...
Enter fullscreen mode Exit fullscreen mode

πŸ“Š

Diagnostics: The Secret Weapon

When tests still fail, understanding why is half the battle. Waitless includes a diagnostic system that explains exactly what's blocking stability:

╔═════════════════════════════════════════════════════════════╗
β•‘              WAITLESS STABILITY REPORT                      β•‘
╠═════════════════════════════════════════════════════════════╣
β•‘ Timeout: 10.0s                                              β•‘
β•‘                                                             β•‘
β•‘ BLOCKING FACTORS:                                           β•‘
β•‘   ⚠ NETWORK: 2 request(s) still pending                    β•‘
β•‘   β†’ GET /api/users (started 2.3s ago)                       β•‘
β•‘   β†’ POST /analytics (started 1.1s ago)                      β•‘
β•‘                                                             β•‘
β•‘   ⚠ ANIMATIONS: 1 active animation(s)                      β•‘
β•‘   β†’ .spinner { animation: rotate 1s infinite }              β•‘
β•‘                                                             β•‘
╠═════════════════════════════════════════════════════════════╣
β•‘ SUGGESTIONS:                                                β•‘
β•‘   1. /api/users is slow. Consider mocking in tests.         β•‘
β•‘   2. Spinner has infinite animation. Set                    β•‘
β•‘      animation_detection=False                              β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
Enter fullscreen mode Exit fullscreen mode

This isn't just "test failed." It's "test failed because your analytics endpoint is slow, and here's exactly how to fix it."

πŸ“ˆ

The Results

Here's what changes when you adopt waitless:

Before

driver.get("https://myapp.com")
time.sleep(2)
WebDriverWait(driver, 10).until(
    EC.element_to_be_clickable((By.ID, "login-btn"))
)
driver.find_element(By.ID, "login-btn").click()
time.sleep(1)
driver.find_element(By.ID, "username").send_keys("user")
Enter fullscreen mode Exit fullscreen mode

After

driver = stabilize(driver)
driver.get("https://myapp.com")
driver.find_element(By.ID, "login-btn").click()
driver.find_element(By.ID, "username").send_keys("user")
Enter fullscreen mode Exit fullscreen mode
| Metric               | Before               | After           |
|----------------------|----------------------|-----------------|
| Lines of wait code   | 4+ per test          | 1 total         |
| Arbitrary delays     | 3+ seconds           | 0               |
| Flaky failures       | Common               | Rare            |
| Debug information    | "Element not found"  | Full stability report |
Enter fullscreen mode Exit fullscreen mode

🎭

Why Not Just Use Playwright?

Playwright has auto-waiting built in. It's great! But:

  • Migration cost β€” You have 10,000 Selenium tests. Rewriting isn't an option.
  • Framework lock-in β€” Playwright auto-wait is Playwright-only.
  • Different approach β€” Playwright waits for element actionability. Waitless waits for page-wide stability.

The Best of Both Worlds

Waitless gives Selenium users the reliability of Playwright without the rewrite.

⚠️

Current Limitations (v0.2.0)

Being honest about what doesn't work yet:

  • Selenium only β€” Playwright integration planned for v1
  • Sync only β€” No async/await support
  • Main frame only β€” iframes not monitored
  • No Shadow DOM β€” MutationObserver can't see shadow roots
  • Chrome-focused β€” Tested primarily on Chromium

These will be addressed in future versions β€” contributions welcome!

πŸš€

Try It Yourself

pip install waitless
Enter fullscreen mode Exit fullscreen mode
from selenium import webdriver
from waitless import stabilize

driver = webdriver.Chrome()
driver = stabilize(driver)

# Your tests are now stable
driver.get("https://your-app.com")
driver.find_element(By.ID, "button").click()
Enter fullscreen mode Exit fullscreen mode

One line. Zero test rewrites. No more flaky failures.

βœ…

Conclusion

Flaky tests are a symptom of racing against UI state. The solution isn't longer sleeps or more retriesβ€”it's understanding when the UI is truly stable.

Waitless monitors DOM mutations, network requests, animations, and layout shifts to answer one question: "Is this page ready for interaction?"

The Bottom Line

Your tests should be deterministic. Your CI should be green. And you should never write time.sleep() again.

Built by Dhiraj Das

Automation Architect. Making Selenium tests deterministic, one at a time.

Top comments (0)