DEV Community

Cover image for A Simple Guide to Fixing Flaky Playwright Tests
Pratik Patel
Pratik Patel

Posted on

A Simple Guide to Fixing Flaky Playwright Tests

If you're new to Playwright or test automation, you've probably heard about "flaky tests." Maybe you're even dealing with them now. Simply means the tests that pass sometimes and fail other times, even though nothing changed in your code.

Flaky tests are frustrating for everyone. Experienced QAs hate them. New learners get confused by them. Developers lose trust in automation because of them.

The good news? Flaky tests aren't mysterious. They follow predictable patterns, and there's a clear path to fixing them.

What Makes Tests Flaky?

Before we fix anything, let's understand what actually causes flakiness. It's usually one of three things:

  1. Your Test moves faster than your app Imagine clicking a button before it's fully loaded. In real life, humans wait instinctively. In automated tests, the script doesn't wait unless you tell it to.

  2. Your test relies on things that change If your test looks for a button by its CSS class name, and developers rename that class during a redesign, your test breaks. The button still works for users, but your test can't find it.

  3. Your tests interfere with each other One test logs in a user. The next test assumes no one is logged in. They step on each other's toes when running together.

The solution? Address each level systematically.

Want this framework as a printable checklist? Download the complete Playwright Automation Checklist

Level 1: Write Stable Test Code

This is where most problems start, and luckily, it's also the easiest to fix.

Use smart waiting, not fixed delays

When beginners write tests, they often add delays like this:

typescript
await page.locator('#submit-button').click();
await page.waitForTimeout(3000); // Wait 3 seconds
await expect(page.locator('#success')).toBeVisible();

Problem: 3 seconds might be too long (waste time) or too short (test fails). You're guessing.

Playwright has smart waiting built in. It automatically waits for elements to be ready:

typescript
await page.locator('#submit-button').click();
await expect(page.locator('#success')).toBeVisible(); // Waits automatically

Much better. No guessing.

Find elements the way users do

Instead of technical selectors like CSS classes:

typescript
// Breaks easily
page.locator('.btn-primary-lg')
`

Use what users see:

typescript
// Stable
page.getByRole('button', { name: 'Sign In' })

If a user can find it, your test can too. And it won't break when designers change styles.

Let assertions retry automatically

Modern websites load data asynchronously. Playwright assertions automatically check repeatedly for 5 seconds:

typescript
// Checks until true or 5 seconds pass
await expect(page.locator('#count')).toHaveText('10');

This handles loading delays without you doing anything special.

Level 2: Configure Your CI Pipeline

Even good test code needs proper CI support.

Use controlled retries

Random failures happen—network hiccups, slow servers. Instead of treating every failure as critical, let tests retry automatically:

In your playwright.config.ts:

typescript
retries: 2, // Test runs up to 3 times total

If a test fails once then passes, Playwright labels it "flaky"—telling you it needs investigation even though it didn't block the build.

Capture debugging information

When tests fail in CI, you need to see what happened. Configure Playwright to save:

  • Screenshots of the failure moment
  • Videos of the entire test
  • Traces showing everything (DOM state, network calls, console logs) typescript use: { screenshot: 'only-on-failure', video: 'retain-on-failure', trace: 'on-first-retry', }

This only captures data when needed, keeping CI fast.

Keep tests isolated

Tests should never affect each other. Each test gets:

  • Its own fresh browser context (like opening a new incognito window)
  • Its own data (either mocked or freshly created)
  • Its own environment settings

Start with workers: 1 in CI (tests run one at a time), then increase once everything's stable.

Level 3: Build Team Habits

Technical fixes don't stick without good team habits.

Handle flaky tests properly

Found a flaky test? Don't just disable it and forget. Instead:

Tag it @flaky so it's tracked
Remove it from main pipeline (won't block releases)
Run it in a separate job (still monitor if it improves)
Assign someone to fix it
Create a ticket right away

Track patterns over time

After each test run, you get a report showing what passed and failed. That's helpful for today.

But you also need to know:

  • Is this test always flaky or was this a one-time thing?
  • Are our tests getting better or worse?
  • Which part of our app has the flakiest tests?

Test reporting and analytics platforms collect all your test results over time, showing you these patterns. When you can see "login tests fail 3× more than average," you know where to focus your improvement efforts.

Start Small

Don't try to fix everything at once. Here's a simple plan:

Week 1:Find your 5 flakiest tests. Apply the code-level fixes from Level 1.

Week 2:Set up CI configuration from Level 2. Enable retries and artifact collection.

Week 3: Establish team habits from Level 3. Create your quarantine process.

Each week, you'll see improvement. Your tests become more reliable. Your team starts trusting automation again.

Why This Matters

Flaky tests aren't just technical annoyances. They're trust killers. When tests fail randomly:

  • Developers stop checking test results before deploying
  • QAs spend hours investigating false alarms
  • Teams lose confidence in automation entirely

But when tests are reliable:

  • Everyone trusts the results
  • Real bugs get caught immediately
  • Deployments happen with confidence

That's what this three-level approach gives you.

Ready to implement this in your team? Get the complete step-by-step checklist with code examples. Click here.

Top comments (0)