Sebastian Clavijo Suero

Posted on Jan 2 • Edited on Jan 15

CYPRESS-FLAKY-TEST-AUDIT: thriving in the Cypress 'Dual-Verse' for once!

#cypress #testing #automation #qa

When flaky tests stop being random and start being a pattern, it is time to stop guessing and start auditing. This is the story of how to make peace with Cypress’ Dual-Verse... and actually win.

ACT 1: EXPOSITION

At some point in every Cypress project, a test fails.

You re-run it.
It passes.

You shrug, blame CI, and move on with your life.

Then it happens again. And again. And somehow only on Tuesdays, only in CI, or only when Mercury is in retrograde.

Welcome back to the Cypress Dual-Verse: the place where synchronous JavaScript and Cypress’ asynchronous command queue coexist peacefully… until they absolutely don’t.

If this sounds familiar, you are not alone. In one of my previous articles, I went deep (very deep) into why Cypress behaves this way and how mixing sync JS with async commands can quietly sabotage your tests:

👉 The Async Nature of Cypress: Don’t Mess With the Timelines

I definitely wanted to write an article about how this asynchronous command queue works, but I wanted to approach it in a way that is easier to understand and more intuitive, without sacrificing depth or the practical knowledge you actually need to tackle it effectively.

That is when I came up with the idea of presenting it as a quirky 'dual-universe' concept, using a very visual diagram so you can clearly see what is going on behind the scenes.

For a long time after publishing the article, a (somewhat crazy) idea kept spinning in my head… What if I could see a visual representation while implementing and running those rebellious tests that keep failing or behaving flakily, tests where I can’t quite put my finger on why, but I strongly suspect (and I’m usually right) that race conditions are to blame?

But man! This is not an easy task. It would mean diving deep into low-level test details and command interactions, then somehow turning all that chaos into something clear, understandable, and fully portable to a CI/CD pipeline.

And why stop there? Why not visualize it all with intuitive graphs (like in the 'dual-verse' article) that the human brain can actually enjoy digesting?

After all… you know the saying: a 🖼️ is worth more than a thousand long 📜📜📜. 😅

That is where CYPRESS-FLAKY-TEST-AUDIT plugin comes in.

ACT 2: CONFRONTATION

Flaky tests rarely announce themselves politely.

They don’t fail everywhere.
They don’t fail consistently.
And they definitely don’t fail when you are watching.

Most of the time, the root cause is subtle:

Cypress commands mixed with sync control flow
State captured too early
Variables that look populated… but aren’t (yet)
Assertions running outside the Cypress command chain

In other words: Dual-Verse violations.

The problem? These patterns are easy to write and painfully hard to spot, especially once your test suite grows beyond a handful of specs, or when you are staring at code written by someone else.

Presenting `cypress-flaky-test-audit`!

This plugin does one simple but powerful thing:

It scans your Cypress tests as they run and shows you (clearly and intuitively, no kidding!) what is running at every moment and exactly where things go wrong, making flakiness-prone patterns much easier to spot.

Not hypotheticals.
Not style opinions.
Real "this will bite you later" code smells.

And as someone wise once said…
"No gimmicks. No AI. Just the information you need to understand why a test is flaky". 🙂

It doesn’t rewrite your tests.
It doesn’t block your CI.

It simply shines a flashlight into the dark corners of your test suite.

You can run it locally, in CI, or as part of a quality gate, and decide how strict you want to be.

What the plugin actually does (and how that helps you)

Unlike a static pattern detector or flake predictor, cypress-flaky-test-audit gives you runtime insights into what your tests *actually did*, command by command.

In other words, it does not guess where flakiness might be; it shows you exactly what happened when your tests ran.

At a practical level, it provides:

🧠 Per-command execution details

For every Cypress command in a run, you get:

the exact execution order
timing information
how many retries happened
whether each command passed or failed

You can immediately see which commands took longer than expected, retried often, or didn’t execute at all. This is much more actionable than a static linter.

Why it helps:

When a test flops in CI but works locally, you can compare these metrics and see what changed, instead of guessing.

⏱ Timing and latency observations

The plugin highlights command durations — which can be telling when:

network responses are slow
UI takes time to load
intermittent server delays impact your tests

Why it helps:

Instead of "this fails sometimes", you see "this command took 4× longer in CI than locally."

🔁 Retry tracking

Cypress retries certain commands automatically, and you can now see:

which commands retried
how many times
whether retries eventually succeeded or still failed

This is especially useful for flaky assertions or slow elements.

Why it helps:

You will stop writing defensive cy.wait() hacks and start understanding when and why Cypress is retrying under the hood.

🚦 Clear pass/fail status per command

Every command is annotated with its success/failure state in the audit output, giving you:

confidence this test truly passed
insight into hidden failures that were masked by retries

Why it helps:

No more "it failed once but passed twice so I think it’s fine". You actually see the results, not just the final green light.

🔍 Comparing retries within the same test run

And this is my personal favorite!

One of the most underrated features is the ability to compare what happened between retries of the same test, especially when the first attempt fails and the retry somehow passes (ah yes… CI/CD, we meet again). 😅

Instead of seeing a single green checkmark and moving on, the audit lets you inspect:

what commands ran in the failed attempt
what commands ran in the successful retry
timing differences between both executions
commands that retried more aggressively in one attempt than the other

Why it helps:

Because "it passed on retry" is not a root cause.

By comparing both attempts side by side, you can spot subtle timing shifts, delayed elements, or network-dependent behavior that only shows up under certain conditions, all without re-running the test or adding guesswork.

This is where flaky tests stop being mysterious and start being debuggable.

In short, cypress-flaky-test-audit turns your test run into an observable timeline rather than a black box. Instead of hoping something is flaky, you get real data on what Cypress did and where your timing assumptions broke down.

How this actually helps you (and where you see the results)

All this data would be pointless if it stayed hidden.

CYPRESS-FLAKY-TEST-AUDIT is intentionally noisy in the right places and visual where it matters, so you can consume the results in whatever way fits your workflow.

Here is how.

🖥 Browser console output

During test execution, the plugin logs detailed audit information directly in the browser console.

You can see:

each Cypress command as it executes (in the precise order)
timing and retry information
which test retry you are currently looking at
subtle differences between attempts

Why this helps:

When debugging locally, you don’t need to leave the test runner. You can watch the Dual-Verse unfold in real time.

🧾 Terminal output (CI-friendly)

The same audit information is also available in the terminal output, making it CI-friendly by default.

This means:

no screenshots required
no guesswork when a test flakes in CI
actionable logs even when the test eventually passes on retry

Why this helps:

Because flaky tests don’t stop being flaky just because the pipeline turned green.

📊 HTML test audit report (where everything clicks)

This is where things really come together.

The plugin generates an HTML audit report that visualizes for a test suite:

every test
every retry
every command
execution timelines per attempt
pass/fail status and duration
what gets executed, and what is never reached

Each retry is displayed individually, and when a test fails first and passes on retry, you can compare both executions side by side.

And yes, if this looks familiar, that is intentional.

The graphs purposely resemble the ones used in The Async Nature of Cypress: Don’t Mess with the Timelines in Your Cypress Tests 'Dual-Verse', because they are designed to expose exactly the same thing: what Cypress did VS when you thought it did it.

Why this helps:

Seeing both timelines next to each other turns flakiness from a feeling into evidence. Timing gaps, delayed commands, and retry-heavy steps stop hiding behind a green checkmark.

In short, the plugin does not just tell you that a test retried. It shows you how each retry actually behaved, across the browser, the terminal, and a visual report that makes the Dual-Verse impossible to ignore.

You can interact with a fully functional HTML report here: https://sclavijosuero.github.io/samples/flaky-demo.html

Who this is for (and who it is not)

This is for you if:

You maintain a medium-to-large Cypress test suite.
You’ve seen tests fail in CI but magically pass locally.
You’ve ever said "just re-run it" and felt a tiny bit guilty.
You want guardrails, not lectures, around Cypress async behavior.
You’re a new QA engineer getting started with Cypress or a seasoned professional, because we all write flaky tests at some point, specially after we forget (again) about the Cypress 'Dual-Verse' gotcha.

This is probably not for you if:

You’re just getting started with Cypress and only have a handful of tests.
You’re intentionally ignoring Cypress’ command queue (no judgment… okay, maybe a little).
Your tests are already rock-solid and boring in the best possible way.

ACT3: RESOLUTION

The goal here is not to shame your tests.
We have all written code that works perfectly… until it doesn’t.

The real win is shifting from:
"Why is this flaky?"
to
"We know exactly where flakiness comes from."

Once you internalize and fully understand the Cypress execution model and back it up with automated auditing, something nice happens:

Tests become predictable
CI becomes boring (the good kind)
Failures start meaning something again

That is what I mean by thriving in the Cypress Dual-Verse.

Not fighting Cypress.
Not outsmarting it.
Just respecting the timeline, and letting a tool point out when you don’t.

Where to find it?

I'd love to hear from you! Please don't forget to follow me, leave a comment, or a reaction if you found this article useful or insightful. ❤️ 🦄 🤯 🙌 🔥

You can also connect with me on my new YouTube channel: https://www.youtube.com/@SebastianClavijoSuero

If you'd like to support my work, consider buying me a coffee or contributing to a training session, so I can keep learning and sharing cool stuff with all of you.
Thank you for your support!