DEV Community: Cypress

Meet the CypressConf 2026 Keynote: Bas Dijkstra

Ronald Williams — Mon, 13 Apr 2026 16:32:16 +0000

As CypressConf 2026 gets closer, the focus is simple.

How do you move fast and still trust your test results?

With more automation and AI in the mix, teams are shipping faster than ever. But speed has never been the hard part. Confidence is.

Across teams, the same patterns keep showing up. Test suites grow, pipelines get faster, and more signals are generated at every stage of development. But when it comes time to make a decision, teams are still asking the same question.

Can we trust this?

Quality issues rarely belong to one person or one role anymore. Confidence in what ships depends on how well teams interpret and act on the signals coming from their tests, pipelines, and production systems. That is the shift happening across the industry, and it is what this year’s theme is built around.

To kick things off, we are excited to announce our keynote speaker, Bas Dijkstra.

Bas is a global test automation consultant, trainer, and speaker who has spent years working directly with engineering teams to improve how they approach quality. His work is grounded in real systems, real constraints, and the realities of modern development. He is known for helping teams cut through noise, reduce flake, and build testing strategies they can actually rely on.

More importantly, he understands that testing is no longer a single function. It is a shared responsibility that spans developers, QA, and engineering leadership. The teams that succeed are the ones that align around clear signals and act on them with confidence.

That perspective is exactly what makes this keynote the right fit for CypressConf this year.

The industry is not short on tools. It is not short on automation. What teams are working through now is how to make sense of everything those systems produce. AI is increasing the volume of signals. Pipelines are accelerating feedback. But without clarity, more signal can just as easily create more noise.

The challenge is no longer generating results. It is knowing what matters.

At CypressConf 2026, you will hear from teams who are working through this in real time. They are improving how they collaborate across roles, finding ways to respond earlier without introducing risk, and turning test results into decisions they can stand behind. They are not slowing down. They are getting better at understanding the systems they have already built.

The keynote sets the tone for that conversation.

With Bas Dijkstra, expect a session grounded in experience, focused on what teams are actually dealing with today, and clear about where things are going next.

This is the first announcement for CypressConf 2026. More speakers, sessions, and content are on the way.

The signal is up.

Agent-Driven E2E Testing with Cypress: A Practical Guide to Harness Engineering with Cursor Subagents

Darpan Shah — Tue, 07 Apr 2026 22:13:28 +0000

Teams have done end-to-end testing deliberately for years: exploring the app, writing tests from what they see, fixing failures in focused sessions. That's skilled work, not guesswork.

The hard part is usually organizational. Knowledge sits in people's heads or scattered across chat histories and tickets. What you see on a live screen is tough to describe clearly to whoever writes the automated test. Each new flow forces everyone to reload the same context from scratch.

Agent-driven development doesn't replace that judgment. It packages skilled work into narrow roles (explore, implement, execute, repair) with clear inputs and outputs. Quality builds over time instead of starting from zero every sprint.

This approach mirrors harness engineering: the system around the agents that makes them reliable, not just capable.

What Is a Harness, and Why Does It Matter?

The term "harness" has emerged as shorthand for everything in an AI agent system except the model itself. Put simply: Agent = Model + Harness. "The core challenge of long-running agents is that they must work in discrete sessions, and each new session begins with no memory of what came before." According to Anthropic's engineering research, imagine a software project staffed by engineers working in shifts, where each new engineer arrives with no memory of what happened on the previous shift. Without structure, agents drift, repeat work, or declare victory too early.

Their solution? A two-fold approach: an initializer agent that sets up the environment on the first run, and a coding agent that makes incremental progress in every session, while leaving clear artifacts for the next session.

When you talking about a coding agent. Martin Fowler's team breaks harness engineering into key components:

Context engineering: Provides us with the means to make guides and sensors available to the agent.
Architectural constraints: Rules that mechanically enforce quality (not just suggestions).
Feedback loops: The human's job is to steer the agent by iterating on the harness. Whenever an issue happens multiple times, the feedforward and feedback controls should be improved to make the issue less probable or even prevent it.

Here's the counterintuitive insight: increasing trust and reliability in AI-generated code requires constraining the solution space rather than expanding it. Narrow roles, explicit handoffs, and clear boundaries make agents more productive, not less.

How This Applies to E2E Testing with Cypress

This article describes four agents specialized for E2E testing using Cypress and how they form a closed loop:

cypress-browser-explorer: Maps UI flows with live browser tooling
cypress-builder: Implements specs per team conventions
cypress-runner: Executes tests consistently
cypress-debugger: Classifies failures and applies fixes

Each agent produces a structured artifact (exploration report, spec file, run summary, debug notes) that becomes the input for the next agent. This is the harness in action: each step creates a plan that keeps the next agent on track.

In Cursor, each of these agents maps directly to a custom subagent -- a markdown file in .cursor/agents/ with a name, description, and focused prompt. The explorer subagent leverages Cursor's built-in browser tool to navigate your app, take snapshots, read the live DOM, and capture network activity without leaving the IDE. That means the exploration report isn't hand-written -- it's generated from real page state.

It seems reasonable that specialized agents like a testing agent, a quality assurance agent, or a code cleanup agent could do an even better job at sub-tasks across the software development lifecycle. That's exactly what this workflow does for E2E automation when Cypress is your tool.

Evidence from the real UI flows into code. Code gets verified by a standard test run. Failures get handled with clear escalation rules instead of improvisation.

The Feedback Loop

The loop in one sentence: Explore → build → run; on failure, debug and re-run; if the UI changed, explore again and rebuild.

This closed loop is where the efficiency gains come from:

Less rework: Selectors and URLs come from live exploration, not memory
Faster green builds: Runner standardizes execution; debugger applies evidence-based fixes
Clear escalation: Stale DOM leads to re-explore; flaky patterns get documented
Single-test discipline: Fix one failure, re-run, then move on

The Four Agents at a Glance

Agent	Role	Primary Inputs	Primary Outputs	Must Not
cypress-browser-explorer	Map scoped UI flows using Cursor's browser tool	URL, steps, ticket scope	Exploration report with selectors, network map	Wander outside scope; invent selectors without proof
cypress-builder	Implement specs per team rules	Exploration report	Spec and support code; handoff to runner	Skip exploration for unfamiliar pages
cypress-runner	Execute tests consistently	Spec path, tags/env	Pass/fail summary with failure context	Fix failing tests (send to debugger)
cypress-debugger	Classify failures, apply fixes	Failure output, artifacts	Code changes; handoff to runner or explorer	Invent selectors when DOM has changed

Important: These agents are blueprints, not universal standards. Your stack, auth flow, and naming conventions will differ. Expect to:

Edit agent instructions to reference your scripts and config
Pair agents with project rules (lint, selector policy, test ID format)
Add or trim steps where your org needs tighter guardrails

The value is the shape of the workflow and clean handoffs, not a one-size-fits-all prompt.

Handoff Templates: Structured Artifacts That Bridge Context

The key insight here was finding a way for agents to quickly understand the state of work when starting with a fresh context window. Structured handoffs are what prevent "context amnesia" between agents.

Explorer → Builder

## Handoff to cypress-builder

Prompt: "Create cypress/e2e/[feature].cy.js using this exploration report:
- Scope source: [quote from ticket/steps]
- URL map: [ordered list]
- Selector inventory: [element, purpose, selector, stability]
- Network map: [method, pattern, suggested alias]"

Builder → Runner

## Handoff to cypress-runner

Prompt: "Run <spec path> to verify the new/updated spec."

Runner → Debugger (on failure)

## Handoff to cypress-debugger

Prompt: "Triage these E2E test failures (Cypress):

**Failing specs:** cypress/e2e/<spec>.cy.js

**Failures:**
1. [TEST-ID] <describe> > <it>
   Error: <message>
   Screenshot: cypress/screenshots/<path>

**Notes:** <auth errors, timeouts, etc.>"

Debugger → Runner (after fix)

## Handoff to cypress-runner

Prompt: "Re-run <spec path> to verify the fix for [TEST-ID]."

Debugger → Explorer (stale DOM)

## Handoff to cypress-browser-explorer

Prompt: "Re-explore <URL/flow> because selectors are stale for <spec>. Return updated report to builder."

Explorer Report Checklist

When using the explorer agent, require a report that includes:

Scope source: Ticket, pasted steps, or URL/feature
Flow summary: Scoped path, completion or blocked state
URL map: Ordered URLs visited
Selector inventory: Element, purpose, selector, stability rating
Network map: Method, pattern, suggested intercept alias
Test strategy: E2E vs shift-left rationale per scenario
Notes: Gaps, fragile selectors, missing test hooks

Steering the Harness: How to Keep Agents Aligned

Rather than personally inspecting what the agents produce, we can make them better at producing it. The collection of specifications, quality checks, and workflow guidance that control different levels of loops inside the how loop is the agent's harness. The emerging practice of building and maintaining these harnesses, Harness Engineering, is how humans work on the loop.

This is working "on the loop" rather than just "in the loop." You're not micromanaging every output. You're improving the harness so agents naturally produce better results.

Practical steps:

Scope every request: URL, role, numbered steps, or ticket excerpt. The explorer especially needs to know what path to follow.
Encode standards in the repo: Lint rules, skills files, and agent instructions should match. Otherwise the model follows whatever file it read most recently.
Use explicit handoffs: Paste the structured blocks so the next agent gets data, not a summary.
Review diffs like any PR: Generated specs need scrutiny, especially auth, network mocks, and assertions on money or permissions.
Keep secrets out of chat: Credentials belong in .env or your secret manager.
Turn fixes into constraints: When an agent makes a mistake, you take the time to engineer a solution such that the agent never makes that mistake again. Add a lint rule, update the instructions, or create a check.

Review Gates: Keeping Humans on the Loop

Agents execute evaluations automatically, but human oversight remains important for initial calibration and quality validation. Keep humans at judgment points:

After build: Review spec structure, selector quality, and assertion coverage before treating a run as final.
After green: Quick coverage and risk check before merge.
After repeated debug failure: If the same failure persists after three fix attempts, escalate to a person.

The agents handle the repetitive cycle. Engineers keep the judgment calls.

Team-Owned Content

The harness above doesn't define these items. Your team documents them in skills, rules, or extended agent files:

Authentication flows, secrets file layout, and commands patterns
Exact run commands (local vs Docker), CI script names
Tag/grep filters, base URLs per environment
Selector policy beyond "prefer stable hooks" (data-*, roles, aria)
Test ID formats, coverage scripts, Lint/Cypress config conventions

Why This Approach Works

The principles from Anthropic and Martin Fowler's research explain why the four-agent pattern is effective:

Constraints as multipliers: Paradoxically, constraining the solution space makes agents more productive, not less. When an agent can generate anything, it wastes tokens exploring dead ends.
Structured artifacts bridge context: Structured progress files and feature lists let a new agent quickly understand the state of work, analogous to a shift handoff between engineers who've never met.
Feedback loops catch issues early: Run follows build. Debug follows failure. Re-explore only when needed. This order cuts rework.
Clean escalation prevents endless retries: If the DOM is wrong, hand to explorer. If three fixes fail, hand to a human. No guessing.
The harness evolves: Coding agents make it much cheaper to build more custom controls and more custom static analysis. Agents can help write structural tests, generate draft rules from observed patterns, scaffold custom linters, or create how-to guides from codebase archaeology.

Implementing This in Cursor with Subagents and Browser

The four-agent workflow maps to four Cursor subagents: one markdown file per role under .cursor/agents/, each with YAML frontmatter (name, description, model and any optional fields you need) plus a focused instructions and prompt body. How you create them is always the same—only the name, description, and instructions change to match explorer, builder, runner, or debugger.

Below is one example (the browser explorer). The other three files use the identical shape; plug in the responsibilities from the agent table and handoff templates earlier in this article instead of pasting four full prompts here.

---
name: cypress-browser-explorer
model: inherit
description: Explores the application UI using browser tools to discover selectors, network calls, and page flows for Cypress test development. Use when exploring a new feature, finding selectors, mapping user flows, building new tests, or when the user says to explore a page. ALWAYS launch the browser - never assume selectors without navigating and snapshotting.
---

You are a browser exploration specialist for E2E tests using Cypress.

When invoked:

1. **Authenticate** if the target page requires login (see above)
2. **Navigate** to the target URL or flow entry point
3. **Take a snapshot** to capture the page structure
4. **Follow the exploration checklist** below for every flow

## Exploration Checklist

### Page URLs
- Record the entry page URL
- Navigate through each step of the flow, recording intermediate URLs
- Record the confirmation/success page URL

### Selectors (capture in priority order)
Priority order:
1. `[data-cy]`, `[data-test]`, `[data-testid]` -- purpose-built for testing
2. Any other `[data-*]` attribute -- stable, not styling-dependent
3. Any `[test-*]` attribute (e.g. `test-auto`, `test-id`) -- also for testing
4. `[role="..."]`, `[aria-label="..."]`, `[aria-labelledby]` -- semantic/accessible
5. `label[for="..."]` + associated input -- form elements
6. Stable visible text via `cy.contains()` -- only when text itself is the assertion
7. Tag + attribute combos (e.g. `input[name="email"]`) -- last resort

**Never use**: CSS classes, generated IDs, tag names alone, XPath, positional selectors

### Network Calls
- Monitor network requests during the flow using browser tools
- For each significant API call, record:
  - HTTP method and URL pattern
  - Suggested intercept alias (e.g., `get:cart-items`, `post:place-order`)
  - Whether the response contains data needed for assertions
- Pay attention to: auth calls, data fetching, form submissions, redirects

## Authentication

When the target page requires login (e.g. `/dashboard`, `/account`, any page that
redirects to `/login`), authenticate **before** exploring. Never ask the user
for credentials -- resolve them from project files.

### Credential Resolution (priority order)

1. **`.env`** file in the project root -- parse `KEY=VALUE` lines.
2. **`cypress.env.json`** in the project root -- parse JSON object.

## Handoff to cypress-builder

Prompt: "Create cypress/e2e/[feature].cy.js using this exploration report:
- Scope source: [quote from ticket/steps]
- URL map: [ordered list]
- Selector inventory: [element, purpose, selector, stability]
- Network map: [method, pattern, suggested alias]
- Draft spec: [snippet if applicable]
"

## Output Format

Return a structured report:
1. **Scope source:** Ticket, pasted steps, or URL/feature 
2. **Flow summary**: Scoped path, completion or blocked state
3. **URL map:** Ordered URLs visited
4. **Selector inventory:** Element, purpose, selector, stability rating
5. **Network map:** Method, pattern, suggested intercept alias
6. **Test strategy:** E2E vs shift-left rationale per scenario
7. **Notes:** Gaps, fragile selectors, missing test hooks

Save as .cursor/agents/cypress-browser-explorer.md. Add cypress-builder.md, cypress-runner.md, and cypress-debugger.md the same way, then invoke with /cypress-browser-explorer (and so on) or let the parent agent delegate from each file’s description.

Cursor's browser tool powers the explorer

The explorer subagent is where Cursor's built-in browser tool becomes essential. Rather than asking an engineer to describe what's on screen, the agent:

Navigates directly to URLs and follows multi-step flows
Takes snapshots of live DOM state, capturing element structure, attributes, and text
Reads selectors from the actual page -- data-testid, ARIA roles, form labels -- instead of guessing
Captures network activity to identify API calls that need cy.intercept() aliases

This means the exploration report is evidence-based from the start. Selectors come from the real DOM, not from memory or a other sources that may be out of date. When the debugger detects stale selectors and hands back to the explorer, the browser tool re-navigates and captures the current state -- closing the feedback loop with live data.

Why subagents fit this workflow

Cursor subagents provide three properties that align with the harness model:

Context isolation: Each subagent gets its own context window. The explorer's noisy DOM snapshots and network logs don't pollute the builder's context. The debugger's stack traces don't crowd the runner. This is the same isolation principle the harness pattern demands.
Parallel execution: Multiple subagents run simultaneously, cutting wall-clock time on multi-spec work.
Structured handoffs: A subagent returns a final message to the parent agent. That message is the handoff artifact -- the exploration report, the run summary, the debug notes. The templates in this article become the return format each subagent follows.

The Orchestration Pattern

The parent agent acts as an orchestrator, coordinating the four subagents in sequence:

Invoke /cypress-browser-explorer with URL and steps -- get exploration report
Pass the report to /cypress-builder -- get spec files
Hand spec paths to /cypress-runner -- get pass/fail summary
On failure, send details to /cypress-debugger -- get fixes, then back to step 3

Each handoff uses the structured templates from earlier in this article. The parent agent doesn't need deep knowledge of Cypress APIs—it routes data between specialists. This is the same orchestrator pattern Cursor's documentation recommends for complex workflows.

If you use Cypress MCP, you can also point /cypress-debugger at MCP tools to fetch failures from Cypress Cloud. The debugger triages, patches the spec or support code, then uses the Debugger → Runner handoff to re-run and stays in that loop until failures are addressed. That keeps run, fail, fetch, fix, re-run inside one workflow.

Closing

Treating exploration, implementation, execution, and repair as separate agent roles mirrors how strong teams already work. The harness makes this pattern repeatable and easy to hand off inside the IDE.

The largest efficiency win is the closed loop: run follows build, debug follows failure, re-explore only when the page structure actually changed.

The most effective harnesses don't just constrain the agent. They create an environment where the agent naturally produces better output with less correction needed. This is a critical insight. The best harnesses aren't restrictive. They're enabling.

Since shipping these specialized Cypress agents, I have hardly written tests by hand. The agents produce specs; I review them, merge when they are right, and when something drifts or misfires I adjust the agent definitions, skills, or prompts so the next run is better. The work shifts from typing cy.* to curating the harness -- continuous improvement on the automation itself, not just on individual tests.

The loop is sequential, but each step stays small: one subagent, one job, less noise in context than doing it all in a single chat.

Agent-driven development pays off when agents are blueprints you maintain. With Cursor subagents, those blueprints live in your repo as markdown files -- versioned, reviewable, and shared across the team. The browser tool gives the explorer agent direct access to your running app, so the entire loop from live UI to green test stays inside the IDE. Tighten instructions as your app and pipeline evolve. Keep guidance in the loop so automation stays trustworthy, not just clever.

References

Anthropic: Effective Harnesses for Long-Running Agents
Martin Fowler: Harness engineering for coding agent users
Cursor: Subagents and Browser Tool

How I stopped declaring login in each of my 5k tests

Marcelo C. — Fri, 27 Feb 2026 19:26:44 +0000

Have you ever encountered a testing codebase that many portions are repeated over and over? We all have! Of course, I could be talking about DRY princples (Don't Repeat Yourself), but lets keep that aside for now and focus on a Cypress trick up it's sleeve that can go unnoticed for many senior devs: the global hooks.

And what do I mean by "global" hooks? They're called like that because you only declare them once and they're applied to all your tests instantly.

So let's get to the grain here: when installing Cypress it already comes with a file at created at cypress/support/e2e.<ts|js> level. This is usually where you declare some important commands imports that your E2E testing will need to access and run properly.

But it's also responsible for adding let's say before, beforeEach, or after, afterEach hooks that will be applied by all your tests. This can be responsible for login hooks, clean-ups to the database after the tests run, screenshots resolutions configuration -- a million of possibilities here.

I guess that the "global hooks" makes more sense now, right?

The challenge

At my workplace, I encountered the command to login, called cy.login, declared in each of the 5.634 tests that we have. For a while it bothered me a lot, because I always wanted to remove this codelines, and make my life easier with a simple e2e.ts file that handled all existing and new login logic for future tests.

But if I knew how it worked, why didn't I just do it already?

Becuase, for the first time (for me), I was dealing with a really complex system: a big chunk of the legacy tests ran with testIsolation: false. That meant that they only login once, and the it blocks don't load a new baseURL after each one is done. They do it all in one session.

Because? Well, that would be another story, so let's just accept their done like that for "system requirements" at the time.

Ok, so I basically needed:

before() → cy.login() beforeEach() → cy.login() only if testIsolation is true

Easy right? Not yet. The complexity resides because different spec families have different needs: different credentials, different environments, different session strategies, and different testIsolation settings.

There are two needs here to login:

Which login method?

cy.login(): Default specs with standard app with default credential.
cy.loginDemo(): Presentation specs with different environment, likely SSO or different credentials.
Custom (own login): Need specific credentials or totally different E2E tests outside enviroment.

When to login? (before vs beforeEach)

This is driven by Cypress's testIsolation setting:

testIsolation: true — Cypress clears browser state (cookies, storage) between each test. Session is lost → must re-login before each test.
testIsolation: false — Cookies persist across tests in the same spec → login once is enough.

Now, here is where the plot thickens. I had to declare folders, tests, paths that needed to be skipped (or not), because on the same folder I had testIsolation: true/false. So, follow me along:

1. Legacy specs (/legacy/ folder)

before() → skip entirely
beforeEach() → only call cy.login() if testIsolation is true

Legacy specs use specific credentials. If the global before() called cy.login() first, cy.session() would cache the wrong credentials, causing 500 errors when the spec then tries to login with different ones. So global before() is completely skipped. In beforeEach(), it only re-validates when state is actually cleared (testIsolation: true).

2. Training portal specs (/training/ folder) + Custom login specs

before() → skip entirely
beforeEach() → skip entirely

These specs manage their own login end-to-end. The global hooks stay completely out of the way. Some tests likely test login flows themselves or use role-specific credentials.

3. BEFORE_LOGIN_DEMO_SPECS

before() → cy.loginDemo() + cy.goToHome()
beforeEach() → skip (already logged in)

Login once per spec, reused across all tests. These are demo/presentation specs where re-logging in per test would be slow and unnecessary.

4. BEFORE_EACH_LOGIN_DEMO_SPECS

before() → nothing
beforeEach() → cy.loginDemo() + cy.goToHome()

Similar to above but login is repeated per test because testIsolation: true, the state is cleared between tests, so they must re-login each time.

5. LOGIN_DEMO_SPECIAL_SPECS

before() → nothing
beforeEach() → override baseUrl + timeout + idp_active env, then cy.loginDemo()

This spec runs against a completely different server, with an IDP/SSO active flag and a much longer timeout. The config must be set before each test because test isolation may reset Cypress config state.

6. Default specs (everything else, A.K.A the MOST important logic!)

before() → cy.login()
beforeEach() → cy.login() only if testIsolation is true

Standard app, default credentials. Login once in before(), then beforeEach() only re-validates the session if Cypress actually cleared it (testIsolation: true). If testIsolation is false, cookies persist and calling cy.login() again would navigate back to home, breaking any test that expects to be on a specific page.

The key insight: why before AND beforeEach?

You're asking "are you mental? why on earth would you declare login TWICE?", basically because:

before() → establishes the initial session (runs once)
beforeEach() → re-validates/restores the session if testIsolation cleared it

For specs with testIsolation: false, beforeEach() is a no-op (or returns early) because the session is still alive. For specs with testIsolation: true, beforeEach() must re-run the login to restore the cleared session — but it uses cy.session() internally which caches credentials.

Here's a sneak peek of how my e2e.ts file looks in it's (hopefully) final form:

And I basically removed the cy.login function from more than 1k files in my codebase, leaving me or any other engineer to not worry anymore about the login being declared at test level, it handles everything for me now, as it should.

What about you? Have you ever encountered a challenging and complex test codebase to deal with? What was the most difficult change you had to make? Leave me a comment, I would love to hear!

Cypress in the Age of AI Agents: Orchestration, Trust, and the Tests That Run Themselves

Vladimir Mikhalev — Thu, 26 Feb 2026 11:33:21 +0000

Last year, I wrote about Docker and Cypress for this blog. It covered containers, layer caching, and parallel runners. Good stuff. Useful stuff.

But I'm not writing that article again.

Here's why.

I could write a perfect container config in my sleep. So could Claude. So could GPT. So could any intern with a prompt. Syntax has become a commodity. The Dockerfile isn't the hard part anymore.

The hard part?

Orchestration and trust when AI agents run the tests.

Let me explain.

The Shift Nobody Talks About

In 2025, Cypress shipped cy.prompt(). Write tests in plain English. The AI figures out the selectors. It even self-heals when your UI changes.

That's powerful. And that's dangerous.

Not because the tool is bad. It's genuinely impressive. But because it changes who is making decisions in your pipeline. And most teams haven't thought about that.

Before cy.prompt(), the chain of trust was simple:

A human wrote the test
A human reviewed it
CI ran it
If it failed, a human fixed it

Every link in that chain had a name attached.

Now?

An AI writes the test
An AI picks the selectors
An AI heals the test when it breaks
The human sees green checkmarks
Everybody ships

Until something goes wrong. And nobody knows why.

Autonomy vs. Augmentation: The Framework That Matters

The industry keeps confusing two very different things.

Autonomy means the agent acts for you. You find out later what happened.
Think: self-driving car. You're the passenger. The AI makes every turn.

Augmentation means the agent helps you decide. You still make the call.
Think: GPS navigation. It suggests the route. You drive.

Most AI testing tools sell autonomy:

"Never write a test again!"
"Self-healing pipelines!"
"Zero maintenance!"

That sounds great in a demo.

It falls apart in production.

Google's testing team found that 1.5% of all test runs were flaky (2016 study). Nearly 16% of tests showed some flakiness over time. Microsoft reported 49,000 flaky tests across 100+ product teams (2022). These numbers haven't gotten better. Now imagine those tests were written by AI.

You don't have a testing problem.

You have a trust problem.

What Actually Happens When AI Writes Your Cypress Tests

I've watched AI code assistants generate test suites. Here's the pattern I see every time:

Day one: Beautiful. High coverage numbers. Clean syntax. The PR merges fast. Everyone celebrates.

Week two: A UI change breaks three tests. The self-healing kicks in. Tests pass again. Nobody checks what changed.

Month two: The self-healed selectors are now targeting the wrong elements. The tests pass. But they're testing the wrong things. Your coverage number says 90%. Your real coverage is closer to 40%.

Quarter end: A production bug ships. The test suite was green. The post-mortem reveals the AI "healed" a critical login test. It now clicks a decorative button instead of the submit button. Both are blue. Both say "Continue."

The AI didn't fail.

The architecture failed.

Nobody designed a system where AI decisions get verified.

The Architecture Cypress Teams Actually Need

Here's the playbook I'd build for any team using Cypress with AI in 2026.

Layer 1: AI Generates, Humans Gate

Use cy.prompt() (or any AI tool) to draft tests. That's the accelerator.

But treat AI-generated tests like pull requests from a junior developer.

// cy.prompt() generates the test
cy.prompt([
  'Visit the login page',
  'Type admin@company.com into the email field',
  'Type the password into the password field',
  'Click the sign in button',
  'Verify the dashboard loads'
])

Then eject that code. Review the selectors. Commit the explicit version.

// The reviewed, committed version
cy.visit('/login')
cy.get('[data-cy=email]').type('admin@company.com')
cy.get('[data-cy=password]').type(Cypress.env('TEST_PASSWORD'))
cy.get('[data-cy=submit-login]').click()
cy.url().should('include', '/dashboard')
cy.get('[data-cy=welcome-banner]').should('be.visible')

The AI got you there faster. A human verified the result.

That's augmentation.

Layer 2: The Trust Boundary in CI

Your pipeline needs a clear line:

On one side: things AI can do alone
On the other: things that need human eyes

# GitHub Actions - Trust Architecture
jobs:
  ai-generated-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v6

      - name: Run AI-assisted test suite
        run: |
          docker compose -f docker-compose.cypress.yml up \
            --abort-on-container-exit \
            --exit-code-from cypress

      - name: Validate no self-healed selectors
        run: |
          # Flag any tests that healed since last commit
          # Note: Requires a custom script to parse
          # Cypress Cloud API or stdout logs
          node ./scripts/check-healed-tests.js
          # If selectors changed, block the merge
          # Force a human review

      - name: Screenshot diff on healed tests
        if: failure()
        run: |
          # Capture what the AI "fixed"
          # Attach to PR for human review
          npx cypress run --spec "healed-tests/**" \
            --config screenshotOnRunFailure=true

The key:

Self-healed tests don't auto-merge. They create a review request. A human looks at what changed. Then decides.

Layer 3: The Accountability Layer

Every AI decision in your pipeline needs a log.

Not just "test passed."

But: "test healed selector from .btn-primary to .btn-action on Feb 15."

// cypress.config.js
module.exports = {
  e2e: {
    experimentalPromptCommand: true,
    setupNodeEvents(on, config) {
      on('after:spec', (spec, results) => {
        // Parse Cypress stdout or Cloud API for healing events.
        // Self-healing data appears in the Command Log
        // but isn't yet exposed in results.stats.
        //
        // Option A: Parse terminal output for "Self-Healed" tags
        // Option B: Query Cypress Cloud API for spec run details
        // Option C: Build a custom Cypress plugin that listens
        //           to command events during the run

        const healingEvents = parseHealingFromLogs(spec.name)

        if (healingEvents.length > 0) {
          logToAuditTrail({
            spec: spec.name,
            healed: healingEvents.length,
            timestamp: new Date().toISOString(),
            details: healingEvents
          })
        }
      })
    }
  }
}

When something breaks in production, you can trace it back:

"The AI changed this selector on this date. Nobody reviewed it. That's the gap."

Without this layer, your pipeline is a black box.

Green doesn't mean correct. It means unchallenged.

Layer 4: Docker as the Trust Container

Docker isn't just for consistency anymore.

It's your isolation boundary for AI-generated tests.

# docker-compose.cypress.yml
services:
  cypress-human-authored:
    build:
      context: .
      dockerfile: Dockerfile.cypress
    command: >
      npx cypress run
      --spec "cypress/e2e/human-authored/**"
    volumes:
      - ./results/human:/results

  cypress-ai-generated:
    build:
      context: .
      dockerfile: Dockerfile.cypress
    command: >
      npx cypress run
      --spec "cypress/e2e/ai-generated/**"
    volumes:
      - ./results/ai:/results
    # AI tests run in a separate container
    # Different reporting, different trust level

Separate the results. Report them differently.

Human-authored tests are your source of truth
AI-generated tests are your early warning system

When both agree: high confidence.
When they disagree: investigate.
When only AI tests pass: be suspicious.

The Uncomfortable Question

Here's where I need to be honest.

I've been in tech for 20 years, and spent the last 15 building delivery pipelines. I can debug a failing Docker container at 2 AM with my eyes half closed. I've configured CI/CD systems that run thousands of tests across dozens of services.

And I'm watching AI tools do parts of that job faster than I can.

That's not a threat.

That's a signal.

The value isn't in writing the cy.get() selector anymore.

The value is in designing the system where:

AI-generated selectors get verified
Self-healing gets audited
Trust has a paper trail

The Executor writes the test.

The Architect designs the trust system.

Most teams are building AI-powered testing without building AI-accountable testing. They're adding speed without adding trust.

That's technical debt with a new name.

What I'd Do This Week

If I ran a Cypress team today, here's my Monday morning plan:

Separate your test suites. Human-authored in one folder. AI-generated in another. Track them separately.
Add an audit log for self-healing. Every time cy.prompt() (or any AI tool) changes a selector, log it. Make it visible.
Block auto-merge on healed tests. Self-healed tests go into a review queue. A human approves. Every time.
Run AI tests in a separate Docker container. Different reporting pipeline. Compare results against human-authored tests.
Measure real coverage. Not line coverage. Not selector coverage. "Does this test actually verify the behavior we care about?" AI can inflate coverage numbers without testing anything meaningful.

None of this is anti-AI.

All of this is pro-trust.

The Bottom Line

Cypress + AI is the future. I believe that. cy.prompt() is a genuine leap forward.

The ability to write tests in plain English, the self-healing, the lower barrier to entry — all of it matters.

But the teams that win won't be the ones who automate the most.

They'll be the ones who trust the right things and verify everything else.

The bot that ships the wrong build doesn't get fired. You do.

Design accordingly.

Resources:

Valdemar is a Docker Captain and Cypress Ambassador based in Canada. He builds CI/CD pipelines that don't lie to you. Find him at valdemar.ai.

Leading Quality Through Change: Balancing Speed, AI, and the Fundamentals That Matter

Ronald Williams — Thu, 29 Jan 2026 16:41:46 +0000

As software delivery accelerates and AI driven tooling reshapes how teams approach testing, many QA leaders are facing the same challenge: how to evolve quality practices without losing the fundamentals that keep teams effective, scalable, and trusted.

This tension shows up in real leadership decisions every day. framework selection, automation trade offs, skill development, and responsible adoption of emerging tools. The conversations are rarely just about tools. They are about judgment, mindset, and how to guide teams through constant change without sacrificing long term quality.

In this reflection, Lyle Smart, Director of Quality Assurance and Test Automation (SDET) at Continued, shares perspective shaped by real world leadership decisions. His experience offers practical guidance for QA and engineering leaders navigating speed, complexity, and sustainability in today’s delivery landscape.

When a technical decision becomes a leadership one

One defining moment for Lyle came during a new platform build, when he was faced with choosing a test automation framework.

On the surface, the decision appeared technical. In reality, it carried significant leadership weight.

That choice would influence how the team collaborated, how quickly engineers could onboard, and how quality practices would scale over time. For Lyle, the decision was less about picking the “best” framework and more about setting the foundation for how the team would work and grow together.

Framework decisions shape culture. They signal what a team values, how approachable quality is for new contributors, and whether testing becomes a shared responsibility or a bottleneck.

Excitement, innovation, and the need for discipline

Lyle describes QA leadership today as genuinely exciting. The pace of innovation, especially with AI, has opened up new possibilities for how teams think about testing and quality engineering.

At the same time, strong fundamentals still matter.

From his perspective, leading QA requires balancing innovation with discipline. Skilled QA professionals remain essential to guide quality decisions, apply context, and ensure tools are used intentionally rather than for novelty. AI can accelerate workflows, but it cannot replace judgment.

The role of QA leadership is increasingly about knowing when to lean into new capabilities and when to slow down and ask harder questions.

What really keeps QA leaders up at night

For leaders like Lyle, the biggest concern is not adopting the next tool. It is ensuring the department has the right mix of skills, both technical and interpersonal, to succeed in the future. This matters at a leadership level because it directly affects sustainability.

Teams need more than expertise with tools. They need communication skills, critical thinking, and the ability to adapt as systems, products, and expectations evolve. Without those skills, even the most advanced tooling can become a liability rather than an advantage.

Fast adoption, thoughtful use

Lyle points to real world examples where new capabilities required careful leadership, not just enthusiasm. When new features such as cy.prompt were introduced, adoption needed to be fast but thoughtful. The challenge was ensuring teams understood not only how the feature worked, but when it should be used and when it should not.
As a leader, he felt responsible for helping the team avoid unnecessary complexity or misuse that could reduce effectiveness instead of improving it. Clear guidance, shared standards, and open conversations became just as important as documentation.

Slowing down to move forward

These experiences shaped how Lyle approaches leadership today. Pressure can push teams toward fast solutions, but rushed quality decisions often create more work later. He now places greater emphasis on evaluating broader impact and long term consequences, especially when introducing new tools or practices.
Slowing down is not resistance to progress. It is a way to protect teams from churn, burnout, and fragile systems.

A final reflection for QA and engineering leaders

If Lyle could leave other QA or engineering leaders with one reflection, it would be this:

Keep learning. The future is exciting, and new tools and skills are essential. Just do not lose sight of the core principles of quality that make those tools effective in the first place.

If you are looking for leadership content that keeps you and your team ahead of the curve, register for CypressConf 2026 and learn from industry leaders defining success in modern software development.

🚀 Enhancing Cypress Test Stability and Retry Capabilities

Laerte Neto — Tue, 13 Jan 2026 15:31:18 +0000

Introduction

The cypress-retry-after-run plugin brings a smart way to rerun only the tests that failed in a previous Cypress run, saving pipeline time, infrastructure resources, and frustration when dealing with flaky tests. It was designed mainly for real-world teams that run large suites in CI/CD and do not want to pay the price of re-executing the entire test set just because a handful of tests were unstable.

The problem this plugin solves

In modern QA pipelines, it is very common to have:

Large suites that take minutes (or hours) to run and consume a lot of CI resources.
Intermittent tests (flaky tests) that fail occasionally due to environment instability, networking, data issues, and so on.
A real need to isolate and rerun only what failed, instead of running everything again manually or rerun on demand, as the data can be corrupted at runtime.

Native Cypress retries will retry the test within the same execution, but that is not always what you want. In many pipelines, you first want to run the full suite, then do something (deploy fresh data, restart a service, clean up the environment), and only then trigger a new execution focused exclusively on the failures.

Core idea of `cypress-retry-after-run`

The plugin implements a two-step flow:

During the normal run, it listens to test execution and records failed tests into a .cypress-failures.json file at the project root.
Then you trigger a CLI command (cypress-retry) that reads this file, uses @cypress/grep under the hood, and starts a new Cypress run that executes only the tests that failed before.

In practice, the plugin turns “Run once, record failures, and rerun only what matters”.

Concrete benefits for CI/CD

For CI/CD pipelines, cypress-retry-after-run delivers very tangible advantages:

Time savings: instead of running the entire suite twice, the second run usually will be much smaller and focused only on the failed specs/cases.
Lower infrastructure cost: fewer runner minutes, fewer containers, less CPU and memory usage on shared environments.
More focused feedback: you quickly get a clean “retry run” that shows only the behavior of the failed tests, which helps distinguish real bugs from pure flakiness (either due to the tests or due to the environment, bad data, or other things), so it will make your debugging much faster, specially in larger suites as mentioned.

This pattern is especially useful for:

Pipelines that run multiple times a day.
Monorepos with dozens or hundreds of specs (which was my case, by the way).
Quality gates that only block merges if failures persist even after a dedicated retry run.

How the plugin works under the hood

The internal logic is simple but powerful:

Execution hook: the plugin plugs into Cypress via setupNodeEvents and listens to information about failed tests as the run progresses.
Persistence: at the end of the run, it writes a .cypress-failures.json file with identifiers of the tests/specs that failed.
Smart CLI: the cypress-retry command reads this file, builds the proper filters, and starts a new Cypress execution using @cypress/grep to run only the relevant tests.

Effectively, you get a selective replay of the failing tests, fully automated and integrated into your normal Cypress workflow.

Easy installation

Installation follows the standard pattern for modern Cypress plugins, with no extra friction.

With npm:
- npm install --save-dev cypress-retry-after-run @cypress/grep.
With yarn:
- yarn add -D cypress-retry-after-run @cypress/grep.

The only additional requirement is @cypress/grep, which the plugin uses to filter tests on the retry run, so it is installed alongside the plugin in a single command.

JavaScript and Typescript configuration

This plugin can be used in both JS and TS projects. Refer to the plugin's official npm link to get the full instructions on how to set it up and how to run it:

https://www.npmjs.com/package/cypress-retry-after-run

Pipeline and automation integration

cypress-retry-after-run fits naturally into any CI/CD pipeline design.

Step 1 – Main run:
- A standard job that runs the full suite (if you want) and generates .cypress-failures.json if there are failures.
Step 2 (Optional) – Do anything, like cleaning a database, or any operation in the environment you want, if necessary.
Step 3 – Automated retry:
- A second job/step that executes npm run retry (or an equivalent command you created) only if the failures file exists and/or if the previous job failed.

This design enables:

Conditional pipelines (retry only if there were actual failures).
Richer monitoring (separate dashboards for the full run and the retry run).
Smarter alerting (a test that still fails even after a dedicated retry can trigger a stronger alert or block a merge).

Why this plugin stands out

A few things make cypress-retry-after-run stand out in the Cypress ecosystem:

Built from real QA pain points: the “run everything, fix the environment, then rerun only failures” flow comes directly from production CI/CD needs.
Native integration with @cypress/grep: instead of reinventing filtering, it relies on a widely used community library, staying aligned with the Cypress ecosystem.
Minimal configuration: just a few lines in cypress.config and in the support file are enough to adopt it in both new and legacy projects.
Lightweight and focused: small package, no unnecessary dependencies, easy to drop into any repository without bloating your project.

For teams that already care deeply about automation quality and are tired of flakiness and wasted CI resources, cypress-retry-after-run is a strong ally to make pipelines more efficient, predictable, and truly professional.

You can check both links below with the plugin and a LinkedIn post with everything here, but summarized:

Triggering Cypress End-to-End Tests Manually on Different Browsers with GitHub Actions

Talking About Testing — Fri, 19 Dec 2025 19:29:32 +0000

A Practical Guide to Cross-Browser Testing

One of the most practical features of GitHub Actions is the ability to manually trigger workflows and pass parameters at runtime. This is especially useful for end-to-end (E2E) testing, where you may want to select which browser to run the tests against rather than hard-coding that choice.

In this post, I'll walk you through a GitHub Actions workflow written in YAML that allows you to manually trigger Cypress tests and select the target browser directly from the GitHub UI.

The Full Workflow

Here's the workflow I'll be explaining:

name: End-to-end tests 🧪
on:
  workflow_dispatch:
    inputs:
      browser:
        description: "Browser to run tests"
        type: choice
        required: true
        default: chrome
        options:
          - chrome
          - edge
          - electron
          - firefox
          - safari
jobs:
  cypress-run:
    runs-on: ubuntu-24.04
    steps:
      - name: Checkout
        uses: actions/checkout@v6

      - name: Install WebKit system deps (Safari)
        if: ${{ github.event.inputs.browser == 'safari' }}
        run: npx playwright install-deps webkit

      - name: Cypress run
        uses: cypress-io/github-action@v6
        with:
          command: npm run test:${{ github.event.inputs.browser }}

      - name: Upload screenshots (selected browser, on failure)
        if: failure()
        uses: actions/upload-artifact@v4
        with:
          name: screenshots-${{ github.event.inputs.browser }}
          path: cypress/screenshots
          if-no-files-found: ignore

Naming the Workflow

name: End-to-end tests 🧪

This is the friendly name shown in the GitHub Actions UI. Adding an emoji is optional, but it makes workflows easier to scan—especially when you have many of them.

Manual Trigger with `workflow_dispatch`

on:
  workflow_dispatch:

The workflow_dispatch event enables manual execution of the workflow. This means:

The workflow won't run automatically on push or pull_request
A "Run workflow" button will appear in the Actions tab

This is ideal for:

Ad-hoc test runs
Debugging browser-specific issues
Running tests before a release

Defining Input Parameters

inputs:
  browser:
    description: "Browser to run tests"
    type: choice
    required: true
    default: chrome
    options:
      - chrome
      - edge
      - electron
      - firefox
      - safari

This is the heart of the workflow.

What's happening here?

browser is a required input
The user must select one value from a predefined list
The default option is chrome

On GitHub, this renders as a dropdown selector in the UI.

This prevents invalid values and makes the workflow safer and more user-friendly.

Defining the Job

jobs:
  cypress-run:
    runs-on: ubuntu-24.04

The workflow has a single job called cypress-run
It runs on the ubuntu-24.04 GitHub-hosted runner

Ubuntu runners are commonly used for Cypress because they're fast, stable, and well supported by the Cypress GitHub Action.

Step 1: Checking Out the Code

- name: Checkout
  uses: actions/checkout@v6

This step pulls your repository code into the runner so that:

Cypress tests
package.json
Configuration files

are available during execution.

This step is required in almost every CI workflow.

Step 2: Installing Safari (WebKit) Dependencies Conditionally

- name: Install WebKit system deps (Safari)
  if: ${{ github.event.inputs.browser == 'safari' }}
  run: npx playwright install-deps webkit

This is an excellent example of conditional execution in GitHub Actions.

Why is this needed?

Cypress runs Safari tests via WebKit
WebKit requires additional system dependencies on Linux
These dependencies are unnecessary for other browsers

What the condition does

The if expression ensures that this step:

Runs only when safari is selected
Is skipped for all other browsers

This keeps the workflow:

Faster
Cleaner
Easier to maintain

Notes:

It's worth mentioning that for Cypress to work with WebKit:

The experimentalWebKitSupport property has to be set to true in the cypress.config.js file
playwright-webkit has to be installed as a dev dependency (e.g., npm i playwright-webkit -D)

Pro tip: Make sure to version not only the package.json file, but also the package-lock.json.

Step 3: Running Cypress with the Selected Browser

- name: Cypress run
  uses: cypress-io/github-action@v6
  with:
    command: npm run test:${{ github.event.inputs.browser }}

This step uses the official Cypress GitHub Action image.

Key idea

The browser input is injected dynamically into the command:

npm run test:chrome
npm run test:firefox
npm run test:safari

This implies that your package.json contains scripts like:

{
  "scripts": {
    "test:chrome": "cypress run --browser chrome",
    "test:firefox": "cypress run --browser firefox",
    "test:safari": "cypress run --browser webkit"
  }
}

This pattern keeps:

The workflow generic
Browser-specific logic inside your project configuration

Step 4: Uploading Screenshots on Failure

- name: Upload screenshots (selected browser, on failure)
  if: failure()
  uses: actions/upload-artifact@v4
  with:
    name: screenshots-${{ github.event.inputs.browser }}
    path: cypress/screenshots
    if-no-files-found: ignore

This step runs only if the job fails.

What it does

Uploads Cypress screenshots as workflow artifacts
Names the artifact based on the selected browser
Avoids failing the workflow if no screenshots exist

Why this matters

When an E2E test fails:

Screenshots provide visual context
Browser-specific issues are easier to diagnose
Artifacts are preserved for later analysis

Final Thoughts

This workflow demonstrates a clean and scalable way to:

Manually trigger Cypress tests
Select the target browser at runtime
Handle browser-specific dependencies
Collect meaningful artifacts on failure

It's a powerful pattern for teams that care about cross-browser confidence without overloading their CI pipelines with unnecessary runs.

More than just automation, this approach puts control and observability back in the team's hands—exactly where quality belongs.

For the complete implementation, take a look at the cross-browser-testing-gha GitHub repository.

Would you like to learn E2E Testing with Cypress from scratch until your tests are running on GitHub Actions and integrated with the Cypress Cloud?

Consider subscribing to my course: "Cypress, from Zero to the Cloud."

How to validate tables, rows or any content of an Excel file using Cypress

Marcelo C. — Fri, 31 Oct 2025 11:43:24 +0000

At the company I work for, we already have many test cases to validate a key behavior of our SaaS, which through the user downloads a table as an Excel file of the information needed. But there was a need to validate some edge cases, in which we also needed to validate that the content corresponds to what the table showed.

This would mean that Cypress needs to deterministically validate rows, numbers, names and even colors inside the Excel file set by our user flows. After some research, we basically came upon two Node.js libs: @e965/xlsx and exceljs.

While @e965/xlsx is mostly used for data content validation, as in validating a JSON rows straight from the sheet - exceljs is more focused for style assertion, meaning assertions like “is A1 light-green?”. All right, so now we could split keeps tests readable and fast.

Configuring @e965/xlsx library

First, wire up the configuration in Cypress. Head off to Node with cy.task(). It’s the official way to run filesystem code from Cypress tests: register tasks in setupNodeEvents and they’ll return values back to your spec.

Remember to also import the package on the config file:

//cypress.config.ts

const xlsx = require('@e965/xlsx');
...
...
...
      on('task', {
        async readExcelByPattern(pattern: string, timeoutMs = 15000) {
          const re = new RegExp(pattern);
          const end = Date.now() + timeoutMs;

          while (Date.now() < end) {
            const files = fs.readdirSync(downloadsDir).filter(f => re.test(f) && !f.endsWith('.crdownload') && !f.endsWith('.tmp'));

            if (files.length) {
              const { fullPath } = files
                .map(f => {
                  const fullPath = path.join(downloadsDir, f);
                  return { fullPath, mtime: fs.statSync(fullPath).mtimeMs };
                })
                .sort((a, b) => b.mtime - a.mtime)[0];

              await sleep(200);

              const wb = xlsx.readFile(fullPath);
              const sheet = wb.Sheets[wb.SheetNames[0]];
              const data = xlsx.utils.sheet_to_json(sheet);
              return { fileName: path.basename(fullPath), data };
            }

            await sleep(300);
          }

          throw new Error(`File .xlsx taht matches /${pattern}/ not found on "${downloadsDir}" inside ${timeoutMs}ms`);
        },

You can see that readExcelByPattern is the task we should call to validate the content like rows, tables and any information inside the Excel file. You can then define it inside your test context and methods (or define it globally over commands.ts if you plan to use it in many tests), but for a single test it should look something like this:

//my-testing-context.ts

  @step('Read downloaded excel values')
  readExcelDownloadedFile(
    pathToFile: string = 'excel_export/bugs/',
    fixture: string,
    fileName: string
  ): ExportPrintableReportContext<TParent> {
    cy.fixture(pathToFile + fixture).then((expected: any[]) => {
      cy.task('readExcelByPattern', fileName).then(({ data }: { data: any[] }) => {
        expect(data.length, 'Table length').to.equal(expected.length);
        expected.forEach((expectedRow, i) => {
          const actualRow = data[i];
          Object.entries(expectedRow).forEach(([key, expectedValue]) => {
            const actualValue = actualRow[key] === undefined ? null : actualRow[key];
            expect(actualValue, `Row ${i} - Column "${key}"`).to.equal(expectedValue);
          });
        });
      });
    });
    return this;
  }

As you can see it's pretty straight forward, it calls for a JSON file inside 'fixtures/excel_export/bugs/' that already has the values you want to validate and should be equal to the Excel file and executes a forEach of the Table length, and each row, which already awaits for a value.

And this is how it would look inside a test:

import { testContext } from '@/my-testing-context'

const testingContext = new testContext();

describe('example of reading excel files', => ()
  it('case 1', () => {
    testingContext.readExcelDownloadedFile('excel_export/bugs/', tk, 'Excel.xlsx');
  });
});

Basically it checked 171 rows of the file content and succeeded in 40 seconds.

For the second part of this tutorial, I'll expand on how to validate Excel colors as well. Happy testing!

How to get most out of cy.prompt() - 6 tips and tricks for your new AI tool!

Marcelo C. — Thu, 09 Oct 2025 13:54:06 +0000

I know, I know, Cypress has just announced a game changing feature with cy.prompt() that is going to change the way we test - or at least approach how we think of it. You're going to use Natural Language all the way to test your new app? Read through my recommendations then!

As a Cypress Ambassador I was lucky enough to be using cy.prompt for the past weeks and here are a few tips to make your testing and usage go a bit smoothly.

1) Start your phrase with the action or assertion you want

Instead of giving it an instruction like:

When the page loads, check that the header is seen and then click on Create button

Would be better to:

Wait 8 seconds for the page to load
Assert that the header is visible
Click on create button

Now Cypress will translate your instructions more easily - a hardcoded wait, followed by an assertion, followed by a click.

2) Try to separate instructions

The previous step gave it away already! Cypress prompt works as any other LLM - give clear instructions of what you want to do and it'll have a better chance to execute it.

Do not mix assertions, with force clicks, with reloads in the same line of action! The prompt needs to go through, so in a way try to act as a prompt engineer and step by step you'll get there.

3) You can have up to 20 steps for each prompt you execute

20 is the limit, ok. But that doesn't mean that you need to have 20 steps each prompt. Also, the more steps you add, the prone it is to ask for clarifications or make mistakes.

Think of it as this: each plain English text line you introduce is an abstraction layer, right? Do want an over-complicated test, or a easy to read through, understandable (for non-developers specially) test?

Lesser is better in some cases!

4) Leave some tests in prompt in order to validate flaky behavior

Got a new feature? Want to avoid brittle in your E2E test? Want to check for any weird behavior here and there? Then cy.prompt is your way to go! You can always leave your tests in plain English to see if the BDD/TDD behavior stays the same.

Remember: it works in both local and CI - but it only supports Chrome or Chromium browsers (Edge/Electron). Any others are out (sorry Firefox!). Leave it a few days or weeks in your CI in prompt scenario and see what happens.

5) Portability first?

For me any test that is written in plain English has a natural advantage over each other. It doesn't need to be refactored into any other programming language. So if you already know that the application you're working now is starting to be ported into another modern framework, leave your tests in prompt format. Your devs will appreaciate!

6) It's always cached

Another advantage with cy.prompt is that once it runs, it will cache the steps in order to avoid LLM interaction. But if you change one line in your prompt - wait 15 seconds instead of 8, for example - it will execute all over again.

Remember this to focus on speed and reliability in your tests!

How Cypress will revolutionize the use of AI in testing with cy.prompt()

Marcelo C. — Thu, 09 Oct 2025 10:28:22 +0000

Cypress has become the go-to testing framework for SDETs and QA engineers to validate modern web apps. It’s fast, reliable, and backed by a mature ecosystem—both in software updates and excellent documentation. Add to that the vibrant community building powerful plugins and extensions, and it’s clear why Cypress dominates the testing landscape.

Cypress is taking a bold step into AI-powered testing with the upcoming cy.prompt(). Unlike typical AI integrations that act as external copilots or rely on general-purpose MCP-style assistants, cy.prompt() adds the intent (what we want) built directly into the testing workflow.

This means no context switching, no juggling between an IDE plugin and your test runner. Instead, Cypress allows you to describe your intent in plain English, and the AI automatically generates selectors, actions, and assertions right inside your test.

It’s a shift from writing tests line by line to guiding your tests conversationally. Think less about cy.get() or cy.click() and more about telling Cypress what you want verified, letting the framework translate that into executable code.

Here’s a video demonstration of cy.prompt() in action:

This is the code that I used in the validation:

And here is what the prompt suggests of code locators right after is executed:

Also, you can leave the prompt as it is and push to your CI/CD pipeline:

With cy.prompt(), Cypress is no longer “just” a testing framework—it’s stepping into the AI-assisted development era. For SDETs and QA engineers, this means faster authoring, smarter locator handling, and easier onboarding for teams.

The possibilities of cy.prompt()

Cy.prompt focus on the intent: what we want, not how do do it. It's a great tool for non-developers, or anyone who doesn’t want to dive deep into app implementation.
Imagine writing the BDD (Behavior Driven Development) acceptance criteria directly into the test. You'll have the best of both worlds here, BDD criteria that is understood by all stakeholders (Project Managers, Product Owners), and the code being executed in the background.
TDD (Test-Driven-Development) is also covered for the developers. Imagine developing a feature until is ready and, step by step (word by word, line by line) it start to pass. Until is ready for deployment.
Portability is here: Need to refactor your project? Move from one programming language to another? Don't need to change a thing in your tests written in plain English, they can be easily shared, exported, or integrated across different systems.
Also, another great benefit here are the self-healing tests, they’re more resilient to changes in the DOM or selectors. This feature could fundamentally change how we approach automation.

The future of QA is not just code—it’s collaboration between AI and testers.

What are your thoughts?

Share, comment or connect with me directly in LinkedIn!

Six Technical Sessions That Will Change How You Think About Testing

Ronald Williams — Tue, 02 Sep 2025 15:04:00 +0000

Tired of expensive learning materials that take weeks to complete but teach you nothing you can't Google? Fed up with content that assumes you're still figuring out basic assertions when you're managing complex test architectures?

CypressConf 2025 workshops solve the learning problem that plagues experienced developers: finding advanced, practical content that matches your skill level and specific to your setup without the cost barriers or time wasting. Over two exclusive workshop days (October 23–24), global industry practitioners will teach you competitive skills you can implement Monday morning. These workshops were designed based on years of attendee feedback from our global community, focusing on the real problems you've told us you're solving right now.

Slack 'n' Roll: CI/CD Pipelines with GHA
Led by: Tanya Sahni, Software Developer in Test at Fashion Cloud

Your CI/CD pipeline should work for you, not against you. Tanya will show how to integrate Cypress into GitHub Actions so your tests run automatically and notify your team intelligently. This intermediate to advanced workshop assumes you're already comfortable with CI/CD concepts and want to build pipelines whose scale reliably", eliminating the manual check dance that wastes hours every sprint and positioning you as the developer who builds systems that move as fast as your code.

Advocate for Quality Within Your Company
Led by: Péter Földházi, Quality Architect at EPAM Systems

Quality isn't just a QA responsibility. It's an organizational capability. This senior-level workshop teaches experienced QA professionals how to become effective advocates for quality across engineering teams. You'll learn to speak business language while maintaining technical standards, turning quality from a cost center into a competitive advantage and transforming how your organization views testing so you lead change instead of reacting to it.

Operationalizing Quality with Data That Matters
Presented by: Dan Johansen, Senior Product Manager at Cypress.io

Raw testing data is noise. Actionable insights are signal. Dan demonstrates how to turn test results into meaningful metrics that improve release decisions. This intermediate to senior workshop teaches which data points actually matter and how to present them in ways that influence engineering strategy, enabling you to make data-driven quality decisions that leadership understands and supports.

Data Driven Testing with Cypress
Delivered by: Marko Kolasinac, CEO at Assert QA, and Dejan Živković, QA Automation Engineer at Assert QA

Hardcoded test data creates maintenance nightmares. This intermediate workshop shows how to design resilient, data-powered testing strategies that don't interfere with production systems. Marko and Dejan assume you understand testing fundamentals and focus on building migration approaches that scale without the complexity overhead that kills productivity.

Simplifying Cypress Testing
Led by: Walmyr Filho, Instructor and Founder at Talking About Testing

Writing Cypress tests isn't just writing JavaScript. It requires different thinking. Whether you're new to Cypress or have been using it for years, Walmyr shares practical techniques for writing maintainable tests that grow with your product complexity. You'll learn patterns that prevent technical debt before it accumulates, building tests that remain valuable as your codebase evolves rather than becoming liabilities.

Authentication Workflows with Cypress & Mailosaur
Led by: Filip Hric, Developer Educator at filiphric.com

Authentication testing is notoriously brittle. Filip walks through testing authentication flows so your login and access systems remain reliable across environments. This intermediate to advanced workshop handles real-world scenarios including email verification, multi-factor authentication, and role-based access, helping you secure reliable user experiences without compromising test stability.

What Makes These Workshops Different

These sessions were built from years of global community feedback. Developers told us they needed advanced content that respects their experience level, practical sessions they could apply immediately, and learning opportunities that didn't require expensive course subscriptions or weeks of commitment.

Each workshop delivers concentrated expertise from practitioners who've built testing systems at scale. No generic tutorials. No basic concepts you already know. Just advanced techniques that solve real problems you're encountering in production environments.

Workshop seats are intentionally limited and fill fast. You must be registered for CypressConf 2025 to access workshops, and early registrants get first access before standby lists open.

More workshops are coming soon.

Cypress v15: A Better User Experience

Talking About Testing — Tue, 19 Aug 2025 23:45:47 +0000

Streamlined Features and Improvements for Modern Testing

Cypress v15 is just around the corner, and one of the exciting changes is an improved user experience in the command logs of the test runner. If you’ve spent time debugging tests in v14, you’ll immediately notice how v15 makes test execution logs easier to scan, parse, and act upon.

In this post, we’ll walk through the key UX improvements.

At the end of this post, screenshots will illustrate the differences between v14 and v15.

Hierarchical Grouping

In v15, the test log is segmented into clear sections with borders to distinguish between: SESSIONS, BEFORE EACH, and TEST BODY.
In v14, although SESSIONS, BEFORE EACH, and TEST BODY also appear, there's no border to differentiate and isolate them better.

Cleaner Visual Density

The v15 layout introduces tighter grouping and spacing, so related blocks are easier to scan. The result is less eye travel compared to v14’s list.

Final Thoughts

Cypress v15 isn’t just a version bump—it’s a thoughtful step forward in how developers and testers interact with their test runner. By rethinking the command log experience with hierarchical grouping and cleaner visual density, Cypress reduces cognitive load and makes debugging more intuitive.

If you’ve ever found yourself lost in the flat logs of v14, v15 will feel like a breath of fresh air. These changes may seem subtle, but they directly improve day-to-day productivity and test clarity. And when it comes to testing, minor UX improvements often translate into big wins for speed, focus, and confidence.

As Cypress continues to evolve, v15 is a reminder that great tooling isn’t just about raw features—it’s about delivering a user experience that helps teams ship quality software faster.

Illustrations comparing Cypress versions 14 and 15

v14 session collapsed

v15 session collapsed

v14 session expanded

v15 session expanded

Would you like to learn more about web testing with Cypress?
Check out the "Cypress, from Zero to the Cloud" course from the Talking About Testing online school, and happy testing!

DEV Community: Cypress

Meet the CypressConf 2026 Keynote: Bas Dijkstra

Agent-Driven E2E Testing with Cypress: A Practical Guide to Harness Engineering with Cursor Subagents

What Is a Harness, and Why Does It Matter?

How This Applies to E2E Testing with Cypress

The Feedback Loop

The Four Agents at a Glance

Handoff Templates: Structured Artifacts That Bridge Context

Explorer Report Checklist

Steering the Harness: How to Keep Agents Aligned

Review Gates: Keeping Humans on the Loop

Team-Owned Content

Why This Approach Works

Implementing This in Cursor with Subagents and Browser

Why subagents fit this workflow

The Orchestration Pattern

Closing

References

How I stopped declaring login in each of my 5k tests

Cypress in the Age of AI Agents: Orchestration, Trust, and the Tests That Run Themselves

The Shift Nobody Talks About

Autonomy vs. Augmentation: The Framework That Matters

What Actually Happens When AI Writes Your Cypress Tests

The Architecture Cypress Teams Actually Need

Layer 1: AI Generates, Humans Gate

Layer 2: The Trust Boundary in CI

Layer 3: The Accountability Layer

Layer 4: Docker as the Trust Container

The Uncomfortable Question

What I'd Do This Week

The Bottom Line

Leading Quality Through Change: Balancing Speed, AI, and the Fundamentals That Matter

🚀 Enhancing Cypress Test Stability and Retry Capabilities

Introduction

The problem this plugin solves

Core idea of cypress-retry-after-run

Concrete benefits for CI/CD

How the plugin works under the hood

Easy installation

JavaScript and Typescript configuration

Pipeline and automation integration

Why this plugin stands out

Triggering Cypress End-to-End Tests Manually on Different Browsers with GitHub Actions

A Practical Guide to Cross-Browser Testing

The Full Workflow

Naming the Workflow

Manual Trigger with workflow_dispatch

Defining Input Parameters

What's happening here?

Defining the Job

Step 1: Checking Out the Code

Step 2: Installing Safari (WebKit) Dependencies Conditionally

Why is this needed?

What the condition does

Step 3: Running Cypress with the Selected Browser

Key idea

Step 4: Uploading Screenshots on Failure

What it does

Why this matters

Final Thoughts

How to validate tables, rows or any content of an Excel file using Cypress

How to get most out of cy.prompt() - 6 tips and tricks for your new AI tool!

How Cypress will revolutionize the use of AI in testing with cy.prompt()

Six Technical Sessions That Will Change How You Think About Testing

Cypress v15: A Better User Experience

Streamlined Features and Improvements for Modern Testing

Hierarchical Grouping

Cleaner Visual Density

Final Thoughts

Illustrations comparing Cypress versions 14 and 15

Core idea of `cypress-retry-after-run`

Manual Trigger with `workflow_dispatch`