The Problem with AI Tests That Don’t Know Your App

#ai #testing #cypress #webdev

AI-generated Cypress tests are promising, but by default, the AI has never seen your app.
Out of the box, cy.prompt() has no knowledge of your specific app. It does not know your selectors, your valid usernames, or your known failure scenarios. It is working from general knowledge, not your actual codebase.

That is where RAG comes in. Retrieval-Augmented Generation. Instead of relying on a generic AI, you feed it your own documentation. Your API spec. Your component library. Your bug history. When a test is being generated, it pulls what is relevant and uses that as its foundation.

I tried this locally using Sauce Demo, a free e-commerce app built for testing practice. I created three simple docs:

An API spec covering login, inventory, cart, and checkout
A component doc with exact CSS selectors for every page
A bug history doc with known failure scenarios

I indexed these into ChromaDB using Google Gemini embeddings. When I queried "user login with valid credentials" it retrieved exactly the right context, the API spec, the correct selectors, and the locked out user bug. No guessing.

The difference was immediate. Instead of guessing, cy.prompt() now had real context to work from:

The AI resolved the correct selectors, mapped to the right flows, and the test passed.

That said, it is not a replacement. You will always need a human to write better assertions. You will always need a human to cover intent. And any context that never made it into your docs will not show up in your tests either.

I am curious. If you have tried this, did the AI surprise you with what it caught, or what it missed?

DEV Community

The Problem with AI Tests That Don’t Know Your App

Top comments (0)