Every E2E test suite I've worked on has the same lifecycle: build it, trust it for a few sprints, then watch it slowly become the thing that slows you down instead of speeding you up.
The root cause isn't bad tests. It's a bad mental model. We treat CSS selectors and XPath as the definition of our tests. They're not. They're a performance optimization.
The problem in one sentence
When a button moves from the sidebar to the header, the test fails. But the user journey — "click Submit" — didn't change. The test broke because the how changed, not the what.
Traditional test automation conflates these two things:
// This is "how" — breaks when the DOM changes
await page.click('.sidebar > .btn-primary-submit');
What if we separated intent from implementation?
The mental model: locators are a cache
Think of locators like a database cache:
- Cache hit: The locator resolves. Test runs at full speed (~1ms per step). Deterministic.
- Cache miss: The locator is stale. Fall back to the intent ("click the Submit button") and use AI to re-resolve the element. Update the cache.
This is exactly how a CDN works: serve from cache when possible, fetch from origin when needed.
In practice, this looks like:
goal: Verify checkout completes
statements:
- intent: Click the Submit button
action: click
locator: "getByRole('button', { name: 'Submit' })"
- VERIFY: Order confirmation is displayed
The locator is the cache. The intent is the source of truth. When the locator works, you get Playwright speed. When it breaks, the AI resolves by intent and updates the cache.
Why this matters for AI-generated code
If you're using AI coding agents (Cursor, Claude Code, Codex), your UI changes more frequently than ever. Features ship in minutes. Components get refactored aggressively.
Traditional Playwright scripts can't keep up. You end up spending more time fixing tests than fixing bugs.
The intent-cache-heal pattern fixes this by design: the test describes what should happen, not how to find the element. UI changes don't break unrelated tests.
How we implement this at Shiplight
Full disclosure: I work on Shiplight AI. We built this pattern into our testing platform:
- AI coding agent connects via our plugin
- Agent opens a real browser, verifies the UI change
- Verification is saved as a YAML test in the repo
- Locators are cached for speed, intent is the fallback
We run on Playwright under the hood — so you get the same browser engine reliability. The YAML tests are reviewable in PRs, and the locator cache auto-updates when the UI changes.
But the mental model is useful regardless of what tool you use. Stop treating selectors as the test. Start treating them as a cache.
Try it yourself
If you want to experiment with this approach:
- Shiplight Plugin — free, no account needed
- YAML Test Format spec
- Full write-up on the intent-cache-heal pattern
What's your team's approach to E2E test maintenance? I'd love to hear what's working (or not working) in the comments.
Top comments (0)