DEV Community

charlieww
charlieww

Posted on

How I Stopped Writing Fragile E2E Tests and Let AI Handle It

Last month I spent 4 hours debugging a Playwright test that broke because someone renamed a CSS class. Sound familiar?

I decided to try a different approach: what if the test framework could see the app like a human does, instead of relying on brittle selectors?

What I Built

An MCP (Model Context Protocol) server that gives AI agents — Claude, GPT, Cursor, Copilot — direct access to running applications. The AI can:

  • Launch and connect to apps via CDP
  • Tap elements, fill forms, scroll, navigate
  • Take screenshots and analyze UI snapshots
  • Run assertions in natural language

The Key Trick: Semantic Snapshots

Instead of sending full screenshots (expensive in tokens), I built a snapshot system that extracts the UI's semantic structure — interactive elements, their positions, labels, states. The AI gets a complete picture of the UI in ~2ms and a few hundred tokens.

Compare that to a screenshot: ~100KB of base64, thousands of tokens, and the AI still has to "guess" where buttons are.

Real Numbers

Metric Traditional AI-Driven
Tap latency 50-200ms 1ms
UI analysis 500ms-2s (screenshot) 2ms (snapshot)
Test brittleness High (selector-dependent) Low (semantic)
Platforms supported Usually 1-2 10

Supported Platforms

Flutter, React Native, iOS, Android, Web (Chrome/Firefox/Safari), Electron, Tauri, KMP, .NET MAUI.

253 MCP tools total. Video recording, API testing, mock responses, parallel multi-device — it's all there.

Getting Started

\bash
npx flutter-skill@latest
\
\

Add to your MCP config and your AI assistant can start testing immediately.

GitHub: ai-dashboad/flutter-skill
npm: flutter-skill

Would love to hear from anyone doing E2E testing — what's the most annoying part of your current setup?

Top comments (0)