Last month I spent 4 hours debugging a Playwright test that broke because someone renamed a CSS class. Sound familiar?
I decided to try a different approach: what if the test framework could see the app like a human does, instead of relying on brittle selectors?
What I Built
An MCP (Model Context Protocol) server that gives AI agents — Claude, GPT, Cursor, Copilot — direct access to running applications. The AI can:
- Launch and connect to apps via CDP
- Tap elements, fill forms, scroll, navigate
- Take screenshots and analyze UI snapshots
- Run assertions in natural language
The Key Trick: Semantic Snapshots
Instead of sending full screenshots (expensive in tokens), I built a snapshot system that extracts the UI's semantic structure — interactive elements, their positions, labels, states. The AI gets a complete picture of the UI in ~2ms and a few hundred tokens.
Compare that to a screenshot: ~100KB of base64, thousands of tokens, and the AI still has to "guess" where buttons are.
Real Numbers
| Metric | Traditional | AI-Driven |
|---|---|---|
| Tap latency | 50-200ms | 1ms |
| UI analysis | 500ms-2s (screenshot) | 2ms (snapshot) |
| Test brittleness | High (selector-dependent) | Low (semantic) |
| Platforms supported | Usually 1-2 | 10 |
Supported Platforms
Flutter, React Native, iOS, Android, Web (Chrome/Firefox/Safari), Electron, Tauri, KMP, .NET MAUI.
253 MCP tools total. Video recording, API testing, mock responses, parallel multi-device — it's all there.
Getting Started
\bash
npx flutter-skill@latest
\\
Add to your MCP config and your AI assistant can start testing immediately.
GitHub: ai-dashboad/flutter-skill
npm: flutter-skill
Would love to hear from anyone doing E2E testing — what's the most annoying part of your current setup?
Top comments (0)