MCP Apps are hard to test

#mcp #chatgpt #react #webdev

MCP Apps are hard to test.

They run inside ChatGPT and Claude, so every code change means deploying to a real host, burning AI credits, and waiting through non-deterministic LLM responses. If you're building for both hosts, double everything.

We built the sunpeak Inspector to fix this.

It replicates the ChatGPT and Claude MCP App runtimes on localhost. Your app renders exactly as it would inside the real hosts, with accurate display modes, themes, safe areas, and conversation chrome. One command to start:

sunpeak inspect --server URL

Works with any MCP server. Python, TypeScript, Go, whatever. No sunpeak project required.

For development: Switch between ChatGPT and Claude from the sidebar. Toggle light/dark themes, mobile/tablet/desktop widths, and display modes. Edit tool input and output live. Changes appear instantly with HMR.

For testing: The inspector doubles as the test runtime for Playwright E2E tests. Define tool states with simulation files (JSON fixtures), load them via URL, and assert against the rendered output. Test every host, theme, and display mode combination in CI/CD. No paid accounts, no API keys, no credits on your CI runners.

For coding agents: Claude Code, Codex, and Cursor can run the inspector and execute Playwright tests programmatically, so they can iterate on MCP Apps without needing a human to manually test in a real host.
sunpeak is MIT licensed and open source.

DEV Community

MCP Apps are hard to test

Top comments (0)