The Problem
Building a Chrome extension that modifies a third-party web app is a unique challenge. The DOM structure is opaque, class names are minified and change between deployments, and there's no official API to hook into. Traditional extension development looks like this:
- Inspect the DOM manually in DevTools
- Write selectors and content scripts
- Reload the extension
- Check if it works
- Repeat
This cycle is slow. I wanted an AI coding agent that could see the actual browser state and verify its own changes — not just generate code blindly.
That's how I arrived at this stack: WXT for the extension framework, Chrome DevTools MCP for giving the AI agent browser access, and Cursor as the IDE tying it all together.
The Stack
| Tool | Role |
|---|---|
| WXT | Chrome extension framework (TypeScript, hot reload, Manifest V3) |
| Chrome DevTools MCP | MCP server that exposes Chrome DevTools Protocol to AI agents |
| Cursor | AI-powered IDE with native MCP support |
Step 1: WXT with a Fixed CDP Port
WXT is a framework that wraps Chrome extension development with file-based routing, hot reload, and TypeScript support out of the box. The key insight is that WXT's runner can launch Chrome with custom Chromium args — including --remote-debugging-port.
// wxt.config.ts
import { defineConfig } from 'wxt';
export default defineConfig({
manifest: {
name: 'My Extension',
version: '1.0',
permissions: ['storage'],
},
extensionApi: 'chrome',
runner: {
chromiumArgs: [
'--remote-debugging-port=9222',
`--user-data-dir=${process.cwd()}/.chrome-debug-profile`,
'--exclude-switches=enable-automation',
],
startUrls: ['https://example.com'],
},
});
Three things to note:
-
--remote-debugging-port=9222— Exposes the Chrome DevTools Protocol on a fixed port. The MCP server connects here. -
--user-data-dir— A dedicated profile directory, separate from your daily Chrome. Login sessions persist across dev restarts. Add this to.gitignore— it contains cookies and session tokens that must not be pushed to a repository. -
--exclude-switches=enable-automation— Without this, some sites detect the "automated" browser and block sign-in.
When you run wxt (or npm run dev), WXT launches Chrome with these args, loads your extension, and watches for file changes — all in one command.
Step 2: Chrome DevTools MCP Configuration
MCP (Model Context Protocol) lets AI agents call external tools. Chrome DevTools MCP is an MCP server that wraps the Chrome DevTools Protocol — giving your AI agent the ability to navigate pages, evaluate JavaScript, take screenshots, and inspect the DOM.
Configuration lives in .cursor/mcp.json:
{
"mcpServers": {
"chrome-devtools": {
"command": "npx",
"args": [
"-y",
"chrome-devtools-mcp@latest",
"--browserUrl=http://127.0.0.1:9222"
]
}
}
}
That's it. When Cursor starts, it spins up the MCP server, which connects to Chrome on port 9222. The AI agent can now see and interact with your browser.
Step 3: The Dev Script (Optional but Recommended)
A small shell script wrapping the two main workflows keeps things ergonomic:
./dev.sh dev # WXT dev server + Chrome (hot reload + MCP)
./dev.sh start # Load built extension (no hot reload, MCP only)
./dev.sh stop # Kill debug Chrome
./dev.sh status # Check CDP connection
dev is the primary mode — WXT handles everything and you get hot reload. start is for testing production builds with MCP inspection. The script handles edge cases like port conflicts, PID management, and connection verification via curl http://localhost:9222/json/version.
The Workflow in Practice
1. Start the environment
npm run dev # or ./dev.sh dev
Chrome opens automatically with the extension loaded. WXT watches for file changes. The MCP server connects to port 9222. Cursor's AI agent can now see the browser.
2. AI inspects current state
The AI agent evaluates JavaScript in the browser context to understand the current DOM:
// The agent runs this via MCP
evaluate_script({
function: `() => ({
injectedStyle: !!document.getElementById('my-extension-styles'),
buttonCount: document.querySelectorAll('.my-custom-button').length,
panelVisible: !!document.getElementById('my-panel'),
})`
})
The agent can verify its own changes without you switching context. It writes code, WXT hot-reloads, and the agent checks if the DOM updated correctly.
3. AI verifies changes through the browser
Here's the key difference from normal AI-assisted coding. Instead of:
"I've added the panel. Please refresh and check if it works."
The AI does this:
"I've added the panel. Let me verify... [evaluates script via MCP] ... The
#my-panelelement exists, has 5 child entries, and is positioned correctly. Rendering looks good."
Text-based DOM verification is preferred over screenshots — it's faster, cheaper, and more precise:
// Good: structured verification
evaluate_script(() => {
const buttons = document.querySelectorAll('.my-button');
return {
count: buttons.length,
firstButton: buttons[0]?.outerHTML.substring(0, 200),
};
});
// Screenshots only when you need visual layout confirmation
take_screenshot()
Tips and Gotchas
Content Script Isolation
Chrome extension content scripts run in an isolated world. Variables set on window in the content script are invisible to evaluate_script via MCP, because MCP evaluates in the page context.
The workaround: verify through DOM side effects, not global variables.
// Won't work: window globals are in the isolated world
evaluate_script(() => window.myExtensionState) // → undefined
// Works: check the DOM changes the extension made
evaluate_script(() => ({
styleInjected: !!document.getElementById('my-extension-styles'),
panelExists: !!document.getElementById('my-panel'),
}))
Selector Strategy for Third-Party UIs
When building extensions for sites you don't control, selectors break frequently. A fallback chain helps:
const el =
document.querySelector('[aria-label*="Submit"]') ||
document.querySelector('[data-test-id="submit"]') ||
document.querySelector('.submit-btn');
Priority:
-
ARIA attributes (
aria-label,role) — most stable across updates -
Semantic attributes (
data-test-id) — moderately stable - Class names — last resort, always provide as fallback
You can even build a DOM analyzer shortcut (Ctrl+Shift+D) that exports the page structure in a format the AI agent can consume. When selectors break, press the shortcut, paste the output into Cursor, and the agent updates the fallback selectors.
Async DOM Waiting
SPA elements appear asynchronously. Rather than fragile setTimeout chains, use polling with bounded retries:
let retries = 0;
const interval = setInterval(() => {
const el = document.querySelector(selector);
if (el || retries++ > 10) {
clearInterval(interval);
if (el) callback(el);
}
}, 500);
If the element never appears, fail silently — no console spam.
Google Login Gotcha
When Chrome launches with --remote-debugging-port, Google sometimes detects it as an "unsafe browser" and blocks sign-in. The --exclude-switches=enable-automation flag helps, but if it's not enough:
- Launch Chrome with the dedicated profile (without WXT)
- Sign in manually
- Close Chrome
- Now run
npm run dev— WXT reuses the same profile with the valid session
The dedicated --user-data-dir persists your login across dev sessions.
--user-data-dir and Security
The dedicated profile serves two purposes:
- Isolation from your daily Chrome: The dev browser doesn't touch your bookmarks, extensions, or sessions — and your personal credentials don't leak into the dev environment.
- Minimal credentials: Only log into what you need for development. Don't sign into personal Gmail or other unrelated accounts.
Keep in mind:
-
Always add the profile directory to
.gitignore. It contains cookies, session tokens, and LocalStorage.
.chrome-debug-profile/
-
CDP port 9222 is accessible from localhost.
--remote-debugging-portbinds to127.0.0.1by default, but any process on your machine can access all open tabs. Only run it during active development. - Don't use this on shared machines. While CDP is open, anyone on the same machine can control the browser session.
Getting Started
If you want to try this workflow:
1. Create a WXT project
npm create wxt@latest my-extension
cd my-extension
2. Add CDP port to wxt.config.ts
runner: {
chromiumArgs: [
'--remote-debugging-port=9222',
`--user-data-dir=${process.cwd()}/.chrome-debug-profile`,
'--exclude-switches=enable-automation',
],
},
3. Create .cursor/mcp.json
{
"mcpServers": {
"chrome-devtools": {
"command": "npx",
"args": ["-y", "chrome-devtools-mcp@latest", "--browserUrl=http://127.0.0.1:9222"]
}
}
}
4. Run
npm run dev
Chrome launches with CDP enabled, Cursor's agent connects. Your AI agent can now see your browser.
Why This Workflow Matters
The traditional Chrome extension development loop is write → reload → manually check → repeat. With WXT + Chrome DevTools MCP, it becomes write → auto-reload → AI verifies → iterate — and the AI agent can do the first and last steps too.
- Debugging goes from "read console logs, set breakpoints, manually reproduce" to "AI evaluates scripts in the live browser and reports what's happening."
- Selector maintenance goes from "open DevTools, inspect element, copy selector, paste into code" to "AI reads the DOM and updates fallback selectors."
- Feature development goes from "code blind, test manually" to "AI writes code, checks DOM state, fixes issues — all in one turn."
This doesn't replace understanding your own extension. But it dramatically shortens the feedback loop, especially for the tedious parts of third-party DOM manipulation.
Top comments (0)