אחיה כהן

Posted on Mar 28

Every MCP Browser Tool Uses Chromium. That's a Problem.

#mcp #safari #automation #discuss

The Model Context Protocol has a browser monoculture problem, and nobody's talking about it.

I just searched the MCP server registry. There are at least 13 browser automation servers listed. Every single one requires Chromium -- Chrome DevTools Protocol, Puppeteer, Playwright with Chromium, or some wrapper around them. If your AI agent needs to interact with a web page, your only option has been "launch Chrome."

This matters more than you think. Not because of some abstract browser diversity argument, but because Chrome is the single biggest resource drain in most developers' MCP setups -- and it's completely unnecessary for 95% of browser automation tasks.

The Realization That Started This

I run an automation business. My daily workflow involves Claude Code connected to 6-7 MCP servers simultaneously. One day I noticed my M2 MacBook Pro fans spinning up during what should have been a simple task -- the AI agent was just reading a dashboard and filling a form.

I opened Activity Monitor. Chrome (spawned by the browser MCP server) was using 38% CPU. Just sitting there. With a debug port open. Waiting for commands.

Meanwhile, Safari -- the browser where I was already logged into everything I needed -- was using 0.3% CPU with 11 tabs open.

That's when it clicked: why am I running two browsers when the one I already have open does everything I need?

Introducing Safari MCP

Safari MCP is a native MCP server that controls Safari directly through AppleScript and JavaScript. No Chrome. No Puppeteer. No WebDriver. No headless browser process.

# Install from npm
npm install -g safari-mcp

# Or clone the repo
git clone https://github.com/achiya-automation/safari-mcp.git
cd safari-mcp
npm install

Enable Safari's JavaScript bridge (one-time setup):

Safari -> Settings -> Advanced -> Show features for web developers
Safari -> Develop -> Allow JavaScript from Apple Events

Add to your MCP client config (~/.mcp.json for Claude Code, .cursor/mcp.json for Cursor):

{
  "mcpServers": {
    "safari": {
      "command": "npx",
      "args": ["-y", "safari-mcp"]
    }
  }
}

That's it. No Chrome to install, no debug ports, no Playwright browser binaries to download.

The Dual-Engine Architecture

Most people look at this project and assume it's "just AppleScript." It's not. Safari MCP runs a dual-engine architecture where two completely different execution paths compete for the best way to handle each command:

AI Agent (Claude Code / Cursor / Windsurf)
        |
        v  MCP Protocol (stdio)
   Safari MCP Server (Node.js)
        |                    |
   Safari Extension      AppleScript + Swift daemon
   (HTTP, ~5-20ms)       (~5ms, always available)
        |                    |
   Content Script        do JavaScript in tab N
        |                    |
        +---> Page DOM <-----+

Engine 1: AppleScript + Swift Daemon (always available)

This is the foundation. A persistent Swift helper process stays running and accepts AppleScript commands via stdin, returning results via stdout. Instead of spawning a new osascript process for every command (~80ms overhead each), the daemon keeps one process alive. Result: ~5ms per command.

Engine 2: Safari Extension (optional, for advanced scenarios)

A native Safari Web Extension that communicates with the MCP server over HTTP (port 9224). This engine handles things AppleScript fundamentally cannot:

Closed Shadow DOM -- Reddit, Web Components, Shoelace UI. AppleScript's do JavaScript runs in the page context and can't pierce shadow boundaries. The extension's content script runs in an isolated world with access to element.shadowRoot.
Strict CSP sites -- Sites with script-src Content Security Policy headers block injected JavaScript. The extension executes in the MAIN world, bypassing CSP entirely.
Deep framework state -- React Fiber tree traversal, ProseMirror editor manipulation, Vue reactivity system hooks. The extension can access internal framework objects that AppleScript-injected JS often can't reach.

The server automatically uses the Extension when it's connected, falling back to AppleScript seamlessly. You don't configure anything -- it just works.

80 Tools Across 20 Categories

Safari MCP isn't a proof of concept with 5 navigation tools. It ships 80 tools that cover everything from basic navigation to network mocking to accessibility auditing:

Navigation & Reading -- safari_navigate, safari_read_page, safari_navigate_and_read (combines both in one round-trip), safari_go_back, safari_go_forward, safari_reload

Interaction -- safari_click (CSS selector, visible text, or coordinates), safari_double_click, safari_right_click, safari_hover, safari_drag, safari_click_and_wait

Forms -- safari_fill (React/Vue/Angular compatible), safari_fill_form (batch), safari_fill_and_submit, safari_select_option, safari_type_text, safari_press_key, safari_clear_field

Screenshots & PDF -- safari_screenshot (viewport or full page), safari_screenshot_element, safari_save_pdf

Network -- safari_start_network_capture, safari_network_details (headers, timing, bodies), safari_mock_route (intercept fetch/XHR), safari_throttle_network (simulate 3G/4G/offline)

Storage -- Full cookie, localStorage, sessionStorage, and IndexedDB access. Plus safari_export_storage / safari_import_storage for backing up and restoring entire browser sessions as JSON.

Accessibility -- safari_accessibility_snapshot returns the full a11y tree with roles, ARIA attributes, and focusable elements.

Data Extraction -- safari_extract_tables (structured JSON), safari_extract_meta (OG, Twitter, JSON-LD), safari_extract_links, safari_extract_images

Advanced -- safari_css_coverage (find unused CSS), safari_analyze_page (full analysis in one call), safari_emulate (device emulation), safari_run_script (batch multiple actions)

The React Form Problem Everyone Hits

If you've automated browser forms with any tool, you've probably hit this wall:

// This doesn't work in React
document.querySelector('#email').value = 'test@example.com';

React's synthetic event system doesn't see direct value assignments. The component state doesn't update. Validation doesn't run. The submit button stays disabled.

Safari MCP handles this correctly by default using native property setters:

// What safari_fill actually does under the hood
const nativeSetter = Object.getOwnPropertyDescriptor(
  window.HTMLInputElement.prototype, 'value'
).set;
nativeSetter.call(element, value);
element.dispatchEvent(new Event('input', { bubbles: true }));
element.dispatchEvent(new Event('change', { bubbles: true }));

This approach works with React, Vue, Angular, Svelte, Solid -- any framework that uses synthetic event listeners or intercepted property setters. When the Extension is active, it goes even deeper: it can reset React's internal _valueTracker to ensure the framework truly sees the new value.

You don't think about any of this. You just call safari_fill and it works.

Head-to-Head: Safari MCP vs The Alternatives

Here's an honest comparison. I use all three of these regularly and each has legitimate strengths:

Feature	Safari MCP	Chrome DevTools MCP	Playwright MCP
Engine	WebKit (native)	Chromium (CDP)	Chromium/WebKit/FF
CPU idle	~0.1%	~8-15% (observed, M2)	~3-8%
CPU active	~2-5%	~25-40% (observed, M2)	~10-25%
Memory	~30MB (Node only)	~200-400MB (Chrome+Node)	~150-300MB
Command latency	~5ms	~15-30ms	~20-50ms
Startup	<1s	3-5s	2-4s
Your logins/cookies	Yes (real Safari)	Yes (your Chrome)	No (clean profile)
Tool count	80	~30	~25
Network mocking	Yes	No	Yes
Lighthouse	No	Yes	No
Performance traces	No	Yes	No
Cross-platform	macOS only	Any OS	Any OS
Headless mode	No	Yes	Yes
Shadow DOM	Yes (with Extension)	Yes	Yes
Dependencies	None	Chrome + debug port	Playwright runtime

Where each tool wins:

Safari MCP: Daily workflow automation, anything involving your existing browser sessions, long-running agent tasks where CPU/battery matters, Mac-native development
Chrome DevTools MCP: Lighthouse audits, Performance traces, CPU profiling, memory snapshots -- the debugging tools Chrome pioneered
Playwright MCP: Cross-platform CI/CD, testing across multiple browser engines, headless server environments

The CPU/memory numbers above are what I observed on my M2 MacBook Pro running each server. Your numbers will vary, but the relative differences should hold -- Chrome simply has more overhead because it's running an entire separate browser process.

Real-World Agent Workflow: Why Sessions Matter

Here's something the comparison table doesn't capture: authenticated sessions.

When I ask Claude to "check my Google Search Console rankings," Safari MCP just... does it. Because I'm already logged into GSC in Safari. The agent navigates to the page, reads the data, done.

With Playwright MCP, I'd need to either:

Store credentials and log in every time (security concern)
Export cookies and import them (fragile, expires)
Use a persistent browser context (more complexity)

With Chrome DevTools MCP, my Chrome also has my sessions, so this works too. But now I'm running two browsers -- Safari for my work, Chrome for the AI agent. That's 400MB+ of RAM and 15% CPU for the privilege.

Safari MCP sidesteps all of this. One browser. One set of sessions. Zero extra overhead.

Setup for Every MCP Client

Claude Code (~/.mcp.json):

{
  "mcpServers": {
    "safari": {
      "command": "npx",
      "args": ["-y", "safari-mcp"]
    }
  }
}

Claude Desktop (claude_desktop_config.json):

{
  "mcpServers": {
    "safari": {
      "command": "npx",
      "args": ["-y", "safari-mcp"]
    }
  }
}

Cursor (.cursor/mcp.json):

{
  "mcpServers": {
    "safari": {
      "command": "npx",
      "args": ["-y", "safari-mcp"]
    }
  }
}

All of these work identically. The server communicates over stdio using the standard MCP protocol.

The Honest Tradeoffs

Safari MCP is not a drop-in replacement for everything:

macOS only -- If you're on Linux or Windows, this doesn't exist for you. Safari is a macOS application.
No headless mode -- Safari is always "real." You can't run it on a CI server without a display.
No Lighthouse -- Performance auditing is Chrome's crown jewel. I still use Chrome DevTools MCP for that.
No CDP -- We use AppleScript, not Chrome DevTools Protocol. This is both a limitation and the reason it's so lightweight.
Extension requires Xcode -- The optional Safari Extension needs a one-time Xcode build. AppleScript mode works without it.

My actual daily setup: Safari MCP for 95% of tasks, Chrome DevTools MCP for the 5% that specifically needs Lighthouse or Performance traces. They coexist peacefully.

What's Next

The project is MIT-licensed, open source, and actively maintained. The codebase is deliberately small -- two main files (index.js for MCP tool definitions, safari.js for the AppleScript bridge) totaling about 5,000 lines.

Some things I'm working on:

Better tab management for multi-window workflows
Improved network capture with response body access
More device emulation presets

PRs are welcome. The architecture is simple enough that most contributors can get productive within an hour of reading the source.

GitHub: github.com/achiya-automation/safari-mcp
npm: npmjs.com/package/safari-mcp

The Question I Actually Want Answered

I built Safari MCP because Chrome's overhead drove me crazy. But I'm genuinely uncertain about something:

How many MCP servers are you running simultaneously in your daily workflow, and what's the total memory footprint?

I run 6-7 and it adds up fast. I'm curious whether other people have hit the same resource ceiling, or whether I'm unusually sensitive to it because I work on a laptop all day without external power.

If you're running Chrome DevTools MCP or Playwright MCP alongside 4+ other MCP servers, I'd love to know: what does your Activity Monitor (or Task Manager) look like right now? Screenshot it. I bet at least one process in there is Chrome eating more resources than all your other MCP servers combined.

DEV Community