DEV Community

אחיה כהן
אחיה כהן

Posted on

I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked)

Or: why every browser-automation MCP uses Chromium, and why that's the wrong default on macOS.


The problem I kept hitting

Every browser automation MCP server I tried on my Mac — chrome-devtools-mcp, playwright-mcp, browsermcp, puppeteer-mcp — did the same thing: spin up a fresh Chromium instance with nothing in it. No logins, no cookies, no session state. Then my AI agent would spend the first 5 minutes of every task navigating Cloudflare, solving reCAPTCHA, or explaining to me that it couldn't log into Gmail.

Which is weird, because I was already logged into Gmail. In Safari. In the window right next to me.

The disconnect bothered me enough that I started reading Chromium-MCP source code. And what I found is that the entire ecosystem is built on an assumption that quietly doesn't hold for macOS users: "just spin up Chromium, it'll be fine."

It isn't fine.

What Chromium costs on Apple Silicon

Every Chromium process on M1/M2/M3 Macs pays a non-trivial tax:

  • Multiple helper processes per tab (GPU, renderer, network, storage)
  • WebKit-parity emulation that duplicates what Safari's WebKit gives you for free
  • RAM spike on tab open, and fans audibly spinning up
  • No access to the user's existing Safari extensions, iCloud Keychain, Apple Pay, or ApplePay-linked banking session

When you have a laptop on your lap, you feel every one of these.

The headless-browser fallacy

The first thing people say is: "use headless mode, it's lighter." Sort of. Headless Chromium is still Chromium — you've just hidden the window. More importantly, headless mode is what gets you blocked. Cloudflare, reCAPTCHA v3, Akamai, DataDome — they all fingerprint headless browsers within seconds. Your agent's first action on 30% of the real web becomes "prove you're human."

A headful browser running on your actual machine, with your actual fingerprint, doesn't have this problem. But headful Chromium-MCP means now you have two browsers open — Safari (which you're using) and Chromium (which your agent is using). That's a fan-melting setup.

The alternative no one was building

What I wanted was obvious once I said it out loud:

Drive the Safari the user already has open. Inherit their logins, cookies, extensions, Apple Pay session. Use the WebKit process that's already running. Don't spin up a second browser.

What I found out when I tried to build it: macOS has made this weirdly hard, and I think that's why nobody had done it.

The three things that kept breaking

1. React's _valueTracker.
You can't just set input.value = "hello" and call dispatchEvent("input"). React has an internal _valueTracker on every controlled input that decides whether your "input" event is real. If the tracker thinks the value didn't change, React ignores you. Fixing this means reaching into React's internal state and calling setter.call(input, value) via the prototype's native setter. It works, but it's the kind of code you don't write until you've spent an afternoon wondering why your form submission silently fails on every SPA.

2. Shadow DOM traversal.
Modern web components hide everything behind shadowRoot. document.querySelector stops at the shadow boundary. You need a recursive walker with a MutationObserver cache, because otherwise traversing a single YouTube page costs you 200ms. And if you get the cache invalidation wrong, clicks land on stale element refs.

3. CSP.
About 30% of high-value pages (Google Search Console, LinkedIn, Gmail's admin console, many banks) block inline eval and Function() via strict Content Security Policy. Pure JavaScript injection fails silently. The workaround is a 4-strategy fallback chain: try regular JS → try document.evaluate → try AppleScript do JavaScript → try an injected content script via a Safari extension. Each one has its own failure modes and you only know which applies by trial.

I ended up writing this out on HackerNoon last week, because the reverse-engineering took long enough that it felt worth sharing: the three hardest problems.

The unintentional side effects

After a couple of months of using Safari-backed MCP instead of Chrome-backed MCP, I noticed a few things I wasn't expecting:

  • My battery lasted measurably longer on coding-agent-heavy days. No surprise in retrospect — one browser instead of two.
  • My agent's success rate on "just book this for me" tasks went up. It was already logged into the calendar, the banking app, the booking portal.
  • I stopped having to re-authenticate everything every time I rebooted. Because the agent uses the browser I was already using.
  • Safari stays in the background. MCP calls run via AppleScript + a persistent Swift daemon. The window doesn't steal focus, so I can keep working while an agent finishes a long task.

Boring outcomes, maybe. But they compound over a workday.

Why this doesn't generalize

A caveat: this approach only makes sense on macOS. On Linux or Windows, Chromium is the right default — there's no equivalent "browser the user is already using" with the same automation surface. And you give up Chrome DevTools' performance traces and Lighthouse, which don't have Safari equivalents. I still keep Chrome DevTools MCP installed for those specific audits.

But "daily browsing tasks" — navigate, click, fill a form, extract some data, take a screenshot — those are 95% of what AI agents do with browsers. And for that 95%, on macOS, it's worth reconsidering the default.

If you want to try it

The project is called Safari MCP. It's MIT-licensed, one npx command to install, and works with Claude Code, Claude Desktop, Cursor, Windsurf, and VS Code:

npx safari-mcp
Enter fullscreen mode Exit fullscreen mode

80 tools covering the full MCP surface — navigation, clicks, forms, screenshots, network mocking, cookies, accessibility snapshots, performance metrics. The README covers setup for each MCP client.

If you've been feeling the Chromium tax on Apple Silicon, maybe give this a try. And if it works for you, a star on GitHub helps other macOS developers find it.


Written after a few months of running Safari MCP as my primary browser automation tool on an M3 MacBook Air. Your mileage will vary — I'd love to hear what breaks for you.

Tags: mcp, claude, macos, webautomation, webdev

Top comments (0)