You're typing a message in Slack. An AI agent is doing something in the background — reading a page, taking a screenshot, navigating a tab. Mid-sentence, Safari jumps to the foreground. Three of your keystrokes land in Safari's address bar instead of your message. You lose your train of thought. You alt-tab back. A few seconds later, it happens again.
That's focus theft, and it's the single most enraging thing a "background" automation tool can do. The whole promise of Safari MCP is that your agent drives the real Safari you're already logged into — while you keep working in another app. If it yanks the foreground every few calls, that promise is broken.
Here's the part that stings: I already had a focus guard. Save the frontmost app before touching Safari, restore it after. I shipped that months ago. On macOS Tahoe it quietly stopped being enough — and when I finally sat down to fix it, it wasn't one bug. It was three independent race windows in the same pipeline, plus a fourth bug where my own code was fighting the user.
Why Tahoe is different
The save/restore guard was written against an assumption: Safari only comes forward when I tell it to. That was roughly true on older macOS. On Tahoe it isn't.
Safari now implicitly activates itself whenever an AppleScript mutates one of its windows — set URL, set bounds, set current tab. None of those say "come to the front." Tahoe brings Safari forward anyway. And screencapture -l<windowID> flashes the target window forward for the capture itself.
So every safari_navigate, every safari_snapshot, every screenshot had a brand-new way to steal focus that my guard was never designed to catch. The guard wasn't wrong. The ground had shifted underneath it.
Gap 1: the hot path had no guard at all
The first thing I found is the kind of bug that makes you put your head on the desk.
Safari MCP has two ways to run AppleScript. There's osascript — the subprocess wrapper, slow, and it did save and restore the frontmost app. And there's osascriptFast — a persistent Swift daemon, about 5ms per call, roughly 18× faster than spawning a subprocess. It's the hot path. safari_navigate uses it. safari_snapshot uses it. The tab-resolution layer, the profile-window detector, and dozens of internal helpers all funnel through it.
osascriptFast had no focus guard whatsoever.
So the slow, rarely-hit path was protected, and the fast path that runs on nearly every single tool call was wide open. The protection existed exactly where it wasn't needed and was missing exactly where it was. I added a mirror guard to osascriptFast, gated on a !_focusGuardActive flag so nested calls (a runJSLarge that internally calls back through the daemon) don't redundantly save-and-restore on top of an outer guard that already did it.
Gap 2: the restores weren't actually waited for
With the hot path guarded, focus theft got better. Not gone. Still flickering every so often.
The save side was synchronous and fine. The restore side called _helperActivateApp(prev).catch(() => {}) — fire-and-forget — in three separate places: the osascript subprocess path, runJSLarge, and the screenshot's screencapture path.
Here's the trap. Re-activating an app is NSRunningApplication.activate() under the hood, and that is asynchronous at the OS level. Calling it doesn't mean focus is back. It means focus will come back, eventually, on the window server's schedule. The fire-and-forget .catch() returned control to my code while Safari was still frontmost. For the next 5–50ms there was a window where the user's keystrokes landed in Safari — and then the restore completed and the foreground snapped back, which is exactly what makes it feel like a flicker rather than a freeze.
All three sites now await restoreFocusIfStolen(prev). You cannot fire-and-forget a focus restore. The whole point is the timing.
Gap 3: Tahoe can silently refuse to activate
Even awaited, it still wasn't perfect — and this one took the longest to believe.
On Tahoe, the window-server policy can silently block NSRunningApplication.activate(). You call it, it doesn't throw, it doesn't return an error, and the app simply doesn't come forward. Your saved app stays stuck behind Safari and there's no signal that anything failed.
So restoring focus can't be a single call you trust. restoreFocusIfStolen now:
- Activates the saved bundle.
- Settles 5ms — Tahoe needs a beat to honor the activate.
- Re-reads the frontmost app to check whether it actually worked.
- Falls back to
_helperHideSafari()only if Safari is still on top.
The fallback is the nice trick. Instead of fighting to push the right app forward, hide Safari — and the OS auto-picks the next app in the z-order, which is the one we saved. "Make the right app frontmost" is unreliable on Tahoe. "Make Safari not-frontmost" is reliable. Same outcome, opposite verb.
The bonus bug: my own poll was fighting the user
Then I caused a new bug with the fix.
Every 3 seconds, the server polls tell application "Safari" to return name of window N to notice when the user switches Safari profile windows. The moment osascriptFast became focus-guarded, that read-only poll started running the full save → detect → restore dance every 3 seconds.
Picture the user deliberately clicking over to Safari to read something. Within 3 seconds my poll fires, sees Safari is now frontmost, decides "Safari stole focus," and runs the hide fallback — against the user, who wanted to be in Safari. My focus protection had become focus sabotage on a 3-second timer.
The poll is read-only. It reads a window name; it provably cannot activate Safari. So it now passes noFocusGuard: true and opts out of the guard entirely. The guard belongs on calls that mutate Safari, never on a passive read.
The lesson
I went in thinking "add a focus guard." I came out understanding that focus preservation on modern macOS isn't a boolean you bolt on — it's a pipeline, and every stage has its own race window:
- Coverage — the guard has to be on the path that actually runs (the fast one), not just the obvious one.
- Timing — async OS calls have to be awaited, or "restored" is just a hopeful word.
- Verification — on Tahoe you can't trust that activate worked; you re-read and fall back.
- Scope — a guard on a read-only poll doesn't protect the user, it ambushes them.
Four properties, four separate failure modes, one feature. Miss any one and the foreground still flickers — and the user can't tell you which of the four broke, only that "it keeps stealing focus." Which is exactly the bug report I'd been staring at.
The full writeup is in the v2.11.7 changelog. Safari MCP is MIT-licensed and on npm — npx safari-mcp. If you're building anything that drives a GUI app on the user's behalf while they keep working, I'd budget for all four of these, not just the first.
Top comments (0)