My safety guard protected 2 tools and trusted the other 20

#ai #javascript #security #opensource

I maintain an MCP server that lets a coding agent drive your real, logged-in Safari — the same browser where your bank, your email, and your half-written Slack messages live. The whole premise only works if there's one ironclad rule:

The agent may only touch tabs it opened. Never yours.

I wrote that guard early. It was a small function: before any page-mutating action, check that the target tab is one the agent owns. I dropped it into the wrapper that safari_click and safari_fill both flow through, watched the two of them refuse to act on an unowned tab, and moved on feeling responsible.

It took three separate audits to discover that the guard was, for most of its life, decorative.

Round 1: the guard was in a place almost nothing went through

The wrapper I'd guarded — call it extensionOrFallback — was the path for click and fill. It was not the path for the other twenty-ish tools that mutate a page.

safari_set_cookie. safari_delete_cookies. Every local-storage and session-storage writer. safari_import_storage. safari_drag. safari_upload_file. safari_paste_image. safari_select_option. safari_mock_route. safari_throttle_network. safari_override_geolocation. safari_handle_dialog. safari_resize.

Every one of them called the engine directly. None of them touched the wrapper. So the ownership check — the entire safety story of the project — applied to two tools and was silently absent from the rest. An agent that got confused about which tab was in front could write a cookie or dump localStorage into your logged-in session, and nothing would stop it.

The fix was conceptually trivial (extract the check into _assertTabOwnership() and call it first in all of them) and that triviality is exactly the point. A guard living in one convenient wrapper is not a guard. It's a guard-shaped object that happens to sit on the two code paths you tested. The real invariant is "every path that mutates a page calls the check," and a wrapper can only ever promise that for the paths that go through the wrapper.

Round 2: the batch tool had an escape hatch

safari_run_script runs a batch of steps — navigate, click, fill, evaluate — in one call. It did check ownership… once, pre-flight, before the batch started.

A single pre-flight check on a batch that can move between tabs is not a check. Two holes:

evaluate was exempt. You could run arbitrary JavaScript in an unowned tab simply by wrapping it in a batch instead of calling it standalone.
A switchTab or navigate step mid-batch could walk onto your tab, and the next step — a click, say — would land there, because the only gate was at the door, not on each step.

Now ownership is enforced per step while the batch runs. A navigate step registers ownership of its destination exactly like the standalone tool does, and any refused step aborts the whole batch instead of letting the rest proceed on freshly-stolen ground.

The lesson rhymes with Round 1: I'd put the check at the start of the operation when the dangerous thing was the transition between steps. Guards belong on the state change, not on the entrance.

Round 3: the state itself was lying

The third audit was the unsettling one, because the guard was now called everywhere — and could still be wrong, because the data it consulted was rotten in three different ways.

/org owned /org-evil. Ownership was matched with a path-prefix test and no segment boundary. If the agent legitimately owned https://site.com/org, my matcher cheerfully concluded it also owned https://site.com/org-evil on the same origin. A string prefix is not a security boundary; I had quietly treated it as one. Matching now requires a real / boundary, and — because this is precisely the kind of rule that rots silently — I pulled the whole matching/TTL semantics out into a pure ownership-match.js module with a unit suite that locks the /org vs /org-evil case in place forever.

The TTL leaked ownership across sessions. Every time the ownership file was saved, it rewrote each entry's timestamp to now. The 30-minute expiry could therefore never fire on a long-lived session — owned URLs accumulated indefinitely, potentially carrying a stale claim onto a tab that was now yours. Original timestamps are preserved now, and expiry is enforced on use, not just at load.

A worker restart failed open. The browser extension is MV3, so its service worker is killed and restarted at the platform's whim. When it restarted, the in-memory map of owned tabs was wiped — and the code interpreted "no tabs owned yet" as the startup compatibility path, which is permissive by design. So a routine worker restart silently disabled the guard entirely until the next new_tab. The owned-tab state now survives restarts in storage.session. State that resets must fail closed. Mine had been failing open, in the one direction that mattered.

What I actually changed about how I work

The code fixes are in the changelog. The habits are the takeaway:

Guard the chokepoint, or guard every call site — never the convenient wrapper. If you can't put the check on a single line every path provably crosses, you have to put it on every path. There's no third option that's safe.
"Owns" is a security boundary; startsWith is not. Any time a string comparison stands in for an authorization decision, assume it's wrong at the boundary and write the test that proves it.
Decide what your state does when it resets. Caches get cleared, workers get killed, files get reloaded. The only question is whether that reset opens the gate or closes it. Pick on purpose.
Extract the security-relevant logic into something a test can pin. The matching rules lived inline, tangled with I/O, untestable — which is why the boundary bug survived. Pure module + unit suite turned an invisible invariant into a red build.

None of these are clever. They're the boring discipline that the exciting "give your AI agent a real browser" headline quietly depends on. The guard isn't the feature. The guard is what lets the feature be allowed to exist.

Safari MCP is open source (MIT) — a Safari automation server for AI coding agents on macOS: native WebKit, your real logins, no Chrome, no headless. The full three-pass hardening shipped in v2.13.0; the blow-by-blow is in the changelog. Repo: github.com/achiya-automation/safari-mcp. I build automation like this at achiya-automation.com.

Have you found one of these in your own code — a guard on the wrapper, a prefix standing in for a boundary, state that fails open on restart? I'd genuinely like to hear which of the three bit you.