My AI agent had a checkbox to tick. A multi-step form: tick two boxes, hit Next. It ticked the boxes. It hit Next. The form rejected it and snapped back to step one — every time.
The boxes were visibly checked. The DOM said checked = true. The agent had done everything right. The form still didn't believe it.
Over the next day I shipped four patch releases of safari-mcp — v2.10.6 through v2.10.9 — and still didn't completely win. Here's what each layer of the stack taught me about why a programmatic click isn't a click.
The tell: isTrusted
When software clicks, it calls element.click() or dispatches a MouseEvent. The handler runs — but the event carries isTrusted: false. That flag is the browser stating, on the record, that no human did this.
Most code never checks. But the modern stack has at least four layers that do, and each rejects a forged click in its own way. I met all four.
Layer 1 — the component library
The form used react-select. My tool opened the dropdown by clicking the chevron, then clicked the option. Fine — for the first few rows. Past row four, clicking the chevron did nothing. No menu. The element still had a live React fiber; its pointer handler had simply, silently, stopped responding.
So I stopped driving the UI. The v2.10.6 fix walks the React fiber up from the target node, finds the Select component, and calls its onChange directly — with the same { action: 'select-option', option, name } payload react-select dispatches internally. No menu, no chevron, no click.
Lesson: when a component's visible affordance gets flaky, its internal API usually isn't. Reach for the fiber.
Layer 2 — the framework
Next layer: Vue 3. My tool clicked the checkbox. The DOM .checked flipped to true. Vue's reactive v-model proxy did not.
So the box looked checked, but Vue's internal state still held the old value — and the next form submission read Vue's state, not the DOM. That was the snap-back.
The v2.10.7 fix is belt-and-suspenders: after the click, redispatch input and change with composed: true so they cross Shadow DOM and Vue Teleport portals, and reset React's _valueTracker for the shared React-checkbox case. Now the reactive layer hears the change — not just the DOM.
Lesson: flipping a DOM property is not the same as telling the framework you flipped it.
Layer 3 — the browser's own geometry
Safari MCP can also do real clicks — actual OS-level CGEvent mouse events at screen coordinates — for cases synthetic clicks can't reach.
To turn a page coordinate into a screen coordinate, you need the height of everything above the web content: title bar, toolbar, tab bar. I had hardcoded it at 74 px.
On modern Safari the chrome above the content is closer to 90 px. Every native click landed ~16 px high. Often that's still inside the target row, so it sort of worked — but for a button near whitespace, 16 px is a hit versus a silent miss.
The v2.10.8 fix: stop guessing. Compute outerHeight - innerHeight in JavaScript at click time, with a sanity range and a fallback. The browser already knows how tall its own chrome is. Ask it.
Lesson: never hardcode a number the platform will hand you for free.
Layer 4 — the operating system
Those OS-level clicks need macOS Accessibility permission. macOS stores that grant in its TCC database, keyed to the code-signing identifier of the binary asking.
My helper binary was ad-hoc signed with a hash-based identifier — a new string on every rebuild. So every npm install produced a binary macOS had never seen. The Accessibility grant from yesterday was bound to yesterday's identifier; today's binary inherited nothing.
The symptom was maddening: the helper reported success, no clicks reached the page, and System Settings showed the permission as granted — for the stale identifier.
The v2.10.9 fix: postinstall re-signs the helper with a stable identifier so the grant survives upgrades.
Lesson: if a permission keeps "randomly" resetting, check whether the thing requesting it has a stable identity.
Layer 5 — the one I haven't beaten
Four releases, four layers, one day. And then, on macOS 26, a click still didn't land.
With everything above fixed — right coordinates, stable permission, valid target — CGEvent.postToPid reports a successful injection and the page receives nothing. No isTrusted event at all. The private window-targeting fields the call needs are still present in the macOS 26 SDK; the event simply never crosses into Safari's sandboxed WebContent process.
I can't yet prove it's an OS change rather than something I'm still missing — so it's tracked in the open as issue #29, with the full repro and everything ruled out. If you've automated macOS UI and have a theory, that thread's the place.
"Just click the button"
A click looks atomic. It isn't. Before a real finger reaches a checkbox, an event has to satisfy a component library, a reactive framework, the browser's coordinate math, and the OS permission model — all in one motion — and on a new OS release, the OS can quietly change the rules underneath all of it.
element.click() skips the finger and asks four contracts to take its word for it. Some of them won't. If you're building automation for AI agents, budget for every layer — and keep your release numbers cheap. Some days you'll spend four.
safari-mcp is open source — native Safari automation for AI agents on macOS, no Chrome, no headless. github.com/achiya-automation/safari-mcp
Top comments (0)