אחיה כהן

Posted on Jun 25

My tool said "clicked." Safari never saw it. macOS 26 quietly broke a system API.

#macos #debugging #opensource #devtools

I maintain an open-source MCP server that lets AI coding agents drive real Safari on macOS. One of its tools sends a native mouse click — an OS-level CGEvent, not a JavaScript element.click() — because some forms (Vue/React with anti-bot checks, OAuth consent screens) reject anything that isn't isTrusted: true. For two years that tool worked.

Then a user on macOS 26 filed a bug, and it took me an embarrassingly long time to believe what I was reading:

The MCP returns Native clicked: BUTTON "Next" at screen (x, y). But the click listener on the page never fires. window.__clicks stays empty.

The tool said it clicked. The page swears nothing happened. Both were telling the truth.

The most expensive kind of bug: the one that succeeds

Here's the failure mode that cost me a weekend. The API call returned success. No exception, no error code, no permission dialog. CGEvent.postToPid(safariPID) took my event, said "sure," and dropped it on the floor.

A bug that throws is a gift — it points at itself. A bug that silently succeeds sends you hunting everywhere except the actual cause. So I hunted.

Accessibility permission? Granted. Verified auth_value=2 in the TCC database for the exact helper binary.
Code-signing identity stable? Yes — signed with a fixed identifier so the grant survives reinstalls. (An earlier macOS bug had silently revoked it; I'd already fixed that.)
Coordinates wrong? No. document.elementFromPoint(x, y) returned the exact <button> I was aiming at, to within a pixel.
Did Apple remove the private window-targeting fields? No. kCGMouseEventWindowUnderMousePointer and its can-handle-this-event sibling are still public in the macOS 26.5 SDK headers.

Every single thing that's supposed to make a synthetic click land was correct. And the click still didn't land.

What actually changed

macOS 26 (Tahoe) tightened the delivery semantics of CGEvent.postToPid for processes that render sandboxed WebKit content. The private fields are still accepted at the API surface — that's why there's no error — but the event never crosses the boundary into Safari's WebContent process. It's authorized, it's well-formed, and it goes nowhere.

This is the gap that breaks debugging: the API contract ("post this event to that PID") still holds, while the behavioral contract ("and the target will receive it") quietly does not. Your code is correct against the documentation. The documentation is correct about the API. Neither is correct about reality on this OS version.

And nothing in the stack tells you which macOS you're on, because for two years it never mattered.

The fix wasn't a permission. It was a fact.

My first instinct was wrong: keep chasing the grant. Try a different event tap. Re-sign the binary again. That's the trap — treating an OS behavior change as a misconfiguration you can fix with one more checkbox.

The real fix was to stop pretending the environment is uniform and surface the one fact that disambiguates the whole bug class: the macOS version itself.

My server has a doctor command — run it first when "clicks don't work even with permissions granted." It checked Safari, Apple Events, the helper daemon, Accessibility, Screen Recording, codesigning… and never printed the OS version. The single most relevant number for a "native input silently fails" report was missing.

So I added a small, pure function — no I/O, unit-tested directly — that classifies the version and flags the risky range:

export function macosCompatNote(productVersion) {
  const major = parseInt(String(productVersion ?? "").trim().split(".")[0], 10);
  if (!Number.isFinite(major)) {
    return { risky: false, line: "macOS version: unknown" };
  }
  const risky = major >= 26;
  const line = risky
    ? `macOS ${productVersion} ⚠ CGEvent native clicks/keys may silently no-op on ` +
      `macOS 26+ even with Accessibility granted (issue #29) — for trust-gated ` +
      `forms prefer JS evaluation or extension-based clicks.`
    : `macOS ${productVersion} — CGEvent native input supported.`;
  return { version: productVersion, major, risky, line };
}

Then doctor calls it as best-effort — sw_vers is macOS-only and absent in CI sandboxes, so it's wrapped in a try/catch that can never block the rest of the diagnostics:

try {
  const { stdout } = await execFileAsync("sw_vers", ["-productVersion"], { timeout: 2000 });
  osLine = macosCompatNote(stdout).line;
} catch { /* sw_vers unavailable — skip the line, the other checks still stand */ }

That's the entire change. It doesn't fix the regression — I can't patch Apple's event delivery. What it does is convert a multi-hour phantom-permission hunt into a single line at the top of the diagnostic output: you're on a version where this API path is known to no-op; reach for the JavaScript or extension path instead.

The lesson I keep relearning

Three things stuck:

A success that does nothing is worse than a failure that screams. When you wrap a platform API, the dangerous case isn't the one that errors — it's the one that returns OK and silently misbehaves. Assume your dependencies can lie politely.
Put the environment in the diagnostics. Every "works on my machine" bug is really "my machine differs from yours in a way neither of us is looking at." The cheapest fix is to make your tool print the difference. The OS version cost me a weekend precisely because nothing surfaced it.
Detect-and-warn beats assume-and-fail. I can't make postToPid work on Tahoe. I can make sure nobody else spends a weekend re-deriving why it doesn't.

The native click still doesn't land on macOS 26 — that's Apple's to change, and I'm tracking it. But now the very first thing the tool tells you is the truth about where you're standing. Sometimes the best you can ship isn't a fix. It's an honest map.

This is from Safari MCP, an open-source MCP server for native Safari automation on macOS (no Chrome, no WebDriver). The full macosCompatNote + doctor change is on main. I write about the unglamorous edges of browser automation and indie automation work at achiya-automation.com.

What's the worst "the API said success but did nothing" bug you've hit — and how long before you stopped blaming your own code?

Top comments (2)

Brittany Joiner • Jun 25

"A success that does nothing is worse than a failure that screams" is going on a sticky note. Love how honest this is. We're working the same untrusted-page problem from the security side (hidden-text prompt injection, PII leaking into the model) and it's the same lesson every time: the page lies politely and nothing in the stack surfaces it. Respect for shipping the honest map instead of pretending you fixed Apple!

אחיה כהן • Jun 28

"The page lies politely" is exactly it — and it's the same root on both our sides: the rendered surface and the thing your code (or your model) actually consumes are two different artifacts, and nobody wired a tripwire between them. Mine was a click the DOM cheerfully acknowledged but WebKit never dispatched; yours is text the human eye never sees but the model dutifully ingests. Same gap, opposite directions.

The fix that's holding for me is to stop trusting the acknowledgment and assert the effect — did the page actually change? I'd bet your equivalent is asserting on what the model received, not on what the page claimed to render. Would genuinely love to read how you're surfacing the hidden-text case — that's the screaming-failure version of a problem that loves to stay silent.