<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: אחיה כהן</title>
    <description>The latest articles on DEV Community by אחיה כהן (@achiya-automation).</description>
    <link>https://dev.to/achiya-automation</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3810102%2Fefb43e59-992c-4f8b-91df-ee602c7c853f.jpg</url>
      <title>DEV Community: אחיה כהן</title>
      <link>https://dev.to/achiya-automation</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/achiya-automation"/>
    <language>en</language>
    <item>
      <title>An AI agent overwrote two of my browser tabs. The fix took three releases.</title>
      <dc:creator>אחיה כהן</dc:creator>
      <pubDate>Thu, 07 May 2026 12:45:23 +0000</pubDate>
      <link>https://dev.to/achiya-automation/an-ai-agent-overwrote-two-of-my-browser-tabs-the-fix-took-three-releases-l2l</link>
      <guid>https://dev.to/achiya-automation/an-ai-agent-overwrote-two-of-my-browser-tabs-the-fix-took-three-releases-l2l</guid>
      <description>&lt;p&gt;I was eating dinner when my AI agent ate my tabs.&lt;/p&gt;

&lt;p&gt;I had Safari open with a Chatwoot Meta dashboard in one tab and an n8n executions view in another — both with unsaved state, both in the middle of real work. In a third tab, my own tab, my agent was supposed to be testing a new feature in the MCP server I maintain (&lt;a href="https://github.com/achiya-automation/safari-mcp" rel="noopener noreferrer"&gt;Safari MCP&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;I came back to the laptop and both real-work tabs had been navigated to URLs the agent picked. The Chatwoot tab was now showing some test page. The n8n tab was on a Reddit comment thread the agent had been debugging an unrelated module against.&lt;/p&gt;

&lt;p&gt;The agent hadn't gone rogue. The MCP server had a state-tracking bug — and instead of failing loudly, it had silently fallen back to "use whatever tab the user is on."&lt;/p&gt;

&lt;p&gt;This is a postmortem. The fix took three releases — &lt;code&gt;v2.10.0&lt;/code&gt;, &lt;code&gt;v2.10.1&lt;/code&gt;, and &lt;code&gt;v2.10.3&lt;/code&gt; — and the iteration is the interesting part.&lt;/p&gt;




&lt;h2&gt;
  
  
  The shape of the bug
&lt;/h2&gt;

&lt;p&gt;Safari MCP exposes a &lt;code&gt;safari_new_tab(url)&lt;/code&gt; tool. Internally, it tracks "the tab MCP owns" via:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A tab index (&lt;code&gt;_activeTabIndex&lt;/code&gt;) — Safari's positional handle.&lt;/li&gt;
&lt;li&gt;A DOM marker (&lt;code&gt;window.__mcpTabMarker&lt;/code&gt;) — injected JS that lets future calls verify "yes, this is still our tab."&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Every subsequent &lt;code&gt;safari_navigate&lt;/code&gt;, &lt;code&gt;safari_click&lt;/code&gt;, &lt;code&gt;safari_fill&lt;/code&gt; etc. resolves "where to act" by:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;if marker still in current tab → use it
else if _activeTabIndex still valid → switch to it, re-verify
else → fall back to "front document of front window"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That last branch is the catastrophe. &lt;em&gt;"Front document of front window"&lt;/em&gt; is, by definition, &lt;strong&gt;whatever the user is looking at right now&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;So why did the fallback fire? Three different reasons, across three releases.&lt;/p&gt;




&lt;h2&gt;
  
  
  v2.10.0 — the original failure mode
&lt;/h2&gt;

&lt;p&gt;The original &lt;code&gt;safari_new_tab(url)&lt;/code&gt; did exactly what its name said: open a new tab and navigate it to &lt;code&gt;url&lt;/code&gt; in one call.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;newTab&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;openBlankTab&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;     &lt;span class="c1"&gt;// creates blank tab&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;navigate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;              &lt;span class="c1"&gt;// navigates immediately&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;injectMarker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;               &lt;span class="c1"&gt;// marker for future calls&lt;/span&gt;
  &lt;span class="nx"&gt;_activeTabIndex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Spot the bug? It's about what happens when &lt;code&gt;navigate(idx, url)&lt;/code&gt; &lt;em&gt;fails to load&lt;/em&gt; — &lt;code&gt;file://&lt;/code&gt; blocked by Safari, network error, an &lt;code&gt;app://&lt;/code&gt; scheme that Safari doesn't understand. The new tab stays at &lt;code&gt;about:blank&lt;/code&gt;. The marker injection runs, but then the next user-driven navigation in any tab can wipe it. By the time the next &lt;code&gt;safari_navigate&lt;/code&gt; arrives, our marker check fails. Our &lt;code&gt;_activeTabIndex&lt;/code&gt; still points at a tab, but Safari's real DOM in that tab has been replaced.&lt;/p&gt;

&lt;p&gt;The "front document" fallback fires. We navigate the user's current tab.&lt;/p&gt;

&lt;p&gt;I shipped this. I tested it on a clean Safari with one window. I never hit the bug because in clean state, the user's tab &lt;em&gt;is&lt;/em&gt; my tab.&lt;/p&gt;




&lt;h2&gt;
  
  
  v2.10.1 — the grace window (almost-fix)
&lt;/h2&gt;

&lt;p&gt;The first fix was a &lt;code&gt;NEW_TAB_GRACE_MS = 30_000&lt;/code&gt; window. For 30 seconds after &lt;code&gt;safari_new_tab&lt;/code&gt;, ANY mutating operation that &lt;em&gt;would&lt;/em&gt; fall back to the user's tab now throws a clear error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;_lastNewTabAt&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;NEW_TAB_GRACE_MS&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;markerOk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;tab tracking lost shortly after new_tab — call safari_new_tab again instead of letting MCP touch your active tab&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Plus a fix for the marker wipe — &lt;code&gt;safari_navigate&lt;/code&gt; now re-injects &lt;code&gt;window.__mcpTabMarker&lt;/code&gt; after every successful navigation, so JS-context resets don't lose tracking.&lt;/p&gt;

&lt;p&gt;This passed all my tests. It also worked correctly for ~95% of real sessions.&lt;/p&gt;

&lt;p&gt;The 5% it missed: &lt;strong&gt;sessions longer than 30 seconds where the tab-ghost recovery path nullified &lt;code&gt;_activeTabIndex&lt;/code&gt; mid-session.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  v2.10.3 — the permanent guard
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;runJS&lt;/code&gt; (the workhorse for every JS-driven tool) has a tab-ghost recovery path. If a JavaScript &lt;code&gt;evaluate&lt;/code&gt; fails because the tab Safari thinks is at index N has been closed/replaced, &lt;code&gt;runJS&lt;/code&gt; nullifies &lt;code&gt;_activeTabIndex&lt;/code&gt; so the next call resolves cleanly.&lt;/p&gt;

&lt;p&gt;The intent: avoid using a stale index after Safari shuffles tabs.&lt;/p&gt;

&lt;p&gt;The unintended consequence: 30+ minutes into a session, after a routine ghost-recovery, &lt;code&gt;_activeTabIndex&lt;/code&gt; is &lt;code&gt;null&lt;/code&gt;. The grace window from v2.10.1 has long expired. The marker check fails (the agent has navigated several times since). Fallback fires. User's current tab gets clobbered.&lt;/p&gt;

&lt;p&gt;The bug pattern: &lt;strong&gt;a "safe" recovery path created the exact failure mode the grace window was designed to prevent.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The permanent fix is a one-line change in spirit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;_hasOwnedTab&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// session-scoped, sticky&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;newTab&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// ... existing logic ...&lt;/span&gt;
  &lt;span class="nx"&gt;_hasOwnedTab&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;     &lt;span class="c1"&gt;// ← set once, never reset&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;_assertNotFallingBackToUserTab&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;_hasOwnedTab&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;MCP previously owned a tab in this session, but tracking was lost. &lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Refusing to fall back to the user's current tab. Call safari_new_tab to re-establish.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="c1"&gt;// sessions that never called new_tab can still use front-document fallback&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The flag is set the first time &lt;code&gt;safari_new_tab&lt;/code&gt; succeeds, and &lt;strong&gt;it never resets for the lifetime of the MCP process.&lt;/strong&gt; The four entry points that can target a tab — &lt;code&gt;_assertNotFallingBackToUserTab&lt;/code&gt; (used by &lt;code&gt;navigate&lt;/code&gt; and &lt;code&gt;navigateAndRead&lt;/code&gt;), &lt;code&gt;runJS&lt;/code&gt;'s tab-ghost fallback path, and &lt;code&gt;runJSLarge&lt;/code&gt; — all call this assertion before falling back to the user's current tab.&lt;/p&gt;

&lt;p&gt;If the assertion throws, the agent gets a clear error pointing back at &lt;code&gt;safari_new_tab&lt;/code&gt;. The user's tab is untouched.&lt;/p&gt;

&lt;p&gt;Sessions that &lt;em&gt;never&lt;/em&gt; call &lt;code&gt;safari_new_tab&lt;/code&gt; (e.g. tools that explicitly read the user's current tab) are unaffected — &lt;code&gt;_hasOwnedTab&lt;/code&gt; stays false, and the front-document fallback still works for them.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd take away if I were writing my own MCP server
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. The fallback you don't notice is the fallback that bites.&lt;/strong&gt;&lt;br&gt;
"Use the user's current tab" looks like a reasonable degraded mode in isolation. In context — an autonomous agent acting on the user's real, logged-in browser — it's the worst possible default. The fix wasn't "make the fallback work better." It was "the fallback should not exist in this branch of the state machine."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. State-tracking bugs aren't subtle. They're catastrophic.&lt;/strong&gt;&lt;br&gt;
A misidentified tab is a misidentified action. The class of bug here — &lt;em&gt;I think I'm acting on X but I'm actually acting on Y&lt;/em&gt; — is the same class as a deployment script targeting prod instead of staging, or a Git rebase rewriting the wrong branch. There's no "minor version" of this bug. Engineering effort should be priced accordingly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. "Sticky" flags beat "windowed" flags for invariants you actually need.&lt;/strong&gt;&lt;br&gt;
The v2.10.1 grace window was time-bounded. That made sense for the failure mode I'd seen. But sessions are unbounded. &lt;em&gt;Anything that can happen during the session can happen after the grace window expires.&lt;/em&gt; If the property "MCP has owned a tab in this session" is the actual thing protecting the user, that property must hold for the whole session — not 30 seconds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Tests on clean state miss the bugs that matter.&lt;/strong&gt;&lt;br&gt;
I tested v2.10.0 on a fresh Safari with no other tabs. The user-tab-clobber bug is &lt;em&gt;invisible&lt;/em&gt; in that environment, because the user's tab and MCP's tab are the same tab. Real users have eight tabs open and were just clicking around in tab six. If your tool drives a user's real browser, your test environment must have &lt;em&gt;unrelated, in-progress tabs&lt;/em&gt; — and your failure modes must be loud when you brush against them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Errors are a feature.&lt;/strong&gt;&lt;br&gt;
The replacement for "silent fallback to user tab" is a thrown error with a remediation message: &lt;em&gt;"Call &lt;code&gt;safari_new_tab&lt;/code&gt; to re-establish."&lt;/em&gt; That error is &lt;em&gt;better than the original happy path&lt;/em&gt; — because the original happy path was sometimes a disaster. A loud, fixable error is always better than a quiet, irreversible mistake.&lt;/p&gt;




&lt;h2&gt;
  
  
  The diff that mattered
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight diff"&gt;&lt;code&gt;&lt;span class="gi"&gt;+ let _hasOwnedTab = false;
&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;  async function newTab(url) {
    const idx = await openBlankTab();
    await navigate(idx, url);
    await injectMarker(idx);
    _activeTabIndex = idx;
&lt;span class="gi"&gt;+   _hasOwnedTab = true;
&lt;/span&gt;  }
&lt;span class="err"&gt;
&lt;/span&gt;  function getFallbackTarget() {
&lt;span class="gi"&gt;+   if (_hasOwnedTab) {
+     throw new Error("…re-establish via safari_new_tab");
+   }
&lt;/span&gt;    return frontDocumentOfFrontWindow();
  }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the load-bearing part. Everything else in v2.10.3 is plumbing.&lt;/p&gt;

&lt;p&gt;If you're building an MCP server (or any tool that drives a user's real browser/editor/database), the question I'd ask in code review is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"What's the worst thing this fallback can do, and does the fallback's existence buy enough to be worth that worst case?"&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For tab fallback in Safari MCP, the answer was: &lt;strong&gt;no, it doesn't.&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://github.com/achiya-automation/safari-mcp" rel="noopener noreferrer"&gt;github.com/achiya-automation/safari-mcp&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Install:&lt;/strong&gt; &lt;code&gt;npx safari-mcp&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Releases discussed:&lt;/strong&gt; &lt;a href="https://github.com/achiya-automation/safari-mcp/releases/tag/v2.10.0" rel="noopener noreferrer"&gt;v2.10.0&lt;/a&gt;, &lt;a href="https://github.com/achiya-automation/safari-mcp/releases/tag/v2.10.1" rel="noopener noreferrer"&gt;v2.10.1&lt;/a&gt;, &lt;a href="https://github.com/achiya-automation/safari-mcp/releases/tag/v2.10.3" rel="noopener noreferrer"&gt;v2.10.3&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Have you shipped a state-tracking bug that ate user data? What was the failure mode, and what flag/invariant ended up being the real fix?&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>debugging</category>
      <category>javascript</category>
    </item>
    <item>
      <title>LinkedIn Quietly Migrated From ProseMirror to Quill — and Broke Every Browser Automation Tool That Touched the Composer</title>
      <dc:creator>אחיה כהן</dc:creator>
      <pubDate>Sun, 03 May 2026 07:34:07 +0000</pubDate>
      <link>https://dev.to/achiya-automation/linkedin-quietly-migrated-from-prosemirror-to-quill-and-broke-every-browser-automation-tool-that-4927</link>
      <guid>https://dev.to/achiya-automation/linkedin-quietly-migrated-from-prosemirror-to-quill-and-broke-every-browser-automation-tool-that-4927</guid>
      <description>&lt;p&gt;I shipped a fix to my MCP server last week for LinkedIn's ProseMirror composer. It worked. Two days later, every LinkedIn post automation broke.&lt;/p&gt;

&lt;p&gt;This is the post-mortem of what changed, how I figured it out, and why "automate the platform" stories almost always end this way.&lt;/p&gt;

&lt;h2&gt;
  
  
  The crash
&lt;/h2&gt;

&lt;p&gt;The symptom was specific. My MCP server's &lt;code&gt;safari_fill&lt;/code&gt; tool — which dutifully filled ProseMirror by walking React Fiber and calling &lt;code&gt;editor.commands.setContent(html)&lt;/code&gt; — was now crashing the helper daemon and dismissing the composer dialog the instant it touched the contenteditable.&lt;/p&gt;

&lt;p&gt;Same composer URL. Same DOM tree at first glance. Same selectors. Different editor underneath.&lt;/p&gt;

&lt;h2&gt;
  
  
  The DOM tells the truth
&lt;/h2&gt;

&lt;p&gt;I dropped into the browser console and ran the usual probe:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;el&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[contenteditable="true"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;editor&lt;/span&gt; &lt;span class="c1"&gt;// -&amp;gt; undefined&lt;/span&gt;
&lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;closest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.ProseMirror&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// -&amp;gt; null&lt;/span&gt;
&lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;closest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.ql-editor&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// -&amp;gt; &amp;lt;div class="ql-editor"&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There it was. &lt;code&gt;.ql-editor&lt;/code&gt; is the canonical Quill class name. LinkedIn had swapped the post composer from ProseMirror to Quill at some point in early 2026 with no announcement I can find.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why it was crashing
&lt;/h2&gt;

&lt;p&gt;Quill, like ProseMirror, doesn't let you "just" stuff text into the contenteditable. Both editors hold an internal model — Quill calls it a Delta — and the DOM is downstream of that model.&lt;/p&gt;

&lt;p&gt;If you bypass the model and write to the DOM directly, two things happen:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The model and DOM disagree.&lt;/li&gt;
&lt;li&gt;The next user-driven event (a keystroke, a save) triggers a re-render that throws because the diff is incoherent.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's what was killing the composer. My fill was writing to &lt;code&gt;innerText&lt;/code&gt;, the Delta state thought the editor was still empty, the React tree tried to reconcile, and the dialog evaporated. The Swift daemon caught the cascading exception and crashed itself for good measure.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix: drive Quill the way it expects to be driven
&lt;/h2&gt;

&lt;p&gt;Quill exposes a programmatic API. You just need a reference to the instance. The lookup order I landed on:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Walk up to find an ancestor with class &lt;code&gt;.ql-container&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Try &lt;code&gt;.__quill&lt;/code&gt; — Quill 2.x attaches the instance there directly.&lt;/li&gt;
&lt;li&gt;Fall back to React Fiber: walk up the fiber chain looking for &lt;code&gt;memoizedProps.quill&lt;/code&gt; or &lt;code&gt;stateNode.quill&lt;/code&gt; (LinkedIn wraps Quill in a React component that holds the instance in props).&lt;/li&gt;
&lt;li&gt;If still nothing, fall back to a real CGEvent &lt;code&gt;Cmd+V&lt;/code&gt; paste — Quill respects clipboard events with &lt;code&gt;isTrusted: true&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Once you have the instance, the actual fill is one line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;quill&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setContents&lt;/span&gt;&lt;span class="p"&gt;([{&lt;/span&gt; &lt;span class="na"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;api&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;'api'&lt;/code&gt; source flag is the part that matters. It tells Quill "this came from your own API, update your model and the DOM together." The text commits, the Delta stays consistent, and the React parent doesn't try to re-conciliate against a corrupted model.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this taught me about platform automation
&lt;/h2&gt;

&lt;p&gt;Two lessons, both old, both worth re-learning:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Editors aren't a stable interface.&lt;/strong&gt; ProseMirror and Quill have different APIs, different state models, and different rules for "what counts as a real edit." Targeting one of them only works until the platform decides it doesn't anymore. LinkedIn made this swap with zero changelog. The only way I knew was that my code broke.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The DOM is the lowest common denominator. The editor model is the actual one.&lt;/strong&gt; Every automation tool that synthesizes events on the contenteditable is operating one layer below the truth. Sometimes that works (because the editor reconciles). Sometimes it doesn't (because the editor crashes or silently discards the input). The robust path is always to find the editor instance and call its API.&lt;/p&gt;

&lt;p&gt;There's a third lesson, which is more uncomfortable: I couldn't fully verify my fix on LinkedIn, because LinkedIn's modal-opening behavior in headless contexts is independently broken right now. The composer button accepts clicks, the dialog DOM materializes, but it never visually opens. So the Quill detection is in place — and verified on test pages — but the LinkedIn-specific live path is still gated on a separate modal issue I haven't cracked.&lt;/p&gt;

&lt;p&gt;This is the texture of platform automation. Two unrelated bugs, same week, same target. Each one looks like the other. You ship a fix for one and the other one masquerades as a regression.&lt;/p&gt;

&lt;h2&gt;
  
  
  The takeaway
&lt;/h2&gt;

&lt;p&gt;If you're building anything that types into a third-party rich text editor — Slack, LinkedIn, Discord, Medium, Notion — the editor identity is part of your contract with the platform, and the platform doesn't owe you stability there. Detect the editor type at runtime. Have a fallback for the unknown case (real clipboard events, ideally). Log what you found, so when it changes you find out from your own telemetry instead of from a Slack message at 11pm.&lt;/p&gt;

&lt;p&gt;And read the contenteditable's class list before you touch it. ProseMirror and Quill have different class signatures and the DOM will tell you what you're dealing with — if you ask.&lt;/p&gt;

&lt;p&gt;The fix shipped in &lt;a href="https://www.npmjs.com/package/safari-mcp" rel="noopener noreferrer"&gt;safari-mcp@2.10.2&lt;/a&gt;. Source on &lt;a href="https://github.com/achiya-automation/safari-mcp" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>mcp</category>
      <category>ai</category>
      <category>automation</category>
    </item>
    <item>
      <title>When GitHub Actions Goes Silent: The Pending-Forever Bug I Hit Shipping My MCP Server to npm</title>
      <dc:creator>אחיה כהן</dc:creator>
      <pubDate>Tue, 28 Apr 2026 19:22:06 +0000</pubDate>
      <link>https://dev.to/achiya-automation/when-github-actions-goes-silent-the-pending-forever-bug-i-hit-shipping-my-mcp-server-to-npm-229m</link>
      <guid>https://dev.to/achiya-automation/when-github-actions-goes-silent-the-pending-forever-bug-i-hit-shipping-my-mcp-server-to-npm-229m</guid>
      <description>&lt;p&gt;I have an &lt;a href="https://github.com/achiya-automation/safari-mcp" rel="noopener noreferrer"&gt;open-source MCP server&lt;/a&gt;. I tag a release, push, GitHub Actions builds, npm publishes, MCP Registry updates. That's the contract. It worked for v2.7.6 through v2.8.4.&lt;/p&gt;

&lt;p&gt;Then v2.8.5 didn't publish. Neither did v2.8.6. Or v2.9.0. Or v2.9.1. Or v2.9.2. Or v2.9.3.&lt;/p&gt;

&lt;p&gt;Six releases stuck. Not failing — &lt;strong&gt;stuck&lt;/strong&gt;. Yellow dot. Forever.&lt;/p&gt;

&lt;p&gt;Here's what was actually happening. And how I got the releases out without GitHub Actions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The symptom that doesn't match any docs
&lt;/h2&gt;

&lt;p&gt;Every release event triggered the workflow. Every workflow showed up in the runs list. None of them ever started a job.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gh run view 25001890100 &lt;span class="nt"&gt;--json&lt;/span&gt; status,conclusion,jobs
&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"status"&lt;/span&gt;: &lt;span class="s2"&gt;"queued"&lt;/span&gt;,
  &lt;span class="s2"&gt;"conclusion"&lt;/span&gt;: null,
  &lt;span class="s2"&gt;"jobs"&lt;/span&gt;: &lt;span class="o"&gt;[]&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No conclusion. No jobs. Empty &lt;code&gt;pending_deployments&lt;/code&gt;. Not "waiting for approval". Not "in_progress". Not "failure". Just &lt;strong&gt;pending&lt;/strong&gt; with no work scheduled — for 125 hours.&lt;/p&gt;

&lt;p&gt;If you search "GitHub Actions stuck pending", you'll find a hundred forum posts. Every answer assumes one of:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You hit the runner concurrency limit (3 for free-tier macos)&lt;/li&gt;
&lt;li&gt;You have a deployment environment requiring approval&lt;/li&gt;
&lt;li&gt;Your &lt;code&gt;runs-on:&lt;/code&gt; label is unreachable&lt;/li&gt;
&lt;li&gt;You're using self-hosted runners with no online agents&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;None of those applied. My workflow was simple, no environments with required reviewers, &lt;code&gt;runs-on: macos-latest&lt;/code&gt;, no self-hosted runners.&lt;/p&gt;

&lt;h2&gt;
  
  
  The thing GitHub doesn't tell you in the run UI
&lt;/h2&gt;

&lt;p&gt;The runs list shows pending. The run detail page shows pending. The job list shows nothing. The "deployment" tab shows nothing.&lt;/p&gt;

&lt;p&gt;But if you look at your &lt;strong&gt;billing dashboard&lt;/strong&gt;, there's a different story:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Your account has used 100% of included macOS minutes for this billing period.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's it. That's the entire diagnostic. There is no banner on the run page. The workflow doesn't fail with a clear error. It just sits in the queue forever — because the runner that would pick it up doesn't exist, and the queue doesn't time out events.&lt;/p&gt;

&lt;p&gt;The minutes counter resets monthly. Until it does, every release event becomes another silent pending row.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two facts that surprised me
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Fact 1: macOS runners cost 10x more than Linux runners.&lt;/strong&gt; Both &lt;code&gt;runs-on: macos-latest&lt;/code&gt; and &lt;code&gt;runs-on: macos-13&lt;/code&gt; charge against your Actions minutes at a 10x multiplier. The free 2,000 minutes/month gets you 200 minutes of macOS — about 20 release builds if each takes 10 minutes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fact 2: Switching to Linux didn't fix it.&lt;/strong&gt; I changed &lt;code&gt;runs-on: macos-latest&lt;/code&gt; to &lt;code&gt;runs-on: ubuntu-latest&lt;/code&gt;. Same symptom. 0 jobs queued, status "pending". Why?&lt;/p&gt;

&lt;p&gt;The macOS minutes meter is one bucket. The Linux meter is another. When the macOS bucket emptied, my pending macOS runs were still in the queue, blocking new runs. Even after switching the workflow to ubuntu, the &lt;em&gt;concurrency group&lt;/em&gt; in the YAML serialized everything:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;concurrency&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;publish&lt;/span&gt;
  &lt;span class="na"&gt;cancel-in-progress&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So new ubuntu runs queued behind old stuck macOS runs and never started.&lt;/p&gt;

&lt;h2&gt;
  
  
  The two-part fix
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Part 1: workflow_dispatch with tag input
&lt;/h3&gt;

&lt;p&gt;Adding a manual trigger lets you re-publish a tag whose release-event run got stuck, without deleting and recreating the GitHub Release:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;release&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;types&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;published&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;workflow_dispatch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;inputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;tag&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tag&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;publish&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;(e.g.&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;v2.9.3)"&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In every step that needs the tag, fall back through both event types:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v6&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;ref&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ github.event.inputs.tag || github.ref_name }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That alone isn't enough — if the runner pool is still empty, the dispatched run also stalls. But it gives you a clean re-trigger path the moment runners are back.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 2: portable runner OS
&lt;/h3&gt;

&lt;p&gt;The workflow downloaded &lt;code&gt;mcp-publisher_darwin_${ARCH}.tar.gz&lt;/code&gt; — hardcoded "darwin". Switching to ubuntu broke that step. Generalize:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Download mcp-publisher&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;OS=$(uname -s | tr '[:upper:]' '[:lower:]')&lt;/span&gt;
    &lt;span class="s"&gt;ARCH=$(uname -m)&lt;/span&gt;
    &lt;span class="s"&gt;if [ "$ARCH" = "x86_64" ]; then ARCH=amd64; fi&lt;/span&gt;
    &lt;span class="s"&gt;if [ "$ARCH" = "aarch64" ]; then ARCH=arm64; fi&lt;/span&gt;
    &lt;span class="s"&gt;curl -sL "https://github.com/modelcontextprotocol/registry/releases/latest/download/mcp-publisher_${OS}_${ARCH}.tar.gz" -o mcp-publisher.tar.gz&lt;/span&gt;
    &lt;span class="s"&gt;tar -xzf mcp-publisher.tar.gz mcp-publisher&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the same step works on macOS-arm64, macOS-x86_64, ubuntu-x86_64, and any future runner.&lt;/p&gt;

&lt;h2&gt;
  
  
  The manual workaround that actually shipped the release
&lt;/h2&gt;

&lt;p&gt;While the workflow stays stuck, here's how I got v2.9.3 to npm and the MCP Registry from my laptop:&lt;/p&gt;

&lt;h3&gt;
  
  
  npm: the easy part
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git checkout v2.9.3
npm publish &lt;span class="nt"&gt;--provenance&lt;/span&gt; &lt;span class="nt"&gt;--access&lt;/span&gt; public
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;--provenance&lt;/code&gt; requires a valid OIDC token, which only works inside GitHub Actions. Skip it locally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm publish &lt;span class="nt"&gt;--access&lt;/span&gt; public
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You lose the provenance attestation, but the package ships. Provenance is a nice-to-have, not a publish blocker.&lt;/p&gt;

&lt;h3&gt;
  
  
  MCP Registry: the trickier part
&lt;/h3&gt;

&lt;p&gt;The MCP Registry's CLI authenticates interactively:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mcp-publisher login github
&lt;span class="c"&gt;# Opens a browser, asks you to paste a code, etc.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's fine for humans. For a script — or for a Claude session running headless — you need non-interactive auth. The &lt;code&gt;mcp-publisher&lt;/code&gt; binary accepts &lt;code&gt;-token&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;GH_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;gh auth token&lt;span class="si"&gt;)&lt;/span&gt;
mcp-publisher login github &lt;span class="nt"&gt;-token&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$GH_TOKEN&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
mcp-publisher publish
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;gh&lt;/code&gt; CLI you already use for everything else? Its token works as your GitHub PAT for &lt;code&gt;mcp-publisher&lt;/code&gt;. No browser, no copy-paste.&lt;/p&gt;

&lt;p&gt;After running these, the MCP Registry's &lt;code&gt;io.github.achiya-automation/safari-mcp&lt;/code&gt; v2.9.3 went from "stuck on v2.7.6 for 3 weeks" to &lt;code&gt;isLatest: true&lt;/code&gt; in about 15 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd tell past-me
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Check the billing dashboard *first&lt;/strong&gt;* when an Actions run sits pending with no error. The run UI does not surface "you're out of minutes". The billing page does.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't trust &lt;code&gt;runs-on: ubuntu-latest&lt;/code&gt; to "just be cheaper"&lt;/strong&gt; — it is, but if you've burned your macOS minutes on stalled runs, the queue can still serialize new ones behind dead ones via your &lt;code&gt;concurrency:&lt;/code&gt; group.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keep a manual publish path documented.&lt;/strong&gt; Both npm and the MCP Registry have non-interactive auth options. Write the bash one-liners somewhere your future self can find them at 2am.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;workflow_dispatch&lt;/code&gt; with a tag input is cheap insurance.&lt;/strong&gt; It costs you 6 lines of YAML and saves you from needing to delete-and-recreate GitHub Releases when the release-event run gets corrupted.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Why didn't a &lt;code&gt;timeout-minutes:&lt;/code&gt; rescue me?&lt;/strong&gt;&lt;br&gt;
That's a job-level timeout. It applies once a job &lt;em&gt;starts&lt;/em&gt;. A run that never starts a job has nothing to time out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Couldn't I have used a self-hosted runner?&lt;/strong&gt;&lt;br&gt;
Yes — and that's the right answer for high-volume projects. For an OSS hobby project, self-hosted is operationally heavier than the manual publish path.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Doesn't &lt;code&gt;--provenance&lt;/code&gt; matter for supply-chain security?&lt;/strong&gt;&lt;br&gt;
For widely-installed packages, yes. For an OSS project's own emergency-publish workaround, the trade-off is "ship the release without provenance" vs "ship nothing". Pick the first one and re-publish with provenance on the next clean release.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Could I have known about the billing limit before hitting it?&lt;/strong&gt;&lt;br&gt;
GitHub does send an email when you cross 75% of your minutes. The email goes to the address on your billing account, which may not be the address you watch. Worth setting up a filter.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What about Actions minutes for OSS public repos?&lt;/strong&gt;&lt;br&gt;
GitHub gives unlimited minutes to public repos using GitHub-hosted runners — but that's only for repos owned by &lt;strong&gt;organizations on the Free plan, with the runner type matching the included unlimited tier&lt;/strong&gt;. For personal accounts and certain runner combinations, the standard quota applies. Check the actual numbers under Settings → Billing → Plans for your specific account type.&lt;/p&gt;




&lt;p&gt;If you've hit a similar stuck-pending pattern with no error in the run UI — that's the bug. Check your minutes. Then ship from your laptop.&lt;/p&gt;

&lt;p&gt;The repo (with the workflow that handles all this now) is &lt;a href="https://github.com/achiya-automation/safari-mcp" rel="noopener noreferrer"&gt;safari-mcp on GitHub&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>github</category>
      <category>ci</category>
      <category>npm</category>
      <category>devops</category>
    </item>
    <item>
      <title>The 3 isTrusted:false Bugs That Made LinkedIn Posts Impossible From My MCP Server</title>
      <dc:creator>אחיה כהן</dc:creator>
      <pubDate>Wed, 22 Apr 2026 14:36:56 +0000</pubDate>
      <link>https://dev.to/achiya-automation/the-3-istrustedfalse-bugs-that-made-linkedin-posts-impossible-from-my-mcp-server-102f</link>
      <guid>https://dev.to/achiya-automation/the-3-istrustedfalse-bugs-that-made-linkedin-posts-impossible-from-my-mcp-server-102f</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;I couldn't post to LinkedIn from my MCP server. Not "sometimes fails" — &lt;em&gt;never works&lt;/em&gt;. I assumed one bug. I was wrong. I found three, stacked, and each one looked like success to every automation tool I tried. Here is the anatomy of why your agent's "I posted it!" lies to you when a rich-text editor sits inside a dialog.&lt;/p&gt;




&lt;h2&gt;
  
  
  The symptom
&lt;/h2&gt;

&lt;p&gt;I ship Safari MCP — an MCP server that drives the Safari you are already logged into. 80 tools. &lt;code&gt;safari_fill&lt;/code&gt; is the most-used one. For three months it worked everywhere — Gmail, GitHub, Ahrefs, Google Docs, Shopify admin.&lt;/p&gt;

&lt;p&gt;Then I tried posting to LinkedIn from an agent.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;safari_fill&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Shipping v2.9.0 — modal detection in snapshot!&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Filled. 67 chars.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Except the LinkedIn composer was empty. And closed. And I had a cursor in my address bar.&lt;/p&gt;

&lt;p&gt;Three hours later I had a list. Three separate boundaries, each silently sabotaging the one before it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Boundary 1: &lt;code&gt;focusout&lt;/code&gt; dismisses the dialog
&lt;/h2&gt;

&lt;p&gt;LinkedIn's composer is a &lt;code&gt;&amp;lt;div role="dialog"&amp;gt;&lt;/code&gt;. Specifically, its share composer listens for &lt;code&gt;focusout&lt;/code&gt; on any descendant and closes the modal — the UX intent is "clicked outside → close."&lt;/p&gt;

&lt;p&gt;My fill path did this at the end:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Old: "polite" contenteditable fill (pseudo-code)&lt;/span&gt;
&lt;span class="nf"&gt;setEditableContent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;editableEl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;editableEl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dispatchEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;input&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;bubbles&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
&lt;span class="nx"&gt;editableEl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;blur&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;  &lt;span class="c1"&gt;// ← here's the assassin&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;blur()&lt;/code&gt; call was there for a reason — some React frameworks only commit state on blur. Perfectly reasonable on a standalone textarea. Inside a dialog? The &lt;code&gt;focusout&lt;/code&gt; listener takes the blur, concludes the user clicked away, and runs the dismiss animation.&lt;/p&gt;

&lt;p&gt;My fill &lt;em&gt;worked&lt;/em&gt;. For ~40ms. Then the dialog DOM disappeared and the text with it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; Remove the &lt;code&gt;blur()&lt;/code&gt;. React commits state from &lt;code&gt;input&lt;/code&gt; alone on any modern contenteditable. If a site truly requires &lt;code&gt;blur&lt;/code&gt; to persist, it is broken for keyboard users anyway.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But removing blur was not enough.&lt;/strong&gt; The next run showed the text finally landing, the Post button enabling — and then the button click did nothing. Why?&lt;/p&gt;




&lt;h2&gt;
  
  
  Boundary 2: ProseMirror's &lt;code&gt;isTrusted:false&lt;/code&gt; paste rejection
&lt;/h2&gt;

&lt;p&gt;LinkedIn's composer &lt;em&gt;was&lt;/em&gt; ProseMirror when I started debugging. (They have since migrated to Lexical. We will get there.) ProseMirror has a paste handler. That handler is strict:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ProseMirror source, paraphrased&lt;/span&gt;
&lt;span class="nf"&gt;handlePaste&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;view&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isTrusted&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Synthetic paste events don't reflect real user intent.&lt;/span&gt;
    &lt;span class="c1"&gt;// Reject them — the editor state must only change from real input.&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a &lt;em&gt;security&lt;/em&gt; decision, not a UX one. &lt;code&gt;event.isTrusted&lt;/code&gt; is only &lt;code&gt;true&lt;/code&gt; when the browser itself dispatches the event — a real keystroke, a real paste, a real click. JavaScript &lt;code&gt;new Event()&lt;/code&gt; or &lt;code&gt;dispatchEvent()&lt;/code&gt; produces &lt;code&gt;isTrusted:false&lt;/code&gt; every time.&lt;/p&gt;

&lt;p&gt;My fill was dispatching &lt;code&gt;new ClipboardEvent('paste', { clipboardData: ... })&lt;/code&gt;. The editor reached its paste handler, saw &lt;code&gt;isTrusted:false&lt;/code&gt;, and bailed. The &lt;code&gt;execCommand('insertText')&lt;/code&gt; fallback went the same way.&lt;/p&gt;

&lt;p&gt;The character-by-character &lt;code&gt;beforeinput&lt;/code&gt; dispatch? Also &lt;code&gt;isTrusted:false&lt;/code&gt;. Also rejected.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix that worked (and broke in Boundary 3):&lt;/strong&gt; Route through a real OS paste. I already had a &lt;code&gt;_nativeTypeViaClipboard&lt;/code&gt; path — uses AppleScript to set the system clipboard, then dispatches a real Cmd+V via macOS CGEvent. The browser sees it as a real user paste. &lt;code&gt;isTrusted&lt;/code&gt; is &lt;code&gt;true&lt;/code&gt;. Editor accepts it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Boundary 3: CGEvent Cmd+V steals focus, triggers Boundary 1
&lt;/h2&gt;

&lt;p&gt;Remember Boundary 1 was "focusout dismisses the dialog?" Well —&lt;/p&gt;

&lt;p&gt;The CGEvent Cmd+V path delivers the keystroke to the frontmost window. To &lt;em&gt;be&lt;/em&gt; the frontmost window, Safari has to be active. When I programmatically activate Safari via &lt;code&gt;NSApplication activateIgnoringOtherApps&lt;/code&gt;, the previous window loses focus for a tiny window. Chrome's "focus stealing" behavior is a documented pet peeve of every automation tool; Safari is no different.&lt;/p&gt;

&lt;p&gt;So the sequence was:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;CGEvent fires Cmd+V&lt;/li&gt;
&lt;li&gt;Safari gets activated (taking focus briefly)&lt;/li&gt;
&lt;li&gt;The composer editor sees &lt;code&gt;focusout&lt;/code&gt; during the ~10ms activation window&lt;/li&gt;
&lt;li&gt;Dialog dismisses&lt;/li&gt;
&lt;li&gt;Paste lands — but on the &lt;em&gt;feed&lt;/em&gt; underneath the now-closed dialog&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Cool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;First fix attempt:&lt;/strong&gt; Use a background-activation variant that does not foreground Safari. This worked but required the user's Safari to &lt;em&gt;already&lt;/em&gt; be the active app (fragile — the point of MCP is the user is doing other work).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Second fix attempt — the one that stuck:&lt;/strong&gt; Bypass the OS keyboard entirely. Drive the editor through its own internal API.&lt;/p&gt;




&lt;h2&gt;
  
  
  The actual fix: editor-native API access
&lt;/h2&gt;

&lt;p&gt;LinkedIn's composer (as of 2026-04) is Lexical, not ProseMirror. Lexical is Meta's replacement — also used in Shopify admin, some Meta apps, newer Notion surfaces.&lt;/p&gt;

&lt;p&gt;Lexical exposes the editor instance on its DOM root element:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;editorEl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[data-lexical-editor="true"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;editor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;editorEl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;__lexicalEditor&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// the actual LexicalEditor instance&lt;/span&gt;

&lt;span class="c1"&gt;// Build a minimal root → paragraph → text document&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;newState&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;editor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parseEditorState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;root&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;children&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
      &lt;span class="na"&gt;children&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;detail&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;normal&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
      &lt;span class="na"&gt;direction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ltr&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;indent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;paragraph&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="p"&gt;}],&lt;/span&gt;
    &lt;span class="na"&gt;direction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ltr&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;indent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;root&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}));&lt;/span&gt;
&lt;span class="nx"&gt;editor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setEditorState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;newState&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Zero synthetic events. Zero focus shift. Zero clipboard. The editor updates its own state directly. Lexical's internal invariants hold. React re-renders the contenteditable tree through its normal diff path. The Post button observes the state change and enables itself.&lt;/p&gt;

&lt;p&gt;For ProseMirror (which LinkedIn used to use), the equivalent is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pmView&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;editorEl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pmViewDesc&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;view&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// ProseMirror's EditorView&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;pmView&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;pmView&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;selection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;pmView&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dispatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same principle: do not pretend to be a user. Be a caller.&lt;/p&gt;




&lt;h2&gt;
  
  
  The cascade of falsified "success"
&lt;/h2&gt;

&lt;p&gt;Here is what is unsettling: &lt;strong&gt;every stage of every failed attempt returned &lt;code&gt;success&lt;/code&gt; to my agent.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Setting the editable element's content → value set, DOM mutation event fires, "success"&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;dispatchEvent(new ClipboardEvent('paste'))&lt;/code&gt; → handler called, &lt;code&gt;preventDefault&lt;/code&gt; returned, "looks like paste fired, success"&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;_nativeTypeViaClipboard&lt;/code&gt; → Cmd+V fired, clipboard had the content, "success"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The only honest verification is: did the editor &lt;em&gt;state&lt;/em&gt; update? Not the DOM. Not the visible text. Not the event log. The editor's own source of truth.&lt;/p&gt;

&lt;p&gt;For Lexical: &lt;code&gt;editor.getEditorState().toJSON()&lt;/code&gt;. Compare to what you expected. Now you know.&lt;/p&gt;

&lt;p&gt;This is why your agent's "I posted it" lies. Every layer of the automation stack reports local success. None of them verified the editor's internal state matched the intent.&lt;/p&gt;




&lt;h2&gt;
  
  
  Generalizations
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Blur is radioactive in dialogs.&lt;/strong&gt; Audit every automation tool's fill path. If it calls &lt;code&gt;.blur()&lt;/code&gt;, it will close some modal somewhere.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;isTrusted:false&lt;/code&gt; is a one-way door.&lt;/strong&gt; Real-world rich-text editors audit it. Your synthetic paste/input/keydown will not cross. Either use a native OS path (Cmd+V via CGEvent/winuser) or drive the editor API directly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Native OS paste moves focus.&lt;/strong&gt; Which is fine — unless the target is inside a dialog that listens for focus loss. In that case, drive the editor API directly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Editor API access is undocumented but stable.&lt;/strong&gt; &lt;code&gt;__lexicalEditor&lt;/code&gt;, &lt;code&gt;pmViewDesc.view&lt;/code&gt;, Draft.js's internal store — these are all in production for years because the editors are &lt;em&gt;themselves&lt;/em&gt; stable. They are not public but they are not moving.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Trust nothing downstream of the editor.&lt;/strong&gt; The rendered DOM, text content, visible interface — any of these can be right while the editor's internal state is wrong. Verify editor state, not DOM state.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  What this means if you use or build MCP servers
&lt;/h2&gt;

&lt;p&gt;Most MCP browser tools today use &lt;code&gt;page.type()&lt;/code&gt; or &lt;code&gt;element.fill()&lt;/code&gt; — thin wrappers over DOM events. They will work for 80% of forms and silently fail for rich editors inside dialogs (which is roughly: every post/comment/share UI on every major social site, Notion, Google Docs, JIRA, Salesforce rich notes, Shopify description fields).&lt;/p&gt;

&lt;p&gt;If you are evaluating browser-automation MCP servers for agent workflows that involve content creation, test this specifically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Can it post to LinkedIn?&lt;/li&gt;
&lt;li&gt;Can it type a multi-line comment on GitHub?&lt;/li&gt;
&lt;li&gt;Can it fill a Notion page with formatted text?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If any of those fail silently (returns "success" but the target app shows nothing), the tool has one of these three bugs.&lt;/p&gt;




&lt;p&gt;Safari MCP v2.9.4 ships the Lexical-native path. If you are on macOS and want to try it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx safari-mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;MIT, 80 tools, &lt;a href="https://github.com/achiya-automation/safari-mcp" rel="noopener noreferrer"&gt;github.com/achiya-automation/safari-mcp&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Is there a fourth boundary I missed? Drop a comment — I will buy the bug report with a merch sticker if it forces a v2.9.5.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>browserautomation</category>
      <category>webdev</category>
    </item>
    <item>
      <title>WhatsApp Bot for Business 2026 — $1K-$4K (50+ Real Builds)</title>
      <dc:creator>אחיה כהן</dc:creator>
      <pubDate>Mon, 20 Apr 2026 19:54:12 +0000</pubDate>
      <link>https://dev.to/achiya-automation/whatsapp-bot-for-business-2026-1k-4k-50-real-builds-ba4</link>
      <guid>https://dev.to/achiya-automation/whatsapp-bot-for-business-2026-1k-4k-50-real-builds-ba4</guid>
      <description>&lt;p&gt;WhatsApp has over 2 billion users worldwide. If your customers are on WhatsApp (and they probably are), a bot can handle inquiries 24/7, book appointments, and qualify leads — while you sleep.&lt;/p&gt;

&lt;p&gt;But there's a catch: do it wrong, and Meta will restrict or ban your number. I've seen businesses lose their primary WhatsApp number because they used the wrong tool, sent messages to people who didn't opt in, or scaled too aggressively.&lt;/p&gt;

&lt;p&gt;This guide covers how to build a WhatsApp bot properly — which API to use, how to avoid bans, and what a realistic setup looks like.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Official API&lt;/strong&gt; (via BSP) — safe, verified, but costs $50-100/month + per-message fees. Best for established businesses&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WAHA&lt;/strong&gt; (unofficial, open-source) — free, flexible, but not endorsed by Meta. Risk of account restrictions. Best for small businesses starting out&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ban prevention&lt;/strong&gt; — get opt-in before messaging, don't send bulk unsolicited messages, respond to conversations (don't just broadcast)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Realistic cost&lt;/strong&gt; — $1,000-4,000 setup + $5-100/month ongoing, depending on your approach&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Two Ways to Connect: Official API vs. WAHA
&lt;/h2&gt;

&lt;p&gt;This is the first decision you'll make, and it affects everything else — cost, reliability, features, and risk.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 1: Official WhatsApp Business API
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://business.whatsapp.com/" rel="noopener noreferrer"&gt;WhatsApp Business API&lt;/a&gt; is Meta's official solution for businesses. You access it through a Business Solution Provider (BSP) like Twilio, 360dialog, or MessageBird. (If you want a full breakdown of BSPs, fees, and the onboarding process, see our &lt;a href="https://dev.to/en/blog/whatsapp-business-api-guide/"&gt;WhatsApp Business API guide&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How it works:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Sign up with a BSP&lt;/li&gt;
&lt;li&gt;Verify your business with Meta&lt;/li&gt;
&lt;li&gt;Get a dedicated phone number (or use an existing one)&lt;/li&gt;
&lt;li&gt;Send and receive messages through the API&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Pricing (as of March 2026):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;BSP monthly fee: $50-100/month (varies by provider)&lt;/li&gt;
&lt;li&gt;Per-conversation fees (set by Meta):

&lt;ul&gt;
&lt;li&gt;Marketing conversations: ~$0.035/conversation (varies by country)&lt;/li&gt;
&lt;li&gt;Utility conversations (order updates, etc.): ~$0.005/conversation&lt;/li&gt;
&lt;li&gt;Service conversations (customer-initiated): &lt;strong&gt;free&lt;/strong&gt; for the first 1,000/month&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Template messages must be pre-approved by Meta&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Officially supported — no risk of account bans for API usage&lt;/li&gt;
&lt;li&gt;Green checkmark verification available&lt;/li&gt;
&lt;li&gt;Template messages for outbound messaging&lt;/li&gt;
&lt;li&gt;Higher rate limits&lt;/li&gt;
&lt;li&gt;Multi-device support built in&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Per-message costs add up at scale&lt;/li&gt;
&lt;li&gt;BSP adds another vendor and monthly cost&lt;/li&gt;
&lt;li&gt;Template approval process can be slow (24-72 hours)&lt;/li&gt;
&lt;li&gt;Less flexibility — you can only do what the API allows&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Option 2: WAHA (Unofficial WhatsApp API)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Important: WAHA is NOT an official WhatsApp product.&lt;/strong&gt; It's an &lt;a href="https://github.com/devlikeapro/waha" rel="noopener noreferrer"&gt;open-source project&lt;/a&gt; that provides API access to WhatsApp by connecting through WhatsApp Web's protocol. Meta does not endorse or support it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How it works:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Self-host WAHA on your server (Docker)&lt;/li&gt;
&lt;li&gt;Scan a QR code with your WhatsApp number (like WhatsApp Web)&lt;/li&gt;
&lt;li&gt;Send and receive messages through WAHA's REST API&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Pricing:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;WAHA Core: free and open-source&lt;/li&gt;
&lt;li&gt;WAHA Plus: paid version with additional features&lt;/li&gt;
&lt;li&gt;Your only cost: server hosting ($5-20/month)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No per-message fees&lt;/li&gt;
&lt;li&gt;No template approval process&lt;/li&gt;
&lt;li&gt;Full flexibility — send any message type&lt;/li&gt;
&lt;li&gt;Open-source — you can inspect and modify the code&lt;/li&gt;
&lt;li&gt;No BSP middleman&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Not endorsed by Meta&lt;/strong&gt; — using it technically violates WhatsApp's Terms of Service&lt;/li&gt;
&lt;li&gt;Risk of account restrictions if you trigger spam detection&lt;/li&gt;
&lt;li&gt;Relies on WhatsApp Web protocol — can break when WhatsApp updates&lt;/li&gt;
&lt;li&gt;No green checkmark&lt;/li&gt;
&lt;li&gt;Phone must stay connected (though WAHA handles multi-device well)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Which Should You Choose?
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Recommendation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Established business, customer communications&lt;/td&gt;
&lt;td&gt;Official API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Marketing campaigns and broadcasts&lt;/td&gt;
&lt;td&gt;Official API (with opt-in)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Small business, responding to incoming messages&lt;/td&gt;
&lt;td&gt;WAHA can work well&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Testing and prototyping&lt;/td&gt;
&lt;td&gt;WAHA (lower cost to experiment)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Highly regulated industry (healthcare, finance)&lt;/td&gt;
&lt;td&gt;Official API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Budget-conscious startup&lt;/td&gt;
&lt;td&gt;WAHA to start, migrate to official later&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In our experience, many small businesses start with WAHA because the barrier to entry is lower, then migrate to the official API as they scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to NOT Get Banned (Critical)
&lt;/h2&gt;

&lt;p&gt;Whether you use the official API or WAHA, these rules apply:&lt;/p&gt;

&lt;h3&gt;
  
  
  The #1 Rule: Get Opt-In First
&lt;/h3&gt;

&lt;p&gt;Never send the first message to someone who hasn't explicitly asked to hear from you. This is both a WhatsApp policy requirement and common sense.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Good:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Customer fills out a form and checks "Contact me on WhatsApp"&lt;/li&gt;
&lt;li&gt;Customer sends you a message first and you respond&lt;/li&gt;
&lt;li&gt;Customer explicitly asks to receive updates via WhatsApp&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Bad:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You bought a list of phone numbers and blast them all&lt;/li&gt;
&lt;li&gt;You scrape numbers from websites and send cold messages&lt;/li&gt;
&lt;li&gt;You add everyone in your phone contacts to a broadcast list&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Rate Limiting
&lt;/h3&gt;

&lt;p&gt;Don't send hundreds of messages per minute. WhatsApp's detection algorithms look for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;High volume in short time&lt;/strong&gt; — sending 500 messages in 5 minutes is a red flag&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Identical messages&lt;/strong&gt; — sending the exact same text to many numbers looks like spam&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;High block rate&lt;/strong&gt; — if many recipients block you, your quality rating drops fast&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Messaging numbers that don't have you saved&lt;/strong&gt; — this is a strong spam signal, especially at volume&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a deeper breakdown of the exact thresholds and how WhatsApp's four-layer detection system works in 2026, see the &lt;a href="https://dev.to/en/blog/whatsapp-spam-detection-2026/"&gt;WhatsApp spam detection guide&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Safe practices:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Space out bulk messages (add 2-5 second delays between sends)&lt;/li&gt;
&lt;li&gt;Personalize messages (use the recipient's name, reference their specific inquiry)&lt;/li&gt;
&lt;li&gt;Keep your block rate under 2-3%&lt;/li&gt;
&lt;li&gt;Start slow — send to 50 people first, monitor for blocks, then scale gradually&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  WAHA-Specific Precautions
&lt;/h3&gt;

&lt;p&gt;If you're using WAHA (unofficial API):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Use a dedicated number&lt;/strong&gt; — don't risk your primary business number&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't blast&lt;/strong&gt; — WAHA is best for responding to incoming messages, not mass outbound campaigns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitor your quality&lt;/strong&gt; — if you notice messages not delivering, stop and investigate&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Have a backup plan&lt;/strong&gt; — if the number gets restricted, you need to be able to switch to the official API or a new number&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keep sessions stable&lt;/strong&gt; — frequent disconnections/reconnections can trigger flags&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Building Your Bot: A Practical Walkthrough
&lt;/h2&gt;

&lt;p&gt;Here's how a typical WhatsApp bot setup looks using &lt;a href="https://n8n.io/get-started/?ref=achiya" rel="noopener noreferrer"&gt;n8n&lt;/a&gt; (our preferred automation platform) and WAHA.&lt;/p&gt;

&lt;h3&gt;
  
  
  Architecture Overview
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Customer sends WhatsApp message
        ↓
    WAHA (receives message via WhatsApp Web)
        ↓
    Webhook → n8n (processes the message)
        ↓
    Logic: FAQ? Appointment? Lead? → Route accordingly
        ↓
    Response sent back through WAHA
        ↓
    Customer receives reply on WhatsApp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 1: Set Up WAHA
&lt;/h3&gt;

&lt;p&gt;WAHA runs as a Docker container. Basic setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# docker-compose.yml&lt;/span&gt;
&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;waha&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;devlikeapro/waha&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3000:3000"&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;WHATSAPP_DEFAULT_ENGINE=WEBJS&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;WAHA_DASHBOARD_ENABLED=true&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;waha_data:/app/.sessions&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;waha_data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After starting it (&lt;code&gt;docker compose up -d&lt;/code&gt;), open &lt;code&gt;http://your-server:3000/dashboard&lt;/code&gt;, start a session, and scan the QR code with your phone.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Connect to n8n
&lt;/h3&gt;

&lt;p&gt;In &lt;a href="https://n8n.io/get-started/?ref=achiya" rel="noopener noreferrer"&gt;n8n&lt;/a&gt;, create a webhook node that WAHA will call when messages arrive:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Add a &lt;strong&gt;Webhook&lt;/strong&gt; node — this receives incoming messages&lt;/li&gt;
&lt;li&gt;Configure WAHA to send webhooks to your n8n webhook URL&lt;/li&gt;
&lt;li&gt;Add a &lt;strong&gt;Switch&lt;/strong&gt; node to route messages based on content&lt;/li&gt;
&lt;li&gt;Add response nodes to send replies back through WAHA's API&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Step 3: Build Your Logic
&lt;/h3&gt;

&lt;p&gt;A basic FAQ bot might look like this in n8n:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Webhook (incoming message)
    → Switch node:
        - Contains "hours" or "open" → Send business hours
        - Contains "price" or "cost" → Send pricing info
        - Contains "appointment" or "book" → Start booking flow
        - Default → "Thanks for reaching out! A team member will reply shortly."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Add AI (Optional)
&lt;/h3&gt;

&lt;p&gt;To make your bot smarter, add an AI node:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Webhook (incoming message)
    → OpenAI/Claude node:
        System prompt: "You are a helpful assistant for [Business Name].
        You know: [business hours, services, pricing, FAQ].
        If you can't answer, say you'll connect them with a human."
    → Send AI response via WAHA
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This turns your bot from a rigid keyword-matcher into a conversational agent that understands natural language.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Use Cases
&lt;/h2&gt;

&lt;p&gt;These are use cases we've implemented (described in general terms):&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Appointment Scheduling
&lt;/h3&gt;

&lt;p&gt;The bot asks what service the customer needs, checks available time slots from Google Calendar, proposes options, and books the appointment — all within WhatsApp. Confirmation and reminder messages are automated.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Lead Qualification
&lt;/h3&gt;

&lt;p&gt;When a new lead messages, the bot asks 3-4 qualifying questions (budget, timeline, requirements). Qualified leads get forwarded to a human agent immediately. Unqualified leads get a helpful resource and are added to a follow-up sequence.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Order Status Updates
&lt;/h3&gt;

&lt;p&gt;Connected to the business's order management system, the bot responds to "Where's my order?" with real-time tracking information. No human intervention needed for 80%+ of status inquiries.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. FAQ + Human Handoff
&lt;/h3&gt;

&lt;p&gt;The bot handles common questions (pricing, hours, location, services). When it can't answer or the customer asks for a human, the conversation is routed to a support agent in &lt;a href="https://www.chatwoot.com/?via=achiya-automation" rel="noopener noreferrer"&gt;Chatwoot&lt;/a&gt; (open-source customer support platform; &lt;strong&gt;5% off Cloud with code &lt;code&gt;UJR5GXWK&lt;/code&gt;&lt;/strong&gt;) — with full conversation history preserved.&lt;/p&gt;

&lt;h2&gt;
  
  
  What It Actually Costs
&lt;/h2&gt;

&lt;p&gt;Here's a realistic breakdown for a small business:&lt;/p&gt;

&lt;h3&gt;
  
  
  DIY with WAHA + n8n (self-hosted)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Monthly Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;VPS (2GB RAM)&lt;/td&gt;
&lt;td&gt;$5-20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WAHA&lt;/td&gt;
&lt;td&gt;Free (Core)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;n8n&lt;/td&gt;
&lt;td&gt;Free (self-hosted)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI API (if using AI)&lt;/td&gt;
&lt;td&gt;$5-50 (depends on volume)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$10-70/month&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Professional Setup (someone builds it for you)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Bot development&lt;/td&gt;
&lt;td&gt;$1,000-4,000 (one-time)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hosting + maintenance&lt;/td&gt;
&lt;td&gt;$25-75/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI API costs&lt;/td&gt;
&lt;td&gt;$5-50/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$1,000-4,000 setup + $30-125/month&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Official API Route
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Monthly Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;BSP subscription&lt;/td&gt;
&lt;td&gt;$50-100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WhatsApp conversation fees&lt;/td&gt;
&lt;td&gt;$20-200 (depends on volume)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;n8n or automation platform&lt;/td&gt;
&lt;td&gt;$0-25&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$70-325/month&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Common Mistakes I See
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Going straight to mass messaging.&lt;/strong&gt; Build a bot that responds to incoming messages first. Get that working well. Then — and only then — consider outbound campaigns, and always with opt-in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Not planning for human handoff.&lt;/strong&gt; No bot handles 100% of conversations. You need a clear path for escalating to a human agent. We use &lt;a href="https://www.chatwoot.com/?via=achiya-automation" rel="noopener noreferrer"&gt;Chatwoot&lt;/a&gt; for this — the bot handles routine questions, and complex issues are seamlessly transferred to a person. &lt;em&gt;Reader perk: **5% off Chatwoot Cloud with code &lt;code&gt;UJR5GXWK&lt;/code&gt;&lt;/em&gt;&lt;em&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Ignoring the conversation window.&lt;/strong&gt; With the official API, you have a 24-hour window to respond to a customer's message for free. After that, you need to use a pre-approved template (which costs money). Design your bot to respond instantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Overcomplicating the bot.&lt;/strong&gt; Start with 5-10 common questions. Get those right. Then expand. A bot that handles 10 things well is better than one that handles 50 things poorly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Not testing with real users.&lt;/strong&gt; Your team will use the bot differently than your customers. Test with actual customers (or friends who can pretend to be customers) before going live.&lt;/p&gt;

&lt;h2&gt;
  
  
  Beyond the Bot: Scaling Into Full Automation
&lt;/h2&gt;

&lt;p&gt;A WhatsApp bot is usually the first automation businesses deploy — but it's rarely the last. Once you have conversations flowing in, the natural next steps are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/en/blog/ai-agents-for-business/"&gt;AI agents for business&lt;/a&gt;&lt;/strong&gt; — move from scripted replies to autonomous agents that handle multi-step tasks (lookup orders, escalate tickets, schedule appointments) without hand-written flows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/en/blog/business-automation-guide/"&gt;Broader business automation&lt;/a&gt;&lt;/strong&gt; — the same n8n instance that powers your bot can automate invoicing, CRM updates, lead routing, and inventory sync. One workflow engine, many business processes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/en/blog/chatbot-customer-service/"&gt;Dedicated customer service chatbots&lt;/a&gt;&lt;/strong&gt; — once your WhatsApp flow is stable, the same stack can power an omnichannel support bot (web chat + Messenger + email) with ticket routing and SLA tracking.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most of our clients start with a WhatsApp bot and expand outward as they see ROI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;Building a WhatsApp bot doesn't have to be complicated or expensive. Start with a clear goal ("I want to handle appointment bookings automatically"), choose your API approach, and build from there.&lt;/p&gt;

&lt;p&gt;If you want help building your WhatsApp bot — whether it's a simple FAQ responder or a full AI-powered agent — &lt;a href="https://achiya-automation.com/en/contact/" rel="noopener noreferrer"&gt;reach out to us&lt;/a&gt;. At &lt;a href="https://achiya-automation.com/en/" rel="noopener noreferrer"&gt;Achiya Automation&lt;/a&gt;, we specialize in WhatsApp bots, business automation, and CRM integration using open-source tools.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://achiya-automation.com/en/contact/" rel="noopener noreferrer"&gt;Contact us&lt;/a&gt; or message us directly on &lt;a href="https://wa.me/972504197060" rel="noopener noreferrer"&gt;WhatsApp&lt;/a&gt; — we practice what we preach.&lt;/p&gt;

</description>
      <category>whatsapp</category>
      <category>chatbots</category>
      <category>automation</category>
      <category>node</category>
    </item>
    <item>
      <title>I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked)</title>
      <dc:creator>אחיה כהן</dc:creator>
      <pubDate>Sun, 19 Apr 2026 18:48:16 +0000</pubDate>
      <link>https://dev.to/achiya-automation/i-replaced-chrome-with-safari-for-ai-browser-automation-heres-what-broke-and-what-finally-worked-15ep</link>
      <guid>https://dev.to/achiya-automation/i-replaced-chrome-with-safari-for-ai-browser-automation-heres-what-broke-and-what-finally-worked-15ep</guid>
      <description>&lt;p&gt;&lt;em&gt;Or: why every browser-automation MCP uses Chromium, and why that's the wrong default on macOS.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The problem I kept hitting
&lt;/h2&gt;

&lt;p&gt;Every browser automation MCP server I tried on my Mac — &lt;code&gt;chrome-devtools-mcp&lt;/code&gt;, &lt;code&gt;playwright-mcp&lt;/code&gt;, &lt;code&gt;browsermcp&lt;/code&gt;, &lt;code&gt;puppeteer-mcp&lt;/code&gt; — did the same thing: spin up a fresh Chromium instance with nothing in it. No logins, no cookies, no session state. Then my AI agent would spend the first 5 minutes of every task navigating Cloudflare, solving reCAPTCHA, or explaining to me that it couldn't log into Gmail.&lt;/p&gt;

&lt;p&gt;Which is weird, because I was &lt;em&gt;already&lt;/em&gt; logged into Gmail. In Safari. In the window right next to me.&lt;/p&gt;

&lt;p&gt;The disconnect bothered me enough that I started reading Chromium-MCP source code. And what I found is that the entire ecosystem is built on an assumption that quietly doesn't hold for macOS users: &lt;strong&gt;"just spin up Chromium, it'll be fine."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It isn't fine.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Chromium costs on Apple Silicon
&lt;/h2&gt;

&lt;p&gt;Every Chromium process on M1/M2/M3 Macs pays a non-trivial tax:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multiple helper processes per tab (GPU, renderer, network, storage)&lt;/li&gt;
&lt;li&gt;WebKit-parity emulation that duplicates what Safari's WebKit gives you for free&lt;/li&gt;
&lt;li&gt;RAM spike on tab open, and fans audibly spinning up&lt;/li&gt;
&lt;li&gt;No access to the user's existing Safari extensions, iCloud Keychain, Apple Pay, or ApplePay-linked banking session&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you have a laptop on your lap, you feel every one of these.&lt;/p&gt;

&lt;h2&gt;
  
  
  The headless-browser fallacy
&lt;/h2&gt;

&lt;p&gt;The first thing people say is: "use headless mode, it's lighter." Sort of. Headless Chromium is still Chromium — you've just hidden the window. More importantly, headless mode is what gets you blocked. Cloudflare, reCAPTCHA v3, Akamai, DataDome — they all fingerprint headless browsers within seconds. Your agent's first action on 30% of the real web becomes "prove you're human."&lt;/p&gt;

&lt;p&gt;A headful browser running on your actual machine, with your actual fingerprint, doesn't have this problem. But headful Chromium-MCP means now you have &lt;em&gt;two&lt;/em&gt; browsers open — Safari (which you're using) and Chromium (which your agent is using). That's a fan-melting setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  The alternative no one was building
&lt;/h2&gt;

&lt;p&gt;What I wanted was obvious once I said it out loud:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Drive the Safari the user already has open. Inherit their logins, cookies, extensions, Apple Pay session. Use the WebKit process that's already running. Don't spin up a second browser.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;What I found out when I tried to build it: &lt;strong&gt;macOS has made this weirdly hard&lt;/strong&gt;, and I think that's why nobody had done it.&lt;/p&gt;

&lt;h3&gt;
  
  
  The three things that kept breaking
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. React's &lt;code&gt;_valueTracker&lt;/code&gt;.&lt;/strong&gt;&lt;br&gt;
You can't just set &lt;code&gt;input.value = "hello"&lt;/code&gt; and call &lt;code&gt;dispatchEvent("input")&lt;/code&gt;. React has an internal &lt;code&gt;_valueTracker&lt;/code&gt; on every controlled input that decides whether your "input" event is real. If the tracker thinks the value didn't change, React ignores you. Fixing this means reaching into React's internal state and calling &lt;code&gt;setter.call(input, value)&lt;/code&gt; via the prototype's native setter. It works, but it's the kind of code you don't write until you've spent an afternoon wondering why your form submission silently fails on every SPA.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Shadow DOM traversal.&lt;/strong&gt;&lt;br&gt;
Modern web components hide everything behind &lt;code&gt;shadowRoot&lt;/code&gt;. &lt;code&gt;document.querySelector&lt;/code&gt; stops at the shadow boundary. You need a recursive walker with a &lt;code&gt;MutationObserver&lt;/code&gt; cache, because otherwise traversing a single YouTube page costs you 200ms. And if you get the cache invalidation wrong, clicks land on stale element refs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. CSP.&lt;/strong&gt;&lt;br&gt;
About 30% of high-value pages (Google Search Console, LinkedIn, Gmail's admin console, many banks) block inline &lt;code&gt;eval&lt;/code&gt; and &lt;code&gt;Function()&lt;/code&gt; via strict Content Security Policy. Pure JavaScript injection fails silently. The workaround is a 4-strategy fallback chain: try regular JS → try &lt;code&gt;document.evaluate&lt;/code&gt; → try AppleScript &lt;code&gt;do JavaScript&lt;/code&gt; → try an injected content script via a Safari extension. Each one has its own failure modes and you only know which applies by trial.&lt;/p&gt;

&lt;p&gt;I ended up writing this out on HackerNoon last week, because the reverse-engineering took long enough that it felt worth sharing: &lt;a href="https://hackernoon.com/i-had-to-reverse-engineer-react-shadow-dom-and-csp-to-automate-safari-without-chrome" rel="noopener noreferrer"&gt;the three hardest problems&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  The unintentional side effects
&lt;/h2&gt;

&lt;p&gt;After a couple of months of using Safari-backed MCP instead of Chrome-backed MCP, I noticed a few things I wasn't expecting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;My battery lasted measurably longer on coding-agent-heavy days.&lt;/strong&gt; No surprise in retrospect — one browser instead of two.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;My agent's success rate on "just book this for me" tasks went up.&lt;/strong&gt; It was already logged into the calendar, the banking app, the booking portal.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;I stopped having to re-authenticate everything every time I rebooted.&lt;/strong&gt; Because the agent uses the browser I was already using.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Safari stays in the background.&lt;/strong&gt; MCP calls run via AppleScript + a persistent Swift daemon. The window doesn't steal focus, so I can keep working while an agent finishes a long task.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Boring outcomes, maybe. But they compound over a workday.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why this doesn't generalize
&lt;/h2&gt;

&lt;p&gt;A caveat: this approach only makes sense on macOS. On Linux or Windows, Chromium is the right default — there's no equivalent "browser the user is already using" with the same automation surface. And you give up Chrome DevTools' performance traces and Lighthouse, which don't have Safari equivalents. I still keep Chrome DevTools MCP installed for those specific audits.&lt;/p&gt;

&lt;p&gt;But "daily browsing tasks" — navigate, click, fill a form, extract some data, take a screenshot — those are 95% of what AI agents do with browsers. And for that 95%, on macOS, it's worth reconsidering the default.&lt;/p&gt;
&lt;h2&gt;
  
  
  If you want to try it
&lt;/h2&gt;

&lt;p&gt;The project is called &lt;a href="https://github.com/achiya-automation/safari-mcp" rel="noopener noreferrer"&gt;Safari MCP&lt;/a&gt;. It's MIT-licensed, one &lt;code&gt;npx&lt;/code&gt; command to install, and works with Claude Code, Claude Desktop, Cursor, Windsurf, and VS Code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx safari-mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;80 tools covering the full MCP surface — navigation, clicks, forms, screenshots, network mocking, cookies, accessibility snapshots, performance metrics. The README covers setup for each MCP client.&lt;/p&gt;

&lt;p&gt;If you've been feeling the Chromium tax on Apple Silicon, maybe give this a try. And if it works for you, a star on GitHub helps other macOS developers find it.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Written after a few months of running Safari MCP as my primary browser automation tool on an M3 MacBook Air. Your mileage will vary — I'd love to hear what breaks for you.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; mcp, claude, macos, webautomation, webdev&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>claude</category>
      <category>macos</category>
      <category>webdev</category>
    </item>
    <item>
      <title>I Tried to Auto-Launch My MCP Server Using My MCP Server. It Found Its Own Bug.</title>
      <dc:creator>אחיה כהן</dc:creator>
      <pubDate>Tue, 14 Apr 2026 20:04:02 +0000</pubDate>
      <link>https://dev.to/achiya-automation/i-tried-to-auto-launch-my-mcp-server-using-my-mcp-server-it-found-its-own-bug-494n</link>
      <guid>https://dev.to/achiya-automation/i-tried-to-auto-launch-my-mcp-server-using-my-mcp-server-it-found-its-own-bug-494n</guid>
      <description>&lt;h2&gt;
  
  
  TLDR
&lt;/h2&gt;

&lt;p&gt;I built &lt;strong&gt;safari-mcp&lt;/strong&gt;, an MCP server that lets AI agents drive Safari natively on macOS. This week I shipped a discoverability push for it: post the launch announcement to Hacker News, X, LinkedIn, and Reddit. Naturally, I tried to automate the campaign &lt;strong&gt;using safari-mcp itself&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It worked for HN. It worked for X. Then LinkedIn started running clicks on a completely different tab — Catchpoint Internet Performance Monitoring, which I'd never visited. Three windows, a URL prefix match, and a 500 ms cache TTL conspired to teach me a lesson about tab identity.&lt;/p&gt;

&lt;p&gt;Here's the detective story, the root cause, and the fix that ships in &lt;strong&gt;v2.8.3&lt;/strong&gt; today.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Setup: Eating My Own Dog Food
&lt;/h2&gt;

&lt;p&gt;I had four launch targets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Show HN&lt;/strong&gt; — submit the link, post a first comment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;X (Twitter)&lt;/strong&gt; — a single thread that quotes the article&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LinkedIn&lt;/strong&gt; — a Hebrew-English bilingual long-form post&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reddit r/ClaudeAI&lt;/strong&gt; — a tool-launch-with-context post&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I'd just shipped a &lt;a href="https://hackernoon.com/i-had-to-reverse-engineer-react-shadow-dom-and-csp-to-automate-safari-without-chrome" rel="noopener noreferrer"&gt;HackerNoon technical deep-dive&lt;/a&gt; about how I built browser automation for a browser that has no Chrome DevTools Protocol. The launch was the natural follow-on. And of course I was going to drive it through safari-mcp — what's the point of building a Safari automation tool if you don't use it for your own launch?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Eat your own dog food at launch — bugs surface fast." — me, after this incident.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Round 1: HN and X Worked Beautifully
&lt;/h2&gt;

&lt;p&gt;The HN submission flow was textbook. Open &lt;code&gt;news.ycombinator.com/submit&lt;/code&gt;, fill the title and URL inputs, call &lt;code&gt;form.submit()&lt;/code&gt; via injected JS, follow the redirect, find the new item ID via &lt;code&gt;submitted?id=&amp;lt;user&amp;gt;&lt;/code&gt;. About 8 seconds end-to-end.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Verify the form is real, not some other tab&lt;/span&gt;
&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;location&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;href&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;hasTitleInput&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;!!&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;input[name="title"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;hasUrlInput&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;!!&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;input[name="url"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="c1"&gt;// → {"url":"https://news.ycombinator.com/submit","hasTitleInput":true,"hasUrlInput":true}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Filled both inputs. Called &lt;code&gt;form.submit()&lt;/code&gt;. Got redirected to &lt;code&gt;/newest&lt;/code&gt;. Walked back to &lt;code&gt;/submitted?id=Achiyacohen&lt;/code&gt; and confirmed the new post sat at #1 with 1 point. &lt;strong&gt;Live.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;X was even smoother. The compose textbox in &lt;code&gt;x.com/home&lt;/code&gt; is a contenteditable with &lt;code&gt;aria-label="Post text"&lt;/code&gt;. I filled it with the thread text, found the &lt;code&gt;button[data-testid="tweetButtonInline"]&lt;/code&gt;, dispatched a React-aware pointer event sequence (mousedown → mouseup → click), and watched the textbox empty itself. Verified by reading the user's profile timeline 30 seconds later: the tweet was there, with my exact text and a fresh &lt;code&gt;status/2044134672683110740&lt;/code&gt; URL. &lt;strong&gt;Live.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Two for two. I was feeling good.&lt;/p&gt;

&lt;h2&gt;
  
  
  Round 2: Then LinkedIn Got Weird
&lt;/h2&gt;

&lt;p&gt;LinkedIn's "Start a post" button (in Hebrew: "כתבו פוסט") is a &lt;code&gt;div&lt;/code&gt; with class names like &lt;code&gt;_73dfa4c8 ed6e5932 _1d1c97a4&lt;/code&gt;. I found it, dispatched the same React-aware click sequence, and waited for the compose modal to appear.&lt;/p&gt;

&lt;p&gt;It didn't.&lt;/p&gt;

&lt;p&gt;I called &lt;code&gt;safari_evaluate&lt;/code&gt; to check whether &lt;code&gt;[contenteditable="true"]&lt;/code&gt; had appeared anywhere on the page. The result came back &lt;strong&gt;empty&lt;/strong&gt; — zero contenteditable elements. That was strange. Even the LinkedIn feed itself has search inputs and other interactive elements. So I asked the page for its URL and title to make sure I was in the right place.&lt;/p&gt;

&lt;p&gt;The response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"API Monitoring | Catchpoint Internet Performance Monitoring"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.catchpoint.com/application-experience/api-monitoring?utm_campaign=Hackernoon-TOFU-billboard"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Catchpoint. &lt;strong&gt;I'd never visited Catchpoint.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The First Suspicion: Tab Tracking
&lt;/h2&gt;

&lt;p&gt;The first hypothesis was that safari-mcp's tab tracking had drifted. The MCP keeps a cached &lt;code&gt;_activeTabIndex&lt;/code&gt; in memory and uses it for all subsequent operations on a tab it opened. The cache has a TTL of 500 ms, after which &lt;code&gt;resolveActiveTab&lt;/code&gt; re-verifies by URL prefix matching.&lt;/p&gt;

&lt;p&gt;I called &lt;code&gt;safari_list_tabs&lt;/code&gt; and got 12 tabs in the profile window — but with the LinkedIn tab right where I expected it. So the cache and the actual tab layout agreed: tab 12 was LinkedIn.&lt;/p&gt;

&lt;p&gt;Then why was &lt;code&gt;safari_evaluate&lt;/code&gt; returning Catchpoint?&lt;/p&gt;

&lt;h2&gt;
  
  
  Detective Work: There Are Three Windows
&lt;/h2&gt;

&lt;p&gt;I dropped down to raw AppleScript to bypass the MCP layer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight applescript"&gt;&lt;code&gt;&lt;span class="k"&gt;tell&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;application&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Safari"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;wCount&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;count&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;windows&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Total windows: "&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;wCount&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;linefeed&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="k"&gt;repeat&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;w&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;wCount&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Window "&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;w&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;": "&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;tabs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="na"&gt;window&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;" tabs"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;linefeed&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"  name: "&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="na"&gt;window&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;linefeed&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"  tab1: "&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;URL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;tab&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="na"&gt;window&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;linefeed&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="k"&gt;end&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;repeat&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nb"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="k"&gt;end&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;tell&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Total windows: 3
Window 1: 2 tabs
  name: אישי — Documenso
  tab1: https://mail.google.com/mail/u/0/#starred/...
Window 2: 12 tabs
  name: אוטומציות — API Monitoring | Catchpoint Internet Performance Monitoring
  tab1: https://hackernoon.com/login?redirect=app
Window 3: 3 tabs
  name: אישי — תוכנה קלה לשליחה למחשב מרחוק - Claude
  tab1: https://claude.ai/recents
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three windows. Two profiles ("אישי" / Personal and "אוטומציות" / Automation). Safari MCP was correctly targeting &lt;strong&gt;Window 2&lt;/strong&gt; ("אוטומציות"), where my LinkedIn tab actually lived as tab 12. So far so good.&lt;/p&gt;

&lt;p&gt;The Catchpoint URL? It was tab 5 of Window 2 — a tab the user (me) had clicked open earlier from a HackerNoon ad without thinking. It was sitting there idle. And somehow &lt;code&gt;safari_evaluate&lt;/code&gt; was hitting it instead of tab 12.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Bug: Resolve Cache + URL Prefix
&lt;/h2&gt;

&lt;p&gt;I traced through &lt;code&gt;resolveActiveTab&lt;/code&gt; line by line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;resolveActiveTab&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;_activeTabURL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;_activeTabIndex&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;safeUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;_activeTabURL&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/"/g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s1"&gt;"&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;domain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;_activeTabURL&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/^https&lt;/span&gt;&lt;span class="se"&gt;?&lt;/span&gt;&lt;span class="sr"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\/\/&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;osascriptFast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`
    tell application "Safari"
      set w to &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nf"&gt;getTargetWindowRef&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;
      set tabCount to count of tabs of w

      // Strategy 1: verify cached index still matches URL
      try
        if tabCount &amp;gt;= &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;_activeTabIndex&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; then
          if URL of tab &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;_activeTabIndex&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; of w starts with "&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;safeUrl&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;" then
            return &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;_activeTabIndex&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;
          end if
        end try
      end try

      // Strategy 2: search all tabs by URL prefix
      repeat with i from tabCount to 1 by -1
        if URL of tab i of w starts with "&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;safeUrl&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;" then return i
      end repeat

      // Strategy 3: search by domain (returns negative — partial match)
      repeat with i from tabCount to 1 by -1
        if URL of tab i of w contains "&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;" then return -(i)
      end repeat

      return "0:" &amp;amp; tabCount
    end tell
  `&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The bug was right there in the strategies. When I navigated LinkedIn to &lt;code&gt;https://www.linkedin.com/feed/&lt;/code&gt;, that became &lt;code&gt;_activeTabURL&lt;/code&gt;. Then LinkedIn's React router silently rewrote the URL to &lt;code&gt;https://www.linkedin.com/feed/?shareActive=true&lt;/code&gt; because of the query parameter I'd passed. Strategy 1 — the fast path — failed because &lt;code&gt;URL of tab 12 starts with "https://www.linkedin.com/feed/"&lt;/code&gt;... wait, that should still match. The new URL starts with the old prefix.&lt;/p&gt;

&lt;p&gt;So why did it fail?&lt;/p&gt;

&lt;p&gt;The actual cause was even more subtle: a &lt;strong&gt;different&lt;/strong&gt; Safari instance, in a &lt;strong&gt;different&lt;/strong&gt; profile window, had completed an HTTP redirect that rewrote the URL to a &lt;em&gt;shorter&lt;/em&gt; form. AppleScript's &lt;code&gt;URL of tab&lt;/code&gt; was returning the post-redirect URL, which &lt;strong&gt;did not start with&lt;/strong&gt; my saved &lt;code&gt;_activeTabURL&lt;/code&gt; because &lt;code&gt;_activeTabURL&lt;/code&gt; had query parameters that the post-redirect URL didn't.&lt;/p&gt;

&lt;p&gt;Strategy 1 fell through. Strategy 2 (full URL search across all tabs) also fell through for the same reason. Strategy 3 (domain search) found... a tab in the wrong profile window? No — it found Catchpoint. &lt;strong&gt;Why?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Because of how I'd extracted the domain:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;domain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;_activeTabURL&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/^https&lt;/span&gt;&lt;span class="se"&gt;?&lt;/span&gt;&lt;span class="sr"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\/\/&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="c1"&gt;// "www.linkedin.com"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the AppleScript:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight applescript"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;URL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;tab&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;w&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;contains&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"${domain}"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;contains&lt;/code&gt; is a substring match. &lt;code&gt;Catchpoint&lt;/code&gt;'s ad URL was &lt;code&gt;https://www.catchpoint.com/.../?utm_campaign=Hackernoon-TOFU-billboard&amp;amp;utm_source=hackernoon&amp;amp;utm_medium=paidsocial&lt;/code&gt;. Did it contain &lt;code&gt;www.linkedin.com&lt;/code&gt;? No.&lt;/p&gt;

&lt;p&gt;Wait, then how did it match?&lt;/p&gt;

&lt;p&gt;After two more hours of tracing, I found the actual cause. The MCP server runs as a singleton, but Claude Code occasionally spawns a second instance for ~40 ms during connection negotiation. That second instance had its own &lt;code&gt;_activeTabIndex&lt;/code&gt; state, and &lt;strong&gt;it had set the index to point at Catchpoint&lt;/strong&gt; because it saw Catchpoint as the active tab when it briefly took over. When the original instance came back, it read the wrong index from a stale cache check that hadn't yet been invalidated by the singleton kill code.&lt;/p&gt;

&lt;p&gt;The 500 ms cache window was just long enough for that race.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Fix: window.__mcpTabMarker
&lt;/h2&gt;

&lt;p&gt;URL prefix matching is fragile. Domain matching is fragile. Cached indices are fragile. What's not fragile?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A unique identifier injected into the page's JavaScript context.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The new fix: every &lt;code&gt;safari_new_tab&lt;/code&gt; writes a unique marker into &lt;code&gt;window.__mcpTabMarker&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tabMarker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`MCP_&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;SESSION_ID&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;36&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;_&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;36&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;osascriptFast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="s2"&gt;`tell application "Safari" to do JavaScript "window.__mcpTabMarker='&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;tabMarker&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;'" in tab &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;_activeTabIndex&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; of &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nf"&gt;getTargetWindowRef&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;_activeTabMarker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tabMarker&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The marker survives:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Same-tab navigation&lt;/strong&gt; — &lt;code&gt;window.__mcpTabMarker&lt;/code&gt; lives in the JS realm, which persists across &lt;code&gt;location.href = ...&lt;/code&gt; if the new URL is same-origin. For cross-origin navigations it gets wiped, which is fine because that's a deliberate context boundary.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hash changes&lt;/strong&gt; — &lt;code&gt;location.hash = "#x"&lt;/code&gt; doesn't reload the JS context.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;pushState&lt;/code&gt; and &lt;code&gt;replaceState&lt;/code&gt;&lt;/strong&gt; — single-page-app routers don't reset the realm.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query string mutations&lt;/strong&gt; — same as above.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redirects within the same origin&lt;/strong&gt; — still in the same realm.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;resolveActiveTab&lt;/code&gt; now tries the marker &lt;strong&gt;first&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;resolveActiveTab&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Strategy 1: window.__mcpTabMarker (bulletproof)&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;_activeTabMarker&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;_activeTabIndex&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;checkScript&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`(function(){return window.__mcpTabMarker==='&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;safeMarker&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;'?'1':'0'})()`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Check cached index first (fast path)&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;matchAtCached&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;osascriptFast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;`tell application "Safari" to do JavaScript "&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;checkScript&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;" in tab &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;_activeTabIndex&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; of &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nf"&gt;getTargetWindowRef&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;matchAtCached&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;_activeTabIndex&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Cached index doesn't match — scan all tabs in profile window&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tabCount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Number&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;osascriptFast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;`tell application "Safari" to return count of tabs of &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nf"&gt;getTargetWindowRef&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
    &lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tabCount&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;osascriptFast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s2"&gt;`tell application "Safari" to do JavaScript "&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;checkScript&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;" in tab &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; of &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nf"&gt;getTargetWindowRef&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
      &lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;_activeTabIndex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Strategy 2: URL prefix (fallback for tabs created before the marker was set)&lt;/span&gt;
  &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The marker check costs about 5 ms per tab via the persistent &lt;code&gt;osascriptFast&lt;/code&gt; daemon. On a tab list of 12 tabs, the worst case is 60 ms — slower than the previous "check cached index" path, but &lt;strong&gt;correct&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I also dropped the resolve cache from 500 ms to 100 ms. The check is cheap enough that the tighter cache buys us correctness without measurable latency.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bypass Tool I Built While Debugging
&lt;/h2&gt;

&lt;p&gt;While I was tracing the bug, I needed a way to test changes against Safari without restarting the MCP server (which would require restarting the Claude Code session). So I wrote a &lt;strong&gt;Python wrapper&lt;/strong&gt; that calls &lt;code&gt;osascript&lt;/code&gt; directly, with one job: find a tab by URL prefix in a specific window, then run JS in that exact tab.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_js&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url_prefix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;js_code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;js_clean&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;strip_line_comments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;js_code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;js_escaped&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;js_clean&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\\\\&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'"'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="sh"&gt;"'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;osascript&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'''&lt;/span&gt;&lt;span class="s"&gt;
tell application &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Safari&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;
  set tCount to count of tabs of window &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;
  set foundIdx to 0
  repeat with i from 1 to tCount
    if URL of tab i of window &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; starts with &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url_prefix&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; then
      set foundIdx to i
      exit repeat
    end if
  end repeat
  if foundIdx = 0 then return &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ERROR_NO_TAB&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;
  set jsOut to do JavaScript &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;js_escaped&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; in tab foundIdx of window &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;
  return &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tab:w&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &amp;amp; foundIdx &amp;amp; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;|&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &amp;amp; jsOut
end tell
&lt;/span&gt;&lt;span class="sh"&gt;'''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;capture_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This bypassed every layer of the MCP and gave me direct, predictable access to whichever tab I wanted in whichever window I wanted. Three rules I learned writing it:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;AppleScript's &lt;code&gt;result&lt;/code&gt; is a reserved word.&lt;/strong&gt; Don't name your variable &lt;code&gt;result&lt;/code&gt;. Use &lt;code&gt;jsOut&lt;/code&gt; or &lt;code&gt;output&lt;/code&gt; or anything else. The error message you get is "המשתנה result אינו מוגדר" if your system locale is Hebrew, which is unhelpful unless you happen to know that &lt;code&gt;result&lt;/code&gt; is taken.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;do JavaScript&lt;/code&gt; returns immediately for any expression that's not a synchronously-resolved value.&lt;/strong&gt; Promises return undefined. Async functions return their &lt;code&gt;[[PromiseState]]&lt;/code&gt; representation, which AppleScript silently coerces to "missing value", which then triggers "המשתנה X אינו מוגדר" downstream. Workaround: write the result to &lt;code&gt;window.__myResult&lt;/code&gt; from a &lt;code&gt;.then()&lt;/code&gt; callback, then poll for it with a second &lt;code&gt;do JavaScript&lt;/code&gt; call.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hebrew text in shell variables breaks AppleScript.&lt;/strong&gt; When you &lt;code&gt;bash -c "osascript -e '...$VAR...'"&lt;/code&gt;, the UTF-8 round-trip through shell substitution corrupts Hebrew bytes. The fix is to call &lt;code&gt;osascript -&lt;/code&gt; with the script on stdin, in Python or Ruby or any language that handles UTF-8 natively.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  How LinkedIn Was Actually Posted
&lt;/h2&gt;

&lt;p&gt;After all that, I still couldn't get LinkedIn's compose modal to open via clicks, even with the bypass tool. LinkedIn's React event handlers check &lt;code&gt;event.isTrusted&lt;/code&gt;, which is &lt;code&gt;false&lt;/code&gt; for any event dispatched by user JavaScript. Synthetic clicks just get dropped on the floor.&lt;/p&gt;

&lt;p&gt;So I gave up on the modal entirely and used &lt;strong&gt;LinkedIn's own voyager API&lt;/strong&gt; directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;match&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cookie&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/JSESSIONID="&lt;/span&gt;&lt;span class="se"&gt;?([^&lt;/span&gt;&lt;span class="sr"&gt;";&lt;/span&gt;&lt;span class="se"&gt;]&lt;/span&gt;&lt;span class="sr"&gt;+&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="sr"&gt;"&lt;/span&gt;&lt;span class="se"&gt;?&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;csrf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;match&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://www.linkedin.com/voyager/api/contentcreation/normShares&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;credentials&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;include&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;csrf-token&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;csrf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;content-type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;application/json; charset=UTF-8&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;accept&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;application/vnd.linkedin.normalized+json+2.1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;x-restli-protocol-version&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2.0.0&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;visibleToConnectionsOnly&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;commentaryV2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;postBody&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;FEED&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;allowedCommentersScope&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ALL&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;postState&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;PUBLISHED&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;media&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;){&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;){&lt;/span&gt;
    &lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;__mcpLinkedinResult&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;substring&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;)});&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;csrf-token&lt;/code&gt; header is just the value of the &lt;code&gt;JSESSIONID&lt;/code&gt; cookie that LinkedIn sets during login. Once you're authenticated, the API accepts your request and returns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;201&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ok"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"{&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;status&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:{&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;urn&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;urn:li:share:7449905229468274688&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;,&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;toastCtaText&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;צפייה בפוסט&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;,&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;mainToastText&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;פרסום הפוסט הצליח.&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;}}"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;"פרסום הפוסט הצליח"&lt;/strong&gt; — "Post published successfully". The bypass worked. LinkedIn was live.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Reddit Taught Me
&lt;/h2&gt;

&lt;p&gt;Reddit was my one failure. The user account in window 1 (Personal profile) was logged in. The form on &lt;code&gt;old.reddit.com/r/ClaudeAI/submit&lt;/code&gt; filled correctly. The CSRF token (&lt;code&gt;uh&lt;/code&gt; field) was present. I built a &lt;code&gt;FormData&lt;/code&gt; POST to &lt;code&gt;/api/submit&lt;/code&gt;, included all the required fields, and fired it.&lt;/p&gt;

&lt;p&gt;Response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"json"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"errors"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s2"&gt;"BAD_CAPTCHA"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"That was a tricky one. Why don't you try that again."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"captcha"&lt;/span&gt;&lt;span class="p"&gt;]]}}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Reddit's &lt;code&gt;/api/submit&lt;/code&gt; endpoint requires a solved reCAPTCHA token, even for fully-authenticated users. There's no API path that bypasses this. There's no honor-system "I'm a real human" header. The only ways through are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pay a CAPTCHA-solving service ($1-2 per 1000 captchas, with all the ethical and TOS implications you'd expect)&lt;/li&gt;
&lt;li&gt;Have a human solve it&lt;/li&gt;
&lt;li&gt;Don't post to Reddit&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I picked option 3. I respect the captcha as a clearly-stated boundary.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Eat your own dog food at launch.&lt;/strong&gt; I'd been running safari-mcp for daily browser automation tasks for weeks and never hit this bug. It took the specific combination of "rapid sequence of operations across multiple Safari windows with same-domain tabs and React-driven URL rewrites" to surface it. A launch campaign happens to involve exactly that combination.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-window/multi-profile is a forgotten edge case in browser automation.&lt;/strong&gt; Most automation tools assume one window or have a strict "first window" convention. Safari's profile feature (introduced in macOS Sonoma) makes multi-window the default for power users. If you write a Safari automation tool, test with three profile windows open from day one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;URL matching is fragile; identity markers in the JS context are bulletproof.&lt;/strong&gt; This is the takeaway I wish someone had told me three weeks ago. Don't track tabs by URL or title or any other property the page can mutate. Inject a marker into the page's JS realm and check for it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cache TTL is a knife edge.&lt;/strong&gt; 500 ms felt safe. It wasn't. 100 ms with a cheap revalidation check is the sweet spot for this workload. Your sweet spot may differ — measure it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When debugging, build a bypass tool.&lt;/strong&gt; Don't fight the bug from inside the affected layer. Route around it. The 60 lines of Python I wrote in the middle of this incident saved me hours of MCP restart cycles, and I get to keep them as a permanent low-level escape hatch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Some platforms genuinely don't want automation.&lt;/strong&gt; That's their right. Respect it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Status
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;safari-mcp v2.8.3&lt;/strong&gt; ships the marker fix today. &lt;a href="https://www.npmjs.com/package/safari-mcp" rel="noopener noreferrer"&gt;npm&lt;/a&gt;, &lt;a href="https://github.com/achiya-automation/safari-mcp" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;, &lt;a href="https://registry.modelcontextprotocol.io/" rel="noopener noreferrer"&gt;MCP Registry&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;The launch campaign worked: HN post live, X tweet live, LinkedIn post live (via the API bypass), Reddit deferred.&lt;/li&gt;
&lt;li&gt;The bug-find-fix loop took about 90 minutes. The article you're reading took longer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you build MCP servers, automation tools, or anything that touches a multi-window browser, I'd love to hear how you've solved tab identity. Drop a comment or open an issue on &lt;a href="https://github.com/achiya-automation/safari-mcp" rel="noopener noreferrer"&gt;achiya-automation/safari-mcp&lt;/a&gt;. I learn from every reply.&lt;/p&gt;

&lt;p&gt;And if you're considering using your own tool to launch your own tool — do it. The bugs you'll find are the bugs your users would have hit first.&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>mcp</category>
      <category>browserautomation</category>
      <category>debugging</category>
    </item>
    <item>
      <title>I've Deployed 50+ WhatsApp Bots — Here's How the Spam Detection Algorithm Actually Works in 2026</title>
      <dc:creator>אחיה כהן</dc:creator>
      <pubDate>Sun, 12 Apr 2026 19:20:59 +0000</pubDate>
      <link>https://dev.to/achiya-automation/ive-deployed-50-whatsapp-bots-heres-how-the-spam-detection-algorithm-actually-works-in-2026-69a</link>
      <guid>https://dev.to/achiya-automation/ive-deployed-50-whatsapp-bots-heres-how-the-spam-detection-algorithm-actually-works-in-2026-69a</guid>
      <description>&lt;p&gt;After deploying 50+ WhatsApp bots for businesses, I've learned the hard way how WhatsApp's spam detection works. Not from documentation — from watching accounts get restricted and figuring out why.&lt;/p&gt;

&lt;p&gt;Here's the real picture in 2026.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 4-Layer Detection System
&lt;/h2&gt;

&lt;p&gt;WhatsApp doesn't use a single algorithm. It's a pipeline:&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 1: Registration Fingerprinting
&lt;/h3&gt;

&lt;p&gt;Before you send a message, WhatsApp analyzes your registration signal — device metadata, IP clusters, phone number patterns, registration velocity. Bulk-registered numbers on VPS servers get flagged immediately.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 2: Behavioral Analysis (Where Bots Get Caught)
&lt;/h3&gt;

&lt;p&gt;This is the critical layer. WhatsApp monitors:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Send velocity&lt;/strong&gt; — messages per minute/hour/day&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reply-to-send ratio&lt;/strong&gt; — if you send 100 messages and get 5 replies, that's a 5% ratio = spam signal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Message timing patterns&lt;/strong&gt; — bots send at precise intervals; humans don't&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Contact interaction history&lt;/strong&gt; — messages to contacts who never messaged you weigh more heavily&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;From our deployments, here are the thresholds I've observed:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Safe&lt;/th&gt;
&lt;th&gt;Warning&lt;/th&gt;
&lt;th&gt;Danger&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Messages/hour&lt;/td&gt;
&lt;td&gt;&amp;lt; 30&lt;/td&gt;
&lt;td&gt;30-60&lt;/td&gt;
&lt;td&gt;&amp;gt; 60&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reply rate&lt;/td&gt;
&lt;td&gt;&amp;gt; 30%&lt;/td&gt;
&lt;td&gt;15-30%&lt;/td&gt;
&lt;td&gt;&amp;lt; 15%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;New contacts/day&lt;/td&gt;
&lt;td&gt;&amp;lt; 20&lt;/td&gt;
&lt;td&gt;20-50&lt;/td&gt;
&lt;td&gt;&amp;gt; 50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Identical messages&lt;/td&gt;
&lt;td&gt;&amp;lt; 5/hr&lt;/td&gt;
&lt;td&gt;5-15/hr&lt;/td&gt;
&lt;td&gt;&amp;gt; 15/hr&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Based on observations across 50+ deployments, not official Meta docs.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 3: User Reports
&lt;/h3&gt;

&lt;p&gt;Every block or spam report adds negative signal. Block rate &amp;gt; 2% = quality rating drops to "Low". Multiple reports in 24 hours = temporary restriction.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 4: Content Pattern Matching
&lt;/h3&gt;

&lt;p&gt;WhatsApp analyzes message metadata (length, media, links), forward patterns, and template similarity — without reading encrypted content.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Big 2026 Change: Unanswered Message Counter
&lt;/h2&gt;

&lt;p&gt;The most significant change this year: WhatsApp now tracks &lt;strong&gt;messages sent that received no reply within 48 hours&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This counter is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Cumulative&lt;/strong&gt; — counts across all conversations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time-bounded&lt;/strong&gt; — rolling 30-day window&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Universal&lt;/strong&gt; — affects both official and unofficial API&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We saw this hit a dental clinic client running appointment reminders via the official API. Fully compliant, template-approved, opt-in collected. But 40% of patients confirmed by showing up, not replying to WhatsApp.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix&lt;/strong&gt;: We added "Reply 1 to confirm, 2 to reschedule" to every reminder. Reply rate jumped from 60% to 89%. Quality rating recovered in two weeks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Official vs Unofficial API: Risk Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Official API&lt;/th&gt;
&lt;th&gt;Unofficial (WAHA/Baileys)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Registration ban&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Behavioral ban&lt;/td&gt;
&lt;td&gt;Low (templates enforce limits)&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User report ban&lt;/td&gt;
&lt;td&gt;Low (warnings first)&lt;/td&gt;
&lt;td&gt;High (direct ban)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Recovery&lt;/td&gt;
&lt;td&gt;Appeal through Meta&lt;/td&gt;
&lt;td&gt;Permanent, no appeal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost&lt;/td&gt;
&lt;td&gt;BSP $50-100/mo + per-msg&lt;/td&gt;
&lt;td&gt;Server $5-20/mo&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Key insight&lt;/strong&gt;: Unofficial API bots that only &lt;strong&gt;respond&lt;/strong&gt; to incoming messages have &amp;lt;2% ban rate over 12 months. Bots that &lt;strong&gt;proactively message&lt;/strong&gt; new contacts see 15-30% ban rates.&lt;/p&gt;

&lt;h2&gt;
  
  
  7 Rules We Follow for Every Bot
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Official API for proactive messaging&lt;/strong&gt; — templates exist to keep you compliant&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Explicit opt-in&lt;/strong&gt; — not buried in ToS. Real: "I want reminders via WhatsApp"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Design for replies&lt;/strong&gt; — quick-reply buttons, yes/no questions. Reply rate = trust signal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rate-limit sending&lt;/strong&gt; — 50-100/batch for marketing, 5-min gaps&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitor quality rating&lt;/strong&gt; weekly — Meta Business Suite → Phone Numbers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Segment audience&lt;/strong&gt; — don't message contacts silent for 90+ days&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human escalation&lt;/strong&gt; after 2 failed bot responses — frustrated users report + block&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What If You're Already Restricted?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Official API&lt;/strong&gt;: Pause marketing templates, improve reply rates, wait 7 days for quality re-evaluation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unofficial API&lt;/strong&gt;: Stop proactive messaging immediately. If banned, the number is gone. Migrate to official API.&lt;/p&gt;




&lt;p&gt;The algorithm isn't adversarial toward legitimate businesses. The formula:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Official API + Opt-in + Relevant Messages + Reply-Encouraging Design = Zero Risk&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Full deep-dive with all technical details: &lt;a href="https://achiya-automation.com/en/blog/whatsapp-spam-detection-2026/" rel="noopener noreferrer"&gt;WhatsApp Spam Detection Algorithm 2026&lt;/a&gt;&lt;/p&gt;

</description>
      <category>whatsapp</category>
      <category>bots</category>
      <category>automation</category>
      <category>security</category>
    </item>
    <item>
      <title>MCP vs CLI for Browser Automation: I Benchmarked Both and the Results Surprised Me</title>
      <dc:creator>אחיה כהן</dc:creator>
      <pubDate>Sat, 11 Apr 2026 23:10:53 +0000</pubDate>
      <link>https://dev.to/achiya-automation/mcp-vs-cli-for-browser-automation-i-benchmarked-both-and-the-results-surprised-me-4cog</link>
      <guid>https://dev.to/achiya-automation/mcp-vs-cli-for-browser-automation-i-benchmarked-both-and-the-results-surprised-me-4cog</guid>
      <description>&lt;p&gt;Three weeks ago I published &lt;a href="https://github.com/achiya-automation/safari-mcp" rel="noopener noreferrer"&gt;safari-mcp&lt;/a&gt; — a macOS-native Safari automation server that speaks the Model Context Protocol. 84 tools, AppleScript + optional extension for speed, keeps Safari logins, zero Chrome overhead. Today it's in the VS Code and Cursor marketplaces.&lt;/p&gt;

&lt;p&gt;Then I saw &lt;a href="https://github.com/HKUDS/CLI-Anything" rel="noopener noreferrer"&gt;HKUDS/CLI-Anything&lt;/a&gt; — a 29k-star project that auto-wraps open-source software as agent-ready CLIs. Their pitch: "Make ALL software agent-native." Their main example is &lt;a href="https://github.com/apireno/DOMShell" rel="noopener noreferrer"&gt;DOMShell&lt;/a&gt; wrapped as &lt;code&gt;cli-anything-browser&lt;/code&gt; — a shell-pipeable interface for Chrome automation.&lt;/p&gt;

&lt;p&gt;I wanted to know: &lt;strong&gt;is wrapping safari-mcp as a CLI actually worth it?&lt;/strong&gt; Or is it pure theater — re-exposing a working MCP server as a strictly worse interface?&lt;/p&gt;

&lt;p&gt;So I built the harness (&lt;a href="https://github.com/HKUDS/CLI-Anything/pull/212" rel="noopener noreferrer"&gt;PR #212&lt;/a&gt;) and benchmarked it live against the direct MCP path. Real Safari, real macOS, measured on 2026-04-10.&lt;/p&gt;

&lt;p&gt;Here's what I found.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;MCP (direct stdio)&lt;/th&gt;
&lt;th&gt;CLI (subprocess per call)&lt;/th&gt;
&lt;th&gt;Winner&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Per-call latency&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;119ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3,023ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;MCP, 25×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5-op workflow&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.7s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;15.2s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;MCP, 5.6×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tokens per API call (tool defs)&lt;/td&gt;
&lt;td&gt;7,986&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;95&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;CLI, 84×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Output accuracy&lt;/td&gt;
&lt;td&gt;identical&lt;/td&gt;
&lt;td&gt;identical&lt;/td&gt;
&lt;td&gt;tie&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;If your agent speaks MCP&lt;/strong&gt; (Claude Code, Cursor, Cline, Windsurf, Continue, OpenClaw, any MCP-aware client) — &lt;strong&gt;use the MCP directly&lt;/strong&gt;. The CLI is strictly slower.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If you need to drive it from bash, CI, cron, or an agent that doesn't speak MCP&lt;/strong&gt; — use the CLI. The token savings compound; at Claude Opus pricing, a 100-turn session saves ~$12 in tool-definition overhead alone.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's the whole story. If you only wanted the numbers, you can stop here. If you want the methodology, the edge cases, and the bugs I hit along the way, read on.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I actually built
&lt;/h2&gt;

&lt;p&gt;The harness (&lt;a href="https://github.com/HKUDS/CLI-Anything/tree/main/safari/agent-harness" rel="noopener noreferrer"&gt;&lt;code&gt;safari/agent-harness/&lt;/code&gt;&lt;/a&gt;) is a schema-driven CLI generator:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Offline Zod parser&lt;/strong&gt; (&lt;code&gt;scripts/extract_tools.py&lt;/code&gt;) reads safari-mcp's source and emits &lt;code&gt;resources/tools.json&lt;/code&gt; — the full schema for all 84 tools. Depth-aware, handles nested &lt;code&gt;z.array(z.object({...})).describe("outer")&lt;/code&gt; correctly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Runtime Click generator&lt;/strong&gt; (&lt;code&gt;safari_cli.py&lt;/code&gt;) loads the registry at import time and builds one Click subcommand per MCP tool. Argument names, types, enum choices, required flags, and descriptions are all pulled from the schema. Zero manual mapping.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Parity test suite&lt;/strong&gt; (&lt;code&gt;test_parity.py&lt;/code&gt;) iterates the registry and verifies every tool is reachable, every param is wired correctly, every enum matches. If the registry and the CLI ever drift, the tests scream.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The CLI surface ends up looking like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;cli-anything-safari tools count
84

&lt;span class="nv"&gt;$ &lt;/span&gt;cli-anything-safari tools describe safari_click
Name:        safari_click
CLI &lt;span class="nb"&gt;command&lt;/span&gt;: tool click
Description: Click element. Use ref &lt;span class="o"&gt;(&lt;/span&gt;from snapshot&lt;span class="o"&gt;)&lt;/span&gt;, selector, text, or x/y...
Parameters:
  &lt;span class="nt"&gt;--ref&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;string, optional&lt;span class="o"&gt;)&lt;/span&gt;
  &lt;span class="nt"&gt;--selector&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;string, optional&lt;span class="o"&gt;)&lt;/span&gt;
  &lt;span class="nt"&gt;--text&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;string, optional&lt;span class="o"&gt;)&lt;/span&gt;
  &lt;span class="nt"&gt;--x&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;number, optional&lt;span class="o"&gt;)&lt;/span&gt;
  &lt;span class="nt"&gt;--y&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;number, optional&lt;span class="o"&gt;)&lt;/span&gt;

&lt;span class="nv"&gt;$ &lt;/span&gt;cli-anything-safari &lt;span class="nt"&gt;--json&lt;/span&gt; tool snapshot
&lt;span class="s2"&gt;"ref=0_0 body&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;ref=0_1 div&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;ref=0_2 navigation &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;Sidebar&lt;/span&gt;&lt;span class="se"&gt;\"\n&lt;/span&gt;&lt;span class="s2"&gt;..."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same interface as the upstream MCP, just behind &lt;code&gt;click.command(...)&lt;/code&gt; calls.&lt;/p&gt;

&lt;h2&gt;
  
  
  The benchmark setup
&lt;/h2&gt;

&lt;p&gt;Both paths hit the &lt;strong&gt;same&lt;/strong&gt; &lt;code&gt;safari-mcp&lt;/code&gt; server in the end. The difference is the connection model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MCP direct:  Python → stdio (persistent) → safari-mcp → Safari
CLI:         Python → subprocess → npx → Node → safari-mcp → Safari
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For MCP I used &lt;code&gt;mcp.ClientSession&lt;/code&gt; with a persistent stdio connection, measuring only the &lt;code&gt;call_tool()&lt;/code&gt; round-trip (initialization amortized). For CLI I measured &lt;code&gt;subprocess.run([...])&lt;/code&gt; wall time. Both had one warmup call that I discarded.&lt;/p&gt;

&lt;p&gt;The benchmark script is at &lt;code&gt;/tmp/benchmark_cli_vs_mcp.py&lt;/code&gt; (not committed because it's scratch); the key loop is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# MCP: persistent session, N calls
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;stdio_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;as &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# warmup
&lt;/span&gt;        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;t0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;times&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;t0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# CLI: spawn per call
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;t0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CLI&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_short_name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;args_list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;capture_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;times&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;t0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Latency — MCP wins by 25×
&lt;/h2&gt;

&lt;p&gt;Ten calls of &lt;code&gt;safari_list_tabs&lt;/code&gt; (warm cache, same Safari state):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                  MCP (ms)    CLI (ms)       ratio
  min                113.3      2970.2       26.2×
  median             119.5      3026.1       25.3×
  mean               119.3      3022.7       25.3×
  max                123.7      3097.2       25.0×
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;CLI calls land at &lt;strong&gt;~3 seconds&lt;/strong&gt; every single time, with almost no variance. That consistency is the giveaway: the bottleneck is not &lt;code&gt;safari_list_tabs&lt;/code&gt; itself — it's the ~2.9 seconds that go into &lt;code&gt;npx&lt;/code&gt; resolution, Node.js startup, &lt;code&gt;safari-mcp&lt;/code&gt; initialization, and MCP handshake for every fresh subprocess.&lt;/p&gt;

&lt;p&gt;MCP amortizes all of that across a single persistent session. Once the session is up, each additional tool call is &lt;strong&gt;just&lt;/strong&gt; the ~100ms AppleScript operation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For interactive reactive workflows — agents that take each result and decide the next step — MCP is the obvious choice.&lt;/strong&gt; Every round-trip matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Workflow — MCP still wins on reactive sequences
&lt;/h2&gt;

&lt;p&gt;I ran a 5-op workflow (&lt;code&gt;snapshot → read_page → list_tabs → snapshot → read_page&lt;/code&gt;) three ways:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  MCP (persistent, 5 ops)           2,714 ms
  CLI (5 sequential spawns)        15,285 ms
  CLI (1 shell pipeline, 5 ops)    15,153 ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Shell pipelining — &lt;code&gt;cli-anything-safari tool X &amp;amp;&amp;amp; cli-anything-safari tool Y&lt;/code&gt; — does &lt;strong&gt;not&lt;/strong&gt; help. Every &lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt; still spawns a fresh &lt;code&gt;npx&lt;/code&gt; subprocess. The overhead per step is unchanged.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The only way to amortize the cost is to drive the Python API directly&lt;/strong&gt; (&lt;code&gt;from cli_anything.safari.utils.safari_backend import call&lt;/code&gt;). If you do that, you're back to roughly MCP-class numbers because you're just using the MCP Python SDK under a different name.&lt;/p&gt;

&lt;p&gt;The CLI's per-call cost is structural. You cannot pipeline your way out of it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tokens — CLI wins by 84×
&lt;/h2&gt;

&lt;p&gt;This is where the picture inverts. When an LLM uses MCP tools, &lt;strong&gt;every API call includes the full tool definitions in the request&lt;/strong&gt;. For safari-mcp that's 84 tools × ~95 tokens each = &lt;strong&gt;~7,986 tokens&lt;/strong&gt; on every turn.&lt;/p&gt;

&lt;p&gt;I measured this with the real tools.json and the &lt;code&gt;cl100k_base&lt;/code&gt; tokenizer (&lt;code&gt;tiktoken&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tiktoken&lt;/span&gt;
&lt;span class="n"&gt;enc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tiktoken&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_encoding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cl100k_base&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;mcp_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inputSchema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inputSchema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;
&lt;span class="p"&gt;]}&lt;/span&gt;
&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;enc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mcp_response&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="c1"&gt;# 7,986 tokens for 84 tools
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CLI path sends ~95 tokens — just the &lt;code&gt;bash&lt;/code&gt; tool definition. The agent learns the CLI surface by running &lt;code&gt;cli-anything-safari tools list --json&lt;/code&gt; once (5,236 tokens, one-time) and the info sits in the conversation context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;At Claude Opus pricing ($15/MTok input, no caching):&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Session length&lt;/th&gt;
&lt;th&gt;MCP overhead&lt;/th&gt;
&lt;th&gt;CLI overhead&lt;/th&gt;
&lt;th&gt;Savings&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10 turns&lt;/td&gt;
&lt;td&gt;$1.20&lt;/td&gt;
&lt;td&gt;$0.09&lt;/td&gt;
&lt;td&gt;$1.11&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100 turns&lt;/td&gt;
&lt;td&gt;$11.98&lt;/td&gt;
&lt;td&gt;$0.22&lt;/td&gt;
&lt;td&gt;$11.76&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1,000 turns&lt;/td&gt;
&lt;td&gt;$119.79&lt;/td&gt;
&lt;td&gt;$1.60&lt;/td&gt;
&lt;td&gt;$118.19&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Prompt caching narrows this considerably — Anthropic lets you cache tool definitions at $3.75/MTok on first write and $1.50/MTok on reads, roughly a 10× discount. With caching the MCP cost drops from ~$12 to ~$1.50 per 100-turn session. Still more expensive than CLI, but the gap is smaller.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The takeaway&lt;/strong&gt;: for &lt;strong&gt;short, reactive&lt;/strong&gt; sessions (where you care about UX and per-call latency), MCP wins hands-down. For &lt;strong&gt;long, scripted&lt;/strong&gt; sessions at scale (where tool-definition overhead becomes a real line item), the CLI's token efficiency is genuine and measurable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Accuracy — tie
&lt;/h2&gt;

&lt;p&gt;Both paths call the same &lt;code&gt;safari-mcp&lt;/code&gt; server. Both go through the same AppleScript → Safari chain. The CLI is a thin subprocess wrapper that serializes the MCP &lt;code&gt;CallToolResult.content&lt;/code&gt; into stdout via a small &lt;code&gt;_unwrap()&lt;/code&gt; helper:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_unwrap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;parts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="nf"&gt;except &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;JSONDecodeError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="c1"&gt;# ImageContent: returned by screenshot tools
&lt;/span&gt;        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                          &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mimeType&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mimeType&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/octet-stream&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)})&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;parts&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Byte-identical output verified live: the Unicode tab titles returned by &lt;code&gt;cli-anything-safari --json tool list-tabs&lt;/code&gt; match the direct MCP output character-for-character, including right-to-left Hebrew.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bugs that took 5 review rounds to find
&lt;/h2&gt;

&lt;p&gt;I'm not going to pretend the first draft was clean. The schema-driven generator had real bugs that five passes of review (two my own, three by an adversarial code-reviewer agent) surfaced one by one:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Nested &lt;code&gt;.describe()&lt;/code&gt; leaked&lt;/strong&gt;. For &lt;code&gt;z.array(z.object({selector: z.string().describe("CSS selector")})).describe("Array of {selector, value} pairs")&lt;/code&gt;, the naive regex picked the inner &lt;code&gt;"CSS selector"&lt;/code&gt; as the outer field's description. Four tools had wrong help text. Fixed by walking modifier chains at depth 0 only.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Nested &lt;code&gt;.optional()&lt;/code&gt; leaked&lt;/strong&gt;. Same root cause, different effect — &lt;code&gt;safari_mock_route.response&lt;/code&gt; and &lt;code&gt;safari_run_script.steps&lt;/code&gt; were marked optional because an inner field had &lt;code&gt;.optional()&lt;/code&gt;. The actual MCP schema marks them required. This one silently produced wrong JSON schemas; the fix was depth-aware modifier detection everywhere.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;_unwrap()&lt;/code&gt; silently dropped screenshot output&lt;/strong&gt;. It only handled &lt;code&gt;TextContent&lt;/code&gt;, not &lt;code&gt;ImageContent&lt;/code&gt;. For two tools (&lt;code&gt;safari_screenshot&lt;/code&gt;, &lt;code&gt;safari_screenshot_element&lt;/code&gt;), the CLI returned &lt;code&gt;null&lt;/code&gt; with exit code 0 instead of the base64 JPEG. Caught on the fourth review round after I'd already declared "100% compliance."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;safari_evaluate&lt;/code&gt; parameter name is &lt;code&gt;script&lt;/code&gt;, not &lt;code&gt;code&lt;/code&gt;&lt;/strong&gt;. The tool description said "JavaScript code to execute" so I wrote every documentation example as &lt;code&gt;--code "document.title"&lt;/code&gt;. The parser auto-generated the CLI correctly from the schema (&lt;code&gt;--script&lt;/code&gt;), so the CLI worked, but every doc example in SKILL.md, README.md, and my test file was wrong. Caught on the fourth review round when the reviewer cross-referenced docs against the schema.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;doubleClick: z.boolean().default(false)&lt;/code&gt; serialized the default as the string &lt;code&gt;"false"&lt;/code&gt;&lt;/strong&gt;. Not broken at runtime (Click ignores it) but wrong in the bundled JSON schema. Fixed by adding a &lt;code&gt;_coerce_default()&lt;/code&gt; step that parses JS barewords into their Python equivalents.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Every bug except #5 had a corresponding &lt;strong&gt;regression test&lt;/strong&gt; added to &lt;code&gt;test_parity.py&lt;/code&gt; after the fix. The file is now 24 tests, including explicit assertions like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_evaluate_param_is_script_not_code&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Regression: prior versions used &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;code&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; by mistake.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;tool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;safari_evaluate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;script&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The lesson I kept re-learning: &lt;strong&gt;if you wrote the code, you can't review it yourself&lt;/strong&gt;. You read your own docs through your own mental model of what the code does. You need an adversary — either a human with fresh eyes or an agent with no context — to catch the bugs your mental model papers over.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to use which
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Decision tree (read left to right):

Does your agent speak MCP natively?
├── Yes → Use safari-mcp directly. 25× faster, better UX.
└── No
    ├── Is this a one-off / interactive script?
    │   └── Yes → Use cli-anything-safari. jq-pipeable.
    ├── Long-running automation, cost matters?
    │   └── Yes → cli-anything-safari. Token savings compound.
    ├── CI / cron / non-interactive automation?
    │   └── Yes → cli-anything-safari. Subprocess-friendly.
    └── Everything else → try MCP first, fall back to CLI if needed.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For &lt;strong&gt;Claude Code, Cursor, Cline, Windsurf, Continue, OpenClaw, VS Code MCP&lt;/strong&gt; — all MCP-native. Use &lt;code&gt;safari-mcp&lt;/code&gt; directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; safari-mcp
&lt;span class="c"&gt;# Then add to your MCP client config and restart it.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For &lt;strong&gt;Codex CLI, GitHub Copilot CLI, older agent frameworks, shell scripts, cron jobs&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# After the CLI-Anything PR merges&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;cli-anything-safari
cli-anything-safari tools list
cli-anything-safari tool navigate &lt;span class="nt"&gt;--url&lt;/span&gt; https://example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What I'd do differently next time
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Write the benchmark first.&lt;/strong&gt; I built the CLI, shipped it, and &lt;em&gt;then&lt;/em&gt; benchmarked it. If I'd measured first, I would have avoided ~3 review rounds of "is this even useful?" angst. The answer is nuanced (MCP for latency, CLI for tokens and reach), but I couldn't see that without the numbers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Schema-driven from day one.&lt;/strong&gt; The original plan was to hand-wrap 20 curated tools with a &lt;code&gt;raw&lt;/code&gt; escape hatch for the rest. That would have been ~1,500 lines of code I'd be maintaining forever. The schema-driven approach is ~300 lines and maintains itself.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Spin up an adversarial reviewer earlier.&lt;/strong&gt; I used an independent code-reviewer agent on review rounds 2–4. It caught bugs I'd read past a dozen times. Should have used it on round 1.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Token cost is a first-class metric for MCP design.&lt;/strong&gt; I was thinking about MCP vs CLI in pure latency terms. The token-cost-at-scale axis is genuinely the more important one for long agent sessions, and I should have been measuring it from the start.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;safari-mcp repo&lt;/strong&gt;: &lt;a href="https://github.com/achiya-automation/safari-mcp" rel="noopener noreferrer"&gt;https://github.com/achiya-automation/safari-mcp&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CLI-Anything PR #212&lt;/strong&gt;: &lt;a href="https://github.com/HKUDS/CLI-Anything/pull/212" rel="noopener noreferrer"&gt;https://github.com/HKUDS/CLI-Anything/pull/212&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Direct harness path&lt;/strong&gt; (after merge): &lt;a href="https://github.com/HKUDS/CLI-Anything/tree/main/safari/agent-harness" rel="noopener noreferrer"&gt;https://github.com/HKUDS/CLI-Anything/tree/main/safari/agent-harness&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you run into edge cases — or have a better benchmark setup I should run — open an issue on the repo or reply here.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This post is part of my &lt;a href="https://dev.to/achiya-automation"&gt;safari-mcp series&lt;/a&gt;. Previous posts: &lt;a href="https://dev.to/achiya-automation/i-built-an-mcp-server-for-safari-because-chrome-was-melting-my-macbook"&gt;Why I built an MCP server for Safari&lt;/a&gt;, &lt;a href="https://dev.to/achiya-automation/i-replaced-chrome-devtools-mcp-with-safari-on-my-mac-heres-what-happened"&gt;Chrome DevTools MCP vs Safari&lt;/a&gt;, &lt;a href="https://dev.to/achiya-automation/7-things-i-learned-building-a-safari-browser-automation-tool-that-chrome-cant-do"&gt;7 things I learned building Safari automation&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>cli</category>
      <category>automation</category>
      <category>ai</category>
    </item>
    <item>
      <title>I just hardened my OSS release pipeline to 11 layers of security — here's the playbook</title>
      <dc:creator>אחיה כהן</dc:creator>
      <pubDate>Sat, 11 Apr 2026 21:58:18 +0000</pubDate>
      <link>https://dev.to/achiya-automation/i-just-hardened-my-oss-release-pipeline-to-11-layers-of-security-heres-the-playbook-4267</link>
      <guid>https://dev.to/achiya-automation/i-just-hardened-my-oss-release-pipeline-to-11-layers-of-security-heres-the-playbook-4267</guid>
      <description>&lt;h1&gt;
  
  
  I just hardened my OSS release pipeline to 11 layers of security — here's the playbook
&lt;/h1&gt;

&lt;p&gt;This weekend I shipped &lt;a href="https://github.com/achiya-automation/safari-mcp/releases/tag/v2.7.9" rel="noopener noreferrer"&gt;safari-mcp v2.7.9&lt;/a&gt;, a minor release on the surface but a complete overhaul of how the project gets published. Along the way I went from "NPM_TOKEN in a workflow secret" to 11 layers of supply-chain defense, all driven by one realistic question: &lt;em&gt;if my GitHub account gets phished tomorrow, how much damage can an attacker do before somebody notices?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;If you're a solo maintainer of a JavaScript package, the playbook below is for you. Every step is something I actually did today, with links to the commits, and most of them took under 10 minutes each.&lt;/p&gt;

&lt;h2&gt;
  
  
  The starting point
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Package:&lt;/strong&gt; &lt;code&gt;safari-mcp&lt;/code&gt; on npm (~2000 monthly downloads, 27 stars, MIT)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Release flow:&lt;/strong&gt; GitHub Actions workflow triggered by release, authenticating with a long-lived &lt;code&gt;NPM_TOKEN&lt;/code&gt; stored as a repo secret&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Problem:&lt;/strong&gt; That token could publish &lt;em&gt;anything&lt;/em&gt; on my npm account until it expired. If my GitHub got compromised, step one for an attacker would be grabbing the token and uploading a malicious &lt;code&gt;safari-mcp@2.7.10&lt;/code&gt; within minutes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal: make that attack path as hard as possible without turning my release flow into a 20-minute bureaucracy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 1: npm OIDC Trusted Publisher (no more long-lived token)
&lt;/h2&gt;

&lt;p&gt;npm now supports &lt;a href="https://docs.npmjs.com/trusted-publishers" rel="noopener noreferrer"&gt;Trusted Publishers via OIDC&lt;/a&gt;. Instead of storing a token, you configure npm to accept short-lived OIDC tokens issued by GitHub Actions — tokens that are cryptographically bound to a specific repository + workflow + environment and expire within minutes.&lt;/p&gt;

&lt;p&gt;Setting this up on npm takes three fields (org, repo, workflow filename) and one click. After that, the workflow no longer needs &lt;code&gt;NPM_TOKEN&lt;/code&gt;; it just needs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;
  &lt;span class="na"&gt;id-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;  &lt;span class="c1"&gt;# required for OIDC&lt;/span&gt;

&lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm publish --provenance --access public&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;--provenance&lt;/code&gt; flag adds a &lt;a href="https://slsa.dev/" rel="noopener noreferrer"&gt;SLSA build attestation&lt;/a&gt; to the published package, so anyone downloading it can cryptographically verify that &lt;code&gt;safari-mcp@2.7.9&lt;/code&gt; was built by the specific GitHub Actions run I claim it was.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Upgrade path:&lt;/strong&gt; After configuring Trusted Publisher, &lt;em&gt;delete the old &lt;code&gt;NPM_TOKEN&lt;/code&gt; secret&lt;/em&gt;. Otherwise it's still a liability.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Layer 2: manual-approval environment gate
&lt;/h2&gt;

&lt;p&gt;OIDC stops token theft. It doesn't stop a compromised workflow file from publishing malware. So I added a GitHub &lt;a href="https://docs.github.com/en/actions/concepts/deployments/deployment-environments" rel="noopener noreferrer"&gt;deployment environment&lt;/a&gt; called &lt;code&gt;npm-publish&lt;/code&gt; with &lt;strong&gt;required reviewers&lt;/strong&gt; (just me) and &lt;code&gt;can_admins_bypass: false&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Now every release pauses in GitHub Actions until I explicitly click "Approve" in the UI. I see the exact SHA, the commit message, and the tag before I let the publish proceed. Even if an attacker has the ability to push to &lt;code&gt;main&lt;/code&gt; &lt;em&gt;and&lt;/em&gt; has my repo admin privileges, they still can't skip the approval.&lt;/p&gt;

&lt;p&gt;Important detail: by default, deployment environments allow admins to bypass. I set it to &lt;code&gt;false&lt;/code&gt; because the entire point was to defend against &lt;em&gt;my own&lt;/em&gt; compromised credentials. If I get locked out I can always re-enable it from the Settings page, which itself requires passkey reauth.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 3: custom branch policy (tags + main only)
&lt;/h2&gt;

&lt;p&gt;First time I tried to release, the publish workflow failed with:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Tag "v2.7.9" is not allowed to deploy to npm-publish due to environment protection rules.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The default &lt;code&gt;deployment_branch_policy&lt;/code&gt; is "protected branches only" — tags don't count as branches. The fix was switching to &lt;code&gt;custom_branch_policies: true&lt;/code&gt; and adding two rules:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;branch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;main&lt;/span&gt;
&lt;span class="na"&gt;tag&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v*&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now both push-to-main deployments and version tag deployments work, but only those. Feature branches can't deploy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 4: SHA pinning required for all actions
&lt;/h2&gt;

&lt;p&gt;The single biggest OSS supply-chain incident of 2025 was &lt;a href="https://github.com/tj-actions/changed-files/issues/2464" rel="noopener noreferrer"&gt;tj-actions/changed-files&lt;/a&gt; — thousands of secrets leaked because workflows used &lt;code&gt;@v1&lt;/code&gt; tags that the attacker retagged. The fix is simple but painful: pin every action to a full commit SHA instead of a version tag.&lt;/p&gt;

&lt;p&gt;GitHub recently added a repo-level setting to &lt;em&gt;require&lt;/em&gt; this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gh api &lt;span class="nt"&gt;--method&lt;/span&gt; PUT /repos/OWNER/REPO/actions/permissions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--input&lt;/span&gt; - &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;'
{"enabled": true, "allowed_actions": "all", "sha_pinning_required": true}
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After I enabled this, every CI run failed because my workflows used &lt;code&gt;actions/checkout@v4&lt;/code&gt;. I updated them to the equivalent SHA:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd&lt;/span&gt; &lt;span class="c1"&gt;# v6.0.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The comment is important — it's how Dependabot auto-updates the SHA while keeping the version readable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 5: branch protection on main (signatures required, no force push)
&lt;/h2&gt;

&lt;p&gt;Branch protection for solo maintainers is usually reductive (it blocks your own pushes), but three rules are pure upside:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gh api &lt;span class="nt"&gt;--method&lt;/span&gt; PUT /repos/OWNER/REPO/branches/main/protection &lt;span class="nt"&gt;--input&lt;/span&gt; - &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;'
{
  "required_status_checks": null,
  "enforce_admins": false,
  "required_pull_request_reviews": null,
  "restrictions": null,
  "allow_force_pushes": false,
  "allow_deletions": false,
  "required_conversation_resolution": true
}
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;allow_force_pushes: false&lt;/code&gt;&lt;/strong&gt; — stops history rewriting, essential with signed commits&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;allow_deletions: false&lt;/code&gt;&lt;/strong&gt; — stops accidental &lt;code&gt;git push --delete origin main&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;required_conversation_resolution: true&lt;/code&gt;&lt;/strong&gt; — every PR comment must be resolved before merge&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then, separately:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gh api &lt;span class="nt"&gt;--method&lt;/span&gt; POST /repos/OWNER/REPO/branches/main/protection/required_signatures
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every new commit on &lt;code&gt;main&lt;/code&gt; must carry a verified signature. Commits from unsigned laptops get rejected at push time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 6: SSH commit signing without losing your mind
&lt;/h2&gt;

&lt;p&gt;Setting up commit signing always feels like yak-shaving. Here's the short version for macOS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Generate a dedicated signing key (no passphrase, used ONLY for git sign)&lt;/span&gt;
ssh-keygen &lt;span class="nt"&gt;-t&lt;/span&gt; ed25519 &lt;span class="nt"&gt;-f&lt;/span&gt; ~/.ssh/git_signing_ed25519 &lt;span class="nt"&gt;-N&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt; &lt;span class="nt"&gt;-C&lt;/span&gt; &lt;span class="s2"&gt;"git-signing"&lt;/span&gt;

&lt;span class="c"&gt;# Point git at it&lt;/span&gt;
git config &lt;span class="nt"&gt;--local&lt;/span&gt; gpg.format ssh
git config &lt;span class="nt"&gt;--local&lt;/span&gt; user.signingkey ~/.ssh/git_signing_ed25519.pub
git config &lt;span class="nt"&gt;--local&lt;/span&gt; commit.gpgsign &lt;span class="nb"&gt;true&lt;/span&gt;

&lt;span class="c"&gt;# Upload to GitHub as a *signing* key (not auth!)&lt;/span&gt;
gh ssh-key add ~/.ssh/git_signing_ed25519.pub &lt;span class="nt"&gt;--type&lt;/span&gt; signing &lt;span class="nt"&gt;--title&lt;/span&gt; &lt;span class="s2"&gt;"git signing"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Gotcha: if your primary &lt;code&gt;~/.ssh/id_ed25519&lt;/code&gt; has a passphrase, &lt;code&gt;git commit -S&lt;/code&gt; will hang forever trying to unlock it without prompting (in a non-interactive shell). The dedicated no-passphrase key avoids that, and since it's only usable for signing — not auth — the security trade-off is small.&lt;/p&gt;

&lt;p&gt;Second gotcha: commits aren't shown as "Verified" on GitHub unless the committer email matches a verified email on your GitHub account. If yours doesn't, switch to the &lt;code&gt;&amp;lt;user-id&amp;gt;+&amp;lt;username&amp;gt;@users.noreply.github.com&lt;/code&gt; format, which GitHub recognizes automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 7: Approval for outside-collaborator workflows
&lt;/h2&gt;

&lt;p&gt;The default GitHub setting is "Require approval for first-time contributors". That means a malicious user who's had one merged PR 5 years ago can push any workflow to your repo without approval. Ramp it up:&lt;/p&gt;

&lt;p&gt;In &lt;code&gt;Settings → Actions → General → Fork pull request workflows from outside collaborators&lt;/code&gt;, pick &lt;strong&gt;"Require approval for all external contributors"&lt;/strong&gt;. Every outside PR now waits for me to click "Approve and run" before its workflow touches any of my secrets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layers 8–11: the boring-but-essential ones
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CODEOWNERS&lt;/strong&gt; — auto-request my review on any PR that touches &lt;code&gt;.github/**&lt;/code&gt;, &lt;code&gt;package.json&lt;/code&gt;, or native code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dependabot for GitHub Actions&lt;/strong&gt; — by default Dependabot only monitors &lt;code&gt;npm&lt;/code&gt; / &lt;code&gt;pip&lt;/code&gt; / etc. Add &lt;code&gt;github-actions&lt;/code&gt; to &lt;code&gt;.github/dependabot.yml&lt;/code&gt; so action pins get security updates too&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;npm WebAuthn 2FA&lt;/strong&gt; — not new, but confirm your npm account uses a hardware-backed second factor, not just TOTP&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;package.json overrides&lt;/strong&gt; — when a transitive dep has a security advisory that the parent hasn't fixed, force the patched version with &lt;code&gt;overrides&lt;/code&gt; instead of waiting&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What the attack surface looks like now
&lt;/h2&gt;

&lt;p&gt;For an attacker to publish malicious &lt;code&gt;safari-mcp@X.Y.Z&lt;/code&gt; to npm, they need to &lt;em&gt;simultaneously&lt;/em&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Compromise my GitHub account (phishing-resistant passkey)&lt;/li&gt;
&lt;li&gt;Bypass signed commits on &lt;code&gt;main&lt;/code&gt; (signing key on a different laptop)&lt;/li&gt;
&lt;li&gt;Either compromise the code review process or commit directly (blocked by required_signatures)&lt;/li&gt;
&lt;li&gt;Bypass the &lt;code&gt;npm-publish&lt;/code&gt; environment approval (can_admins_bypass: false — not even I can skip it)&lt;/li&gt;
&lt;li&gt;Either compromise my local keychain (for the approval click) or the npm OIDC signing key at GitHub Actions runtime&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Pre-playbook, the attacker needed: (1) my GitHub token, or (2) the NPM_TOKEN secret. One step.&lt;/p&gt;

&lt;p&gt;Post-playbook: five independent, cryptographically-bounded steps — four of which require a different device or a human decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 30-minute version
&lt;/h2&gt;

&lt;p&gt;If you're reading this and thinking "I'll do this later", here's the minimum viable version you can copy-paste &lt;em&gt;right now&lt;/em&gt; (replace &lt;code&gt;OWNER/REPO&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. SHA pinning (10s)&lt;/span&gt;
gh api &lt;span class="nt"&gt;--method&lt;/span&gt; PUT /repos/OWNER/REPO/actions/permissions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--input&lt;/span&gt; - &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;'
{"enabled": true, "allowed_actions": "all", "sha_pinning_required": true}
&lt;/span&gt;&lt;span class="no"&gt;EOF

&lt;/span&gt;&lt;span class="c"&gt;# 2. Branch protection (10s)&lt;/span&gt;
gh api &lt;span class="nt"&gt;--method&lt;/span&gt; PUT /repos/OWNER/REPO/branches/main/protection &lt;span class="nt"&gt;--input&lt;/span&gt; - &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;'
{"required_status_checks":null,"enforce_admins":false,"required_pull_request_reviews":null,"restrictions":null,"allow_force_pushes":false,"allow_deletions":false,"required_conversation_resolution":true}
&lt;/span&gt;&lt;span class="no"&gt;EOF

&lt;/span&gt;&lt;span class="c"&gt;# 3. Add github-actions to Dependabot (edit .github/dependabot.yml)&lt;/span&gt;
&lt;span class="c"&gt;# 4. Enable npm Trusted Publisher on npmjs.com for your package&lt;/span&gt;
&lt;span class="c"&gt;# 5. Delete NPM_TOKEN secret&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Five steps. Zero downtime. You're 70% of the way there.&lt;/p&gt;




&lt;h2&gt;
  
  
  Plug
&lt;/h2&gt;

&lt;p&gt;I do this on a project called &lt;a href="https://github.com/achiya-automation/safari-mcp" rel="noopener noreferrer"&gt;Safari MCP&lt;/a&gt; — a macOS-only MCP server that lets AI agents drive your real Safari (with all your existing logins) instead of spawning Chrome. It's MIT, runs on &lt;code&gt;npx safari-mcp&lt;/code&gt;, and has 80 tools for navigation, form fill, screenshots, and everything in between.&lt;/p&gt;

&lt;p&gt;If you're tired of Chrome DevTools MCP eating your M-series battery, give it a spin. And if you have opinions about any of the security layers above — or know one I missed — hit me on &lt;a href="https://github.com/achiya-automation/safari-mcp/issues" rel="noopener noreferrer"&gt;GitHub Issues&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Sources and links:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/achiya-automation/safari-mcp/releases/tag/v2.7.9" rel="noopener noreferrer"&gt;safari-mcp v2.7.9 release&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.npmjs.com/trusted-publishers" rel="noopener noreferrer"&gt;npm Trusted Publishers docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.github.com/en/actions/concepts/deployments/deployment-environments" rel="noopener noreferrer"&gt;GitHub Actions deployment environments&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://slsa.dev/" rel="noopener noreferrer"&gt;SLSA provenance v1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/tj-actions/changed-files/issues/2464" rel="noopener noreferrer"&gt;tj-actions/changed-files incident post-mortem&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>opensource</category>
      <category>security</category>
      <category>github</category>
      <category>npm</category>
    </item>
    <item>
      <title>Why 60 seconds beats a perfect message: an automation latency study</title>
      <dc:creator>אחיה כהן</dc:creator>
      <pubDate>Sat, 11 Apr 2026 21:54:54 +0000</pubDate>
      <link>https://dev.to/achiya-automation/why-60-seconds-beats-a-perfect-message-an-automation-latency-study-3gde</link>
      <guid>https://dev.to/achiya-automation/why-60-seconds-beats-a-perfect-message-an-automation-latency-study-3gde</guid>
      <description>&lt;p&gt;A few months ago I built what looked like a textbook lead-capture workflow for a real estate company in Israel: Facebook ad → lead form → CRM → agent follow-up. Standard pipeline. The thing that surprised me about how it performed had nothing to do with the agents, the message templates, or the offer. It came down to a single variable I had not been measuring: &lt;strong&gt;time between trigger and first touch.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This post is about what that variable does and why I now think it's the first thing to optimize in any marketing automation, not the last.&lt;/p&gt;

&lt;h2&gt;
  
  
  The setup
&lt;/h2&gt;

&lt;p&gt;Real estate, Hebrew market, Facebook lead ads. The original pre-automation flow looked like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Prospect fills out a Facebook lead form&lt;/li&gt;
&lt;li&gt;Lead lands in the company's Facebook Lead Center&lt;/li&gt;
&lt;li&gt;Once or twice a day, an admin exports it&lt;/li&gt;
&lt;li&gt;Lead gets called by an available agent&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;End-to-end latency: &lt;strong&gt;3 to 6 hours&lt;/strong&gt; on a good day. The pattern was depressingly familiar to anyone who's looked at lead funnels — by the time the agent called, the prospect had already messaged 2-3 competing brokers, locked in a viewing with whoever responded first, and stopped picking up unknown numbers.&lt;/p&gt;

&lt;p&gt;The instinct (mine and theirs) was to fix this with better human routing: rotate agents, set up paging, train people to check more often. I've seen this approach a hundred times. It does not work. Humans are not the right tier for sub-minute response.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I built
&lt;/h2&gt;

&lt;p&gt;The whole thing is small. Three pieces:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Facebook Lead Form
       │
       ▼
[Webhook → n8n]
       │
       ├─→ WhatsApp Business API: send acknowledgment
       │
       ├─→ Monday CRM: create item with full lead data
       │
       └─→ Notify available agent (email + push)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The n8n flow has six nodes: Webhook trigger, a Set node to normalize the lead payload, a WhatsApp Cloud API node for the acknowledgment, an HTTP Request node to Monday's API, a second HTTP Request to send the agent notification, and an If node to handle the rare malformed payload.&lt;/p&gt;

&lt;p&gt;The acknowledgment template is intentionally boring:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hi {{firstName}}, thanks for your inquiry about {{property}}.
One of our agents will reach out shortly to schedule a viewing.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. No personalization beyond the first name and the property they asked about. No emoji. No marketing copy. No upsell.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I expected vs. what happened
&lt;/h2&gt;

&lt;p&gt;I expected the automation to mostly help on the operations side — fewer dropped leads, less manual data entry, agents stop chasing stale leads. Conversion impact, I figured, would come from the human follow-up still being good.&lt;/p&gt;

&lt;p&gt;What actually happened: the response time dropped from hours to under one minute, and conversion improved significantly. What surprised me was how much of that improvement seemed to come from the automated acknowledgment alone. Prospects who got the WhatsApp ping within 60 seconds were warmer when the human agent eventually called back, and "warmer" mostly meant "still expecting our call instead of three competitors'."&lt;/p&gt;

&lt;p&gt;The acknowledgment was not a bridge to the human call. It was the moment the prospect committed to talking to &lt;em&gt;us&lt;/em&gt; instead of shopping around.&lt;/p&gt;

&lt;h2&gt;
  
  
  The model that explains it
&lt;/h2&gt;

&lt;p&gt;Here's what I now believe is happening, and it's pretty simple:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lead intent has a half-life of minutes, not hours.&lt;/strong&gt; When someone fills out a form, their attention is on the problem they're trying to solve right now. Every second after submission, that attention is leaking — to a competitor, to a phone notification, to dinner. By the 15-minute mark, you're competing with an entirely different mental state.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;An acknowledgment is not a placeholder. It's the conversion event.&lt;/strong&gt; Most marketers treat the first-touch message as "we got it, real reply coming." Prospects don't read it that way. They read it as "this business is alive and responsive." That signal — "alive and responsive" — is what makes them stop shopping. The actual conversation that follows is downstream of that decision, not the cause of it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The exact words barely matter as much as the timing.&lt;/strong&gt; Across the workflows I've built, copy tweaks (better wording, personalization, emoji, social proof) tend to produce small incremental conversion changes. Cutting response latency from hours to minutes consistently produces much larger jumps. The latency variable seems to dwarf the copy variable in almost every test I've watched.&lt;/p&gt;

&lt;h2&gt;
  
  
  The lesson
&lt;/h2&gt;

&lt;p&gt;If you're building a marketing automation workflow, the order of operations I'd recommend is the opposite of how most teams approach it:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;First&lt;/strong&gt;, measure your current trigger-to-first-response time. Not your trigger-to-human-call time. The time until &lt;em&gt;anything&lt;/em&gt; reaches the prospect. If that number is over 5 minutes, you have a latency problem and no amount of better copy will fix it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Second&lt;/strong&gt;, get a barebones acknowledgment out the door. Plain language. No personalization beyond what you can parse from the trigger payload. The goal is sub-60-second delivery.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Third&lt;/strong&gt;, instrument the gap between acknowledgment and human follow-up so you can see what conversion looks like with and without human touch. You will probably find, like I did, that the acknowledgment is doing more work than you thought.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Then&lt;/strong&gt;, and only then, start optimizing copy, personalization, and follow-up sequences.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This ordering matters because most marketing automation post-mortems end up with a list of recommendations like "rewrite the email," "add more drip steps," "personalize the subject line." Those are real levers but they're second-order. The first-order lever is latency, and almost no one is measuring it.&lt;/p&gt;

&lt;h2&gt;
  
  
  A few practical notes
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Use WhatsApp Business API or another channel with native push delivery, not email. Email-based acknowledgments hit the spam filter latency wall and you lose your sub-minute window.&lt;/li&gt;
&lt;li&gt;Make the trigger sync, not poll. Webhooks beat scheduled exports every time.&lt;/li&gt;
&lt;li&gt;Don't put your acknowledgment behind a manual approval step. The agent does not need to review the auto-message. Trust the template.&lt;/li&gt;
&lt;li&gt;Log the trigger-to-acknowledgment timestamp so you can actually see when it drifts. If it ever goes above 90 seconds you have an outage even if every node is "green."&lt;/li&gt;
&lt;li&gt;If you need a CRM, use one with a public API and a real webhook surface. Most of the time spent on these workflows is fighting CRM integrations, not building the automation logic.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you've built something similar and seen the same effect — or if you've measured the opposite and the human touch matters more than I think — I'd love to hear about it in the comments.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is one of about 50 automation projects I've built for Israeli SMBs. If you want to see more case studies, including the full pricing breakdown for similar workflows, &lt;a href="https://achiya-automation.com/blog/automatic-lead-collection-whatsapp/" rel="noopener noreferrer"&gt;there's a longer write-up on my site&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>automation</category>
      <category>n8n</category>
      <category>whatsapp</category>
      <category>marketing</category>
    </item>
    <item>
      <title>7 Things I Learned Building a Safari Browser Automation Tool That Chrome Can't Do</title>
      <dc:creator>אחיה כהן</dc:creator>
      <pubDate>Wed, 01 Apr 2026 10:42:21 +0000</pubDate>
      <link>https://dev.to/achiya-automation/7-things-i-learned-building-a-safari-browser-automation-tool-that-chrome-cant-do-2i6n</link>
      <guid>https://dev.to/achiya-automation/7-things-i-learned-building-a-safari-browser-automation-tool-that-chrome-cant-do-2i6n</guid>
      <description>&lt;p&gt;Every browser automation tool assumes you're using Chrome.&lt;/p&gt;

&lt;p&gt;Playwright? Chrome. Puppeteer? Chrome. Selenium? &lt;em&gt;Technically&lt;/em&gt; supports others, but let's be real -- Chrome. Even the new wave of AI-powered browser tools (Chrome DevTools MCP, Browserbase) are all Chromium under the hood.&lt;/p&gt;

&lt;p&gt;I use Safari as my daily browser. I have 47 tabs open right now with active sessions -- Gmail, GitHub, Ahrefs, my hosting dashboards. When I started building AI agents that needed to interact with web pages, every tool told me the same thing: "Just use Chrome."&lt;/p&gt;

&lt;p&gt;So I spent the last two weeks building &lt;a href="https://github.com/achiya-automation/safari-mcp" rel="noopener noreferrer"&gt;Safari MCP&lt;/a&gt; -- a native Safari automation server with 80 tools, running entirely through AppleScript and JavaScript injection. No Chrome. No Puppeteer. No headless browser.&lt;/p&gt;

&lt;p&gt;Here are 7 things I learned that surprised me.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. WebKit on macOS Is Not What You Think It Is
&lt;/h2&gt;

&lt;p&gt;When Playwright says it supports WebKit, it's running a &lt;em&gt;custom build&lt;/em&gt; of WebKit in a separate process. It's WebKit the engine, not Safari the application.&lt;/p&gt;

&lt;p&gt;The real Safari on macOS runs inside the operating system's rendering pipeline. It shares resources with the window server, uses the system's DNS resolver, benefits from Apple's Intelligent Tracking Prevention, and -- this is the part that matters for automation -- it has access to your actual cookies, sessions, and logins.&lt;/p&gt;

&lt;p&gt;The practical difference: when my AI agent needs to check Google Search Console, it just... opens it. No login flow. No stored credentials. No OAuth dance. Safari already has my Google session from this morning.&lt;/p&gt;

&lt;p&gt;This is what "native" actually means. Not "runs on macOS" -- but "runs &lt;em&gt;as&lt;/em&gt; macOS."&lt;/p&gt;




&lt;h2&gt;
  
  
  2. The CPU Difference Is Real: ~60% Less Than Chrome
&lt;/h2&gt;

&lt;p&gt;I wasn't expecting this to be significant. I was wrong.&lt;/p&gt;

&lt;p&gt;Running Chrome DevTools Protocol (via Chrome DevTools MCP) on my M2 MacBook Pro, Activity Monitor showed Chrome Helper processes eating 30-45% CPU during automation tasks. The fans spun up. My laptop got hot.&lt;/p&gt;

&lt;p&gt;Safari MCP doing the same tasks: 10-15% CPU. No fan noise. The reason isn't that Safari is "more efficient" in some abstract sense -- it's that Safari's rendering is baked into macOS's WindowServer process, which is already running. There's no separate browser process to spin up, no V8 isolate to warm, no DevTools protocol overhead.&lt;/p&gt;

&lt;p&gt;For AI agents that run for hours -- scraping data, filling forms, monitoring dashboards -- this isn't a nice-to-have. It's the difference between a usable laptop and a space heater.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Apple's Private Entitlement Wall (The Thing That Almost Killed the Project)
&lt;/h2&gt;

&lt;p&gt;My first approach was obvious: use Safari's Web Inspector Protocol. Safari &lt;em&gt;has&lt;/em&gt; a remote debugging protocol -- you can see it in the Develop menu. It's how Safari's built-in DevTools work.&lt;/p&gt;

&lt;p&gt;I spent days trying to connect to it programmatically. Here's what I found:&lt;/p&gt;

&lt;p&gt;Safari's Web Inspector uses an XPC service (&lt;code&gt;com.apple.WebKit.WebContent&lt;/code&gt;) that requires a &lt;strong&gt;private Apple entitlement&lt;/strong&gt; to connect. This entitlement is only granted to Apple-signed binaries -- Safari itself and Xcode's instruments.&lt;/p&gt;

&lt;p&gt;You cannot get this entitlement as a third-party developer. There's no API to request it. No workaround. Apple has deliberately locked down programmatic access to Safari's debugging protocol.&lt;/p&gt;

&lt;p&gt;This is the wall that stops every "Safari automation" attempt. It's why Selenium's Safari driver is perpetually limited. It's why no one has built a Puppeteer-for-Safari.&lt;/p&gt;

&lt;p&gt;I had to find another way in.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. AppleScript + A Swift Daemon Gets You ~5ms Per Command
&lt;/h2&gt;

&lt;p&gt;The "other way in" turned out to be hiding in plain sight: AppleScript's &lt;code&gt;do JavaScript&lt;/code&gt; command.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight applescript"&gt;&lt;code&gt;&lt;span class="k"&gt;tell&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;application&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Safari"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;do&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;JavaScript&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"document.title"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;tab&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="na"&gt;window&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This runs arbitrary JavaScript in any Safari tab. It's been in macOS for over a decade. The catch: spawning &lt;code&gt;osascript&lt;/code&gt; as a subprocess takes ~80ms per call. For a single command that's fine. For an AI agent issuing hundreds of commands to fill a form, it's painfully slow.&lt;/p&gt;

&lt;p&gt;The solution: a persistent Swift daemon (&lt;code&gt;safari-helper.swift&lt;/code&gt; -- 301 lines) that keeps an &lt;code&gt;NSAppleScript&lt;/code&gt; instance alive in-process. The AI agent sends JSON lines over stdin, the daemon executes them and returns results.&lt;/p&gt;

&lt;p&gt;Result: &lt;strong&gt;~5ms per command&lt;/strong&gt; instead of ~80ms. A 16x speedup from a 301-line Swift file.&lt;/p&gt;

&lt;p&gt;The entire codebase is ~6,000 lines across 4 files. Two production dependencies (&lt;code&gt;@modelcontextprotocol/sdk&lt;/code&gt; for the MCP protocol, &lt;code&gt;ws&lt;/code&gt; for the optional Extension WebSocket). That's it.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. The Tab Ownership Problem (AI Agents Will Destroy Your Work)
&lt;/h2&gt;

&lt;p&gt;This one cost me a full day of lost work before I solved it.&lt;/p&gt;

&lt;p&gt;Here's the scenario: you're writing an email in Safari tab 3. Your AI agent is automating something in tab 5. The agent needs to navigate somewhere -- and it navigates &lt;em&gt;in your tab 3&lt;/em&gt; instead. Your half-written email is gone. The form state is destroyed. There's no undo.&lt;/p&gt;

&lt;p&gt;This happens because tab indices shift. You close a tab, every index after it changes. The agent cached "my tab is index 5" but now it's index 4, and your email tab is 5.&lt;/p&gt;

&lt;p&gt;My solution: &lt;strong&gt;tab tracking by URL&lt;/strong&gt;. Every command resolves the target tab by its URL, not its index. If the URL doesn't match, the command fails safely instead of hitting the wrong tab. The agent maintains a list of tabs it opened (via &lt;code&gt;safari_new_tab&lt;/code&gt;) and refuses to touch any tab it didn't create.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Before EVERY command:
1. Resolve tab by URL, not cached index
2. Verify this is a tab the agent opened
3. If mismatch -&amp;gt; fail safely, never navigate in user's tab
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I added this as a hard rule in my MCP configuration: the AI agent must call &lt;code&gt;safari_list_tabs&lt;/code&gt; at the start of every session, track which tabs it opens, and verify ownership before every interaction.&lt;/p&gt;

&lt;p&gt;It sounds paranoid. It is paranoid. And it's the only way to safely share a browser between a human and an AI agent.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Shadow DOM, React State, and CSP -- Why I Had to Build a Safari Extension
&lt;/h2&gt;

&lt;p&gt;AppleScript's &lt;code&gt;do JavaScript&lt;/code&gt; is powerful, but it runs in the page's JavaScript context. Three things break it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Closed Shadow DOM.&lt;/strong&gt; Reddit, many web components, and design systems use &lt;code&gt;mode: 'closed'&lt;/code&gt; shadow roots. JavaScript running in the page context literally cannot see inside them -- &lt;code&gt;element.shadowRoot&lt;/code&gt; returns &lt;code&gt;null&lt;/code&gt;. The only way in is through a browser extension's content script, which has access to the internal shadow tree.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;React's internal value tracker.&lt;/strong&gt; If you set &lt;code&gt;input.value = 'hello'&lt;/code&gt; on a React-controlled input, React ignores it. React tracks the "last known value" via an internal &lt;code&gt;_valueTracker&lt;/code&gt; property on the DOM element. You have to reset this tracker &lt;em&gt;before&lt;/em&gt; dispatching the input event, or React's synthetic event system thinks nothing changed. I learned this the hard way on LinkedIn, where every form is React-controlled.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// The hack that makes React forms work:&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tracker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_valueTracker&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tracker&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;tracker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hello&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dispatchEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;input&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;bubbles&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Content Security Policy.&lt;/strong&gt; Strict CSP headers block dynamic code execution and inline scripts. Some sites (banking, enterprise tools) restrict even more aggressively. The Extension runs in the &lt;code&gt;MAIN&lt;/code&gt; world with elevated privileges, bypassing CSP restrictions that would block AppleScript-injected JavaScript.&lt;/p&gt;

&lt;p&gt;This led to the &lt;strong&gt;dual-engine architecture&lt;/strong&gt;: the Safari Extension handles modern SPAs and CSP-strict sites (5-20ms per command via HTTP polling), while AppleScript handles everything else (~5ms via the Swift daemon). The system automatically falls back between them.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. CGEvent Window Targeting -- Clicking Without Stealing Focus
&lt;/h2&gt;

&lt;p&gt;The final boss: some sites (Airtable, complex React apps) don't respond to synthetic JavaScript clicks. They check &lt;code&gt;event.isTrusted&lt;/code&gt; -- a read-only property that's &lt;code&gt;true&lt;/code&gt; only for events generated by the OS, not by JavaScript.&lt;/p&gt;

&lt;p&gt;The obvious solution -- simulate a real mouse click via macOS accessibility APIs -- has a nasty side effect: it moves your physical cursor and brings Safari to the foreground. If you're typing in VS Code while your agent works, suddenly your cursor jumps and Safari appears on top.&lt;/p&gt;

&lt;p&gt;The fix lives in &lt;code&gt;safari-helper.swift&lt;/code&gt; and uses a largely undocumented CGEvent feature:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight swift"&gt;&lt;code&gt;&lt;span class="c1"&gt;// CGEventField 91 = kCGMouseEventWindowUnderMousePointer&lt;/span&gt;
&lt;span class="c1"&gt;// (not in Apple's public headers)&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;kWindowField&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;CGEventField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;rawValue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;91&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;
&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setIntegerValueField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kWindowField&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;windowId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By setting the window ID on the CGEvent, the click is delivered directly to Safari's window -- without moving the mouse cursor, without activating the window, without stealing focus. The event registers as &lt;code&gt;isTrusted: true&lt;/code&gt; in the browser.&lt;/p&gt;

&lt;p&gt;This field isn't in Apple's public CGEvent documentation. I found it by reading Chromium's source code (they use the same trick for their own window targeting) and then confirmed the raw field numbers work on macOS Sequoia.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Result
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/achiya-automation/safari-mcp" rel="noopener noreferrer"&gt;Safari MCP&lt;/a&gt; is open source (MIT), installs via &lt;code&gt;npm install -g safari-mcp&lt;/code&gt; or Homebrew, and works with Claude Code, Claude Desktop, Cursor, Windsurf, and any MCP-compatible client.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;By the numbers:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;80 tools (navigation, forms, screenshots, network mocking, cookies, accessibility, and more)&lt;/li&gt;
&lt;li&gt;~5ms per command via the persistent Swift daemon&lt;/li&gt;
&lt;li&gt;~6,000 lines of code across 4 files&lt;/li&gt;
&lt;li&gt;2 production dependencies&lt;/li&gt;
&lt;li&gt;~60% less CPU than Chrome-based alternatives&lt;/li&gt;
&lt;li&gt;MIT license&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I use it daily at &lt;a href="https://achiya-automation.com" rel="noopener noreferrer"&gt;Achiya Automation&lt;/a&gt; for everything from filling client forms to monitoring dashboards to running SEO audits -- all without Chrome ever touching my system.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;If I started over, I'd skip the XPC/Web Inspector rabbit hole entirely and go straight to AppleScript + Extension. I lost three days on the private entitlement wall before accepting it wasn't going to work.&lt;/p&gt;

&lt;p&gt;I'd also build tab tracking from day one. The "navigate in the wrong tab" disaster happened because I treated it as an edge case. It's not an edge case -- it's the &lt;em&gt;default&lt;/em&gt; failure mode when an AI agent shares a browser with a human.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Question I Keep Coming Back To
&lt;/h2&gt;

&lt;p&gt;Every MCP server I've seen for browser automation is Chrome-first. Playwright MCP, Chrome DevTools MCP, Browserbase -- all Chromium.&lt;/p&gt;

&lt;p&gt;But most Mac developers I know use Safari as their daily browser. And the AI agent use case is fundamentally different from testing: you're not running in CI, you're running on &lt;em&gt;your&lt;/em&gt; machine, with &lt;em&gt;your&lt;/em&gt; sessions, while &lt;em&gt;you're&lt;/em&gt; actively working.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For those of you building AI agents that interact with browsers: what's the biggest pain point you've hit with focus stealing, session management, or CPU overhead -- and did you solve it, or just live with Chrome eating your battery?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I'm genuinely curious whether the "just use headless Chrome" consensus holds when the agent runs on your personal laptop for 8 hours a day.&lt;/p&gt;

</description>
      <category>discuss</category>
      <category>javascript</category>
      <category>webdev</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
