<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Michael Rakutko</title>
    <description>The latest articles on DEV Community by Michael Rakutko (@michael_rakutko).</description>
    <link>https://dev.to/michael_rakutko</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3776252%2Ffaac5186-0fc9-4cf8-80bf-be5f515edec3.png</url>
      <title>DEV Community: Michael Rakutko</title>
      <link>https://dev.to/michael_rakutko</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/michael_rakutko"/>
    <language>en</language>
    <item>
      <title>Building in n8n with Claude</title>
      <dc:creator>Michael Rakutko</dc:creator>
      <pubDate>Sat, 04 Apr 2026 12:37:41 +0000</pubDate>
      <link>https://dev.to/michael_rakutko/building-in-n8n-with-claude-l54</link>
      <guid>https://dev.to/michael_rakutko/building-in-n8n-with-claude-l54</guid>
      <description>&lt;p&gt;n8n raised $180M at a $2.5B valuation last October. Their pitch calls it an "AI-first automation platform," and founder Jan Oberhauser describes it as "the Excel of AI."&lt;/p&gt;

&lt;p&gt;I've always been a "code-first" guy. But with the ecosystem shifting toward n8n as the "brain" for AI automations, I wanted to see if it's a legitimate production tool or just a fancy playground for drawing boxes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub: &lt;a href="https://github.com/r-ms/n8n-mcp" rel="noopener noreferrer"&gt;r-ms/n8n-mcp&lt;/a&gt;&lt;/strong&gt; | 20 tools | MIT | Claude Code / Desktop / Cursor&lt;/p&gt;

&lt;h2&gt;
  
  
  The use case
&lt;/h2&gt;

&lt;p&gt;I follow ~30 YouTube channels on AI research and engineering. 90% of uploads are fluff. I needed a system that monitors channels, extracts transcripts, scores relevance with an LLM, and delivers a 30-second brief to Telegram every morning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why n8n and not a Python script?
&lt;/h2&gt;

&lt;p&gt;Sure, Claude writes the script in 20 minutes. But then you need monitoring, alerting, logging, restart logic, state management. Claude can write all of that too — but now you're spending hours on infrastructure instead of the product.&lt;/p&gt;

&lt;p&gt;n8n solves the ops around the code: visual execution traces (what went into the LLM, what came out), OAuth/retry/state management out of the box.&lt;/p&gt;

&lt;h2&gt;
  
  
  The "UI Gap"
&lt;/h2&gt;

&lt;p&gt;I wanted Claude to build for me via n8n's API. But n8n's UI does a massive amount of invisible heavy lifting the API doesn't:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Missing Defaults:&lt;/strong&gt; Code Node v2 requires a &lt;code&gt;language&lt;/code&gt; parameter the UI sets automatically. Omit it via API — silent break.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Version Drift:&lt;/strong&gt; &lt;code&gt;SplitInBatches&lt;/code&gt; v3 swapped its output ports. Wrong version = infinite loop.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Trailing Space:&lt;/strong&gt; Two hours debugging a 404. The API had created a webhook URL with a trailing space from my prompt. The UI would have trimmed it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The MCP
&lt;/h2&gt;

&lt;p&gt;I built an MCP to bridge the gap. 20 tools for n8n, plus a know-how database and auto-fix rules. When Claude creates a workflow, the MCP intercepts it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Validates node versions&lt;/li&gt;
&lt;li&gt;Auto-injects missing UI-default parameters&lt;/li&gt;
&lt;li&gt;Fixes naming conventions before they break webhook routing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Claude gets the full context of the n8n instance and debugs execution errors by reading JSON logs directly.&lt;/p&gt;

&lt;p&gt;Full list of 7 auto-fix rules and 11 knowhow entries in the &lt;a href="https://github.com/r-ms/n8n-mcp#pre-flight-validation" rel="noopener noreferrer"&gt;README&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  From Low-Code to Agentic-Code
&lt;/h2&gt;

&lt;p&gt;The canvas is becoming a debugger, not an editor.&lt;/p&gt;

&lt;p&gt;Drawing lines between boxes is just another way of manually managing complexity — with a mouse instead of a keyboard. In the Agentic era, humans manage the intent. The agent manages the structural complexity. n8n's role shifts from "design tool" to "reliable runtime" — the engine that ensures intent is executed and logged.&lt;/p&gt;

&lt;p&gt;The visual canvas remains, but as an audit trail. It's where you verify what the agent built, not where you build.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/r-ms/n8n-mcp.git
&lt;span class="nb"&gt;cd &lt;/span&gt;n8n-mcp &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; npm run build
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"n8n"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stdio"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"node"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"/path/to/n8n-mcp/dist/index.js"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"N8N_API_URL"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://localhost:5678"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"N8N_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"your-api-key"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tell Claude: "Build me a YouTube monitor in n8n." If you've spent hours debugging an n8n quirk that should have just worked — &lt;a href="https://github.com/r-ms/n8n-mcp/issues" rel="noopener noreferrer"&gt;open an issue&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>claude</category>
      <category>n8n</category>
      <category>ai</category>
      <category>mcp</category>
    </item>
    <item>
      <title>How Claude Code tracks your coding sessions</title>
      <dc:creator>Michael Rakutko</dc:creator>
      <pubDate>Wed, 01 Apr 2026 19:45:01 +0000</pubDate>
      <link>https://dev.to/michael_rakutko/how-claude-code-tracks-your-coding-sessions-30l5</link>
      <guid>https://dev.to/michael_rakutko/how-claude-code-tracks-your-coding-sessions-30l5</guid>
      <description>&lt;p&gt;As a Head of Analytics, I build tracking systems for a living. So at some point the obvious question hit me: how does my own tool track me?&lt;/p&gt;

&lt;p&gt;I decompiled the Claude Code CLI binary and cross-checked it against &lt;a href="https://venturebeat.com/technology/claude-codes-source-code-appears-to-have-leaked-heres-what-we-know" rel="noopener noreferrer"&gt;source code Anthropic accidentally leaked via npm&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Your prompts aren't being exfiltrated. Your code stays local. But there's a regex that flags when you swear, 40 background LLM calls you never see, a remote flag that can change what gets collected without asking, and &lt;code&gt;DO_NOT_TRACK=1&lt;/code&gt; is silently ignored.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TLDR&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Add to ~/.zshrc if you want to opt out:&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4 services your CLI talks to
&lt;/h2&gt;

&lt;p&gt;Every time you run &lt;code&gt;claude&lt;/code&gt;, your terminal opens connections to four external services:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;th&gt;Can you turn it off?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;GrowthBook&lt;/strong&gt; (via api.anthropic.com)&lt;/td&gt;
&lt;td&gt;Feature flags, A/B tests&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Datadog&lt;/strong&gt; (datadoghq.com)&lt;/td&gt;
&lt;td&gt;Ops monitoring. ~44 whitelisted events, feature-flagged off by default&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Anthropic OTEL&lt;/strong&gt; (api.anthropic.com)&lt;/td&gt;
&lt;td&gt;First-party OpenTelemetry logs — this is where almost everything goes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Anthropic Metrics&lt;/strong&gt; (api.anthropic.com)&lt;/td&gt;
&lt;td&gt;OTEL counters and histograms for BigQuery&lt;/td&gt;
&lt;td&gt;Org-level opt-in only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Three of the four endpoints are Anthropic's own servers. The only third-party service is Datadog, and it's gated behind a feature flag that's off by default. Anthropic can flip it on server-side for any user or cohort through GrowthBook targeting — no &lt;code&gt;@anthropic.com&lt;/code&gt; check in the code, the restriction is purely server-side.&lt;/p&gt;

&lt;h2&gt;
  
  
  What gets tracked: 838 event types
&lt;/h2&gt;

&lt;p&gt;All events go to Anthropic's OTEL endpoint (service #3 above). ~44 of them also go to Datadog if the feature flag is on. Every event is prefixed with &lt;code&gt;tengu_&lt;/code&gt; — probably an internal codename. 838 distinct event types, covering every interaction you have with the tool. The number is high because each flow is tracked at every step — OAuth token refresh alone is 7 separate events (&lt;code&gt;_starting&lt;/code&gt;, &lt;code&gt;_lock_acquiring&lt;/code&gt;, &lt;code&gt;_acquired&lt;/code&gt;, &lt;code&gt;_completed&lt;/code&gt;, &lt;code&gt;_success&lt;/code&gt;, &lt;code&gt;_failure&lt;/code&gt;, &lt;code&gt;_released&lt;/code&gt;). Multiply that by every feature and it adds up fast.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;API &amp;amp; Model&lt;/strong&gt; — every request to Claude: model, tokens, cost in USD, latency, fallbacks, refusals.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User input&lt;/strong&gt; — every prompt fires &lt;code&gt;tengu_input_prompt&lt;/code&gt;. Not the text itself (more on that below), but metadata: was it negative? Was it "keep going"? Single word?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tools&lt;/strong&gt; — every tool call: name, duration, result size. For bash commands, the first word of your command is sent raw — &lt;code&gt;./deploy-prod.sh&lt;/code&gt; goes as-is, not sanitized to "bash" or "other".&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Files&lt;/strong&gt; — &lt;code&gt;tengu_file_operation&lt;/code&gt; on every read/write/edit. SHA256 hash of the file path (first 16 chars) and SHA256 of the content. Not the actual path or content. But the hashes are deterministic — same file, same hash. They can tell you keep editing the same file without knowing which one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP&lt;/strong&gt; — server connections, tool calls, errors. MCP server URLs are sent in cleartext. I'll come back to this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sessions&lt;/strong&gt; — init, exit, resume, fork, compact, memory access.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remote sessions&lt;/strong&gt; — ~40 &lt;code&gt;tengu_bridge_*&lt;/code&gt; events for WebSocket infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Voice&lt;/strong&gt; — recording start/stop, transcription metadata.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Team memory&lt;/strong&gt; — sync push/pull, secret skipping, entry limits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Auto-dream&lt;/strong&gt; — background memory consolidation events.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scheduled tasks&lt;/strong&gt; — &lt;code&gt;tengu_kairos_*&lt;/code&gt; for cron-based agents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agents&lt;/strong&gt; — creation, model used, prompt length, response length, tool uses, duration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Permissions&lt;/strong&gt; — every dialog: shown, accepted, rejected, escaped. Every config change: setting name and value.&lt;/p&gt;

&lt;p&gt;At exit, &lt;code&gt;tengu_exit&lt;/code&gt; sends a session summary: cost in USD, lines added/removed, total tokens, duration, UI performance metrics. No conversation content.&lt;/p&gt;

&lt;h2&gt;
  
  
  The swearing detector
&lt;/h2&gt;

&lt;p&gt;Every prompt you type gets run through this regex:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;QaK&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;\b(&lt;/span&gt;&lt;span class="sr"&gt;wtf|wth|ffs|omfg|shit&lt;/span&gt;&lt;span class="se"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;ty|tiest&lt;/span&gt;&lt;span class="se"&gt;)?&lt;/span&gt;&lt;span class="sr"&gt;|dumbass|horrible|awful&lt;/span&gt;&lt;span class="err"&gt;|
&lt;/span&gt;    &lt;span class="nf"&gt;piss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ed&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;ing&lt;/span&gt;&lt;span class="p"&gt;)?&lt;/span&gt; &lt;span class="nx"&gt;off&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;piece&lt;/span&gt; &lt;span class="k"&gt;of &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;shit&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;crap&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;junk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;what&lt;/span&gt; &lt;span class="nf"&gt;the &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fuck&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;hell&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
    &lt;span class="nx"&gt;fucking&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;broken&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;useless&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;terrible&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;awful&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;horrible&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;fuck&lt;/span&gt; &lt;span class="nx"&gt;you&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
    &lt;span class="nf"&gt;screw &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;you&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;so&lt;/span&gt; &lt;span class="nx"&gt;frustrating&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt; &lt;span class="nx"&gt;sucks&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;damn&lt;/span&gt; &lt;span class="nx"&gt;it&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Result: &lt;code&gt;is_negative: true&lt;/code&gt; in &lt;code&gt;tengu_input_prompt&lt;/code&gt;. Just the boolean, not your words. There's also a "keep going" detector — fires &lt;code&gt;is_keep_going: true&lt;/code&gt; when you type "continue", "keep going", or "go on".&lt;/p&gt;

&lt;p&gt;If users are swearing, something's broken. If users keep saying "continue", the model stops too early. Proxy metrics for product quality. I've built similar things myself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Facet extraction: local session analysis
&lt;/h2&gt;

&lt;p&gt;After a session ends, Claude Code can run a full LLM-based analysis and extract structured "facets":&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;What it measures&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Goal (13 types)&lt;/td&gt;
&lt;td&gt;debug, implement feature, fix bug, write tests, deploy, etc.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Satisfaction (8 levels)&lt;/td&gt;
&lt;td&gt;frustrated → dissatisfied → neutral → ... → delighted&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Friction (11 types)&lt;/td&gt;
&lt;td&gt;misunderstood request, wrong approach, buggy code, user rejected action, etc.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Outcome (5 levels)&lt;/td&gt;
&lt;td&gt;fully achieved → not achieved&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Helpfulness (5 levels)&lt;/td&gt;
&lt;td&gt;unhelpful → essential&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Plus &lt;code&gt;underlying_goal&lt;/code&gt;, &lt;code&gt;brief_summary&lt;/code&gt;, &lt;code&gt;primary_success&lt;/code&gt;, &lt;code&gt;primary_friction&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This only runs when you type &lt;code&gt;/insights&lt;/code&gt;, not automatically. Facets are saved locally to &lt;code&gt;~/.claude/usage-data/session-meta/{session_id}.json&lt;/code&gt; and are not sent anywhere. There are no &lt;code&gt;tengu_facet*&lt;/code&gt; or &lt;code&gt;tengu_insights*&lt;/code&gt; events in the codebase. The data stays on your machine.&lt;/p&gt;

&lt;h2&gt;
  
  
  40 hidden LLM calls you never asked for
&lt;/h2&gt;

&lt;p&gt;Besides the main model, Claude Code has 40 different types of background LLM calls — mostly to &lt;code&gt;claude-haiku-4-5&lt;/code&gt; — for things like extracting bash command prefixes, generating terminal titles, compressing context, and auto-extracting memories. Which ones fire depends on what you're doing. Not tracking per se, but your content goes to Anthropic's API either way.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;What&lt;/th&gt;
&lt;th&gt;What it sends to Haiku&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Bash prefix extraction&lt;/td&gt;
&lt;td&gt;Your full bash command&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Tool use summary (status bar)&lt;/td&gt;
&lt;td&gt;Tool inputs/outputs (300 chars)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Web fetch processing&lt;/td&gt;
&lt;td&gt;Web page content (up to 100K chars)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Worktree title generation&lt;/td&gt;
&lt;td&gt;Task description&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Bug report formatting&lt;/td&gt;
&lt;td&gt;Your bug report text&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Prompt suggestion&lt;/td&gt;
&lt;td&gt;Full conversation context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Compact (context compression)&lt;/td&gt;
&lt;td&gt;Your full conversation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Side question (/btw)&lt;/td&gt;
&lt;td&gt;Your question&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Session memory&lt;/td&gt;
&lt;td&gt;Full conversation + MEMORY.md&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Hook evaluation&lt;/td&gt;
&lt;td&gt;Conversation + hook condition&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Speculation (pre-computation)&lt;/td&gt;
&lt;td&gt;Full context. Ant-only (disabled for external users)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;Magic docs generation&lt;/td&gt;
&lt;td&gt;File path + content&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;Agent creation&lt;/td&gt;
&lt;td&gt;Agent description&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;Agent summary&lt;/td&gt;
&lt;td&gt;Agent work results&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Custom agent&lt;/td&gt;
&lt;td&gt;Custom agent context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Auto-dream&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Session transcript&lt;/strong&gt; — background memory consolidation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Auto-mode classifier&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Tool call + user messages only&lt;/strong&gt; — decides whether to auto-approve. Uses the &lt;em&gt;main model&lt;/em&gt;, not Haiku&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;Auto-mode critique&lt;/td&gt;
&lt;td&gt;Auto-mode rules analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;19&lt;/td&gt;
&lt;td&gt;Buddy companion&lt;/td&gt;
&lt;td&gt;Generates a virtual terminal pet (name, species, personality). temperature=1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;Extract memories&lt;/td&gt;
&lt;td&gt;Full conversation — background auto-extraction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;21&lt;/td&gt;
&lt;td&gt;Generate session title&lt;/td&gt;
&lt;td&gt;Your prompt text&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;22&lt;/td&gt;
&lt;td&gt;Hook agent&lt;/td&gt;
&lt;td&gt;Context + hook config (up to 50 turns)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;23&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Insights&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Multiple session transcripts&lt;/strong&gt; — facet extraction, report generation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;24&lt;/td&gt;
&lt;td&gt;MCP datetime parse&lt;/td&gt;
&lt;td&gt;Datetime string&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;25&lt;/td&gt;
&lt;td&gt;Memory directory relevance&lt;/td&gt;
&lt;td&gt;Memory metadata&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;26&lt;/td&gt;
&lt;td&gt;Model validation&lt;/td&gt;
&lt;td&gt;Model info&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;27&lt;/td&gt;
&lt;td&gt;Permission explainer&lt;/td&gt;
&lt;td&gt;Command + context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;28&lt;/td&gt;
&lt;td&gt;Rename generation&lt;/td&gt;
&lt;td&gt;Context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;29&lt;/td&gt;
&lt;td&gt;SDK&lt;/td&gt;
&lt;td&gt;SDK/programmatic API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;td&gt;Session search&lt;/td&gt;
&lt;td&gt;Session metadata (titles, first 300 chars)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;31&lt;/td&gt;
&lt;td&gt;Skill improvement&lt;/td&gt;
&lt;td&gt;Skill data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;32&lt;/td&gt;
&lt;td&gt;Web search&lt;/td&gt;
&lt;td&gt;Search query&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;33&lt;/td&gt;
&lt;td&gt;Away summary&lt;/td&gt;
&lt;td&gt;Last 30 messages + session memory — "while you were away" recap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;34&lt;/td&gt;
&lt;td&gt;Chrome MCP&lt;/td&gt;
&lt;td&gt;Chrome bridge tool calls&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;35&lt;/td&gt;
&lt;td&gt;Fork agent&lt;/td&gt;
&lt;td&gt;Worktree agent context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;36&lt;/td&gt;
&lt;td&gt;Session notes&lt;/td&gt;
&lt;td&gt;Session-level memory (separate from extract_memories)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;37&lt;/td&gt;
&lt;td&gt;REPL main thread&lt;/td&gt;
&lt;td&gt;Main REPL loop context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;38&lt;/td&gt;
&lt;td&gt;Auto-mode critique (user rules)&lt;/td&gt;
&lt;td&gt;Validation of user-defined auto-mode rules&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;39&lt;/td&gt;
&lt;td&gt;Teleport title&lt;/td&gt;
&lt;td&gt;Teleport title generation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;40&lt;/td&gt;
&lt;td&gt;Rename&lt;/td&gt;
&lt;td&gt;Session rename context&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A few of these are worth pausing on. Auto-dream runs in the background, reads your session transcripts, and synthesizes durable memories through four phases: Orient → Review → Consolidate → Housekeep. The auto-mode classifier is interesting for a different reason: it deliberately excludes model responses from the transcript it analyzes. A comment in the source reads &lt;em&gt;"assistant text is model-authored and could be crafted to influence the classifier's decision"&lt;/em&gt; — anti-prompt-injection by design. And yes, there's a side-call that generates a virtual terminal pet with a random personality.&lt;/p&gt;

&lt;p&gt;Some side-calls are restricted to Anthropic employees (&lt;code&gt;USER_TYPE === 'ant'&lt;/code&gt;): speculation (pre-computing responses with a copy-on-write filesystem overlay) and frustration-triggered transcript sharing. For external users, those code paths are replaced with no-ops.&lt;/p&gt;

&lt;p&gt;You can override the model with &lt;code&gt;ANTHROPIC_SMALL_FAST_MODEL&lt;/code&gt;, but you can't turn these calls off without losing the features they power.&lt;/p&gt;

&lt;h2&gt;
  
  
  The data flow
&lt;/h2&gt;

&lt;p&gt;What happens when you type a prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You type a prompt
        |
        |-- regex QaK() --&amp;gt; is_negative: bool --------+
        |-- regex daK() --&amp;gt; is_keep_going: bool ------+
        |-- prompt length --&amp;gt; prompt_length -----------+
        |-- r_1(prompt) --&amp;gt; "&amp;lt;REDACTED&amp;gt;" (default) ---+
        |                                              |
        |   +------------------------------------------+
        |   |
        |   v
        |   tengu_input_prompt event
        |   |
        |   |-- OTEL 1P  --&amp;gt; api.anthropic.com/api/event_logging/batch
        |   +-- Datadog   --&amp;gt; datadoghq.com      [if flag on + whitelist]
        |
        |-- Anthropic API (main Claude request)
        |   |
        |   |-- LLM side-calls (Haiku): 40 calls
        |   |   |-- bash_extract_prefix
        |   |   |-- auto_mode (auto-approve, uses main model)
        |   |   |-- extract_memories (auto-memory)
        |   |   |-- auto_dream (memory consolidation)
        |   |   +-- ... 28 others
        |   |
        |   +-- Model response
        |
        |-- After session ends
        |   +-- Facet Extraction (LLM, local only)
        |       |-- goal, satisfaction, friction, outcome
        |       +-- saved to ~/.claude/usage-data/session-meta/
        |
        +-- Local storage
            |-- ~/.claude/projects/{cwd}/{session}.jsonl  (full transcript)
            |-- ~/.claude/telemetry/                        (retry queue)
            |-- ~/.claude/usage-data/facets/                (facet cache)
            +-- ~/.claude/debug/                            (debug logs)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your prompt text is redacted by default in OTEL spans (replaced with &lt;code&gt;"&amp;lt;REDACTED&amp;gt;"&lt;/code&gt;). File paths are always hashed. If you set &lt;code&gt;OTEL_LOG_USER_PROMPTS=true&lt;/code&gt;, your full prompt text goes to the OTEL endpoint — off by default, but enterprise deployments might flip it. Same for &lt;code&gt;OTEL_LOG_TOOL_CONTENT=true&lt;/code&gt; (file contents, bash output, diffs).&lt;/p&gt;

&lt;h2&gt;
  
  
  What leaks (and what doesn't)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Error messages&lt;/strong&gt; go through a sanitizer that maps known error types to safe messages and truncates unknown ones to 60 chars of class name only. Stack traces don't leave your machine. But validation errors can still contain up to 2,000 characters, and API errors are unlimited, so fragments of paths and commands can slip through.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP server URLs&lt;/strong&gt; leak in cleartext. &lt;code&gt;mcpServerBaseUrl&lt;/code&gt; is spread into telemetry events without any allowlist check. If you connect to &lt;code&gt;https://internal-corp-api.company.com/mcp&lt;/code&gt;, that URL goes to OTEL. MCP tool &lt;em&gt;names&lt;/em&gt; get anonymized to &lt;code&gt;"mcp_tool"&lt;/code&gt;, but the server URL doesn't.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt;&lt;/strong&gt; also leaks in plaintext. If you use a custom proxy, the full URL goes into &lt;code&gt;tengu_api_query&lt;/code&gt;, &lt;code&gt;tengu_api_success&lt;/code&gt;, and &lt;code&gt;tengu_api_error&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repo hash&lt;/strong&gt; — a field &lt;code&gt;rh&lt;/code&gt; sends SHA256[0:16] of your normalized git remote URL with events. Not the URL itself, but a deterministic hash that allows correlating all sessions on the same repo.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP proxy for claude.ai connectors&lt;/strong&gt; — if you connect Gmail, Google Calendar, Slack, etc. through claude.ai, all tool call inputs and outputs route through &lt;code&gt;mcp-proxy.anthropic.com&lt;/code&gt;. Anthropic sees the contents of your emails, calendar entries, Slack messages going through those connectors. This only applies to the &lt;code&gt;claudeai-proxy&lt;/code&gt; type; stdio/sse/http MCP servers connect directly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Team memory&lt;/strong&gt; syncs automatically when files change. Pushes full file contents to &lt;code&gt;api.anthropic.com/api/claude_code/team_memory&lt;/code&gt;. Files containing secrets are skipped (regex filter), max 250KB per file, max 200 files. Disable with &lt;code&gt;CLAUDE_CODE_DISABLE_AUTO_MEMORY=1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Session transcripts&lt;/strong&gt; are only shared if you explicitly consent. Four gates: give feedback → probability check → explicit dialog asking "Can Anthropic look at your transcript?" → click "Yes". You can permanently dismiss it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Grove opt-out doesn't affect tracking.&lt;/strong&gt; The privacy toggle in &lt;code&gt;/privacy-settings&lt;/code&gt; only controls whether your data is used for model training. Tracking runs the same either way.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you can and can't turn off
&lt;/h2&gt;

&lt;p&gt;One environment variable kills almost everything:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Disables GrowthBook, Datadog, OTEL, auto-updates, connectivity checks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude config &lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;--global&lt;/span&gt; env.CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;DO_NOT_TRACK=1&lt;/code&gt; — the standard convention — is completely ignored. Zero references in the source.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Env var&lt;/th&gt;
&lt;th&gt;GrowthBook&lt;/th&gt;
&lt;th&gt;Datadog&lt;/th&gt;
&lt;th&gt;OTEL 1P&lt;/th&gt;
&lt;th&gt;Metrics&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Off&lt;/td&gt;
&lt;td&gt;Off&lt;/td&gt;
&lt;td&gt;Off&lt;/td&gt;
&lt;td&gt;Off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DISABLE_TELEMETRY=1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Off&lt;/td&gt;
&lt;td&gt;Off&lt;/td&gt;
&lt;td&gt;Off&lt;/td&gt;
&lt;td&gt;Off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DO_NOT_TRACK=1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Ignored&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Ignored&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Ignored&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Ignored&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For more granular control:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;CLAUDE_CODE_DISABLE_AUTO_MEMORY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1      &lt;span class="c"&gt;# stop auto-memory extraction&lt;/span&gt;
&lt;span class="nv"&gt;CLAUDE_CODE_DISABLE_TERMINAL_TITLE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1   &lt;span class="c"&gt;# stop LLM title generation&lt;/span&gt;
&lt;span class="nv"&gt;CLAUDE_CODE_DISABLE_BACKGROUND_TASKS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="c"&gt;# stop background tasks&lt;/span&gt;
&lt;span class="nv"&gt;CLAUDE_CODE_DISABLE_CRON&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1             &lt;span class="c"&gt;# stop scheduled tasks&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What you cannot turn off:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;API calls to Claude. The product itself. Anthropic logs requests server-side.&lt;/li&gt;
&lt;li&gt;40 LLM side-calls. Features, not tracking. Your content goes to Anthropic's API.&lt;/li&gt;
&lt;li&gt;Facet extraction. LLM analysis of sessions. Data stays local.&lt;/li&gt;
&lt;li&gt;Auto-dream. Background memory consolidation. Only numbers leave your machine (hours_since, sessions_reviewed), not your content.&lt;/li&gt;
&lt;li&gt;Remote session events. Full message content when using Claude Code remotely.&lt;/li&gt;
&lt;li&gt;WebFetch domain check. Domain name sent to &lt;code&gt;api.anthropic.com/api/web/domain_info&lt;/code&gt;. Disable with &lt;code&gt;skipWebFetchPreflight&lt;/code&gt; in config.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The remote flag problem
&lt;/h2&gt;

&lt;p&gt;This was the most uncomfortable finding. Anthropic can remotely enable enhanced tracking through a GrowthBook feature flag:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;XQ1&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;q&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;CLAUDE_CODE_ENHANCED_TELEMETRY_BETA&lt;/span&gt; 
       &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ENABLE_ENHANCED_TELEMETRY_BETA&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Q6&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;q&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;    &lt;span class="c1"&gt;// env var ON → enabled&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;A_&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;q&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="c1"&gt;// env var OFF → disabled&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;enhanced_telemetry_beta&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;// ← REMOTE FLAG&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you haven't explicitly set the env var, the decision falls through to a remote flag. Anthropic could flip this on for any user or cohort through GrowthBook targeting. In practice, &lt;code&gt;DISABLE_TELEMETRY=1&lt;/code&gt; blocks all backends so the data wouldn't go anywhere. But for enterprise/team setups with their own OTEL infrastructure, this is a real consideration.&lt;/p&gt;

&lt;p&gt;Other things GrowthBook can change remotely: enable Datadog for your account, change event sampling rates to 100%, adjust batch parameters. It cannot remotely enable &lt;code&gt;OTEL_LOG_USER_PROMPTS&lt;/code&gt; (your actual prompt text), that's strictly env var controlled.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I think about all this
&lt;/h2&gt;

&lt;p&gt;I've spent a career building product analytics.&lt;/p&gt;

&lt;p&gt;The architecture is clean. Three of four tracking endpoints are Anthropic's own. Datadog is the only third party, and it's flagged off by default. Prompts redacted. File paths hashed. Content logging opt-in. Transcript sharing behind four consent gates.&lt;/p&gt;

&lt;p&gt;The source code confirms they take this seriously at the engineering level, not just the policy level. The TypeScript type system enforces PII safety at compile time — &lt;code&gt;LogEventMetadata&lt;/code&gt; only accepts &lt;code&gt;boolean | number | undefined&lt;/code&gt;, and adding a string requires an explicit cast through a type named &lt;code&gt;AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS&lt;/code&gt;. Plugin names go into restricted &lt;code&gt;_PROTO_*&lt;/code&gt; BigQuery columns that get stripped before forwarding to Datadog. The team memory secret scanner has 30+ gitleaks-based regex patterns. A source code comment in &lt;code&gt;sink.ts&lt;/code&gt; reads: &lt;em&gt;"With Segment removed the two remaining sinks are fire-and-forget."&lt;/em&gt; They're actively simplifying.&lt;/p&gt;

&lt;p&gt;What bothers me:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MCP server URLs and &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt; leak in plaintext. Internal infrastructure ends up in Anthropic's pipeline.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;DO_NOT_TRACK=1&lt;/code&gt; is silently ignored. Either support the standard or say you don't.&lt;/li&gt;
&lt;li&gt;The remote flag for enhanced tracking can change what gets collected without asking. Make it env-var-only.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;838 event types, 40 background LLM calls, and a remote flag — all in a tool that has full access to your source code. The tracking itself is well-designed: prompts redacted, file paths hashed, session analysis stays local, the kill switch works. But that's a lot of metadata about how you work, when you work, and how your team collaborates. I'd want to know about that. Now you do.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Add to ~/.zshrc if you want to opt out:&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;em&gt;Full technical report with all 838 event names and source code references: &lt;a href="https://github.com/r-ms/blog/blob/main/research/claude-code-telemetry-report-en.md" rel="noopener noreferrer"&gt;link&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>security</category>
      <category>privacy</category>
    </item>
    <item>
      <title>Airflow vs n8n: what's the difference in 2026?</title>
      <dc:creator>Michael Rakutko</dc:creator>
      <pubDate>Sat, 21 Mar 2026 15:26:42 +0000</pubDate>
      <link>https://dev.to/michael_rakutko/airflow-vs-n8n-what-to-choose-in-2026-11dd</link>
      <guid>https://dev.to/michael_rakutko/airflow-vs-n8n-what-to-choose-in-2026-11dd</guid>
      <description>&lt;p&gt;I can spin up an Airflow DAG with Claude Code in the same time it takes me to build an n8n workflow. Describe what I want in English, get working Python in minutes, deploy, done.&lt;/p&gt;

&lt;p&gt;So why do I still use both?&lt;/p&gt;

&lt;p&gt;Because the "visual vs code" framing is dead. AI killed it. The real question in 2026 is what each tool gives you &lt;em&gt;after&lt;/em&gt; the workflow is built — in production, at 3 AM, when the Slack API silently changed their rate limits and your pipeline is on fire.&lt;/p&gt;

&lt;h2&gt;
  
  
  The tools, briefly
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;n8n&lt;/strong&gt; is open-source workflow automation with a visual canvas. 500+ integrations, self-hosted or cloud. In 2025, n8n pivoted hard into AI: LangChain nodes, MCP support, AI agent builder, human-in-the-loop approvals. The market responded — $2.5B valuation, 180K GitHub stars, 700K+ developers, 75% of customers using AI features. It's no longer "that Zapier alternative." It's a platform.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Apache Airflow&lt;/strong&gt; is code-first DAG orchestration in Python. The de facto standard for data engineering. Kubernetes executor, Celery workers, battle-tested at companies running millions of DAG executions daily. If your data team exists, they're probably using Airflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 2026 twist: AI coding changed the equation
&lt;/h2&gt;

&lt;p&gt;In 2024, the comparison was simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Can your team write Python? → Airflow. Can't? → n8n.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In 2026, that logic collapsed. Claude Code, Cursor, GitHub Copilot — they write Python for you. Here's my actual workflow: I open a terminal, describe a pipeline in plain English, and get deployable code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Create an Airflow DAG that runs daily at 6 AM UTC.
Pull new rows from our Postgres orders table since yesterday,
calculate revenue per region,
load into BigQuery,
send a Slack summary.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three minutes later:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.decorators&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;dag&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timedelta&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.providers.postgres.hooks.postgres&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PostgresHook&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.providers.google.cloud.hooks.bigquery&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BigQueryHook&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.providers.slack.hooks.slack_webhook&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SlackWebhookHook&lt;/span&gt;

&lt;span class="nd"&gt;@dag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;schedule&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0 6 * * *&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;start_date&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2026&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;catchup&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;default_args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retries&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retry_delay&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;minutes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;daily_revenue_pipeline&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;

    &lt;span class="nd"&gt;@task&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_orders&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;hook&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PostgresHook&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;orders_db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;hook&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_pandas_df&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT region, amount FROM orders &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WHERE created_at &amp;gt;= NOW() - INTERVAL &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;1 day&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;
        &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;orient&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;records&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nd"&gt;@task&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;
        &lt;span class="n"&gt;totals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;region&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;amount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nd"&gt;@task&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_to_bigquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nc"&gt;BigQueryHook&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bigquery_conn&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;insert_rows&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;analytics.daily_revenue&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;region&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;revenue&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;date&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;today&lt;/span&gt;&lt;span class="p"&gt;())}&lt;/span&gt;
             &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()])&lt;/span&gt;

    &lt;span class="nd"&gt;@task&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;notify_slack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: $&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;,.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="nc"&gt;SlackWebhookHook&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;slack_conn&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Daily revenue:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;extract_orders&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;load_to_bigquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;notify_slack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;daily_revenue_pipeline&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's real, deployable code. AI wrote it in minutes.&lt;/p&gt;

&lt;p&gt;But here's what I've learned running both tools in production: &lt;strong&gt;writing the code was never the hard part. Maintaining it was.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A &lt;a href="https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/" rel="noopener noreferrer"&gt;METR study&lt;/a&gt; found that experienced developers using AI tools actually took &lt;strong&gt;19% longer&lt;/strong&gt; on real-world tasks — despite believing they were faster. The bottleneck isn't writing code. It's everything around the code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where n8n wins
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Managed auth for 500+ APIs
&lt;/h3&gt;

&lt;p&gt;This is n8n's deepest moat, and most people underestimate it.&lt;/p&gt;

&lt;p&gt;Every API has quirks. Slack requires bot scopes and socket mode tokens. Google silently rotates refresh tokens every 7 days for apps in "testing" mode. Salesforce routes requests to instance-specific URLs. HubSpot deprecated API keys entirely in 2024, breaking thousands of integrations overnight.&lt;/p&gt;

&lt;p&gt;n8n handles all of this. Click "Connect," authenticate via OAuth, done. Token refresh, retry-on-401, scope management — built in for 500+ services.&lt;/p&gt;

&lt;p&gt;Claude Code generates a generic OAuth flow. It works on day one. It breaks on day eight when Google revokes your token. In my experience, maintaining auth logic for even 5 SaaS APIs is a part-time job.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Visual debugging in production
&lt;/h3&gt;

&lt;p&gt;When step 7 of a 15-step workflow fails in n8n, you open the execution, see the exact node that failed, inspect the input data, inspect the output, and retry that single step. No redeployment. No re-running the entire pipeline.&lt;/p&gt;

&lt;p&gt;With Airflow: check the scheduler logs, find the task instance, read the log output, maybe add debug logging, commit, push, wait for the scheduler to pick up the new DAG, trigger a manual run, check logs again. It works — but it's 15 minutes where n8n takes 30 seconds.&lt;/p&gt;

&lt;p&gt;For engineers, this is acceptable overhead. For anyone else, it's a wall.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. The maintenance argument
&lt;/h3&gt;

&lt;p&gt;This is the one nobody talks about.&lt;/p&gt;

&lt;p&gt;AI writes code fast. But after the code exists, someone needs to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deploy it&lt;/strong&gt; — to a server, with a scheduler, with health checks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitor it&lt;/strong&gt; — set up alerting for failures&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manage secrets&lt;/strong&gt; — store API keys, rotate credentials&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update dependencies&lt;/strong&gt; — when a library releases a breaking change&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fix it at 3 AM&lt;/strong&gt; — when the upstream API changed their response format&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;n8n abstracts all of this into the platform. Slack changed their API? n8n updates the node — your workflow keeps running. OAuth token expired? n8n rotates via credential manager. Workflow failed? Visual retry with one click.&lt;/p&gt;

&lt;p&gt;With AI-generated code, every single one of these is your problem.&lt;/p&gt;

&lt;p&gt;The analogy I keep coming back to: AI writes you a Dockerfile, but n8n is Heroku. Both get your app running. Only one of them handles ops.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. AI agent orchestration
&lt;/h3&gt;

&lt;p&gt;n8n's biggest bet — and it's paying off. In 2025-2026, n8n shipped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LangChain nodes&lt;/strong&gt; — connect any LLM as a first-class workflow step&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool nodes&lt;/strong&gt; — any n8n workflow becomes a callable tool for an AI agent&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human-in-the-loop&lt;/strong&gt; — pause execution, wait for human approval, resume&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Guardrails&lt;/strong&gt; — jailbreak detection, NSFW filtering, custom rules&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP support&lt;/strong&gt; — the emerging standard for AI-tool integration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Building an AI agent that reads emails, classifies intent, drafts a response, asks a human for approval, then sends — that's a 20-minute drag-and-drop job in n8n.&lt;/p&gt;

&lt;p&gt;In Airflow, you'd write custom operators, manage conversation state via XCom, and build your own approval mechanism. Possible? Yes. Worth it? Almost never.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Time-to-first-workflow
&lt;/h3&gt;

&lt;p&gt;10 minutes from signup to a working, deployed workflow. That's n8n cloud. Self-hosted: a single &lt;code&gt;docker run&lt;/code&gt; command.&lt;/p&gt;

&lt;p&gt;Airflow: install, configure metadata DB, set up connections, write a DAG file, place it in the dags folder, wait for the scheduler to parse it, test, fix, redeploy. Even with AI writing the code, the infrastructure overhead is real.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Airflow wins
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Unlimited customization
&lt;/h3&gt;

&lt;p&gt;When you hit n8n's ceiling — and for complex data transforms, you will — there's no clean escape hatch. You can write JavaScript in a Function node or Python in a Code node, but you're still inside n8n's execution model.&lt;/p&gt;

&lt;p&gt;What I've found is that workflows that start simple in n8n tend to accumulate Code nodes until 60% of the logic is hand-written JavaScript. At that point, you've lost the visual advantage and you'd be better off in Airflow where the entire thing is code you can test, lint, and version-control properly.&lt;/p&gt;

&lt;p&gt;Airflow is Python. Custom operators, dynamic DAG generation, conditional branching, complex dependency graphs — no ceiling.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Scale
&lt;/h3&gt;

&lt;p&gt;n8n is a Node.js process. Even in queue mode with multiple workers, there's a limit. For TB-scale ETL, thousands of concurrent tasks, or long-running compute jobs — Airflow with the Kubernetes executor spins up isolated pods per task:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Process 1000 files in parallel, each in its own K8s pod
&lt;/span&gt;&lt;span class="nd"&gt;@task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;executor_config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;KubernetesExecutor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;request_memory&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2Gi&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;request_cpu&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}})&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Heavy processing — isolated pod, dedicated resources
&lt;/span&gt;    &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="nd"&gt;@dag&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;batch_pipeline&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list_files&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;process_file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;files&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Dynamic task mapping
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;n8n can't do this. If your pipeline processes terabytes, Airflow is the only serious option.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Data engineering ecosystem
&lt;/h3&gt;

&lt;p&gt;dbt, Spark, BigQuery, Snowflake, Databricks, Great Expectations — all have first-class Airflow providers. The &lt;code&gt;apache-airflow-providers-*&lt;/code&gt; ecosystem has 80+ packages.&lt;/p&gt;

&lt;p&gt;n8n has basic database nodes, but if your pipeline involves dbt model runs → data quality checks → Spark jobs → warehouse loading — Airflow is where that ecosystem lives.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Production-grade reliability
&lt;/h3&gt;

&lt;p&gt;SLAs with automatic alerting. Task-level retries with configurable exponential backoff. Sensor patterns that wait for external conditions. XCom for cross-task data passing. Pool-based concurrency limits. Priority weights for scheduling.&lt;/p&gt;

&lt;p&gt;These matter when you're running hundreds of DAGs and need to explain to your VP of Finance exactly why Tuesday's revenue numbers were 4 hours late.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. No vendor lock-in
&lt;/h3&gt;

&lt;p&gt;Airflow DAGs are &lt;code&gt;.py&lt;/code&gt; files. Move them to any Airflow instance — self-hosted, Google Cloud Composer, AWS MWAA, Astronomer. Or strip out the decorators and run the logic as plain Python.&lt;/p&gt;

&lt;p&gt;n8n workflows are JSON tied to n8n's runtime. Exportable, sure. Portable? Only to another n8n instance.&lt;/p&gt;

&lt;h2&gt;
  
  
  The decision matrix
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;th&gt;Choose&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Connect SaaS tools (Slack + Sheets + CRM)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;n8n&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;500+ managed connectors with OAuth&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ETL pipeline (extract → transform → load)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Airflow&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Python flexibility, scale, ecosystem&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI agent with human-in-the-loop&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;n8n&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Visual agent builder, guardrails, MCP&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ML pipeline (train → evaluate → deploy)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Airflow&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Native Python, GPU scheduling, K8s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Business process automation&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;n8n&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Non-technical users, visual canvas&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data quality monitoring&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Airflow&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Sensors, SLAs, Great Expectations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Webhook-triggered actions&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;n8n&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Built-in webhook node, instant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Batch processing at scale&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Airflow&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;K8s executor, dynamic task mapping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prototype / MVP&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;n8n&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;10 min to working workflow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mission-critical data pipeline&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Airflow&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Battle-tested, horizontal scaling&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  The plot twist: use both
&lt;/h2&gt;

&lt;p&gt;Here's what I actually run in production:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;n8n&lt;/strong&gt; handles event-driven work: SaaS integrations, AI agent chains, Slack bots, webhook receivers, anything that talks to external APIs with OAuth.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Airflow&lt;/strong&gt; handles data work: batch ETL, scheduled processing, anything that needs scale or touches the data warehouse.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They connect trivially. n8n fires a webhook that triggers an Airflow DAG. Airflow calls n8n via HTTP when it needs to notify humans or interact with SaaS tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@task&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;notify_via_n8n&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;
    &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://n8n.example.com/webhook/pipeline-complete&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rows_processed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Not architecturally beautiful. But pragmatic — each tool does what it's best at.&lt;/p&gt;

&lt;h2&gt;
  
  
  What AI actually changes
&lt;/h2&gt;

&lt;p&gt;Let me be specific about what AI coding tools change in this equation:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What AI accelerates:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Writing DAG boilerplate (extract/transform/load patterns)&lt;/li&gt;
&lt;li&gt;Writing SQL transformations and dbt models&lt;/li&gt;
&lt;li&gt;Creating custom operators for new data sources&lt;/li&gt;
&lt;li&gt;Debugging failed tasks from log output&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What AI doesn't help with:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Setting up infrastructure (servers, Docker, networking)&lt;/li&gt;
&lt;li&gt;Managing credentials and OAuth flows long-term&lt;/li&gt;
&lt;li&gt;Debugging intermittent production failures&lt;/li&gt;
&lt;li&gt;Tuning sensor timeouts when upstream data arrives late&lt;/li&gt;
&lt;li&gt;Capacity planning when your DAG count grows from 10 to 100&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI shrinks the &lt;em&gt;development&lt;/em&gt; cost of Airflow dramatically. But the &lt;em&gt;operational&lt;/em&gt; cost — the infra, the on-call, the credential rotation, the monitoring — stays the same.&lt;/p&gt;

&lt;p&gt;n8n's real value proposition in 2026 isn't "you don't need to code" (AI handles that). It's &lt;strong&gt;"you don't need to operate."&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The real question
&lt;/h2&gt;

&lt;p&gt;The question isn't "n8n or Airflow?" It's: &lt;strong&gt;who is operating this in production?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data engineer who lives in the terminal&lt;/strong&gt; → Airflow. You'll appreciate the control when you're debugging a sensor timeout at 3 AM.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Business user who needs automation&lt;/strong&gt; → n8n. They'll appreciate fixing things without filing a Jira ticket.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Developer prototyping an AI agent&lt;/strong&gt; → n8n first. Migrate to code if it outgrows the platform.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Team with mixed technical skills&lt;/strong&gt; → Both. Engineers own Airflow, business users own n8n.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;n8n's CEO Jan Oberhauser put it well: &lt;em&gt;"n8n allows you to combine humans, AI, and code."&lt;/em&gt; Airflow gives you full code and full control. Both are right — for different problems, for different teams.&lt;/p&gt;

&lt;p&gt;AI didn't make n8n obsolete. It didn't make Airflow unnecessary. What it did is kill "we can't write code" as a reason to choose n8n, and sharpen &lt;strong&gt;"we don't want to operate code"&lt;/strong&gt; as the real reason.&lt;/p&gt;

&lt;p&gt;Before you choose, ask one question: &lt;strong&gt;will this workflow be maintained by an engineer or a business user?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That single question answers which tool you need.&lt;/p&gt;

</description>
      <category>airflow</category>
      <category>n8n</category>
      <category>etl</category>
      <category>claude</category>
    </item>
    <item>
      <title>How we cut MCP context by 95% and stopped wasting the team's time</title>
      <dc:creator>Michael Rakutko</dc:creator>
      <pubDate>Wed, 18 Mar 2026 16:31:20 +0000</pubDate>
      <link>https://dev.to/michael_rakutko/how-we-cut-mcp-context-by-95-and-stopped-wasting-the-teams-time-7hg</link>
      <guid>https://dev.to/michael_rakutko/how-we-cut-mcp-context-by-95-and-stopped-wasting-the-teams-time-7hg</guid>
      <description>&lt;p&gt;We've all been there. Claude hits an MCP error, tries a different approach, hits another one, tries again — and eventually figures it out. You wait a minute, maybe two, it's fine.&lt;/p&gt;

&lt;p&gt;But here's the thing nobody talks about: when a whole team uses Claude Code daily, those minutes stack. One person watches the agent spin through three wrong SQL dialects before landing on the right one. Another waits while it retries the same failed tool call four times. Someone else loses the thread of a complex session because query results flooded the context. Multiply that by four people, every day, and you're not talking about a minor inconvenience anymore — you're talking about hours of engineering time burned while Claude figures out things it should have known from the start.&lt;/p&gt;

&lt;p&gt;I started tracking it. Two weeks, 23+ of these incidents across my analytics team. And the worst part? The agent wasn't being stupid. It was being sabotaged by the server we gave it.&lt;/p&gt;




&lt;p&gt;We were using the official &lt;code&gt;aws-dataprocessing&lt;/code&gt; MCP server from AWS Labs. Good project, well-maintained, 34 tools covering Glue, EMR, Athena, IAM, S3. We needed Athena. That's it. Five tools out of thirty-four.&lt;/p&gt;

&lt;p&gt;I should've noticed sooner.&lt;/p&gt;

&lt;p&gt;The thing is, it worked — sometimes. When it worked it was great. But when it didn't, it failed in the most demoralizing way possible: the agent would try something, get an error, try a variation, get a different error, try again, get the first error back. You'd watch it spin for ten minutes on something a junior analyst could fix in thirty seconds.&lt;/p&gt;

&lt;p&gt;After a while I started looking at &lt;em&gt;why&lt;/em&gt; it failed. Not the specific errors — the underlying reasons. And it turned out there were five of them, all happening at once.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The first thing I noticed was the metadata blindspot.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Athena is strict about types. If your column is &lt;code&gt;varchar&lt;/code&gt; and you write a query treating it like a &lt;code&gt;date&lt;/code&gt;, you get an error. Simple. But Claude didn't know our column types — there was no mechanism to tell it upfront. So it guessed. And when it guessed wrong and got an error, it... guessed again. I watched it try three different &lt;code&gt;CAST&lt;/code&gt; approaches on the same column, each one wrong in a different way, before I just typed &lt;code&gt;event_date is STRING&lt;/code&gt; in the chat and it immediately fixed everything. The information was always there — we just never gave it to the agent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The second thing was the context bill.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;34 tools × ~600 tokens each = roughly 20K tokens just for tool descriptions. Before the agent runs a single query, a fifth of its context window is gone. On a normal session that's annoying. With parallel sub-agents — which we use constantly — it's a disaster. Each sub-agent gets the full 34-tool payload. When someone on the team was running parallel agents on a complex analysis task, each one was starting with almost no usable context. Dozens of failures in a single session — and now it made sense.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Third: dialect hallucinations.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Claude kept writing &lt;code&gt;TIMESTAMP_SUB&lt;/code&gt;. That's BigQuery syntax. Athena runs Presto/Trino, which uses &lt;code&gt;DATE_ADD('day', -N, CURRENT_DATE)&lt;/code&gt;. Every single time someone ran a date filter, the agent defaulted to what it knew best. Because nothing in the tool description said "hey, this isn't BigQuery."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fourth: the DROP VIEW trap.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The server has a built-in SQL analyzer that blocks write operations. Makes sense as a safety feature. Except it classified &lt;code&gt;DROP VIEW&lt;/code&gt; as a write operation and blocked it — even though we had &lt;code&gt;--allow-write&lt;/code&gt; enabled. Drop a view before recreating it, like you do in any dbt workflow, and you get a permissions error that makes zero sense. Eight attempts. Eight times the same wall.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fifth: the data dump problem.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No default row limit on query results. &lt;code&gt;SELECT *&lt;/code&gt; on a large table returns 1,000 rows of JSON into context. One query. Multiply by parallel agents running simultaneously, and the useful context just disappears.&lt;/p&gt;




&lt;p&gt;So. 160 lines of Python later, here's what we changed — and why each thing actually matters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We cut 33 tools.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This sounds obvious but it isn't. You can't just disable tools in the config. We wrote a new entry point, &lt;code&gt;server_athena.py&lt;/code&gt;, that imports the original handler but only registers one tool instead of thirty-four. Same codebase, same logic, different surface area. Context overhead dropped from ~20K tokens to ~1K. That single change made parallel sub-agents viable again.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We merged three calls into one.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The previous flow was: &lt;code&gt;start-query-execution&lt;/code&gt; → poll &lt;code&gt;get-query-execution&lt;/code&gt; until it finishes → &lt;code&gt;get-query-results&lt;/code&gt;. Three tool calls minimum, each one a decision point where the agent could drift. We added an &lt;code&gt;execute-query&lt;/code&gt; operation that handles all of it internally — 500ms polling, 30 second timeout, returns results directly. For 95% of queries, it's one call. For slow queries, it returns the execution ID so you can check back.&lt;/p&gt;

&lt;p&gt;The agent stops forgetting what it was doing mid-query.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We preloaded the schema.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;At startup, the server scans all databases and caches every table, every column, every type. It also downloads our dbt manifest from S3 — so it knows which model owns which table, who's responsible for it, how fresh the data is supposed to be. This all gets baked into the server's system instructions before Claude sees anything.&lt;/p&gt;

&lt;p&gt;Then we added &lt;code&gt;get-all-schemas&lt;/code&gt; with two modes: a cheap compact mode that returns table names and descriptions (so the agent can orient), and a deep mode where you ask for specific tables and get full column types plus dbt lineage. The agent orients cheaply, drills when it needs to.&lt;/p&gt;

&lt;p&gt;No more guessing column types. The metadata is just there.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We made errors actually helpful.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This was the biggest change in practice. The old behavior: query fails, agent gets a raw error string, guesses what went wrong. New behavior: the server parses the error type and returns the specific context needed to fix it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TABLE_NOT_FOUND      → full list of available tables
COLUMN_NOT_FOUND     → all columns for tables in the query
TYPE_MISMATCH        → column types with correct CAST suggestion
PARTITION_MISMATCH   → partition keys and their actual types
QUERY_EXHAUSTED      → hints: add LIMIT, use APPROX_DISTINCT
SYNTAX_ERROR         → "this is Presto, use DATE_ADD not DATE_SUB"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Most errors now resolve in one retry. The spiraling stopped almost immediately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We compressed the output.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;JSON query results are verbose. We switched to &lt;a href="https://github.com/toon-format/toon" rel="noopener noreferrer"&gt;TOON (Token-Oriented Object Notation)&lt;/a&gt; — pipe-separated tabular format. Same data, 74% fewer tokens on our actual queries. Sounds like micro-optimization. At 50 queries deep into a complex analysis session, it's the difference between the agent remembering what it was asked and losing the thread.&lt;/p&gt;




&lt;p&gt;The results:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Before&lt;/th&gt;
&lt;th&gt;After&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tools in context&lt;/td&gt;
&lt;td&gt;34&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Context overhead&lt;/td&gt;
&lt;td&gt;~20K tokens&lt;/td&gt;
&lt;td&gt;~1K tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Calls per query&lt;/td&gt;
&lt;td&gt;3+&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Default row limit&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Error response&lt;/td&gt;
&lt;td&gt;raw string&lt;/td&gt;
&lt;td&gt;schema + hint&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Result format&lt;/td&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;TOON (-74%)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The wasted cycles stopped. The team runs parallel agents on complex tasks without babysitting them. The agent gets it right on the first or second try.&lt;/p&gt;




&lt;p&gt;The thing I keep coming back to is that none of this required a smarter model. Claude was fine the whole time. We handed it a server built for maximum coverage — 34 tools, every AWS data service, zero assumptions about your use case — and expected it to perform like a specialist. AWS Labs built that server for everyone. We needed it to work for us.&lt;/p&gt;

&lt;p&gt;If your agent keeps spinning in circles, before blaming the model, check what you're loading into its context:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;How many tools does it see? Do you actually use all of them?&lt;/li&gt;
&lt;li&gt;What does it know about your data model before the first call?&lt;/li&gt;
&lt;li&gt;When it gets an error, does it get a hint or a wall?&lt;/li&gt;
&lt;li&gt;How big are the responses coming back?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We write MCP servers as access layers. But they're also the agent's cognitive environment. Design them badly and even a great model will look stupid.&lt;/p&gt;




&lt;p&gt;If you're just starting to think about MCP servers — what they're for, when they make sense, how to structure them — I wrote about that &lt;a href="https://dev.to/michael_rakutko/why-build-mcp-4-levels-of-adoption-from-api-access-to-company-wide-semantic-layer-im6"&gt;here&lt;/a&gt;. This post is what happens two levels deeper, when the theory meets production.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>claudecode</category>
    </item>
    <item>
      <title>Why Build MCP? 4 Levels of Adoption — From API Access to Company-Wide Semantic Layer</title>
      <dc:creator>Michael Rakutko</dc:creator>
      <pubDate>Mon, 16 Feb 2026 18:10:19 +0000</pubDate>
      <link>https://dev.to/michael_rakutko/why-build-mcp-4-levels-of-adoption-from-api-access-to-company-wide-semantic-layer-im6</link>
      <guid>https://dev.to/michael_rakutko/why-build-mcp-4-levels-of-adoption-from-api-access-to-company-wide-semantic-layer-im6</guid>
      <description>&lt;p&gt;Our team builds a lot of MCPs — for ourselves and for external users. Over time, recurring patterns have emerged. Here are the key use cases we see over and over again, organized by complexity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Level 0. Give the agent access to APIs
&lt;/h2&gt;

&lt;p&gt;The simplest and most obvious use case. You ask the agent: "analyze the Telegram channel @llm_under_hood, identify topics and popular posts" — it calls the Telegram API, fetches posts, calculates metrics, and returns the analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Level 1. Automate routine by raising abstraction
&lt;/h2&gt;

&lt;p&gt;AI frequently makes mistakes — forgets where servers and data are, makes syntax errors, even when everything is spelled out in context. MCP solves this by raising the abstraction level.&lt;/p&gt;

&lt;p&gt;For example, I have 3 MCP servers written for a specific project. Each is 200-300 lines of TypeScript:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;infra&lt;/strong&gt; — &lt;code&gt;vm_health&lt;/code&gt; generates a health report (12+ threshold alerts), &lt;code&gt;container_logs&lt;/code&gt; returns logs, &lt;code&gt;redis_query&lt;/code&gt; runs queries.&lt;/p&gt;

&lt;p&gt;Sure, the agent can compose a long SSH command on its own, but it fails every other time. With MCP we remove the cognitive load:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Without MCP: agent composes this and often gets it wrong
ssh user@server "docker exec redis redis-cli -a $PASS INFO memory | grep used_memory_human"

// With MCP: one tool call
redis_query({ server: "audioserver", command: "INFO memory" })
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;deps&lt;/strong&gt; — &lt;code&gt;dep_versions&lt;/code&gt; across 5 repositories, &lt;code&gt;tag_api_types&lt;/code&gt;, &lt;code&gt;update_consumer&lt;/code&gt;. Checking dependency versions, syncing API types between services — scripted and automatic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;s3&lt;/strong&gt; — S3 navigation: &lt;code&gt;s3_org_tree&lt;/code&gt;, &lt;code&gt;s3_device_files&lt;/code&gt;, &lt;code&gt;s3_cat&lt;/code&gt;. Instead of &lt;code&gt;aws s3 ls&lt;/code&gt; with endless paths — "show files for device X from yesterday".&lt;/p&gt;

&lt;h2&gt;
  
  
  Level 2. Semantic layer for data
&lt;/h2&gt;

&lt;p&gt;An MCP server can wrap not just an API, but a semantic layer. Data is already prepared and labeled — the agent doesn't need to know the database schema, it operates with business concepts.&lt;/p&gt;

&lt;p&gt;Yes, you can connect an MCP for GA4. But how do you account for all the custom tagging rules and complex logic of merging data from different sources?&lt;/p&gt;

&lt;p&gt;That's what ETL is for — it handles the processing. The MCP server wraps the result as a semantic layer, and then anyone in the company can ask:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"show traffic insights for yesterday"&lt;/li&gt;
&lt;li&gt;"which ASNs should we block?"&lt;/li&gt;
&lt;li&gt;"which users generated the most revenue?"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The agent doesn't need to know table names, join logic, or filtering rules. The MCP server encapsulates all of that.&lt;/p&gt;

&lt;p&gt;This changes who can use the tool. An analyst builds the semantic layer once — then the entire team uses it, including managers who don't know SQL.&lt;/p&gt;

&lt;h2&gt;
  
  
  Level 3. Shared authorization and access control
&lt;/h2&gt;

&lt;p&gt;One MCP server can serve the entire company.&lt;/p&gt;

&lt;p&gt;Example: Google Search Console. Instead of handing out credentials to everyone — one internal OAuth. Connect to the MCP server, authenticate via corporate SSO, get access based on your role.&lt;/p&gt;

&lt;p&gt;Or an MCP that gives some people access to yesterday's revenue and others — not. Role-based access at the tool level.&lt;/p&gt;

&lt;p&gt;This is already the industry standard. Sentry, Stripe, GitHub, Atlassian — all offer remote MCP servers with OAuth. Zero-config for the user: add a URL, log in via browser, start working.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building MCP servers: a skill with best practices
&lt;/h2&gt;

&lt;p&gt;We analyzed the source code and documentation of 50 production MCP servers from Stripe, Sentry, GitHub, Cloudflare, Supabase, Linear, Grafana, Playwright, AWS, Terraform, MongoDB, and others.&lt;/p&gt;

&lt;p&gt;Packaged it as a Claude Code skill — 23 sections covering:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Architecture&lt;/strong&gt;: transport choice (STDIO vs StreamableHTTP), deployment models, OAuth 2.1&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool design&lt;/strong&gt;: naming conventions, writing descriptions for LLMs, managing tool count (1 to 1400+)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Implementation&lt;/strong&gt;: error handling, security, prompt injection protection, token optimization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operations&lt;/strong&gt;: debugging with MCP Inspector, LLM-based eval testing, Docker deployment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Industry patterns&lt;/strong&gt;: top 35 patterns from production, pre-release checklist&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Drop it into your &lt;code&gt;.claude/skills/&lt;/code&gt; directory and run &lt;code&gt;/mcp-guide&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://gitlab.com/mskrx/skills/-/tree/main/claude/mcp-building-guide" rel="noopener noreferrer"&gt;MCP Building Guide Skill on GitLab&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The agent will use these best practices automatically when planning, developing, or reviewing MCP servers.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>claudecode</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
