<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ankit Dhiman</title>
    <description>The latest articles on DEV Community by Ankit Dhiman (@ankitdhiman).</description>
    <link>https://dev.to/ankitdhiman</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3951090%2Fe86e1391-112e-4f70-9bed-eac16ea0a32f.png</url>
      <title>DEV Community: Ankit Dhiman</title>
      <link>https://dev.to/ankitdhiman</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ankitdhiman"/>
    <language>en</language>
    <item>
      <title>Building Production Multi-Agent Workflows in n8n: What 50 Deployments Taught Us</title>
      <dc:creator>Ankit Dhiman</dc:creator>
      <pubDate>Mon, 25 May 2026 17:06:44 +0000</pubDate>
      <link>https://dev.to/ankitdhiman/building-production-multi-agent-workflows-in-n8n-what-50-deployments-taught-us-foi</link>
      <guid>https://dev.to/ankitdhiman/building-production-multi-agent-workflows-in-n8n-what-50-deployments-taught-us-foi</guid>
      <description>&lt;p&gt;Most n8n AI workflow tutorials end at "it worked in testing." The gap between a demo and a production system handling 10,000 items/day with real money on the line is where the interesting problems live.&lt;/p&gt;

&lt;p&gt;At &lt;a href="https://chronexa.io" rel="noopener noreferrer"&gt;Chronexa&lt;/a&gt;, we've built 50+ multi-agent workflows for fintech compliance teams, legal document processing, AI SDR engines, and RAG-powered research assistants. Here's what we've learned about making them reliable.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Design Failure as a First-Class Concern
&lt;/h2&gt;

&lt;p&gt;Most n8n tutorials wire &lt;code&gt;main[0]\&lt;/code&gt;. Production workflows wire &lt;code&gt;main[0]\&lt;/code&gt; &lt;strong&gt;and&lt;/strong&gt; &lt;code&gt;main[1]\&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Every HTTP Request node and AI node has two outputs in n8n: success (&lt;code&gt;main[0]\&lt;/code&gt;) and error (&lt;code&gt;main[1]\&lt;/code&gt;). Leaving the error branch unwired means failures disappear silently — you only find out when a client notices something is wrong three days later.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The pattern we use on every deployment:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;\&lt;/code&gt;&lt;code&gt;&lt;br&gt;
HTTP Request → main[0] → continue workflow&lt;br&gt;
             → main[1] → DLQ Sheet + Slack Alert&lt;br&gt;
\&lt;/code&gt;&lt;code&gt;\&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Set &lt;code&gt;onError: 'continueErrorOutput'\&lt;/code&gt; on every AI and HTTP node. Wire &lt;code&gt;main[1]\&lt;/code&gt; to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;Dead Letter Queue (DLQ)&lt;/strong&gt; Google Sheet or Baserow table with the failed item, timestamp, and error message&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;Slack alert&lt;/strong&gt; to the ops channel with the item ID and a link to the DLQ row&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Never rely on a global workflow-level error trigger as a substitute for node-level error routing. The global trigger fires when the whole workflow crashes — but you want to capture partial failures item-by-item, not lose an entire batch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this matters:&lt;/strong&gt; On one fintech client's AML monitoring workflow, we caught 847 failed enrichment calls in the first week that would have silently dropped cases. The DLQ made every failure visible and recoverable.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. HITL — The Pattern That Makes AI Output Trustworthy
&lt;/h2&gt;

&lt;p&gt;Fully automated AI workflows fail silently in high-stakes contexts. Claude occasionally generates wrong company names, incorrect figures, or fabricated URLs. Without a human checkpoint, those errors reach customers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The HITL (Human-in-the-Loop) pattern:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;\&lt;/code&gt;&lt;code&gt;&lt;br&gt;
AI Node → Append to Review Sheet (status: "Pending")&lt;br&gt;
        → Wait for Webhook&lt;br&gt;
        → [Human reviews, sets status to "Approved" or "Rejected"]&lt;br&gt;
        → Approved: continue workflow&lt;br&gt;
        → Rejected: route to revision sub-workflow&lt;br&gt;
\&lt;/code&gt;&lt;code&gt;\&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Implementation in n8n:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;After AI generation, write output to a Google Sheet / Baserow row with a "Status" column set to "Pending Review"&lt;/li&gt;
&lt;li&gt;Use a &lt;strong&gt;Wait node&lt;/strong&gt; configured to resume on webhook&lt;/li&gt;
&lt;li&gt;Set up a sheet trigger or webhook that fires when Status changes to "Approved"&lt;/li&gt;
&lt;li&gt;Add a 24-hour timeout check — if a row sits Pending too long, Slack-alert the reviewer&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;When to use HITL:&lt;/strong&gt; Any workflow where AI output is customer-facing, regulatory, or financial. Skip it for internal data transformation pipelines where errors are low-stakes.&lt;/p&gt;

&lt;p&gt;Our AI SDR engine uses HITL for outbound email review. SDRs spend 45 minutes/day approving emails instead of 6 hours writing them — the workflow does the research and drafting, a human does the final check. Reply rates went from 2.1% to 6.8%.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Memory Management for Long-Running Agents
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Window Buffer Memory
&lt;/h3&gt;

&lt;p&gt;Best for conversational agents where recency matters. Set window size to &lt;strong&gt;10–20 messages&lt;/strong&gt; — beyond 20, you're paying for context that rarely helps.&lt;/p&gt;

&lt;h3&gt;
  
  
  RAG over Static Documents
&lt;/h3&gt;

&lt;p&gt;When your agent needs to reference a knowledge base (contracts, policies, product docs), vector retrieval beats pumping the full document into context every time.&lt;/p&gt;

&lt;p&gt;Setup: Pinecone or pgvector + n8n's Embeddings node + Information Retrieval chain. Cost difference at scale: a 50-page policy document passed to every query costs ~$0.08/query at Claude Sonnet pricing. RAG retrieval of 3 relevant chunks costs ~$0.004/query — 20x cheaper at volume.&lt;/p&gt;

&lt;h3&gt;
  
  
  Session Keys for Multi-User Deployments
&lt;/h3&gt;

&lt;p&gt;This is the one that bites people most often. If the same workflow handles multiple concurrent users with the default session ID, memory from User A bleeds into User B's conversation.&lt;/p&gt;

&lt;p&gt;Fix — scope session ID to a user identifier from the webhook payload:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;\&lt;/code&gt;&lt;code&gt;javascript&lt;br&gt;
sessionId: {{ $('Webhook').item.json.userId }}&lt;br&gt;
\&lt;/code&gt;&lt;code&gt;\&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;We've seen this misconfiguration cause a support bot to answer one user's question with another user's account details.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Rate Limiting, Backoff, and Concurrency
&lt;/h2&gt;

&lt;p&gt;Three failure modes that will bite you in production:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. API Rate Limits (OpenAI/Anthropic)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For bulk workflows processing hundreds of items, rate limits hit fast. Use n8n's built-in &lt;strong&gt;Retry on Fail&lt;/strong&gt; — set max retries to 3 with exponential backoff. For sustained bulk processing, add a Wait node between AI calls.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Webhook Concurrency&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;n8n's default webhook concurrency is 5 simultaneous executions. For AI workflows where each execution makes multiple LLM calls, 5 concurrent workflows can spike to 50 simultaneous API calls.&lt;/p&gt;

&lt;p&gt;Fix: set &lt;code&gt;maxConcurrency: 2\&lt;/code&gt; on webhook triggers for AI-heavy workflows. It creates a queue rather than dropping requests.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Downstream API Timeouts&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;HTTP Request nodes have a 30-second default timeout. If your workflow calls slow external APIs, you'll see phantom failures. Set explicit &lt;code&gt;"timeout": 60000\&lt;/code&gt; on slow-API nodes, and wire the error output so timeouts go to the DLQ.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. The Production Checklist We Use Before Every Deployment
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Error output (&lt;code&gt;main[1]\&lt;/code&gt;) wired on every HTTP Request and AI node&lt;/li&gt;
&lt;li&gt;[ ] DLQ sheet created and connected to error outputs&lt;/li&gt;
&lt;li&gt;[ ] Slack alert configured on failure with item ID and error details&lt;/li&gt;
&lt;li&gt;[ ] &lt;code&gt;saveSuccessfulExecution: false\&lt;/code&gt; set for high-volume workflows (prevents DB bloat)&lt;/li&gt;
&lt;li&gt;[ ] HITL step added for any customer-facing or regulatory output&lt;/li&gt;
&lt;li&gt;[ ] Session ID scoped to user/item (not default) for multi-user agents&lt;/li&gt;
&lt;li&gt;[ ] Rate limit buffer added — Wait node or Retry on Fail with backoff&lt;/li&gt;
&lt;li&gt;[ ] &lt;code&gt;maxConcurrency\&lt;/code&gt; set to 2 on webhook triggers for AI workflows&lt;/li&gt;
&lt;li&gt;[ ] Tested with 10× expected volume before go-live&lt;/li&gt;
&lt;li&gt;[ ] &lt;code&gt;errorWorkflow\&lt;/code&gt; field set to centralized error handler&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The difference between an n8n demo and a production system is entirely in how you handle the 10% of cases that don't go right. Designing failure handling as a first-class architectural concern, adding HITL for trust, and managing memory and concurrency carefully is what separates a reliable automation from a liability.&lt;/p&gt;

&lt;p&gt;If you're building multi-agent workflows for real business use cases, start with the error output. Everything else follows from there.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Ankit Dhiman is the founder of &lt;a href="https://chronexa.io" rel="noopener noreferrer"&gt;Chronexa&lt;/a&gt;, an AI automation agency that builds custom n8n workflows for mid-market B2B companies. We've open-sourced our workflow templates at &lt;a href="https://github.com/Chronexa/chronexa-n8n-workflows" rel="noopener noreferrer"&gt;github.com/Chronexa/chronexa-n8n-workflows&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>n8n</category>
      <category>automation</category>
      <category>ai</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
