DEV Community

Cover image for Optimizing Multi-Agent Workflows in n8n: A Context-Aware Approach to Agent Handoffs
Alex Retana
Alex Retana

Posted on

Optimizing Multi-Agent Workflows in n8n: A Context-Aware Approach to Agent Handoffs

Sequential agent handoff workflow

Master agent with switch routing

Optimizing Multi-Agent Workflows in n8n: A Context-Aware Approach to Agent Handoffs

When working with multi-agent systems like the BMAD (Big Model, Agent Design) pattern, context window management becomes critical for model performance and cost efficiency. While you could dump an entire agent bundle into a Claude Project and let it figure things out, you'll quickly burn through tokens on instruction sets that may never be relevant to the current task.

This tutorial demonstrates how to build intelligent agent routing in n8n—the popular node-based automation tool—that maintains tight control over context and enables direct user-to-subagent communication without wasteful token processing.

Why This Matters

Traditional approaches to multi-agent orchestration often suffer from two key problems:

  1. Context bloat: Loading all agent instructions upfront wastes tokens on irrelevant context
  2. Indirect communication: Routing everything through a master agent doubles processing costs and adds latency

While Claude Projects offers solutions like separating master instructions from agent definitions and using RAG for knowledge retrieval, building a custom workflow in n8n gives you explicit control over data flow and context management. This pattern extends beyond chatbots—use it anywhere you need task-specific agents with optimized context windows.

The Build: Two Demonstration Workflows

I've created two n8n workflows that progressively demonstrate agent handoff patterns. Both use intentionally simple agent instructions to focus on the routing mechanics, but these patterns scale to complex production systems.

You can copy the templates to import into your own n8n instance at my github repo: N8n Multi Agent Handoff Templates

Demo 1: Sequential Agent Pass-Through

Sequential agent handoff workflow

This workflow demonstrates the fundamental pattern: how to pass control from one agent to another.

Flow breakdown:

  1. Chat Trigger receives the user message
  2. AI Agent 1 processes the input with access to:
    • OpenAI GPT-4.1-mini (shared language model)
    • Simple Memory (conversation history)
  3. Agent 1 outputs to two destinations simultaneously:
    • "Respond to Chat" node (user feedback)
    • AI Agent 2 (next agent in chain)

  1. AI Agent 2 receives Agent 1's output via the prompt template:
   {{ $json.output || $json.chatInput }}
Enter fullscreen mode Exit fullscreen mode

This expression handles both the initial user input and subsequent agent outputs

  1. Agent 2 responds back to the user through "Respond to Chat1"

  1. The loop continues—Agent 2's response feeds back into itself. As you can see below, when I ask what agent it is, it says agent 2, and with out routing through agent 1 (we don't see agent 1 messaged again).

Key architectural decisions:

  • Shared memory: Both agents use the same Simple Memory node to maintain conversation continuity
  • Shared model: Single OpenAI connection reduces configuration overhead
  • Branching output: Agent 1 uses n8n's multiple output connections to respond AND handoff simultaneously

Code reference (from Demonstrate Agent Pass Off.json):

{
  "parameters": {
    "promptType": "define",
    "text": "={{ $json.output ||  $json.chatInput}}",
    "options": {
      "systemMessage": "You are Agent 2. If you're asked to respond to the chat with what agent you are, just say \"Yes, I'm Agent 2\""
    }
  },
  "type": "@n8n/n8n-nodes-langchain.agent",
  "name": "AI Agent 2"
}
Enter fullscreen mode Exit fullscreen mode

Demo 2: Dynamic Routing with Master Agent

Master agent with switch routing

This workflow adds intelligent routing: a master agent decides which specialized agent should handle each request.

Flow breakdown:

  1. Chat TriggerAI Master Agent
  2. Master Agent analyzes the request and outputs structured JSON:
   {
     "direct_response_to_user": "I'm routing you to Agent 2",
     "agent_to_route_to": "Agent 2",
     "forwarded_message": "User asked about X. Routing because Y."
   }
Enter fullscreen mode Exit fullscreen mode
  1. Map Master Agent's Response extracts these fields using n8n expressions:
   $json.output.parseJson().agent_to_route_to
   $json.output.parseJson().forwarded_message
Enter fullscreen mode Exit fullscreen mode
  1. Data splits into two paths:
    • Master Agent Responds To Chat: Sends routing explanation to user (no wait)
    • Switch Node: Routes to Agent 1, 2, or 3 based on agent_to_route_to value

  1. Selected agent receives a contextualized prompt:
   User's Original Message:
   ${$('When chat message received').item.json.chatInput}

   Master Agent's message to you:
   ${$json.forwarded_message}
Enter fullscreen mode Exit fullscreen mode
  1. Agent responds through its dedicated chat node

Critical differences from Demo 1:

  • Isolated memory: Each agent (including Master) has separate memory nodes (Simple Memory1/2/3)
  • Context preservation: The forwarded message includes both the original user input AND the master's routing rationale
  • Parallel execution: User gets immediate feedback while the selected agent processes in parallel

Master Agent system prompt (edited for clarity):

You are the Master Agent. You route user requests to the correct agent.

IMPORTANT: Output only valid JSON in this format:

{
  "direct_response_to_user": "I'm routing you to Agent 1",
  "agent_to_route_to": "Agent 1",
  "forwarded_message": "**Summary of user request and routing rationale**"
}
Enter fullscreen mode Exit fullscreen mode

Why structured output matters: The JSON format enables programmatic routing via the Switch node. In production, you'd add validation to handle malformed responses.

Implementation Details You Need to Know

Context Window Optimization

Each agent only loads:

  • Its own system prompt (~100-500 tokens)
  • Relevant conversation history (window-buffered)
  • The forwarded context from the master agent

Compare this to loading all 3 agent instruction sets upfront—you'd waste thousands of tokens per request.

The Switch Node Configuration

The Switch node uses n8n's rule-based routing:

{
  "conditions": {
    "conditions": [
      {
        "leftValue": "={{ $json.agent_to_route_to }}",
        "rightValue": "Agent 2",
        "operator": {
          "type": "string",
          "operation": "equals"
        }
      }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

Three rules match "Agent 1", "Agent 2", or "Agent 3" exactly. Unmatched requests fall through (you'd want error handling in production).

Memory Architecture Trade-offs

Demo 1: Shared memory allows agents to reference each other's outputs naturally, but blurs agent boundaries.

Demo 2: Isolated memory per agent creates cleaner separation but requires explicit context passing via forwarded_message. This scales better for specialized agents with distinct conversation contexts.

Running These Workflows

  1. Import the JSON files into n8n (both included with this post)
  2. Configure your OpenAI API credentials in the "OpenAI: gpt-4.1-mini" node
  3. Activate the workflow
  4. Open the chat interface via the webhook URL

Test prompts:

  • "Who are you?" (tests agent self-identification)
  • "Pass me to Agent 2" (tests routing logic)
  • "What did Agent 1 say?" (tests memory persistence)

What's Next?

The natural evolution is bidirectional routing: subagents should be able to return control to the master when they complete their task. This creates a true orchestration layer where:

  • Master Agent delegates to specialists
  • Specialists execute and report back
  • Master Agent synthesizes results or delegates further

Challenge for you: Can you modify Demo 2 to:

  1. Let each subagent indicate completion in its output (maybe via JSON like the master)?
  2. Route completed tasks back to the Master Agent?
  3. Have the Master Agent decide whether to route again or provide a final response?

This pattern mirrors how tools like LangGraph handle cyclic agent flows, but with explicit control over every transition.


Conclusion

TL;DR: Multi-agent systems in n8n benefit from explicit routing and context management. Use sequential pass-through (Demo 1) for simple pipelines; use master-agent routing with structured output (Demo 2) for dynamic task distribution. Both patterns dramatically reduce token waste compared to loading all agent instructions upfront. Next step: implement agent-to-master return logic for full orchestration loops.

The workflows demonstrated here show that intelligent agent handoffs aren't magic—they're just careful data flow management. n8n's visual interface makes the logic transparent, which is invaluable when debugging complex agent interactions or optimizing for cost.

Try implementing the return-to-master pattern yourself, and share your solution in the comments. What other agent routing patterns would be useful for your projects?

Top comments (0)