Alex Retana

Posted on Oct 1

Optimizing Multi-Agent Workflows in n8n: A Context-Aware Approach to Agent Handoffs

#performance #tutorial #llm #architecture

Optimizing Multi-Agent Workflows in n8n: A Context-Aware Approach to Agent Handoffs

When working with multi-agent systems like the BMAD (Big Model, Agent Design) pattern, context window management becomes critical for model performance and cost efficiency. While you could dump an entire agent bundle into a Claude Project and let it figure things out, you'll quickly burn through tokens on instruction sets that may never be relevant to the current task.

This tutorial demonstrates how to build intelligent agent routing in n8n—the popular node-based automation tool—that maintains tight control over context and enables direct user-to-subagent communication without wasteful token processing.

Why This Matters

Traditional approaches to multi-agent orchestration often suffer from two key problems:

Context bloat: Loading all agent instructions upfront wastes tokens on irrelevant context
Indirect communication: Routing everything through a master agent doubles processing costs and adds latency

While Claude Projects offers solutions like separating master instructions from agent definitions and using RAG for knowledge retrieval, building a custom workflow in n8n gives you explicit control over data flow and context management. This pattern extends beyond chatbots—use it anywhere you need task-specific agents with optimized context windows.

The Build: Two Demonstration Workflows

I've created two n8n workflows that progressively demonstrate agent handoff patterns. Both use intentionally simple agent instructions to focus on the routing mechanics, but these patterns scale to complex production systems.

You can copy the templates to import into your own n8n instance at my github repo: N8n Multi Agent Handoff Templates

Demo 1: Sequential Agent Pass-Through

This workflow demonstrates the fundamental pattern: how to pass control from one agent to another.

Flow breakdown:

Chat Trigger receives the user message
AI Agent 1 processes the input with access to:
- OpenAI GPT-4.1-mini (shared language model)
- Simple Memory (conversation history)
Agent 1 outputs to two destinations simultaneously:
- "Respond to Chat" node (user feedback)
- AI Agent 2 (next agent in chain)

AI Agent 2 receives Agent 1's output via the prompt template:

   {{ $json.output || $json.chatInput }}

This expression handles both the initial user input and subsequent agent outputs

Agent 2 responds back to the user through "Respond to Chat1"

The loop continues—Agent 2's response feeds back into itself. As you can see below, when I ask what agent it is, it says agent 2, and with out routing through agent 1 (we don't see agent 1 messaged again).

Key architectural decisions:

Shared memory: Both agents use the same Simple Memory node to maintain conversation continuity
Shared model: Single OpenAI connection reduces configuration overhead
Branching output: Agent 1 uses n8n's multiple output connections to respond AND handoff simultaneously

Code reference (from Demonstrate Agent Pass Off.json):

{
  "parameters": {
    "promptType": "define",
    "text": "={{ $json.output ||  $json.chatInput}}",
    "options": {
      "systemMessage": "You are Agent 2. If you're asked to respond to the chat with what agent you are, just say \"Yes, I'm Agent 2\""
    }
  },
  "type": "@n8n/n8n-nodes-langchain.agent",
  "name": "AI Agent 2"
}

Demo 2: Dynamic Routing with Master Agent

This workflow adds intelligent routing: a master agent decides which specialized agent should handle each request.

Flow breakdown:

Chat Trigger → AI Master Agent
Master Agent analyzes the request and outputs structured JSON:

   {
     "direct_response_to_user": "I'm routing you to Agent 2",
     "agent_to_route_to": "Agent 2",
     "forwarded_message": "User asked about X. Routing because Y."
   }

Map Master Agent's Response extracts these fields using n8n expressions:

   $json.output.parseJson().agent_to_route_to
   $json.output.parseJson().forwarded_message

Data splits into two paths:
- Master Agent Responds To Chat: Sends routing explanation to user (no wait)
- Switch Node: Routes to Agent 1, 2, or 3 based on agent_to_route_to value

Selected agent receives a contextualized prompt:

   User's Original Message:
   ${$('When chat message received').item.json.chatInput}

   Master Agent's message to you:
   ${$json.forwarded_message}

Agent responds through its dedicated chat node

Critical differences from Demo 1:

Isolated memory: Each agent (including Master) has separate memory nodes (Simple Memory1/2/3)
Context preservation: The forwarded message includes both the original user input AND the master's routing rationale
Parallel execution: User gets immediate feedback while the selected agent processes in parallel

Master Agent system prompt (edited for clarity):

You are the Master Agent. You route user requests to the correct agent.

IMPORTANT: Output only valid JSON in this format:

{
  "direct_response_to_user": "I'm routing you to Agent 1",
  "agent_to_route_to": "Agent 1",
  "forwarded_message": "**Summary of user request and routing rationale**"
}

Why structured output matters: The JSON format enables programmatic routing via the Switch node. In production, you'd add validation to handle malformed responses.

Implementation Details You Need to Know

Context Window Optimization

Each agent only loads:

Its own system prompt (~100-500 tokens)
Relevant conversation history (window-buffered)
The forwarded context from the master agent

Compare this to loading all 3 agent instruction sets upfront—you'd waste thousands of tokens per request.

The Switch Node Configuration

The Switch node uses n8n's rule-based routing:

{
  "conditions": {
    "conditions": [
      {
        "leftValue": "={{ $json.agent_to_route_to }}",
        "rightValue": "Agent 2",
        "operator": {
          "type": "string",
          "operation": "equals"
        }
      }
    ]
  }
}

Three rules match "Agent 1", "Agent 2", or "Agent 3" exactly. Unmatched requests fall through (you'd want error handling in production).

Memory Architecture Trade-offs

Demo 1: Shared memory allows agents to reference each other's outputs naturally, but blurs agent boundaries.

Demo 2: Isolated memory per agent creates cleaner separation but requires explicit context passing via forwarded_message. This scales better for specialized agents with distinct conversation contexts.

Running These Workflows

Import the JSON files into n8n (both included with this post)
Configure your OpenAI API credentials in the "OpenAI: gpt-4.1-mini" node
Activate the workflow
Open the chat interface via the webhook URL

Test prompts:

"Who are you?" (tests agent self-identification)
"Pass me to Agent 2" (tests routing logic)
"What did Agent 1 say?" (tests memory persistence)

What's Next?

The natural evolution is bidirectional routing: subagents should be able to return control to the master when they complete their task. This creates a true orchestration layer where:

Master Agent delegates to specialists
Specialists execute and report back
Master Agent synthesizes results or delegates further

Challenge for you: Can you modify Demo 2 to:

Let each subagent indicate completion in its output (maybe via JSON like the master)?
Route completed tasks back to the Master Agent?
Have the Master Agent decide whether to route again or provide a final response?

This pattern mirrors how tools like LangGraph handle cyclic agent flows, but with explicit control over every transition.

Conclusion

TL;DR: Multi-agent systems in n8n benefit from explicit routing and context management. Use sequential pass-through (Demo 1) for simple pipelines; use master-agent routing with structured output (Demo 2) for dynamic task distribution. Both patterns dramatically reduce token waste compared to loading all agent instructions upfront. Next step: implement agent-to-master return logic for full orchestration loops.

The workflows demonstrated here show that intelligent agent handoffs aren't magic—they're just careful data flow management. n8n's visual interface makes the logic transparent, which is invaluable when debugging complex agent interactions or optimizing for cost.

Try implementing the return-to-master pattern yourself, and share your solution in the comments. What other agent routing patterns would be useful for your projects?

DEV Community

Optimizing Multi-Agent Workflows in n8n: A Context-Aware Approach to Agent Handoffs

Optimizing Multi-Agent Workflows in n8n: A Context-Aware Approach to Agent Handoffs

Why This Matters

The Build: Two Demonstration Workflows

Demo 1: Sequential Agent Pass-Through

Demo 2: Dynamic Routing with Master Agent

Implementation Details You Need to Know

Context Window Optimization

The Switch Node Configuration

Memory Architecture Trade-offs

Running These Workflows

What's Next?

Conclusion

Top comments (0)