DEV Community

Programming Central
Programming Central

Posted on • Originally published at programmingcentral.hashnode.dev

Why Single Agents Fail: Building Scalable AI Teams with the Manager-Worker Pattern

If you've ever built an AI agent using a simple ReAct loop, you know the pain: it works great for simple tasks, but throw a complex, multi-step problem at it, and the whole system buckles. The agent gets lost in its own context window, forgets earlier constraints, or gets stuck in infinite loops. It’s like hiring a single "full-stack developer" to build an entire enterprise platform from scratch—it’s inefficient and prone to failure.

The solution? Hierarchical Agent Teams. This architectural pattern, inspired by microservices in software engineering, introduces a Manager-Worker structure that scales, modularizes, and stabilizes your AI applications. In this post, we’ll dive deep into the theory, explore the analogy to modern software architecture, and walk through a practical TypeScript implementation using LangGraph.js and Zod.

The Core Concept: The Manager-Worker Pattern

In the previous chapter, we explored the ReAct Loop as a foundational agentic design pattern. This pattern creates a cyclical graph structure where an agent alternates between generating a Thought (internal reasoning), selecting an Action (tool call), and processing an Observation (tool result). While powerful for single-agent tasks, the ReAct loop represents a single, monolithic unit of intelligence. When a problem becomes complex—requiring multiple distinct skill sets, parallel processing, or sequential dependency management—relying on a single agent to handle every step leads to inefficiency, context overload, and a lack of modularity.

Hierarchical Agent Teams solve this by introducing a Manager-Worker pattern. This is a structural architecture where a "Manager" agent (often called a Supervisor or Orchestrator) delegates specific subtasks to specialized "Worker" agents. The Manager does not perform the actual work; instead, it focuses on task decomposition and routing. It analyzes the high-level objective, breaks it down into discrete, manageable units, and assigns each unit to the most competent Worker.

To understand this via a web development analogy, imagine building a complex e-commerce platform. You do not hire a single "Full Stack Developer" to write the entire application from the database schema to the CSS styling in one massive code file. Instead, you assemble a team:

  • The Manager: The Project Manager or Tech Lead. They don't write the code; they read the requirements (the prompt), break them into tickets (subtasks), and assign tickets to the appropriate specialists.
  • The Workers: The Backend Developer (database logic), the Frontend Developer (UI/UX), and the DevOps Engineer (deployment). Each is an expert in their domain.

In LangGraph.js, this hierarchy is not just a conceptual grouping; it is a literal graph structure. The Manager is a node in the graph, and the Workers are sub-graphs or individual nodes that the Manager routes to. This creates a scalable, modular workflow where the failure of one component (e.g., a tool call by a Worker) can be isolated and handled without collapsing the entire system.

Why Hierarchical Architectures are Necessary

The transition from single-agent ReAct loops to multi-agent hierarchies is driven by the limitations of context windows and the complexity of reasoning.

  1. Context Management: A single agent attempting to solve a complex problem (e.g., "Plan a vacation itinerary for 5 days in Tokyo, book flights, and reserve hotels") must hold the entire plan, current status, tool outputs, and future steps in its immediate memory (context window). As the conversation grows, the model risks "forgetting" earlier constraints or hallucinating details. By delegating, the Manager only needs to track the high-level state, while Workers handle the granular details of their specific domain.
  2. Specialization and Consistency: Just as a database engineer writes better SQL than a generalist, a specialized agent can be tuned (via prompt engineering and tool selection) for a specific task. A "Researcher" agent can be optimized for browsing and summarizing, while a "Coder" agent is optimized for writing TypeScript. This prevents the "jack of all trades, master of none" problem.
  3. Parallelism and Efficiency: In a linear ReAct loop, steps are sequential. In a hierarchical system, the Manager can dispatch independent tasks to multiple Workers simultaneously. For example, while one Worker searches for flight prices, another can check hotel availability. The Manager awaits the results of all before synthesizing the final answer.

The Manager: The Orchestrator of Logic

The Manager agent is the brain of the operation. Its primary function is State Management and Routing. In LangGraph.js, the Manager is typically implemented as a node that utilizes an LLM (Large Language Model) to decide the next step based on the current graph state.

The Manager operates on a "plan" or a "mental model" of the available Workers. It doesn't know how a Worker performs a task, only what the Worker is capable of. This is analogous to an API Gateway in microservices architecture. The Gateway (Manager) knows that POST /users creates a user, but it doesn't know the internal implementation details of the User Service (Worker).

The decision-making process of the Manager often involves a Router mechanism. This can be a deterministic function (e.g., if the task involves math, route to the Calculator Worker) or a dynamic LLM call (e.g., "Based on the user's request, which of the following agents is best suited to handle the next step?"). This router ensures that the flow of control follows the most efficient path through the graph.

The Workers: Specialized Execution Units

Workers are the execution engines of the hierarchy. In LangGraph.js, a Worker is defined as a node that possesses a specific set of tools and a specific system prompt. A Worker does not necessarily need to know about the existence of other Workers; its world is limited to its assigned tasks and the tools it can access.

Consider the Tool Use Reflection concept defined earlier. This is crucial for Workers. A Worker might call a tool (e.g., a search API), receive an observation, and then use that observation to refine its internal thought process before making another tool call or handing control back to the Manager. This internal ReAct loop allows the Worker to be autonomous within its domain.

However, unlike a standalone agent, a Worker in a hierarchical team is often "stateless" regarding the overall goal. It processes a specific input (a subtask from the Manager) and produces a specific output (the result). It does not retain memory of the conversation history beyond what is passed in the current state update.

Visualizing the Hierarchy

To visualize this flow, we can look at the graph structure. The Manager acts as a central hub, dispatching tasks to specialized nodes. The Workers process these tasks and return results, which the Manager then synthesizes.

::: {style="text-align: center"}
The Manager node orchestrates the workflow by delegating tasks to specialized Worker nodes, which process the assigned work and return their results for final synthesis.{width=80% caption="The Manager node orchestrates the workflow by delegating tasks to specialized Worker nodes, which process the assigned work and return their results for final synthesis."}
:::

The Web Development Analogy: Microservices vs. Monolith

To deeply understand the "Why" of Hierarchical Agent Teams, let's expand on the web development analogy, specifically comparing Monolithic Architecture vs. Microservices Architecture.

The Monolith (Single ReAct Agent):
In a monolithic web application, the frontend, backend, and database logic are tightly coupled in one codebase. If you need to update the search algorithm, you might have to redeploy the entire application.

  • In Agents: A single ReAct agent handles everything. If the agent gets stuck in a loop trying to format data perfectly while also trying to search for it, the entire "application" freezes. The context window is shared for everything, leading to congestion.

The Microservices (Hierarchical Agents):
In a microservices architecture, the application is broken down into small, independent services that communicate over a network (like HTTP or gRPC). Each service owns its own data and logic.

  • In Agents: The Manager is the API Gateway. It receives a request (User Prompt). It routes the request to the Search Service (Researcher Worker). The Search Service queries its own database (Vector Store) and returns a JSON response. The Manager then routes that data to the Processing Service (Analyst Worker) to summarize it. Finally, the Manager formats the response for the user.

The "Under the Hood" Connection:
Just as microservices use standardized protocols (REST/GraphQL) to ensure different services can talk to each other, Hierarchical Agents use a standardized State Schema. In LangGraph.js, the State object is the shared language between the Manager and the Workers. If a Worker expects a query string in the state and the Manager provides a query object, the system breaks—just as if a microservice expected JSON but received XML.

Task Decomposition and State Propagation

The magic of the Manager-Worker pattern lies in how information flows. It is not a simple linear chain; it is a recursive or iterative flow.

  1. Decomposition: The Manager looks at the initial state (User Request). It uses an LLM to generate a list of subtasks. For example, "Write a report on AI trends" becomes:
    • Subtask 1: Search for recent AI news.
    • Subtask 2: Summarize the key findings.
    • Subtask 3: Draft the report.
  2. Dispatch: The Manager updates the graph state with the first subtask and routes control to the Researcher Worker.
  3. Execution: The Researcher Worker executes its internal ReAct loop (Thought -> Action -> Observation) using its specific tools (e.g., search_web).
  4. Aggregation: The Researcher Worker writes its findings back to the shared state (e.g., state.research_data = [...]). Control returns to the Manager.
  5. Iteration: The Manager reviews the updated state. It sees that research_data is populated but summary is empty. It routes control to the Analyst Worker.

This propagation ensures that the "knowledge" gained by one specialized agent is available to the next, without the agents needing to communicate directly with one another. They communicate indirectly through the shared state managed by the Manager.

Scalability and Error Handling

Hierarchical systems offer superior scalability. If a specific task requires heavy computation (like analyzing a large PDF), we can scale that specific Worker node independently. In LangGraph.js, this can be implemented by offloading a Worker node to a separate server or queue system, while the Manager remains lightweight.

Furthermore, error handling becomes granular. If the Researcher Worker fails to find a result (e.g., the search tool returns an error), the Manager can detect this state change (e.g., state.error = true) and route to a "Fallback Worker" or attempt the task with a different tool. In a monolithic agent, an error in a tool call often requires restarting the entire reasoning process from scratch.

Summary of Theoretical Foundations

The Hierarchical Agent Team is not merely a way to organize prompts; it is a structural paradigm for managing complexity. By separating the concerns of Orchestration (Manager) from Execution (Workers), we create systems that are:

  • Modular: Components can be swapped or updated without affecting the whole.
  • Scalable: Heavy tasks can be isolated and parallelized.
  • Robust: Errors are contained within specific nodes.
  • Context-Efficient: Memory is distributed across the graph, reducing the load on the LLM's context window.

This architecture mirrors the evolution of software engineering from monoliths to distributed systems, applying the same proven principles of separation of concerns to the domain of artificial intelligence.

Practical Implementation: A Customer Support SaaS

In a hierarchical multi-agent system, a Supervisor Node acts as a traffic controller. It doesn't perform the specialized work itself; instead, it analyzes the current state of the application (e.g., a user request in a SaaS dashboard) and decides which Worker Agent is best suited to handle the next step. The Supervisor uses structured output (often JSON) to delegate tasks clearly, ensuring the receiving worker knows exactly what to do.

This example simulates a simple Customer Support SaaS Application. We have a Supervisor that routes user queries to either a BillingWorker (for payment issues) or a TechnicalWorker (for bugs). We will use zod for schema validation to ensure the Supervisor's output is strictly typed and reliable.

Visualizing the Flow

The following diagram illustrates the control flow. The Supervisor receives the initial request, makes a decision, and invokes the appropriate worker node. The worker then updates the state, and the process terminates.

::: {style="text-align: center"}
The Supervisor receives the initial request, makes a decision, and invokes the appropriate worker node, which then updates the state and terminates the process.{width=80% caption="The Supervisor receives the initial request, makes a decision, and invokes the appropriate worker node, which then updates the state and terminates the process."}
:::

Implementation

This code is fully self-contained. It simulates the LLM calls using mock functions to ensure it runs without external API keys, but the structure mirrors a real production environment using LangGraph.js and Zod.

import { StateGraph, Annotation, StateSendMessage, StateSend } from "@langchain/langgraph";
import { z } from "zod";

// ==========================================
// 1. Define State and Schemas
// ==========================================

/**
 * The shared Graph State. This object is passed between nodes.
 * In a real app, this would contain user session data, conversation history, etc.
 */
type GraphState = {
  userRequest: string;
  route: string | null; // The decision made by the Supervisor
  finalResponse: string | null; // The result from the worker
};

// Schema for the Supervisor's decision.
// The Supervisor MUST output JSON matching this schema.
const SupervisorDecisionSchema = z.object({
  route: z.enum(["billing", "technical"]).describe("The worker to delegate the task to."),
  reasoning: z.string().describe("Why this route was chosen."),
});

// ==========================================
// 2. Define Agent Nodes
// ==========================================

/**
 * Supervisor Node: Analyzes the state and decides which worker to invoke.
 * In a real scenario, this would call an LLM (e.g., GPT-4) with a structured output prompt.
 * Here, we mock the logic for clarity and reliability.
 */
async function supervisorNode(state: GraphState): Promise<Partial<GraphState>> {
  console.log(`[Supervisor] Analyzing: "${state.userRequest}"`);

  // Mock LLM decision logic based on keywords
  let decision: z.infer<typeof SupervisorDecisionSchema>;

  if (state.userRequest.toLowerCase().includes("bill") || state.userRequest.toLowerCase().includes("charge")) {
    decision = { route: "billing", reasoning: "User mentioned billing terms." };
  } else if (state.userRequest.toLowerCase().includes("bug") || state.userRequest.toLowerCase().includes("error")) {
    decision = { route: "technical", reasoning: "User mentioned technical issues." };
  } else {
    // Default fallback
    decision = { route: "technical", reasoning: "Unclear request, defaulting to technical support." };
  }

  // Validate the output against the schema (Defensive Programming)
  const validatedDecision = SupervisorDecisionSchema.parse(decision);

  console.log(`[Supervisor] Decision: Route to ${validatedDecision.route}`);

  // Update state with the decision
  return { route: validatedDecision.route };
}

/**
 * Worker Node: Billing Specialist.
 * Handles specific logic related to invoices and payments.
 */
async function billingWorker(state: GraphState): Promise<Partial<GraphState>> {
  console.log(`[Billing Worker] Processing request...`);
  // Simulate database lookup or API call
  const response = `Billing Report: Your last invoice #12345 was paid successfully. No issues found regarding "${state.userRequest}".`;
  return { finalResponse: response };
}

/**
 * Worker Node: Technical Support.
 * Handles bugs, errors, and system functionality.
 */
async function technicalWorker(state: GraphState): Promise<Partial<GraphState>> {
  console.log(`[Technical Worker] Processing request...`);
  // Simulate debugging logic
  const response = `Technical Analysis: We investigated the error regarding "${state.userRequest}". A patch has been deployed.`;
  return { finalResponse: response };
}

// ==========================================
// 3. Define Routing Logic (Edges)
// ==========================================

/**
 * Conditional Edge: Determines the next step based on the Supervisor's decision.
 * This is the "Delegation Strategy".
 */
function routeDecision(state: GraphState): string | typeof StateSendMessage {
  if (!state.route) {
    // If the supervisor hasn't decided yet, stay on the supervisor (loop prevention)
    return "supervisor"; 
  }

  // Route to the specific worker node based on the 'route' field in state
  if (state.route === "billing") {
    return "billing_worker";
  } else if (state.route === "technical") {
    return "technical_worker";
  }

  // If we reach here, the state is invalid or unknown
  return StateSendMessage("Invalid routing decision detected.");
}

// ==========================================
// 4. Construct the Graph
// ==========================================

/**
 * Initialize the State Graph.
 * We use a mutable state object where nodes update specific fields.
 */
const workflow = new StateGraph<GraphState>({
  // Define the schema of the state (optional in JS, but good for TS inference)
  stateSchema: {
    userRequest: { value: null, reducer: (prev, next) => next ?? prev },
    route: { value: null, reducer: (prev, next) => next ?? prev },
    finalResponse: { value: null, reducer: (prev, next) => next ?? prev },
  },
});

// Add nodes to the graph
workflow.addNode("supervisor", supervisorNode);
workflow.addNode("billing_worker", billingWorker);
workflow.addNode("technical_worker", technicalWorker);

// Define the entry point
workflow.setEntryPoint("supervisor");

// Define conditional edges from the supervisor
// "supervisor" node -> checks routeDecision -> goes to "billing_worker" or "technical_worker"
workflow.addConditionalEdges("supervisor", routeDecision);

// Define terminal edges (workers go to END)
workflow.addEdge("billing_worker", StateSendMessage("END"));
workflow.addEdge("technical_worker", StateSendMessage("END"));

// Compile the graph
const app = workflow.compile();

// ==========================================
// 5. Execution
// ==========================================

/**
 * Helper to run the graph and log results.
 */
async function runTest(request: string) {
  console.log("\n----------------------------------------");
  console.log(`Starting Execution: "${request}"`);
  console.log("----------------------------------------");

  const initialState: GraphState = {
    userRequest: request,
    route: null,
    finalResponse: null,
  };

  // Stream events to see the flow in real-time
  const stream = await app.stream(initialState);

  for await (const event of stream) {
    // The stream yields updates from nodes
    const nodeName = Object.keys(event)[0];
    const nodeState = event[nodeName];

    if (nodeState.finalResponse) {
      console.log(`\n>>> FINAL OUTPUT: ${nodeState.finalResponse}`);
    }
  }
}

// Run simulations
(async () => {
  // Test Case 1: Billing Issue
  await runTest("I think there is a problem with my invoice charge.");

  // Test Case 2: Technical Issue
  await runTest("The dashboard is throwing a 500 error.");
})();
Enter fullscreen mode Exit fullscreen mode

Detailed Line-by-Line Explanation

Here is the breakdown of the logic into a numbered list for clarity.

1. State and Schema Definition

  • GraphState Type: Defines the shape of the data flowing through the graph. It tracks the user's request, the routing decision (route), and the final output. In a real SaaS app, this might also include userId, sessionId, or authToken.
  • SupervisorDecisionSchema (Zod): This is critical for reliability. Instead of asking the LLM to output free text, we enforce a JSON structure. Zod ensures that the Supervisor's output is validated at runtime. If the LLM hallucinates a format, Zod throws an error, preventing downstream crashes.

2. Agent Nodes

  • supervisorNode: This is the "Brain" of the hierarchy.
    • It receives the current state.
    • Under the Hood: In a production environment, you would pass state.userRequest to an LLM prompt like: "Analyze the user request and output JSON: { route: 'billing' | 'technical' }".
    • Mock Logic: For this "Hello World" example, we use simple string matching (includes) to simulate the LLM's decision-making process.
    • Validation: SupervisorDecisionSchema.parse(decision) validates the decision. This is a best practice to handle LLM non-determinism.
  • billingWorker & technicalWorker: These are the "Hands". They only care about their specific domain. They receive the state (which now includes the route decision) and perform the actual business logic (e.g., querying a database, calling an API). They return an updated state containing the finalResponse.

3. Routing Logic (Edges)

  • routeDecision Function: This function acts as the router. It looks at the state.route field populated by the Supervisor.
    • It returns a string matching the name of the next node to execute (e.g., "billing_worker").
    • This separation of logic (Node) and flow control (Edge) is a core strength of LangGraph. It allows you to visualize the graph flow independently of the node code.

4. Graph Construction

  • new StateGraph: Initializes the graph builder. We pass the TypeScript type GraphState for full type safety.
  • workflow.addNode: Registers the functions we defined as nodes in the graph.
  • workflow.setEntryPoint: Defines where the graph starts (always the Supervisor in this pattern).
  • workflow.addConditionalEdges: This is the dynamic part. Instead of a fixed path (A -> B -> C), the graph asks routeDecision where to go after the Supervisor runs.
  • workflow.compile: Turns the declarative definition into an executable runtime object.

5. Execution

  • app.stream: This method executes the graph. It returns an async iterator, allowing us to "watch" the graph execute step-by-step. This is useful for real-time UI updates in a web app (e.g., showing a loading spinner while the Supervisor decides, then showing the worker's result).

Common Pitfalls

When building hierarchical agents in TypeScript/LangGraph, watch out for these specific issues:

  1. LLM Hallucination & JSON Parsing Errors

    • The Issue: LLMs often output invalid JSON or add conversational text (e.g., "Sure, here is the JSON: { ... }") which breaks JSON.parse().
    • The Fix: Never rely on JSON.parse directly on raw LLM output. Use a schema validator like Zod (as shown in the code) or LangChain's withStructuredOutput. This forces the LLM to adhere to a strict schema and handles parsing errors gracefully.
  2. State Mutation & Reference Issues

    • The Issue: In JavaScript/TypeScript, objects are passed by reference. If you mutate the state object directly inside a node (e.g., state.route = 'billing'), you might cause side effects or race conditions in concurrent streams.
    • The Fix: Always return a new partial state object (e.g., { route: 'billing' }). LangGraph handles merging this partial object into the main state immutably. Avoid mutating the state argument directly.
  3. Infinite Loops

    • The Issue: If the Supervisor fails to make a decision or the routing logic fails, the graph might loop back to the Supervisor indefinitely, consuming expensive LLM tokens.
    • The Fix: Implement a max_iterations counter in your graph state or use a "fallback" node. In the routeDecision function, ensure there is a default path or an error state that terminates the graph.
  4. Vercel/AWS Lambda Timeouts

    • The Issue: Serverless functions have strict timeouts (e.g., 10 seconds on Vercel Hobby plans). Hierarchical graphs involve multiple LLM calls and processing steps, which can easily exceed this limit.
    • The Fix:
      • Streaming: Use app.stream() instead of app.invoke() to send incremental updates to the client, keeping the connection alive.
      • Background Execution: For long workflows, trigger the graph execution via a background job (e.g., Vercel Background Functions, Inngest, or AWS SQS) and notify the frontend via WebSockets or polling when the result is ready.
  5. Async/Await Loops in Streams

    • The Issue: When iterating over app.stream(), failing to await properly or mixing synchronous logic with async streams can block the event loop, causing performance degradation in Node.js.
    • The Fix: Always use for await (const ev to iterate over the stream asynchronously, ensuring non-blocking execution.

Conclusion

The transition from monolithic single-agent systems to hierarchical multi-agent teams is a necessary evolution for building complex, production-ready AI applications. By adopting the Manager-Worker pattern, you gain modularity, scalability, and robustness. The Manager handles the high-level orchestration and state management, while specialized Workers execute domain-specific tasks with precision.

This architecture not only mirrors proven software engineering principles like microservices but also addresses the unique challenges of AI, such as context window limitations and reasoning complexity. By implementing this pattern with LangGraph.js and Zod, you can create systems that are both powerful and maintainable, ready to handle the demands of real-world SaaS applications.

The concepts and code demonstrated here are drawn directly from the comprehensive roadmap laid out in the book Autonomous Agents. Building Multi-Agent Systems and Workflows with LangGraph.js Amazon Link of the AI with JavaScript & TypeScript Series.
The ebook is also on Leanpub.com: https://leanpub.com/JSTypescriptAutonomousAgents.

Top comments (0)