<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: JackChen</title>
    <description>The latest articles on DEV Community by JackChen (@jackchenme).</description>
    <link>https://dev.to/jackchenme</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3880820%2F2c27f98e-f812-41c5-894d-7011ab2903e9.png</url>
      <title>DEV Community: JackChen</title>
      <link>https://dev.to/jackchenme</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jackchenme"/>
    <language>en</language>
    <item>
      <title>Goal-Driven Agent Orchestration vs Explicit Graphs: A TypeScript Framework Taxonomy</title>
      <dc:creator>JackChen</dc:creator>
      <pubDate>Wed, 03 Jun 2026 12:57:35 +0000</pubDate>
      <link>https://dev.to/jackchenme/goal-driven-agent-orchestration-vs-explicit-graphs-a-typescript-framework-taxonomy-6i3</link>
      <guid>https://dev.to/jackchenme/goal-driven-agent-orchestration-vs-explicit-graphs-a-typescript-framework-taxonomy-6i3</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Most multi-agent framework reviews compare features. This post argues you should compare a different axis first: where the framework places the decomposition cost. Goal-first frameworks pay it at runtime in tokens; graph-first frameworks pay it at design time in code. The right default depends on what kind of work your team actually has.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you spent any time in 2025 evaluating TypeScript agent frameworks, you probably hit the same wall I did. The product pages do not distinguish themselves on the axes that matter once you ship anything. They all promise "multi-agent". They all show a cooperating-agents diagram. They all link to a half dozen integrations. None of them tell you what the framework is going to make you write at 2am when a customer reports a regression.&lt;/p&gt;

&lt;p&gt;The thing the pages do not tell you is who decides the topology. That is the central design choice of every multi-agent framework, and it sorts cleanly into two camps. Graph-first frameworks make you decide. Goal-first frameworks let a coordinator agent decide for you at runtime. Each has costs the other does not.&lt;/p&gt;

&lt;p&gt;I am the maintainer of one of the goal-first frameworks (&lt;a href="https://github.com/open-multi-agent/open-multi-agent" rel="noopener noreferrer"&gt;open-multi-agent&lt;/a&gt;, the TypeScript-ecosystem answer to CrewAI), so my bias is going to show. I am going to try to be honest about what goal-first costs you, because I think the choice between paradigms is more interesting than which framework wins a feature checklist.&lt;/p&gt;

&lt;h2&gt;
  
  
  The two-axis problem with most comparison posts
&lt;/h2&gt;

&lt;p&gt;Read any "Top N TypeScript Agent Frameworks 2026" listicle and you will get a feature grid: which framework supports streaming, structured output, retries, observability, MCP, Zod schemas, lifecycle hooks, agent handoffs, and so on. Every framework gets a check or a partial mark.&lt;/p&gt;

&lt;p&gt;The grid is not useless, but it conceals the design choice that determines whether you will be productive on the framework six months in. That choice is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Who decides which agent runs next, and when?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Two answers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;You do, at design time.&lt;/strong&gt; The framework gives you primitives for nodes, edges, conditions, and you wire them. The execution path is whatever your code says it is.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A coordinator does, at runtime.&lt;/strong&gt; You declare the agents and the goal. The framework runs an LLM call to plan a task DAG, then executes it.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Call these &lt;strong&gt;graph-first&lt;/strong&gt; and &lt;strong&gt;goal-first&lt;/strong&gt; respectively. They are not feature sets. They are paradigms with different cost shapes, different failure modes, and different right-fit use cases.&lt;/p&gt;

&lt;p&gt;This post is the taxonomy. The same two-agent task appears inline below, implemented in four TypeScript frameworks (LangGraph.js, Mastra workflows, KaibanJS, open-multi-agent), so you can read each surface and judge for yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "graph-first" and "goal-first" mean
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Graph-first
&lt;/h3&gt;

&lt;p&gt;You declare the topology yourself. Nodes are agents (or steps). Edges declare execution order or conditions. The compiled object is a deterministic state machine. The framework executes it; it does not change it.&lt;/p&gt;

&lt;p&gt;Canonical examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LangGraph.js&lt;/strong&gt;: &lt;code&gt;StateGraph&lt;/code&gt; with explicit &lt;code&gt;Annotation.Root&lt;/code&gt; schema, &lt;code&gt;addNode&lt;/code&gt; per agent, &lt;code&gt;addEdge&lt;/code&gt; for transitions, &lt;code&gt;addConditionalEdges&lt;/code&gt; for routing. You compile the graph and invoke it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mastra workflows&lt;/strong&gt;: &lt;code&gt;createWorkflow&lt;/code&gt; with typed &lt;code&gt;inputSchema&lt;/code&gt; and &lt;code&gt;outputSchema&lt;/code&gt;, &lt;code&gt;createStep&lt;/code&gt; per stage, &lt;code&gt;.then()&lt;/code&gt; to chain. A linear-graph DSL on top of typed steps.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Goal-first
&lt;/h3&gt;

&lt;p&gt;You declare agents (role, model, system prompt). You hand the orchestrator a sentence-level goal. At runtime, a coordinator agent (an LLM call you do not write) decomposes the goal into a task DAG, assigns tasks to agents, runs them in dependency order, and synthesizes a final answer.&lt;/p&gt;

&lt;p&gt;Canonical examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CrewAI&lt;/strong&gt; (Python): the prototype. &lt;code&gt;Crew.kickoff()&lt;/code&gt; on a list of agents plus tasks. CrewAI uses the role/goal/backstory metaphor to seed agent behavior.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;open-multi-agent&lt;/strong&gt;: &lt;code&gt;OpenMultiAgent&lt;/code&gt; orchestrator plus &lt;code&gt;runTeam(team, goal)&lt;/code&gt;. The coordinator pattern lives in &lt;a href="https://github.com/open-multi-agent/open-multi-agent/blob/main/src/orchestrator/orchestrator.ts" rel="noopener noreferrer"&gt;&lt;code&gt;src/orchestrator/orchestrator.ts&lt;/code&gt;&lt;/a&gt; (&lt;code&gt;runTeam&lt;/code&gt; method) and is described inline in the JSDoc, six steps: decompose, queue, schedule, execute with parallelism, persist results, synthesize.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What about KaibanJS
&lt;/h3&gt;

&lt;p&gt;KaibanJS lives between the two. You define &lt;code&gt;Agent&lt;/code&gt; and &lt;code&gt;Task&lt;/code&gt; objects with explicit dependencies (the writer's task description references the researcher's output by convention), then the framework's Kanban board moves tasks across columns. Topology is mostly explicit in your task list, but the state machine that drives execution is hidden inside the board abstraction. Call it a &lt;strong&gt;hybrid&lt;/strong&gt;. Closer to graph-first than goal-first in practice.&lt;/p&gt;

&lt;h2&gt;
  
  
  Four-way side by side: the same task in four frameworks
&lt;/h2&gt;

&lt;p&gt;The task is small on purpose: a &lt;code&gt;researcher&lt;/code&gt; agent gathers a brief on a topic, and a &lt;code&gt;writer&lt;/code&gt; agent turns the brief into a 400-word summary. The writer depends on the researcher. The snippets below are the relevant core of each implementation, written to show the API surface side by side rather than as a packaged runnable project.&lt;/p&gt;

&lt;h3&gt;
  
  
  LangGraph.js
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;State&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Annotation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Root&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="nx"&gt;Annotation&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;brief&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="nx"&gt;Annotation&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Annotation&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;researcher&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;State&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;State&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Topic: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;])&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;brief&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;State&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;State&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Brief:\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;brief&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;])&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;State&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addNode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;researcher&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;researcher&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addNode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;writer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="nx"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEdge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;__start__&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;researcher&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEdge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;researcher&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;writer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEdge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;writer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;__end__&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What you see in the code: a state schema, two node functions, three edges, a compile-and-invoke. The dependency between researcher and writer is the edge &lt;code&gt;addEdge('researcher', 'writer')&lt;/code&gt;. The writer reads &lt;code&gt;state.brief&lt;/code&gt; because the researcher wrote to it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mastra workflows
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;researchStep&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createStep&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;research&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="na"&gt;outputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;brief&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="na"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;inputData&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;brief&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;callLLM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;inputData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;writeStep&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createStep&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;write&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;brief&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="na"&gt;outputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="na"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;inputData&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;callLLM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;inputData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;brief&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;wf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createWorkflow&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;r+w&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;...,&lt;/span&gt; &lt;span class="na"&gt;outputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;researchStep&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;writeStep&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What you see: Zod-typed steps, a &lt;code&gt;.then()&lt;/code&gt; chain. The dependency between research and write is positional in the chain. Mastra workflows expose more types than LangGraph at this complexity level, and trades graph generality for linear DSL clarity.&lt;/p&gt;

&lt;h3&gt;
  
  
  KaibanJS
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;researchTask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Research the topic {topic} and produce a brief.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;researcher&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;writeTask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Using the brief produced previously, write a 400-word summary.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;team&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Team&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Research and Write&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;agents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;researcher&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;researchTask&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;writeTask&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;team&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What you see: agents and tasks declared, dependency implicit in task order and prose references. The board state machine drives execution under the hood.&lt;/p&gt;

&lt;h3&gt;
  
  
  open-multi-agent
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;orchestrator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenMultiAgent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;defaultModel&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;claude-sonnet-4-6&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;defaultProvider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;anthropic&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;team&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;orchestrator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createTeam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;research-and-write&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;research-and-write&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;agents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;researcher&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;   &lt;span class="c1"&gt;// researcher + writer AgentConfig&lt;/span&gt;
  &lt;span class="na"&gt;sharedMemory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;goal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Research "Multi-agent orchestration tradeoffs in TypeScript" and write a 400-word summary.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;orchestrator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;runTeam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;team&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;goal&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What you see: agents declared, a goal sentence, one call. What &lt;code&gt;runTeam()&lt;/code&gt; can do for you, on a goal complex enough to need it, is a coordinator planning pass that decomposes the goal into a DAG (here a &lt;code&gt;researcher → writer&lt;/code&gt; dependency), runs the tasks in order, and synthesizes the final summary. One honest caveat, expanded in the cost section below: this particular goal is simple enough that OMA short-circuits and skips the coordinator entirely, dispatching straight to one agent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the cost actually lives
&lt;/h2&gt;

&lt;p&gt;These four snippets look like a "fewer lines = better" comparison. They are not.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Graph-first frameworks make you pay the decomposition cost in code, at design time, in your file.&lt;/strong&gt; You write the State schema, you write the edges, you decide the routing. The cost is visible, version-controlled, diffable, and stable: the graph behaves the same on Tuesday as it did on Monday because nothing about it has changed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Goal-first frameworks make you pay the decomposition cost at runtime, in tokens, in a coordinator LLM call.&lt;/strong&gt; The cost is invisible in your source code but visible in your bill and your trace. A non-trivial &lt;code&gt;runTeam()&lt;/code&gt; call spends a coordinator turn to plan the DAG, plus a synthesis turn to combine results, before any of your agents do their actual work. Those two extra turns are the overhead: proportionally heavy on a tiny job, shrinking toward noise as the real work grows. Simple goals skip it entirely, through the short-circuit described below.&lt;/p&gt;

&lt;p&gt;The graph-first model is more predictable. The goal-first model is more compressive: the line count is lower because the work moved into a place you cannot see. Neither is free.&lt;/p&gt;

&lt;p&gt;This is the part most comparisons skip. Picking a framework is not picking a feature set. It is picking where you want the topology cost to live: in your repo or in your API bill.&lt;/p&gt;

&lt;h2&gt;
  
  
  Which paradigm fits your work
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Graph-first when the shape of the work is fixed.&lt;/strong&gt; A pipeline that runs the same five steps on every record is worth writing once as an explicit graph; goal-first would just rediscover that DAG every run and bill you a coordinator turn for it. Same when you need an audit trail (compliance, legal, medical, finance), where "the coordinator decided to skip step 4 this run" is not an answer and an explicit edge from node 3 to node 5 is. Same, doubly, when the problem is a real state machine: cycles, retries that mutate state, interrupt-and-resume across long pauses; LangGraph.js is the most mature option there. And the most common reason of all, less technical than the rest: if your senior engineers do not yet trust an LLM making the routing call, build graph-first and earn that trust before you hand it over.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Goal-first when the shape of the work varies.&lt;/strong&gt; "Summarize this 3-page contract" and "summarize this 80-page master services agreement" want different sub-tasks; hard-coding the maximum graph and gating it with conditionals is doable but ugly, and a coordinator handles it naturally. It is also the faster paradigm while you are still discovering the decomposition, since changing a sentence-level goal beats redrawing a graph after every user conversation. Once you have several similarly-capable agents, letting a coordinator route between them beats encoding a soft "who does what" call as hard edges. And if your product's value is that it adapts to the user's goal rather than running a fixed automation, you want that planning surface visible, not buried in a graph DSL.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Coordinator: what the framework writes for you in goal-first
&lt;/h2&gt;

&lt;p&gt;Goal-first sounds magical when you describe it from outside. From inside, the trick is mundane: it is one well-prompted LLM call.&lt;/p&gt;

&lt;p&gt;When you call &lt;code&gt;runTeam(team, goal)&lt;/code&gt; in open-multi-agent, the orchestrator does this in the &lt;code&gt;runTeam&lt;/code&gt; method (full source in &lt;a href="https://github.com/open-multi-agent/open-multi-agent/blob/main/src/orchestrator/orchestrator.ts" rel="noopener noreferrer"&gt;&lt;code&gt;src/orchestrator/orchestrator.ts&lt;/code&gt;&lt;/a&gt;, see the JSDoc on the method):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A temporary coordinator agent receives the goal and the list of agents on the team, with their names, models, and system prompts.&lt;/li&gt;
&lt;li&gt;The coordinator is asked to output a JSON array of tasks. Each task has a title, description, assignee (one of the agent names), and an optional &lt;code&gt;dependsOn&lt;/code&gt; field listing which earlier tasks must complete first.&lt;/li&gt;
&lt;li&gt;Title-based dependency tokens are resolved to task IDs and the array is loaded into a &lt;code&gt;TaskQueue&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;A scheduler assigns ready tasks to the named agent. The queue's topological dependency resolution (&lt;code&gt;src/task/queue.ts&lt;/code&gt;) figures out which tasks are unblocked.&lt;/li&gt;
&lt;li&gt;Independent tasks run in parallel up to &lt;code&gt;maxConcurrency&lt;/code&gt;. Results are written to shared memory after each task completes, so downstream tasks can read them.&lt;/li&gt;
&lt;li&gt;After all tasks complete, the coordinator runs once more to synthesize a final answer from the collected outputs.&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;TeamRunResult&lt;/code&gt; is returned with per-agent token usage and a total.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There is a short-circuit: if the goal is short and contains no multi-step signals, the orchestrator skips the coordinator and dispatches the goal directly to the best-matching agent. That keeps trivial goals from paying the planning tax.&lt;/p&gt;

&lt;p&gt;The cost is genuinely real, and the example here has a sharp edge worth knowing. The two-agent goal in the OMA example above is simple enough that the short-circuit fires: the coordinator never runs, the goal is dispatched straight to a single agent, and for this goal the tax is zero. You pay the planning-plus-synthesis overhead only once a goal is genuinely multi-step, and when you do, it is proportionally heavy on small jobs and minor on large ones. Do not trust a generic number for it: measure your own workload from &lt;code&gt;result.totalTokenUsage&lt;/code&gt;, because the ratio depends entirely on how large your real agent work is next to the two coordinator turns.&lt;/p&gt;

&lt;h2&gt;
  
  
  What goal-first loses on
&lt;/h2&gt;

&lt;p&gt;Now the honest counterweight: where goal-first is the weaker choice.&lt;/p&gt;

&lt;p&gt;The clearest cost is tokens. Whenever a goal is complex enough that the coordinator runs, you pay for that planning turn plus a synthesis turn on top of the actual agent work. Putting the coordinator on a cheap model softens that but does not remove it. There is also a subtler cost that grows with the team: every dependent task carries upstream results forward as context, so agents spend tokens re-reading state they did not produce. Ken W Alger aptly named that the &lt;a href="https://dev.to/kenwalger/comment/38e55"&gt;"Prose Tax"&lt;/a&gt;, and it is the argument for passing explicit, typed results between agents rather than free-form prose.&lt;/p&gt;

&lt;p&gt;Then there is control flow. A flat DAG is the natural unit, so anything that is really a state machine (cycles, retry-with-mutation, interrupt-and-resume across long pauses, deeply nested conditionals) can be bent to fit but should not be. If your problem is a state machine, do not pretend it is a DAG.&lt;/p&gt;

&lt;p&gt;Debugging is less predictable too. The coordinator can pick a slightly different decomposition between runs, which shifts the output shape. Low temperature, a pinned prompt, and reviewing the plan through the &lt;code&gt;onPlanReady&lt;/code&gt; hook all help, but none of them are as deterministic as reading a graph file you wrote yourself.&lt;/p&gt;

&lt;p&gt;And there is social proof. Graph-first frameworks have shipped at LinkedIn, Klarna, and J.P. Morgan; goal-first is younger in production at that tier, so it is the harder sell in a board-level review today. For a documented case of an LLM-driven routing system hitting these walls and the engineering response, see &lt;a href="https://dev.to/jackchenme/5-walls-multi-agent-frameworks-hit-receipts-from-mastras-year-of-network-to-supervisor-3am3"&gt;Mastra's year of network-to-supervisor&lt;/a&gt;, traced through its own issue tracker.&lt;/p&gt;

&lt;h2&gt;
  
  
  Human-in-the-loop: the bridge to production, already shipped
&lt;/h2&gt;

&lt;p&gt;The path from goal-first as a fast-prototyping paradigm to goal-first as a production paradigm runs through human-in-the-loop, and in open-multi-agent that path is already in place. Two primitives, both shipped, let you put a human between the plan and its execution: the &lt;code&gt;onPlanReady&lt;/code&gt; hook (a callback that receives the coordinator's task list and returns approve or reject before anything runs) and PlanOnly mode (&lt;code&gt;runTeam(team, goal, { planOnly: true })&lt;/code&gt;, which returns the plan without executing it, so you can inspect or edit it first).&lt;/p&gt;

&lt;p&gt;This is the move that brings goal-first into the same audit story graph-first has had since day one: the planning is still LLM-driven, but the executed plan is human-approved. You get the goal-first benefit (you did not write the topology yourself) and the graph-first guarantee (a human signed off on it).&lt;/p&gt;

&lt;p&gt;That is the piece that turns goal-first from a prototyping convenience into something you can defend in a production review. The ergonomics will keep improving, but the load-bearing primitives are available today.&lt;/p&gt;

&lt;h2&gt;
  
  
  Decision rubric
&lt;/h2&gt;

&lt;p&gt;Take this as a starting point, not gospel:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Signal&lt;/th&gt;
&lt;th&gt;Lean graph-first&lt;/th&gt;
&lt;th&gt;Lean goal-first&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Pipeline shape varies per input&lt;/td&gt;
&lt;td&gt;weak&lt;/td&gt;
&lt;td&gt;strong&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Audit trail required&lt;/td&gt;
&lt;td&gt;strong&lt;/td&gt;
&lt;td&gt;weak&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prototyping a new product&lt;/td&gt;
&lt;td&gt;weak&lt;/td&gt;
&lt;td&gt;strong&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Team unfamiliar with LLM planning&lt;/td&gt;
&lt;td&gt;strong&lt;/td&gt;
&lt;td&gt;weak&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3+ agents with overlapping skills&lt;/td&gt;
&lt;td&gt;weak&lt;/td&gt;
&lt;td&gt;strong&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cycles or interrupts needed&lt;/td&gt;
&lt;td&gt;strong&lt;/td&gt;
&lt;td&gt;weak&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Goals expressed in user language&lt;/td&gt;
&lt;td&gt;weak&lt;/td&gt;
&lt;td&gt;strong&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Token cost per run is the bottleneck&lt;/td&gt;
&lt;td&gt;strong&lt;/td&gt;
&lt;td&gt;weak&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If your signals split, build the first version graph-first. Graph-first is the safer starting place when you cannot tell which way the pendulum goes. You can always swap to goal-first later when you have learned enough to trust the planning. The reverse migration is harder because you have to invent the explicit topology you never wrote down.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd build first
&lt;/h2&gt;

&lt;p&gt;If you are starting a new TypeScript agent project today and have not picked a framework, I would advise this sequence:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Build the first three weeks graph-first in &lt;strong&gt;Mastra workflows&lt;/strong&gt;. It is the cleanest TypeScript graph DSL right now and gets you to running in the least surface area.&lt;/li&gt;
&lt;li&gt;When you find that you are spending most of your time editing the graph rather than the agent prompts, that is the signal your task shape is varying. Try the same task in &lt;strong&gt;open-multi-agent&lt;/strong&gt; with a goal sentence and see whether the coordinator's plan matches what you would have written.&lt;/li&gt;
&lt;li&gt;If the coordinator plan is consistently good, switch your prototype to goal-first. If it is consistently off, you are in graph-first territory and you have just saved yourself a six-month detour.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;None of this is provably right; it is what I would tell a friend. The frameworks are not enemies, and which paradigm fits your work matters more than which one wins a feature checklist. Pick the paradigm, and the framework choice mostly follows.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;About open-multi-agent.&lt;/strong&gt; TypeScript-native multi-agent orchestration framework, MIT-licensed. Goal-first by design, with a coordinator pattern that decomposes a sentence-level goal into a parallel task DAG at runtime. Three runtime dependencies (&lt;code&gt;@anthropic-ai/sdk&lt;/code&gt;, &lt;code&gt;openai&lt;/code&gt;, &lt;code&gt;zod&lt;/code&gt;). The TypeScript-ecosystem answer to CrewAI's role/goal/crew pattern.&lt;/p&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/open-multi-agent/open-multi-agent" rel="noopener noreferrer"&gt;https://github.com/open-multi-agent/open-multi-agent&lt;/a&gt;. Coordinator implementation: &lt;a href="https://github.com/open-multi-agent/open-multi-agent/blob/main/src/orchestrator/orchestrator.ts" rel="noopener noreferrer"&gt;&lt;code&gt;src/orchestrator/orchestrator.ts&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Related posts.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://dev.to/jackchenme/5-walls-multi-agent-frameworks-hit-receipts-from-mastras-year-of-network-to-supervisor-3am3"&gt;5 Walls Multi-Agent Frameworks Hit&lt;/a&gt;: the empirical companion to this taxonomy. One TypeScript framework's year-long migration from LLM-driven routing to a supervisor tree, traced through its issue tracker, is the production evidence behind the goal-first / graph-first split.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>typescript</category>
      <category>ai</category>
    </item>
    <item>
      <title>5 walls multi-agent frameworks hit: receipts from Mastra's year of .network() to Supervisor migration</title>
      <dc:creator>JackChen</dc:creator>
      <pubDate>Thu, 21 May 2026 16:28:50 +0000</pubDate>
      <link>https://dev.to/jackchenme/5-walls-multi-agent-frameworks-hit-receipts-from-mastras-year-of-network-to-supervisor-3am3</link>
      <guid>https://dev.to/jackchenme/5-walls-multi-agent-frameworks-hit-receipts-from-mastras-year-of-network-to-supervisor-3am3</guid>
      <description>&lt;p&gt;Multi-agent in TypeScript is engineering-hard. Context propagation between agents, routing quality across providers, observability inside LLM-driven decisions, nesting depth, performance under concurrency: each of these has bitten Mastra over the past year, with public GitHub issues to prove it.&lt;/p&gt;

&lt;p&gt;This post pulls 5 engineering walls out of Mastra's year-long migration from &lt;code&gt;.network()&lt;/code&gt; to the Supervisor pattern. I searched the Mastra GitHub repo for &lt;code&gt;AgentNetwork&lt;/code&gt;, &lt;code&gt;multi-agent&lt;/code&gt;, &lt;code&gt;supervisor&lt;/code&gt;, and &lt;code&gt;network&lt;/code&gt;, got 32 relevant issues spanning May 2025 to May 2026, and cite 18 representative ones below.&lt;/p&gt;

&lt;p&gt;Why Mastra specifically? Because they are the most public case study. On April 9, 2026, Mastra raised a $22M Series A led by Spark Capital, bringing total funding to $35M. Same day, they launched Mastra Platform. I read the Series A post, the Platform announcement, and the pricing page end to end: the exact phrase "multi-agent" appears zero times across all three. They still mention &lt;code&gt;subagents&lt;/code&gt; once in the Series A post, so multi-agent coordination has not vanished as a capability. But "multi-agent" as a &lt;em&gt;positioning word&lt;/em&gt; is gone.&lt;/p&gt;

&lt;p&gt;This is a shift. Nine months earlier, in July 2025, Mastra published "Beyond Workflows: Introducing Agent Network" and positioned automatic LLM-driven multi-agent routing as a step beyond workflows. Nine months later, the Series A narrative is Studio + Server + Memory Gateway. It is "agent infrastructure platform." It is "framework gives you primitives, platform gives you tools to run at scale."&lt;/p&gt;

&lt;p&gt;What happened in between? The 32 issues are far more honest than any blog post. They cover Mastra's full arc: every iteration, every transition, every shift in positioning.&lt;/p&gt;

&lt;p&gt;The five walls are below. Each one Mastra hit, with issue receipts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Context: everyone was racing on multi-agent
&lt;/h2&gt;

&lt;p&gt;To understand the weight of this shift, look at the field. Through 2024 and 2025, LangGraph, CrewAI, AutoGen, and Mastra all pushed multi-agent as a core narrative. The Microsoft AutoGen paper kept emphasizing "multiple agents collaborating outperform a single agent." LangChain promoted LangGraph to a top-line product. CrewAI grew to tens of thousands of stars in a year.&lt;/p&gt;

&lt;p&gt;In the TypeScript world, Mastra was the standard bearer. Founded October 2024 by Gatsby co-founder Sam Bhagwat and team. YC W25. $13M seed round announced October 2025, from 100+ investors (the post headline says "120+ others"), including YC, Paul Graham, Guillermo Rauch, Amjad Masad, and Balaji Srinivasan. Three founders from a framework used by hundreds of thousands of developers.&lt;/p&gt;

&lt;p&gt;They had every advantage: TS ecosystem, Gatsby pedigree, YC, top-tier VCs, marquee angels, a $22M Series A, and customers including Replit, Brex, Sanity, Factorial, Indeed, Marsh McLennan, MongoDB, Workday, and Salesforce.&lt;/p&gt;

&lt;p&gt;And they still moved &lt;code&gt;.network()&lt;/code&gt; out of the multi-agent headline.&lt;/p&gt;

&lt;h2&gt;
  
  
  The full timeline of Mastra's multi-agent narrative
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Date&lt;/th&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;th&gt;Position of "multi-agent" in their story&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Oct 2024&lt;/td&gt;
&lt;td&gt;Team formed&lt;/td&gt;
&lt;td&gt;None. Pitch was "TS framework for the next million AI developers"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;H1 2025&lt;/td&gt;
&lt;td&gt;AgentNetwork v1 (experimental)&lt;/td&gt;
&lt;td&gt;Present, but they later admitted v1 was "pretty whack"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Oct 8, 2025&lt;/td&gt;
&lt;td&gt;$13M seed round announced&lt;/td&gt;
&lt;td&gt;Not a core funding narrative&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Jul 3, 2025&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Blog: "Beyond Workflows: Introducing Agent Network (vNext)"&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Peak&lt;/strong&gt;. Original wording: "intelligent AI orchestration that automatically routes and executes complex multi-agent tasks without predetermined workflows"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Aug 26, 2025&lt;/td&gt;
&lt;td&gt;Blog: "Improved agent orchestration with AI SDK v5"&lt;/td&gt;
&lt;td&gt;The "orchestration" in the title quietly downgrades to single-agent tool orchestration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Oct 10, 2025&lt;/td&gt;
&lt;td&gt;Blog: "The evolution of AgentNetwork." &lt;code&gt;.network()&lt;/code&gt; API consolidates&lt;/td&gt;
&lt;td&gt;Multi-agent still featured, but the API is simplifying&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Nov 2025&lt;/td&gt;
&lt;td&gt;v1 Beta&lt;/td&gt;
&lt;td&gt;Still mentions &lt;code&gt;.network()&lt;/code&gt; for agent networks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Jan 2026&lt;/td&gt;
&lt;td&gt;v1.0 stable&lt;/td&gt;
&lt;td&gt;Multi-agent no longer a top-line feature&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Feb 26, 2026&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Supervisor pattern launches as the first-class primitive for multi-agent orchestration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;.network()&lt;/code&gt; later marked deprecated in the migration guide&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Apr 9, 2026&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Mastra Platform + Series A $22M&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The exact phrase "multi-agent" appears 0 times across the Series A, Platform, and Pricing pages. &lt;code&gt;subagents&lt;/code&gt; still mentioned once&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;May 19, 2026&lt;/td&gt;
&lt;td&gt;"Introducing A2A support"&lt;/td&gt;
&lt;td&gt;Cross-framework interop protocol. Multi-agent capability continues, but now framed as agent-to-agent interop rather than internal orchestration&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Multi-agent as primary headline ran from July 2025 to November 2025, roughly 4 to 5 months. Then a quiet downgrade in August, an API migration to Supervisor in February, and a vocabulary switch in the Series A by April.&lt;/p&gt;

&lt;p&gt;Nine months of repositioning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Blogs are written for press. Issues are real.
&lt;/h2&gt;

&lt;p&gt;Announcement posts have communications teams. GitHub issues do not.&lt;/p&gt;

&lt;p&gt;The Mastra repo currently has about 24.1k stars, 2.1k forks, and 200+ open issues (checked 2026-05-21). Of the 32 issues matching my search, this post cites 18 representative ones. Five themes show up repeatedly. These are not feature requests. They are not typos or doc errors. They are the actual hard problems of running multi-agent systems in production.&lt;/p&gt;

&lt;p&gt;Mastra spent a year on them and chose to migrate to a structurally simpler design (Supervisor tree) that sidesteps some, but not all, of them.&lt;/p&gt;

&lt;p&gt;Here are the five walls.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wall 1: Memory, context propagation, and persistence between agents
&lt;/h2&gt;

&lt;p&gt;This is the deepest wall, and the one Mastra hit longest.&lt;/p&gt;

&lt;p&gt;Issue #11468, titled simply "Agent Network," was filed December 29, 2025 from their Discord. The original text:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Using agent.network() I found something that when an orchestrating agent decides which secondary agent to call, the message history is not transferred to the secondary agent, making it difficult for it to understand the context for action. Please, can you help me with this? I haven't found in any documentation how to pass the memory to this flow in the final agent."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Translated to product language: &lt;strong&gt;the coordinator decides who to call, but the agent being called does not know why it is being called.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This problem persisted in Mastra's tracker for at least six months. Issue #5381 ("Memory for Networks?") was filed June 23, 2025. Adjacent memory/storage/persistence issues continued after the Supervisor migration, including #15336 ("LibSQL Storage/Memory Error with supervisor agent and sub agents") and #14583 ("Supervisor/Subagent Persistence Duplication"). These are not strictly the same "message history not propagated" bug as #11468, but they share a root: state management between a coordinator and its sub-agents is hard, in multiple ways.&lt;/p&gt;

&lt;p&gt;The real engineering hardness: you cannot dump the entire conversation history into every sub-agent (token explosion, privacy, signal-to-noise), but you cannot leave them blind either (they need task context). The tradeoff is an open problem, not a few-months problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wall 2: Routing quality and prompt fragility
&lt;/h2&gt;

&lt;p&gt;Automatic LLM-driven routing depends on prompt robustness across models. Cross-provider, the same routing prompt behaves very differently.&lt;/p&gt;

&lt;p&gt;The receipts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;#9873&lt;/strong&gt; (2025-11-07) "Network Agents does not forward the request to sub agents inside the network." Routing literally does not work&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;#12468&lt;/strong&gt; (2026-01-29) "Agent Network Routing Latency." Slow&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;#12955&lt;/strong&gt; (2026-02-11) "The sub agents are returning empty output inside network." Sub-agents return empty&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;#13621&lt;/strong&gt; (2026-02-28) "Agent Network routing prompt has trailing whitespace, causing failures with Bedrock-backed Claude models." &lt;strong&gt;A trailing whitespace in a routing prompt breaks the entire routing chain on Bedrock Claude.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The last one is the most diagnostic. A trailing whitespace, undebuggable across providers. This is not a user mistake. This is the brittleness of LLM-driven routing as a paradigm. Switch providers and your routing behavior may need a full re-tune.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wall 3: AgentNetwork routing observability gap
&lt;/h2&gt;

&lt;p&gt;LLMs do the routing inside AgentNetwork. Users couldn't see why.&lt;/p&gt;

&lt;p&gt;Issue &lt;strong&gt;#12277&lt;/strong&gt; (2026-01-24) "Missing Observability for Routing and Validation LLM Calls in Agent Networks" pointed this out directly. The scope is narrow: it's specifically about tracing for AgentNetwork's internal routing and validation LLM calls, not framework-wide observability. By that date, &lt;code&gt;.network()&lt;/code&gt; had been live for roughly three months. Production users of &lt;code&gt;.network()&lt;/code&gt; had been flying blind on the routing layer that whole time.&lt;/p&gt;

&lt;p&gt;Observability for AgentNetwork's routing and validation had to be a day-one design decision. Adding it months in means months of users hitting "why did the coordinator pick this agent" without an answer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wall 4: Three-level nesting already breaks
&lt;/h2&gt;

&lt;p&gt;Issue &lt;strong&gt;#15013&lt;/strong&gt; (2026-04-03) "3-level sub-agent delegation: no progressive streaming to client."&lt;/p&gt;

&lt;p&gt;Three levels of sub-agent delegation is enough to break streaming.&lt;/p&gt;

&lt;p&gt;This matters because multi-agent frameworks that aspire to be an "agent OS" or "agent operating system" need to support deep organizational structures. Mastra cracks at three levels of sub-agent delegation. I haven't found public benchmarks for any framework at four levels of agent delegation, and Mastra hasn't disclosed the topology depth of their customer workloads (Brex, Indeed, Marsh McLennan), so I can't claim anything about that ceiling. What I can say is: three-level streaming broke for at least one user of Mastra, and the "deep agent organization" pitch deserves a higher evidence bar than it usually gets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wall 5: Performance collapse
&lt;/h2&gt;

&lt;p&gt;Issue &lt;strong&gt;#15478&lt;/strong&gt; (2026-04-17, closed 2026-05-20), "[RFC] Agent Performance Optimization (Slow Responses)."&lt;/p&gt;

&lt;p&gt;This is an RFC, not a bug. Mastra opened a public RFC acknowledging slow agent responses had reached the level of a systemic issue. The RFC was closed on May 20, 2026, the day before this post, after a maintainer commented that it had been taken care of.&lt;/p&gt;

&lt;p&gt;The diagnosis comes via #15677 (2026-04-23):&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"ObservationalMemoryProcessor.processInputStep blocks every agentic loop step with DB reads and token counting even when far below thresholds."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Translated: &lt;strong&gt;every agent loop iteration triggers a database read and token counting.&lt;/strong&gt; Tolerable on a single agent. Catastrophic when amplified across a supervisor with multiple concurrent sub-agents.&lt;/p&gt;

&lt;p&gt;The hidden cost of multi-agent is consistently underestimated. Each agent is one LLM call. Each call needs context handling, plus observability, tracing, token counting, memory I/O. Mastra is exposing the real cost of these "lightweight" operations once they are stacked on multi-agent topologies.&lt;/p&gt;

&lt;h2&gt;
  
  
  What happened after migrating to Supervisor
&lt;/h2&gt;

&lt;p&gt;On February 26, 2026, Mastra officially launched the Supervisor pattern. The changelog described it as "a first-class supervisor pattern, exposed through the same primitives you already use, &lt;code&gt;stream()&lt;/code&gt; and &lt;code&gt;generate()&lt;/code&gt;." &lt;code&gt;.network()&lt;/code&gt; was later marked deprecated in the migration guide: "will be removed in a future release. While existing code will continue to work until then, no new features will be added to it."&lt;/p&gt;

&lt;p&gt;The logic of the migration: shift from LLM-driven routing to manually configured sub-agents in a tree. Simpler structure, more predictable decisions, fewer bug surfaces.&lt;/p&gt;

&lt;p&gt;The issue data tells a different story.&lt;/p&gt;

&lt;p&gt;From March to May 2026, Supervisor-related issues clustered:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;#14723&lt;/strong&gt; (2026-03-26) Supervisor and sub-agent interactions stored as Supervisor-and-User interactions (history pollution)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;#14820&lt;/strong&gt; (2026-03-29) No way to abort sub-agent execution in supervisor mode&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;#14583&lt;/strong&gt; (2026-03-23) Supervisor/Subagent persistence duplication&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;#15013&lt;/strong&gt; (2026-04-03) Three-level sub-agent delegation streaming broken&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;#15336&lt;/strong&gt; (2026-04-14) LibSQL storage + sub-agent throws&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;#15436&lt;/strong&gt; (2026-04-16) No control over sub-agent tool results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;#15734&lt;/strong&gt; (2026-04-24) Suspend/resume breaks when sub-agent owns a workflow&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;#15887&lt;/strong&gt; (2026-04-28) Sub-agent calls serialized under approval mode (concurrency dies)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;#16422&lt;/strong&gt; (2026-05-11) &lt;code&gt;transformAgent&lt;/code&gt; drops sub-agent tool input streaming chunks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;#15478&lt;/strong&gt; (performance RFC, closed 2026-05-20)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Supervisor is not the destination. It simplified some problems (no more LLM auto-routing), but the core multi-agent challenges (context propagation, persistence consistency, nesting depth, concurrency control, streaming) did not disappear. They got new issue numbers and resurfaced.&lt;/p&gt;

&lt;p&gt;"Simplified design" is the story told to the community and to investors. The engineering reality is that they are still patching.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this means
&lt;/h2&gt;

&lt;p&gt;Three takeaways from a year of Mastra's public behavior.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One. The migration is not a concept failure. It is an engineering hardness.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Multi-agent as a concept is validated in research and product. LangChain, AutoGen, CrewAI are all doing it. The gap is between "concept works" and "production-stable." Crossing it took Mastra a year, dozens of issues, one major API rewrite, and a vocabulary switch in the Series A. This is not a "pick it up and ship" direction. It is a real-engineering domain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Two. Multi-agent depth beyond two levels remains a hard, undersolved problem.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Mastra's 3-level streaming bug (#15013) suggests this isn't a Mastra-only ceiling, but I can't speak for frameworks I haven't tested. What I can say from the receipts: prompt robustness, context propagation, streaming, token accounting, error recovery each barely held at two levels in Mastra's case, got shaky at three, and I haven't found public reliable demos at four for any framework. If you have counterexamples, I'd genuinely like to see them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Three. "Agent OS" as a buzzword does not match engineering reality.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;An operating system implies stability, predictability, and deep process nesting. A system that breaks streaming at three levels of delegation, needs an RFC for performance, and took six months to figure out context propagation is at best a framework. Calling it an OS is writing a check that current technology cannot cash.&lt;/p&gt;

&lt;p&gt;Mastra clearly knows this. Their funding announcements do not use "OS." They use "Platform." They use "framework." They use "infrastructure." These are bounded words.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I am sitting
&lt;/h2&gt;

&lt;p&gt;I have been working on an open-source TypeScript multi-agent framework since April 2026. We call it open-multi-agent. The repo lives at github.com/open-multi-agent/open-multi-agent.&lt;/p&gt;

&lt;p&gt;Watching Mastra's migration over the past year did not convince me the road is dead. It convinced me of something else: &lt;strong&gt;Mastra chose to migrate &lt;code&gt;.network()&lt;/code&gt; into Supervisor and shift the headline vocabulary. We are choosing to walk directly into these five walls and call them out by name.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Where we are today:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Wall 1 (context propagation)&lt;/strong&gt;: &lt;code&gt;SharedMemory&lt;/code&gt; as a namespaced key-value store, injected into prompts as markdown summaries, plus &lt;code&gt;MessageBus&lt;/code&gt; for point-to-point and broadcast messages between agents. Different mechanism from Mastra's "pass message history." We sidestep the problem rather than solving it directly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wall 2 (routing)&lt;/strong&gt;: Coordinator decisions are constrained by Zod schema with one automatic retry on validation failure. Local model fallback parses raw JSON and markdown-fenced JSON output formats&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wall 3 (observability)&lt;/strong&gt;: Built-in post-run task DAG dashboard (pure HTML render, no I/O dependency). Every team run renders the task DAG, assignee per task, status, timing, and token usage. Day-one design&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wall 4 (nesting)&lt;/strong&gt;: &lt;code&gt;maxDelegationDepth&lt;/code&gt; cap, plus cycle detection (target already in &lt;code&gt;delegationChain&lt;/code&gt; is rejected) and agent pool deadlock detection (rejected when &lt;code&gt;availableRunSlots &amp;lt; 1&lt;/code&gt;). Three guards from day one&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wall 5 (performance)&lt;/strong&gt;: Three runtime dependencies only (&lt;code&gt;@anthropic-ai/sdk&lt;/code&gt;, &lt;code&gt;openai&lt;/code&gt;, &lt;code&gt;zod&lt;/code&gt;). &lt;code&gt;SharedMemory&lt;/code&gt; is in-process by default with no per-step DB I/O. Three-layer Semaphore concurrency control (agent pool, per-agent, tool execution)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What we have not solved
&lt;/h2&gt;

&lt;p&gt;None of the five walls is "we figured it out with a clever design." This is industry-hard. Not a single-team intelligence problem.&lt;/p&gt;

&lt;p&gt;I should be explicit about where we are short:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Nesting depth&lt;/strong&gt;: &lt;code&gt;maxDelegationDepth&lt;/code&gt; defaults to 3. &lt;strong&gt;That is exactly the depth at which Mastra cracked.&lt;/strong&gt; We have not done serious engineering tests beyond four. Open problem for us too&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: We have not systematically load-tested 100 tasks × 10 agents. Mastra hit performance walls in customer production, and our sample size is not comparable yet&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context propagation&lt;/strong&gt;: &lt;code&gt;SharedMemory&lt;/code&gt; + &lt;code&gt;MessageBus&lt;/code&gt; is directionally correct, but the policy of "what a sub-agent sees by default" is still iterating. We have not reproduced every Mastra failure case&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-provider robustness&lt;/strong&gt;: We run basic routing-consistency tests across providers. Edge cases like "trailing whitespace breaks Bedrock Claude" have not been systematically swept&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This post is not announcing that we solved what Mastra did not.&lt;/p&gt;

&lt;p&gt;It is an invitation. Multi-agent is a real direction with real engineering value and real engineering difficulty. We are continuing to push on it. If you also believe this is worth doing, come help.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to get involved
&lt;/h2&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/open-multi-agent/open-multi-agent" rel="noopener noreferrer"&gt;github.com/open-multi-agent/open-multi-agent&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;PRs welcome. Counterarguments welcome in issues. Failure cases that break our claims especially welcome.&lt;/p&gt;




&lt;h2&gt;
  
  
  Frequently asked questions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Did Mastra abandon multi-agent?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No. Mastra continues to support multi-agent coordination through the Supervisor pattern (launched February 26, 2026), the &lt;code&gt;subagents&lt;/code&gt; primitive, and the A2A cross-framework interop protocol (May 19, 2026). What changed is the headline vocabulary: their Series A and Platform announcements (April 9, 2026) no longer use "multi-agent" as a positioning word. The capability stayed; the API and the marketing both shifted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If I'm starting a multi-agent project in TypeScript today, what should I use?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It depends on what you need. Mastra's Supervisor pattern is backed by a well-funded company with enterprise customers (Replit, Brex, MongoDB, Workday, Salesforce, Indeed), and it fits well when you want manually configured sub-agents with predictable execution. If you want LLM-driven goal-to-DAG decomposition (one goal in, an auto-generated task DAG executed across multiple agents in dependency order, rather than manually configuring a supervisor tree), &lt;a href="https://github.com/open-multi-agent/open-multi-agent" rel="noopener noreferrer"&gt;open-multi-agent&lt;/a&gt; takes that approach with 3 runtime dependencies and a Coordinator pattern. LangGraph is also worth a look if you don't mind Python.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is open-multi-agent and how does it differ from Mastra?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;open-multi-agent is an open-source TypeScript framework for multi-agent orchestration, launched April 2026 (&lt;a href="https://github.com/open-multi-agent/open-multi-agent" rel="noopener noreferrer"&gt;github.com/open-multi-agent/open-multi-agent&lt;/a&gt;). The core difference: open-multi-agent uses a Coordinator that decomposes a goal into a task DAG via LLM, then executes tasks in dependency order across multiple agents in parallel, whereas Mastra's current Supervisor pattern uses manually configured sub-agents with the supervisor delegating at runtime. Other design choices include 3 runtime dependencies (&lt;code&gt;@anthropic-ai/sdk&lt;/code&gt;, &lt;code&gt;openai&lt;/code&gt;, &lt;code&gt;zod&lt;/code&gt;), in-process &lt;code&gt;SharedMemory&lt;/code&gt; with no per-step DB I/O by default, and built-in delegation depth caps with cycle detection.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does open-multi-agent solve the 5 walls Mastra hit?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Honestly, partially. We have explicit design choices for each wall (SharedMemory + MessageBus for context, Zod-constrained Coordinator decisions for routing, post-run task DAG dashboard for observability, depth caps + cycle detection for nesting, in-process state for performance). But we have not load-tested at 100 tasks × 10 agents, our default &lt;code&gt;maxDelegationDepth&lt;/code&gt; is 3 (exactly where Mastra cracked), and cross-provider routing edge cases like "trailing whitespace breaks Bedrock Claude" have not been systematically swept. This is an open invitation, not a solved problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Mastra posts and announcements:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/blog/series-a" rel="noopener noreferrer"&gt;Mastra Series A (2026-04-09)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/blog/announcing-mastra-platform" rel="noopener noreferrer"&gt;Announcing Mastra Platform (2026-04-09)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/blog/vnext-agent-network" rel="noopener noreferrer"&gt;Beyond Workflows: Introducing Agent Network vNext (2025-07-03)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/blog/announcing-mastra-improved-agent-orchestration-ai-sdk-v5-support" rel="noopener noreferrer"&gt;Announcing improved agent orchestration with AI SDK v5 (2025-08-26)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/blog/agent-network" rel="noopener noreferrer"&gt;The evolution of AgentNetwork (2025-10-10)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/blog/changelog-2026-02-26" rel="noopener noreferrer"&gt;Mastra Changelog (Supervisor pattern launch, 2026-02-26)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/guides/migrations/network-to-supervisor" rel="noopener noreferrer"&gt;Network to Supervisor migration guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/blog/introducing-agent-to-agent-support" rel="noopener noreferrer"&gt;Introducing A2A support (2026-05-19)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/blog/seed-round" rel="noopener noreferrer"&gt;Seed round announcement (2025-10-08)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/pricing" rel="noopener noreferrer"&gt;Mastra pricing&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;GitHub issues (in order of appearance):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Context propagation: &lt;a href="https://github.com/mastra-ai/mastra/issues/11468" rel="noopener noreferrer"&gt;#11468&lt;/a&gt; &lt;a href="https://github.com/mastra-ai/mastra/issues/5381" rel="noopener noreferrer"&gt;#5381&lt;/a&gt; &lt;a href="https://github.com/mastra-ai/mastra/issues/15336" rel="noopener noreferrer"&gt;#15336&lt;/a&gt; &lt;a href="https://github.com/mastra-ai/mastra/issues/14583" rel="noopener noreferrer"&gt;#14583&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Routing quality: &lt;a href="https://github.com/mastra-ai/mastra/issues/9873" rel="noopener noreferrer"&gt;#9873&lt;/a&gt; &lt;a href="https://github.com/mastra-ai/mastra/issues/12468" rel="noopener noreferrer"&gt;#12468&lt;/a&gt; &lt;a href="https://github.com/mastra-ai/mastra/issues/12955" rel="noopener noreferrer"&gt;#12955&lt;/a&gt; &lt;a href="https://github.com/mastra-ai/mastra/issues/13621" rel="noopener noreferrer"&gt;#13621&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Observability: &lt;a href="https://github.com/mastra-ai/mastra/issues/12277" rel="noopener noreferrer"&gt;#12277&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Nesting: &lt;a href="https://github.com/mastra-ai/mastra/issues/15013" rel="noopener noreferrer"&gt;#15013&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Performance: &lt;a href="https://github.com/mastra-ai/mastra/issues/15478" rel="noopener noreferrer"&gt;#15478&lt;/a&gt; &lt;a href="https://github.com/mastra-ai/mastra/issues/15677" rel="noopener noreferrer"&gt;#15677&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Supervisor era: &lt;a href="https://github.com/mastra-ai/mastra/issues/14723" rel="noopener noreferrer"&gt;#14723&lt;/a&gt; &lt;a href="https://github.com/mastra-ai/mastra/issues/14820" rel="noopener noreferrer"&gt;#14820&lt;/a&gt; &lt;a href="https://github.com/mastra-ai/mastra/issues/15436" rel="noopener noreferrer"&gt;#15436&lt;/a&gt; &lt;a href="https://github.com/mastra-ai/mastra/issues/15734" rel="noopener noreferrer"&gt;#15734&lt;/a&gt; &lt;a href="https://github.com/mastra-ai/mastra/issues/15887" rel="noopener noreferrer"&gt;#15887&lt;/a&gt; &lt;a href="https://github.com/mastra-ai/mastra/issues/16422" rel="noopener noreferrer"&gt;#16422&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;I work on open-multi-agent, an open-source TypeScript multi-agent framework. Comments, counterarguments, and failure cases welcome at the repo.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>mastra</category>
      <category>typescript</category>
      <category>agents</category>
      <category>opensource</category>
    </item>
    <item>
      <title>How to Run a Mixed-Model AI Agent Team in TypeScript?</title>
      <dc:creator>JackChen</dc:creator>
      <pubDate>Sat, 16 May 2026 08:05:41 +0000</pubDate>
      <link>https://dev.to/jackchenme/how-to-run-a-mixed-model-ai-agent-team-in-typescript-1569</link>
      <guid>https://dev.to/jackchenme/how-to-run-a-mixed-model-ai-agent-team-in-typescript-1569</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;A practical walkthrough that takes you from a single-model team baseline to a mixed-provider production setup with live cost and latency monitoring, using &lt;a href="https://github.com/open-multi-agent/open-multi-agent" rel="noopener noreferrer"&gt;open-multi-agent&lt;/a&gt;, the TypeScript-ecosystem answer to CrewAI.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you have ever priced out a multi-agent system that runs on a single frontier model, you already know the trap. You wire up three agents, plan, build, review, and the monthly bill can hit hundreds of dollars at modest cadence (100 runs/day) and climb past four figures at production volume, because the architect, the developer, and the reviewer are all eating Opus tokens to argue about a one-line bug.&lt;/p&gt;

&lt;p&gt;Most TypeScript agent frameworks assume one provider, one model. You can swap the model, but only globally. That single knob makes the cost-quality tradeoff harder than it needs to be. The frontier models are right for a small fraction of the team's turns. The rest is wasted spend.&lt;/p&gt;

&lt;p&gt;This post shows the alternative. Three agents, three different model tiers, one &lt;code&gt;runTeam()&lt;/code&gt; call. Architect on Claude Opus 4.7, developer on a cheaper hosted OpenAI model, reviewer on a local model running through Ollama. The mix is configured per agent in &lt;code&gt;AgentConfig&lt;/code&gt;, no provider lock-in, no glue code. By the end you will have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A concrete way to combine the existing OMA examples for multi-model teams, local models, and cost-tiered execution.&lt;/li&gt;
&lt;li&gt;A cost ledger that gets you per-agent dollar numbers from a single &lt;code&gt;onProgress&lt;/code&gt; callback.&lt;/li&gt;
&lt;li&gt;A current pricing table for the providers used here, snapshot 2026-05-16, so you can do back-of-the-envelope monthly forecasts before you commit.&lt;/li&gt;
&lt;li&gt;An honest list of what mixed-model teams cost you (because they do cost something).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The repository examples this post connects are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/open-multi-agent/open-multi-agent/blob/main/examples/basics/multi-model-team.ts" rel="noopener noreferrer"&gt;&lt;code&gt;examples/basics/multi-model-team.ts&lt;/code&gt;&lt;/a&gt;: different hosted models per agent.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/open-multi-agent/open-multi-agent/blob/main/examples/providers/ollama.ts" rel="noopener noreferrer"&gt;&lt;code&gt;examples/providers/ollama.ts&lt;/code&gt;&lt;/a&gt;: Claude plus a local Ollama reviewer through an OpenAI-compatible &lt;code&gt;baseURL&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/open-multi-agent/open-multi-agent/blob/main/examples/patterns/cost-tiered-pipeline.ts" rel="noopener noreferrer"&gt;&lt;code&gt;examples/patterns/cost-tiered-pipeline.ts&lt;/code&gt;&lt;/a&gt;: token usage and cost comparison across model tiers.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/open-multi-agent/open-multi-agent/blob/main/examples/providers/gemini.ts" rel="noopener noreferrer"&gt;&lt;code&gt;examples/providers/gemini.ts&lt;/code&gt;&lt;/a&gt;: Pro/Flash tiering inside one provider.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There is no separate companion repo for this post. The point is the pattern: per-agent model assignment is already a first-class field on &lt;code&gt;AgentConfig&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "mixed-model" actually means here
&lt;/h2&gt;

&lt;p&gt;A few definitions before we start, because the term is overloaded.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Single-model team.&lt;/strong&gt; Every agent runs on the same provider and the same model. Easiest to reason about, easiest to debug, most expensive when the model is a frontier one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-provider team.&lt;/strong&gt; Different agents run on different providers, but each agent's model is still picked at design time, not at runtime. This is what we mean by "mixed-model" in this post.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dynamic model routing.&lt;/strong&gt; The framework decides which model to use per turn based on cost, latency, or quality signals. Powerful, but adds a layer of indirection and a new failure mode. Out of scope here. Static per-agent assignment gets you 80% of the value with 5% of the complexity.&lt;/p&gt;

&lt;p&gt;The framework primitive we lean on is &lt;code&gt;AgentConfig&lt;/code&gt;. open-multi-agent's &lt;code&gt;AgentConfig&lt;/code&gt; carries &lt;code&gt;provider&lt;/code&gt;, &lt;code&gt;model&lt;/code&gt;, &lt;code&gt;baseURL&lt;/code&gt;, and &lt;code&gt;apiKey&lt;/code&gt; on each agent. The orchestrator instantiates the right LLM adapter (Anthropic, OpenAI, Gemini, Grok, Bedrock, Copilot, and OpenAI-compatible local servers via &lt;code&gt;baseURL&lt;/code&gt;) lazily, so you only install the SDKs you actually use.&lt;/p&gt;

&lt;h2&gt;
  
  
  The four pieces, in order
&lt;/h2&gt;

&lt;p&gt;The OMA repo already has the building blocks. Read them in this order; each one isolates a different part of the mixed-model pattern.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: All-Opus baseline (the cost ceiling)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;OpenMultiAgent&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@open-multi-agent/core&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;AgentConfig&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@open-multi-agent/core&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;architect&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;architect&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;anthropic&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;systemPrompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;You are a senior software architect...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;developer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;architect&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;developer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;systemPrompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reviewer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="nx"&gt;AgentConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;architect&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;reviewer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="na"&gt;systemPrompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;orchestrator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenMultiAgent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;defaultModel&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;defaultProvider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;anthropic&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;team&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;orchestrator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createTeam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;design-build-review&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;design-build-review&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;agents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;architect&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;developer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reviewer&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;sharedMemory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;orchestrator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;runTeam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;team&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Implement retryWithBackoff&amp;lt;T&amp;gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;totalTokenUsage&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the entire team. Three agents, one model, one orchestrator. The coordinator that gets spawned by &lt;code&gt;runTeam()&lt;/code&gt; decomposes the goal into tasks, assigns each task to an agent, runs them with dependency-aware parallelism, and returns a &lt;code&gt;TeamRunResult&lt;/code&gt; with per-agent token usage in &lt;code&gt;agentResults&lt;/code&gt; and a total in &lt;code&gt;totalTokenUsage&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We use it as the cost ceiling. Everything that follows will be measured against this baseline.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Dual-model (Opus plans, OpenAI executes)
&lt;/h3&gt;

&lt;p&gt;The architect's job is to look at the goal once, decide what to build and what to skip, and lay out an API shape. It is called rarely, but each call is high leverage. Opus is right for that.&lt;/p&gt;

&lt;p&gt;The developer and the reviewer fire many times on smaller prompts. The reviewer in particular is mostly running short checklist passes. Spending Opus tokens on those is wasteful. A mini-tier OpenAI model, for example GPT-5.4 mini at $0.75 input and $4.50 output per million tokens as of 2026-05-16, is roughly 5.5x cheaper than Opus on the output side. You can absorb the quality delta on simpler turns.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;architect&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;architect&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;anthropic&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;systemPrompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;developer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;developer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gpt-5.4-mini&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;systemPrompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reviewer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;reviewer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gpt-5.4-mini&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;systemPrompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same &lt;code&gt;createTeam()&lt;/code&gt;, same &lt;code&gt;runTeam()&lt;/code&gt;. The only thing that changes is &lt;code&gt;provider&lt;/code&gt; and &lt;code&gt;model&lt;/code&gt; on two of the three agents. The orchestrator handles the adapter switching internally; you do not write a single line of provider-specific code in your team setup.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Triple-model with a local reviewer
&lt;/h3&gt;

&lt;p&gt;Now we push the reviewer onto a local model. The argument is the same as Step 2, pushed harder: the reviewer does style checks, edge-case prompts, and short-form QA. A local Ollama model is often good enough for that, and the marginal cloud cost is zero.&lt;/p&gt;

&lt;p&gt;The trick is that Ollama exposes an OpenAI-compatible endpoint, so you reuse the &lt;code&gt;openai&lt;/code&gt; adapter and override &lt;code&gt;baseURL&lt;/code&gt; and &lt;code&gt;apiKey&lt;/code&gt; on the agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reviewer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;reviewer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;llama3.1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://localhost:11434/v1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ollama&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;systemPrompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A note that catches people: even when the local server ignores the key, the OpenAI SDK validates that it is non-empty. Pass any placeholder. &lt;code&gt;'ollama'&lt;/code&gt; works.&lt;/p&gt;

&lt;p&gt;Prerequisites for this step: &lt;code&gt;ollama pull llama3.1&lt;/code&gt; once, then &lt;code&gt;ollama serve&lt;/code&gt; in a separate terminal. Swap &lt;code&gt;llama3.1&lt;/code&gt; for Gemma, Qwen, or whatever local model you actually use. Same setup works with vLLM, LM Studio, llama-server, or any OpenAI-compatible inference server.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Live cost and latency monitoring
&lt;/h3&gt;

&lt;p&gt;The mixed-model team is now functionally complete. The problem is you do not yet know what it costs. In production you want to find out fast when an agent hot-loops on Opus and burns $50 in an afternoon.&lt;/p&gt;

&lt;p&gt;open-multi-agent's &lt;code&gt;OrchestratorConfig&lt;/code&gt; accepts an &lt;code&gt;onProgress&lt;/code&gt; callback that fires for &lt;code&gt;agent_start&lt;/code&gt;, &lt;code&gt;agent_complete&lt;/code&gt;, &lt;code&gt;task_start&lt;/code&gt;, &lt;code&gt;task_complete&lt;/code&gt;, &lt;code&gt;error&lt;/code&gt;, and a few others. We use the start and complete events to build a per-agent ledger.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ledger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;startedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;finishedAt&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;modelForAgent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;architect&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;developer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reviewer&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;coordinator&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;unknown&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;orchestrator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenMultiAgent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;defaultModel&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;defaultProvider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;anthropic&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;onProgress&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;agent_start&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;ledger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;startedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;modelForAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;agent_complete&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;entry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ledger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;finishedAt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After &lt;code&gt;runTeam()&lt;/code&gt; finishes, you walk &lt;code&gt;result.agentResults&lt;/code&gt;, pull &lt;code&gt;tokenUsage&lt;/code&gt; for each agent, look up the per-million pricing for its model, and compute a dollar number. The same ledger pattern used by the cost-tiered example produces output like this (representative shape; for a real recorded run with measured numbers, see the &lt;strong&gt;Update 2026-05-22&lt;/strong&gt; section near the end of this post):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;agent       | model              | latency  | tokens in/out  | cost
------------+--------------------+----------+----------------+--------
coordinator | claude-opus-4-7    |    3.4s  |   1102/   612  | $0.0208
architect   | claude-opus-4-7    |    6.1s  |   1580/  1140  | $0.0364
developer   | gpt-5.4-mini       |    8.7s  |   2240/  2106  | $0.0112
reviewer    | llama3.1           |   12.0s  |   2680/   480  | $0.0000

Grand total: $0.0684 USD
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The reviewer is the slowest agent in the table because a local model on a consumer machine is usually slower than a hosted frontier model. That is the local-cost trade-off in concrete terms: zero marginal cloud dollars, more wall-clock seconds. Whether that is the right call depends on whether your team blocks on the reviewer or runs it asynchronously.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing snapshot (2026-05-16)
&lt;/h2&gt;

&lt;p&gt;These are the numbers used by the examples and estimates in this post. They are also the numbers you want for any cost back-of-the-envelope before you ship a mixed-model team to production. Verify on the official pages before you commit to a forecast.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Input ($/1M)&lt;/th&gt;
&lt;th&gt;Output ($/1M)&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claude Opus 4.7&lt;/td&gt;
&lt;td&gt;$5.00&lt;/td&gt;
&lt;td&gt;$25.00&lt;/td&gt;
&lt;td&gt;&lt;a href="https://platform.claude.com/docs/en/about-claude/pricing" rel="noopener noreferrer"&gt;platform.claude.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Sonnet 4.6&lt;/td&gt;
&lt;td&gt;$3.00&lt;/td&gt;
&lt;td&gt;$15.00&lt;/td&gt;
&lt;td&gt;same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Haiku 4.5&lt;/td&gt;
&lt;td&gt;$1.00&lt;/td&gt;
&lt;td&gt;$5.00&lt;/td&gt;
&lt;td&gt;same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT-5.5&lt;/td&gt;
&lt;td&gt;$5.00&lt;/td&gt;
&lt;td&gt;$30.00&lt;/td&gt;
&lt;td&gt;&lt;a href="https://openai.com/api/pricing/" rel="noopener noreferrer"&gt;openai.com/api/pricing&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT-5.4&lt;/td&gt;
&lt;td&gt;$2.50&lt;/td&gt;
&lt;td&gt;$15.00&lt;/td&gt;
&lt;td&gt;same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT-5.4 mini&lt;/td&gt;
&lt;td&gt;$0.75&lt;/td&gt;
&lt;td&gt;$4.50&lt;/td&gt;
&lt;td&gt;same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 2.5 Flash&lt;/td&gt;
&lt;td&gt;$0.30&lt;/td&gt;
&lt;td&gt;$2.50&lt;/td&gt;
&lt;td&gt;&lt;a href="https://ai.google.dev/gemini-api/docs/pricing" rel="noopener noreferrer"&gt;ai.google.dev/gemini-api/docs/pricing&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Local model via Ollama&lt;/td&gt;
&lt;td&gt;$0 marginal&lt;/td&gt;
&lt;td&gt;$0 marginal&lt;/td&gt;
&lt;td&gt;electricity + amortized hardware&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A reasonable starting reading of the table: the cloud frontier models charge 4x to 8x more on the output side than on the input side. That is the inversion you want to design against. Push input-heavy agents (research, summarization, retrieval grounding) onto the cheaper models. Reserve the expensive models for agents that produce a lot of high-stakes output.&lt;/p&gt;

&lt;h2&gt;
  
  
  A worked cost comparison on a recurring workload
&lt;/h2&gt;

&lt;p&gt;Suppose your team runs the same three-agent task 100 times a day (real-world cadence for an automation that fires on inbound webhooks, scheduled batches, or per-customer pipelines). A representative run uses roughly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Coordinator: 1.1K input, 0.6K output tokens (Opus 4.7 in all variants)&lt;/li&gt;
&lt;li&gt;Architect: 1.6K input, 1.1K output&lt;/li&gt;
&lt;li&gt;Developer: 2.2K input, 2.1K output&lt;/li&gt;
&lt;li&gt;Reviewer: 2.7K input, 0.5K output&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use this as a representative shape, not a benchmark. Your numbers will differ; the math below shows how to do it. If you want to measure your own workload, start with &lt;a href="https://github.com/open-multi-agent/open-multi-agent/blob/main/examples/patterns/cost-tiered-pipeline.ts" rel="noopener noreferrer"&gt;&lt;code&gt;examples/patterns/cost-tiered-pipeline.ts&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;All-Opus run (Step 1 baseline)&lt;/strong&gt; runs to about $0.15 per execution on the token shape above. At 100/day that is roughly $450/month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dual-model (Step 2): Opus on coordinator + architect, GPT-5.4 mini on developer + reviewer.&lt;/strong&gt; About $0.073 per run. Roughly $219/month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Triple-model (Step 3): Opus on coordinator + architect, GPT-5.4 mini on developer, local Ollama model on reviewer.&lt;/strong&gt; About $0.068 per run on the cloud side. Roughly $205/month, plus the local model's wall-clock overhead on a machine you already own.&lt;/p&gt;

&lt;p&gt;That is a little over 50% monthly savings against the all-Opus baseline on this token shape, with the quality drop contained to lower-risk roles. The savings shape changes by workload: research-heavy, summarization-heavy, or retrieval-grounded teams can gain more by pushing input-heavy roles to cheaper models. Reasoning-heavy teams gain less and may be wrong to mix at all.&lt;/p&gt;

&lt;h2&gt;
  
  
  When mixed-model is the wrong call
&lt;/h2&gt;

&lt;p&gt;The post would be useless if it told you to always mix. Here is the honest list of when not to.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Single-agent tasks.&lt;/strong&gt; If your goal is small enough that one agent can finish it, do not split into a team just to mix models. The coordinator overhead and inter-agent context shuffling will dwarf the savings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;High-consistency tasks.&lt;/strong&gt; Models from different providers disagree on edge cases. If your output is graded against a strict rubric (legal review, medical triage, anything with a regulator-facing audit trail), the variance from mixing providers will bite you. Stay on one model and pay the price.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tight feedback loops with human reviewers.&lt;/strong&gt; When you are still iterating on prompts and the team's behavior is unstable, mixing models adds a degree of freedom you do not want. Get the team to converge on a single model first, then push pieces out to cheaper models once each role's prompt is stable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cheap models with cheap output.&lt;/strong&gt; If every agent in your team already fits a single Haiku or GPT-5.4 mini run, the absolute savings from mixing are pennies and the operational complexity is real. Mixed-model pays off when at least one role's monthly cost is meaningful enough to justify the variance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Provider failure handling that is not in place.&lt;/strong&gt; A mixed-model team has more failure surfaces than a single-model team. Three providers means three rate-limit ceilings, three auth flows, three latency profiles, and three ways your pipeline can stall on a 503. Decide your retry, fallback, and circuit-breaker strategy before you ship.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the cookbook example looks like in mixed-model form
&lt;/h2&gt;

&lt;p&gt;open-multi-agent ships a &lt;a href="https://github.com/open-multi-agent/open-multi-agent/blob/main/examples/cookbook/personalized-interview-simulator.ts" rel="noopener noreferrer"&gt;personalized interview simulator cookbook&lt;/a&gt; that runs three agents on Claude Sonnet 4.6: an interviewer, an observer, and a reporter. It is a nice match for the mixed-model pattern.&lt;/p&gt;

&lt;p&gt;The interviewer does deep, candidate-specific question generation across many turns. That role earns Opus.&lt;/p&gt;

&lt;p&gt;The observer reads the transcript after each turn and writes 3-6 short flags. The role is short-output, repeatable, and structurally simple. Push it to a cheaper hosted model or even a local model.&lt;/p&gt;

&lt;p&gt;The reporter runs once at the end of the session against a strict Zod schema (&lt;code&gt;recommendation: 'strong-hire' | 'hire' | ...&lt;/code&gt;, plus structured arrays). Structured-output agents are sensitive to the underlying model's JSON adherence. Keep that on a frontier model.&lt;/p&gt;

&lt;p&gt;The migration is two &lt;code&gt;provider&lt;/code&gt; and &lt;code&gt;model&lt;/code&gt; edits, two &lt;code&gt;AgentConfig&lt;/code&gt; blocks. You do not touch the orchestration logic. You do not refactor the prompts. You read the schema and decide where the consistency requirements actually live.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this lives in the TypeScript ecosystem
&lt;/h2&gt;

&lt;p&gt;A small note on positioning since this is the question I get asked most.&lt;/p&gt;

&lt;p&gt;CrewAI established the team-of-agents shape that this post leans on: an agent has a role, agents form crews, a crew has a goal, and the framework orchestrates the goal into work. CrewAI is Python-only, and the TypeScript options for the same pattern have been thin until recently. open-multi-agent treats the TypeScript ecosystem as a first-class target: 100% TypeScript runtime, three runtime dependencies (&lt;code&gt;@anthropic-ai/sdk&lt;/code&gt;, &lt;code&gt;openai&lt;/code&gt;, &lt;code&gt;zod&lt;/code&gt;), and the same Goal → Result one-call surface (&lt;code&gt;runTeam&lt;/code&gt;) that you would get from CrewAI's &lt;code&gt;Crew.kickoff()&lt;/code&gt;. The mixed-model team is, by design, a first-class pattern rather than a custom adapter you write yourself.&lt;/p&gt;

&lt;p&gt;If you are coming from CrewAI and looking for the team-of-roles model in TypeScript, the examples above are the migration target.&lt;/p&gt;

&lt;h2&gt;
  
  
  Update 2026-05-22: Real run data + thinking-mode cost
&lt;/h2&gt;

&lt;p&gt;The original post (2026-05-16) uses Anthropic + OpenAI as the canonical example because that's the setup most TypeScript readers will recognize. After publishing, I ran the same &lt;code&gt;runTeam&lt;/code&gt;-shaped pipeline with &lt;strong&gt;DeepSeek + a local Qwen model&lt;/strong&gt; on an M1 16GB MacBook because those were the API credentials I had at hand. Same code path, different provider/model strings. That's the model-agnostic point of the post, applied honestly to my own constraints.&lt;/p&gt;

&lt;p&gt;This section adds (1) the actual recorded ledger so anyone reproducing can work from measured data, (2) a side experiment on thinking-mode cost using two Qwen 3.5 9B variants, and (3) a known limitation of the OMA + OpenAI-compatible path for thinking-mode models.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setup for the real run
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;coordinator / default: DeepSeek &lt;code&gt;deepseek-chat&lt;/code&gt; (non-thinking, $0.14 / $0.28 per MTok)&lt;/li&gt;
&lt;li&gt;architect: DeepSeek &lt;code&gt;deepseek-reasoner&lt;/code&gt; (thinking mode, same pricing)&lt;/li&gt;
&lt;li&gt;developer: DeepSeek &lt;code&gt;deepseek-chat&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;reviewer: Ollama &lt;code&gt;qwen3:8b&lt;/code&gt; (local, accessed via Ollama's OpenAI-compatible endpoint)&lt;/li&gt;
&lt;li&gt;DAG: &lt;code&gt;runTasks(team, tasks)&lt;/code&gt; with explicit per-task &lt;code&gt;assignee&lt;/code&gt; (see "Why &lt;code&gt;runTasks&lt;/code&gt;" below)&lt;/li&gt;
&lt;li&gt;Hardware: MacBook M1, 16GB unified memory, Ollama 0.20.2&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Ledger 1: real run
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;agent       | model              | latency  | tokens in/out  | cost
------------+--------------------+----------+----------------+--------
architect   | deepseek-reasoner  |    25.3s |   1612/  2450  | $0.0009
developer   | deepseek-chat      |    68.1s | 108219/ 10408  | $0.0181
reviewer    | qwen3:8b           |   208.5s |   1432/   696  | $0 (local)

Grand total: $0.0190 USD
Wall total : 5:03
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Per-run cost is $0.0190. At 100 runs/day that is roughly $57/month against the original all-Opus baseline of $450/month. The shape is different from the post's "40-70% savings" claim because the DeepSeek pricing floor is much lower than the OpenAI mini tier the original example used. Use whichever pricing matches your stack.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why &lt;code&gt;runTasks&lt;/code&gt; over &lt;code&gt;runTeam(goal)&lt;/code&gt; for mixed cloud + local
&lt;/h3&gt;

&lt;p&gt;I tried &lt;code&gt;runTeam(goal)&lt;/code&gt; three times before switching. Two failures worth recording, because they will hit anyone trying to pin a role to local inference:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Short goal (under 200 chars, no complexity keywords).&lt;/strong&gt; OMA v1.1.0 introduced "Skip Coordinator for Simple Goals" — when &lt;code&gt;isSimpleGoal(goal)&lt;/code&gt; returns true, the coordinator skips DAG decomposition entirely and routes the whole goal to the best-matching agent. The reviewer was never invoked.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Multi-step goal that did trigger the coordinator.&lt;/strong&gt; The coordinator generated a DAG but folded the review work into the developer's task instead of dispatching to the reviewer agent. Even adding "the reviewer agent must independently audit" to the goal text didn't move the dispatch — the coordinator's routing decision wins over goal text hints.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For mixed cloud+local pipelines where you have intentionally pinned a role to local inference, this matters: a coordinator that optimizes your local agent away costs you both the architectural intent and the cost saving. &lt;code&gt;runTasks(team, tasks)&lt;/code&gt; is the path that guarantees per-agent dispatch with compile-time &lt;code&gt;assignee&lt;/code&gt; typing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;orchestrator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;runTasks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;team&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;design-api&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;assignee&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;architect&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;implement&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="na"&gt;assignee&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;developer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;dependsOn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;design-api&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;review&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="na"&gt;assignee&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;reviewer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;dependsOn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;implement&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For goal-driven workloads where you genuinely want the coordinator to decide, &lt;code&gt;runTeam(goal)&lt;/code&gt; is still the right call. The two APIs are complementary; pick by whether the dispatch decision is yours or the framework's.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thinking-mode cost: same model, estimated 4-5x latency
&lt;/h3&gt;

&lt;p&gt;Side experiment. I ran the same DAG two more times, only swapping the reviewer model. First with &lt;code&gt;qwen3.5:9b-mlx&lt;/code&gt; (Qwen 3.5 9B, MLX-optimized for Apple Silicon, 8.9GB on disk) with thinking mode on its default. Then with the same model + &lt;code&gt;/no_think&lt;/code&gt; appended to the reviewer task description (the prompt-level workaround Qwen 3 series ships with).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Ledger&lt;/th&gt;
&lt;th&gt;reviewer config&lt;/th&gt;
&lt;th&gt;reviewer latency&lt;/th&gt;
&lt;th&gt;reviewer in / out tokens&lt;/th&gt;
&lt;th&gt;wall total&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;qwen3:8b&lt;/code&gt; (no thinking by design)&lt;/td&gt;
&lt;td&gt;208s&lt;/td&gt;
&lt;td&gt;1432 / 696&lt;/td&gt;
&lt;td&gt;5:03&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;qwen3.5:9b-mlx&lt;/code&gt; (thinking default ON)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1347s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;21014 / 1566&lt;/td&gt;
&lt;td&gt;24:30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;qwen3.5:9b-mlx&lt;/code&gt; (&lt;code&gt;/no_think&lt;/code&gt; workaround)&lt;/td&gt;
&lt;td&gt;554s&lt;/td&gt;
&lt;td&gt;38342 / 568&lt;/td&gt;
&lt;td&gt;10:44&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Ledger 2 vs Ledger 3 is the cleanest control: same model, same hardware, only thinking mode varies. Output tokens dropped 63% (1566 → 568) with &lt;code&gt;/no_think&lt;/code&gt;, and reviewer latency dropped 59% (1347s → 554s) — despite Ledger 3's reviewer receiving 82% &lt;em&gt;more&lt;/em&gt; input (38K vs 21K) because the developer happened to emit more in that run. Normalizing for the input difference, the pure thinking-mode cost on this M1 16GB is an estimated 4-5x reviewer latency.&lt;/p&gt;

&lt;p&gt;For short code-review-shaped tasks (~200-word output), thinking mode is overkill. Match the local model + thinking setting to the role, not to "biggest thing I can pull".&lt;/p&gt;

&lt;h3&gt;
  
  
  Caveat: OMA + OpenAI-compatible can't fully kill Qwen 3.5 thinking
&lt;/h3&gt;

&lt;p&gt;I tested three ways to disable thinking from OMA's side. Two failed, one partially works:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ollama's native &lt;code&gt;think: false&lt;/code&gt; field via the OpenAI-compatible endpoint.&lt;/strong&gt; Did not work. Adding &lt;code&gt;"think": false&lt;/code&gt; to a request body sent to &lt;code&gt;localhost:11434/v1/chat/completions&lt;/code&gt; kept the model thinking for 60+ seconds, with &lt;code&gt;reasoning&lt;/code&gt; length still 758 chars. The OpenAI Chat Completions schema doesn't define this field, and Ollama's compatibility layer drops it silently.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;OMA's &lt;code&gt;AgentConfig.extraBody&lt;/code&gt;&lt;/strong&gt; is designed exactly for passing provider-specific params like GPT-5.5's &lt;code&gt;reasoning_effort: 'xhigh'&lt;/code&gt;. It hits the same OpenAI-compatible endpoint, so the same outcome is the most likely (I did not retest separately; if you do, please report back).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;/no_think&lt;/code&gt; placed cleanly at the end of the user message / task description.&lt;/strong&gt; Partially works. &lt;code&gt;reasoning&lt;/code&gt; length drops, output tokens drop, latency drops (Ledger 3 above). Putting &lt;code&gt;/no_think&lt;/code&gt; in &lt;code&gt;systemPrompt&lt;/code&gt; with extra reinforcement ("skip the  block entirely") made it strictly worse — the model pushed 1337 chars into &lt;code&gt;reasoning&lt;/code&gt; and produced empty &lt;code&gt;content&lt;/code&gt; because the &lt;code&gt;max_tokens&lt;/code&gt; budget was consumed by the reasoning phase.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The fully-off path goes through Ollama's native &lt;code&gt;/api/chat&lt;/code&gt; endpoint with &lt;code&gt;"think": false&lt;/code&gt;. On the same review prompt, that endpoint completes in &lt;strong&gt;6.7 seconds&lt;/strong&gt;. That is the theoretical ceiling for this model + hardware on this prompt. OMA's OpenAI-compatible path leaves measurable performance on the table for thinking-mode models. If you have found a way to pass Ollama's &lt;code&gt;think&lt;/code&gt; field through an OpenAI-compatible client without bypassing the framework, please open an issue on the OMA repo. I would like to be wrong about this limitation.&lt;/p&gt;

&lt;h3&gt;
  
  
  What this changes about the post above
&lt;/h3&gt;

&lt;p&gt;Nothing about the architectural argument changes — per-agent model assignment through &lt;code&gt;AgentConfig&lt;/code&gt; is still the lever, &lt;code&gt;runTasks&lt;/code&gt; vs &lt;code&gt;runTeam(goal)&lt;/code&gt; is still the dispatch choice, and the cost-tiered framing is still the design pattern. The cost numbers in the original worked example use a representative Anthropic + OpenAI shape; the real-run ledger here is a different provider mix with a different cost floor. Both are valid reference points, and being able to swap between them without rewriting orchestration code is exactly what the post argues for.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrap-up: what to take from here
&lt;/h2&gt;

&lt;p&gt;Mixed-model agent teams are not a clever trick. They are the right default once your team grows beyond two agents and the workload starts running on a real cadence. The savings can be material, often 40-70% against an all-frontier baseline depending on token shape, the operational cost is real (more failure modes, more variance), and the design choice that matters most is which agent gets the expensive model.&lt;/p&gt;

&lt;p&gt;Three takeaways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Per-agent model assignment is a design lever, not an optimization.&lt;/strong&gt; Decide it when you decide the team. Retrofitting it later means rewriting prompts that have already drifted to match the wrong model.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Start with two providers, then add local.&lt;/strong&gt; Step 2 captures most of the savings with two API keys and zero infrastructure. Step 3 is incremental and depends on whether you can spare the local-model latency.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;onProgress&lt;/code&gt; is the cheapest insurance you can buy.&lt;/strong&gt; Twenty lines of TypeScript turn token counts into dollar numbers per run. Without it, mixed-model teams silently regress and you find out from the bill.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Start with the existing repo examples: &lt;a href="https://github.com/open-multi-agent/open-multi-agent/blob/main/examples/basics/multi-model-team.ts" rel="noopener noreferrer"&gt;&lt;code&gt;multi-model-team&lt;/code&gt;&lt;/a&gt;, &lt;a href="https://github.com/open-multi-agent/open-multi-agent/blob/main/examples/providers/ollama.ts" rel="noopener noreferrer"&gt;&lt;code&gt;providers/ollama&lt;/code&gt;&lt;/a&gt;, and &lt;a href="https://github.com/open-multi-agent/open-multi-agent/blob/main/examples/patterns/cost-tiered-pipeline.ts" rel="noopener noreferrer"&gt;&lt;code&gt;cost-tiered-pipeline&lt;/code&gt;&lt;/a&gt;. Run the one closest to your workload, then add the per-agent ledger before you scale it up. If you push this pattern to production, I would like to hear what your real cost shape looks like. The reasonable model split is probably different from the examples, and the right answer is workload-specific.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;About open-multi-agent.&lt;/strong&gt; TypeScript-native multi-agent orchestration framework, MIT-licensed. Goal → Result in one &lt;code&gt;runTeam()&lt;/code&gt; call. Three runtime dependencies. Repo: &lt;a href="https://github.com/open-multi-agent/open-multi-agent" rel="noopener noreferrer"&gt;https://github.com/open-multi-agent/open-multi-agent&lt;/a&gt;. The framework treats the TypeScript ecosystem as a first-class target rather than a secondary port from Python.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Edits and corrections.&lt;/strong&gt; If a price has moved since 2026-05-16 or a model has been renamed, please open an issue against the OMA repo and I will refresh the constants in the examples.&lt;/p&gt;

</description>
      <category>typescript</category>
      <category>agents</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Adding Multi-Agent Orchestration to a Vercel AI SDK App</title>
      <dc:creator>JackChen</dc:creator>
      <pubDate>Wed, 15 Apr 2026 17:20:14 +0000</pubDate>
      <link>https://dev.to/jackchenme/adding-multi-agent-orchestration-to-a-vercel-ai-sdk-app-4536</link>
      <guid>https://dev.to/jackchenme/adding-multi-agent-orchestration-to-a-vercel-ai-sdk-app-4536</guid>
      <description>&lt;p&gt;I hit a wall recently. I had a working AI SDK app -- &lt;code&gt;streamText&lt;/code&gt;, &lt;code&gt;useChat&lt;/code&gt;, the whole thing -- and then I needed it to do something that a single agent can't: research a topic with one agent, then hand that research to a second agent for writing.&lt;/p&gt;

&lt;p&gt;You can do this manually. Glue two &lt;code&gt;generateText&lt;/code&gt; calls together, pass context around, handle the error cases. But once you want a coordinator that figures out which tasks to run in what order, or three agents sharing state, you're writing orchestration infrastructure. I didn't want to write orchestration infrastructure.&lt;/p&gt;

&lt;p&gt;So I wired &lt;a href="https://github.com/open-multi-agent/open-multi-agent" rel="noopener noreferrer"&gt;open-multi-agent&lt;/a&gt; (OMA) into a Next.js API route next to the AI SDK, and the two libraries turned out to work well together. This is how.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where each library sits
&lt;/h2&gt;

&lt;p&gt;AI SDK and OMA do different jobs. They don't overlap much.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Vercel AI SDK&lt;/th&gt;
&lt;th&gt;open-multi-agent&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;What it is&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;LLM call layer + streaming UI&lt;/td&gt;
&lt;td&gt;Multi-agent orchestration framework&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Core strength&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Unified API for 60+ providers, &lt;code&gt;useChat&lt;/code&gt;, &lt;code&gt;streamText&lt;/code&gt;, structured outputs&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;runTeam()&lt;/code&gt; -- auto task decomposition, parallel execution, shared memory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Agent model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Single agent with tool loop (&lt;code&gt;ToolLoopAgent&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Team of agents with coordinator pattern&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Streaming&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;First-class (&lt;code&gt;toUIMessageStreamResponse&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Not streaming-native (batch results)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Ecosystem&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;23,400+ GitHub stars, 10M+ weekly downloads&lt;/td&gt;
&lt;td&gt;5,700+ GitHub stars, 3 runtime deps&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;AI SDK talks to models and streams tokens. OMA sits above that: given a goal and a roster of agents, it breaks the goal into tasks, runs them in dependency order, and collects the results. The two can share the same API route.&lt;/p&gt;

&lt;h2&gt;
  
  
  What we're building
&lt;/h2&gt;

&lt;p&gt;A Next.js chat app. User types a topic, two agents collaborate on a researched article, the result streams back through &lt;code&gt;useChat&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Browser (useChat)
    |
    v
POST /api/chat
    |
    +-- Phase 1: OMA runTeam()
    |     coordinator decomposes goal
    |     -&amp;gt; researcher agent gathers info
    |     -&amp;gt; writer agent drafts article
    |     (shared memory passes context between agents)
    |
    +-- Phase 2: AI SDK streamText()
    |     streams the team's output to the browser
    |
    v
useChat renders streamed response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Phase 1: OMA runs the team. A coordinator agent (created automatically by &lt;code&gt;runTeam&lt;/code&gt;) analyzes the goal, produces a task plan, and executes it. The researcher's output lands in shared memory so the writer can reference it.&lt;/p&gt;

&lt;p&gt;Phase 2: the coordinator's final output gets piped into AI SDK's &lt;code&gt;streamText&lt;/code&gt;, which streams it to the browser through &lt;code&gt;useChat&lt;/code&gt;. This is the bridge between OMA's batch output and AI SDK's streaming protocol.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Project setup
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;with-vercel-ai-sdk &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;with-vercel-ai-sdk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;package.json&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"private"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scripts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"dev"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"next dev"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"build"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"next build"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dependencies"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"@ai-sdk/openai-compatible"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^2.0.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"@ai-sdk/react"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^3.0.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"@open-multi-agent/open-multi-agent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^1.1.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ai"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^6.0.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"next"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^16.0.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"react"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^19.0.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"react-dom"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^19.0.0"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We're using &lt;code&gt;@ai-sdk/openai-compatible&lt;/code&gt; here because the demo points at DeepSeek. If you use Anthropic or OpenAI directly, swap in their provider package instead.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: The backend
&lt;/h2&gt;

&lt;p&gt;One API route, two phases. The interesting part is how little glue code the integration needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;app/api/chat/route.ts&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;streamText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;convertToModelMessages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;UIMessage&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createOpenAICompatible&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@ai-sdk/openai-compatible&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;OpenMultiAgent&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@open-multi-agent/open-multi-agent&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;AgentConfig&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@open-multi-agent/open-multi-agent&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;maxDuration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;120&lt;/span&gt;

&lt;span class="c1"&gt;// --- Provider setup (swap this for your preferred LLM) ---&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;BASE_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://api.deepseek.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;MODEL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;deepseek-chat&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createOpenAICompatible&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;deepseek&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;BASE_URL&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/v1`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DEEPSEEK_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;// --- Agent definitions ---&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;researcher&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;researcher&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;BASE_URL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DEEPSEEK_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;systemPrompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`You are a research specialist. Given a topic, provide thorough,
factual research with key findings, relevant data points, and important context.
Be concise but comprehensive. Output structured notes, not prose.`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;maxTurns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;writer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;BASE_URL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DEEPSEEK_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;systemPrompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`You are an expert writer. Using research from team members
(available in shared memory), write a well-structured, engaging article
with clear headings and concise paragraphs.`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;maxTurns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OMA's &lt;code&gt;provider: 'openai'&lt;/code&gt; means "use the OpenAI-compatible chat completions API." It works with DeepSeek, Ollama, Together, or anything that speaks that protocol.&lt;/p&gt;

&lt;p&gt;Now the request handler:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;extractText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;UIMessage&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;parts&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;POST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;UIMessage&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;lastText&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;extractText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;at&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="c1"&gt;// --- Phase 1: OMA multi-agent orchestration ---&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;orchestrator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenMultiAgent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;defaultModel&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;defaultProvider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;defaultBaseURL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;BASE_URL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;defaultApiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DEEPSEEK_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;team&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;orchestrator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createTeam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;research-writing&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;research-writing&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;agents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;researcher&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;sharedMemory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;teamResult&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;orchestrator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;runTeam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;team&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;`Research and write an article about: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;lastText&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;teamOutput&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
    &lt;span class="nx"&gt;teamResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agentResults&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;coordinator&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;output&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;

  &lt;span class="c1"&gt;// --- Phase 2: Stream result via Vercel AI SDK ---&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;streamText&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;MODEL&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;system&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`You are presenting research from a multi-agent team.
The team has already done the work. Relay their output faithfully
in a well-formatted way.

## Team Output
&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;teamOutput&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;convertToModelMessages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toUIMessageStreamResponse&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What &lt;code&gt;runTeam()&lt;/code&gt; does internally:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A &lt;strong&gt;coordinator&lt;/strong&gt; agent receives the goal plus the agent roster&lt;/li&gt;
&lt;li&gt;It produces a JSON task plan -- tasks, assignments, dependency edges&lt;/li&gt;
&lt;li&gt;OMA's &lt;code&gt;TaskQueue&lt;/code&gt; topologically sorts the plan. Independent tasks run in parallel; dependent tasks wait.&lt;/li&gt;
&lt;li&gt;Each agent writes its output to &lt;code&gt;SharedMemory&lt;/code&gt;, so the writer can see what the researcher found&lt;/li&gt;
&lt;li&gt;The coordinator synthesizes everything into a final output&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You define agents and a goal. The coordinator decides the task graph.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: The frontend
&lt;/h2&gt;

&lt;p&gt;AI SDK v6's &lt;code&gt;useChat&lt;/code&gt; handles streaming. A few things changed from v3 that tripped me up: there's no built-in &lt;code&gt;handleSubmit&lt;/code&gt; or &lt;code&gt;input&lt;/code&gt; state anymore, and messages use &lt;code&gt;parts&lt;/code&gt; instead of a &lt;code&gt;content&lt;/code&gt; string. The &lt;code&gt;isLoading&lt;/code&gt; boolean is gone too -- replaced by a &lt;code&gt;status&lt;/code&gt; field with four states (&lt;code&gt;'ready'&lt;/code&gt;, &lt;code&gt;'submitted'&lt;/code&gt;, &lt;code&gt;'streaming'&lt;/code&gt;, &lt;code&gt;'error'&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;app/page.tsx&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;use client&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useChat&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@ai-sdk/react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;Home&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sendMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useChat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setInput&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;isLoading&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;submitted&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;streaming&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;handleSubmit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;React&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;FormEvent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;preventDefault&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt;
    &lt;span class="nf"&gt;setInput&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;sendMessage&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;main&lt;/span&gt; &lt;span class="na"&gt;style&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;maxWidth&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;720&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;margin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;0 auto&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;32px 16px&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;h1&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Research Team&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;h1&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;

      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="na"&gt;style&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;marginBottom&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;strong&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;You&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Research Team&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;strong&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;style&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;whiteSpace&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;pre-wrap&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;parts&lt;/span&gt;
              &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
                  &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;

      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;isLoading&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;submitted&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Agents are collaborating -- this may take a minute...&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;

      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt; &lt;span class="na"&gt;style&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;red&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Error: &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;

      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;form&lt;/span&gt; &lt;span class="na"&gt;onSubmit&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;handleSubmit&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="na"&gt;style&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;display&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;flex&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;gap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;input&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
          &lt;span class="na"&gt;onChange&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setInput&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
          &lt;span class="na"&gt;placeholder&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Enter a topic to research..."&lt;/span&gt;
          &lt;span class="na"&gt;disabled&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;isLoading&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
          &lt;span class="na"&gt;style&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;flex&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;10px 14px&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;button&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"submit"&lt;/span&gt; &lt;span class="na"&gt;disabled&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;isLoading&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
          Send
        &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;button&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;form&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;main&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 4: Run it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;DEEPSEEK_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;sk-...
npm run dev
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;http://localhost:3000&lt;/code&gt; and try a topic. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbp8l0e2hcuhl1xoqia0i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbp8l0e2hcuhl1xoqia0i.png" alt="Entering a topic in the chat UI -- agents are collaborating" width="800" height="370"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The OMA orchestration phase takes 30-60 seconds (coordinator planning + two agents running sequentially), then the streaming phase kicks in and you get the article token by token.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa41u2pzrruxshf0bxsgy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa41u2pzrruxshf0bxsgy.png" alt="The streamed article output produced by the researcher and writer agents" width="800" height="409"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwkzo5gk6c6ufbtz30zg0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwkzo5gk6c6ufbtz30zg0.png" alt="The streamed article output produced by the researcher and writer agents" width="800" height="398"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One gotcha: &lt;code&gt;@ai-sdk/openai&lt;/code&gt; v2 defaults to OpenAI's new Responses API (&lt;code&gt;/responses&lt;/code&gt; endpoint). If your provider doesn't support it (most don't yet), use &lt;code&gt;@ai-sdk/openai-compatible&lt;/code&gt; instead, or call &lt;code&gt;provider.chat('model-name')&lt;/code&gt; explicitly rather than &lt;code&gt;provider('model-name')&lt;/code&gt;. Burned about 20 minutes on this.&lt;/p&gt;

&lt;h2&gt;
  
  
  Under the hood
&lt;/h2&gt;

&lt;p&gt;The full request lifecycle:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;useChat&lt;/code&gt; POSTs to &lt;code&gt;/api/chat&lt;/code&gt; with the message history&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;runTeam()&lt;/code&gt; starts. Coordinator agent receives the goal.&lt;/li&gt;
&lt;li&gt;Coordinator produces a task plan via LLM call (JSON with tasks, assignments, dependencies)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;TaskQueue&lt;/code&gt; topologically sorts the tasks&lt;/li&gt;
&lt;li&gt;Researcher agent runs, output goes to &lt;code&gt;SharedMemory&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Writer agent runs (reads researcher's output from shared memory), produces the article&lt;/li&gt;
&lt;li&gt;Coordinator synthesizes the final output&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;streamText()&lt;/code&gt; takes that output and streams it through AI SDK's wire protocol&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;useChat&lt;/code&gt; renders the tokens in the browser&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Steps 3-7 happen inside &lt;code&gt;runTeam()&lt;/code&gt;. That's where OMA earns its keep -- you declare agents and a goal, it handles decomposition, ordering, and state passing.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to use what
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;AI SDK alone&lt;/strong&gt; handles most single-agent work: chatbots, RAG, tool-calling agents, structured extraction. If one agent can finish the job in a single conversation loop, adding OMA would just be extra complexity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Add OMA when&lt;/strong&gt; you need agents collaborating -- research + writing teams, multi-perspective code review, fan-out data collection, anything where one agent's output feeds into another and the dependency graph isn't something you want to hardcode.&lt;/p&gt;

&lt;p&gt;Trade-offs, since every library has them:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;AI SDK&lt;/th&gt;
&lt;th&gt;OMA&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Provider support&lt;/td&gt;
&lt;td&gt;60+ (official + community)&lt;/td&gt;
&lt;td&gt;Anthropic, OpenAI-compatible, Gemini, Grok&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DevTools&lt;/td&gt;
&lt;td&gt;Built-in DevTools, Telemetry integration&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;onProgress&lt;/code&gt; / &lt;code&gt;onTrace&lt;/code&gt; callbacks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Community&lt;/td&gt;
&lt;td&gt;Massive (10M+ weekly downloads)&lt;/td&gt;
&lt;td&gt;Smaller (5,700+ stars)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Maturity&lt;/td&gt;
&lt;td&gt;Years of production use&lt;/td&gt;
&lt;td&gt;Newer, iterating fast&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;OMA's strengths are orchestration-specific: automatic task decomposition, dependency DAGs, shared memory, concurrency control with semaphores. Its provider coverage and tooling ecosystem are thinner. Whether that matters depends on your project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Full example
&lt;/h2&gt;

&lt;p&gt;The working code is in the open-multi-agent repo:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/open-multi-agent/open-multi-agent/tree/main/examples/with-vercel-ai-sdk" rel="noopener noreferrer"&gt;github.com/open-multi-agent/open-multi-agent/tree/main/examples/with-vercel-ai-sdk&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Clone it, set your API key, &lt;code&gt;npm install &amp;amp;&amp;amp; npm run dev&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If multi-agent orchestration is new to you, the &lt;a href="https://github.com/open-multi-agent/open-multi-agent/blob/main/examples/01-single-agent.ts" rel="noopener noreferrer"&gt;single-agent example&lt;/a&gt; might be a better starting point.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>nextjs</category>
      <category>webdev</category>
      <category>typescript</category>
    </item>
  </channel>
</rss>
