<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Johnny Z</title>
    <description>The latest articles on DEV Community by Johnny Z (@stormhub).</description>
    <link>https://dev.to/stormhub</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2583804%2F5eaa8bd5-5b97-465d-a00e-89a397f5505a.jpeg</url>
      <title>DEV Community: Johnny Z</title>
      <link>https://dev.to/stormhub</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/stormhub"/>
    <language>en</language>
    <item>
      <title>Agent with Vercel's Eve Framework</title>
      <dc:creator>Johnny Z</dc:creator>
      <pubDate>Mon, 22 Jun 2026 04:51:54 +0000</pubDate>
      <link>https://dev.to/stormhub/agent-with-vercels-eve-framework-3c2l</link>
      <guid>https://dev.to/stormhub/agent-with-vercels-eve-framework-3c2l</guid>
      <description>&lt;p&gt;Vercel recently open-sourced &lt;a href="https://github.com/vercel/eve" rel="noopener noreferrer"&gt;Eve&lt;/a&gt;, a framework for building durable AI agents. It takes an opinionated, filesystem-first approach: instead of wiring up model loops, tool dispatch, and session persistence yourself, you author a directory of files and Eve handles the rest.&lt;/p&gt;

&lt;p&gt;I took it for a spin by building a shopping assistant — an agent that can search a product catalog, check inventory, compare prices, read reviews, and place orders. Here's what I found.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core Separation: Agent vs Channel
&lt;/h2&gt;

&lt;p&gt;Eve draws a hard line between what the agent &lt;em&gt;is&lt;/em&gt; and how it &lt;em&gt;communicates&lt;/em&gt;. The agent is the reasoning core — model, tools, instructions. It doesn't know or care how users reach it. A channel is just comms — it handles inbound transport, auth, message format, and delivery for a specific platform.&lt;/p&gt;

&lt;p&gt;This means the same agent can simultaneously serve a browser chat widget, a Slack bot, a CLI, and a custom webhook — without any conditional logic in the agent itself. You add surfaces by adding channel files, not by changing agent code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Channels: How Users Reach the Agent
&lt;/h2&gt;

&lt;p&gt;In Eve, a &lt;strong&gt;channel&lt;/strong&gt; is the edge adapter between a platform and your agent. It normalizes inbound messages, owns the conversation resume handle (&lt;code&gt;continuationToken&lt;/code&gt;), and decides how responses get delivered back.&lt;/p&gt;

&lt;p&gt;The default channel is &lt;code&gt;eve.ts&lt;/code&gt; — the HTTP session API that the dev TUI, browser clients, and &lt;code&gt;curl&lt;/code&gt; all talk to. But channels aren't limited to HTTP. Eve ships integrations for Slack, Discord, Teams, Telegram, Twilio (SMS/voice), GitHub, and Linear. You can also write custom channels for any surface (webhooks, WebSockets, internal systems).&lt;/p&gt;

&lt;p&gt;Each channel is a file under &lt;code&gt;agent/channels/&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;agent/channels/
├── eve.ts      # HTTP API (always present, even without a file)
├── slack.ts    # Slack DMs, mentions, buttons
└── intake.ts   # Your custom webhook channel
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight: your agent logic (instructions + tools) stays the same regardless of channel. You write the agent once, then expose it through multiple surfaces by adding channel files. The channel handles platform-specific concerns (auth, message format, delivery) while the agent handles reasoning.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Default Eve Channel
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;eve.ts&lt;/code&gt; channel is always present — it provides a session-based HTTP API that the dev TUI, browser clients (&lt;code&gt;useEveAgent&lt;/code&gt;), and &lt;code&gt;curl&lt;/code&gt; all use. The key concept is &lt;strong&gt;durable sessions&lt;/strong&gt;: you &lt;code&gt;POST&lt;/code&gt; to create a session, stream events from it via NDJSON, and continue it with a &lt;code&gt;continuationToken&lt;/code&gt;. Sessions survive server restarts and support reconnection at any event index (&lt;code&gt;?startIndex=N&lt;/code&gt;). This is fundamentally different from stateless request/response — Eve owns the conversation state server-side.&lt;/p&gt;

&lt;h3&gt;
  
  
  How Sessions Are Durable by Default
&lt;/h3&gt;

&lt;p&gt;Eve sessions aren't just "kept in memory" — they're backed by a workflow engine. Under the hood, every turn runs as a durable workflow built on the open-source &lt;a href="https://workflow-sdk.dev/" rel="noopener noreferrer"&gt;Workflow SDK&lt;/a&gt;. Eve checkpoints progress at each &lt;strong&gt;step&lt;/strong&gt; boundary (one model call + its tool calls = one step). If the process crashes mid-turn, it resumes from the last completed step rather than replaying everything.&lt;/p&gt;

&lt;p&gt;Locally, this is just files on disk. Run your agent and you'll find a &lt;code&gt;.workflow-data/&lt;/code&gt; directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.workflow-data/
├── runs/       # one JSON file per session (workflow state)
├── steps/      # checkpoint per completed step
├── streams/    # event streams (what clients read via /stream)
├── hooks/      # parked continuation tokens (waiting for input)
└── events/     # workflow lifecycle events
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means you can:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start a conversation, ask the agent something&lt;/li&gt;
&lt;li&gt;Kill the server (&lt;code&gt;Ctrl+C&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Restart it&lt;/li&gt;
&lt;li&gt;Continue the conversation with the same &lt;code&gt;sessionId&lt;/code&gt; and &lt;code&gt;continuationToken&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The session picks up exactly where it left off — including mid-turn recovery if a step was already completed before the crash.&lt;/p&gt;

&lt;p&gt;Obviously, local files aren't scalable for production. Eve's durability is pluggable via the &lt;a href="https://workflow-sdk.dev/" rel="noopener noreferrer"&gt;Workflow SDK&lt;/a&gt;'s "World" abstraction — the storage/queue/streaming backend. You pick a world package and Eve uses it for all session persistence:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;World&lt;/th&gt;
&lt;th&gt;Backend&lt;/th&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@workflow/world-local&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Filesystem (&lt;code&gt;.workflow-data/&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Local dev (default)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@workflow/world-postgres&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;PostgreSQL + graphile-worker&lt;/td&gt;
&lt;td&gt;Self-hosted production&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@workflow/world-vercel&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Vercel Workflow (managed)&lt;/td&gt;
&lt;td&gt;Vercel deployments&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@workflow-worlds/redis&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Redis + BullMQ&lt;/td&gt;
&lt;td&gt;Self-hosted, high throughput&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@workflow-worlds/mongodb&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;MongoDB&lt;/td&gt;
&lt;td&gt;Self-hosted&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@workflow-worlds/turso&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Turso/libSQL (embedded SQLite)&lt;/td&gt;
&lt;td&gt;Edge/embedded&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;To switch, you set it in &lt;code&gt;agent.ts&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;defineAgent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-4o&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;modelContextWindowTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;experimental&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;workflow&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;world&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@workflow/world-postgres&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The World interface has three responsibilities: &lt;strong&gt;Storage&lt;/strong&gt; (persisting runs, steps, hooks via an append-only event log), &lt;strong&gt;Queue&lt;/strong&gt; (dispatching workflow/step invocations with at-least-once delivery), and &lt;strong&gt;Streams&lt;/strong&gt; (real-time event delivery to clients). You can also &lt;a href="https://workflow-sdk.dev/docs/deploying/building-a-world" rel="noopener noreferrer"&gt;build your own&lt;/a&gt; if none of the existing options fit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example: Exposing the Agent as AG-UI
&lt;/h3&gt;

&lt;p&gt;To make this concrete — I built a custom channel that exposes the Eve agent over the &lt;a href="https://github.com/ag-ui-protocol/ag-ui" rel="noopener noreferrer"&gt;AG-UI protocol&lt;/a&gt; (SSE-based, used by CopilotKit and similar frameworks). The channel translates Eve's internal event stream into AG-UI's event vocabulary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// agent/channels/agui.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;defineChannel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;POST&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;Session&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;eve/channels&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;EventEncoder&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@ag-ui/encoder&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;EventType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;BaseEvent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;RunAgentInput&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@ag-ui/core&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;defineChannel&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;routes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nc"&gt;POST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/agui&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;send&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;RunAgentInput&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;threadId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Missing 'threadId'.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;threadId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;runId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;runId&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="nf"&gt;randomUUID&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;

      &lt;span class="c1"&gt;// AG-UI is stateless per request — clients send full message history.&lt;/span&gt;
      &lt;span class="c1"&gt;// Pass prior messages as context so Eve's agent sees the conversation.&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;lastUserIdx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findLastIndex&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;lastUserIdx&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="s2"&gt;`[&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;]: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
      &lt;span class="p"&gt;);&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="na"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;lastUserIdx&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;continuationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`agui:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;threadId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nf"&gt;randomUUID&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;);&lt;/span&gt;

      &lt;span class="c1"&gt;// Read Eve's event stream and translate to AG-UI SSE&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;eveStream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getEventStream&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;encoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;EventEncoder&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="c1"&gt;// ... event mapping loop (see full source)&lt;/span&gt;
    &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The event mapping is mostly mechanical — &lt;code&gt;actions.requested&lt;/code&gt; → &lt;code&gt;TOOL_CALL_START&lt;/code&gt;/&lt;code&gt;ARGS&lt;/code&gt;/&lt;code&gt;END&lt;/code&gt;, &lt;code&gt;action.result&lt;/code&gt; → &lt;code&gt;TOOL_CALL_RESULT&lt;/code&gt;, &lt;code&gt;message.appended&lt;/code&gt; → &lt;code&gt;TEXT_MESSAGE_CONTENT&lt;/code&gt;, &lt;code&gt;turn.completed&lt;/code&gt; → &lt;code&gt;RUN_FINISHED&lt;/code&gt;. But there were a few non-obvious gotchas:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Eve sessions are durable; AG-UI runs are not.&lt;/strong&gt; Eve's event stream never closes after a turn — it waits for the next message. You must close the response stream yourself after emitting &lt;code&gt;RUN_FINISHED&lt;/code&gt;. If you don't, the client hangs forever waiting for more data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Eve emits both &lt;code&gt;turn.completed&lt;/code&gt; and &lt;code&gt;session.waiting&lt;/code&gt; at turn boundaries.&lt;/strong&gt; If you naively emit &lt;code&gt;RUN_FINISHED&lt;/code&gt; for both, the AG-UI client throws: &lt;em&gt;"Cannot send event type 'RUN_FINISHED': The run has already finished."&lt;/em&gt; Guard with a flag and only emit once.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. AG-UI is stateless; Eve is stateful.&lt;/strong&gt; Each AG-UI request carries the full message history. Since Eve's &lt;code&gt;send()&lt;/code&gt; creates/continues sessions via &lt;code&gt;continuationToken&lt;/code&gt;, you need a fresh token per request (otherwise Eve tries to continue a stale session). The conversation history goes through &lt;code&gt;context&lt;/code&gt; so the agent sees prior turns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Eve actions are typed unions.&lt;/strong&gt; &lt;code&gt;actions.requested&lt;/code&gt; contains a &lt;code&gt;RuntimeActionRequest[]&lt;/code&gt; that can be &lt;code&gt;tool-call&lt;/code&gt;, &lt;code&gt;subagent-call&lt;/code&gt;, or &lt;code&gt;load-skill&lt;/code&gt;. You need to filter for &lt;code&gt;action.kind === "tool-call"&lt;/code&gt; and use &lt;code&gt;action.toolName&lt;/code&gt; / &lt;code&gt;action.callId&lt;/code&gt; (not &lt;code&gt;.name&lt;/code&gt; / &lt;code&gt;.id&lt;/code&gt; which don't exist on the type).&lt;/p&gt;

&lt;p&gt;Same agent, same tools, same instructions — but now it speaks AG-UI over SSE at &lt;code&gt;POST /agui&lt;/code&gt;. You could have the Eve HTTP channel, a Slack channel, and this AG-UI channel all running simultaneously, each talking to the same underlying agent.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Developer Experience
&lt;/h2&gt;

&lt;p&gt;Run &lt;code&gt;pnpm dev&lt;/code&gt; (or &lt;code&gt;npx eve dev&lt;/code&gt;) and you get an interactive terminal UI that speaks the eve channel protocol locally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;☰eve  shopping-agent-orchestrator

&amp;gt; show me laptops under $1000

✓ search_products  query="laptop" maxPrice=1000 → 1 result
✓ get_pricing  product="Dell XPS 13" → $899.10 (Summer Sale)
✓ check_stock  product="Dell XPS 13" → 12 units

I found the Dell XPS 13 for $899.10 (10% off with the Summer Sale).
It's in stock with 12 units available. Would you like to place an order?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The TUI shows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tool calls as they happen (collapsed to one-line summaries)&lt;/li&gt;
&lt;li&gt;Streaming text as the model generates it&lt;/li&gt;
&lt;li&gt;Token usage stats&lt;/li&gt;
&lt;li&gt;Slash commands (&lt;code&gt;/model&lt;/code&gt; to switch models, &lt;code&gt;/new&lt;/code&gt; for a fresh session)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;File changes trigger a hot rebuild — edit a tool, and it's live on the next message.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Gotcha: Custom Models Need &lt;code&gt;modelContextWindowTokens&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;If you're using a non-gateway model (custom &lt;code&gt;baseURL&lt;/code&gt;, direct provider), Eve's build will fail with a cryptic error about compaction metadata. You need to explicitly tell it the context window size:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// agent/agent.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createOpenAI&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@ai-sdk/openai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;defineAgent&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;eve&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createOpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_BASE_URL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// custom endpoint&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;defineAgent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-4o&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;modelContextWindowTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// required for direct providers&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What is &lt;code&gt;modelContextWindowTokens&lt;/code&gt;?&lt;/strong&gt; Eve has a built-in "compaction" system that prevents long conversations from overflowing the model's context window. When the conversation reaches ~90% of the window, Eve automatically summarizes older turns into a compressed form so the session can keep going indefinitely. To do this, Eve needs to know how big the window is. Gateway models (like &lt;code&gt;"openai/gpt-4o"&lt;/code&gt;) have this metadata in Vercel's catalog. But when you bring your own provider via &lt;code&gt;createOpenAI()&lt;/code&gt;, Eve has no way to look it up — so you tell it explicitly.&lt;/p&gt;

&lt;p&gt;Without this field, the build fails at compile time rather than silently skipping compaction at runtime. The error message (&lt;code&gt;"does not have known AI Gateway context window metadata"&lt;/code&gt;) doesn't make the fix obvious.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zero orchestration code.&lt;/strong&gt; I didn't write a single line of routing, tool dispatch, or streaming logic. The model handles multi-step reasoning natively — search → check stock → order — guided by the markdown instructions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The filesystem convention removes boilerplate.&lt;/strong&gt; Adding a new capability is "create a file." No registration, no imports to wire up.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Durable sessions out of the box.&lt;/strong&gt; Multi-turn conversations, reconnection, human-in-the-loop approval — it's all built in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The dev TUI is genuinely useful.&lt;/strong&gt; Seeing tool calls execute in real-time while developing makes the feedback loop fast.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;a href="https://github.com/StormHub/stormhub/tree/main/resources/2026-06-22" rel="noopener noreferrer"&gt;Complete sample code&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Please feel free to reach out on twitter &lt;a href="https://twitter.com/roamingcode" rel="noopener noreferrer"&gt;@roamingcode&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>typescript</category>
      <category>agents</category>
    </item>
    <item>
      <title>Beyond the Agentic Loop, in TypeScript: building a shopping agent with the Orchestrator pattern</title>
      <dc:creator>Johnny Z</dc:creator>
      <pubDate>Wed, 17 Jun 2026 00:26:20 +0000</pubDate>
      <link>https://dev.to/stormhub/beyond-the-agentic-loop-in-typescript-building-a-shopping-agent-with-the-orchestrator-pattern-7ka</link>
      <guid>https://dev.to/stormhub/beyond-the-agentic-loop-in-typescript-building-a-shopping-agent-with-the-orchestrator-pattern-7ka</guid>
      <description>&lt;p&gt;This post is a TypeScript implementation of the pattern described in &lt;a href="https://stackademic.com/blog/beyond-the-agentic-loop-the-orchestrator-pattern-for-multi-agent-systems" rel="noopener noreferrer"&gt;&lt;strong&gt;"Beyond the Agentic Loop: The Orchestrator Pattern for Multi-Agent Systems"&lt;/strong&gt;&lt;/a&gt; by &lt;strong&gt;Amogh Ubale&lt;/strong&gt; (Stackademic). The original is Python with generic agents; here we keep the idea intact and re-theme it as a &lt;strong&gt;shopping assistant&lt;/strong&gt; so the three execution modes have something concrete to chew on. All the design credit goes to that article — go read it first.&lt;/p&gt;

&lt;h2&gt;
  
  
  The cast: a handful of shopping agents
&lt;/h2&gt;

&lt;p&gt;Before the pattern, the scene. The demo is a small storefront assistant backed by a&lt;br&gt;
few single-purpose agents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Catalog&lt;/strong&gt; — list the categories on offer, or search products by keyword and price.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inventory&lt;/strong&gt; — check stock and availability for a product.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pricing&lt;/strong&gt; — look up the current price and any active promotions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reviews&lt;/strong&gt; — fetch a product's rating and review highlights.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Order&lt;/strong&gt; — place an order for a product.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A customer request might need just one of these, several of them at once, or a few in a&lt;br&gt;
strict order — and deciding &lt;em&gt;which&lt;/em&gt; of those shapes a request calls for is exactly what&lt;br&gt;
the orchestrator is for.&lt;/p&gt;
&lt;h2&gt;
  
  
  The problem: the LLM as a &lt;code&gt;while&lt;/code&gt; loop
&lt;/h2&gt;

&lt;p&gt;The default way to build a multi-agent system is the &lt;strong&gt;agentic loop&lt;/strong&gt;: you hand the&lt;br&gt;
model a bag of tools and let it drive.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;think → call a tool → observe the result → think again → call another tool → …
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM is both the brain &lt;em&gt;and&lt;/em&gt; the control flow. That's wonderfully flexible, and&lt;br&gt;
it's the right tool when the task is open-ended and you genuinely don't know the&lt;br&gt;
steps in advance. But in production it has three nasty properties:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unpredictable shape.&lt;/strong&gt; Every "think" step is another LLM round-trip, so a
three-agent task might take 3 calls or 9 — you don't know until it runs, and latency
swings with it. (The article clocks a representative three-agent query at ~7 calls
through the loop; the wall-clock and spend follow, but the unpredictability is the
part that actually bites.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Non-determinism.&lt;/strong&gt; The same question can take a different path each time, which
makes behavior hard to reason about and hard to trust with side effects — like
&lt;em&gt;placing an order&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Poor observability.&lt;/strong&gt; "Why did it do that?" means replaying a transcript of
intermingled reasoning and tool calls. There's no single place where the &lt;em&gt;plan&lt;/em&gt;
lives.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you already know which agents exist and what they do, an open-ended reasoning loop&lt;br&gt;
on every request is more freedom than the job needs.&lt;/p&gt;
&lt;h2&gt;
  
  
  The pattern: decide once, execute deterministically
&lt;/h2&gt;

&lt;p&gt;The orchestrator's move is to &lt;strong&gt;separate the decision from the execution&lt;/strong&gt;. Instead&lt;br&gt;
of letting the model loop, you make exactly &lt;strong&gt;two&lt;/strong&gt; LLM calls with plain,&lt;br&gt;
deterministic code in between:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;query ──▶ [ROUTE: LLM #1] ──▶ [EXECUTE: agents, no LLM] ──▶ [SYNTHESIZE: LLM #2] ──▶ answer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Route&lt;/strong&gt; — one LLM call whose &lt;em&gt;only&lt;/em&gt; job is to pick which agent(s) to run.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execute&lt;/strong&gt; — ordinary application code runs those agents. No LLM here.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Synthesize&lt;/strong&gt; — one LLM call turns the structured results into prose.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Two calls, every time, no matter how many agents run. That fixed shape is the whole&lt;br&gt;
point: a plan you can inspect before anything happens, latency that doesn't depend on&lt;br&gt;
the model's mood, and independent work you can fan out. (It's cheaper too — the article&lt;br&gt;
puts the same query at ~2 calls instead of ~7 — but the cost isn't the headline; the&lt;br&gt;
&lt;em&gt;outcomes&lt;/em&gt; are.)&lt;/p&gt;
&lt;h2&gt;
  
  
  1. The registry: agents are just functions
&lt;/h2&gt;

&lt;p&gt;An agent is a name, a description (for the router), a JSON-Schema for its arguments,&lt;br&gt;
and an &lt;code&gt;execute&lt;/code&gt; function. Nothing more.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/server/orchestrator/types.ts&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;ExecuteFn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentArgs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;AgentResult&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;AgentDefinition&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;        &lt;span class="c1"&gt;// human name, e.g. "Catalog Agent"&lt;/span&gt;
  &lt;span class="nl"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// shown to the router LLM so it can choose this tool&lt;/span&gt;
  &lt;span class="nl"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// JSON Schema for the args&lt;/span&gt;
  &lt;span class="nl"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ExecuteFn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The "registry" is a plain in-process object — agents are &lt;strong&gt;registered by hand&lt;/strong&gt;.&lt;br&gt;
There's deliberately no Redis, no database, no HTTP self-registration. That keeps the&lt;br&gt;
whole thing runnable and testable with zero infrastructure.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/server/orchestrator/registry.ts&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;REGISTRY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;AgentDefinition&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;catalog_agent__list_categories&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;catalogCategoriesAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;catalog_agent__search_products&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;catalogAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;inventory_agent__check_stock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;inventoryAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;pricing_agent__get_deals&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;pricingAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;reviews_agent__get_reviews&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;reviewsAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;order_agent__place_order&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;orderAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;toolDefinitions()&lt;/code&gt; projects that map into the OpenAI tool format the router sees —&lt;br&gt;
each agent becomes one function tool, plus one &lt;strong&gt;meta-tool&lt;/strong&gt; we'll meet shortly.&lt;/p&gt;
&lt;h2&gt;
  
  
  2. Route: the one decision-making LLM call
&lt;/h2&gt;

&lt;p&gt;The router is given a blunt system prompt: &lt;em&gt;pick tools, do not answer.&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/server/orchestrator/router.ts&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;SYSTEM_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`You are a query router. Your ONLY job is to decide which tool(s) to call.
Rules:
- If the query needs ONE agent, call that one tool.
- If the query needs MULTIPLE INDEPENDENT agents, call all of them.
- If the query needs steps IN ORDER (a later step depends on an earlier one), call plan_execution and provide the ordered steps.
Do NOT answer the user's question — just pick tools.`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We call the model at &lt;code&gt;temperature: 0&lt;/code&gt; with &lt;code&gt;tool_choice: "auto"&lt;/code&gt;, then read its tool&lt;br&gt;
calls back out. The shape of that tool-call list &lt;em&gt;is&lt;/em&gt; the execution plan — we never&lt;br&gt;
ask the model to "answer," only to choose:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/server/orchestrator/router.ts&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;RouteDecision&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getOpenAIClient&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;getConfig&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nx"&gt;ROUTER_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;toolDefinitions&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;tool_choice&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;auto&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;SYSTEM_PROMPT&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;toolCalls&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tool_calls&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;

  &lt;span class="c1"&gt;// plan_execution present -&amp;gt; sequential. Take its ordered steps.&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;planCall&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;toolCalls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;PLAN_EXECUTION_TOOL&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;planCall&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;parsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;safeParseArgs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;planCall&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;args&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="nx"&gt;AgentArgs&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;reason&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;steps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;steps&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="p"&gt;[]).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;args&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="na"&gt;reason&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;reason&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;sequential&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;steps&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;steps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;toolCalls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;safeParseArgs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;parallel&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;single&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;steps&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So the router collapses to three outcomes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;one&lt;/strong&gt; tool → &lt;code&gt;single&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;several&lt;/strong&gt; tools → &lt;code&gt;parallel&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;&lt;code&gt;plan_execution&lt;/code&gt;&lt;/strong&gt; meta-tool → &lt;code&gt;sequential&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. Execute: the heart of the pattern (no LLM)
&lt;/h2&gt;

&lt;p&gt;This is where parallel and sequential actually diverge — and it's pure TypeScript,&lt;br&gt;
no model involved.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/server/orchestrator/executor.ts&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;executeStream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Mode&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PlanStep&lt;/span&gt;&lt;span class="p"&gt;[]):&lt;/span&gt; &lt;span class="nx"&gt;AsyncGenerator&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;ExecEvent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;AgentContext&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="na"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentContext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;mode&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;parallel&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;step&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;agent_start&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;args&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;settled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="nx"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;runAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})]&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;settled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;agent_result&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// single + sequential: ordered; each step sees prior results as context.&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;step&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;agent_start&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;args&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;runAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;agent_result&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Read the two branches side by side:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Parallel&lt;/strong&gt; is &lt;code&gt;Promise.all&lt;/code&gt;. The agents are independent, so they all fire at
once and you pay for the slowest one, not the sum. &lt;em&gt;"What's the price, rating, and
stock of the iPhone 15?"&lt;/em&gt; becomes three lookups that have nothing to say to each
other — run them together.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sequential&lt;/strong&gt; is an ordered &lt;code&gt;for&lt;/code&gt; loop where each step receives the accumulated
&lt;code&gt;results&lt;/code&gt; as its &lt;code&gt;context&lt;/code&gt;. That's how a later agent consumes an earlier one's
output. &lt;em&gt;"Find a laptop under $1000, check it's in stock, then order it"&lt;/em&gt; can't be
parallel — the order step needs the product the search produced.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(The generator &lt;code&gt;yield&lt;/code&gt;s a small event before and after each agent. That's only so a&lt;br&gt;
transport can show progress; it doesn't change the logic.)&lt;/p&gt;
&lt;h2&gt;
  
  
  4. &lt;code&gt;plan_execution&lt;/code&gt;: a signal, not an agent
&lt;/h2&gt;

&lt;p&gt;How does the router say "do these in order"? With a meta-tool that runs no code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/server/orchestrator/registry.ts&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;PLAN_EXECUTION_TOOL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;plan_execution&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;// ...its tool schema asks for { reason, steps: [{ tool, args, reason }] }&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the router selects &lt;code&gt;plan_execution&lt;/code&gt;, the orchestrator switches to sequential&lt;br&gt;
mode. The original article treats it purely as a &lt;em&gt;signal&lt;/em&gt; and leaves the ordering and&lt;br&gt;
data-passing unspecified. This repo makes one deliberate addition so the demo&lt;br&gt;
actually works end-to-end: &lt;strong&gt;&lt;code&gt;plan_execution&lt;/code&gt; returns the ordered &lt;code&gt;steps&lt;/code&gt;&lt;/strong&gt;, and the&lt;br&gt;
executor threads &lt;code&gt;results&lt;/code&gt; forward as context. The order agent then resolves the&lt;br&gt;
product the search found (see &lt;code&gt;resolveTargetProduct&lt;/code&gt; in&lt;br&gt;
&lt;code&gt;src/server/lib/resolve-product.ts&lt;/code&gt;). That's the difference between a pattern diagram&lt;br&gt;
and a thing you can run.&lt;/p&gt;
&lt;h2&gt;
  
  
  5. Synthesize: the only creative call
&lt;/h2&gt;

&lt;p&gt;Once the agents have produced structured data, a second LLM call turns it into an&lt;br&gt;
answer. This is the only step with any "writing" to do, so it runs warmer and streams&lt;br&gt;
its tokens out.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/server/orchestrator/synthesizer.ts&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;synthesizeStream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;AgentContext&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;AsyncGenerator&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getOpenAIClient&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;getConfig&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nx"&gt;SYNTH_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Summarize the agent results into a clear, helpful answer.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`User asked: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;\nResults: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;delta&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="nx"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What this buys you
&lt;/h2&gt;

&lt;p&gt;Putting the three phases together, the payoff is exactly the inverse of the loop's&lt;br&gt;
pain points — and these &lt;em&gt;enablements&lt;/em&gt;, not the price tag, are the real reason to reach&lt;br&gt;
for it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A plan you can trust.&lt;/strong&gt; The decision is a single inspectable object — the
&lt;code&gt;RouteDecision&lt;/code&gt; — produced &lt;em&gt;before&lt;/em&gt; any agent runs. You can log it, assert on it,
gate it, replay it. That's what makes it safe to let an agent actually place an
order.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debuggability.&lt;/strong&gt; The execute phase is deterministic, so a bug there reproduces
every time instead of hiding in a different transcript on each run.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallelism for free.&lt;/strong&gt; Independent work is a &lt;code&gt;Promise.all&lt;/code&gt;; you didn't have to
teach the model to be concurrent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A testable core.&lt;/strong&gt; Because the middle phase has no LLM in it, &lt;code&gt;executeStream&lt;/code&gt; is
an ordinary async function you can unit-test with a stub registry — no API key, no
flakiness.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Predictable runs&lt;/strong&gt; (the boring-but-nice one). Always two LLM calls, whether the
request touches one agent or five — so latency is something you can put a number on,
and the bill is lower as a side effect.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Sample queries → how they route
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Query&lt;/th&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Agents&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;what do you have?&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;single&lt;/td&gt;
&lt;td&gt;&lt;code&gt;catalog_agent__list_categories&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;what's the price, rating and availability of the iPhone 15?&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;parallel&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;pricing&lt;/code&gt; + &lt;code&gt;reviews&lt;/code&gt; + &lt;code&gt;inventory&lt;/code&gt; (at once)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;find a laptop under $1000, make sure it's in stock, then order it&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;sequential&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;search&lt;/code&gt; → &lt;code&gt;check stock&lt;/code&gt; → &lt;code&gt;order&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Same agents, same data — the router decides the &lt;strong&gt;shape&lt;/strong&gt; of the run.&lt;/p&gt;

&lt;h2&gt;
  
  
  When the loop still wins
&lt;/h2&gt;

&lt;p&gt;This isn't "orchestrator good, loop bad." The agentic loop is the right tool when the&lt;br&gt;
task is genuinely exploratory: you don't know the steps ahead of time, the toolset is&lt;br&gt;
open-ended, or the agent needs to re-plan mid-flight based on what it discovers. The&lt;br&gt;
orchestrator trades that adaptability for predictability — and it assumes you can&lt;br&gt;
enumerate your agents up front. Note too that the router here is itself a single LLM&lt;br&gt;
call, so a truly novel multi-hop plan it has never seen is out of scope by design.&lt;/p&gt;

&lt;p&gt;The article's framing is the one to keep: &lt;strong&gt;loop for exploration, orchestrator for&lt;br&gt;
production.&lt;/strong&gt; If you already know your agents and you need bounded latency, parallel&lt;br&gt;
execution, and debuggable runs — ask the model once, execute, synthesize. Two calls,&lt;br&gt;
done.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://github.com/StormHub/stormhub/tree/main/resources/2026-06-17/agent-orchestrator" rel="noopener noreferrer"&gt;Complete sample code&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Please feel free to reach out on twitter &lt;a href="https://twitter.com/roamingcode" rel="noopener noreferrer"&gt;@roamingcode&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>typescript</category>
    </item>
    <item>
      <title>Agent Skills in Microsoft Agent Framework</title>
      <dc:creator>Johnny Z</dc:creator>
      <pubDate>Thu, 04 Jun 2026 05:52:25 +0000</pubDate>
      <link>https://dev.to/stormhub/agent-skills-in-microsoft-agent-framework-2gof</link>
      <guid>https://dev.to/stormhub/agent-skills-in-microsoft-agent-framework-2gof</guid>
      <description>&lt;p&gt;The &lt;a href="https://learn.microsoft.com/en-us/agent-framework/overview/agent-framework-overview" rel="noopener noreferrer"&gt;Microsoft Agent Framework&lt;/a&gt; recently added &lt;strong&gt;skills&lt;/strong&gt; support, built around &lt;em&gt;progressive disclosure&lt;/em&gt; (still in beta). The &lt;a href="https://devblogs.microsoft.com/agent-framework/give-your-agents-domain-expertise-with-agent-skills-in-microsoft-agent-framework/" rel="noopener noreferrer"&gt;Give Your Agents Domain Expertise with Agent Skills&lt;/a&gt; devblog is an excellent introduction, so I won't re-tread the basics here.&lt;/p&gt;

&lt;p&gt;If you've used skills in a coding agent, the idea is familiar: a skill is just a folder — a &lt;code&gt;SKILL.md&lt;/code&gt; manifest plus reference documents and scripts — that the agent discovers and pulls in &lt;em&gt;only when it needs to&lt;/em&gt;. Instead of stuffing every capability into the system prompt, the agent sees a lightweight catalog of skill names and descriptions, and loads the full content on demand. That's the whole point of progressive disclosure: an agent's context is a budget, and skills are a way to spend it lazily.&lt;/p&gt;

&lt;p&gt;In practice that part just works: when a request matches a skill, the model is nudged to call the built-in &lt;code&gt;load_skill&lt;/code&gt; tool, and the framework returns the skill's full content for the model to use. Triggering and loading behave exactly as advertised.&lt;/p&gt;

&lt;p&gt;But spending the budget is only half the story. Once a skill's content is loaded, &lt;em&gt;where does it actually live — and is it ever dropped?&lt;/em&gt; The docs are silent on this, and it's the question the rest of this post digs into.&lt;/p&gt;

&lt;p&gt;The short answer: within a session, it isn't dropped at all. Starting a new session drops everything, of course — that much is obvious. The part worth knowing is what happens &lt;em&gt;inside&lt;/em&gt; a single session: once loaded, a skill's full content stays in the conversation for the entire life of that session. There's no budget, no sliding window, no eviction. The rest of the post shows how I confirmed this, and why it matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Watching a skill get triggered
&lt;/h2&gt;

&lt;p&gt;The sample is a tiny console app running entirely against a local &lt;a href="https://ollama.com" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; model — no cloud keys, and every HTTP call is traced so I can see exactly what goes over the wire (&lt;a href="https://github.com/StormHub/stormhub/tree/main/resources/2026-06-02/AgentSkillsDemo" rel="noopener noreferrer"&gt;complete sample code&lt;/a&gt;). There's a single skill on disk:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;skills/unit-converter/
├── SKILL.md                        # name + description + usage steps
└── references/conversion-table.md  # the actual conversion factors
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wiring it into the agent is one line — &lt;code&gt;AgentSkillsProvider&lt;/code&gt; is just an &lt;code&gt;AIContextProvider&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;agentOptions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;ChatClientAgentOptions&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"UnitConverterAgent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ChatOptions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;ChatOptions&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Instructions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"You are a helpful assistant that can convert units. ..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Tools&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;AIFunctionFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Convert&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;AIContextProviders&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;skillsProvider&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;   &lt;span class="c1"&gt;// &amp;lt;-- skills plug in here&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On every request, that provider does two things. First, it injects a &lt;strong&gt;catalog&lt;/strong&gt; of skills — names and descriptions only — into the system prompt. That's the entire "advertisement" the model sees up front; no factors, no usage steps:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;available_skills&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;skill&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;name&amp;gt;&lt;/span&gt;unit-converter&lt;span class="nt"&gt;&amp;lt;/name&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;description&amp;gt;&lt;/span&gt;Convert between common units using a multiplication factor.
      Use when asked to convert miles, kilometers, pounds, or kilograms.&lt;span class="nt"&gt;&amp;lt;/description&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/skill&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/available_skills&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Second, it registers three tools the model can call to pull in more on demand: &lt;code&gt;load_skill&lt;/code&gt;, &lt;code&gt;read_skill_resource&lt;/code&gt;, and &lt;code&gt;run_skill_script&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Intercepting the tool calls
&lt;/h3&gt;

&lt;p&gt;To watch the triggering happen, I don't need to read the trace — the framework lets you intercept every tool call with function-invocation middleware. &lt;code&gt;AIAgentBuilder.Use(...)&lt;/code&gt; wraps the agent and hands you each call before it runs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agentOptions&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;AIAgentBuilder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ct&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="s"&gt;"load_skill"&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="s"&gt;"read_skill_resource"&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="s"&gt;"run_skill_script"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;Console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WriteLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;$"Skill triggered: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;(&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Arguments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetValueOrDefault&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"skillName"&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s"&gt;)"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ct&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The three skill tools are supplied by the provider, but they flow through the same function-invoking pipeline as my own &lt;code&gt;Convert&lt;/code&gt; tool — so this one interceptor sees them all, and I just filter by name.&lt;/p&gt;

&lt;p&gt;Now I ask a question that needs the skill:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;How many kilometers is a marathon (26.2 miles)? And how many pounds is 75 kilograms?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;and the triggering shows up live:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;Skill triggered: load_skill(unit-converter)
Skill triggered: read_skill_resource(unit-converter)
Agent: A marathon of 26.2 miles is approximately 42.16 kilometers, and 75 kilograms is approximately 165.35 pounds.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So the disclosure unfolds in stages, exactly as designed:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The model sees only the catalog, decides &lt;code&gt;unit-converter&lt;/code&gt; is relevant, and calls &lt;code&gt;load_skill("unit-converter")&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The framework returns the full &lt;code&gt;SKILL.md&lt;/code&gt; as the tool result. Its usage steps tell the model to consult &lt;code&gt;references/conversion-table.md&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The model calls &lt;code&gt;read_skill_resource&lt;/code&gt; to pull that reference, then runs the actual conversion.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each step pulls in a little more context, only when it is needed. This is progressive disclosure working as promised — the part the docs cover well. The interesting question is what happens to all that loaded content next.&lt;/p&gt;

&lt;h2&gt;
  
  
  Loaded once, kept for the whole session
&lt;/h2&gt;

&lt;p&gt;So where does that loaded content go? Straight into the session history — and it stays. After the run I read the history back and tagged the skill messages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;===== Session history after run: 8 messages =====
  [ 1] [SKILL] assistant call -&amp;gt; load_skill
  [ 2] [SKILL] tool      tool result          ← the full SKILL.md body
  [ 3] [SKILL] assistant call -&amp;gt; read_skill_resource
  [ 4] [SKILL] tool      tool result          ← the reference content
  ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;load_skill&lt;/code&gt; body and the reference are sitting right there as ordinary tool messages, and nothing removes them. That's the part to take away: within a session, loaded skill content lives &lt;strong&gt;forever&lt;/strong&gt;. It's not the skills provider holding on to it — &lt;code&gt;load_skill&lt;/code&gt; just returns a normal tool message, and a tool message is history like any other. So every subsequent turn on that session re-sends the whole thing. No budget, no sliding window, no eviction; the only thing that clears it is starting a new session.&lt;/p&gt;

&lt;h2&gt;
  
  
  Compact automatically — but only when skills are in play
&lt;/h2&gt;

&lt;p&gt;Skills can be large, so on a long-lived session this adds up fast: you can't keep carrying every loaded skill forward. The fix is compaction, and the framework ships it out of the box. &lt;code&gt;CompactionProvider&lt;/code&gt; is just another &lt;code&gt;AIContextProvider&lt;/code&gt; you add alongside the skills provider, and &lt;code&gt;SummarizationCompactionStrategy&lt;/code&gt; &lt;em&gt;summarizes&lt;/em&gt; older history instead of dropping it — and it groups messages so a &lt;code&gt;load_skill&lt;/code&gt; call is never split from its result.&lt;/p&gt;

&lt;p&gt;I don't want to compact on &lt;em&gt;every&lt;/em&gt; turn, though — only when there's actually skill content to reclaim. A &lt;code&gt;CompactionTrigger&lt;/code&gt; is just a predicate over the message groups, so I gate it on whether a skill tool was called:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;CompactionTrigger&lt;/span&gt; &lt;span class="n"&gt;skillsTriggered&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Groups&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;SelectMany&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Messages&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;History&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MentionsSkillTool&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;AIContextProviders&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="n"&gt;skillsProvider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;CompactionProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;SummarizationCompactionStrategy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skillsTriggered&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;minimumPreservedGroups&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Compaction runs before each turn. On a fresh first turn there's nothing to compact; once a skill has been loaded, the next turn triggers a one-off summarization call and the bulky &lt;code&gt;SKILL.md&lt;/code&gt; body drops out of what's sent to the model — replaced by a short summary, while the conversation keeps going. Spend the budget lazily on the way in, reclaim it automatically on the way out.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/StormHub/stormhub/tree/main/resources/2026-06-02/AgentSkillsDemo" rel="noopener noreferrer"&gt;Complete sample code&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Please feel free to reach out on twitter &lt;a href="https://twitter.com/roamingcode" rel="noopener noreferrer"&gt;@roamingcode&lt;/a&gt;&lt;/p&gt;

</description>
      <category>agents</category>
      <category>microsoft</category>
      <category>ai</category>
      <category>dotnet</category>
    </item>
    <item>
      <title>Building Autonomous Agent Coding Harness</title>
      <dc:creator>Johnny Z</dc:creator>
      <pubDate>Thu, 16 Apr 2026 05:32:27 +0000</pubDate>
      <link>https://dev.to/stormhub/building-autonomous-agent-coding-harness-4g0n</link>
      <guid>https://dev.to/stormhub/building-autonomous-agent-coding-harness-4g0n</guid>
      <description>&lt;p&gt;This is a personal experiment in autonomous coding &lt;a href="https://github.com/StormHub/Agents.Code" rel="noopener noreferrer"&gt;source&lt;/a&gt;, built with the &lt;a href="https://platform.claude.com/docs/en/agent-sdk/overview" rel="noopener noreferrer"&gt;Claude Agent SDK&lt;/a&gt;. It takes a spec (markdown or text) and builds a full-stack application using three specialized agents, as described in this &lt;a href="https://www.anthropic.com/engineering/harness-design-long-running-apps" rel="noopener noreferrer"&gt;Anthropic post&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Requirement
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Build a full-stack application (Next.js + .NET) weather chat. &lt;/li&gt;
&lt;li&gt;I have manually created an "ideal target solution" &lt;a href="https://github.com/StormHub/Agents.Resources" rel="noopener noreferrer"&gt;reference implementation&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why This Project Is Hard
&lt;/h2&gt;

&lt;p&gt;While building a weather chat app sounds straightforward, this implementation intentionally introduces architectural challenges that test whether coding agents can work with unfamiliar, cutting-edge libraries — or whether they fall back to well-known patterns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Backend (Agent Construction &amp;amp; Local LLM Integration):&lt;/strong&gt; The .NET API utilizes the &lt;a href="https://github.com/microsoft/agent-framework" rel="noopener noreferrer"&gt;Microsoft Agent Framework&lt;/a&gt; and exposes the agent via the relatively new &lt;a href="https://docs.ag-ui.com/introduction" rel="noopener noreferrer"&gt;AG-UI&lt;/a&gt; protocol. A key challenge lies in the underlying &lt;a href="https://learn.microsoft.com/en-us/dotnet/ai/microsoft-extensions-ai" rel="noopener noreferrer"&gt;Microsoft.Extensions.AI&lt;/a&gt; pipeline: coding agents must understand how to connect a local Ollama server, correctly register it as an &lt;code&gt;IChatClient&lt;/code&gt;, configure the agent with tools, and seamlessly wire everything into the .NET dependency injection container.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Schema-Driven UI Rendering (The Catalyst):&lt;/strong&gt; To achieve the visual "Generative UI" component, the application utilizes &lt;a href="https://github.com/vercel-labs/json-render" rel="noopener noreferrer"&gt;@vercel-labs/json-render&lt;/a&gt;. This introduces a profound layer of abstraction. Rather than passing generic data to props, coding agents must grasp an indirect, specification-based rendering model. The frontend strictly expects tool outputs to be converted into a structured UI spec tree (e.g., &lt;code&gt;Container&lt;/code&gt; -&amp;gt; &lt;code&gt;WeatherCard&lt;/code&gt; -&amp;gt; &lt;code&gt;ForecastGrid&lt;/code&gt;), mapped dynamically to concrete React components via a component catalog. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Full-Stack Tool Coupling &amp;amp; Protocol Bridging:&lt;/strong&gt; Driven by the strict schema requirements of the UI, tool execution becomes a highly coupled, full-stack concern. The backend emits raw AG-UI Server-Sent Events (SSE), which the Next.js server must manually parse and map to the Vercel &lt;a href="https://github.com/vercel/ai" rel="noopener noreferrer"&gt;AI SDK&lt;/a&gt; 'UIMessage' types. Crucially, because the AG-UI protocol exposes tool execution results directly to the client stream as JSON payloads, coding agents must explicitly co-design the C# backend tool call result types to satisfy the frontend's schema-driven expectations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Custom Generative UI Transport &amp;amp; State:&lt;/strong&gt; Because these tightly-coupled tool outputs stream directly to the client, standard AI SDK hooks aren't enough out-of-the-box. The frontend requires configuring &lt;code&gt;useChat&lt;/code&gt; with a custom &lt;code&gt;DefaultChatTransport&lt;/code&gt;. Agents must design the UI interface such that the incoming JSON payloads seamlessly inject complex parts into the &lt;code&gt;ChatMessage&lt;/code&gt; state. They must deeply understand multi-part message trees—accurately inspecting &lt;code&gt;part.type&lt;/code&gt; and &lt;code&gt;part.state === "output-available"&lt;/code&gt; to interrupt typical text rendering and conditionally mount the generated JSON UI spec.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  First Round Result
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/StormHub/Agents.Code/results/round-1/features.md" rel="noopener noreferrer"&gt;Feature requirement file&lt;/a&gt; only — intentionally instructed to use simulated/mock weather data to reduce complexity&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/StormHub/Agents.Code/results/round-1/output" rel="noopener noreferrer"&gt;Result output&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Gap Analysis
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Reference (Target)&lt;/th&gt;
&lt;th&gt;Generated (Round 1)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;.NET version&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;.NET 10&lt;/td&gt;
&lt;td&gt;.NET 8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Backend framework&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Microsoft Agent Framework (&lt;code&gt;Microsoft.Agents.AI&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Plain ASP.NET Core MVC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Streaming protocol&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;AG-UI via SSE&lt;/td&gt;
&lt;td&gt;Standard JSON REST&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LLM integration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ollama via &lt;code&gt;OllamaSharp&lt;/code&gt; + &lt;code&gt;IChatClient&lt;/code&gt; DI&lt;/td&gt;
&lt;td&gt;None — rule-based string matching&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Frontend AI SDK&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;@ai-sdk/react&lt;/code&gt; &lt;code&gt;useChat&lt;/code&gt; + &lt;code&gt;DefaultChatTransport&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Raw &lt;code&gt;fetch()&lt;/code&gt; + &lt;code&gt;useState&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;UI rendering&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;@json-render&lt;/code&gt; (schema-driven spec tree)&lt;/td&gt;
&lt;td&gt;Direct hardcoded React components&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Every architectural constraint specified in the feature requirements — AG-UI, Microsoft Agent Framework, Ollama, json-render — was ignored. The agents built a conventional CRUD-style app instead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What it got right:&lt;/strong&gt; The app is functional end-to-end with good visual design (glassmorphic cards, dynamic backgrounds, custom SVG icons), responsive layout, and clean code structure. About 7 of 16 features work partially or fully.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What it missed:&lt;/strong&gt; No SSE streaming, no LLM tool calling (just regex location extraction), no schema-driven UI rendering, no AI SDK hooks. The &lt;code&gt;ai&lt;/code&gt; npm package was even installed but never imported.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; Given only a feature spec, coding agents gravitate toward familiar patterns from training data. The novel integration requirements (AG-UI, json-render, Agent Framework) — which are the architecturally interesting parts — were completely bypassed in favor of well-known alternatives.&lt;/p&gt;

&lt;h2&gt;
  
  
  Second Round Result
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Enhanced &lt;a href="https://github.com/StormHub/Agents.Code/results/round-2/features.md" rel="noopener noreferrer"&gt;feature requirements&lt;/a&gt; with explicit architectural instructions — specifying &lt;code&gt;MapAGUI&lt;/code&gt;, &lt;code&gt;ChatClientAgent&lt;/code&gt;, &lt;code&gt;defineCatalog&lt;/code&gt;/&lt;code&gt;defineRegistry&lt;/code&gt;, &lt;code&gt;useChat&lt;/code&gt; with transport, etc.&lt;/li&gt;
&lt;li&gt;After round 1's results, custom skills created for json-render and Microsoft Agent Framework, and installed official Vercel Next.js and AI SDK skills to give agents better guidance&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/StormHub/Agents.Code/results/round-2/output" rel="noopener noreferrer"&gt;Result output&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Gap Analysis
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Reference (Target)&lt;/th&gt;
&lt;th&gt;Generated (Round 2)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;.NET version&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;.NET 10&lt;/td&gt;
&lt;td&gt;.NET 10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Backend framework&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Microsoft Agent Framework (&lt;code&gt;MapAGUI&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Packages installed but &lt;strong&gt;not used&lt;/strong&gt; — plain REST API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Streaming protocol&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;AG-UI via SSE&lt;/td&gt;
&lt;td&gt;Standard JSON REST&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LLM integration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ollama via &lt;code&gt;OllamaSharp&lt;/code&gt; + &lt;code&gt;IChatClient&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Package installed, only checks if Ollama is running — &lt;strong&gt;never calls it&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Frontend AI SDK&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;@ai-sdk/react&lt;/code&gt; &lt;code&gt;useChat&lt;/code&gt; + &lt;code&gt;DefaultChatTransport&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Package installed but uses raw &lt;code&gt;fetch()&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;UI rendering&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;@json-render/react&lt;/code&gt; (real package)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Fake shim&lt;/strong&gt; — hand-written &lt;code&gt;json-render-compat.ts&lt;/code&gt; reimplements &lt;code&gt;defineCatalog&lt;/code&gt;/&lt;code&gt;defineRegistry&lt;/code&gt; as simple wrappers&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Progress from round 1:&lt;/strong&gt; The agents now &lt;em&gt;acknowledge&lt;/em&gt; the required technologies — correct .NET version, right NuGet packages installed, catalog/registry file structure present. The feature requirements with explicit API names clearly helped.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's still wrong:&lt;/strong&gt; The acknowledgment is superficial. The agents installed &lt;code&gt;Microsoft.Agents.AI&lt;/code&gt; and &lt;code&gt;OllamaSharp&lt;/code&gt; but never called &lt;code&gt;MapAGUI()&lt;/code&gt; or created a &lt;code&gt;ChatClientAgent&lt;/code&gt;. Instead of installing &lt;code&gt;@json-render/react&lt;/code&gt;, they wrote a 40-line compatibility shim that mimics the API surface but does nothing — the &lt;code&gt;&amp;lt;Renderer&amp;gt;&lt;/code&gt; component from json-render is never used. The backend is still hardcoded pattern matching over 6 cities with no LLM.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; Adding skills and explicit architectural instructions moved agents from "completely ignore" to "install the packages and create the right file names." But the actual wiring — the hard part — was still substituted with familiar patterns. The agents created a &lt;em&gt;cargo cult&lt;/em&gt; of the architecture: the right shape, with none of the substance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The progression across rounds tells a clear story. Round 1 completely ignored the architectural requirements. Round 2 acknowledged them superficially — installing the right packages, creating files with the right names — but never actually wired anything up. The hand-written json-render shim and the unused NuGet packages are the most telling evidence.&lt;/p&gt;

&lt;p&gt;None of this is entirely surprising. These are integration challenges that even experienced engineers would need to research and iterate on — connecting unfamiliar frameworks across a full-stack boundary is genuinely hard. The deeper issue is that even with upfront planning enforced (preventing agents from "one-shotting" the app), intrinsic technical challenges in the implementation details cause coding agents to silently fall back to what they know.&lt;/p&gt;

&lt;p&gt;What these experiments suggest is that producing quality implementations with coding agents requires highly detailed, step-by-step plans — not just feature specs or architectural diagrams, but concrete wiring instructions that leave little room for substitution. Simply adding skills as supplementary context does not bridge the gap when the core integration patterns are unfamiliar to the model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Next Steps
&lt;/h2&gt;

&lt;p&gt;The experiments above point to a clear gap: the planning agent produces plans that are too high-level for the coding agent to follow faithfully when unfamiliar technologies are involved. The next iteration of the harness will focus on two changes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Interactive upfront planning:&lt;/strong&gt; Rather than generating a plan in one shot and handing it off, the planning agent will produce a detailed, step-by-step implementation plan that can be reviewed and refined before any code is written. Each step should be concrete enough that the coding agent knows exactly which API to call, which package to import, and how to wire it — leaving no room for silent substitution.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Step-by-step execution with verification:&lt;/strong&gt; Instead of letting the coding agent execute the entire plan autonomously, the harness will execute one step at a time, verifying the output of each step (builds, tests, correct imports) before proceeding to the next. This catches drift early — if the agent installs a package but doesn't use it, or writes a shim instead of using the real library, the verification step surfaces the problem immediately rather than letting it compound.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This follows the approach outlined in the &lt;a href="https://github.com/anthropics/claude-quickstarts/tree/main/autonomous-coding" rel="noopener noreferrer"&gt;autonomous coding quickstart&lt;/a&gt;, adapted to the multi-agent harness architecture described in this project.**&lt;/p&gt;

&lt;p&gt;Please feel free to reach out on twitter &lt;a href="https://twitter.com/roamingcode" rel="noopener noreferrer"&gt;@roamingcode&lt;/a&gt;&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>automation</category>
      <category>claude</category>
    </item>
    <item>
      <title>Building End-to-End Local AI Agents with Microsoft Agent Framework and AG-UI</title>
      <dc:creator>Johnny Z</dc:creator>
      <pubDate>Sun, 23 Nov 2025 06:06:37 +0000</pubDate>
      <link>https://dev.to/stormhub/building-end-to-end-local-ai-agents-with-microsoft-agent-framework-and-ag-ui-nbi</link>
      <guid>https://dev.to/stormhub/building-end-to-end-local-ai-agents-with-microsoft-agent-framework-and-ag-ui-nbi</guid>
      <description>&lt;p&gt;The &lt;a href="https://learn.microsoft.com/en-us/agent-framework/overview/agent-framework-overview" rel="noopener noreferrer"&gt;Microsoft Agent Framework&lt;/a&gt; significantly elevates AI agent orchestration. A standout feature is its implementation of the &lt;strong&gt;&lt;a href="https://docs.ag-ui.com/introduction" rel="noopener noreferrer"&gt;Agent–User Interaction (AG-UI) Protocol&lt;/a&gt;&lt;/strong&gt;, which standardizes how AI agents connect to user-facing applications.&lt;/p&gt;

&lt;p&gt;Below is a quick-start guide to connecting these components into a fully end-to-end solution using local &lt;strong&gt;Ollama&lt;/strong&gt; models.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Service Configuration
&lt;/h2&gt;

&lt;p&gt;First, configure the dependency injection container. The &lt;code&gt;ChatClientAgent&lt;/code&gt; is based on the &lt;code&gt;IChatClient&lt;/code&gt; abstraction from &lt;code&gt;Microsoft.Extensions.AI&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: We register the agent as a Keyed Service to allow for multiple distinct agents within the same host.&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;WebApplication&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateBuilder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// 1. Register the Ollama Client&lt;/span&gt;
&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddTransient&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IChatClient&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;factory&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetRequiredService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IHttpClientFactory&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;
    &lt;span class="c1"&gt;// Ensure you use a wrapper that handles standard formatting &lt;/span&gt;
    &lt;span class="c1"&gt;// (see Implementation Note below)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;OllamaApiClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;factory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"OllamaClient"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s"&gt;"phi4"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// 2. Register the AI Agent&lt;/span&gt;
&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddKeyedTransient&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ChatClientAgent&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"local-ollama-agent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;options&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;ChatClientAgentOptions&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;Id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToString&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Local Assistant"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"An AI agent running on local Ollama."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;ChatOptions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;ChatOptions&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;Temperature&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetRequiredService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IChatClient&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;()&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetRequiredService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ILoggerFactory&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;());&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2. Expose the AG-UI Endpoint
&lt;/h2&gt;

&lt;p&gt;Once configured, map the agent instance directly to an HTTP route. This exposes the agent via the standard AG-UI protocol.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetRequiredKeyedService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ChatClientAgent&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="s"&gt;"local-ollama-agent"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Expose the agent on the root path&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapAGUI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  3. Connect a Client
&lt;/h2&gt;

&lt;p&gt;To consume the agent programmatically, the framework provides the &lt;code&gt;AGUIChatClient&lt;/code&gt;. This allows .NET applications to communicate with your agent over HTTP seamlessly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;AGUIChatClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;httpClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"http://localhost:5000"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetRequiredService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ILoggerFactory&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;());&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;clientAgent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"local-client"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"AG-UI Client Agent"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Frontend Integration:&lt;/strong&gt; The &lt;a href="https://docs.ag-ui.com/quickstart/applications" rel="noopener noreferrer"&gt;AG-UI Protocol&lt;/a&gt; also offers ready-made libraries for TypeScript and Python, allowing you to spin up frontend interfaces in minutes.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Implementation Note: Protocol Compliance
&lt;/h2&gt;

&lt;p&gt;The AG-UI protocol mandates that all messages contain a &lt;code&gt;messageId&lt;/code&gt; property. Native Ollama responses do not currently provide this. To ensure compatibility, I created a &lt;a href="https://github.com/StormHub/stormhub/blob/main/resources/2025-11-23/ConsoleApp/WebApi/OllamaChatClient.cs" rel="noopener noreferrer"&gt;simple wrapper class&lt;/a&gt; to inject the required IDs into the Ollama response stream.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/StormHub/stormhub/tree/main/resources/2025-11-23/ConsoleApp" rel="noopener noreferrer"&gt;Complete sample code&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Please feel free to reach out on twitter &lt;a href="https://twitter.com/roamingcode" rel="noopener noreferrer"&gt;@roamingcode&lt;/a&gt;&lt;/p&gt;

</description>
      <category>agents</category>
      <category>microsoft</category>
      <category>ai</category>
      <category>dotnet</category>
    </item>
    <item>
      <title>Model context protocol server prompts with microsoft semantic kernel</title>
      <dc:creator>Johnny Z</dc:creator>
      <pubDate>Wed, 23 Apr 2025 22:34:37 +0000</pubDate>
      <link>https://dev.to/stormhub/model-context-protocol-server-prompts-with-microsoft-semantic-kernel-p0j</link>
      <guid>https://dev.to/stormhub/model-context-protocol-server-prompts-with-microsoft-semantic-kernel-p0j</guid>
      <description>&lt;p&gt;This post focuses on implementing server prompts, a key feature of the &lt;a href="https://modelcontextprotocol.io/introduction" rel="noopener noreferrer"&gt;Model Context Protocol (MCP)&lt;/a&gt; designed for reusable template definitions. We will explore how to implement these server prompts using both the &lt;a href="https://github.com/modelcontextprotocol/csharp-sdk" rel="noopener noreferrer"&gt;MCP C# SDK&lt;/a&gt; and &lt;a href="https://github.com/microsoft/semantic-kernel" rel="noopener noreferrer"&gt;Semantic Kernel&lt;/a&gt; for enhanced templating capabilities. Further details on MCP server prompts can be found in the &lt;a href="https://modelcontextprotocol.io/docs/concepts/prompts" rel="noopener noreferrer"&gt;MCP documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  MCP Server Prompts via MCP C# SDK Attributes
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/modelcontextprotocol/csharp-sdk" rel="noopener noreferrer"&gt;MCP C# SDK&lt;/a&gt; allows for defining prompts through attributes. This method offers a direct implementation without requiring Semantic Kernel for basic string manipulation as the following example shows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;McpServerPromptType&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;internal&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;StringFormatPrompt&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;_format&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;ILogger&lt;/span&gt; &lt;span class="n"&gt;_logger&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;StringFormatPrompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ILogger&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;StringFormatPrompt&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;_logger&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;_format&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Tell a joke about {0}."&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;McpServerPrompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Joke"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Tell a joke about a topic."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;IReadOnlyCollection&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ChatMessage&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;Format&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"The topic of the joke."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;_logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;LogInformation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Generating prompt with topic: {Topic}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CultureInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InvariantCulture&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_format&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ChatRole&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;    

 &lt;span class="c1"&gt;// Register for the prompt&lt;/span&gt;
 &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;serverBuilder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddMcpServer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WithHttpTransport&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithPrompts&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;StringFormatPrompt&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Semantic Kernel Templates as MCP Server Prompts
&lt;/h2&gt;

&lt;p&gt;Semantic Kernel provides templating capabilities through JSON/YAML, Handlebars, and Liquid formats, along with plugin support. These templates can be exposed as MCP prompts using the MCP C# SDK.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Prompt Templates in Semantic Kernel&lt;/strong&gt;&lt;br&gt;
Semantic Kernel templates are configured with PromptTemplateConfig, created by IPromptTemplateFactory implementations, and can be easily rendered with input variables for dynamic prompt generation.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;templateConfig&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;PromptTemplateConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Tell a joke about {{$topic}}."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;IPromptTemplateFactory&lt;/span&gt; &lt;span class="n"&gt;templateFactory&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;KernelPromptTemplateFactory&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;template&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;templateFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;templateConfig&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RenderAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;KernelArguments&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s"&gt;"topic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"cats"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Expose prompts as McpServerPrompt&lt;/strong&gt;&lt;br&gt;
McpServerPrompt is the abstract base class that represents an MCP prompt we can implement.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;
&lt;span class="k"&gt;internal&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TemplateServerPrompt&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;McpServerPrompt&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;TemplateServerPrompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PromptTemplateConfig&lt;/span&gt; &lt;span class="n"&gt;promptTemplateConfig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;IPromptTemplateFactory&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="n"&gt;promptTemplateFactory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ILoggerFactory&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="n"&gt;loggerFactory&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;promptTemplateFactory&lt;/span&gt; &lt;span class="p"&gt;??=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;KernelPromptTemplateFactory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loggerFactory&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="n"&gt;NullLoggerFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Instance&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;_template&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;promptTemplateFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;promptTemplateConfig&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// MCP prompt&lt;/span&gt;
        &lt;span class="n"&gt;ProtocolPrompt&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;promptTemplateConfig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="n"&gt;_template&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetType&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;promptTemplateConfig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Arguments&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;promptTemplateConfig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InputVariables&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputVariable&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
                    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;PromptArgument&lt;/span&gt;
                    &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;inputVariable&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="n"&gt;Description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;inputVariable&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="n"&gt;Required&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;inputVariable&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsRequired&lt;/span&gt;
                    &lt;span class="p"&gt;})&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToList&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;ValueTask&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;GetPromptResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;GetAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RequestContext&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;GetPromptRequestParams&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;KernelArguments&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="n"&gt;arguments&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;dictionary&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Params&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;Arguments&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dictionary&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;arguments&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;dictionary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;GetService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Kernel&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;()&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Kernel&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_template&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RenderAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; 
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;GetPromptResult&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;Messages&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;PromptMessage&lt;/span&gt;
                    &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="n"&gt;Content&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;Content&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt; 
            &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Register for the prompt with DI and MCP server&lt;/span&gt;
&lt;span class="c1"&gt;// builder.Services.AddSingleton&amp;lt;TemplateAIFunction&amp;gt;(...)&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;serverBuilder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddMcpServer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WithHttpTransport&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;serverBuilder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddSingleton&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;McpServerPrompt&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; 
    &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetRequiredService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;TemplateServerPrompt&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;());&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Exposing AIFunction as McpServerPrompt&lt;/strong&gt;&lt;br&gt;
The McpServerPrompt class provides a Create method to expose a Microsoft.Extensions.AI.AIFunction as an MCP server prompt.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;
&lt;span class="k"&gt;internal&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TemplateAIFunction&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AIFunction&lt;/span&gt; 
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;//...&lt;/span&gt;

    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;ValueTask&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;object&lt;/span&gt;&lt;span class="p"&gt;?&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;InvokeCoreAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AIFunctionArguments&lt;/span&gt; &lt;span class="n"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;KernelArguments&lt;/span&gt; &lt;span class="n"&gt;kernelArguments&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;

        &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;argument&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;kernelArguments&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;argument&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argument&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;GetService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Kernel&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;()&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Kernel&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_template&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RenderAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kernelArguments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Register for the prompt with DI and MCP server&lt;/span&gt;
&lt;span class="c1"&gt;// builder.Services.AddSingleton&amp;lt;TemplateAIFunction&amp;gt;(...)&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;serverBuilder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddMcpServer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WithHttpTransport&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;serverBuilder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddSingleton&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;McpServerPrompt&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; 
    &lt;span class="n"&gt;McpServerPrompt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetRequiredService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;TemplateServerPrompt&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;()));&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://github.com/StormHub/stormhub/tree/main/resources/2025-04-16/ConsoleApp" rel="noopener noreferrer"&gt;Complete sample code&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Please feel free to reach out on twitter &lt;a href="https://twitter.com/roamingcode" rel="noopener noreferrer"&gt;@roamingcode&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>dotnet</category>
      <category>semantickernel</category>
      <category>mcp</category>
    </item>
    <item>
      <title>AWS Bedrock anthropic claude tool call integration with microsoft semantic kernel</title>
      <dc:creator>Johnny Z</dc:creator>
      <pubDate>Mon, 14 Apr 2025 23:34:57 +0000</pubDate>
      <link>https://dev.to/stormhub/aws-bedrock-anthropic-claude-tool-call-integration-with-microsoft-semantic-kernel-29g3</link>
      <guid>https://dev.to/stormhub/aws-bedrock-anthropic-claude-tool-call-integration-with-microsoft-semantic-kernel-29g3</guid>
      <description>&lt;p&gt;As of April 2025, the official Microsoft &lt;a href="https://github.com/microsoft/semantic-kernel" rel="noopener noreferrer"&gt;Semantic Kernel&lt;/a&gt; connector for Amazon &lt;a href="https://www.nuget.org/packages/Microsoft.SemanticKernel.Connectors.Amazon/1.36.1-alpha" rel="noopener noreferrer"&gt;Microsoft.SemanticKernel.Connectors.Amazon&lt;/a&gt; does not natively support tool/function calls. Apparently, Semantic Kernel is shifting its approach towards an LLM abstraction layer based on &lt;a href="https://github.com/dotnet/extensions/tree/main/src/Libraries/Microsoft.Extensions.AI" rel="noopener noreferrer"&gt;Microsoft.Extensions.AI&lt;/a&gt;, aiming for a more unified and extensible architecture. Currently, only OpenAI and Ollama implementations are available within this new abstraction. It is anticipated that an implementation for AWS Bedrock Anthropic Claude based on Microsoft.Extensions.AI will become available in the future. Therefore, in the interim, I implemented a custom solution. The approach leverages the existing &lt;code&gt;IChatClient&lt;/code&gt; &lt;a href="https://github.com/dotnet/extensions/blob/68b25aeb2d752273e1d5621b38a7869ce63970c3/src/Libraries/Microsoft.Extensions.AI.Abstractions/ChatCompletion/IChatClient.cs" rel="noopener noreferrer"&gt;interface&lt;/a&gt;, making the implementation relatively straightforward. Since function calls are supported by this interface, the solution involves implementing it on top of the &lt;a href="https://www.nuget.org/packages/AWSSDK.BedrockRuntime/4.0.0-preview.13" rel="noopener noreferrer"&gt;AWS Bedrock Runtime SDK&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implement IChatClient with AWS Bedrock Runtime
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;IChatClient&lt;/code&gt; interface essentially contains two methods: one for standard chat responses and another for streamed responses. The implementation involves mapping these two methods to the &lt;code&gt;IAmazonBedrockRuntime.ConverseAsync&lt;/code&gt; and &lt;code&gt;ConverseStreamAsync&lt;/code&gt; methods, as demonstrated in the full implementation of the &lt;code&gt;AnthropicChatClient&lt;/code&gt; &lt;a href="https://github.com/StormHub/stormhub/tree/main/resources/2025-04-02/ConsoleApp/AnthropicChatClient.cs" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting up Function Calls with Semantic Kernel
&lt;/h2&gt;

&lt;p&gt;Here's how to set up function calls with Semantic Kernel using our custom &lt;code&gt;AnthropicChatClient&lt;/code&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Set up kernel and functions&lt;/strong&gt;&lt;br&gt;
This step configures the chat completion service with function invocation capabilities and registers it with the Semantic Kernel.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Set up chat completion service&lt;/span&gt;
&lt;span class="n"&gt;IChatClient&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;...;&lt;/span&gt;
&lt;span class="n"&gt;IChatCompletionService&lt;/span&gt; &lt;span class="n"&gt;chatService&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
    &lt;span class="n"&gt;chatClient&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsBuilder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseFunctionInvocation&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;// Enables function call functionality&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsChatCompletionService&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Register the Bedrock chat completion service&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateBuilder&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddKeyedSingleton&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"bedrock"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chatService&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// Add plugins/functions&lt;/span&gt;
&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Plugins&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddFromType&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;MenuPlugin&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;
&lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Use automatically tool calls&lt;/strong&gt;&lt;br&gt;
This code demonstrates how to use the configured chat completion service to automatically invoke functions based on the user's input.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Set up bedrock&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;runtimeClient&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;AmazonBedrockRuntimeClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RegionEndpoint&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;APSoutheast2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;IChatClient&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;AnthropicChatClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;runtimeClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"anthropic.claude-3-5-sonnet-20241022-v2:0"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Configure the chat client as shown in step 1.&lt;/span&gt;
&lt;span class="n"&gt;IChatCompletionService&lt;/span&gt; &lt;span class="n"&gt;chatCompletionService&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsBuilder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseFunctionInvocation&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsChatCompletionService&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;chatHistory&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;ChatHistory&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;chatHistory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddUserMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"What is the special soup and its price?"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;promptExecutionSettings&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;PromptExecutionSettings&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;FunctionChoiceBehavior&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FunctionChoiceBehavior&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Auto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;RetainArgumentTypes&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;
    &lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="n"&gt;ExtensionData&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;Dictionary&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;object&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s"&gt;"temperature"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; 
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s"&gt;"max_tokens_to_sample"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;1024&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="c1"&gt;// Required parameter for Anthropic models&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;messageContent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;chatCompletionService&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetChatMessageContentAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chatHistory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="n"&gt;promptExecutionSettings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;Console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WriteLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;messageContent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Expected output : Today's special soup is Clam Chowder and it costs $9.99.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://github.com/StormHub/stormhub/tree/main/resources/2025-04-02/ConsoleApp" rel="noopener noreferrer"&gt;Complete sample code&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Please feel free to reach out on twitter &lt;a href="https://twitter.com/roamingcode" rel="noopener noreferrer"&gt;@roamingcode&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>dotnet</category>
      <category>semantickernel</category>
      <category>anthropic</category>
    </item>
    <item>
      <title>Model context protocol integration with microsoft semantic kernel</title>
      <dc:creator>Johnny Z</dc:creator>
      <pubDate>Sat, 05 Apr 2025 05:00:09 +0000</pubDate>
      <link>https://dev.to/stormhub/model-context-protocol-integration-with-microsoft-semantic-kernel-1cgb</link>
      <guid>https://dev.to/stormhub/model-context-protocol-integration-with-microsoft-semantic-kernel-1cgb</guid>
      <description>&lt;p&gt;The &lt;a href="https://modelcontextprotocol.io/introduction" rel="noopener noreferrer"&gt;Model Context Protocol (MCP)&lt;/a&gt; aims to standardize connections between AI systems and data sources. This post demonstrates integrating &lt;a href="https://github.com/executeautomation/mcp-playwright" rel="noopener noreferrer"&gt;mcp-playwright&lt;/a&gt; with &lt;a href="https://github.com/microsoft/semantic-kernel" rel="noopener noreferrer"&gt;Semantic Kernel&lt;/a&gt; and &lt;a href="https://ollama.com/library/phi4-mini" rel="noopener noreferrer"&gt;phi4-mini&lt;/a&gt; (via Ollama) for browser automation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting up the Playwright MCP Server
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Install the MCP Playwright package:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @playwright/mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Add a script to &lt;code&gt;package.json&lt;/code&gt;:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scripts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"server"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx @playwright/mcp --port 8931"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Start the server:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm run server
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;This will launch the Playwright MCP server, displaying the port and endpoints in the console.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Running phi4-mini with Ollama for Function Calling
&lt;/h2&gt;

&lt;p&gt;For reliable function calling, &lt;a href="https://ollama.com/library/phi4-mini" rel="noopener noreferrer"&gt;phi4-mini:latest&lt;/a&gt; (as of March 27, 2025) requires a custom Modelfile.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Create a custom Modelfile:&lt;/strong&gt; (See &lt;a href="https://github.com/StormHub/stormhub/tree/main/resources/2025-03-27/Modelfile" rel="noopener noreferrer"&gt;example&lt;/a&gt;)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Create the model in Ollama:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama create phi4-mini:latest &lt;span class="nt"&gt;-f&lt;/span&gt; &amp;lt;path/to/Modelfile&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Implementing the MCP Client in Semantic Kernel
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Install the MCP client NuGet package:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dotnet add package ModelContextProtocol &lt;span class="nt"&gt;--prerelease&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Connect to the Playwright MCP server and retrieve tools:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;mcpClient&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;McpClientFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;McpServerConfig&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"playwright"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Playwright"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;TransportType&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TransportTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sse&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"http://localhost:8931"&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;mcpClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ListToolsAsync&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Configure Semantic Kernel with the MCP tools:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;kernelBuilder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateBuilder&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;kernelBuilder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddOllamaChatCompletion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"phi4-mini"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;kernelBuilder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Plugins&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddFromFunctions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;pluginName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"playwright"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;functions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsKernelFunction&lt;/span&gt;&lt;span class="p"&gt;()));&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kernelBuilder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;executionSettings&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;PromptExecutionSettings&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;FunctionChoiceBehavior&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FunctionChoiceBehavior&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Auto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;RetainArgumentTypes&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;
        &lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="n"&gt;ExtensionData&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;Dictionary&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;object&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s"&gt;"temperature"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;InvokePromptAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"open browser and navigate to https://www.google.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;KernelArguments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;executionSettings&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This code snippet connects to the MCP server, retrieves available tools, and integrates them into Semantic Kernel as functions. The prompt instructs the model to open a browser and navigate to Google, demonstrating the integration.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/StormHub/stormhub/tree/main/resources/2025-03-27/ConsoleApp" rel="noopener noreferrer"&gt;Complete sample code&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Please feel free to reach out on twitter &lt;a href="https://twitter.com/roamingcode" rel="noopener noreferrer"&gt;@roamingcode&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>dotnet</category>
      <category>semantickernel</category>
    </item>
    <item>
      <title>Azure OpenAI Error Handling in Semantic Kernel</title>
      <dc:creator>Johnny Z</dc:creator>
      <pubDate>Wed, 08 Jan 2025 06:27:40 +0000</pubDate>
      <link>https://dev.to/stormhub/azure-openai-error-handling-in-semantic-kernel-21j5</link>
      <guid>https://dev.to/stormhub/azure-openai-error-handling-in-semantic-kernel-21j5</guid>
      <description>&lt;p&gt;In real-world systems, it's crucial to handle HTTP errors effectively, especially when interacting with Large Language Models (LLMs) like Azure OpenAI. Rate limit exceeded errors (tokens per minute or requests per minute) always happen at some point, resulting in 429 errors. This blog post explores different approaches to HTTP error handling with semantic kernel and Azure OpenAI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Default
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateBuilder&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddAzureOpenAIChatCompletion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;deploymentName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"gpt-4o-2024-08-06"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"https://resource-name.openai.azure.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"api-key"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Or DefaultAzureCredential&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The default setup for &lt;a href="https://github.com/microsoft/semantic-kernel" rel="noopener noreferrer"&gt;Semantic Kernel&lt;/a&gt; with Azure OpenAI by AddAzureOpenAIChatCompletion. This approach offers a built-in retry policy that automatically retries requests up to three times with exponential backoff. Additionally, it can detect specific HTTP headers like 'retry-after' to implement more tailored retries.&lt;/p&gt;

&lt;h2&gt;
  
  
  HttpClient
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;factory&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetRequiredService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IHttpClientFactory&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;httpClient&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;factory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"auzre:gpt-4o"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateBuilder&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddAzureOpenAIChatCompletion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;deploymentName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"gpt-4o-2024-08-06"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"https://resource-name.openai.azure.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"api-key"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// Or DefaultAzureCredential&lt;/span&gt;
    &lt;span class="n"&gt;httpClient&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;httpClient&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By configuring an HttpClient instance, you can gain more control over HTTP error handling. &lt;a href="https://github.com/microsoft/semantic-kernel" rel="noopener noreferrer"&gt;Semantic Kernel&lt;/a&gt; disables the default retry policy when HttpClient is provided. This allows you to implement custom retry logic using the Microsoft.Extensions.Http.Resilience library. With this approach, you can define the number of retry attempts, timeouts, and how to handle specific error codes like 429 (rate limit exceeded). &lt;strong&gt;It is strongly recommended to add retry policies to handle transient errors with HttpClient&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddHttpClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"auzre:gpt-4o"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;// 'standard' automatically handle transient errors inlcuding '429'&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddStandardResilienceHandler&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; 
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Configure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Options for attempts and time out etc&lt;/span&gt;
            &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Retry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MaxRetryAttempts&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An important benefit of using HttpClient is that it's not limited to Azure OpenAI. This approach works with other AI connectors like OpenAI as well.&lt;/p&gt;

&lt;h2&gt;
  
  
  AzureOpenAIClient
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;azureOpenAIClient&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;AzureOpenAIClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"https://resource-name.openai.azure.com"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;ApiKeyCredential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"api-key"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c1"&gt;// Or DefaultAzureCredential&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;AzureOpenAIClientOptions&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateBuilder&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddAzureOpenAIChatCompletion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;deploymentName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"gpt-4o-2024-08-06"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;azureOpenAIClient&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This approach offers similar functionality to the default setup with the built-in retry policy. In addition, AzureOpenAIClient provides more flexibility from AzureOpenAIClientOptions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;clientOptions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;AzureOpenAIClientOptions&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Transport&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;HttpClientPipelineTransport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;httpClient&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;RetryPolicy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;ClientRetryPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;maxRetries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This configuration enables you to combine HTTP retry policies from HttpClient with custom pipeline policy-based retries from the Azure OpenAI SDK.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recommendations
&lt;/h2&gt;

&lt;p&gt;The default setup might not be suitable for scenarios where you frequently encounter token limit issues.&lt;br&gt;
If you already have AzureOpenAIClient registered and require maximum control, this approach allows you to leverage both HTTP client policies and Azure OpenAI pipeline policy-based retries.&lt;/p&gt;

&lt;p&gt;Please feel free to reach out on twitter &lt;a href="https://twitter.com/roamingcode" rel="noopener noreferrer"&gt;@roamingcode&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>dotnet</category>
      <category>semantickernel</category>
      <category>openai</category>
    </item>
    <item>
      <title>Working with multiple language models in Semantic Kernel</title>
      <dc:creator>Johnny Z</dc:creator>
      <pubDate>Sat, 28 Dec 2024 07:33:07 +0000</pubDate>
      <link>https://dev.to/stormhub/working-with-multiple-language-models-in-semantic-kernel-31gk</link>
      <guid>https://dev.to/stormhub/working-with-multiple-language-models-in-semantic-kernel-31gk</guid>
      <description>&lt;p&gt;It is common to work with multiple large language models (LLMs) simultaneously, especially when running evaluations or tests. &lt;a href="https://github.com/microsoft/semantic-kernel" rel="noopener noreferrer"&gt;Semantic Kernel&lt;/a&gt; supports registering multiple text generation and embedding services using serviceId and modelId.&lt;/p&gt;

&lt;h2&gt;
  
  
  Register 'serviceId' and 'modelId'
&lt;/h2&gt;

&lt;p&gt;Suppose we have the following setup&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddAzureOpenAIChatCompletion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;deploymentName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"gpt-4-1106-Preview"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"https://resource-name.openai.azure.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"api-key"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"gpt-4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;serviceId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"azure:gpt-4"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddAzureOpenAIChatCompletion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;deploymentName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"gpt-4o-2024-08-06"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"https://resource-name.openai.azure.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"api-key"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"gpt-4o"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;serviceId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"azure:gpt-4o"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

 &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddOllamaChatCompletion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"phi3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"http://localhost:11434"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;serviceId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"local:phi3"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When execute kernel functions or prompts, 'serviceId' and 'modelId' can be passed into 'PromptExecutionSettings' like the following shows&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;promptExecutionSettings&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;PromptExecutionSettings&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ServiceId&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"local:phi3"&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="c1"&gt;// &lt;/span&gt;
&lt;span class="c1"&gt;// or just modelId &lt;/span&gt;
&lt;span class="c1"&gt;//    new PromptExecutionSettings&lt;/span&gt;
&lt;span class="c1"&gt;//     {&lt;/span&gt;
&lt;span class="c1"&gt;//         ModelId = "gpt-4o"&lt;/span&gt;
&lt;span class="c1"&gt;//     }&lt;/span&gt;
&lt;span class="c1"&gt;//&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;InvokePromptAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"""
&lt;/span&gt;    &lt;span class="n"&gt;Answer&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;given&lt;/span&gt; &lt;span class="n"&gt;fact&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;Sky&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="n"&gt;blue&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;violets&lt;/span&gt; &lt;span class="n"&gt;are&lt;/span&gt; &lt;span class="n"&gt;purple&lt;/span&gt;

    &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;What&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="n"&gt;sky&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt;
    &lt;span class="s"&gt;""", 
&lt;/span&gt;    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;KernelArguments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;promptExecutionSettings&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When registering chat completion services, if serviceId is provided, &lt;a href="https://github.com/microsoft/semantic-kernel" rel="noopener noreferrer"&gt;Semantic Kernel&lt;/a&gt; also registers chat completion services as keyed. With the above registration, the following would work:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;chatCompletionService&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetRequiredKeyedService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IChatCompletionService&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="s"&gt;"azure:gpt-4o"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  IAIService and IAIServiceSelector
&lt;/h2&gt;

&lt;p&gt;All AI-related services, including chat completion and text embedding, implement the IAIService interface, which defines a metadata property. This metadata contains attributes specific to the service implementation. For instance, the AzureOpenAIChatCompletionService includes the deployment name and model name. The default IAIServiceSelector resolves services by serviceId first, and then by modelId to match the IAIService metadata. To gain full control over AI service selection, you can implement a custom IAIServiceSelector and register it as a service with &lt;a href="https://github.com/microsoft/semantic-kernel" rel="noopener noreferrer"&gt;Semantic Kernel&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/StormHub/stormhub/tree/main/resources/2024-12-27/ConsoleApp" rel="noopener noreferrer"&gt;Sample code here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Please feel free to reach out on twitter &lt;a href="https://twitter.com/roamingcode" rel="noopener noreferrer"&gt;@roamingcode&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>dotnet</category>
      <category>semantickernel</category>
      <category>openai</category>
    </item>
    <item>
      <title>OpenAI chat completion with Json output format</title>
      <dc:creator>Johnny Z</dc:creator>
      <pubDate>Fri, 20 Dec 2024 01:31:15 +0000</pubDate>
      <link>https://dev.to/stormhub/openai-chat-completion-with-json-output-format-l5i</link>
      <guid>https://dev.to/stormhub/openai-chat-completion-with-json-output-format-l5i</guid>
      <description>&lt;p&gt;I can't recall how many times I've tried to convince an LLM to return JSON so that I could perform API calls based on natural language inputs from users. Recently, I discovered that this functionality is natively supported by the  &lt;a href="https://github.com/microsoft/semantic-kernel" rel="noopener noreferrer"&gt;Semantic Kernel&lt;/a&gt; and Microsoft AI Extension Library. It is officially documented by the OpenAI API &lt;a href="https://platform.openai.com/docs/guides/structured-outputs" rel="noopener noreferrer"&gt;here&lt;/a&gt;. Note that this feature is only available in the latest large language models from GPT-4o/o1 and later. If you are using Azure OpenAI, ensure you have the supported versions when deploying models.&lt;/p&gt;

&lt;h2&gt;
  
  
  Chat completion
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/microsoft/semantic-kernel" rel="noopener noreferrer"&gt;Semantic Kernel&lt;/a&gt; supports JSON output formatting in the ResponseFormat property from PromptExecutionSettings, as shown in the code below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Configure Azure/OpenAI and semantic kernel first.&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;chatCompletionService&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetRequiredService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IChatCompletionService&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;ChatHistory&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddSystemMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Extract the event information."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddUserMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Alice and Bob are going to a science fair on Friday."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;jsonSerializerOptions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;JsonSerializerOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;JsonSerializerOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Default&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;PropertyNamingPolicy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;JsonNamingPolicy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CamelCase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;UnmappedMemberHandling&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;JsonUnmappedMemberHandling&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Disallow&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;responseFormat&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CalendarEvent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;JsonResponseSchema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jsonSerializerOptions&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;chatCompletionService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetChatMessageContentAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;AzureOpenAIPromptExecutionSettings&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;ResponseFormat&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;responseFormat&lt;/span&gt; &lt;span class="c1"&gt;// Json schema&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="c1"&gt;// Json result    &lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;JsonSerializer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Deserialize&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;CalendarEvent&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToString&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;jsonSerializerOptions&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Generate Json schema from types
&lt;/h2&gt;

&lt;p&gt;JSON schema can be automatically generated using Microsoft.Extensions.AI.AIJsonUtilities, which is referenced from &lt;a href="https://github.com/microsoft/semantic-kernel" rel="noopener noreferrer"&gt;Semantic Kernel&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CalendarEvent&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Name of the event"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;required&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;init&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Day of the event"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;required&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Day&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;init&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"List of participants of the event"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;required&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;Participants&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;init&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;ChatResponseFormat&lt;/span&gt; &lt;span class="nf"&gt;JsonResponseSchema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;JsonSerializerOptions&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="n"&gt;jsonSerializerOptions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;inferenceOptions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;AIJsonSchemaCreateOptions&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;IncludeSchemaKeyword&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;DisallowAdditionalProperties&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;

        &lt;span class="c1"&gt;// Json schema from types with descriptions on properties&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;jsonElement&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AIJsonUtilities&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateJsonSchema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="k"&gt;typeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CalendarEvent&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Calendar event result"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;serializerOptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;jsonSerializerOptions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;inferenceOptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;inferenceOptions&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;kernelJsonSchema&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;KernelJsonSchema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jsonElement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetRawText&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;jsonSchemaData&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BinaryData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromObjectAsJson&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kernelJsonSchema&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;jsonSerializerOptions&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ChatResponseFormat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateJsonSchemaFormat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="k"&gt;nameof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CalendarEvent&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;ToLowerInvariant&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;jsonSchemaData&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;jsonSchemaIsStrict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://github.com/StormHub/stormhub/tree/main/resources/2024-12-19/ConsoleApp" rel="noopener noreferrer"&gt;Sample code here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Please feel free to reach out on twitter &lt;a href="https://twitter.com/roamingcode" rel="noopener noreferrer"&gt;@roamingcode&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>semantickernel</category>
      <category>dotnet</category>
      <category>openai</category>
    </item>
    <item>
      <title>Lightweight AI Evaluation with SemanticKernel</title>
      <dc:creator>Johnny Z</dc:creator>
      <pubDate>Tue, 17 Dec 2024 23:28:50 +0000</pubDate>
      <link>https://dev.to/stormhub/lightweight-ai-evaluation-with-semantickernel-2bgj</link>
      <guid>https://dev.to/stormhub/lightweight-ai-evaluation-with-semantickernel-2bgj</guid>
      <description>&lt;p&gt;For quick and easy evaluation or comparison of AI responses in .NET applications, particularly tests. We can leverage &lt;a href="https://github.com/braintrustdata/autoevals" rel="noopener noreferrer"&gt;autoevals&lt;/a&gt; excellent 'LLM-as-a-Judge' prompts with the help of &lt;a href="https://github.com/microsoft/semantic-kernel" rel="noopener noreferrer"&gt;Semantic Kernel&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sample code
&lt;/h2&gt;

&lt;p&gt;Note that you need to setup semantic kernel with chat completion first. It is also recommended to set 'Temperature' to 0.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; 
    &lt;span class="s"&gt;"""
&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s"&gt;"humor"&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s"&gt;"output"&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"this maybe funny"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="s"&gt;""";
&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; 
        &lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;executionSettings&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;executionSettings&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WriteLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;$"[&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;]: result: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;Item1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, score: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;Item2&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://github.com/StormHub/TinyToolBox.AI" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;While &lt;a href="https://devblogs.microsoft.com/dotnet/evaluate-the-quality-of-your-ai-applications-with-ease/" rel="noopener noreferrer"&gt;Microsoft.Extensions.AI.Evaluation&lt;/a&gt; is in the making, it currently involves  a little too much 'ceremonies' for simple use cases.&lt;/p&gt;

&lt;p&gt;Please feel free to reach out on twitter &lt;a href="https://twitter.com/roamingcode" rel="noopener noreferrer"&gt;@roamingcode&lt;/a&gt;&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>ai</category>
      <category>semantickernel</category>
      <category>openai</category>
    </item>
  </channel>
</rss>
