<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Lukas Walter </title>
    <description>The latest articles on DEV Community by Lukas Walter  (@lukaswalter).</description>
    <link>https://dev.to/lukaswalter</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3783973%2F8171c4c5-d69c-4059-b5d9-7b7af32a8962.png</url>
      <title>DEV Community: Lukas Walter </title>
      <link>https://dev.to/lukaswalter</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/lukaswalter"/>
    <language>en</language>
    <item>
      <title>Build and deploy an MCP server with .NET and Azure Container Apps</title>
      <dc:creator>Lukas Walter </dc:creator>
      <pubDate>Mon, 15 Jun 2026 15:30:00 +0000</pubDate>
      <link>https://dev.to/lukaswalter/build-and-deploy-an-mcp-server-with-net-and-azure-container-apps-183m</link>
      <guid>https://dev.to/lukaswalter/build-and-deploy-an-mcp-server-with-net-and-azure-container-apps-183m</guid>
      <description>&lt;p&gt;MCP servers do not have to be local stdio processes. If you want a tool surface that can run as a normal HTTP service, scale like an app, and be reachable from a separate client, Streamable HTTP is the practical transport to look at.&lt;/p&gt;

&lt;p&gt;Terminology note: current MCP specs call this transport Streamable HTTP. It replaced the older HTTP+SSE transport, but it can still use Server-Sent Events (SSE) when the server streams messages back to the client.&lt;/p&gt;

&lt;p&gt;We will build a small MCP server in C# with the official MCP .NET SDK, run it locally, containerize it, deploy it to Azure Container Apps, and call it from a C# client.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Repository: &lt;a href="https://github.com/ovnecron/mcp-azure-container-apps-demo" rel="noopener noreferrer"&gt;MCP Server on Azure Container Apps with C# and .NET&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The sample exposes three tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;echo&lt;/code&gt;: returns a message&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;add&lt;/code&gt;: adds two integers&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;server_time&lt;/code&gt;: returns the server time for a supplied time zone&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The sample is intentionally small. It does not try to become a production agent platform. It gets you from zero to a deployed MCP endpoint, with enough structure that you can replace the demo tools with your own application logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;You need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;.NET 10 SDK&lt;/li&gt;
&lt;li&gt;Docker&lt;/li&gt;
&lt;li&gt;Azure CLI&lt;/li&gt;
&lt;li&gt;An Azure subscription&lt;/li&gt;
&lt;li&gt;Optional: Node.js if you want to test with the MCP Inspector&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The project uses:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ModelContextProtocol.AspNetCore&lt;/code&gt; for the server&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ModelContextProtocol&lt;/code&gt; for the client&lt;/li&gt;
&lt;li&gt;Streamable HTTP transport&lt;/li&gt;
&lt;li&gt;Azure Container Apps for hosting&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Project structure
&lt;/h2&gt;

&lt;p&gt;The repository is split into a server, a client, and a small test project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;McpAzureContainerAppsDemo/
  McpAzureContainerAppsDemo.slnx
  Directory.Build.props
  Directory.Packages.props
  src/
    McpAzureContainerAppsDemo.Server/
      Program.cs
      Dockerfile
      Configuration/
      Security/
      Services/
      Tools/
    McpAzureContainerAppsDemo.Client/
      Program.cs
      Configuration/
      Rendering/
  tests/
    McpAzureContainerAppsDemo.Server.Tests/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The MCP-specific code stays small. Most of the actual behavior lives in normal C# services.&lt;/p&gt;

&lt;p&gt;That is the pattern I prefer for demos like this: keep the protocol wiring thin and put the business logic somewhere testable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create the projects
&lt;/h2&gt;

&lt;p&gt;Start with a solution, a web project for the server, a console project for the client, and an xUnit test project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;McpAzureContainerAppsDemo
&lt;span class="nb"&gt;cd &lt;/span&gt;McpAzureContainerAppsDemo

dotnet new sln &lt;span class="nt"&gt;-n&lt;/span&gt; McpAzureContainerAppsDemo

&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; src tests
dotnet new web &lt;span class="nt"&gt;-n&lt;/span&gt; McpAzureContainerAppsDemo.Server &lt;span class="nt"&gt;-o&lt;/span&gt; src/McpAzureContainerAppsDemo.Server &lt;span class="nt"&gt;--framework&lt;/span&gt; net10.0
dotnet new console &lt;span class="nt"&gt;-n&lt;/span&gt; McpAzureContainerAppsDemo.Client &lt;span class="nt"&gt;-o&lt;/span&gt; src/McpAzureContainerAppsDemo.Client &lt;span class="nt"&gt;--framework&lt;/span&gt; net10.0
dotnet new xunit &lt;span class="nt"&gt;-n&lt;/span&gt; McpAzureContainerAppsDemo.Server.Tests &lt;span class="nt"&gt;-o&lt;/span&gt; tests/McpAzureContainerAppsDemo.Server.Tests &lt;span class="nt"&gt;--framework&lt;/span&gt; net10.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add the projects to the solution:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dotnet sln add &lt;span class="se"&gt;\&lt;/span&gt;
  src/McpAzureContainerAppsDemo.Server/McpAzureContainerAppsDemo.Server.csproj &lt;span class="se"&gt;\&lt;/span&gt;
  src/McpAzureContainerAppsDemo.Client/McpAzureContainerAppsDemo.Client.csproj &lt;span class="se"&gt;\&lt;/span&gt;
  tests/McpAzureContainerAppsDemo.Server.Tests/McpAzureContainerAppsDemo.Server.Tests.csproj
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then add the MCP packages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dotnet add src/McpAzureContainerAppsDemo.Server/McpAzureContainerAppsDemo.Server.csproj package ModelContextProtocol.AspNetCore
dotnet add src/McpAzureContainerAppsDemo.Client/McpAzureContainerAppsDemo.Client.csproj package ModelContextProtocol
dotnet add tests/McpAzureContainerAppsDemo.Server.Tests/McpAzureContainerAppsDemo.Server.Tests.csproj reference src/McpAzureContainerAppsDemo.Server/McpAzureContainerAppsDemo.Server.csproj
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you cloned the repository, you can skip this setup. The packages are already referenced centrally in &lt;code&gt;Directory.Packages.props&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Register the MCP server
&lt;/h2&gt;

&lt;p&gt;The server is a regular ASP.NET Core app.&lt;/p&gt;

&lt;p&gt;In &lt;code&gt;Program.cs&lt;/code&gt;, the MCP server is added to the service collection, configured with Streamable HTTP, and mapped to &lt;code&gt;/mcp&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddMcpServer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ServerInfo&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"mcp-azure-container-apps-demo"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"1.0.0"&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WithHttpTransport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Stateless&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WithToolsFromAssembly&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapMcp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/mcp"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The useful detail is &lt;code&gt;WithToolsFromAssembly()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Instead of manually registering every tool, the SDK discovers tool classes and methods through attributes. That keeps the server startup small and moves tool definitions into their own files.&lt;/p&gt;

&lt;p&gt;The sample also adds two regular HTTP endpoints:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapGet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IOptions&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;McpServerSettings&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;service&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;mcpEndpoint&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"/mcp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;health&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"/health"&lt;/span&gt;
&lt;span class="p"&gt;}));&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapGet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/health"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ok"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;utc&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TimeProvider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetUtcNow&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;/health&lt;/code&gt; is useful for Container Apps health checks and basic smoke testing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Add tools with attributes
&lt;/h2&gt;

&lt;p&gt;The tools live in &lt;code&gt;Tools/DemoTools.cs&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The class is marked with &lt;code&gt;[McpServerToolType]&lt;/code&gt;, and each exposed method is marked with &lt;code&gt;[McpServerTool]&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;McpServerToolType&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DemoTools&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;CalculatorService&lt;/span&gt; &lt;span class="n"&gt;calculator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ServerTimeService&lt;/span&gt; &lt;span class="n"&gt;serverTime&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ILogger&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;DemoTools&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;McpServerTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"echo"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ReadOnly&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Idempotent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Returns the message supplied by the caller."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nf"&gt;Echo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;LogInformation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Echo tool invoked."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;McpServerTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"add"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ReadOnly&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Idempotent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Adds two 32-bit integers and returns the sum."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="n"&gt;calculator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tool methods can still call normal injected services.&lt;/p&gt;

&lt;p&gt;For example, the &lt;code&gt;add&lt;/code&gt; tool does not do arithmetic directly in the tool method. It calls a &lt;code&gt;CalculatorService&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CalculatorService&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;checked&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;left&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That makes the behavior easy to test without going through MCP at all.&lt;/p&gt;

&lt;p&gt;For the &lt;code&gt;server_time&lt;/code&gt; tool, the service uses &lt;code&gt;TimeProvider&lt;/code&gt; so tests can supply a fixed clock:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ServerTimeService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TimeProvider&lt;/span&gt; &lt;span class="n"&gt;timeProvider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ILogger&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ServerTimeService&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;ServerTimeResult&lt;/span&gt; &lt;span class="nf"&gt;GetTime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="n"&gt;timeZoneId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;resolvedTimeZoneId&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;IsNullOrWhiteSpace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeZoneId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="s"&gt;"UTC"&lt;/span&gt;
            &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;timeZoneId&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="n"&gt;TimeZoneInfo&lt;/span&gt; &lt;span class="n"&gt;timeZone&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TimeZoneInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FindSystemTimeZoneById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resolvedTimeZoneId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;DateTimeOffset&lt;/span&gt; &lt;span class="n"&gt;utcNow&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;timeProvider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetUtcNow&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="n"&gt;DateTimeOffset&lt;/span&gt; &lt;span class="n"&gt;localTime&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TimeZoneInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ConvertTime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;utcNow&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeZone&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;LogDebug&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Resolved server time for timezone {TimeZoneId}."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeZone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeZone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;utcNow&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;localTime&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a useful split: tools are the integration layer, services are the behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  Run the server locally
&lt;/h2&gt;

&lt;p&gt;Start the server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dotnet run &lt;span class="nt"&gt;--project&lt;/span&gt; src/McpAzureContainerAppsDemo.Server/McpAzureContainerAppsDemo.Server.csproj &lt;span class="nt"&gt;--launch-profile&lt;/span&gt; http
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The sample listens on:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://localhost:8080
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check the health endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:8080/health
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get a small JSON response with &lt;code&gt;status&lt;/code&gt; and &lt;code&gt;utc&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Call the MCP server from C
&lt;/h2&gt;

&lt;p&gt;The client uses the same MCP SDK, but with the client package.&lt;/p&gt;

&lt;p&gt;The transport is configured for Streamable HTTP:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;transportOptions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;HttpClientTransportOptions&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"mcp-azure-container-apps-demo-client"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Endpoint&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ServerUrl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;TransportMode&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;HttpTransportMode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StreamableHttp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AdditionalHeaders&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateHeaders&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;transport&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;HttpClientTransport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transportOptions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loggerFactory&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;McpClient&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;McpClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transport&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loggerFactory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;loggerFactory&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then the client lists available tools and calls two of them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;IList&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;McpClientTool&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ListToolsAsync&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;CallToolResult&lt;/span&gt; &lt;span class="n"&gt;addResult&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CallToolAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"add"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;Dictionary&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;object&lt;/span&gt;&lt;span class="p"&gt;?&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"left"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"right"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run it against the local server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dotnet run &lt;span class="nt"&gt;--project&lt;/span&gt; src/McpAzureContainerAppsDemo.Client/McpAzureContainerAppsDemo.Client.csproj &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--url&lt;/span&gt; http://localhost:8080/mcp &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--time-zone&lt;/span&gt; Europe/Berlin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see the tool list, the result of &lt;code&gt;add&lt;/code&gt;, and a structured result from &lt;code&gt;server_time&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Add simple API-key protection
&lt;/h2&gt;

&lt;p&gt;For a public HTTP endpoint, even a demo should not leave the MCP endpoint completely open by accident.&lt;/p&gt;

&lt;p&gt;This sample uses a very small API-key middleware. It protects &lt;code&gt;/mcp&lt;/code&gt; when &lt;code&gt;McpServer__ApiKey&lt;/code&gt; is configured and leaves &lt;code&gt;/health&lt;/code&gt; unauthenticated:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="nf"&gt;RequiresApiKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PathString&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="n"&gt;configuredApiKey&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;StartsWithSegments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/mcp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;StringComparison&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrdinalIgnoreCase&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;!&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;IsNullOrWhiteSpace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;configuredApiKey&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run the server with a local key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;McpServer__ApiKey&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;local-dev-key &lt;span class="se"&gt;\&lt;/span&gt;
dotnet run &lt;span class="nt"&gt;--project&lt;/span&gt; src/McpAzureContainerAppsDemo.Server/McpAzureContainerAppsDemo.Server.csproj &lt;span class="nt"&gt;--launch-profile&lt;/span&gt; http
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then pass the same key to the client:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dotnet run &lt;span class="nt"&gt;--project&lt;/span&gt; src/McpAzureContainerAppsDemo.Client/McpAzureContainerAppsDemo.Client.csproj &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--url&lt;/span&gt; http://localhost:8080/mcp &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--api-key&lt;/span&gt; local-dev-key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The client sends the key as an &lt;code&gt;X-Api-Key&lt;/code&gt; header.&lt;/p&gt;

&lt;p&gt;For production, I would not stop here. I would look at Microsoft Entra ID, managed identity, stricter network boundaries, logging, and proper secret management. For a minimal deployable sample, an environment-driven key keeps the flow understandable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Inspect with MCP Inspector
&lt;/h2&gt;

&lt;p&gt;You can also test the server with the MCP Inspector:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @modelcontextprotocol/inspector
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use Streamable HTTP and connect to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://localhost:8080/mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If API-key protection is enabled, add this header:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;X-Api-Key: local-dev-key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is useful when you want to inspect the tool schema without writing client code first.&lt;/p&gt;

&lt;h2&gt;
  
  
  Containerize the server
&lt;/h2&gt;

&lt;p&gt;The Dockerfile uses a normal multi-stage .NET build:&lt;/p&gt;

&lt;p&gt;One repo-specific detail: the sample repository uses &lt;code&gt;Directory.Build.props&lt;/code&gt; and &lt;code&gt;Directory.Packages.props&lt;/code&gt;. If you followed the manual &lt;code&gt;dotnet new&lt;/code&gt; steps and kept package versions in the project files, remove &lt;code&gt;COPY Directory.Build.props Directory.Packages.props ./&lt;/code&gt; from the Dockerfile. Do not replace those files with empty placeholders, because MSBuild expects valid XML.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;mcr.microsoft.com/dotnet/sdk:10.0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;build&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /src&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; Directory.Build.props Directory.Packages.props ./&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; src/McpAzureContainerAppsDemo.Server/McpAzureContainerAppsDemo.Server.csproj src/McpAzureContainerAppsDemo.Server/&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;dotnet restore src/McpAzureContainerAppsDemo.Server/McpAzureContainerAppsDemo.Server.csproj

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; src/McpAzureContainerAppsDemo.Server/ src/McpAzureContainerAppsDemo.Server/&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;dotnet publish src/McpAzureContainerAppsDemo.Server/McpAzureContainerAppsDemo.Server.csproj &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nt"&gt;--configuration&lt;/span&gt; Release &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nt"&gt;--no-restore&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nt"&gt;--output&lt;/span&gt; /app/publish

&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;mcr.microsoft.com/dotnet/aspnet:10.0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;final&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /app&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; ASPNETCORE_URLS=http://0.0.0.0:8080&lt;/span&gt;
&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="s"&gt; 8080&lt;/span&gt;
&lt;span class="k"&gt;USER&lt;/span&gt;&lt;span class="s"&gt; $APP_UID&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=build /app/publish .&lt;/span&gt;
&lt;span class="k"&gt;ENTRYPOINT&lt;/span&gt;&lt;span class="s"&gt; ["dotnet", "McpAzureContainerAppsDemo.Server.dll"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Build the image from the repository root:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker build &lt;span class="nt"&gt;-f&lt;/span&gt; src/McpAzureContainerAppsDemo.Server/Dockerfile &lt;span class="nt"&gt;-t&lt;/span&gt; mcp-aca-demo &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run it locally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; 8080:8080 mcp-aca-demo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or run it with API-key protection:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; 8080:8080 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;McpServer__ApiKey&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;local-dev-key &lt;span class="se"&gt;\&lt;/span&gt;
  mcp-aca-demo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At this point, the server is a containerized ASP.NET Core app exposing &lt;code&gt;/mcp&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That is why Azure Container Apps fits this kind of demo. You do not need a special hosting model for MCP. You can deploy a regular container and let the platform handle ingress, revisions, scaling, and environment variables.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploy to Azure Container Apps
&lt;/h2&gt;

&lt;p&gt;Login and set a few variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;az login

&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;RESOURCE_GROUP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;mcp-aca-demo-rg
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;LOCATION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;westeurope
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;APP_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;mcp-aca-demo-&lt;span class="nv"&gt;$RANDOM&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ENVIRONMENT_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;mcp-aca-demo-env
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;MCP_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;openssl rand &lt;span class="nt"&gt;-base64&lt;/span&gt; 32&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If this is the first time your subscription uses these services, register the required resource providers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;az provider register &lt;span class="nt"&gt;--namespace&lt;/span&gt; Microsoft.App
az provider register &lt;span class="nt"&gt;--namespace&lt;/span&gt; Microsoft.OperationalInsights
az provider register &lt;span class="nt"&gt;--namespace&lt;/span&gt; Microsoft.ContainerRegistry
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wait until Azure Container Registry is registered:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;az provider show &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--namespace&lt;/span&gt; Microsoft.ContainerRegistry &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--query&lt;/span&gt; registrationState &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output&lt;/span&gt; tsv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Continue when the command prints:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Registered
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create the resource group:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;az group create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$RESOURCE_GROUP&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--location&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOCATION&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Deploy from local source:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# This builds the image and creates an Azure Container Registry (ACR)&lt;/span&gt;
&lt;span class="c"&gt;# in the resource group if one is needed.&lt;/span&gt;
az containerapp up &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$APP_NAME&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$RESOURCE_GROUP&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--location&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOCATION&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--environment&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ENVIRONMENT_NAME&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--source&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--ingress&lt;/span&gt; external &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--target-port&lt;/span&gt; 8080 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--env-vars&lt;/span&gt; &lt;span class="nv"&gt;McpServer__ApiKey&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$MCP_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The settings that matter here are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;--source .&lt;/code&gt;: build and deploy from the local repository&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--ingress external&lt;/code&gt;: expose the app publicly&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--target-port 8080&lt;/code&gt;: match the port used by the container&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--env-vars McpServer__ApiKey=...&lt;/code&gt;: pass the API key into the app configuration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Get the public URL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;MCP_FQDN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;az containerapp show &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$APP_NAME&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$RESOURCE_GROUP&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--query&lt;/span&gt; properties.configuration.ingress.fqdn &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output&lt;/span&gt; tsv&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"https://&lt;/span&gt;&lt;span class="nv"&gt;$MCP_FQDN&lt;/span&gt;&lt;span class="s2"&gt;/mcp"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then call the deployed MCP server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dotnet run &lt;span class="nt"&gt;--project&lt;/span&gt; src/McpAzureContainerAppsDemo.Client/McpAzureContainerAppsDemo.Client.csproj &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--url&lt;/span&gt; &lt;span class="s2"&gt;"https://&lt;/span&gt;&lt;span class="nv"&gt;$MCP_FQDN&lt;/span&gt;&lt;span class="s2"&gt;/mcp"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--api-key&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$MCP_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--time-zone&lt;/span&gt; Europe/Berlin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If everything is wired correctly, the client behaves the same way as it did locally. It lists tools, calls &lt;code&gt;add&lt;/code&gt;, and calls &lt;code&gt;server_time&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That is the practical value of this setup: once the MCP server is exposed over Streamable HTTP, the local and deployed client flow look almost identical.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cleanup
&lt;/h2&gt;

&lt;p&gt;When you are done, remove the resource group:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;az group delete &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$RESOURCE_GROUP&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--yes&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--no-wait&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What I would change for a real system
&lt;/h2&gt;

&lt;p&gt;This repository is a demo. I would change a few things before using the same shape in production.&lt;/p&gt;

&lt;p&gt;First, I would replace the static API key with a stronger authentication and authorization model. MCP is a tool boundary. If a tool can read or mutate real data, access control matters.&lt;/p&gt;

&lt;p&gt;Second, I would add proper observability. At minimum, I would want logs around tool calls, failures, request IDs, latency, and downstream dependencies. For real operations, OpenTelemetry and the Aspire dashboard are useful places to start.&lt;/p&gt;

&lt;p&gt;Third, I would make the tool contract boring and explicit. Clear names, narrow parameters, validation, and predictable errors matter more than clever tool descriptions.&lt;/p&gt;

&lt;p&gt;Fourth, I would add integration tests around the HTTP surface. Unit tests are enough for the arithmetic and time services, but the &lt;code&gt;/mcp&lt;/code&gt; endpoint, headers, and deployment configuration need their own checks once the sample grows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;You can build and deploy a remote MCP server in .NET without much ceremony.&lt;/p&gt;

&lt;p&gt;The flow is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a normal ASP.NET Core app.&lt;/li&gt;
&lt;li&gt;Add &lt;code&gt;ModelContextProtocol.AspNetCore&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Register MCP with Streamable HTTP.&lt;/li&gt;
&lt;li&gt;Expose tools with attributes.&lt;/li&gt;
&lt;li&gt;Keep tool behavior in regular services.&lt;/li&gt;
&lt;li&gt;Containerize the server.&lt;/li&gt;
&lt;li&gt;Deploy it to Azure Container Apps.&lt;/li&gt;
&lt;li&gt;Call it from a C# MCP client.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The &lt;code&gt;add&lt;/code&gt; tool is just there to prove the path works. The useful part is the hosting model.&lt;/p&gt;

&lt;p&gt;Once MCP runs as a regular HTTP service, it fits naturally into the same deployment path many .NET teams already use: container image, environment variables, ingress, health endpoint, and a small client that points at &lt;code&gt;/mcp&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Use this shape when you need a remote MCP tool service that behaves like a normal .NET web app and can be called from clients outside the host machine.&lt;/p&gt;

&lt;p&gt;Do not use the sample as-is when tools can access production data, mutate state, or need user-level authorization. Add real authentication, tighter network boundaries, observability, and integration tests first.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/modelcontextprotocol/csharp-sdk" rel="noopener noreferrer"&gt;Official C# SDK for Model Context Protocol&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://modelcontextprotocol.io/specification/2025-11-25/basic/transports" rel="noopener noreferrer"&gt;MCP transport specification&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/azure/container-apps/environment-variables" rel="noopener noreferrer"&gt;Azure Container Apps environment variables&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/cli/azure/containerapp" rel="noopener noreferrer"&gt;Azure CLI &lt;code&gt;containerapp&lt;/code&gt; reference&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>mcp</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Microsoft Agent Framework Workflows: When to Use Them and When to Stay in C#</title>
      <dc:creator>Lukas Walter </dc:creator>
      <pubDate>Thu, 11 Jun 2026 15:30:00 +0000</pubDate>
      <link>https://dev.to/lukaswalter/microsoft-agent-framework-workflows-when-to-use-them-and-when-to-stay-in-c-33jd</link>
      <guid>https://dev.to/lukaswalter/microsoft-agent-framework-workflows-when-to-use-them-and-when-to-stay-in-c-33jd</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This is Part 12 of my series on the Microsoft Agent Framework. You can read the original post over on &lt;a href="https://www.lukaswalter.dev/posts/agentframework_1_12/" rel="noopener noreferrer"&gt;lukaswalter.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In the &lt;a href="https://www.lukaswalter.dev/posts/agentframework_1_11/" rel="noopener noreferrer"&gt;previous article&lt;/a&gt;, we looked at agents as tools.&lt;br&gt;
A coordinator agent could delegate work to focused specialist agents without exposing every low-level tool directly.&lt;/p&gt;

&lt;p&gt;Before that, we looked at &lt;a href="https://www.lukaswalter.dev/posts/agentframework_1_10/" rel="noopener noreferrer"&gt;manual multi-agent routing&lt;/a&gt;.&lt;br&gt;
A small intent agent returned structured output, and normal C# code decided which specialist agent should run.&lt;/p&gt;

&lt;p&gt;Both patterns are useful because they keep orchestration close to the application.&lt;br&gt;
But Microsoft Agent Framework also includes a real workflow engine.&lt;br&gt;
It has executors, edges, graph-based orchestration, streaming events, handoffs, checkpointing, and human-in-the-loop scenarios.&lt;/p&gt;

&lt;p&gt;That is powerful.&lt;br&gt;
It is also not free.&lt;/p&gt;

&lt;p&gt;Just because the framework provides a workflow engine does not mean every multi-agent interaction should become a workflow graph.&lt;br&gt;
In production systems, the best architecture is often the smallest abstraction that makes the system understandable and reliable.&lt;/p&gt;

&lt;p&gt;My rule is simple:&lt;/p&gt;

&lt;p&gt;Microsoft Agent Framework workflows are useful when the process itself needs to be explicit, observable, resumable, and long-running.&lt;br&gt;
For simple orchestration, normal C# code is often the better choice.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why Workflows Exist
&lt;/h2&gt;

&lt;p&gt;A workflow makes orchestration explicit.&lt;/p&gt;

&lt;p&gt;Instead of hiding the process inside ad-hoc method calls, the process becomes a model.&lt;br&gt;
The model is a graph.&lt;br&gt;
The graph contains units of work and connections between them.&lt;/p&gt;

&lt;p&gt;That graph can represent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;sequential execution&lt;/li&gt;
&lt;li&gt;branching&lt;/li&gt;
&lt;li&gt;parallel execution&lt;/li&gt;
&lt;li&gt;fan-out and fan-in&lt;/li&gt;
&lt;li&gt;handoffs between agents&lt;/li&gt;
&lt;li&gt;human input&lt;/li&gt;
&lt;li&gt;long-running processes&lt;/li&gt;
&lt;li&gt;events emitted while the process runs&lt;/li&gt;
&lt;li&gt;checkpointing and resumption&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_12_1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_12_1.png" title="Workflow process model" alt="Workflow process model with agents, code steps, human approval, events, and checkpoint state" width="800" height="453"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is a different abstraction from a single agent call.&lt;/p&gt;

&lt;p&gt;An agent decides dynamically what to do based on its instructions, tools, and conversation context.&lt;br&gt;
A workflow defines a process outside the model.&lt;br&gt;
Agents can be steps inside that process, but the process itself is controlled by the workflow.&lt;/p&gt;

&lt;p&gt;That distinction matters.&lt;/p&gt;

&lt;p&gt;If you are building a document review flow, a support escalation flow, or an approval process that may pause and resume later, the workflow is not just plumbing.&lt;br&gt;
It is part of the system.&lt;br&gt;
You may need to inspect it, persist it, test it, version it, and explain it to other people.&lt;/p&gt;

&lt;p&gt;That is where a workflow engine becomes useful.&lt;/p&gt;

&lt;p&gt;But there is a cost.&lt;br&gt;
A workflow graph introduces more setup code, more types, more event handling, and another layer between the application code and the thing that happens.&lt;br&gt;
For complex systems, that cost can be worth paying.&lt;br&gt;
For small flows, it can make the system harder to read.&lt;/p&gt;
&lt;h2&gt;
  
  
  Executors and Edges
&lt;/h2&gt;

&lt;p&gt;The two basic building blocks are executors and edges.&lt;/p&gt;

&lt;p&gt;An executor is a unit of work.&lt;br&gt;
It receives input, runs logic, and produces output or events.&lt;/p&gt;

&lt;p&gt;An executor can wrap:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;deterministic C# code&lt;/li&gt;
&lt;li&gt;an agent call&lt;/li&gt;
&lt;li&gt;a tool call&lt;/li&gt;
&lt;li&gt;validation logic&lt;/li&gt;
&lt;li&gt;a transformation step&lt;/li&gt;
&lt;li&gt;a human approval request&lt;/li&gt;
&lt;li&gt;another piece of process logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In agentic workflows, many executors are either agents or agent-driven steps.&lt;br&gt;
For example, one executor might call a document summarizer agent.&lt;br&gt;
Another executor might validate the generated summary.&lt;br&gt;
Another executor might wait for a human reviewer.&lt;/p&gt;

&lt;p&gt;Edges define how data moves between executors.&lt;br&gt;
They are the connections in the graph.&lt;/p&gt;

&lt;p&gt;An edge can represent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a direct connection from one step to the next&lt;/li&gt;
&lt;li&gt;conditional routing&lt;/li&gt;
&lt;li&gt;switch-like branching&lt;/li&gt;
&lt;li&gt;fan-out to multiple steps&lt;/li&gt;
&lt;li&gt;fan-in back into an aggregation step&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_12_2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_12_2.png" title="Executors and edges" alt="Executors connected by direct and conditional edges in a workflow graph" width="800" height="157"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is where the workflow engine adds value.&lt;br&gt;
The control flow becomes visible in the workflow definition.&lt;br&gt;
You can look at the graph to see which steps can occur and which transitions are allowed.&lt;/p&gt;

&lt;p&gt;Executors and edges are powerful because they turn orchestration into a model.&lt;br&gt;
But once orchestration becomes a model, you also pay the cost of maintaining that model.&lt;/p&gt;
&lt;h2&gt;
  
  
  Four Workflow-Related Patterns
&lt;/h2&gt;

&lt;p&gt;Several workflow-related patterns show up quickly in real applications.&lt;br&gt;
Some are orchestration patterns.&lt;br&gt;
Others, such as human-in-the-loop, are process capabilities.&lt;/p&gt;

&lt;p&gt;The important part is not the label.&lt;br&gt;
The important part is whether the process benefits from being modeled explicitly.&lt;/p&gt;

&lt;p&gt;The examples are intentionally simplified.&lt;br&gt;
The point is the orchestration shape, not the exact API surface.&lt;/p&gt;
&lt;h3&gt;
  
  
  Sequential Workflows
&lt;/h3&gt;

&lt;p&gt;A sequential workflow passes the output of one step to the next.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;One agent summarizes a legal document.&lt;/li&gt;
&lt;li&gt;Another agent translates the summary into French.&lt;/li&gt;
&lt;li&gt;A final step formats the result for a user.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;As a workflow, this is a pipeline.&lt;br&gt;
Each step depends on the previous step.&lt;br&gt;
In a simple pipeline, the result of one executor can become the input for the next executor. In Agent Framework sequential agent orchestration, the next agent may also receive conversation context depending on how the orchestration is configured.&lt;/p&gt;

&lt;p&gt;That is a valid workflow shape.&lt;br&gt;
But for two or three deterministic steps, normal C# may be clearer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;AgentResponse&lt;/span&gt; &lt;span class="n"&gt;summary&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;summarizerAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;documentText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;AgentResponse&lt;/span&gt; &lt;span class="n"&gt;translation&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;translatorAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;$"Translate this summary to French:\n\n&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;formatted&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;formatter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;translation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There is nothing wrong with this.&lt;br&gt;
It is readable from top to bottom.&lt;br&gt;
It is easy to test.&lt;br&gt;
It is easy to debug.&lt;/p&gt;

&lt;p&gt;The workflow version starts to pay off when the sequence itself matters operationally:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You need to show progress for each step&lt;/li&gt;
&lt;li&gt;You need to retry or resume from a later step&lt;/li&gt;
&lt;li&gt;You need to audit which step produced which output&lt;/li&gt;
&lt;li&gt;You expect the pipeline to grow&lt;/li&gt;
&lt;li&gt;Non-trivial branches will appear between steps&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Concurrent Workflows
&lt;/h3&gt;

&lt;p&gt;A concurrent workflow sends the same input to multiple agents or steps at the same time.&lt;/p&gt;

&lt;p&gt;For example, a legal document could be processed by three agents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One checks legal risk&lt;/li&gt;
&lt;li&gt;One checks spelling and grammar&lt;/li&gt;
&lt;li&gt;One extracts key obligations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This can reduce latency because the work happens in parallel.&lt;br&gt;
It also creates a useful structure when each branch has its own state, events, retry behavior, or result type.&lt;/p&gt;

&lt;p&gt;But again, C# already has a good concurrency primitive:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AgentResponse&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;riskTask&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;legalRiskAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;documentText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AgentResponse&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;grammarTask&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;grammarAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;documentText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AgentResponse&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;obligationsTask&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;obligationsAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;documentText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WhenAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;riskTask&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;grammarTask&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obligationsTask&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;AgentResponse&lt;/span&gt; &lt;span class="n"&gt;risk&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;riskTask&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;AgentResponse&lt;/span&gt; &lt;span class="n"&gt;grammar&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;grammarTask&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;AgentResponse&lt;/span&gt; &lt;span class="n"&gt;obligations&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;obligationsTask&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;DocumentReviewResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;LegalRisk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;risk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;GrammarNotes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;grammar&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Obligations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;obligations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For simple concurrent orchestration, &lt;code&gt;Task.WhenAll&lt;/code&gt; is often easier to read and test than a workflow graph.&lt;/p&gt;

&lt;p&gt;The operational concerns are the same either way:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Parallel agent calls multiply token usage&lt;/li&gt;
&lt;li&gt;They can hit provider rate limits faster&lt;/li&gt;
&lt;li&gt;They increase cost&lt;/li&gt;
&lt;li&gt;They make partial failure handling more important&lt;/li&gt;
&lt;li&gt;Concurrency should be bounded and intentional&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The workflow engine does not make those concerns disappear.&lt;br&gt;
It provides a structured way to model and observe concurrent branches when that structure matters.&lt;/p&gt;
&lt;h3&gt;
  
  
  Handoff Workflows
&lt;/h3&gt;

&lt;p&gt;A handoff workflow allows one agent or step to decide which other agent should continue the task.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;An intent agent receives a user request.&lt;/li&gt;
&lt;li&gt;It decides whether the task belongs to a movie expert, music expert, legal expert, or support agent.&lt;/li&gt;
&lt;li&gt;The workflow controls which handoffs are allowed.&lt;/li&gt;
&lt;li&gt;The selected agent continues the process.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is useful when the route can continue over multiple steps and allowed transitions matter.&lt;br&gt;
The workflow can make it clear that the legal expert may hand off to compliance, but not directly to billing.&lt;br&gt;
Or that support can escalate to engineering, but only after collecting certain information.&lt;/p&gt;

&lt;p&gt;For simple routing, I would usually not start here.&lt;br&gt;
The pattern from the manual routing article is often enough:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;AgentResponse&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IntentResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;intentResponse&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;intentAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IntentResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;
        &lt;span class="s"&gt;$"""
&lt;/span&gt;        &lt;span class="n"&gt;Classify&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

        &lt;span class="n"&gt;User&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="s"&gt;""",
&lt;/span&gt;        &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;IntentResult&lt;/span&gt; &lt;span class="n"&gt;route&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;intentResponse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Intent&lt;/span&gt; &lt;span class="k"&gt;switch&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;UserIntent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Movies&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;movieAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;UserIntent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Music&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;musicAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;UserIntent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Legal&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;legalAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s"&gt;"I can help with movies, music, or legal questions."&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This keeps routing explicit in application code.&lt;br&gt;
The classifier returns structured output.&lt;br&gt;
The application validates it.&lt;br&gt;
The switch expression owns the route.&lt;/p&gt;

&lt;p&gt;That is boring in the best way.&lt;/p&gt;

&lt;p&gt;Handoff workflows help when routing is no longer just a single switch:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Allowed transitions are part of the domain&lt;/li&gt;
&lt;li&gt;Agents may hand off repeatedly&lt;/li&gt;
&lt;li&gt;The path needs to be inspected later&lt;/li&gt;
&lt;li&gt;Intermediate state must survive between handoffs&lt;/li&gt;
&lt;li&gt;Handoffs need events, checkpoints, or human approval&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Human-in-the-Loop Workflows
&lt;/h3&gt;

&lt;p&gt;Human-in-the-loop is less about routing and more about process state.&lt;br&gt;
It allows the system to pause and wait for human input.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;An agent drafts an answer.&lt;/li&gt;
&lt;li&gt;The workflow pauses for human approval.&lt;/li&gt;
&lt;li&gt;A human edits or approves the answer.&lt;/li&gt;
&lt;li&gt;The workflow continues with publishing or sending the result.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is one of the strongest reasons to use a workflow engine.&lt;br&gt;
The pause is not a UI detail.&lt;br&gt;
It is part of the process.&lt;/p&gt;

&lt;p&gt;Still, not every user interaction needs a workflow.&lt;/p&gt;

&lt;p&gt;For a local chat UI, this may be enough:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;AgentResponse&lt;/span&gt; &lt;span class="n"&gt;draft&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;ApprovalDecision&lt;/span&gt; &lt;span class="n"&gt;decision&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;approvalUi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RequestApprovalAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;draft&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Approved&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Draft rejected."&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;publisher&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;PublishAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EditedText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is simple.&lt;br&gt;
The user is present.&lt;br&gt;
The interaction is immediate.&lt;br&gt;
The state belongs naturally in the UI or controller flow.&lt;/p&gt;

&lt;p&gt;Workflows become useful when the human interaction is part of a long-running business process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;approval chains&lt;/li&gt;
&lt;li&gt;compliance checks&lt;/li&gt;
&lt;li&gt;ticket escalation&lt;/li&gt;
&lt;li&gt;delayed review&lt;/li&gt;
&lt;li&gt;multi-day processes&lt;/li&gt;
&lt;li&gt;persisted state&lt;/li&gt;
&lt;li&gt;audit requirements&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the process may pause today and resume tomorrow, a workflow engine is a much better fit than pretending everything still lives inside one request handler.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Guitar Order Example
&lt;/h2&gt;

&lt;p&gt;Let's make the trade-off concrete.&lt;/p&gt;

&lt;p&gt;A customer orders a guitar from an online music store.&lt;/p&gt;

&lt;p&gt;The system should:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;parse the customer's guitar order&lt;/li&gt;
&lt;li&gt;check inventory&lt;/li&gt;
&lt;li&gt;decide whether the guitar can be reserved&lt;/li&gt;
&lt;li&gt;return either a confirmation, an alternative recommendation, or a human review request&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The agents are not the hard part.&lt;br&gt;
The design decision is where orchestration belongs.&lt;/p&gt;
&lt;h3&gt;
  
  
  The Workflow-Heavy Version
&lt;/h3&gt;

&lt;p&gt;In a workflow-heavy design, you might model the process like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;GuitarOrderParserExecutor&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;InventoryCheckExecutor&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ReserveInventoryExecutor&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AlternativeRecommendationExecutor&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;HumanReviewExecutor&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;OrderConfirmationExecutor&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;an edge from parsing to inventory&lt;/li&gt;
&lt;li&gt;a conditional edge from inventory to alternative recommendations&lt;/li&gt;
&lt;li&gt;a conditional edge from inventory to reservation&lt;/li&gt;
&lt;li&gt;a conditional edge from reservation to human review&lt;/li&gt;
&lt;li&gt;a conditional edge from reservation to confirmation&lt;/li&gt;
&lt;li&gt;an edge from approved human review to confirmation&lt;/li&gt;
&lt;li&gt;events emitted when parsing, inventory check, reservation, review, and final decision complete&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Conceptually, the graph looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_12_3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_12_3.png" title="Guitar order workflow" alt="Guitar order workflow with inventory check, reservation, alternative recommendation, human review, and confirmation" width="800" height="852"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This can be a good design.&lt;/p&gt;

&lt;p&gt;If the guitar order process is part of a larger business process, the workflow graph may be exactly what you want.&lt;br&gt;
Maybe high-value guitars require manual review.&lt;br&gt;
Maybe custom instruments need setup confirmation.&lt;br&gt;
Maybe inventory reservation and payment authorization happen asynchronously.&lt;br&gt;
Maybe fulfillment is delayed.&lt;br&gt;
Maybe customer support needs to inspect where the order stopped.&lt;/p&gt;

&lt;p&gt;In that world, the workflow is not over-engineering.&lt;br&gt;
It is the process model.&lt;/p&gt;
&lt;h3&gt;
  
  
  The Normal C# Version
&lt;/h3&gt;

&lt;p&gt;For the simple version, I would probably write something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;GuitarOrderResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;HandleAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;AgentResponse&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;GuitarOrder&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;parsed&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;guitarOrderParserAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;GuitarOrder&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;
            &lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;InventoryResult&lt;/span&gt; &lt;span class="n"&gt;inventory&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;inventoryService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CheckAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;inventory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsInStock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;AlternativeProduct&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="n"&gt;alternative&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;recommendationService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FindAlternativeAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;GuitarOrderResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AlternativeRecommendation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;alternative&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;Reservation&lt;/span&gt; &lt;span class="n"&gt;reservation&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;inventoryService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ReserveAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RequiresManualReview&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;GuitarOrderResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;HumanReviewRequired&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;reservation&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;GuitarOrderResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Confirmed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;reservation&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the simple C# version, the method can simply return a &lt;code&gt;HumanReviewRequired&lt;/code&gt; result and let the surrounding application handle the next step.&lt;br&gt;
In the workflow version, the process itself could pause, persist its state, wait for the review, and continue later.&lt;/p&gt;

&lt;p&gt;Every developer can follow this flow from top to bottom.&lt;br&gt;
There is no workflow setup to inspect.&lt;br&gt;
There are no edges to mentally resolve.&lt;br&gt;
The control flow is not hidden behind the workflow definition.&lt;/p&gt;

&lt;p&gt;For this simple case, the C# version may be easier to understand six months later.&lt;/p&gt;

&lt;p&gt;The same agents can be used in both designs.&lt;br&gt;
The design decision is not whether agents are useful.&lt;br&gt;
The design decision is where orchestration belongs.&lt;/p&gt;

&lt;p&gt;That is the trap with workflow engines.&lt;br&gt;
They can make complex processes easier to inspect, but they can also make simple processes harder to understand.&lt;/p&gt;

&lt;p&gt;For small flows, workflow graphs can introduce more setup code, more indirection, more event handling, and more distance between cause and effect.&lt;/p&gt;

&lt;p&gt;Ceremony is not architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  Decision Matrix: Workflow Engine vs. Normal C
&lt;/h2&gt;

&lt;p&gt;Use this as a quick filter before reaching for a workflow graph.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Start with C# when...&lt;/th&gt;
&lt;th&gt;Use workflows when...&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Sequential chain&lt;/td&gt;
&lt;td&gt;The flow has two or three deterministic steps&lt;/td&gt;
&lt;td&gt;Steps need progress events, checkpoints, retries, or resumption&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Concurrent calls&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Task.WhenAll&lt;/code&gt; is enough&lt;/td&gt;
&lt;td&gt;Branches need independent state, observability, or fan-in logic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Routing / handoff&lt;/td&gt;
&lt;td&gt;Structured output + &lt;code&gt;switch&lt;/code&gt; is clear&lt;/td&gt;
&lt;td&gt;Allowed transitions are part of the domain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Human input&lt;/td&gt;
&lt;td&gt;The user is present in the UI&lt;/td&gt;
&lt;td&gt;Approval is delayed, auditable, or long-running&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The table is intentionally biased toward starting simple. A workflow engine should not be used because orchestration exists. It should be used because modeling orchestration improves the system.&lt;/p&gt;

&lt;h2&gt;
  
  
  When I Would Use Workflows
&lt;/h2&gt;

&lt;p&gt;I reach for workflows when the orchestration is important enough to become a first-class part of the system.&lt;/p&gt;

&lt;p&gt;That usually means one or more of these are true:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the process has many steps and branches&lt;/li&gt;
&lt;li&gt;the process itself needs to be visible&lt;/li&gt;
&lt;li&gt;the workflow needs to pause and resume&lt;/li&gt;
&lt;li&gt;the state must be checkpointed or persisted&lt;/li&gt;
&lt;li&gt;humans approve or modify intermediate results&lt;/li&gt;
&lt;li&gt;the process may run for minutes, hours, or days&lt;/li&gt;
&lt;li&gt;the orchestration path matters for auditing&lt;/li&gt;
&lt;li&gt;teams need a shared model of the process&lt;/li&gt;
&lt;li&gt;the graph is easier to reason about than scattered application code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the process is part of the product behavior, model it deliberately.&lt;br&gt;
If users, operators, or auditors care about the process, hiding it in a method call may not be enough.&lt;/p&gt;

&lt;h2&gt;
  
  
  When I Would Not Use Workflows
&lt;/h2&gt;

&lt;p&gt;I avoid workflows when the graph adds more code than it removes.&lt;/p&gt;

&lt;p&gt;That is often the case when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;there are only two or three steps&lt;/li&gt;
&lt;li&gt;the process is linear&lt;/li&gt;
&lt;li&gt;routing is just a simple switch&lt;/li&gt;
&lt;li&gt;parallelism is just &lt;code&gt;Task.WhenAll&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;user input belongs naturally in the UI flow&lt;/li&gt;
&lt;li&gt;the workflow graph adds more setup than clarity&lt;/li&gt;
&lt;li&gt;debugging the workflow is harder than reading the code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If normal C# makes the flow obvious, testable, and maintainable, that is not a workaround.&lt;br&gt;
That is good engineering.&lt;/p&gt;

&lt;h2&gt;
  
  
  External Workflow Engines
&lt;/h2&gt;

&lt;p&gt;Microsoft Agent Framework workflows are useful inside an agent application.&lt;br&gt;
They do not automatically replace every existing workflow platform.&lt;/p&gt;

&lt;p&gt;If the workflow is a company-level business process with many integrations, approvals, retries, timers, dashboards, and non-developer stakeholders, the agent framework should probably not become the central enterprise workflow platform.&lt;/p&gt;

&lt;p&gt;Examples include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Azure Logic Apps&lt;/li&gt;
&lt;li&gt;Durable Functions&lt;/li&gt;
&lt;li&gt;n8n&lt;/li&gt;
&lt;li&gt;Make&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In those systems, agents can be exposed as API steps or services.&lt;br&gt;
The external workflow engine owns the broader business process.&lt;br&gt;
The agent application owns the agent behavior.&lt;/p&gt;

&lt;p&gt;Use the right workflow engine at the right level.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Practical Rule of Thumb
&lt;/h2&gt;

&lt;p&gt;Start with C#.&lt;br&gt;
Move to workflows when the process needs to outlive the method call.&lt;/p&gt;

&lt;p&gt;More specifically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If the orchestration is local, short-lived, and deterministic, keep it in code.&lt;/li&gt;
&lt;li&gt;If the orchestration is long-running, inspectable, resumable, and operationally relevant, model it as a workflow.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Another way to say it:&lt;/p&gt;

&lt;p&gt;Use code when you need control flow.&lt;br&gt;
Use workflows when control flow becomes product behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The workflow engine is a powerful part of Microsoft Agent Framework.&lt;br&gt;
It is a good fit when the process itself has architectural weight: state, checkpoints, events, approvals, resumption, and operational visibility.&lt;/p&gt;

&lt;p&gt;It is not the default answer to every multi-agent design.&lt;br&gt;
Simple orchestration still often belongs in normal C# code: method calls, switch statements, loops, &lt;code&gt;Task.WhenAll&lt;/code&gt;, validation logic, and explicit application services.&lt;/p&gt;

&lt;p&gt;The goal is not to use the most advanced abstraction.&lt;br&gt;
The goal is to make the system easier to understand, operate, and change.&lt;/p&gt;

&lt;p&gt;We can now orchestrate agents, route tasks, expose agents as tools, and decide when workflows are worth the complexity.&lt;br&gt;
Next, the series moves to knowledge: how agents answer from documents, databases, PDFs, and internal systems through retrieval-augmented generation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/workflows/" rel="noopener noreferrer"&gt;Microsoft Agent Framework Workflows&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/workflows/workflows" rel="noopener noreferrer"&gt;Workflow Builder and Execution&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/workflows/orchestrations/sequential" rel="noopener noreferrer"&gt;Sequential Orchestration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/workflows/orchestrations/concurrent" rel="noopener noreferrer"&gt;Concurrent Orchestration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/workflows/orchestrations/handoff" rel="noopener noreferrer"&gt;Handoff Orchestration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/workflows/events" rel="noopener noreferrer"&gt;Workflow Events&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Agents as Tools in Microsoft Agent Framework</title>
      <dc:creator>Lukas Walter </dc:creator>
      <pubDate>Thu, 04 Jun 2026 15:30:00 +0000</pubDate>
      <link>https://dev.to/lukaswalter/agents-as-tools-in-microsoft-agent-framework-n27</link>
      <guid>https://dev.to/lukaswalter/agents-as-tools-in-microsoft-agent-framework-n27</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This is Part 11 of my series on the Microsoft Agent Framework. You can read the original post over on &lt;a href="https://www.lukaswalter.dev/posts/agentframework_1_11/" rel="noopener noreferrer"&gt;lukaswalter.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In the previous article, we looked at manual multi-agent routing.&lt;br&gt;
A small intent agent classified the request.&lt;br&gt;
Then normal C# decided which specialist agent should run.&lt;/p&gt;

&lt;p&gt;That pattern is useful when your application should retain control over the route.&lt;br&gt;
But there is another pattern that is worth knowing: exposing an agent as a tool.&lt;/p&gt;

&lt;p&gt;Instead of routing from C#, you give a coordinator agent access to other agents as function tools.&lt;br&gt;
The coordinator can then decide when to delegate work to a focused inner agent.&lt;/p&gt;

&lt;p&gt;This is not a magic cost reducer.&lt;br&gt;
It is not a replacement for workflows.&lt;br&gt;
It is a lightweight composition pattern for cases where one agent has become too broad, but a full workflow graph would be too much.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Problem with One Tool-Heavy Agent
&lt;/h2&gt;

&lt;p&gt;Let's first discuss why a single, tool-heavy agent is a problem. &lt;br&gt;
Adding tools to an agent is easy.&lt;br&gt;
That is also why agents tend to grow too quickly.&lt;/p&gt;

&lt;p&gt;A first version might only answer questions.&lt;br&gt;
Then it gets a few string utilities.&lt;br&gt;
Then some number tools.&lt;br&gt;
Then a documentation search tool.&lt;br&gt;
Then a ticketing tool.&lt;br&gt;
Then a deployment helper.&lt;/p&gt;

&lt;p&gt;At some point, every request carries too much baggage.&lt;br&gt;
The model receives:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;broad instructions&lt;/li&gt;
&lt;li&gt;many tool namesor small agents&lt;/li&gt;
&lt;li&gt;many tool descriptions&lt;/li&gt;
&lt;li&gt;parameter schemas for tools it does not need&lt;/li&gt;
&lt;li&gt;rules for domains that are irrelevant to the current request&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This creates the same issue we saw in manual routing.&lt;br&gt;
The agent has more context to process, more tools to choose from, and more ways to choose the wrong behavior.&lt;/p&gt;

&lt;p&gt;For small agents, this does not matter much.&lt;br&gt;
For agents with many detailed tools, the overhead becomes visible quickly because tool definitions are part of the model context.&lt;/p&gt;
&lt;h2&gt;
  
  
  Agents as Tools
&lt;/h2&gt;

&lt;p&gt;Agent Framework lets you convert an &lt;code&gt;AIAgent&lt;/code&gt; into a function tool using &lt;code&gt;AsAIFunction()&lt;/code&gt;.&lt;br&gt;
That tool can then be passed to another agent.&lt;/p&gt;

&lt;p&gt;This assumes you are using an agent type that supports function tools.&lt;br&gt;
In practice, check agent and provider support before designing around this pattern.&lt;/p&gt;

&lt;p&gt;The outer agent sees the inner agent like any other tool.&lt;br&gt;
The inner agent still has its own instructions, tools, model calls, and behavior.&lt;/p&gt;

&lt;p&gt;Conceptually, the flow looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_11_1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_11_1.png" title="Coordinator" alt="Coordinator" width="796" height="41"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The important part is isolation.&lt;br&gt;
The coordinator does not need to know every low-level tool behind every specialist agent.&lt;br&gt;
It only needs to know which specialist agents are available and what each one is good at.&lt;/p&gt;
&lt;h2&gt;
  
  
  A Small Example
&lt;/h2&gt;

&lt;p&gt;Imagine a simple assistant with two specialist areas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;string transformations&lt;/li&gt;
&lt;li&gt;numeric calculations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You could put all tools on one agent.&lt;br&gt;
For a small demo, that is fine.&lt;br&gt;
But the structure becomes clearer when each specialist owns its own tools.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;System.ComponentModel&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Agents.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Extensions.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Converts text to uppercase."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nf"&gt;ToUppercase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"The text to convert."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToUpperInvariant&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Reverses text."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nf"&gt;ReverseText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"The text to reverse."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Reverse&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;ToArray&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Multiplies two numbers."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="nf"&gt;Multiply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"The first number."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"The second number."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;left&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now create the inner agents.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;stringAgent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"string-agent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Handles string transformations such as uppercase and reverse text."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"""
&lt;/span&gt;    &lt;span class="n"&gt;You&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt; &lt;span class="n"&gt;strings&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;the&lt;/span&gt; &lt;span class="n"&gt;provided&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;Keep&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="kt"&gt;short&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;transformed&lt;/span&gt; &lt;span class="k"&gt;value&lt;/span&gt; &lt;span class="n"&gt;clearly&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="s"&gt;""",
&lt;/span&gt;    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;AIFunctionFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ToUppercase&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;AIFunctionFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ReverseText&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;numberAgent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"number-agent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Handles small deterministic numeric calculations."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"""
&lt;/span&gt;    &lt;span class="n"&gt;You&lt;/span&gt; &lt;span class="n"&gt;perform&lt;/span&gt; &lt;span class="n"&gt;numeric&lt;/span&gt; &lt;span class="n"&gt;calculations&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;the&lt;/span&gt; &lt;span class="n"&gt;provided&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;Keep&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="kt"&gt;short&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;include&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;calculated&lt;/span&gt; &lt;span class="k"&gt;value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="s"&gt;""",
&lt;/span&gt;    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;AIFunctionFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Multiply&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;name&lt;/code&gt; and &lt;code&gt;description&lt;/code&gt; matter because they become part of the coordinator model's routing surface once the agents are exposed as tools.&lt;br&gt;
A vague description gives the coordinator too little information to choose the right specialist reliably.&lt;/p&gt;

&lt;p&gt;Now expose those agents as tools to the coordinator.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;coordinatorAgent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"coordinator-agent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"""
&lt;/span&gt;    &lt;span class="n"&gt;You&lt;/span&gt; &lt;span class="n"&gt;decide&lt;/span&gt; &lt;span class="n"&gt;whether&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="n"&gt;should&lt;/span&gt; &lt;span class="n"&gt;be&lt;/span&gt; &lt;span class="n"&gt;handled&lt;/span&gt; &lt;span class="n"&gt;directly&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="n"&gt;delegated&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

    &lt;span class="n"&gt;Use&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="n"&gt;transformation&lt;/span&gt; &lt;span class="n"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;Use&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;number&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;deterministic&lt;/span&gt; &lt;span class="n"&gt;numeric&lt;/span&gt; &lt;span class="n"&gt;calculations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

    &lt;span class="n"&gt;If&lt;/span&gt; &lt;span class="n"&gt;no&lt;/span&gt; &lt;span class="n"&gt;specialist&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="n"&gt;fits&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="n"&gt;briefly&lt;/span&gt; &lt;span class="n"&gt;without&lt;/span&gt; &lt;span class="n"&gt;inventing&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="s"&gt;""",
&lt;/span&gt;    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;stringAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIFunction&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;numberAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIFunction&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="n"&gt;AgentResponse&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;coordinatorAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"Uppercase 'hello world' and multiply 12 by 4."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The coordinator receives two tool options:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Call the string agent&lt;/li&gt;
&lt;li&gt;Call the number agent&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It does not receive the raw schemas for &lt;code&gt;ToUppercase&lt;/code&gt;, &lt;code&gt;ReverseText&lt;/code&gt;, and &lt;code&gt;Multiply&lt;/code&gt; as direct tools.&lt;br&gt;
Those belong to the specialist agents.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Actually Saves Tokens
&lt;/h2&gt;

&lt;p&gt;The token benefit comes from narrowing the visible tool surface.&lt;/p&gt;

&lt;p&gt;Without delegation, one large agent receives every tool schema on calls to that agent:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_11_2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_11_2.png" title="WithoutDelegation" alt="WithoutDelegation" width="800" height="689"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With agents as tools, the coordinator sees a smaller set of higher-level tools:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_11_3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_11_3.png" title="WithAgentsAsTools" alt="WithAgentsAsTools" width="800" height="160"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That can reduce input tokens for the coordinator call.&lt;br&gt;
But delegation also adds extra agent invocations.&lt;/p&gt;

&lt;p&gt;If the coordinator calls one inner agent, you pay for the coordinator call and the inner agent call.&lt;br&gt;
If it calls three inner agents, you pay for all of them.&lt;/p&gt;

&lt;p&gt;This pattern pays off when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Most requests only need one specialist area&lt;/li&gt;
&lt;li&gt;Inner agents have many tools that should not be visible all the time&lt;/li&gt;
&lt;li&gt;The coordinator can use a smaller model than the specialists&lt;/li&gt;
&lt;li&gt;Specialist agents can stay focused and short&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It may not pay off when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Every request needs most of the inner agents anyway&lt;/li&gt;
&lt;li&gt;The tool lists are already small&lt;/li&gt;
&lt;li&gt;The coordinator needs long context just to decide what to do&lt;/li&gt;
&lt;li&gt;Latency matters more than token cost&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Measure it with your prompts, tools, models, and traffic shape.&lt;/p&gt;

&lt;h2&gt;
  
  
  Delegation Is Model-Driven
&lt;/h2&gt;

&lt;p&gt;This is the main difference from the previous article.&lt;/p&gt;

&lt;p&gt;With manual routing, the application decides:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_11_4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_11_4.png" title="Manual" alt="Manual" width="796" height="59"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With agents as tools, the coordinator model decides:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_11_5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_11_5.png" title="Coordinator" alt="Coordinator" width="796" height="76"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That makes agents as tools more flexible.&lt;br&gt;
The coordinator can decide to call one specialist, multiple specialists, or none.&lt;br&gt;
It can also combine results before answering the user.&lt;/p&gt;

&lt;p&gt;But it is less explicit than a C# switch.&lt;br&gt;
The model may skip a useful specialist.&lt;br&gt;
It may call the wrong one.&lt;br&gt;
It may delegate when a direct answer would have been enough.&lt;/p&gt;

&lt;p&gt;Good agent descriptions help, but they do not make delegation deterministic.&lt;/p&gt;

&lt;p&gt;Use manual routing when you need predictable application-owned control.&lt;br&gt;
Use agents as tools when the coordinator should reason about whether delegation is useful.&lt;/p&gt;

&lt;h2&gt;
  
  
  Context Does Not Automatically Transfer
&lt;/h2&gt;

&lt;p&gt;An inner agent is not just a continuation of the outer agent.&lt;br&gt;
It runs as a tool call.&lt;br&gt;
That means the inner agent has its own:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Instructions&lt;/li&gt;
&lt;li&gt;Tools&lt;/li&gt;
&lt;li&gt;Model invocation&lt;/li&gt;
&lt;li&gt;Context providers&lt;/li&gt;
&lt;li&gt;Session behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It does not automatically inherit the full outer conversation history or every piece of outer context.&lt;br&gt;
The coordinator must pass the relevant task information through the tool call.&lt;/p&gt;

&lt;p&gt;If you pass a shared &lt;code&gt;AgentSession&lt;/code&gt; to &lt;code&gt;AsAIFunction()&lt;/code&gt;, treat it carefully.&lt;br&gt;
The resulting function is stateful, so reusing it concurrently across conversations can create unpredictable behavior.&lt;/p&gt;

&lt;p&gt;This is a good boundary, but it can surprise you. Let's think about the coffee agent from my previous examples.&lt;/p&gt;

&lt;p&gt;For example, if the user said their preferred coffee ratio five turns ago, the inner coffee agent may not know that unless the coordinator passes it in or the inner agent can load it through its own memory/context setup.&lt;/p&gt;

&lt;p&gt;So, a practical rule is :&lt;/p&gt;

&lt;p&gt;Do not assume shared context.&lt;br&gt;
Deliberately pass the necessary task context, or give the inner agent its own context provider.&lt;/p&gt;

&lt;h2&gt;
  
  
  Keep Inner Agents Focused
&lt;/h2&gt;

&lt;p&gt;The point of this pattern is not to hide complexity inside another agent.&lt;br&gt;
The point is to create a smaller, clearer capability boundary.&lt;/p&gt;

&lt;p&gt;A useful inner agent should have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a narrow responsibility&lt;/li&gt;
&lt;li&gt;a clear name&lt;/li&gt;
&lt;li&gt;a concrete description&lt;/li&gt;
&lt;li&gt;only the tools it needs&lt;/li&gt;
&lt;li&gt;short instructions&lt;/li&gt;
&lt;li&gt;predictable output expectations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the inner agent becomes another general-purpose assistant, you have only moved the problem one level down.&lt;/p&gt;

&lt;p&gt;For example, &lt;code&gt;string-agent&lt;/code&gt; is a useful boundary.&lt;br&gt;
It transforms strings.&lt;br&gt;
It has string tools.&lt;br&gt;
It does not need documentation search, ticket creation, or deployment access.&lt;/p&gt;

&lt;p&gt;Keep in mind:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;operations-agent-that-can-do-everything&lt;/code&gt; is not a useful boundary.&lt;br&gt;
It is just the original overloaded agent with a different name.&lt;/p&gt;

&lt;h2&gt;
  
  
  Watch the Side Effects
&lt;/h2&gt;

&lt;p&gt;An agent tool can hide several lower-level tool calls behind one coordinator-visible call.&lt;br&gt;
That is useful for composition.&lt;br&gt;
It also means you need observability.&lt;/p&gt;

&lt;p&gt;The outer agent receives the inner agent's result.&lt;br&gt;
It should not be your only audit surface.&lt;br&gt;
If the inner agent can call tools, log those tool calls where they happen.&lt;br&gt;
Use tracing or middleware so you can inspect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;which inner agent was called&lt;/li&gt;
&lt;li&gt;what input it received&lt;/li&gt;
&lt;li&gt;which tools it called&lt;/li&gt;
&lt;li&gt;whether approval was required&lt;/li&gt;
&lt;li&gt;what result it returned&lt;/li&gt;
&lt;li&gt;how long each step took&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: I will dive into observability in a later post, but for now I just want to mention that Aspire Dashboard includes a GenAI visualizer that is really helpful when debugging multi-agent setups. For background, see the OpenTelemetry article &lt;a href="https://opentelemetry.io/blog/2026/genai-observability/" rel="noopener noreferrer"&gt;GenAI observability with OpenTelemetry and Aspire Dashboard&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The same safety rule from local tools still applies:&lt;/p&gt;

&lt;p&gt;If an inner agent can spend money, change data, send messages, or trigger deployments, do not rely on the coordinator prompt as the safety boundary.&lt;br&gt;
Put approval, authorization, validation, and logging behind the actual tool that performs the action.&lt;/p&gt;

&lt;p&gt;The inner agent is a composition boundary.&lt;br&gt;
It is not a security boundary by itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agents as Tools vs. Workflows
&lt;/h2&gt;

&lt;p&gt;Agents as tools are useful for lightweight delegation.&lt;br&gt;
They are not the same as workflows.&lt;/p&gt;

&lt;p&gt;Use agents as tools when the coordinator can decide what to call and the task can complete in normal tool-calling flow.&lt;/p&gt;

&lt;p&gt;Use a workflow when the orchestration itself needs to be explicit.&lt;br&gt;
For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;fixed execution order&lt;/li&gt;
&lt;li&gt;checkpoints&lt;/li&gt;
&lt;li&gt;resumability&lt;/li&gt;
&lt;li&gt;human approval between steps&lt;/li&gt;
&lt;li&gt;typed edges between stages&lt;/li&gt;
&lt;li&gt;long-running processes&lt;/li&gt;
&lt;li&gt;clearer operational visibility&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A coordinator agent can be enough for:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Read the user request.&lt;/li&gt;
&lt;li&gt;Call the right specialist.&lt;/li&gt;
&lt;li&gt;Combine the result.&lt;/li&gt;
&lt;li&gt;Answer.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A workflow is a better fit when the process itself is the product.&lt;br&gt;
For example, triage a support case, create a remediation plan, ask for approval, execute a change, and persist the result.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use Agents as Tools
&lt;/h2&gt;

&lt;p&gt;Use agents as tools when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one agent is accumulating too many unrelated tools&lt;/li&gt;
&lt;li&gt;specialist agents can own clear domains&lt;/li&gt;
&lt;li&gt;the coordinator should decide whether delegation is needed&lt;/li&gt;
&lt;li&gt;different specialists should use different tools, prompts, or models&lt;/li&gt;
&lt;li&gt;the inner task can be represented as a tool call and final result&lt;/li&gt;
&lt;li&gt;you can observe and test the delegation path&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Do not use agents as tools when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a simple C# function is enough&lt;/li&gt;
&lt;li&gt;a C# switch gives better control&lt;/li&gt;
&lt;li&gt;every request needs every specialist anyway&lt;/li&gt;
&lt;li&gt;the inner agent needs hidden context that cannot be passed reliably&lt;/li&gt;
&lt;li&gt;the side effects are too risky without explicit approval&lt;/li&gt;
&lt;li&gt;the process needs workflow-level checkpoints and state&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This pattern is best treated as a pragmatic middle step.&lt;br&gt;
It sits between manual routing and full workflow orchestration.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Agents as tools let you compose focused agents without giving one agent every raw tool.&lt;br&gt;
The coordinator sees a smaller set of specialist capabilities.&lt;br&gt;
Each inner agent keeps its own instructions and tools.&lt;/p&gt;

&lt;p&gt;That can improve focus by reducing the tool surface visible to the coordinator.&lt;br&gt;
But it also adds model-driven delegation, extra invocations, context isolation, and observability requirements.&lt;/p&gt;

&lt;p&gt;Use this pattern when delegation should be flexible and specialist boundaries are clear.&lt;br&gt;
Use manual routing when application code should decide.&lt;br&gt;
Use workflows when the orchestration needs explicit state, checkpoints, and control.&lt;/p&gt;

&lt;p&gt;Next, we will move from lightweight delegation to Agent Framework workflows: when the workflow engine is worth using, and when normal C# is still the cleaner option.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/journey/agents-as-tools" rel="noopener noreferrer"&gt;Agents as Tools&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/tools/" rel="noopener noreferrer"&gt;Tools Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/tools/function-tools" rel="noopener noreferrer"&gt;Using function tools with an agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/observability" rel="noopener noreferrer"&gt;Observability&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/workflows/" rel="noopener noreferrer"&gt;Microsoft Agent Framework Workflows&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Why CancellationToken Matters More in .NET AI Systems</title>
      <dc:creator>Lukas Walter </dc:creator>
      <pubDate>Mon, 01 Jun 2026 15:30:00 +0000</pubDate>
      <link>https://dev.to/lukaswalter/why-cancellationtoken-matters-more-in-net-ai-systems-5271</link>
      <guid>https://dev.to/lukaswalter/why-cancellationtoken-matters-more-in-net-ai-systems-5271</guid>
      <description>&lt;p&gt;&lt;code&gt;CancellationToken&lt;/code&gt; is one of the most underrated AI engineering features in .NET.&lt;br&gt;
Not because it is new.&lt;br&gt;
Because AI workloads have a different runtime profile.&lt;/p&gt;

&lt;p&gt;A normal application call might take milliseconds.&lt;br&gt;
An LLM call might take seconds.&lt;br&gt;
A streaming response might keep running while tokens are generated.&lt;br&gt;
An embedding pipeline might process thousands of chunks.&lt;br&gt;
A tool call might trigger another slow network request.&lt;br&gt;
And sometimes, the user is already gone.&lt;br&gt;
They closed the tab.&lt;br&gt;
They navigated away.&lt;br&gt;
The HTTP request timed out.&lt;br&gt;
The background job was stopped.&lt;br&gt;
The deployment is shutting down.&lt;/p&gt;

&lt;p&gt;Without cancellation, your application may keep doing expensive work nobody needs anymore.&lt;br&gt;
That means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;wasted tokens&lt;/li&gt;
&lt;li&gt;wasted compute&lt;/li&gt;
&lt;li&gt;unnecessary tool calls&lt;/li&gt;
&lt;li&gt;slower shutdowns&lt;/li&gt;
&lt;li&gt;noisy traces&lt;/li&gt;
&lt;li&gt;worse resource usage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In .NET, this is not a special AI problem.&lt;br&gt;
It is a normal engineering problem that becomes much more visible in AI systems.&lt;/p&gt;

&lt;p&gt;Pass the &lt;code&gt;CancellationToken&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;From your ASP.NET Core endpoint.&lt;br&gt;
From &lt;code&gt;HttpContext.RequestAborted&lt;/code&gt;.&lt;br&gt;
Into your agent call.&lt;br&gt;
Into your &lt;code&gt;IChatClient&lt;/code&gt; call.&lt;br&gt;
Into your embedding generation.&lt;br&gt;
Into your retrieval layer.&lt;br&gt;
Into your database query.&lt;br&gt;
Into your tool execution.&lt;/p&gt;

&lt;p&gt;Especially when streaming responses with &lt;code&gt;IAsyncEnumerable&lt;/code&gt;, because the UI might stop listening long before your backend stops generating.&lt;/p&gt;

&lt;p&gt;AI engineering is not only about better prompts, better models, or better frameworks.&lt;br&gt;
It is also about respecting the lifecycle of the request.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why AI Workloads Expose This Problem
&lt;/h2&gt;

&lt;p&gt;Cancellation exists in .NET because long-running work needs a cooperative way to stop.&lt;br&gt;
That has always mattered.&lt;br&gt;
But classic CRUD workloads often hide the problem.&lt;br&gt;
If a request reads one row from a database and returns in 30 milliseconds, the cost of ignoring cancellation is small.&lt;br&gt;
It is still wrong, but it is rarely dramatic.&lt;/p&gt;

&lt;p&gt;AI workloads change that.&lt;br&gt;
An LLM call can hold an outbound HTTP connection open for seconds.&lt;br&gt;
A streaming chat endpoint can continue producing tokens even after the browser tab is closed.&lt;br&gt;
A RAG request can do retrieval, reranking, prompt construction, and model generation before the user sees anything useful.&lt;br&gt;
An ingestion job can generate embeddings for thousands of chunks.&lt;br&gt;
An agent can call tools that call other APIs.&lt;/p&gt;

&lt;p&gt;The runtime profile is wider, slower, and more expensive.&lt;br&gt;
That is why cancellation stops being a small cleanup detail.&lt;br&gt;
It becomes part of the cost and reliability model.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Mental Model
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flukaswalter.dev%2Fimages%2Fcancellation-tokens_1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flukaswalter.dev%2Fimages%2Fcancellation-tokens_1.png" title="Sequence" alt="Sequence" width="800" height="343"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;CancellationToken&lt;/code&gt; is not a timeout button.&lt;br&gt;
It is not a thread abort.&lt;br&gt;
It does not magically undo side effects.&lt;br&gt;
It also does not guarantee that a remote provider stops work or billing instantly.&lt;br&gt;
It is a cooperative signal.&lt;br&gt;
The caller says: "This work is no longer needed."&lt;br&gt;
The callee decides where it can stop safely.&lt;/p&gt;

&lt;p&gt;That distinction matters in AI systems because work often crosses several boundaries:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;HTTP request&lt;/li&gt;
&lt;li&gt;retrieval&lt;/li&gt;
&lt;li&gt;embedding or reranking&lt;/li&gt;
&lt;li&gt;model call&lt;/li&gt;
&lt;li&gt;streaming response&lt;/li&gt;
&lt;li&gt;tool execution&lt;/li&gt;
&lt;li&gt;logging and tracing&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If the token disappears at any point, the rest of the pipeline may continue running.&lt;/p&gt;

&lt;p&gt;The common failure is not that developers forgot cancellation exists.&lt;br&gt;
The common failure is that cancellation only exists at the first method signature.&lt;/p&gt;
&lt;h2&gt;
  
  
  Start at the HTTP Boundary
&lt;/h2&gt;

&lt;p&gt;In ASP.NET Core, a &lt;code&gt;CancellationToken&lt;/code&gt; parameter on an endpoint is bound to the request-aborted token.&lt;br&gt;
That is usually the first token you want.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapPost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/ask"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;AskRequest&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;IChatClient&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;ChatMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ChatRole&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Answer briefly and use the provided context."&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;ChatMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ChatRole&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetResponseAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The important part is not the endpoint syntax.&lt;br&gt;
The important part is that the token crosses the model boundary.&lt;/p&gt;

&lt;p&gt;If the user disconnects while the model is generating, your application should not keep waiting for the answer just so it can throw it away.&lt;/p&gt;

&lt;p&gt;In a larger application, I usually pass the token into an application service rather than calling the model directly in the endpoint.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapPost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/ask"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;AskRequest&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AssistantService&lt;/span&gt; &lt;span class="n"&gt;assistant&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;assistant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AnswerAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then the service owns the AI workflow, but the request still owns the lifecycle.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AssistantService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;IRetrievalService&lt;/span&gt; &lt;span class="n"&gt;retrieval&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;IChatClient&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;AnswerAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;retrieval&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;SearchAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;BuildMessages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetResponseAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;accept the token at the boundary&lt;/li&gt;
&lt;li&gt;pass it to every async operation&lt;/li&gt;
&lt;li&gt;do not replace it with &lt;code&gt;CancellationToken.None&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;do not stop passing it once the code enters the AI layer&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Streaming Makes Cancellation More Important
&lt;/h2&gt;

&lt;p&gt;Streaming is where ignored cancellation becomes easiest to miss.&lt;br&gt;
The backend can keep generating tokens even after the browser, mobile app, or frontend stream reader is gone.&lt;br&gt;
From the user's perspective, the conversation ended.&lt;br&gt;
From the backend's perspective, the model may still be working.&lt;br&gt;
That is wasted work.&lt;br&gt;
For streaming endpoints, pass the token into the model call, and optionally into response writes when it helps the write loop exit cleanly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapGet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/ask/stream"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;IChatClient&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;HttpResponse&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ContentType&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"text/plain; charset=utf-8"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;ChatMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ChatRole&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;update&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetStreamingResponseAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;IsNullOrEmpty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WriteAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FlushAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are two useful details here.&lt;br&gt;
First, the model streaming call receives the token.&lt;br&gt;
Second, each response write receives the same token.&lt;br&gt;
That matters because streaming is not one operation.&lt;br&gt;
It is a sequence.&lt;/p&gt;

&lt;p&gt;Each token chunk is another opportunity to stop.&lt;/p&gt;

&lt;p&gt;If the user interface stops listening, the backend should notice.&lt;br&gt;
If the request is aborted, the model stream should stop.&lt;br&gt;
If the deployment is shutting down, the endpoint should not keep a long stream alive just because it already started.&lt;/p&gt;

&lt;p&gt;Passing the token to &lt;code&gt;WriteAsync&lt;/code&gt; and &lt;code&gt;FlushAsync&lt;/code&gt; is fine, but the more important part is usually upstream cancellation.&lt;br&gt;
Once the client disconnects, ASP.NET Core may already stop or fail response writes.&lt;br&gt;
The expensive work is the model call, retrieval, tool execution, or embedding generation that keeps producing data for a response nobody will read.&lt;/p&gt;

&lt;p&gt;If you consume a streaming API that does not expose a cancellation token parameter, use &lt;code&gt;.WithCancellation(cancellationToken)&lt;/code&gt; at the enumeration site instead.&lt;br&gt;
Do not pass the same token through both mechanisms unless the API documentation explicitly expects that pattern.&lt;/p&gt;
&lt;h2&gt;
  
  
  RAG Pipelines Need the Same Discipline
&lt;/h2&gt;

&lt;p&gt;RAG systems often hide several expensive operations behind one "ask" endpoint.&lt;/p&gt;

&lt;p&gt;A single user question might do this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;rewrite the query&lt;/li&gt;
&lt;li&gt;generate an embedding&lt;/li&gt;
&lt;li&gt;search a vector index&lt;/li&gt;
&lt;li&gt;fetch source documents&lt;/li&gt;
&lt;li&gt;rerank results&lt;/li&gt;
&lt;li&gt;build the prompt&lt;/li&gt;
&lt;li&gt;call the model&lt;/li&gt;
&lt;li&gt;stream the answer&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Cancellation needs to travel through that whole chain.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RagAssistant&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;IQueryRewriter&lt;/span&gt; &lt;span class="n"&gt;rewriter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;IRetriever&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;IChatClient&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;AnswerAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;rewrittenQuery&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;rewriter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RewriteAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;SearchAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;rewrittenQuery&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;BuildPrompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetResponseAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This looks boring.&lt;br&gt;
That is the point.&lt;/p&gt;

&lt;p&gt;Cancellation in AI systems should not require a clever framework.&lt;br&gt;
It should be part of the ordinary method contract.&lt;/p&gt;

&lt;p&gt;If retrieval is backed by EF Core, pass the token there too.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;DocumentChunk&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;LoadChunksAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;IReadOnlyCollection&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;chunkIds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DocumentChunks&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;chunkIds&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Id&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToListAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If retrieval is backed by a vector database or search service, the same rule applies.&lt;br&gt;
The outbound SDK call should receive the token.&lt;/p&gt;
&lt;h2&gt;
  
  
  Embedding Jobs Are Cancellation Hotspots
&lt;/h2&gt;

&lt;p&gt;Embedding generation is another place where cancellation gets ignored.&lt;/p&gt;

&lt;p&gt;It often runs outside the request path, so developers treat it as batch work that can simply run until it finishes.&lt;br&gt;
Sometimes that is fine.&lt;br&gt;
But ingestion jobs still need to stop cleanly during deployment, shutdown, or operational intervention.&lt;/p&gt;

&lt;p&gt;If you process thousands of chunks, check cancellation between batches.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt; &lt;span class="nf"&gt;IndexDocumentsAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;IEnumerable&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;DocumentChunk&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;IEmbeddingGenerator&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Embedding&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;embeddingGenerator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;IVectorIndex&lt;/span&gt; &lt;span class="n"&gt;vectorIndex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;batch&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Chunk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;64&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ThrowIfCancellationRequested&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;texts&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;batch&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToArray&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;embeddingGenerator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GenerateAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;vectorIndex&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UpsertAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The batch boundary is a natural safe point.&lt;/p&gt;

&lt;p&gt;You do not want to stop halfway through a local in-memory projection just because a token was canceled.&lt;br&gt;
But you do want to stop before generating the next expensive batch of embeddings.&lt;/p&gt;

&lt;p&gt;For ingestion pipelines, cancellation is also useful for shutdown behavior.&lt;br&gt;
A background service that ignores its stopping token can make deployments slower and less predictable.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingWorker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;EmbeddingQueue&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;DocumentIndexer&lt;/span&gt; &lt;span class="n"&gt;indexer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;BackgroundService&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt; &lt;span class="nf"&gt;ExecuteAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;stoppingToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;job&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ReadAllAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stoppingToken&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;indexer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;IndexAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stoppingToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;stoppingToken&lt;/code&gt; is not decoration.&lt;br&gt;
It is the host telling your worker that the process is trying to stop.&lt;/p&gt;
&lt;h2&gt;
  
  
  Tools Need Cancellation Too
&lt;/h2&gt;

&lt;p&gt;Tool calling makes this more important, not less.&lt;/p&gt;

&lt;p&gt;An agent tool is still application code.&lt;br&gt;
It might query a database, call an internal API, invoke a search service, read files, or trigger another model call.&lt;br&gt;
If the parent request is canceled, the tool should not keep doing unnecessary work.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Searches internal documentation for relevant snippets."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IReadOnlyList&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;SearchResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;SearchDocsAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;IServiceProvider&lt;/span&gt; &lt;span class="n"&gt;services&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;search&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetRequiredService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IDocumentSearch&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;SearchAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model does not need to know about the token.&lt;br&gt;
Your application does.&lt;/p&gt;

&lt;p&gt;This is an important boundary.&lt;br&gt;
Model-supplied arguments are untrusted input.&lt;br&gt;
The &lt;code&gt;CancellationToken&lt;/code&gt; comes from your runtime.&lt;br&gt;
Do not let agent abstractions make you forget that tool execution still belongs to your application lifecycle.&lt;/p&gt;

&lt;p&gt;This assumes your tool framework treats &lt;code&gt;IServiceProvider&lt;/code&gt; and &lt;code&gt;CancellationToken&lt;/code&gt; as runtime-supplied parameters, not model-supplied parameters.&lt;br&gt;
If a framework exposes every method parameter to the model schema, do not expose application services or lifecycle tokens that way.&lt;/p&gt;
&lt;h2&gt;
  
  
  Timeouts Are Policy, Cancellation Is Plumbing
&lt;/h2&gt;

&lt;p&gt;Cancellation tokens are often used to implement timeouts, but they are not the same thing.&lt;/p&gt;

&lt;p&gt;A timeout is a policy decision.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;this chat endpoint should stop after 30 seconds&lt;/li&gt;
&lt;li&gt;this retrieval call should stop after 2 seconds&lt;/li&gt;
&lt;li&gt;this embedding batch should stop after 5 minutes&lt;/li&gt;
&lt;li&gt;this background worker should stop when the host shuts down&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The token is how that decision travels through the code.&lt;/p&gt;

&lt;p&gt;If you need a request cancellation token and an internal timeout, link them at the edge where the policy is visible.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapPost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/ask"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;AskRequest&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AssistantService&lt;/span&gt; &lt;span class="n"&gt;assistant&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;requestAborted&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;timeoutCts&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;CancellationTokenSource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromSeconds&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;30&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;linkedCts&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CancellationTokenSource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateLinkedTokenSource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;requestAborted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;timeoutCts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Token&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;assistant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AnswerAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;linkedCts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Token&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OperationCanceledException&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;requestAborted&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsCancellationRequested&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// 499 Client Closed Request is a common convention,&lt;/span&gt;
        &lt;span class="c1"&gt;// not an ASP.NET Core named status constant.&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;StatusCode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;499&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OperationCanceledException&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeoutCts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsCancellationRequested&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;StatusCode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;StatusCodes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Status504GatewayTimeout&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flukaswalter.dev%2Fimages%2Fcancellation-tokens_2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flukaswalter.dev%2Fimages%2Fcancellation-tokens_2.png" title="Flowchart" alt="Flowchart" width="800" height="1154"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This keeps two cases separate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the user went away&lt;/li&gt;
&lt;li&gt;your system decided the operation took too long&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those should not be logged, alerted, or retried in the same way.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Not To Do
&lt;/h2&gt;

&lt;p&gt;The mistake I look for first is &lt;code&gt;CancellationToken.None&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetResponseAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;CancellationToken&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;None&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That says: "Even if the caller is gone, keep going."&lt;/p&gt;

&lt;p&gt;Sometimes that is intentional.&lt;br&gt;
Most of the time, it is accidental.&lt;/p&gt;

&lt;p&gt;Another mistake is accepting a token but only using it in the first call.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;AnswerAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;SearchAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetResponseAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;BuildPrompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The retrieval call can cancel.&lt;br&gt;
The model call cannot.&lt;/p&gt;

&lt;p&gt;That is exactly the wrong place to lose the token, because the model call is often the slowest and most expensive part of the operation.&lt;/p&gt;
&lt;h2&gt;
  
  
  Logging Cancellation Like a Failure Creates Noise
&lt;/h2&gt;

&lt;p&gt;Expected cancellation is not the same as failure.&lt;/p&gt;

&lt;p&gt;If a user closes a browser tab while a streaming answer is being generated, that is not a model outage.&lt;br&gt;
If the host is shutting down and a background embedding worker stops between batches, that is not an ingestion error.&lt;/p&gt;

&lt;p&gt;Log cancellation separately.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;try&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;assistant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AnswerAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OperationCanceledException&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsCancellationRequested&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;LogInformation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s"&gt;"AI request canceled before completion."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;throw&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;throw&lt;/code&gt; is intentional.&lt;br&gt;
The method can add local context, but the boundary should decide how cancellation maps to the transport response, trace status, or job state.&lt;/p&gt;

&lt;p&gt;For AI workloads, this matters because traces can get noisy quickly.&lt;br&gt;
If every user-aborted stream looks like an application error, your observability gets worse instead of better.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Checklist for .NET AI Code
&lt;/h2&gt;

&lt;p&gt;When I review AI code in .NET, I check this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Does the ASP.NET Core endpoint accept a &lt;code&gt;CancellationToken&lt;/code&gt;?&lt;/li&gt;
&lt;li&gt;Is &lt;code&gt;HttpContext.RequestAborted&lt;/code&gt; used when the token is not injected directly?&lt;/li&gt;
&lt;li&gt;Does the token reach the agent or &lt;code&gt;IChatClient&lt;/code&gt; call?&lt;/li&gt;
&lt;li&gt;Does streaming use cancellation while consuming &lt;code&gt;IAsyncEnumerable&lt;/code&gt;?&lt;/li&gt;
&lt;li&gt;Does the token reach embedding generation?&lt;/li&gt;
&lt;li&gt;Does retrieval pass the token into search, vector, and database calls?&lt;/li&gt;
&lt;li&gt;Do tools accept and pass the token?&lt;/li&gt;
&lt;li&gt;Do background services use &lt;code&gt;stoppingToken&lt;/code&gt;?&lt;/li&gt;
&lt;li&gt;Are timeout policies explicit and close to the boundary?&lt;/li&gt;
&lt;li&gt;Are linked token sources disposed?&lt;/li&gt;
&lt;li&gt;Is &lt;code&gt;CancellationToken.None&lt;/code&gt; used only where the reason is intentional?&lt;/li&gt;
&lt;li&gt;Are expected cancellations logged differently from real failures?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most of this is not AI-specific syntax.&lt;br&gt;
It is ordinary .NET discipline applied to AI runtime behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  When To Use Cancellation Tokens
&lt;/h2&gt;

&lt;p&gt;Use cancellation tokens in AI systems when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a user request may be aborted&lt;/li&gt;
&lt;li&gt;a streaming response may be disconnected&lt;/li&gt;
&lt;li&gt;a model call may run longer than the user is willing to wait&lt;/li&gt;
&lt;li&gt;retrieval or reranking has a request-level deadline&lt;/li&gt;
&lt;li&gt;embedding generation runs in batches&lt;/li&gt;
&lt;li&gt;tools call databases, APIs, or other slow dependencies&lt;/li&gt;
&lt;li&gt;background workers need to stop cleanly during deployment&lt;/li&gt;
&lt;li&gt;retries should stop when the parent operation is no longer relevant&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  When Not To Rely On Cancellation Tokens
&lt;/h2&gt;

&lt;p&gt;Do not use cancellation tokens as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a substitute for idempotency&lt;/li&gt;
&lt;li&gt;a substitute for transactions&lt;/li&gt;
&lt;li&gt;a replacement for retry and circuit-breaker policy&lt;/li&gt;
&lt;li&gt;a guarantee that remote providers stopped billing instantly&lt;/li&gt;
&lt;li&gt;a way to pretend side effects can be undone&lt;/li&gt;
&lt;li&gt;a reason to swallow &lt;code&gt;OperationCanceledException&lt;/code&gt; and return success&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Cancellation is not the whole reliability story.&lt;br&gt;
It is the lifecycle signal that lets the rest of the story behave correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;The forgotten AI feature in .NET is not flashy.&lt;br&gt;
It is probably already in your method signature: &lt;code&gt;CancellationToken&lt;/code&gt;.&lt;br&gt;
In small CRUD paths, ignoring it might only waste a few milliseconds.&lt;br&gt;
In AI systems, ignoring it can waste model calls, tokens, tool executions, embedding batches, and shutdown time.&lt;/p&gt;

&lt;p&gt;Better prompts and better models matter.&lt;br&gt;
But so does respecting the lifecycle of the request.&lt;br&gt;
Pass the token.&lt;/p&gt;

&lt;h2&gt;
  
  
  Official References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Use the &lt;code&gt;IChatClient&lt;/code&gt; interface: &lt;a href="https://learn.microsoft.com/en-us/dotnet/ai/ichatclient" rel="noopener noreferrer"&gt;https://learn.microsoft.com/en-us/dotnet/ai/ichatclient&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Cancellation in managed threads: &lt;a href="https://learn.microsoft.com/dotnet/standard/threading/cancellation-in-managed-threads" rel="noopener noreferrer"&gt;https://learn.microsoft.com/dotnet/standard/threading/cancellation-in-managed-threads&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CancellationToken&lt;/code&gt;: &lt;a href="https://learn.microsoft.com/dotnet/api/system.threading.cancellationtoken" rel="noopener noreferrer"&gt;https://learn.microsoft.com/dotnet/api/system.threading.cancellationtoken&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;HttpContext.RequestAborted&lt;/code&gt;: &lt;a href="https://learn.microsoft.com/dotnet/api/microsoft.aspnetcore.http.httpcontext.requestaborted" rel="noopener noreferrer"&gt;https://learn.microsoft.com/dotnet/api/microsoft.aspnetcore.http.httpcontext.requestaborted&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>ai</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>Manual Multi-Agent Routing in .NET</title>
      <dc:creator>Lukas Walter </dc:creator>
      <pubDate>Thu, 28 May 2026 15:30:00 +0000</pubDate>
      <link>https://dev.to/lukaswalter/manual-multi-agent-routing-in-net-30cg</link>
      <guid>https://dev.to/lukaswalter/manual-multi-agent-routing-in-net-30cg</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This is Part 10 of my series on the Microsoft Agent Framework. You can read the original post over on &lt;a href="https://www.lukaswalter.dev/posts/agentframework_1_10/" rel="noopener noreferrer"&gt;lukaswalter.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So far, our agent can answer questions, stream responses, remember history, reduce context, use tools, return structured output, connect to MCP and load Agent Skills.&lt;br&gt;
That is already a lot.&lt;br&gt;
And that is exactly where the next problem starts.&lt;/p&gt;

&lt;p&gt;It is tempting to keep adding instructions and tools to one large agent.&lt;br&gt;
Make it a coding assistant.&lt;br&gt;
Also make it a music expert.&lt;br&gt;
Also make it answer questions about coffee.&lt;br&gt;
Also give it support tools, documentation tools and internal process rules.&lt;/p&gt;

&lt;p&gt;At some point, the agent becomes a jack of all trades.&lt;br&gt;
The prompt grows.&lt;br&gt;
The tool list grows.&lt;br&gt;
The model has more irrelevant instructions to ignore.&lt;br&gt;
And you pay for unnecessary input tokens on every request, even when the user only asks a simple question.&lt;/p&gt;

&lt;p&gt;One practical way out is manual routing.&lt;br&gt;
Instead of giving one agent every responsibility, we split the system into smaller specialized agents and put a cheap intent agent in front of them.&lt;br&gt;
The intent agent does not answer the user.&lt;br&gt;
It only decides where the request should go.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Problem with One Large Agent
&lt;/h2&gt;

&lt;p&gt;A single large agent looks simple at first. You just have one entry point for everything. But the simplicity is misleading.&lt;br&gt;
If the same agent knows about coffee, music, support tickets, code review and internal documentation, each request carries baggage.&lt;br&gt;
A question about a guitar amp still pays for coffee instructions.&lt;br&gt;
A coffee question still carries music instructions.&lt;br&gt;
A small talk message still sees tool descriptions it will never need.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_10_1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_10_1.png" title="SingleAgent" alt="SingleAgent" width="800" height="179"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This creates three problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;More input tokens per request&lt;/li&gt;
&lt;li&gt;More room for the model to pick the wrong behavior&lt;/li&gt;
&lt;li&gt;More difficult prompts to maintain and test&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal is not to create a fancy multi-agent architecture.&lt;br&gt;
The goal is to keep each model call focused.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Intent Agent
&lt;/h2&gt;

&lt;p&gt;The intent agent is a dispatcher.&lt;br&gt;
It sits at the front of the system and classifies the user request.&lt;br&gt;
For this example, we keep the domain intentionally small:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Coffee questions go to a coffee expert agent&lt;/li&gt;
&lt;li&gt;Music questions go to a music expert agent&lt;/li&gt;
&lt;li&gt;Everything else gets a controlled fallback&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The intent agent has one strict rule:&lt;/p&gt;

&lt;p&gt;It classifies the request.&lt;br&gt;
It does not answer the request.&lt;/p&gt;

&lt;p&gt;That rule matters.&lt;br&gt;
If the router starts answering directly, it becomes another general-purpose assistant.&lt;br&gt;
Then the system has two problems instead of one.&lt;/p&gt;

&lt;p&gt;Because the intent agent only classifies, it can usually use a smaller and cheaper model than the specialist agents.&lt;br&gt;
You usually do not need deep reasoning just to choose a route.&lt;br&gt;
You need a reliable category.&lt;/p&gt;
&lt;h2&gt;
  
  
  Use Structured Output for the Routing Decision
&lt;/h2&gt;

&lt;p&gt;Do not let the intent agent return plain text like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;This is probably a music question.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That would force your application to parse generated text.&lt;br&gt;
String parsing is a weak boundary.&lt;br&gt;
We already looked at this in the structured output article.&lt;/p&gt;

&lt;p&gt;For routing, define a small C# contract instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;UserIntent&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Coffee&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Music&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Other&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;IntentResult&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;UserIntent&lt;/span&gt; &lt;span class="n"&gt;Intent&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;Confidence&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Reason&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Empty&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the router returns data your application can use directly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Agents.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;intentAgent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;smallChatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"intent-router"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"""
&lt;/span&gt;    &lt;span class="n"&gt;You&lt;/span&gt; &lt;span class="n"&gt;classify&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;routing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

    &lt;span class="n"&gt;Return&lt;/span&gt; &lt;span class="n"&gt;Coffee&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="n"&gt;asks&lt;/span&gt; &lt;span class="n"&gt;about&lt;/span&gt; &lt;span class="n"&gt;brewing&lt;/span&gt; &lt;span class="n"&gt;methods&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;beans&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;grind&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ratios&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;extraction&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;espresso&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="n"&gt;coffee&lt;/span&gt; &lt;span class="n"&gt;gear&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

    &lt;span class="n"&gt;Return&lt;/span&gt; &lt;span class="n"&gt;Music&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="n"&gt;asks&lt;/span&gt; &lt;span class="n"&gt;about&lt;/span&gt; &lt;span class="n"&gt;songs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;albums&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;artists&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;instruments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;music&lt;/span&gt; &lt;span class="n"&gt;theory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="n"&gt;tone&lt;/span&gt; &lt;span class="n"&gt;recommendations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

    &lt;span class="n"&gt;Return&lt;/span&gt; &lt;span class="n"&gt;Other&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="n"&gt;does&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="n"&gt;clearly&lt;/span&gt; &lt;span class="n"&gt;belong&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;Coffee&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="n"&gt;Music&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

    &lt;span class="n"&gt;Do&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;Only&lt;/span&gt; &lt;span class="n"&gt;classify&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="s"&gt;""");
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then call it with &lt;code&gt;RunAsync&amp;lt;T&amp;gt;&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;userMessage&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"How do I get a dirty Hendrix tone on my Strat?"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="n"&gt;AgentResponse&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IntentResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;intentResponse&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;intentAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IntentResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;
        &lt;span class="s"&gt;$"""
&lt;/span&gt;        &lt;span class="n"&gt;Classify&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

        &lt;span class="n"&gt;User&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="s"&gt;""");
&lt;/span&gt;
&lt;span class="n"&gt;IntentResult&lt;/span&gt; &lt;span class="n"&gt;route&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;intentResponse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The useful part is the boundary.&lt;br&gt;
Your application does not receive a sentence that it still has to interpret.&lt;br&gt;
It receives an &lt;code&gt;IntentResult&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Structured output support still depends on the agent type, provider, model and chat client.&lt;br&gt;
With &lt;code&gt;ChatClientAgent&lt;/code&gt; and compatible chat clients, &lt;code&gt;RunAsync&amp;lt;T&amp;gt;&lt;/code&gt; is the cleanest option when the output type is known at compile time.&lt;br&gt;
If your provider does not support this reliably, use an explicit JSON schema via response format or add a retry and validation layer.&lt;/p&gt;
&lt;h2&gt;
  
  
  C# Takes the Wheel
&lt;/h2&gt;

&lt;p&gt;Once you have an &lt;code&gt;IntentResult&lt;/code&gt;, stop asking the model to orchestrate.&lt;br&gt;
Use normal C#.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_10_2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_10_2.png" title="IntentAgent" alt="IntentAgent" width="800" height="864"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;coffeeAgent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;coffeeChatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"coffee-expert"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"""
&lt;/span&gt;    &lt;span class="n"&gt;You&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="n"&gt;coffee&lt;/span&gt; &lt;span class="n"&gt;questions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;Be&lt;/span&gt; &lt;span class="n"&gt;practical&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;specific&lt;/span&gt; &lt;span class="n"&gt;about&lt;/span&gt; &lt;span class="n"&gt;brewing&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;beans&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ratios&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;equipment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="s"&gt;""");
&lt;/span&gt;
&lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;musicAgent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;musicChatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"music-expert"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"""
&lt;/span&gt;    &lt;span class="n"&gt;You&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="n"&gt;music&lt;/span&gt; &lt;span class="n"&gt;questions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;Be&lt;/span&gt; &lt;span class="n"&gt;practical&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;specific&lt;/span&gt; &lt;span class="n"&gt;about&lt;/span&gt; &lt;span class="n"&gt;instruments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tone&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;artists&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;recordings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="s"&gt;""");
&lt;/span&gt;
&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;finalAnswer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Intent&lt;/span&gt; &lt;span class="k"&gt;switch&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;UserIntent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Coffee&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;coffeeAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="n"&gt;UserIntent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Music&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;musicAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="s"&gt;"I can help with coffee or music questions. Please rephrase the request."&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the core pattern:&lt;/p&gt;

&lt;p&gt;The intent agent classifies.&lt;br&gt;
C# routes.&lt;br&gt;
The specialist agent answers.&lt;/p&gt;

&lt;p&gt;For &lt;code&gt;Other&lt;/code&gt;, no second model call is required.&lt;br&gt;
You can return a fixed message, ask a clarification question, show supported topics, or route to a generic fallback agent if that makes sense for your application.&lt;/p&gt;

&lt;p&gt;The important point is control.&lt;br&gt;
The model does not decide which expensive agent gets called.&lt;br&gt;
Your application does.&lt;/p&gt;
&lt;h2&gt;
  
  
  Add Confidence Before Routing Expensive Work
&lt;/h2&gt;

&lt;p&gt;Do not blindly trust the router.&lt;br&gt;
The router is still an LLM call.&lt;br&gt;
It can be wrong.&lt;br&gt;
It can be uncertain.&lt;br&gt;
It can over-classify vague requests.&lt;/p&gt;

&lt;p&gt;That is why the &lt;code&gt;IntentResult&lt;/code&gt; includes a confidence score.&lt;br&gt;
But treat that confidence as a routing signal, not as truth.&lt;/p&gt;

&lt;p&gt;Model-generated confidence is not automatically calibrated.&lt;br&gt;
A result with &lt;code&gt;0.9&lt;/code&gt; does not necessarily mean the route is correct 90% of the time.&lt;br&gt;
It only means the router expressed high confidence.&lt;/p&gt;

&lt;p&gt;You can still use it as a practical gate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Confidence&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt; &lt;span class="m"&gt;0.75&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Is this about coffee or music?"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;finalAnswer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Intent&lt;/span&gt; &lt;span class="k"&gt;switch&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;UserIntent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Coffee&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;coffeeAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="n"&gt;UserIntent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Music&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;musicAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="s"&gt;"I can help with coffee or music questions."&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The exact threshold depends on your application.&lt;br&gt;
For a casual assistant, a wrong route may not matter much.&lt;br&gt;
For support automation, routing errors can waste time or trigger the wrong downstream process.&lt;/p&gt;

&lt;p&gt;Measure this with real examples.&lt;br&gt;
Do not tune the threshold from intuition alone.&lt;/p&gt;

&lt;p&gt;Create a small labeled dataset of representative user requests.&lt;br&gt;
Run the intent agent against it.&lt;br&gt;
Track how often each intent is classified correctly.&lt;br&gt;
Then decide where the confidence threshold should sit.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_10_4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_10_4.png" title="Confidence" alt="Confidence" width="800" height="1249"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Confidence is useful.&lt;br&gt;
But only after you have checked how it behaves in your actual domain.&lt;/p&gt;
&lt;h2&gt;
  
  
  Make Routing Observable
&lt;/h2&gt;

&lt;p&gt;Manual routing also gives you a clean evaluation point.&lt;br&gt;
Because the router returns a typed result before any specialist agent runs, you can log the routing decision separately from the final answer.&lt;br&gt;
For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;LogInformation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"Intent routed. Intent={Intent}, Confidence={Confidence}, SelectedAgent={SelectedAgent}, FallbackUsed={FallbackUsed}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Intent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Confidence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;selectedAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;fallbackUsed&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Useful fields include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;userMessage&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;predictedIntent&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;confidence&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;selectedAgent&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;fallbackUsed&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In a real system, you may not want to log the full user message directly.&lt;br&gt;
Depending on your privacy and compliance requirements, you might log a request id, a redacted message or a hashed reference instead.&lt;/p&gt;

&lt;p&gt;The important part is that routing becomes measurable.&lt;br&gt;
You can now answer questions like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which intents are confused most often?&lt;/li&gt;
&lt;li&gt;How often does the fallback trigger?&lt;/li&gt;
&lt;li&gt;Which confidence ranges produce the most wrong routes?&lt;/li&gt;
&lt;li&gt;Are some user request types consistently misclassified?&lt;/li&gt;
&lt;li&gt;Did a model upgrade improve or damage routing quality?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is one of the underrated benefits of manual routing.&lt;br&gt;
You are not only saving tokens.&lt;br&gt;
You are creating a small, testable control point in front of the rest of the system.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why This Saves Tokens
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_10_3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_10_3.png" title="WhyRoutingWorks" alt="WhyRoutingWorks" width="800" height="1216"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Manual routing helps because each agent receives only the context it needs.&lt;br&gt;
But keep in mind, routing is not free.&lt;br&gt;
It adds one model call before the specialist call.&lt;/p&gt;

&lt;p&gt;This only pays off when the routing call is cheaper than the irrelevant context you avoid.&lt;br&gt;
If your specialist prompts are tiny, the extra router call may not be worth it.&lt;br&gt;
But once prompts, tools and model sizes start to diverge, routing becomes useful quickly.&lt;/p&gt;

&lt;p&gt;The intent agent can stay small:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;short instructions&lt;/li&gt;
&lt;li&gt;no domain tools&lt;/li&gt;
&lt;li&gt;no long specialist prompt&lt;/li&gt;
&lt;li&gt;cheap model&lt;/li&gt;
&lt;li&gt;structured output only&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The specialist agents can stay focused:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;coffee instructions only for coffee questions&lt;/li&gt;
&lt;li&gt;music instructions only for music questions&lt;/li&gt;
&lt;li&gt;domain tools only where they are useful&lt;/li&gt;
&lt;li&gt;stronger models only when the request deserves them&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This does not make token limits disappear.&lt;br&gt;
It changes which instructions, tools and context are sent to which model call.&lt;/p&gt;

&lt;p&gt;Instead of this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Every request
  -&amp;gt; coffee prompt + music prompt + all tools + all rules
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You get this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Every request
  -&amp;gt; small routing prompt

Only music requests
  -&amp;gt; music prompt + music tools

Only coffee requests
  -&amp;gt; coffee prompt + coffee tools
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the real win.&lt;br&gt;
You avoid paying for irrelevant context on every request.&lt;/p&gt;

&lt;h2&gt;
  
  
  Manual Routing vs. Workflow Engine
&lt;/h2&gt;

&lt;p&gt;This article uses plain C# routing on purpose.&lt;br&gt;
You could model routing as a workflow later.&lt;br&gt;
Agent Framework has a workflow engine for explicit orchestration, checkpoints, handoffs and human-in-the-loop scenarios.&lt;br&gt;
But this example does not need that yet.&lt;br&gt;
The flow is simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Classify intent&lt;/li&gt;
&lt;li&gt;Switch on the result&lt;/li&gt;
&lt;li&gt;Call one specialist&lt;/li&gt;
&lt;li&gt;Return the answer&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A C# switch is easier to read, easier to test and easier to debug than a workflow graph for this case.&lt;br&gt;
This is a useful design rule:&lt;/p&gt;

&lt;p&gt;Use the smallest orchestration mechanism that gives you enough control.&lt;/p&gt;

&lt;p&gt;For simple routing, that is often normal C#.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use Manual Intent Routing
&lt;/h2&gt;

&lt;p&gt;Use manual intent routing when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You have clearly separated domains&lt;/li&gt;
&lt;li&gt;One large prompt is becoming too expensive or unfocused&lt;/li&gt;
&lt;li&gt;Different requests need different tools&lt;/li&gt;
&lt;li&gt;Different requests deserve different model sizes&lt;/li&gt;
&lt;li&gt;You want predictable routing logic in application code&lt;/li&gt;
&lt;li&gt;You can evaluate routing quality with realistic examples&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Do not use manual intent routing when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A single small agent is already enough&lt;/li&gt;
&lt;li&gt;The categories are vague and constantly overlapping&lt;/li&gt;
&lt;li&gt;A wrong route would be expensive and you have no validation&lt;/li&gt;
&lt;li&gt;The router becomes as complex as the system it replaces&lt;/li&gt;
&lt;li&gt;You actually need checkpoints, long-running execution or human approval&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This pattern is not a universal architecture.&lt;br&gt;
It is a cheap and practical first step into multi-agent systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Manual multi-agent routing is simple.&lt;br&gt;
Use a small intent agent to classify the request.&lt;br&gt;
Return a typed &lt;code&gt;IntentResult&lt;/code&gt;.&lt;br&gt;
Let C# route to the right specialist agent.&lt;/p&gt;

&lt;p&gt;This keeps the expensive agents focused and avoids sending every instruction and every tool to every model call.&lt;br&gt;
It also gives your application a clean control point for fallbacks, confidence thresholds, logging and tests.&lt;/p&gt;

&lt;p&gt;The main limitation is that routing quality becomes part of your system quality.&lt;br&gt;
You need examples, thresholds and fallback behavior.&lt;br&gt;
But that is still easier to reason about than one overloaded agent that tries to be everything at once.&lt;/p&gt;

&lt;p&gt;Next, we can take this idea further.&lt;br&gt;
Instead of routing to agents from C#, we can expose agents as tools and let one coordinator delegate work deliberately.&lt;br&gt;
That gives more flexibility, but also brings back token, cost and control tradeoffs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/structured-output" rel="noopener noreferrer"&gt;Producing Structured Output with Agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/running-agents" rel="noopener noreferrer"&gt;Running Agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/" rel="noopener noreferrer"&gt;Microsoft Agent Framework Agents overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/tools/" rel="noopener noreferrer"&gt;Microsoft Agent Framework Tools overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/dotnet/api/microsoft.agents.ai.chatclientagent?view=agent-framework-dotnet-latest" rel="noopener noreferrer"&gt;ChatClientAgent Class&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Extending .NET Agents with MCP and Agent Skills</title>
      <dc:creator>Lukas Walter </dc:creator>
      <pubDate>Mon, 18 May 2026 13:30:00 +0000</pubDate>
      <link>https://dev.to/lukaswalter/extending-net-agents-with-mcp-and-agent-skills-2lcg</link>
      <guid>https://dev.to/lukaswalter/extending-net-agents-with-mcp-and-agent-skills-2lcg</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This is Part 9 of my series on the Microsoft Agent Framework. You can read the original post over on &lt;a href="https://www.lukaswalter.dev/posts/agentframework_1_9/" rel="noopener noreferrer"&gt;lukaswalter.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In the previous articles, we moved from simple agents to tools, dependency injection, and structured output.&lt;br&gt;
Now, the agent can already do useful work inside an application.&lt;br&gt;
But it still depends on the capabilities you explicitly wire into that application.&lt;br&gt;
That is fine for domain logic you own.&lt;br&gt;
It's less attractive when the capability already exists somewhere else: a file system, a documentation server, an internal workflow system, or a set of reusable team procedures.&lt;/p&gt;

&lt;p&gt;Two extension models become interesting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MCP tools connect an agent to external capabilities through the Model Context Protocol.&lt;/li&gt;
&lt;li&gt;Agent Skills package reusable instructions, resources, and scripts that an agent can load only when needed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both models extend agent capabilities, though they address different needs.&lt;br&gt;
They solve different problems.&lt;/p&gt;
&lt;h2&gt;
  
  
  MCP Is for External Capabilities
&lt;/h2&gt;

&lt;p&gt;Model Context Protocol is useful when the agent needs to interact with a system outside your application boundary.&lt;br&gt;
Instead of writing a custom C# wrapper for every external API, you connect to an MCP server and expose the tools from that server to the agent.&lt;br&gt;
The server owns the integration logic.&lt;br&gt;
Your application decides whether that server is allowed into the agent runtime.&lt;/p&gt;

&lt;p&gt;MCP can expose more than tools, including resources and prompts, depending on the server and client.&lt;br&gt;
This article focuses on MCP tools because they are the most direct fit for agent tool calling in this part of the series.&lt;br&gt;
Conceptually, the flow looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_9_1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_9_1.png" title="MCPBoundaryFlow" alt="MCPBoundaryFlow" width="800" height="177"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Microsoft Agent Framework can work with the official MCP C# SDK.&lt;br&gt;
A typical setup starts by creating an MCP client, listing the tools exposed by the server, and passing those tools to the agent.&lt;br&gt;
For example, imagine a local read-only documentation MCP server that exposes project docs and an engineering handbook.&lt;br&gt;
The exact command depends on the MCP server you use or build.&lt;br&gt;
The important part is that the server is configured as read-only before its tools are exposed to the model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Agents.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Extensions.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;ModelContextProtocol.Client&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;mcpClient&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;McpClientFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;StdioClientTransport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"EngineeringDocs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Command&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"dotnet"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Arguments&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s"&gt;"run"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"--project"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"./tools/DocsMcpServer"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"--"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"--root"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"./docs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"--root"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"./engineering-handbook"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"--read-only"&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}));&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;mcpTools&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;mcpClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ListToolsAsync&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"""
&lt;/span&gt;    &lt;span class="n"&gt;You&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="n"&gt;questions&lt;/span&gt; &lt;span class="n"&gt;about&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt; &lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="n"&gt;engineering&lt;/span&gt; &lt;span class="n"&gt;documentation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;Use&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;documentation&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="n"&gt;current&lt;/span&gt; &lt;span class="n"&gt;project&lt;/span&gt; &lt;span class="n"&gt;guidance&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="n"&gt;needed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="s"&gt;""",
&lt;/span&gt;    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[..&lt;/span&gt; &lt;span class="n"&gt;mcpTools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Cast&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AITool&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;()]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;DocsMcpServer&lt;/code&gt; name in this example is intentionally generic.&lt;br&gt;
It could be your own small MCP server, an approved internal server, or a stable server package your team has reviewed.&lt;br&gt;
The point is the MCP boundary.&lt;/p&gt;

&lt;p&gt;The agent does not know how to read the documentation store directly.&lt;br&gt;
It sees a set of tool definitions exposed by the MCP server.&lt;br&gt;
When the model chooses one of those tools, your application routes the call through the MCP client.&lt;/p&gt;

&lt;p&gt;MCP avoids hand-written custom API wrappers for every external system, but it does not remove application code.&lt;br&gt;
You still own client setup, authentication, tool discovery, tool selection, logging, and safety boundaries.&lt;/p&gt;
&lt;h2&gt;
  
  
  MCP Is Still a Trust Boundary
&lt;/h2&gt;

&lt;p&gt;MCP makes integrations easier.&lt;br&gt;
It does not make them automatically safe.&lt;br&gt;
An MCP server can expose powerful operations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;searching document stores&lt;/li&gt;
&lt;li&gt;reading internal handbooks&lt;/li&gt;
&lt;li&gt;creating tickets&lt;/li&gt;
&lt;li&gt;changing files&lt;/li&gt;
&lt;li&gt;querying databases&lt;/li&gt;
&lt;li&gt;calling internal APIs&lt;/li&gt;
&lt;li&gt;accessing local files&lt;/li&gt;
&lt;li&gt;executing commands, depending on the server&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So treat an MCP server like any other integration dependency.&lt;br&gt;
Do not connect random servers to production agents, and do not assume that the protocol itself is the safety layer.&lt;br&gt;
At minimum, check:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Who maintains the server&lt;/li&gt;
&lt;li&gt;What tools does it expose&lt;/li&gt;
&lt;li&gt;What credentials it receives&lt;/li&gt;
&lt;li&gt;What data leaves your application&lt;/li&gt;
&lt;li&gt;Whether tool calls are logged&lt;/li&gt;
&lt;li&gt;Whether write operations need approval&lt;/li&gt;
&lt;li&gt;Whether the server runs locally, remotely, or inside a sandbox&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Authentication also belongs outside the prompt.&lt;br&gt;
Do not put API keys, personal access tokens, or OAuth tokens into agent instructions.&lt;br&gt;
Use the authentication mechanism expected by the MCP server and transport.&lt;br&gt;
For remote HTTP servers, prefer per-run headers or runtime credential providers when available, so secrets are not baked into a shared client or accidentally persisted.&lt;/p&gt;
&lt;h2&gt;
  
  
  Keep the MCP Tool Surface Small
&lt;/h2&gt;

&lt;p&gt;MCP servers can expose many tools.&lt;br&gt;
That does not mean every agent should receive all of them.&lt;br&gt;
A large tool surface creates three practical problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;More tool descriptions are sent to the model, increasing token usage.&lt;/li&gt;
&lt;li&gt;The model has more opportunities to choose the wrong tool.&lt;/li&gt;
&lt;li&gt;Your review surface grows because every exposed tool becomes callable through the agent.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The same rule from local function tools applies here:&lt;/p&gt;

&lt;p&gt;Expose the narrowest capability set that solves the task.&lt;/p&gt;

&lt;p&gt;Prefer filtering before the tools ever reach the model.&lt;br&gt;
Client-side filtering is useful, but it should not be the only safety boundary.&lt;br&gt;
If the MCP server supports read-only mode, toolsets, scopes, explicit tool configuration, or server-side restrictions, use those first.&lt;br&gt;
That keeps dangerous operations entirely out of the advertised tool list.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_9_2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_9_2.png" title="MCPToolSurface" alt="MCPToolSurface" width="800" height="758"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For example, a documentation assistant may need tools that search docs, read files under approved directories, or return links to handbook pages.&lt;br&gt;
It should not receive tools that write files, shell out to the host, or read arbitrary paths outside the configured documentation roots.&lt;/p&gt;

&lt;p&gt;Client-side allow-listing can be a second boundary after the server has already been configured safely.&lt;br&gt;
Use explicit configuration or metadata when possible.&lt;/p&gt;

&lt;p&gt;This name-based example is illustrative only:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;allowedToolNames&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;HashSet&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;StringComparer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrdinalIgnoreCase&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s"&gt;"search_docs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"read_doc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"list_doc_sections"&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;selectedTools&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mcpTools&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;allowedToolNames&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Cast&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AITool&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToArray&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;docsAgent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Use only the approved documentation tools."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;selectedTools&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Do not treat naming conventions as a production security model.&lt;br&gt;
They are too easy to get wrong.&lt;br&gt;
In a real system, prefer server-side restrictions, explicit allow-lists, scopes, policy configuration, and audit logs.&lt;br&gt;
Expose tools intentionally to ensure agent access stays controlled.&lt;/p&gt;
&lt;h2&gt;
  
  
  Agent Skills Are for Reusable Knowledge and Procedures
&lt;/h2&gt;

&lt;p&gt;MCP is a good fit when the agent needs to call an external system.&lt;br&gt;
Agent Skills are a better fit when the agent needs reusable knowledge or a repeatable procedure.&lt;/p&gt;

&lt;p&gt;Agent Framework can support file-based skills and other authoring styles, such as code-defined or class-based skills.&lt;br&gt;
This article focuses on the file-based Agent Skills format because it maps well to reusable instructions, reference material, and scripts.&lt;/p&gt;

&lt;p&gt;In the file-based Agent Skills format, a skill is a folder with a &lt;code&gt;SKILL.md&lt;/code&gt; file and optional resources.&lt;br&gt;
For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;skills/
  incident-triage/
    SKILL.md
    references/
      severity-levels.md
      escalation-policy.md
    scripts/
      summarize-logs.py
  pull-request-review/
    SKILL.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;SKILL.md&lt;/code&gt; file contains front matter and instructions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;incident-triage&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Guides&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;incident&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;triage,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;severity&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;classification,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;escalation,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;and&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;concise&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;incident&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;summaries."&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Incident Triage&lt;/span&gt;

Use this skill when the user asks for help with a production incident,
alert investigation, severity classification, escalation, or incident summary.

When triaging an incident:
&lt;span class="p"&gt;
1.&lt;/span&gt; Identify affected services, users, and time window.
&lt;span class="p"&gt;2.&lt;/span&gt; Classify severity using &lt;span class="sb"&gt;`references/severity-levels.md`&lt;/span&gt;.
&lt;span class="p"&gt;3.&lt;/span&gt; Check escalation rules in &lt;span class="sb"&gt;`references/escalation-policy.md`&lt;/span&gt;.
&lt;span class="p"&gt;4.&lt;/span&gt; Summarize known facts, unknowns, impact, and next actions.
&lt;span class="p"&gt;5.&lt;/span&gt; Use &lt;span class="sb"&gt;`scripts/summarize-logs.py`&lt;/span&gt; only when log excerpts need deterministic preprocessing.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is different from putting all of that text into the agent’s system prompt.&lt;br&gt;
The skill can be advertised by name and description first.&lt;br&gt;
The full instructions and reference files are loaded only when the task needs them.&lt;br&gt;
That keeps the base prompt smaller while still giving the agent access to deeper domain knowledge.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_9_4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_9_4.png" title="MCPvsAgent" alt="MCPvsAgent" width="800" height="2030"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Loading Skills in Agent Framework
&lt;/h2&gt;

&lt;p&gt;Agent Framework exposes Agent Skills through an &lt;code&gt;AgentSkillsProvider&lt;/code&gt;.&lt;br&gt;
It acts as an &lt;code&gt;AIContextProvider&lt;/code&gt;, so skills become part of the agent invocation pipeline rather than a one-off prompt trick.&lt;br&gt;
A simple file-based setup looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Agents.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;skillsProvider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;AgentSkillsProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Combine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AppContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BaseDirectory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"skills"&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;agentOptions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;ChatClientAgentOptions&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ChatOptions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Instructions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"""
&lt;/span&gt;        &lt;span class="n"&gt;You&lt;/span&gt; &lt;span class="n"&gt;help&lt;/span&gt; &lt;span class="n"&gt;engineers&lt;/span&gt; &lt;span class="n"&gt;triage&lt;/span&gt; &lt;span class="n"&gt;incidents&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;follow&lt;/span&gt; &lt;span class="k"&gt;internal&lt;/span&gt; &lt;span class="n"&gt;procedures&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;Use&lt;/span&gt; &lt;span class="n"&gt;available&lt;/span&gt; &lt;span class="n"&gt;skills&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="n"&gt;they&lt;/span&gt; &lt;span class="n"&gt;are&lt;/span&gt; &lt;span class="n"&gt;relevant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
        &lt;span class="s"&gt;"""
&lt;/span&gt;    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;AIContextProviders&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;skillsProvider&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agentOptions&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The provider discovers skills from the configured directory and exposes skill-related tools to the agent.&lt;br&gt;
The model can then load the right skill when a user request matches the skill description.&lt;/p&gt;

&lt;p&gt;This gives you a useful separation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The agent instructions define the general behavior.&lt;/li&gt;
&lt;li&gt;Skills provide specialized procedures and checklists.&lt;/li&gt;
&lt;li&gt;Reference files hold longer policy or domain material.&lt;/li&gt;
&lt;li&gt;Scripts can automate deterministic helper steps when you explicitly enable them.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Scripts Need Even More Care
&lt;/h2&gt;

&lt;p&gt;Skills can include scripts.&lt;br&gt;
That is useful, but it changes the risk profile.&lt;br&gt;
Reading a markdown reference file is one thing.&lt;br&gt;
Executing a script is another.&lt;br&gt;
If you enable file-based script execution, do it explicitly and treat scripts as code that runs in your environment.&lt;br&gt;
For example, if your application provides a subprocess runner:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;skillsProvider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;AgentSkillsProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Combine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AppContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BaseDirectory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"skills"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;SubprocessScriptRunner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That makes script execution possible.&lt;br&gt;
It does not make every script acceptable for production.&lt;/p&gt;

&lt;p&gt;Before enabling scripts, decide:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which script extensions are allowed&lt;/li&gt;
&lt;li&gt;Whether scripts need human approval&lt;/li&gt;
&lt;li&gt;What filesystem paths can they access&lt;/li&gt;
&lt;li&gt;Whether they can use the network&lt;/li&gt;
&lt;li&gt;How long can they run&lt;/li&gt;
&lt;li&gt;Where stdout, stderr, and exit codes are logged&lt;/li&gt;
&lt;li&gt;How arguments are validated before execution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For internal skills, store them in version control and review them as you would application code.&lt;br&gt;
For third-party skills, treat them like dependencies that can inject instructions and run code.&lt;/p&gt;

&lt;h2&gt;
  
  
  MCP vs. Agent Skills
&lt;/h2&gt;

&lt;p&gt;MCP is access to an external system or live state.&lt;br&gt;
Agent Skills are reusable procedures, guidance, and packaged expertise.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_9_3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.lukaswalter.dev%2Fimages%2FAgentFramework_1_9_3.png" title="AgentSkills" alt="AgentSkills" width="800" height="111"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use MCP when&lt;/th&gt;
&lt;th&gt;Use Agent Skills when&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;The agent needs to call an external system&lt;/td&gt;
&lt;td&gt;The agent needs reusable instructions or procedures&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The capability already exists behind an API, service, or local server&lt;/td&gt;
&lt;td&gt;The capability is mostly knowledge, process, examples, or local resources&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The tool result should come from current external state&lt;/td&gt;
&lt;td&gt;The agent should load guidance only when relevant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Authentication, permissions, and transport matter&lt;/td&gt;
&lt;td&gt;Packaging, reuse, and progressive disclosure matter&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;There is overlap.&lt;br&gt;
You can use both.&lt;br&gt;
For example, an HR assistant might use:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MCP tools to query the HR system&lt;/li&gt;
&lt;li&gt;Agent Skills to load the company’s parental leave procedure&lt;/li&gt;
&lt;li&gt;structured output to return a validated case summary&lt;/li&gt;
&lt;li&gt;approval tools before submitting a request&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A combination is often more useful than trying to make one abstraction do everything.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use Them
&lt;/h2&gt;

&lt;p&gt;Use MCP when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The agent needs live data from another system&lt;/li&gt;
&lt;li&gt;An existing MCP server already covers the integration&lt;/li&gt;
&lt;li&gt;You can restrict credentials and tool permissions clearly&lt;/li&gt;
&lt;li&gt;You need a standardized integration surface across agents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Do not use MCP when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A simple C# function is enough&lt;/li&gt;
&lt;li&gt;The server exposes broad write operations you cannot control&lt;/li&gt;
&lt;li&gt;You cannot audit what data leaves your application&lt;/li&gt;
&lt;li&gt;The integration would bypass your existing authorization model&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use Agent Skills when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Domain knowledge is too large for the base prompt&lt;/li&gt;
&lt;li&gt;Multiple agents or teams should reuse the same procedure&lt;/li&gt;
&lt;li&gt;The agent should load detailed guidance only when needed&lt;/li&gt;
&lt;li&gt;Instructions, examples, templates, and scripts should live together&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Do not use Agent Skills when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The content is really application state that should come from a database&lt;/li&gt;
&lt;li&gt;The procedure changes on every request&lt;/li&gt;
&lt;li&gt;The skill would hide risky automation inside a markdown folder&lt;/li&gt;
&lt;li&gt;The same result is better expressed as normal tested C# code&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;MCP and Agent Skills both extend an agent, but in different directions.&lt;br&gt;
MCP connects the agent to external capabilities.&lt;br&gt;
Agent Skills give the agent reusable expertise and procedures.&lt;br&gt;
The problem is not about giving the agent more power.&lt;br&gt;
It is about deciding which power belongs in the runtime, which belongs in application code, which belongs in a skill, and which needs approval before it runs.&lt;/p&gt;

&lt;p&gt;At this point in the series, we have an agent that can keep state, manage context, call tools, return structured output, connect through MCP, and load reusable skills.&lt;br&gt;
The next step is orchestration.&lt;br&gt;
Some tasks are too large or too explicit for a single agent call.&lt;br&gt;
In the next article, we will look at multi-agent systems and workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/tools/local-mcp-tools" rel="noopener noreferrer"&gt;Using MCP tools with Agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/skills" rel="noopener noreferrer"&gt;Agent Skills in Microsoft Agent Framework&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://agentskills.io/specification" rel="noopener noreferrer"&gt;Agent Skills specification&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://modelcontextprotocol.io/docs/tutorials/security/security_best_practices" rel="noopener noreferrer"&gt;MCP security best practices&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/tools/" rel="noopener noreferrer"&gt;Tools Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://modelcontextprotocol.io/docs/getting-started/intro" rel="noopener noreferrer"&gt;What is the Model Context Protocol?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Structured Output in .NET Agents</title>
      <dc:creator>Lukas Walter </dc:creator>
      <pubDate>Thu, 14 May 2026 13:30:00 +0000</pubDate>
      <link>https://dev.to/lukaswalter/structured-output-in-net-agents-26fo</link>
      <guid>https://dev.to/lukaswalter/structured-output-in-net-agents-26fo</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This is Part 8 of my series on the Microsoft Agent Framework. You can read the original post over on &lt;a href="https://www.lukaswalter.dev/posts/agentframework_1_8/" rel="noopener noreferrer"&gt;lukaswalter.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;LLMs are good at generating text. But text is a weak boundary for application code.&lt;/p&gt;

&lt;p&gt;Ask a model for e.g., a specific coffee recipe, and the response might look slightly different every time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a markdown list&lt;/li&gt;
&lt;li&gt;a numbered list&lt;/li&gt;
&lt;li&gt;bold section titles&lt;/li&gt;
&lt;li&gt;missing fields&lt;/li&gt;
&lt;li&gt;additional explanations&lt;/li&gt;
&lt;li&gt;a disclaimer at the end&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is fine for a chat interface.&lt;br&gt;
It is not, when your application needs to save the result, display it on a UI, route a workflow, or pass the output into another system.&lt;/p&gt;

&lt;p&gt;At that point, you do not want “some text”.&lt;br&gt;
You want data with a known shape.&lt;/p&gt;
&lt;h2&gt;
  
  
  Raw LLM Text Is Hard to Automate
&lt;/h2&gt;

&lt;p&gt;The problem with unstructured output is not that it looks messy.&lt;br&gt;
The problem is that your application has to guess what the model meant.&lt;/p&gt;

&lt;p&gt;For example, if the model returns a coffee recipe as plain text, your code may need to extract:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;brew method&lt;/li&gt;
&lt;li&gt;coffee dose&lt;/li&gt;
&lt;li&gt;water amount&lt;/li&gt;
&lt;li&gt;grind size&lt;/li&gt;
&lt;li&gt;water temperature&lt;/li&gt;
&lt;li&gt;brewing steps&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That usually means parsing strings.&lt;br&gt;
And string parsing breaks easily.&lt;/p&gt;

&lt;p&gt;One response might look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. V60 recipe: Use 20g coffee and 320g water at 94°C. Grind medium-fine.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The next response might look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;### V60 Pour-Over

- Coffee: 20 grams
- Water: 320 grams
- Temperature: 94°C
- Grind: medium-fine
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both are readable for humans.&lt;br&gt;
But for software, they are different formats.&lt;br&gt;
This is why raw LLM text is a fragile integration boundary.&lt;/p&gt;
&lt;h2&gt;
  
  
  Define the Output Shape in C
&lt;/h2&gt;

&lt;p&gt;Instead of asking the model to return free-form text, you can define the shape you expect in C#.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;BrewRecipeSuggestion&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;BrewMethod&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Empty&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;CoffeeGrams&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;WaterGrams&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;GrindSize&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Empty&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;WaterTemperatureCelsius&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Steps&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want multiple results, you can wrap the list in a response type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;BrewRecipeResult&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;BrewRecipeSuggestion&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Recipes&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now your application has a contract.&lt;br&gt;
The model is no longer just asked to “write an answer”.&lt;br&gt;
It is requested that something be produced that can be represented as a known C# type.&lt;/p&gt;
&lt;h2&gt;
  
  
  Using &lt;code&gt;RunAsync&amp;lt;T&amp;gt;&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;With .NET agents, this becomes much cleaner.&lt;br&gt;
Instead of calling the agent and receiving plain text:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"Give me three pour-over coffee recipes."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can request a typed result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;AgentResponse&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;BrewRecipeResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;BrewRecipeResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;
        &lt;span class="s"&gt;"""
&lt;/span&gt;        &lt;span class="n"&gt;Give&lt;/span&gt; &lt;span class="n"&gt;me&lt;/span&gt; &lt;span class="n"&gt;three&lt;/span&gt; &lt;span class="n"&gt;pour&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="n"&gt;over&lt;/span&gt; &lt;span class="n"&gt;coffee&lt;/span&gt; &lt;span class="n"&gt;recipes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

        &lt;span class="n"&gt;Include&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;brew&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt;
        &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;coffee&lt;/span&gt; &lt;span class="n"&gt;dose&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;grams&lt;/span&gt;
        &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;water&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;grams&lt;/span&gt;
        &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;grind&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;
        &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;water&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;Celsius&lt;/span&gt;
        &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;brewing&lt;/span&gt; &lt;span class="n"&gt;steps&lt;/span&gt;
        &lt;span class="s"&gt;""");
&lt;/span&gt;
&lt;span class="n"&gt;BrewRecipeResult&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The important difference is the boundary.&lt;br&gt;
Your application does not receive a string that it still has to interpret.&lt;br&gt;
It receives an &lt;code&gt;AgentResponse&amp;lt;T&amp;gt;&lt;/code&gt;, and the typed result is available through &lt;code&gt;response.Result&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That means you can work with the result directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;recipe&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Recipes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WriteLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s"&gt;$"&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;recipe&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BrewMethod&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;recipe&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CoffeeGrams&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;g coffee, "&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt;
        &lt;span class="s"&gt;$"&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;recipe&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WaterGrams&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;g water"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is much easier to use in normal application code.&lt;br&gt;
You can render it in a UI, store it in a database, pass it to another service,validate it and even test it.&lt;/p&gt;
&lt;h2&gt;
  
  
  What the Framework Does for You
&lt;/h2&gt;

&lt;p&gt;When you call &lt;code&gt;RunAsync&amp;lt;T&amp;gt;&lt;/code&gt;, the framework can use the target C# type to describe the expected response shape.&lt;br&gt;
The model is guided toward returning data that matches that structure.&lt;br&gt;
The framework then converts the response into the requested C# type.&lt;/p&gt;

&lt;p&gt;Conceptually, the flow looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;C# type
   ↓
Expected response shape
   ↓
Model response
   ↓
Deserialization
   ↓
Typed C# object
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That removes a lot of boilerplate.&lt;br&gt;
You do not have to manually inspect markdown, split strings, or have to search for labels in generated text.&lt;br&gt;
You get a typed result that fits into the rest of your .NET code.&lt;/p&gt;

&lt;p&gt;Still, keep one thing in mind:&lt;/p&gt;

&lt;p&gt;Structured output support can vary by agent type, provider, model, and underlying chat client.&lt;br&gt;
So this is not a reason to stop thinking about validation, fallbacks, and testing.&lt;/p&gt;

&lt;p&gt;It is a better application boundary.&lt;br&gt;
Not a replacement for engineering discipline.&lt;/p&gt;
&lt;h2&gt;
  
  
  What Structured Output Does Not Solve
&lt;/h2&gt;

&lt;p&gt;Structured output solves the shape problem.&lt;br&gt;
It does not solve the truth problem.&lt;/p&gt;

&lt;p&gt;A model can return a valid &lt;code&gt;BrewRecipeSuggestion&lt;/code&gt; object and still be wrong.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;BrewRecipeSuggestion&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;BrewMethod&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"V60"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CoffeeGrams&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;WaterGrams&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;GrindSize&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"very fine"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;WaterTemperatureCelsius&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;120&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Steps&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s"&gt;"Add coffee."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"Pour all water at once."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"Wait 30 seconds."&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This object may be structurally valid.&lt;/p&gt;

&lt;p&gt;It has the expected fields.&lt;br&gt;
It can be deserialized.&lt;/p&gt;

&lt;p&gt;Your application can work with it as an object.&lt;br&gt;
But that does not mean it is a good recipe. (&lt;em&gt;The ratio is unrealistic.&lt;br&gt;
The water temperature is impossible for normal brewing.&lt;br&gt;
The steps are questionable.&lt;/em&gt;)&lt;/p&gt;

&lt;p&gt;Structured output can tell you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The response has the expected fields&lt;/li&gt;
&lt;li&gt;The values can be deserialized&lt;/li&gt;
&lt;li&gt;The application can work with the result as an object&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It does not guarantee:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The facts are correct&lt;/li&gt;
&lt;li&gt;The recommendation is useful&lt;/li&gt;
&lt;li&gt;The values are reasonable&lt;/li&gt;
&lt;li&gt;The user is allowed to perform the action&lt;/li&gt;
&lt;li&gt;The result satisfies your business rules&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So keep in mind: typed output should usually be the first gate, not the final gate.&lt;/p&gt;

&lt;p&gt;A more robust flow looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Model output
   ↓
Deserialize into known type
   ↓
Validate required fields
   ↓
Validate ranges and enums
   ↓
Check business rules
   ↓
Accept, reject, retry, or escalate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this coffee example, you might still check the generated recipe:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;Validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BrewRecipeSuggestion&lt;/span&gt; &lt;span class="n"&gt;recipe&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;IsNullOrWhiteSpace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;recipe&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BrewMethod&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;InvalidOperationException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Brew method is required."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;recipe&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CoffeeGrams&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;InvalidOperationException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Coffee dose must be greater than zero."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;ratio&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;recipe&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WaterGrams&lt;/span&gt; &lt;span class="p"&gt;/&lt;/span&gt; &lt;span class="n"&gt;recipe&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CoffeeGrams&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ratio&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt; &lt;span class="m"&gt;12&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;InvalidOperationException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"Brew ratio is outside the supported range."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;recipe&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WaterTemperatureCelsius&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt; &lt;span class="m"&gt;85&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;InvalidOperationException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"Water temperature must be between 85°C and 100°C."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;recipe&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Steps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Count&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;InvalidOperationException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"At least one brewing step is required."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Structured output makes validation easier.&lt;br&gt;
It does not remove the need for validation.&lt;/p&gt;
&lt;h2&gt;
  
  
  Practical Example: Intent Routing
&lt;/h2&gt;

&lt;p&gt;One useful pattern is intent routing.&lt;/p&gt;

&lt;p&gt;Imagine an assistant that can answer questions about coffee brewing and guitar tone.&lt;br&gt;
A user might ask:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;How do I get a dirty Hendrix tone on my Strat?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;or:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Can you give me a V60 recipe for 18g of coffee?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You could first send the user request to a small routing agent.&lt;br&gt;
That agent should not answer the question.&lt;br&gt;
It should only classify the intent.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;AssistantIntent&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;CoffeeBrewing&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;GuitarTone&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Unknown&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;IntentResult&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;AssistantIntent&lt;/span&gt; &lt;span class="n"&gt;Intent&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;Confidence&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Reason&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Empty&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then you can request a typed result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;userMessage&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"How do I get a dirty Hendrix tone on my Strat?"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="n"&gt;AgentResponse&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IntentResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;intentResponse&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;intentAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IntentResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;
        &lt;span class="s"&gt;$"""
&lt;/span&gt;        &lt;span class="n"&gt;Classify&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

        &lt;span class="n"&gt;Return&lt;/span&gt; &lt;span class="n"&gt;only&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;intent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;Do&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

        &lt;span class="n"&gt;Supported&lt;/span&gt; &lt;span class="n"&gt;intents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;CoffeeBrewing&lt;/span&gt;
        &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;GuitarTone&lt;/span&gt;
        &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="n"&gt;Unknown&lt;/span&gt;

        &lt;span class="n"&gt;User&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="s"&gt;""");
&lt;/span&gt;
&lt;span class="n"&gt;IntentResult&lt;/span&gt; &lt;span class="n"&gt;intent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;intentResponse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After that, your C# code stays simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;intent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Intent&lt;/span&gt; &lt;span class="k"&gt;switch&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;AssistantIntent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CoffeeBrewing&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;coffeeAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;AssistantIntent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GuitarTone&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;guitarAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;fallbackAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is much cleaner than asking the model to return text like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The user is probably asking about guitar tone.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and then trying to parse that sentence.&lt;/p&gt;

&lt;p&gt;The routing decision becomes a typed value.&lt;br&gt;
Your application code can switch on it.&lt;br&gt;
You can log it, test it and add validation around it.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;intent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Confidence&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;InvalidOperationException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s"&gt;"Intent confidence must be between 0 and 1."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;intent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Confidence&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt; &lt;span class="m"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;fallbackAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Again, the typed object does not make the model perfect.&lt;br&gt;
But it gives your application a reliable shape to work with.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Structured Output Fits
&lt;/h2&gt;

&lt;p&gt;Structured output is useful whenever the model response has to cross into application logic.&lt;/p&gt;

&lt;p&gt;Common examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;extracting fields from user input&lt;/li&gt;
&lt;li&gt;classifying intent&lt;/li&gt;
&lt;li&gt;routing workflows&lt;/li&gt;
&lt;li&gt;generating UI-ready data&lt;/li&gt;
&lt;li&gt;creating database records&lt;/li&gt;
&lt;li&gt;preparing tool arguments&lt;/li&gt;
&lt;li&gt;returning validation results&lt;/li&gt;
&lt;li&gt;producing evaluation summaries&lt;/li&gt;
&lt;li&gt;generating configuration-like output&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern is always the same:&lt;/p&gt;

&lt;p&gt;Do not let free-form text leak into places where your application expects structured data.&lt;br&gt;
Use a typed boundary.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Structured output is one of the most important patterns when combining LLMs with traditional software systems.&lt;br&gt;
Not because it makes the model perfect.&lt;br&gt;
But because it gives your application a clear contract.&lt;/p&gt;

&lt;p&gt;Instead of parsing unstable text, your .NET code can work with known types, which makes the system easier to build, test, and reason about.&lt;/p&gt;

&lt;p&gt;LLM output should not be treated as a string once it enters your application boundary.&lt;br&gt;
It should become a typed object.&lt;br&gt;
And from there, normal engineering practices apply again.&lt;/p&gt;

&lt;p&gt;We now know that structured output defines how an agent answers.&lt;br&gt;
But useful agents also need ways to access capabilities beyond the current prompt.&lt;/p&gt;

&lt;p&gt;In the &lt;a href="https://www.lukaswalter.dev/posts/agentframework_1_7/" rel="noopener noreferrer"&gt;previous post&lt;/a&gt;, we looked at local C# function tools: methods exposed directly from your .NET application.&lt;/p&gt;

&lt;p&gt;Next, we will move one step further and look at MCP tools and Agent Skills.&lt;br&gt;
MCP tools expose capabilities from external systems through the Model Context Protocol.&lt;br&gt;
Agent Skills package reusable instructions, domain knowledge, scripts, and procedures that can be loaded when needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/structured-outputs" rel="noopener noreferrer"&gt;Producing Structured Outputs with agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/running-agents" rel="noopener noreferrer"&gt;Running Agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/overview/" rel="noopener noreferrer"&gt;Microsoft Agent Framework Overview&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Tools and Dependency Injection in Microsoft Agent Framework</title>
      <dc:creator>Lukas Walter </dc:creator>
      <pubDate>Mon, 11 May 2026 13:30:00 +0000</pubDate>
      <link>https://dev.to/lukaswalter/tools-and-dependency-injection-in-microsoft-agent-framework-5cm0</link>
      <guid>https://dev.to/lukaswalter/tools-and-dependency-injection-in-microsoft-agent-framework-5cm0</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This is Part 7 of my series on the Microsoft Agent Framework. You can read the original post over on &lt;a href="https://www.lukaswalter.dev/posts/agentframework_1_7/" rel="noopener noreferrer"&gt;lukaswalter.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  When words are not enough
&lt;/h2&gt;

&lt;p&gt;So far, our agent can answer questions, stream responses, remember conversations, reduce chat history, and receive dynamic context.&lt;/p&gt;

&lt;p&gt;But it still has one major limitation: it can only talk.&lt;/p&gt;

&lt;p&gt;An LLM does not know your current application state by default. It cannot query your database, calculate values from your domain model, or place an order unless your application exposes that capability.&lt;/p&gt;

&lt;p&gt;This is where tools come in.&lt;/p&gt;

&lt;p&gt;A tool is a controlled C# function that the model can request during a run. The model does not execute arbitrary code. It can only call the functions you explicitly provide.&lt;/p&gt;

&lt;p&gt;The flow looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The user asks something that requires application logic.&lt;/li&gt;
&lt;li&gt;The model requests a tool call instead of producing a final answer.&lt;/li&gt;
&lt;li&gt;The framework invokes the matching C# method.&lt;/li&gt;
&lt;li&gt;The result is passed back to the model.&lt;/li&gt;
&lt;li&gt;The model uses that result to answer the user.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  A small tool
&lt;/h2&gt;

&lt;p&gt;Let's stay with the barista agent from the &lt;a href="http://lukaswalter.dev/posts/agentframework_1_6/" rel="noopener noreferrer"&gt;previous article&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;One useful tool is a brew recipe calculator. This does not need a database or external service. It is deterministic domain logic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;System.ComponentModel&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Agents.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Extensions.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;record&lt;/span&gt; &lt;span class="nc"&gt;BrewRecipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;CoffeeGrams&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;WaterGrams&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Ratio&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Calculates water amount for a pour-over coffee recipe."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;BrewRecipe&lt;/span&gt; &lt;span class="nf"&gt;CalculatePourOverRecipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Coffee dose in grams."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; 
    &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;coffeeGrams&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Water per gram of coffee. Use 16 for a 1:16 ratio."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;waterPerGram&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;BrewRecipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;CoffeeGrams&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;coffeeGrams&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;WaterGrams&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;coffeeGrams&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;waterPerGram&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;Ratio&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$"1:&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;waterPerGram&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"You are a barista assistant."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;AIFunctionFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CalculatePourOverRecipe&lt;/span&gt;&lt;span class="p"&gt;)]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;Description&lt;/code&gt; attributes are important. They become part of the function schema that the model sees when deciding whether and how to call the tool.&lt;/p&gt;

&lt;p&gt;But keep in mind, that they also cost tokens. Keep them short and concrete. The goal is not to document your whole domain. The goal is to help the model make the correct choice.&lt;/p&gt;

&lt;p&gt;Now, a prompt like this can trigger the tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"I want to brew 18 grams of coffee at 1:16. How much water should I use?"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model can call &lt;code&gt;CalculatePourOverRecipe&lt;/code&gt;, receive the &lt;code&gt;BrewRecipe&lt;/code&gt;, and then explain the result in normal language.&lt;/p&gt;

&lt;h2&gt;
  
  
  Registering multiple tools
&lt;/h2&gt;

&lt;p&gt;One tool is easy to register manually.&lt;br&gt;
More tools get noisy quickly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Calculate a pour-over recipe&lt;/li&gt;
&lt;li&gt;Calculate espresso yield&lt;/li&gt;
&lt;li&gt;Convert a ratio into grams&lt;/li&gt;
&lt;li&gt;Suggest a grind adjustment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reflection can reduce boilerplate, but do not register every public method on a class. That would turn every method into an AI-callable method.&lt;br&gt;
Use an explicit marker attribute or a whitelist.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;AttributeUsage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AttributeTargets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Method&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;sealed&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;BaristaToolAttribute&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Attribute&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;brewTools&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;BrewTools&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;IList&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AITool&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;typeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BrewTools&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetMethods&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BindingFlags&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Instance&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="n"&gt;BindingFlags&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Public&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;method&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetCustomAttribute&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;BaristaToolAttribute&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;()&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;method&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AITool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;AIFunctionFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;brewTools&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToList&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This keeps registration convenient without exposing the whole class as an execution surface.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tools with dependency injection
&lt;/h2&gt;

&lt;p&gt;Calculation tools are useful, but real applications usually need services.&lt;/p&gt;

&lt;p&gt;For example, the barista agent might need to check which beans are currently available. That data belongs within your application boundary, perhaps in a repository, an API client, or a database context.&lt;/p&gt;

&lt;p&gt;To bridge this gap, you simply add an &lt;code&gt;IServiceProvider&lt;/code&gt; parameter to your tool's method. The framework automatically resolves this dependency locally at runtime, completely hiding it from the AI model's tool schema.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Finds available coffee beans by roast level and flavor note."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IReadOnlyList&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;CoffeeBean&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;FindBeansAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Roast level, for example light, medium, or dark."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;roast&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Flavor note, for example chocolate, citrus, or nutty."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;flavorNote&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;IServiceProvider&lt;/span&gt; &lt;span class="n"&gt;services&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;inventory&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetRequiredService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ICoffeeInventory&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;inventory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FindBeansAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;roast&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;flavorNote&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then pass the service provider when creating the agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"You help users choose coffee beans based on taste and brew method."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;AIFunctionFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FindBeansAsync&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
    &lt;span class="n"&gt;services&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model only supplies &lt;code&gt;roast&lt;/code&gt; and &lt;code&gt;flavorNote&lt;/code&gt;. The inventory service still comes from your application.&lt;br&gt;
This distinction is important because model-supplied arguments are untrusted input. Services resolved from DI are trusted application dependencies.&lt;/p&gt;
&lt;h2&gt;
  
  
  Side effects need approval
&lt;/h2&gt;

&lt;p&gt;Reading inventory is one thing. Placing an order is different.&lt;/p&gt;

&lt;p&gt;Tools that spend money, delete data, send messages, or affect users should not run silently. For those cases, wrap the function in an approval-required tool.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Orders coffee beans from the supplier. Use only after explicit confirmation."&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;OrderBeansAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;productCode&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;bags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;IServiceProvider&lt;/span&gt; &lt;span class="n"&gt;services&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetRequiredService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ICoffeeOrderService&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;PlaceOrderAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;productCode&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;AIFunction&lt;/span&gt; &lt;span class="n"&gt;orderBeans&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AIFunctionFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OrderBeansAsync&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;AIFunction&lt;/span&gt; &lt;span class="n"&gt;approvalRequiredOrderBeans&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;ApprovalRequiredAIFunction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;orderBeans&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With approval enabled, the agent can return a &lt;code&gt;FunctionApprovalRequestContent&lt;/code&gt; instead of running the tool immediately. Your application then shows the function name and arguments to the user and sends the approval or rejection back into the same session.&lt;/p&gt;

&lt;p&gt;The exact support depends on the provider and client type. Function tools are broadly supported, but approval is not universal across every provider.&lt;/p&gt;

&lt;h2&gt;
  
  
  Middleware and monitoring
&lt;/h2&gt;

&lt;p&gt;Approval is for high-risk actions.&lt;br&gt;
Middleware is for cross-cutting concerns such as logging, validation, metrics, or blocking suspicious arguments.&lt;/p&gt;

&lt;p&gt;Function calling middleware lets you inspect the function name and arguments before the method runs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;ValueTask&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;object&lt;/span&gt;&lt;span class="p"&gt;?&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;LogToolCallAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;FunctionInvocationContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Func&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;FunctionInvocationContext&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CancellationToken&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ValueTask&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;object&lt;/span&gt;&lt;span class="p"&gt;?&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WriteLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;$"Tool call: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Do not treat logging as authorization. Middleware is useful for observing and validating calls, but normal application permissions still need to exist behind the tool. Additionally, for standard logging and metrics, consider using the framework's native &lt;code&gt;.UseOpenTelemetry()&lt;/code&gt; extension rather than writing custom logging middleware from scratch.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to use tools
&lt;/h2&gt;

&lt;p&gt;Use tools when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The agent needs current application state&lt;/li&gt;
&lt;li&gt;The answer depends on deterministic business logic&lt;/li&gt;
&lt;li&gt;The result must come from your database, API, or domain model&lt;/li&gt;
&lt;li&gt;The action is narrow, describable, and easy to validate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Do not use tools when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A normal model answer is enough&lt;/li&gt;
&lt;li&gt;The function would become a broad "do anything" escape hatch&lt;/li&gt;
&lt;li&gt;The action cannot be validated before execution&lt;/li&gt;
&lt;li&gt;The tool would bypass existing authorization or business rules&lt;/li&gt;
&lt;li&gt;The side effect is too risky to run without approval&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Tools should expose controlled capabilities, not bypass application design.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Tools turn an agent from a text generator into part of an application workflow.&lt;/p&gt;

&lt;p&gt;For simple logic, &lt;code&gt;AIFunctionFactory.Create&lt;/code&gt; is enough. For application behavior, pass services into the agent and keep dependencies behind your existing DI boundary. For state-changing actions, add approval and monitoring before letting the agent execute anything important.&lt;/p&gt;

&lt;p&gt;Our agent can now use tools and request approval before executing them. But sometimes we do not want a conversational answer at all. We want a reliable C# object that can be validated, stored, or passed to the next workflow step.&lt;/p&gt;

&lt;p&gt;In the next article, we will look at Structured Output.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/tools/function-tools" rel="noopener noreferrer"&gt;Using function tools with an agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/tools/tool-approval" rel="noopener noreferrer"&gt;Using function tools with human in the loop approvals&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/tools/" rel="noopener noreferrer"&gt;Tools Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/middleware/" rel="noopener noreferrer"&gt;Agent Middleware&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.ai.aifunctionfactory.create" rel="noopener noreferrer"&gt;AIFunctionFactory.Create Method&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.ai.chatclientextensions.asaiagent?view=agent-framework-dotnet-latest" rel="noopener noreferrer"&gt;ChatClientExtensions.AsAIAgent Method&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>RAG with EF Core and pgvector</title>
      <dc:creator>Lukas Walter </dc:creator>
      <pubDate>Thu, 07 May 2026 13:30:00 +0000</pubDate>
      <link>https://dev.to/lukaswalter/rag-with-ef-core-and-pgvector-fge</link>
      <guid>https://dev.to/lukaswalter/rag-with-ef-core-and-pgvector-fge</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;You can read the original post over on &lt;a href="https://www.lukaswalter.dev/posts/rag-efcore-pgvector/" rel="noopener noreferrer"&gt;lukaswalter.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Developers often start RAG apps using tutorials that recommend dedicated vector databases. &lt;/p&gt;

&lt;p&gt;&lt;code&gt;Step 1: Sign up for a vector database like Pinecone or Qdrant.&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This adds a costly SaaS service to your architecture or requires you to manage it yourself.&lt;/p&gt;

&lt;p&gt;And if you are building line-of-business applications in .NET, dedicated vector databases often introduce another problem: Data Synchronization.&lt;/p&gt;

&lt;p&gt;If core entities like Products, Customers, or SupportTickets exist in a relational database and vector embeddings reside in a specialized vector DB, you face a distributed systems challenge. What if a product is deleted or its description updated? Synchronizing datastores becomes daunting.&lt;/p&gt;

&lt;p&gt;A pragmatic solution? Store your vectors alongside your relational data.&lt;/p&gt;

&lt;p&gt;Using PostgreSQL, the pgvector extension transforms your relational database into a powerful vector search engine. Better yet, it integrates seamlessly with Entity Framework Core.&lt;/p&gt;

&lt;p&gt;You can build a RAG application without adding any new infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Install the Required Packages
&lt;/h2&gt;

&lt;p&gt;Start by adding the pgvector EF Core integration package.&lt;br&gt;
Run the following commands in your project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dotnet add package Pgvector.EntityFrameworkCore
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: The pgvector extension must be available in your PostgreSQL installation and enabled in the database you use. If you use the pgvector/pgvector Docker image, the extension is already installed, but it still needs to be enabled per database.&lt;/p&gt;

&lt;p&gt;You can enable it manually with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or let EF Core handle it through:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;modelBuilder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;HasPostgresExtension&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"vector"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: Define Your Entity
&lt;/h2&gt;

&lt;p&gt;Suppose you’re developing an internal knowledge base. With a Document entity, enhance storage by adding a Vector property for embeddings generated by an embedding model, for example OpenAI’s text-embedding-3-small, which produces 1536-dimensional vectors by default.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Pgvector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;System.ComponentModel.DataAnnotations.Schema&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;Id&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Title&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Content&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// 1536 is the default dimension for OpenAI text-embedding-3-small.&lt;/span&gt;
    &lt;span class="c1"&gt;// Match this dimension to the embedding model you actually use.&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TypeName&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"vector(1536)"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Vector&lt;/span&gt; &lt;span class="n"&gt;Embedding&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// We can still have standard relational data!&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;TenantId&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; 
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: &lt;code&gt;text-embedding-3-small&lt;/code&gt; produces 1536-dimensional embeddings by default.&lt;br&gt;
&lt;code&gt;text-embedding-3-large&lt;/code&gt; produces 3072-dimensional embeddings by default. pgvector can store vectors larger than 2000 dimensions, but HNSW/IVFFlat indexes for the regular &lt;code&gt;vector&lt;/code&gt; type support up to 2000 dimensions. If you use &lt;code&gt;text-embedding-3-large&lt;/code&gt;, either request fewer dimensions from the embedding API or evaluate &lt;code&gt;halfvec&lt;/code&gt;/&lt;code&gt;HalfVector&lt;/code&gt; for indexed search.&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 3: Configure the DbContext
&lt;/h2&gt;

&lt;p&gt;Configure Entity Framework Core to activate the vector extension in PostgreSQL. Add an HNSW (Hierarchical Navigable Small World) index to the embedding column. &lt;br&gt;
For small datasets, exact search without an index can be fine. As the number of vectors grows, an approximate index such as HNSW often becomes important for latency. Just remember that HNSW trades some recall for speed.&lt;/p&gt;

&lt;p&gt;pgvector can handle larger datasets efficiently, but HNSW is not magic. It is an approximate nearest-neighbor index with trade-offs between recall, speed, memory usage, and build time.&lt;/p&gt;

&lt;p&gt;For HNSW indexes, tune &lt;code&gt;m&lt;/code&gt; and &lt;code&gt;ef_construction&lt;/code&gt; during index creation. At query time, tune &lt;code&gt;hnsw.ef_search&lt;/code&gt; if you need better recall. Higher values usually improve recall, but increase query cost. For filtered vector search, also index your relational filter columns, for example &lt;code&gt;TenantId&lt;/code&gt;, and test the query plan with realistic data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.EntityFrameworkCore&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Pgvector.EntityFrameworkCore&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AppDbContext&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DbContext&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;DbSet&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Documents&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;AppDbContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;DbContextOptions&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AppDbContext&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;OnModelCreating&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ModelBuilder&lt;/span&gt; &lt;span class="n"&gt;modelBuilder&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;modelBuilder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;HasPostgresExtension&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"vector"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="n"&gt;modelBuilder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Entity&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;()&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;HasIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TenantId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="n"&gt;modelBuilder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Entity&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;()&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;HasIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;HasMethod&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hnsw"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;HasOperators&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"vector_cosine_ops"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;HasStorageParameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;HasStorageParameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ef_construction"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;64&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Make sure you register the vector types in your Program.cs when configuring the DbContext:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddDbContext&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AppDbContext&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseNpgsql&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Configuration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetConnectionString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"DefaultConnection"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseVector&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;// &amp;lt;-- Don't forget this!&lt;/span&gt;
    &lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 4: Querying with LINQ
&lt;/h2&gt;

&lt;p&gt;Because our vectors live in the same database as our relational data, we can combine semantic vector search with traditional SQL filtering in a single LINQ query.&lt;/p&gt;

&lt;p&gt;Dedicated vector databases also support metadata filtering. Qdrant and Pinecone, for example, both provide filtered vector search. The difference is not that filtering is impossible elsewhere. The difference is architectural: if your source of truth already lives in PostgreSQL, keeping vectors, metadata, deletes, updates, permissions, and document versions in sync across another datastore adds additional system complexity.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;SearchKnowledgeBaseAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;currentTenantId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;userQuestion&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// 1. Turn the user's question into a vector using your preferred AI library &lt;/span&gt;
    &lt;span class="c1"&gt;// (e.g., Microsoft.Extensions.AI)&lt;/span&gt;
    &lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;embeddingArray&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_aiService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GenerateEmbeddingAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userQuestion&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;queryVector&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddingArray&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// 2. Combine vector search with relational filters&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;relevantDocs&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_dbContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Documents&lt;/span&gt;
        &lt;span class="c1"&gt;// Relational filter: scope results to the current tenant&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TenantId&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="n"&gt;currentTenantId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;// Vector Search: Order by semantic similarity using Cosine Distance&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;OrderBy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Embedding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CosineDistance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;queryVector&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Take&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToListAsync&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;relevantDocs&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Combining Relational Filters and Vector Search
&lt;/h2&gt;

&lt;p&gt;When you call &lt;code&gt;ToListAsync()&lt;/code&gt;, EF Core translates the &lt;code&gt;CosineDistance()&lt;/code&gt; method directly into pgvector’s native &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; operator.&lt;/p&gt;

&lt;p&gt;PostgreSQL can combine relational filters and vector ordering in one query. For approximate HNSW indexes, filtered search still needs proper indexing and tuning, especially for selective tenant filters.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Takeaway
&lt;/h2&gt;

&lt;p&gt;You don’t always need a dedicated vector database to build useful RAG features.&lt;/p&gt;

&lt;p&gt;If your application already uses PostgreSQL and your retrieval data is tightly coupled with relational business data, pgvector can be a very pragmatic starting point.&lt;/p&gt;

&lt;p&gt;You keep embeddings, metadata, permissions, and source records close together. You can query them through EF Core. And you avoid introducing a second datastore until you actually need one.&lt;/p&gt;

&lt;p&gt;Dedicated vector databases still have their place, especially at a larger scale or when vector search becomes a standalone platform concern. But for many .NET applications, PostgreSQL with pgvector is enough to start.&lt;/p&gt;

&lt;h2&gt;
  
  
  Runnable Sample
&lt;/h2&gt;

&lt;p&gt;I also created a small runnable sample repository for this post. &lt;/p&gt;

&lt;p&gt;Repository: &lt;a href="https://github.com/ovnecron/rag-efcore-pgvector" rel="noopener noreferrer"&gt;GitHub Repo&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The sample uses a deterministic embedding service so it can run locally without an OpenAI or Azure OpenAI API key.&lt;br&gt;
That service is only there to make the demo reproducible. It is not meant to produce production-quality semantic embeddings. For real applications, replace it with embeddings from your actual embedding model, for example &lt;code&gt;text-embedding-3-small&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.npgsql.org/" rel="noopener noreferrer"&gt;Npgsql - .NET Access to PostgreSQL&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/pgvector/pgvector" rel="noopener noreferrer"&gt;Vector Search in PostgreSQL: pgvector Official GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/dotnet/ai/" rel="noopener noreferrer"&gt;Building AI Apps with .NET&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>csharp</category>
      <category>dotnet</category>
      <category>postgressql</category>
      <category>rag</category>
    </item>
    <item>
      <title>Dynamic Agent Context with AIContextProvider</title>
      <dc:creator>Lukas Walter </dc:creator>
      <pubDate>Wed, 06 May 2026 13:30:00 +0000</pubDate>
      <link>https://dev.to/lukaswalter/dynamic-agent-context-with-aicontextprovider-16i7</link>
      <guid>https://dev.to/lukaswalter/dynamic-agent-context-with-aicontextprovider-16i7</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This is Part 6 of my series on the Microsoft Agent Framework. You can read the original post over on &lt;a href="https://www.lukaswalter.dev/posts/agentframework_1_6/" rel="noopener noreferrer"&gt;lukaswalter.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  When static prompts are no longer enough
&lt;/h2&gt;

&lt;p&gt;Most agents are created with fixed system prompts and tools. But as we need more intelligent systems, we sometimes need to adapt them to the situation, user, or time.&lt;/p&gt;

&lt;p&gt;The framework offers &lt;code&gt;AIContextProviders&lt;/code&gt; for this purpose. &lt;/p&gt;

&lt;p&gt;These provide context to AI agents and can be chained together to connect multiple sources.&lt;/p&gt;

&lt;p&gt;Providers are executed in the order they are registered, allowing you to layer multiple context modifications in a predictable way. You can configure the sequence in your agent's setup, ensuring that context from earlier providers is available to those that run later in the chain. This lets you hook into the pipeline before and after the LLM call, helping avoid unexpected behavior by keeping the flow transparent.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture of Context Providers
&lt;/h2&gt;

&lt;p&gt;To create a custom provider, we inherit from the &lt;code&gt;AIContextProvider&lt;/code&gt; class. The Microsoft Agents framework handles all the complex routing and pipeline management behind the scenes, leaving us with just two key methods to override for our custom logic:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ProvideAIContextAsync&lt;/code&gt; (Pre-Call): This method is called just before the request is sent. Here we have full access to the current session, the previous instructions, and the pending message.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;StoreAIContextAsync&lt;/code&gt; (Post-Call): This method fires after the LLM has generated the response, but before it is returned to the user. Here, we can analyze the final response or any errors that might have occurred.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Memory
&lt;/h3&gt;

&lt;p&gt;Let's say we are building a barista agent for the coffee junkies among us.&lt;/p&gt;

&lt;p&gt;We want the AI to remember the user's specific brewing habits and gear. &lt;br&gt;
For example, when the user says, "I just bought a V60 pour-over" or "I really don't like acidic coffees." &lt;/p&gt;

&lt;p&gt;&lt;code&gt;ProvideAIContextAsync&lt;/code&gt; fetches user facts from the database and appends them as context to the instructions for the call. E.g., "User brews with a V60, prefers a 1:15 ratio, and loves dark, chocolatey roasts."  &lt;/p&gt;

&lt;p&gt;&lt;code&gt;StoreAIContextAsync&lt;/code&gt; passes the user request to a cheap extractor agent, which finds new facts to save for future use, enabling the barista to learn over time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;BaristaMemoryProvider&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AIContextProvider&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;UserIdStateKey&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"UserId"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;ICoffeeDatabase&lt;/span&gt; &lt;span class="n"&gt;_db&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;IExtractorAgent&lt;/span&gt; &lt;span class="n"&gt;_extractor&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;BaristaMemoryProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ICoffeeDatabase&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;IExtractorAgent&lt;/span&gt; &lt;span class="n"&gt;extractor&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;_db&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;_extractor&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extractor&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;ValueTask&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AIContext&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;ProvideAIContextAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;AIContextProvider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InvokingContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;GetUserId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;userPrefs&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetPreferencesAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userPrefs&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;AIContext&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;AIContext&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;Instructions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
                &lt;span class="s"&gt;$"User Coffee Profile: Brewer: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;userPrefs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Brewer&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, "&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt;
                &lt;span class="s"&gt;$"Ratio: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;userPrefs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Ratio&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, Roast: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;userPrefs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RoastType&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;."&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;ValueTask&lt;/span&gt; &lt;span class="nf"&gt;StoreAIContextAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;AIContextProvider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InvokedContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;lastUserMessage&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RequestMessages&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;LastOrDefault&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Role&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="n"&gt;ChatRole&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;)?&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;IsNullOrWhiteSpace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lastUserMessage&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;extractedFact&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_extractor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ExtractNewFactsAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lastUserMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;extractedFact&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;GetUserId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;SaveNewPreferenceAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;extractedFact&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nf"&gt;GetUserId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AgentSession&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;StateBag&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TryGetValue&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;UserIdStateKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;out&lt;/span&gt; &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;
            &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;
            &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"anonymous"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Optimize Tokens
&lt;/h3&gt;

&lt;p&gt;Let's now imagine a virtual Guitar Tech agent. This agent is equipped with many tools (ScaleGenerator, TabFetcher, AmpEQDialer, PedalBoardRouter, Metronome, etc.). &lt;/p&gt;

&lt;p&gt;Now we need to send the  schema for all tools with every request to the LLM. &lt;br&gt;
Even if the user just says, "Hey man". This inevitably wastes hundreds or thousands of tokens per call. &lt;/p&gt;

&lt;p&gt;This time, we use &lt;code&gt;ProvideAIContextAsync&lt;/code&gt; to quickly pass the incoming user message to a fast, efficient agent whose primary task is to evaluate user intent. (Is this request about music theory, finding tabs, or dialing in a tone?)&lt;/p&gt;

&lt;p&gt;If the user asks, "How do I get a dirty Hendrix tone on my Strat?", the provider injects only the AmpEQDialer and PedalBoardRouter tools into the context just before the main LLM call. &lt;/p&gt;

&lt;p&gt;The main agent receives a tailored and lean toolset. This approach saves input tokens and reduces the risk of the AI making unnecessary tool calls.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;GuitarTechToolProvider&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AIContextProvider&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;IRoadieAgent&lt;/span&gt; &lt;span class="n"&gt;_roadieRouter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;IToolRegistry&lt;/span&gt; &lt;span class="n"&gt;_tools&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;GuitarTechToolProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IRoadieAgent&lt;/span&gt; &lt;span class="n"&gt;roadieRouter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;IToolRegistry&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;_roadieRouter&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;roadieRouter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;_tools&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;ValueTask&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AIContext&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;ProvideAIContextAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;AIContextProvider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InvokingContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;lastMsg&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RequestMessages&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;LastOrDefault&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Role&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="n"&gt;ChatRole&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;)?&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;intent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_roadieRouter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;DetermineIntentAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lastMsg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;selectedTools&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AITool&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;
        &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;intent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;Intent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ToneAndGear&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;selectedTools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"AmpEQDialer"&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
                &lt;span class="n"&gt;selectedTools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"PedalBoardRouter"&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;Intent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MusicTheory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;selectedTools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ScaleGenerator"&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;AIContext&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;Tools&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;selectedTools&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Guardrails &amp;amp; Validation
&lt;/h3&gt;

&lt;p&gt;For this example, we will use an agent that helps us build Lego models. Let's ask it for a creative way to connect two Lego plates at a strange 45-degree angle. LLMs are eager to please and sometimes ignore existing rules. And though the agent might confidently suggest using superglue. Obviously, we need a strict safety net to avoid ruining our Lego set because of a wrong answer.&lt;/p&gt;

&lt;p&gt;Via &lt;code&gt;ProvideAIContextAsync&lt;/code&gt;, we inject a strict boundary condition right alongside the user's prompt: "Constraint: You are a purist Lego Master Builder. Only reference legal, official connection techniques. Do not suggest modifying bricks, cutting, or using adhesives." &lt;/p&gt;

&lt;p&gt;But even with strict boundaries, the agent could give us the wrong answer.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;StoreAIContextAsync&lt;/code&gt; grabs the generated response before it is returned to the user. &lt;br&gt;
Again, we run the response through a fast, lightweight agent that looks for out-of-bounds keywords such as "glue", "stress", and "cut". &lt;/p&gt;

&lt;p&gt;If the validator detects an illegal technique, we can log the error immediately, strip the offending paragraph from the answer, or throw an exception to trigger a silent, automatic retry.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LegoGuardrailProvider&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AIContextProvider&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;IValidatorAgent&lt;/span&gt; &lt;span class="n"&gt;_validator&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;LegoGuardrailProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IValidatorAgent&lt;/span&gt; &lt;span class="n"&gt;validator&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;_validator&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;validator&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="n"&gt;ValueTask&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AIContext&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;ProvideAIContextAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;AIContextProvider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InvokingContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ValueTask&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;AIContext&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;Instructions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Constraint: Only reference legal Lego connection techniques."&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;ValueTask&lt;/span&gt; &lt;span class="nf"&gt;StoreAIContextAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;AIContextProvider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InvokedContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;lastAssistantMsg&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseMessages&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;LastOrDefault&lt;/span&gt;&lt;span class="p"&gt;()?&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;validation&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_validator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CheckForIllegalTechniquesAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;lastAssistantMsg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;validation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsSafe&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;AIValidationException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;$"Safety violation: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;validation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Reason&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Alternatives
&lt;/h2&gt;

&lt;p&gt;In addition to the &lt;code&gt;AIContextProvider&lt;/code&gt;, the framework also offers the &lt;code&gt;MessageAIContextProvider&lt;/code&gt;. Instead of adjusting system instructions or tools in the background, this provider injects actual chat messages into the conversation.&lt;/p&gt;

&lt;p&gt;You can register the &lt;code&gt;MessageAIContextProvider&lt;/code&gt; as middleware. This is extremely helpful when working with agents we haven't created ourselves and whose parameters we cannot directly configure (such as remote agents connected via the A2A (Agent-to-Agent) protocol). By using it as middleware, we can still dynamically inject additional messages into them without needing access to their internal configuration.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Context Providers are really helpful in many situations. Whether you need dynamic on-the-fly prompts, an intelligent background memory, or massive token optimization through tool injection. &lt;/p&gt;

&lt;p&gt;We now know how to tame our chat histories, dynamically inject memory, and optimize our token budgets. But what happens when words are no longer enough, and our AI needs to interact with the real world? &lt;/p&gt;

&lt;p&gt;In the next part of this series, we will explore Tools and Dependency Injection, and learn how to teach your AI to execute actual actions!&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/dotnet/api/microsoft.agents.ai.aicontextprovider?view=agent-framework-dotnet-latest" rel="noopener noreferrer"&gt;AIContextProvider Class&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/dotnet/api/microsoft.agents.ai.messageaicontextprovider?view=agent-framework-dotnet-latest" rel="noopener noreferrer"&gt;MessageAIContextProvider Class&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/conversations/context-providers?pivots=programming-language-csharp" rel="noopener noreferrer"&gt;Context Providers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/agent-pipeline?pivots=programming-language-csharp" rel="noopener noreferrer"&gt;Agent pipeline architecture&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>csharp</category>
      <category>dotnet</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Controlling Token Growth with Chat Reducers</title>
      <dc:creator>Lukas Walter </dc:creator>
      <pubDate>Mon, 04 May 2026 13:30:00 +0000</pubDate>
      <link>https://dev.to/lukaswalter/controlling-token-growth-with-chat-reducers-4do8</link>
      <guid>https://dev.to/lukaswalter/controlling-token-growth-with-chat-reducers-4do8</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This is Part 5 of my series on the Microsoft Agent Framework. You can read the original post over on &lt;a href="https://www.lukaswalter.dev/posts/agentframework_1_5/" rel="noopener noreferrer"&gt;lukaswalter.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Token Trap in Long Chats
&lt;/h2&gt;

&lt;p&gt;As we have seen in previous articles, stateless LLMs require us to continuously send the entire previous chat history so the AI can retain context.&lt;/p&gt;

&lt;p&gt;As each message is added to ongoing chats, input tokens accumulate. Even after many previous interactions, asking a simple question like “What is 1+1?” still results in the entire conversation history being sent.&lt;br&gt;
This will come with its own problems, like a full context window and rising costs.&lt;br&gt;
To address this, the framework introduces Chat Reducers.&lt;/p&gt;
&lt;h2&gt;
  
  
  Message Counting
&lt;/h2&gt;

&lt;p&gt;The simplest form of a Chat Reducer is “Message Counting”. &lt;br&gt;
Here, you define a target count. The reducer keeps the most recent messages up to that count, while preserving the first system message if present.&lt;/p&gt;

&lt;p&gt;To use this with an agent, configure a &lt;code&gt;ChatHistoryProvider&lt;/code&gt;, such as &lt;code&gt;InMemoryChatHistoryProvider&lt;/code&gt;, in &lt;code&gt;ChatClientAgentOptions&lt;/code&gt; and pass the reducer through &lt;code&gt;InMemoryChatHistoryProviderOptions&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// 1. Define an IChatReducer that keeps the latest 10 non-system messages&lt;/span&gt;
&lt;span class="n"&gt;IChatReducer&lt;/span&gt; &lt;span class="n"&gt;messageCountReducer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;MessageCountingChatReducer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// 2. Configure the agent options with an in-memory chat history provider&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;agentOptions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;ChatClientAgentOptions&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ChatHistoryProvider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;InMemoryChatHistoryProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;InMemoryChatHistoryProviderOptions&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;ChatReducer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;messageCountReducer&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// 3. Create your agent from an IChatClient&lt;/span&gt;
&lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agentOptions&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The major advantage is that the token count and latency drop drastically the moment the limit takes effect. &lt;/p&gt;

&lt;p&gt;A limitation is that earlier context information is no longer available. If you share your name at the start of the conversation and refer to it after messages have been removed, the AI cannot recall it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summarization
&lt;/h2&gt;

&lt;p&gt;A more sophisticated approach is the &lt;code&gt;SummarizingChatReducer&lt;/code&gt;. &lt;br&gt;
This method uses an &lt;code&gt;IChatClient&lt;/code&gt; to summarize older messages during reduction.&lt;/p&gt;

&lt;p&gt;To set it up, you define the target count and an optional threshold. The target count is the number of recent messages that should remain after the reduction. The threshold controls how many messages beyond that target count are allowed before summarization is triggered.&lt;/p&gt;

&lt;p&gt;When the conversation grows beyond &lt;code&gt;targetCount + threshold&lt;/code&gt;, the reducer summarizes older messages. This summary replaces the old messages, while the most recent chat messages remain unchanged. &lt;/p&gt;

&lt;p&gt;A key feature for advanced scenarios is prompt customization. The summarization prompt or logic used can be tailored to fit your needs. This allows you to adapt the summary process via the &lt;code&gt;SummarizationPrompt&lt;/code&gt; property. This way, you can adapt the logic to your application's domain, highlight specific information, or enforce a particular writing style, resulting in summaries that are more useful and relevant for your use case.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// 1. You need a base IChatClient to perform the summarization calls&lt;/span&gt;
&lt;span class="n"&gt;IChatClient&lt;/span&gt; &lt;span class="n"&gt;innerChatClient&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// e.g., Azure OpenAI, OpenAI, or Ollama&lt;/span&gt;
&lt;span class="c1"&gt;// 2. Configure the reducer&lt;/span&gt;
&lt;span class="c1"&gt;// This keeps 1 recent message after summarization.&lt;/span&gt;
&lt;span class="c1"&gt;// threshold is "messages allowed beyond targetCount", so 9 means summarization&lt;/span&gt;
&lt;span class="c1"&gt;// starts once the history grows beyond 10.&lt;/span&gt;
&lt;span class="n"&gt;IChatReducer&lt;/span&gt; &lt;span class="n"&gt;summaryReducer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;SummarizingChatReducer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;innerChatClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;targetCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;9&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;SummarizationPrompt&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
        &lt;span class="s"&gt;"Summarize the following conversation while keeping technical specs and user names."&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="c1"&gt;// 3. Configure the agent options with the reducer&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;summaryAgentOptions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;ChatClientAgentOptions&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ChatHistoryProvider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;InMemoryChatHistoryProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;InMemoryChatHistoryProviderOptions&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;ChatReducer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;summaryReducer&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="c1"&gt;// 4. Create the agent&lt;/span&gt;
&lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;smartAgent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;summaryAgentOptions&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A significant benefit is that details from earlier in the conversation, such as your name or instructions, are included in the summary, allowing the AI to retain relevant information. &lt;/p&gt;

&lt;p&gt;The disadvantage is that generating this summary with the LLM also costs some tokens. Additionally, summarization introduces a slight performance impact, as the agent must pause and wait for the model to process and return the summary before proceeding. This can temporarily increase the latency for a user's next message each time summarization is triggered. In high-traffic scenarios, frequent summarizations may also affect overall throughput. You should consider these trade-offs and test the reducer settings under expected workloads to ensure that performance remains within acceptable limits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tip&lt;/strong&gt;: To keep costs and latency low, you don't have to use your powerful main model for summarization. You can pass a smaller, faster model as the innerChatClient.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: The framework doesn't provide an automatic fallback if summarization fails. A robust implementation should include a retry policy (via the IChatClient pipeline) or a custom mechanism to retain recent messages, ensuring the conversation remains fluid even in the event of, e.g., an API error.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Comparison
&lt;/h2&gt;

&lt;p&gt;Which reducer you choose depends heavily on your specific use case. &lt;/p&gt;

&lt;p&gt;It is always a balancing act between the value of retaining old messages, the cost of tokens, and the model's maximum context size.&lt;/p&gt;

&lt;p&gt;Use pure truncation (Message Counting) for simple use cases, where old topics quickly become irrelevant. &lt;/p&gt;

&lt;p&gt;Use Summarization for complex, in-depth agents, where the user might still want to refer back to earlier facts even after 15 minutes of chatting.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Message Counting (Truncation)&lt;/th&gt;
&lt;th&gt;Summarization&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best For&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Simple bots, high-volume support&lt;/td&gt;
&lt;td&gt;Complex assistants, deep analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Context&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Lost once it drops off the list&lt;/td&gt;
&lt;td&gt;Retained in condensed form&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Token Cost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Lowest (zero cost for reduction)&lt;/td&gt;
&lt;td&gt;Moderate (costs tokens to summarize)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Set and forget&lt;/td&gt;
&lt;td&gt;Requires custom prompts &amp;amp; error handling&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Chat Reducers let us control conversation length and token costs efficiently.&lt;/p&gt;

&lt;p&gt;Next, we'll explore &lt;code&gt;AIContextProviders&lt;/code&gt;, which allow agents to dynamically inject context and extract new memories, providing persistent memory while optimizing token usage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.ai.summarizingchatreducer?view=net-10.0-pp" rel="noopener noreferrer"&gt;SummarizingChatReducer Class&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.ai.messagecountingchatreducer?view=net-10.0-pp" rel="noopener noreferrer"&gt;MessageCountingChatReducer Class&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>csharp</category>
      <category>dotnet</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>State Management and Chat History</title>
      <dc:creator>Lukas Walter </dc:creator>
      <pubDate>Fri, 01 May 2026 14:30:00 +0000</pubDate>
      <link>https://dev.to/lukaswalter/state-management-and-chat-history-5a7g</link>
      <guid>https://dev.to/lukaswalter/state-management-and-chat-history-5a7g</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This is Part 4 of my series on the Microsoft Agent Framework. You can read the original post over on &lt;a href="https://www.lukaswalter.dev/posts/agentframework_1_4/" rel="noopener noreferrer"&gt;lukaswalter.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction: Why AIs are stateless
&lt;/h2&gt;

&lt;p&gt;Large Language Models (LLMs) are stateless. Ask, “How many levels are in Super Mario 64?” and you’ll get an answer. Ask, “How many stars are there?” right after, and the AI often won’t recognize you mean the game. It may return an unrelated number.&lt;/p&gt;

&lt;p&gt;Each LLM request is isolated. For AI to understand context, you must send the entire conversation history each time.&lt;/p&gt;

&lt;p&gt;With every additional chat question, the number of input tokens rises. You pay for the entire historical text sent back and forth.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Basic Approach: Agent Sessions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;In-Memory Storage:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To solve this, the Agent Framework provides the concept of Agent Sessions.&lt;br&gt;
Instead of just calling &lt;code&gt;agent.runAsync("Question")&lt;/code&gt;, you create a session and include it with each call.&lt;br&gt;
The framework then automatically appends the new messages to a list in the background and sends them with the next call.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Creating an Agent Session to store short-term context&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetNewSessionAsync&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; 

&lt;span class="c1"&gt;// Passing the session with each request&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response1&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"How many levels are in Super Mario 64?"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response2&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"How many stars are there?"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; 
&lt;span class="c1"&gt;// The AI now understands you are still talking about the game!&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By default, storage is in-memory only. If the app closes or the server restarts, the AI’s memory is wiped.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution for Long-Term Memory: The ChatHistoryProvider
&lt;/h2&gt;

&lt;p&gt;To offer features like ChatGPT’s left sidebar, where past chats resume, persistence is needed. This is where ChatHistoryProvider helps.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The StateBag Concept&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each session has a StateBag, a flexible key-value store. Store a unique session ID (e.g., a GUID) as a reference for your database or file system. By keeping the ID separate from the chat history, you can securely reference and restore sessions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Implementation: Saving and Restoring
&lt;/h2&gt;

&lt;p&gt;To build a provider, inherit from the ChatHistoryProvider class and override two main methods:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyDatabaseChatHistoryProvider&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ChatHistoryProvider&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Step 1 - Saving&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt; &lt;span class="nf"&gt;StoreChatHistoryAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ChatHistoryContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Retrieve our Session ID from the StateBag&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;sessionId&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StateBag&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"SessionId"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;ToString&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="c1"&gt;// Grab the newest messages from the context&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;newRequest&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RequestMessages&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;newResponse&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseMessages&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="c1"&gt;// Serialize and save the context to disk or a database record&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;SaveMessagesToDatabaseAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;newRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;newResponse&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; 

    &lt;span class="c1"&gt;// Step 2 - Restoring&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IReadOnlyList&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ChatMessage&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;ProvideChatHistoryAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ChatHistoryContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Check if the StateBag already has a Session ID&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StateBag&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;TryGetValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"SessionId"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;out&lt;/span&gt; &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;sessionIdObj&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// It's a new session, create a unique ID and store it in the StateBag&lt;/span&gt;
            &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StateBag&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"SessionId"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Guid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;NewGuid&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;ToString&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Empty&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ChatMessage&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt; &lt;span class="c1"&gt;// No history to load yet&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;// If the ID exists, read the previous chat messages from your database&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;sessionId&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sessionIdObj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToString&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;historicalMessages&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;LoadMessagesFromDatabaseAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;historicalMessages&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; 
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 1 - Saving (StoreChatHistoryAsync):&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The framework calls this method after the AI responds, but before the user sees it. Here, you can serialize the context and store it. Like writing JSON to disk or a database record.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2 - Restoring (ProvideChatHistoryAsync):&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When a user returns and you pass a session with an existing StateBag ID, this method runs. It reads the saved file or database, deserializes the text into chat messages, and hands them to the agent. Crucially, it returns the deserialized messages to the agent so the AI has the context loaded before it processes the user's new prompt. The AI is caught up and ready to continue.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;With ChatHistoryProvider, you control chat storage. The AI remembers the user, even after long breaks.&lt;/p&gt;

&lt;p&gt;Now our AI remembers whole conversations. But if the history grows too large, hitting token limits and increasing costs, what then? Next, we’ll explore Chat Reducers—tools for summarizing or trimming old messages to save tokens.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/conversations/?pivots=programming-language-csharp" rel="noopener noreferrer"&gt;Conversations &amp;amp; Memory overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/agent-framework/agents/conversations/storage?pivots=programming-language-csharp" rel="noopener noreferrer"&gt;Storage&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/dotnet/api/microsoft.agents.ai.agentsession?view=agent-framework-dotnet-latest" rel="noopener noreferrer"&gt;AgentSession Class&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dotnet</category>
      <category>csharp</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
