<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: alinabi19</title>
    <description>The latest articles on DEV Community by alinabi19 (@alinabi19).</description>
    <link>https://dev.to/alinabi19</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F830106%2F68ef085c-3d41-4a7c-abc4-fbf6ecd35f59.jpeg</url>
      <title>DEV Community: alinabi19</title>
      <link>https://dev.to/alinabi19</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/alinabi19"/>
    <language>en</language>
    <item>
      <title>Running Local AI Models in .NET with Ollama (Step-by-Step Guide)</title>
      <dc:creator>alinabi19</dc:creator>
      <pubDate>Fri, 13 Mar 2026 15:59:27 +0000</pubDate>
      <link>https://dev.to/alinabi19/running-local-ai-models-in-net-with-ollama-step-by-step-guide-4die</link>
      <guid>https://dev.to/alinabi19/running-local-ai-models-in-net-with-ollama-step-by-step-guide-4die</guid>
      <description>&lt;p&gt;Most developers who start experimenting with AI tend to follow the same path.&lt;/p&gt;

&lt;p&gt;You integrate a cloud AI API into your application. The prototype works beautifully. Responses are fast, integration is simple, and everything feels almost magical.&lt;/p&gt;

&lt;p&gt;Then the production questions start appearing.&lt;/p&gt;

&lt;p&gt;How much will this cost at scale?&lt;/p&gt;

&lt;p&gt;Do we really want sensitive data leaving our infrastructure?&lt;/p&gt;

&lt;p&gt;What happens if the API rate limits us?&lt;/p&gt;

&lt;p&gt;And the big one many developers eventually ask:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can we run AI models locally instead?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The answer is yes. And tools like &lt;strong&gt;Ollama&lt;/strong&gt; make it much easier than most developers expect.&lt;/p&gt;

&lt;p&gt;Ollama allows you to run powerful language models directly on your machine and access them through a simple HTTP API. This means you can integrate local AI into &lt;strong&gt;ASP.NET Core APIs, background services, or internal tools&lt;/strong&gt; without relying on external providers.&lt;/p&gt;

&lt;p&gt;In this guide we will walk through:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;why local AI models are becoming popular&lt;/li&gt;
&lt;li&gt;how Ollama works&lt;/li&gt;
&lt;li&gt;how to run models locally&lt;/li&gt;
&lt;li&gt;how to call Ollama from a .NET application&lt;/li&gt;
&lt;li&gt;how to build a simple AI-powered ASP.NET Core endpoint&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you are a .NET developer curious about integrating AI without depending entirely on cloud APIs, this is a great place to start.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Run AI Models Locally?
&lt;/h2&gt;

&lt;p&gt;Cloud AI APIs are extremely powerful, but they are not always the best solution for every scenario.&lt;/p&gt;

&lt;p&gt;Running models locally offers a few advantages that become very attractive in production environments.&lt;/p&gt;

&lt;h3&gt;
  
  
  No API Usage Costs
&lt;/h3&gt;

&lt;p&gt;Most cloud AI providers charge based on &lt;strong&gt;tokens or requests&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That works well for prototypes. But once usage grows, costs can scale quickly.&lt;/p&gt;

&lt;p&gt;Running models locally removes the &lt;strong&gt;per-request cost entirely&lt;/strong&gt;, which makes a big difference for internal tools or heavy workloads.&lt;/p&gt;

&lt;h3&gt;
  
  
  Full Data Privacy
&lt;/h3&gt;

&lt;p&gt;Many enterprise systems process sensitive information such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;internal documentation&lt;/li&gt;
&lt;li&gt;support tickets&lt;/li&gt;
&lt;li&gt;logs&lt;/li&gt;
&lt;li&gt;customer records&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sending this data to external AI APIs can raise &lt;strong&gt;security and compliance concerns&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Local models keep everything &lt;strong&gt;inside your infrastructure&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lower Latency
&lt;/h3&gt;

&lt;p&gt;Cloud inference requires a network round trip.&lt;/p&gt;

&lt;p&gt;Local inference removes that dependency completely.&lt;/p&gt;

&lt;p&gt;For internal assistants, dashboards, or developer tooling, this often results in noticeably faster responses.&lt;/p&gt;

&lt;h3&gt;
  
  
  Offline AI Capabilities
&lt;/h3&gt;

&lt;p&gt;Local models can run without internet access.&lt;/p&gt;

&lt;p&gt;This is useful in environments like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;secure enterprise networks&lt;/li&gt;
&lt;li&gt;air-gapped systems&lt;/li&gt;
&lt;li&gt;developer tools running locally&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Ideal for Internal Tools
&lt;/h3&gt;

&lt;p&gt;Local AI is especially useful for building tools such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;internal chat assistants&lt;/li&gt;
&lt;li&gt;log summarization tools&lt;/li&gt;
&lt;li&gt;documentation search&lt;/li&gt;
&lt;li&gt;developer copilots&lt;/li&gt;
&lt;li&gt;AI-powered dashboards&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is exactly the kind of scenario where &lt;strong&gt;Ollama shines&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Ollama?
&lt;/h2&gt;

&lt;p&gt;Ollama is a tool that allows developers to run &lt;strong&gt;LLMs (large language models) locally&lt;/strong&gt; with minimal setup.&lt;/p&gt;

&lt;p&gt;Instead of manually managing model weights, runtime environments, and inference servers, Ollama handles the heavy lifting.&lt;/p&gt;

&lt;p&gt;It manages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;model downloads&lt;/li&gt;
&lt;li&gt;model execution&lt;/li&gt;
&lt;li&gt;memory handling&lt;/li&gt;
&lt;li&gt;inference APIs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once installed, Ollama exposes a &lt;strong&gt;local HTTP API&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That means any language capable of making HTTP requests can interact with it, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;C#&lt;/li&gt;
&lt;li&gt;Python&lt;/li&gt;
&lt;li&gt;JavaScript&lt;/li&gt;
&lt;li&gt;Go&lt;/li&gt;
&lt;li&gt;Java&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For backend developers, this makes integration extremely straightforward.&lt;/p&gt;

&lt;h2&gt;
  
  
  Supported Models
&lt;/h2&gt;

&lt;p&gt;Ollama supports many popular open-source models, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Llama 3&lt;/li&gt;
&lt;li&gt;Mistral&lt;/li&gt;
&lt;li&gt;Gemma&lt;/li&gt;
&lt;li&gt;Code Llama&lt;/li&gt;
&lt;li&gt;various other community models&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Different models are optimized for different tasks.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Best Use Case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;llama3&lt;/td&gt;
&lt;td&gt;General AI tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;mistral&lt;/td&gt;
&lt;td&gt;Fast responses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;codellama&lt;/td&gt;
&lt;td&gt;Code generation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gemma&lt;/td&gt;
&lt;td&gt;Lightweight inference&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;One of the biggest advantages of Ollama is how easy it is to &lt;strong&gt;switch between models&lt;/strong&gt;.&lt;br&gt;
Before choosing a model, it's worth understanding the hardware requirements involved in running these models locally.&lt;/p&gt;
&lt;h2&gt;
  
  
  Resource Considerations
&lt;/h2&gt;

&lt;p&gt;Before running models locally, it is important to understand that LLMs still require system resources.&lt;/p&gt;

&lt;p&gt;One thing you'll quickly notice when running local models is that inference latency can vary significantly depending on hardware.&lt;/p&gt;

&lt;p&gt;During development, it’s common for responses to take several seconds when running on CPU-only machines.&lt;/p&gt;

&lt;p&gt;For internal tools this is usually acceptable, but it’s worth keeping in mind when designing user-facing APIs.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;p&gt;Llama 3 8B models typically require &lt;strong&gt;8–16GB RAM&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;CPU inference works but may be slower&lt;/p&gt;

&lt;p&gt;GPUs significantly improve performance&lt;/p&gt;

&lt;p&gt;For many internal tools, smaller or quantized models provide the best balance between performance and resource usage.&lt;/p&gt;
&lt;h2&gt;
  
  
  Architecture of a .NET Application Using Ollama
&lt;/h2&gt;

&lt;p&gt;Before writing code, it helps to understand where Ollama fits into the architecture.&lt;/p&gt;

&lt;p&gt;A typical integration looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Client
   ↓
ASP.NET Core API
   ↓
AI Service Layer
   ↓
Ollama Local API
   ↓
Local LLM Model
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Request flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Client sends a request to the ASP.NET Core API&lt;/li&gt;
&lt;li&gt;The API calls an AI service layer&lt;/li&gt;
&lt;li&gt;The service sends a prompt to Ollama&lt;/li&gt;
&lt;li&gt;Ollama runs the model locally&lt;/li&gt;
&lt;li&gt;The generated response returns to the client&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This separation keeps the architecture clean and maintainable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Installing Ollama
&lt;/h2&gt;

&lt;p&gt;First, install Ollama on your machine.&lt;/p&gt;

&lt;p&gt;Go to the official website:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://ollama.com" rel="noopener noreferrer"&gt;https://ollama.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Download the installer for your operating system. Ollama currently supports:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;macOS&lt;/li&gt;
&lt;li&gt;Linux&lt;/li&gt;
&lt;li&gt;Windows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Run the installer and complete the setup.&lt;/p&gt;

&lt;p&gt;After installation, &lt;strong&gt;open a terminal or command prompt&lt;/strong&gt; and verify that Ollama is installed correctly by running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If Ollama is installed properly, you should see the installed version printed in the terminal.&lt;/p&gt;

&lt;h3&gt;
  
  
  Download a Model
&lt;/h3&gt;

&lt;p&gt;Next, download a language model that Ollama will run locally.&lt;/p&gt;

&lt;p&gt;For this guide, we will use &lt;strong&gt;Llama 3&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull llama3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command downloads the model weights and prepares them for local inference.&lt;/p&gt;

&lt;p&gt;Depending on your internet speed, the download may take a few minutes because LLM models are several gigabytes in size.&lt;/p&gt;

&lt;h3&gt;
  
  
  Run the Model
&lt;/h3&gt;

&lt;p&gt;Once the model is downloaded, you can start it using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama run llama3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ollama will load the model and open an interactive prompt.&lt;/p&gt;

&lt;p&gt;Try entering a simple question:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Explain what REST APIs are
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the model responds with an answer, your local AI environment is working correctly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Local API Endpoint
&lt;/h3&gt;

&lt;p&gt;Behind the scenes, Ollama also exposes an HTTP API that applications can call.&lt;/p&gt;

&lt;p&gt;By default, the API runs at:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;http://localhost:11434
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the endpoint your &lt;strong&gt;ASP.NET Core application will communicate with&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Creating a .NET 8 Web API
&lt;/h2&gt;

&lt;p&gt;Next create a new ASP.NET Core API project.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dotnet new webapi &lt;span class="nt"&gt;-n&lt;/span&gt; LocalAIApi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A simple project structure might look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Controllers
Services
Models
Program.cs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Keeping AI logic separated into services helps maintain clean architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Calling the Ollama API from .NET
&lt;/h2&gt;

&lt;p&gt;Ollama exposes a simple endpoint for generating responses.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;POST http://localhost:11434/api/generate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example request payload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"model"&lt;/span&gt;: &lt;span class="s2"&gt;"llama3"&lt;/span&gt;,
  &lt;span class="s2"&gt;"prompt"&lt;/span&gt;: &lt;span class="s2"&gt;"Explain dependency injection in ASP.NET Core"&lt;/span&gt;,
  &lt;span class="s2"&gt;"stream"&lt;/span&gt;: &lt;span class="nb"&gt;false&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Example .NET Call
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;GenerateAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"llama3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_httpClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;PostAsJsonAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s"&gt;"api/generate"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EnsureSuccessStatusCode&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReadFromJsonAsync&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;OllamaResponse&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;Response&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Empty&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Response model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OllamaResponse&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Response&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using a typed model instead of &lt;code&gt;dynamic&lt;/code&gt; makes the code safer and easier to maintain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Creating an AI Service Layer
&lt;/h2&gt;

&lt;p&gt;One design rule worth following:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Avoid putting AI logic directly inside controllers.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead, isolate it inside a service layer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Service Interface
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;IAiService&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;GenerateAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Implementation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OllamaService&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;IAiService&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;HttpClient&lt;/span&gt; &lt;span class="n"&gt;_httpClient&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;OllamaService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;HttpClient&lt;/span&gt; &lt;span class="n"&gt;httpClient&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;_httpClient&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpClient&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;GenerateAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"llama3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_httpClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;PostAsJsonAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"api/generate"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;EnsureSuccessStatusCode&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReadFromJsonAsync&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;OllamaResponse&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;Response&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Empty&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 5: Registering the Service
&lt;/h2&gt;

&lt;p&gt;In &lt;code&gt;Program.cs&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddHttpClient&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IAiService&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;OllamaService&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BaseAddress&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"http://localhost:11434"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Timeout&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromMinutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using &lt;code&gt;HttpClientFactory&lt;/code&gt; ensures efficient connection management.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6: Building an AI Endpoint
&lt;/h2&gt;

&lt;p&gt;Now expose the AI functionality through a controller.&lt;/p&gt;

&lt;p&gt;Request model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiRequest&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Prompt&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Controller:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ApiController&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"api/ai"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiController&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ControllerBase&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;IAiService&lt;/span&gt; &lt;span class="n"&gt;_aiService&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;AiController&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IAiService&lt;/span&gt; &lt;span class="n"&gt;aiService&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;_aiService&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aiService&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;HttpPost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"generate"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IActionResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;Generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;AiRequest&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;IsNullOrWhiteSpace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Prompt&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;BadRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Prompt is required."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_aiService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GenerateAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your ASP.NET Core API now exposes a &lt;strong&gt;local AI-powered endpoint&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 7: Switching Models
&lt;/h2&gt;

&lt;p&gt;One of the nicest things about Ollama is how easy it is to switch between models.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull mistral
ollama pull codellama
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then update the request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"model"&lt;/span&gt;: &lt;span class="s2"&gt;"mistral"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In practice, testing a few models usually produces better results than simply choosing the largest one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 8: Improving Performance
&lt;/h2&gt;

&lt;p&gt;Local models can still be resource intensive.&lt;/p&gt;

&lt;p&gt;A few practical optimizations can help significantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Streaming&lt;/strong&gt;&lt;br&gt;
Streaming responses improves perceived latency for longer outputs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reduce Prompt Size&lt;/strong&gt;&lt;br&gt;
Large prompts increase inference time.&lt;br&gt;
Send only the context that the model truly needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cache Repeated Requests&lt;/strong&gt;&lt;br&gt;
If prompts repeat frequently, caching responses can reduce compute usage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Keep Calls Asynchronous&lt;/strong&gt;&lt;br&gt;
Always use async APIs when calling models to keep your backend scalable.&lt;/p&gt;

&lt;h2&gt;
  
  
  When Local Models Are Not the Best Choice
&lt;/h2&gt;

&lt;p&gt;Local models are powerful, but they are not always the right solution.&lt;/p&gt;

&lt;p&gt;For example, cloud AI services may still be better when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you need extremely large models&lt;/li&gt;
&lt;li&gt;you require massive scaling&lt;/li&gt;
&lt;li&gt;GPU infrastructure is unavailable&lt;/li&gt;
&lt;li&gt;inference workloads are very high&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice, many teams use a &lt;strong&gt;hybrid approach&lt;/strong&gt;, combining cloud models and local models depending on the use case.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;The local AI ecosystem is moving incredibly fast right now.&lt;/p&gt;

&lt;p&gt;Just a few years ago, running large language models required specialized machine learning environments. Today tools like Ollama make it possible for everyday backend developers to experiment with local LLMs using familiar technologies.&lt;/p&gt;

&lt;p&gt;From a .NET perspective, integrating Ollama is actually much simpler than it looks at first.&lt;/p&gt;

&lt;p&gt;Instead of relying entirely on external APIs, you can build AI-powered systems that are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;private&lt;/li&gt;
&lt;li&gt;cost-efficient&lt;/li&gt;
&lt;li&gt;low latency&lt;/li&gt;
&lt;li&gt;fully controlled by your infrastructure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For internal tools, developer assistants, and AI-powered APIs, local models are quickly becoming a practical and powerful alternative to cloud AI services.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Few Things Worth Remembering
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Ollama makes running local LLMs simple&lt;/li&gt;
&lt;li&gt;.NET applications can interact with Ollama using HTTP APIs&lt;/li&gt;
&lt;li&gt;Use a dedicated AI service layer to keep architecture clean&lt;/li&gt;
&lt;li&gt;Choose models based on your use case, not just size&lt;/li&gt;
&lt;li&gt;Optimize prompts and responses for better performance&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dotnet</category>
      <category>ai</category>
      <category>ollama</category>
      <category>backend</category>
    </item>
    <item>
      <title>Fixing CORS Errors in ASP.NET Core APIs (The Real Reasons)</title>
      <dc:creator>alinabi19</dc:creator>
      <pubDate>Thu, 12 Mar 2026 11:38:04 +0000</pubDate>
      <link>https://dev.to/alinabi19/fixing-cors-errors-in-aspnet-core-apis-the-real-reasons-583c</link>
      <guid>https://dev.to/alinabi19/fixing-cors-errors-in-aspnet-core-apis-the-real-reasons-583c</guid>
      <description>&lt;p&gt;If you've worked on APIs for a while, you've probably run into this situation.&lt;/p&gt;

&lt;p&gt;Your frontend calls your ASP.NET Core API.&lt;/p&gt;

&lt;p&gt;Everything looks correct.&lt;/p&gt;

&lt;p&gt;But the browser console suddenly throws this error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Access to fetch at &lt;span class="s1"&gt;'https://api.example.com'&lt;/span&gt; from origin &lt;span class="s1"&gt;'http://localhost:3000'&lt;/span&gt; has been blocked by CORS policy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So you try the exact same request in Postman.&lt;/p&gt;

&lt;p&gt;It works perfectly.&lt;/p&gt;

&lt;p&gt;Now you're confused.&lt;/p&gt;

&lt;p&gt;You check the API.&lt;br&gt;
You check the frontend.&lt;br&gt;
Eventually you start adding random CORS settings hoping something finally works.&lt;/p&gt;

&lt;p&gt;Before long, your &lt;code&gt;Program.cs&lt;/code&gt; starts looking like a CORS experiment lab.&lt;/p&gt;

&lt;p&gt;Almost every backend developer hits this problem at some point.&lt;/p&gt;

&lt;p&gt;What makes CORS errors frustrating is that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;They only appear in browsers&lt;/li&gt;
&lt;li&gt;API testing tools like Postman work fine&lt;/li&gt;
&lt;li&gt;Small configuration mistakes can break everything&lt;/li&gt;
&lt;li&gt;Middleware order matters in ASP.NET Core&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After debugging this issue across multiple APIs, the same root causes show up again and again. Let’s walk through what’s actually happening and how to fix it properly.&lt;/p&gt;
&lt;h2&gt;
  
  
  What CORS Actually Is (In Simple Terms)
&lt;/h2&gt;

&lt;p&gt;CORS stands for &lt;strong&gt;Cross-Origin Resource Sharing&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Browsers enforce a security rule called the Same-Origin Policy.&lt;/p&gt;

&lt;p&gt;It means a web page can normally only call APIs from the same origin.&lt;/p&gt;

&lt;p&gt;An origin is defined as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Protocol + Domain + Port
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://localhost:3000
https://api.myapp.com
https://app.myapp.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even small differences create a different origin.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://localhost:3000
http://localhost:5000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same machine, different port - still considered &lt;strong&gt;cross-origin&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When a frontend application calls an API on another origin, the browser checks whether the API explicitly allows it.&lt;/p&gt;

&lt;p&gt;If the response doesn't include the correct CORS headers, &lt;strong&gt;the browser blocks the request&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Important point many developers miss:&lt;/p&gt;

&lt;p&gt;Your API is usually &lt;strong&gt;not rejecting the request&lt;/strong&gt;.&lt;br&gt;
The &lt;strong&gt;browser is refusing to expose the response&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This is also why the request works in Postman - Postman doesn't enforce browser security rules.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why CORS Errors Happen in ASP.NET Core APIs
&lt;/h2&gt;

&lt;p&gt;After debugging CORS issues in several projects, these are the most common causes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Missing CORS middleware&lt;/strong&gt;&lt;br&gt;
CORS support was never enabled in the API.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wrong middleware order&lt;/strong&gt;&lt;br&gt;
CORS middleware must run before endpoints execute.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Incorrect origin configuration&lt;/strong&gt;&lt;br&gt;
The frontend origin isn't included in allowed origins.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Preflight request failure&lt;/strong&gt;&lt;br&gt;
Browsers sometimes send an &lt;code&gt;OPTIONS&lt;/code&gt; request before the real request.&lt;/p&gt;

&lt;p&gt;If this fails, the real request never happens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Credential configuration mistakes&lt;/strong&gt;&lt;br&gt;
Using cookies or authorization headers incorrectly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Using &lt;code&gt;AllowAnyOrigin()&lt;/code&gt; with credentials&lt;/strong&gt;&lt;br&gt;
Browsers block this combination for security reasons.&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 1: The Correct Way to Enable CORS in ASP.NET Core
&lt;/h2&gt;

&lt;p&gt;CORS configuration starts in &lt;code&gt;Program.cs&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Register the CORS policy&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;WebApplication&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateBuilder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddCors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"FrontendPolicy"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;policy&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WithOrigins&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"http://localhost:3000"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AllowAnyHeader&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
              &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AllowAnyMethod&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates a &lt;strong&gt;named CORS policy&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Enable the middleware&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseCors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"FrontendPolicy"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapControllers&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now requests from &lt;code&gt;http://localhost:3000&lt;/code&gt; will be allowed.&lt;/p&gt;

&lt;p&gt;A common mistake is &lt;strong&gt;registering CORS but forgetting to enable the middleware.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Understanding Preflight Requests (OPTIONS)
&lt;/h2&gt;

&lt;p&gt;Sometimes browsers send a &lt;strong&gt;preflight request&lt;/strong&gt; before the real request.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;OPTIONS /api/orders
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This usually happens when:&lt;/p&gt;

&lt;p&gt;The request uses custom headers&lt;/p&gt;

&lt;p&gt;The request method is &lt;code&gt;PUT&lt;/code&gt;, &lt;code&gt;PATCH&lt;/code&gt;, or &lt;code&gt;DELETE&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Credentials or authorization headers are used&lt;/p&gt;

&lt;p&gt;The browser is essentially asking:&lt;/p&gt;

&lt;p&gt;"Is it okay if I send this request?"&lt;/p&gt;

&lt;p&gt;If the server doesn't respond with the correct CORS headers, the browser blocks the request before the real API call even happens.&lt;/p&gt;

&lt;p&gt;The good news is that &lt;strong&gt;ASP.NET Core handles preflight requests automatically&lt;/strong&gt; when the CORS middleware is configured correctly.&lt;/p&gt;

&lt;p&gt;You normally don't need to manually implement &lt;code&gt;OPTIONS&lt;/code&gt; endpoints.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: The Most Common CORS Configuration Mistakes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Mistake 1: Using &lt;code&gt;AllowAnyOrigin()&lt;/code&gt; With Credentials
&lt;/h3&gt;

&lt;p&gt;This is extremely common.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AllowAnyOrigin&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AllowCredentials&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will not work.&lt;/p&gt;

&lt;p&gt;Browsers block this configuration because allowing credentials from any origin is a security risk.&lt;/p&gt;

&lt;p&gt;Correct approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WithOrigins&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"https://app.example.com"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AllowCredentials&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AllowAnyHeader&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AllowAnyMethod&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Always specify explicit origins when credentials are involved.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mistake 2: Forgetting to Call &lt;code&gt;UseCors()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Developers often configure CORS but forget to enable the middleware.&lt;/p&gt;

&lt;p&gt;Wrong:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddCors&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Correct:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseCors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"FrontendPolicy"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without the middleware, the policy never runs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mistake 3: Wrong Middleware Order
&lt;/h3&gt;

&lt;p&gt;Middleware order matters in ASP.NET Core.&lt;/p&gt;

&lt;p&gt;Correct order:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseRouting&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseCors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"FrontendPolicy"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseAuthentication&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseAuthorization&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapControllers&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If CORS runs after endpoints, it won't apply to the response.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Handling CORS for Frontend Frameworks
&lt;/h2&gt;

&lt;p&gt;During development, frontend frameworks usually run on different ports.&lt;/p&gt;

&lt;p&gt;Examples:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;React  → http://localhost:3000
Vite   → http://localhost:5173
Angular → http://localhost:4200
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your CORS policy must allow these origins.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WithOrigins&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"http://localhost:3000"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"http://localhost:5173"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"http://localhost:4200"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AllowAnyHeader&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AllowAnyMethod&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For production environments, you should only allow trusted domains.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://app.mycompany.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Avoid using &lt;code&gt;AllowAnyOrigin()&lt;/code&gt; in production APIs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Debugging CORS Errors Like a Pro
&lt;/h2&gt;

&lt;p&gt;When a CORS error appears, guessing usually wastes time. It's better to check a few things first.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Check the Browser Console&lt;/strong&gt;&lt;br&gt;
Look for errors like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;No 'Access-Control-Allow-Origin' header present
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means the response didn't include the required CORS headers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Inspect the Network Tab&lt;/strong&gt;&lt;br&gt;
Open DevTools and inspect the &lt;code&gt;OPTIONS&lt;/code&gt; request.&lt;/p&gt;

&lt;p&gt;If the preflight request fails, the real request will never be sent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Compare Browser vs Postman&lt;/strong&gt;&lt;br&gt;
If Postman works but the browser fails, it's almost always a CORS issue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Check Response Headers&lt;/strong&gt;&lt;br&gt;
You can also test the API response headers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-I&lt;/span&gt; https://api.example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Look for:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Access-Control-Allow-Origin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If it's missing, the browser will block the request.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6: Production Best Practices
&lt;/h2&gt;

&lt;p&gt;A few simple practices make CORS configuration easier to manage in real systems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Avoid Hardcoding Origins&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead, load them from configuration.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;allowedOrigins&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Configuration&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetSection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Cors:AllowedOrigins"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;]&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="n"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Empty&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;();&lt;/span&gt;

&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WithOrigins&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;allowedOrigins&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AllowAnyHeader&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AllowAnyMethod&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Configuration example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="s"&gt;"Cors"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="s"&gt;"AllowedOrigins"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s"&gt;"http://localhost:3000"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"https://app.mycompany.com"&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This keeps deployments clean across development, staging, and production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Be Aware of Reverse Proxies&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In production environments your API may sit behind:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Nginx&lt;/li&gt;
&lt;li&gt;Cloudflare&lt;/li&gt;
&lt;li&gt;Azure Application Gateway&lt;/li&gt;
&lt;li&gt;Kubernetes ingress&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sometimes these proxies &lt;strong&gt;strip or override headers&lt;/strong&gt;, which can make CORS appear broken even when the API is configured correctly.&lt;/p&gt;

&lt;p&gt;If CORS works locally but fails in production, checking the proxy configuration is often the next step.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Pitfall
&lt;/h2&gt;

&lt;p&gt;One subtle issue happens when the frontend port changes.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;p&gt;Frontend running on:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;localhost:5173
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But the API only allows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;localhost:3000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Different port = different origin.&lt;/p&gt;

&lt;p&gt;Even though it's the same machine, the browser treats it as cross-origin.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick CORS Debugging Checklist
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Problem&lt;/th&gt;
&lt;th&gt;Cause&lt;/th&gt;
&lt;th&gt;Fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Browser CORS error&lt;/td&gt;
&lt;td&gt;Missing middleware&lt;/td&gt;
&lt;td&gt;Add &lt;code&gt;AddCors()&lt;/code&gt; and &lt;code&gt;UseCors()&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Preflight request fails&lt;/td&gt;
&lt;td&gt;OPTIONS blocked&lt;/td&gt;
&lt;td&gt;Enable CORS policy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credentials error&lt;/td&gt;
&lt;td&gt;Using &lt;code&gt;AllowAnyOrigin()&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Specify allowed origins&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Works in Postman only&lt;/td&gt;
&lt;td&gt;Browser enforcing CORS&lt;/td&gt;
&lt;td&gt;Configure proper headers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production requests blocked&lt;/td&gt;
&lt;td&gt;Missing frontend domain&lt;/td&gt;
&lt;td&gt;Add production origin&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;CORS errors can feel confusing at first because the problem usually shows up far away from the real cause.&lt;/p&gt;

&lt;p&gt;The key thing to remember is this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Browsers enforce CORS. Your API usually doesn't.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most issues come down to a few predictable causes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Missing CORS middleware&lt;/li&gt;
&lt;li&gt;Wrong middleware order&lt;/li&gt;
&lt;li&gt;Incorrect origin configuration&lt;/li&gt;
&lt;li&gt;Preflight request failures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once you understand how browsers handle cross-origin requests and how ASP.NET Core middleware works, debugging CORS becomes much more straightforward.&lt;/p&gt;

&lt;p&gt;And if you've ever spent hours chasing a CORS error before a deployment, you're definitely not alone.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>webdev</category>
      <category>backend</category>
      <category>api</category>
    </item>
    <item>
      <title>Building AI-Powered APIs with ASP.NET Core and OpenAI (.NET 8 Guide)</title>
      <dc:creator>alinabi19</dc:creator>
      <pubDate>Thu, 12 Mar 2026 07:43:16 +0000</pubDate>
      <link>https://dev.to/alinabi19/building-ai-powered-apis-with-aspnet-core-and-openai-net-8-guide-2nam</link>
      <guid>https://dev.to/alinabi19/building-ai-powered-apis-with-aspnet-core-and-openai-net-8-guide-2nam</guid>
      <description>&lt;p&gt;AI features are slowly becoming part of normal backend work.&lt;/p&gt;

&lt;p&gt;A few years ago, most APIs were simple CRUD endpoints. They fetched data, updated records, and returned JSON. Today it is common to see requests like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“Can we add a chatbot endpoint?”&lt;/li&gt;
&lt;li&gt;“Can the API summarize user feedback?”&lt;/li&gt;
&lt;li&gt;“Can we classify support tickets automatically?”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At that point your backend suddenly needs to talk to an AI model.&lt;/p&gt;

&lt;p&gt;If you're already comfortable building APIs with ASP.NET Core, the first instinct is usually simple: just call the OpenAI API from an endpoint and return the result.&lt;/p&gt;

&lt;p&gt;That works for a quick prototype. But once the system starts growing, a few questions appear pretty quickly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Where should the AI logic live?&lt;/li&gt;
&lt;li&gt;Should controllers call OpenAI directly?&lt;/li&gt;
&lt;li&gt;How do we protect API keys?&lt;/li&gt;
&lt;li&gt;What happens if the AI call takes several seconds?&lt;/li&gt;
&lt;li&gt;How do we deal with rate limits or retries?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI integrations behave like any other &lt;strong&gt;external service dependency&lt;/strong&gt; in your backend architecture. Treat them that way and the system stays clean and maintainable.&lt;/p&gt;

&lt;p&gt;In this article we’ll build a simple &lt;strong&gt;AI-powered API using ASP.NET Core and OpenAI&lt;/strong&gt;, while following patterns that hold up well in real applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Expose AI Through an API
&lt;/h2&gt;

&lt;p&gt;Most production systems expose AI features through backend APIs rather than directly from the frontend.&lt;/p&gt;

&lt;p&gt;A typical setup might look like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A web app calls an endpoint to generate content&lt;/li&gt;
&lt;li&gt;A mobile app sends text to be summarized&lt;/li&gt;
&lt;li&gt;An internal tool calls an API for classification&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Centralizing AI logic inside your API gives you a few advantages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security&lt;/strong&gt;&lt;br&gt;
Your OpenAI key stays in the backend. The client never sees it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reuse&lt;/strong&gt;&lt;br&gt;
Multiple clients (web, mobile, internal tools) can use the same AI capability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost control&lt;/strong&gt;&lt;br&gt;
Since AI calls cost money, the API layer can enforce limits and validation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Consistency&lt;/strong&gt;&lt;br&gt;
Prompts, models, and safety rules live in one place.&lt;/p&gt;

&lt;p&gt;Think of the API as the &lt;strong&gt;gateway between your application and AI services&lt;/strong&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  A Simple Architecture That Works Well
&lt;/h2&gt;

&lt;p&gt;When adding AI to a backend, separating responsibilities helps a lot.&lt;/p&gt;

&lt;p&gt;A typical flow looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Client
   ↓
API Endpoint
   ↓
AI Service
   ↓
OpenAI API
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each layer does a specific job.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Responsibility&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;API&lt;/td&gt;
&lt;td&gt;Handles HTTP requests&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Service Layer&lt;/td&gt;
&lt;td&gt;Contains AI logic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HTTP Client&lt;/td&gt;
&lt;td&gt;Calls OpenAI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Configuration&lt;/td&gt;
&lt;td&gt;Stores API keys&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A mistake I’ve seen more than once is calling OpenAI &lt;strong&gt;directly inside controllers&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That approach usually leads to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;duplicated logic&lt;/li&gt;
&lt;li&gt;hard-to-test endpoints&lt;/li&gt;
&lt;li&gt;messy controllers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Moving AI logic into a service keeps things cleaner and easier to maintain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Create the ASP.NET Core API
&lt;/h2&gt;

&lt;p&gt;Start by creating a standard .NET 8 Web API project.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;dotnet&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;webapi&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="n"&gt;AiApiDemo&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;ASP.NET Core supports both controllers and Minimal APIs.&lt;/p&gt;

&lt;p&gt;For simple AI endpoints, &lt;strong&gt;Minimal APIs work nicely&lt;/strong&gt; because the code stays compact.&lt;/p&gt;

&lt;p&gt;Example setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;WebApplication&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateBuilder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddEndpointsApiExplorer&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddSwaggerGen&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseSwagger&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseSwaggerUI&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we have a basic API ready to host endpoints.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Store the OpenAI API Key
&lt;/h2&gt;

&lt;p&gt;Never hardcode API keys directly in code.&lt;/p&gt;

&lt;p&gt;A simple approach is to store it in configuration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;appsettings.json&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="s"&gt;"OpenAI"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s"&gt;"ApiKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"YOUR_API_KEY"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In production environments you would usually use something like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;environment variables&lt;/li&gt;
&lt;li&gt;Azure Key Vault&lt;/li&gt;
&lt;li&gt;AWS Secrets Manager&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal is simple: &lt;strong&gt;keep secrets outside source control&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Register an HTTP Client
&lt;/h2&gt;

&lt;p&gt;ASP.NET Core includes &lt;strong&gt;HttpClientFactory&lt;/strong&gt;, which is the recommended way to make HTTP calls.&lt;/p&gt;

&lt;p&gt;It helps avoid common issues like socket exhaustion and centralizes configuration.&lt;/p&gt;

&lt;p&gt;Register a typed client in &lt;code&gt;Program.cs&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddHttpClient&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IAiService&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;OpenAiService&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BaseAddress&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"https://api.openai.com/"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Timeout&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromSeconds&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;30&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Typed clients also work well with dependency injection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Create the AI Service
&lt;/h2&gt;

&lt;p&gt;Instead of calling OpenAI from endpoints, create a dedicated service.&lt;/p&gt;

&lt;p&gt;This keeps the API layer thin and makes the integration easier to test.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Service Interface&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;IAiService&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;GenerateResponseAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Service Implementation&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OpenAiService&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;IAiService&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;HttpClient&lt;/span&gt; &lt;span class="n"&gt;_httpClient&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;IConfiguration&lt;/span&gt; &lt;span class="n"&gt;_config&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;ILogger&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;OpenAiService&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;_logger&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;OpenAiService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;HttpClient&lt;/span&gt; &lt;span class="n"&gt;httpClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;IConfiguration&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;ILogger&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;OpenAiService&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;_httpClient&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpClient&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;_config&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;_logger&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;GenerateResponseAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;apiKey&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_config&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"OpenAI:ApiKey"&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

        &lt;span class="n"&gt;_httpClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultRequestHeaders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Authorization&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Net&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AuthenticationHeaderValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Bearer"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"gpt-4o-mini"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_httpClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;PostAsJsonAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"v1/responses"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsSuccessStatusCode&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ReadAsStringAsync&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

            &lt;span class="n"&gt;_logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;LogError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"OpenAI request failed: {Error}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;ApplicationException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;$"OpenAI request failed: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReadFromJsonAsync&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;OpenAiResponse&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;Output&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Response Model&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Using strongly typed models is safer than parsing dynamic JSON.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OpenAiResponse&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Output&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Output&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Output&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Content&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Content&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 5: Create an AI Endpoint
&lt;/h2&gt;

&lt;p&gt;Now we expose an endpoint that clients can call.&lt;/p&gt;

&lt;p&gt;Request model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChatRequest&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Prompt&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Minimal API endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapPost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/api/ai/chat"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;ChatRequest&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;IAiService&lt;/span&gt; &lt;span class="n"&gt;aiService&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;IsNullOrWhiteSpace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Prompt&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;BadRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Prompt cannot be empty."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Prompt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;BadRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Prompt too large."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;aiService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GenerateResponseAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;cancellationToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clients can now send prompts and receive AI-generated responses.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handling Failures and Rate Limits
&lt;/h2&gt;

&lt;p&gt;AI services are external dependencies, so failures are normal.&lt;/p&gt;

&lt;p&gt;Some common issues include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;network timeouts&lt;/li&gt;
&lt;li&gt;rate limits (429 responses)&lt;/li&gt;
&lt;li&gt;temporary service errors&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In production systems it’s usually worth adding retry policies.&lt;/p&gt;

&lt;p&gt;Libraries like &lt;strong&gt;Polly&lt;/strong&gt; integrate well with &lt;code&gt;HttpClientFactory&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Example retry setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddHttpClient&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IAiService&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;OpenAiService&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddTransientHttpErrorPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WaitAndRetryAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;retry&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
            &lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromSeconds&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Pow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;retry&lt;/span&gt;&lt;span class="p"&gt;))));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This helps smooth over temporary failures.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Few Performance Considerations
&lt;/h2&gt;

&lt;p&gt;AI requests are typically slower than database queries.&lt;/p&gt;

&lt;p&gt;A few small changes can improve responsiveness.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use async calls&lt;/strong&gt;&lt;br&gt;
Blocking threads during AI calls will hurt scalability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Validate prompt size&lt;/strong&gt;&lt;br&gt;
Large prompts increase both latency and cost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cache repeated responses&lt;/strong&gt;&lt;br&gt;
If users frequently ask the same question, caching results can reduce API calls.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Consider streaming responses&lt;/strong&gt;&lt;br&gt;
Streaming works well for chat-style applications where users expect gradual output.&lt;/p&gt;

&lt;h2&gt;
  
  
  Securing AI Endpoints
&lt;/h2&gt;

&lt;p&gt;AI endpoints can easily become expensive if left unprotected.&lt;/p&gt;

&lt;p&gt;A few safeguards help prevent abuse.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Authentication&lt;/strong&gt;&lt;br&gt;
Use JWT or API keys to restrict access.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rate limiting&lt;/strong&gt;&lt;br&gt;
Limit how frequently a client can call the AI endpoint.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt validation&lt;/strong&gt;&lt;br&gt;
Always validate user input before sending it to a model.&lt;/p&gt;

&lt;h2&gt;
  
  
  One Small Tip That Reduces AI Costs
&lt;/h2&gt;

&lt;p&gt;AI pricing usually depends on the number of tokens processed.&lt;/p&gt;

&lt;p&gt;Better prompts often produce &lt;strong&gt;better responses with fewer tokens&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Instead of sending large context blocks, try using structured prompts with clear instructions.&lt;/p&gt;

&lt;p&gt;It improves both response quality and cost efficiency.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons from Building AI APIs
&lt;/h2&gt;

&lt;p&gt;If you're planning to add AI features to your backend, a few patterns make life easier:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keep AI logic inside a dedicated service layer&lt;/li&gt;
&lt;li&gt;Treat AI like any other external dependency&lt;/li&gt;
&lt;li&gt;Add retries, validation, and logging early&lt;/li&gt;
&lt;li&gt;Protect API keys and enforce usage limits&lt;/li&gt;
&lt;li&gt;Monitor token usage to avoid unexpected costs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once the architecture is set up properly, adding new AI capabilities becomes much simpler.&lt;/p&gt;

&lt;p&gt;Chatbots, summarization, and classification all become just &lt;strong&gt;another API endpoint&lt;/strong&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>dotnet</category>
      <category>webdev</category>
      <category>openai</category>
    </item>
    <item>
      <title>ASP.NET Core Request Pipeline Explained: What Happens When an API Receives a Request</title>
      <dc:creator>alinabi19</dc:creator>
      <pubDate>Wed, 11 Mar 2026 18:49:24 +0000</pubDate>
      <link>https://dev.to/alinabi19/aspnet-core-request-pipeline-explained-what-happens-when-an-api-receives-a-request-7p1</link>
      <guid>https://dev.to/alinabi19/aspnet-core-request-pipeline-explained-what-happens-when-an-api-receives-a-request-7p1</guid>
      <description>&lt;p&gt;You send a request to an API endpoint.&lt;/p&gt;

&lt;p&gt;Milliseconds later, a response comes back.&lt;/p&gt;

&lt;p&gt;Most of the time, we don’t think much about what happens in between. We write controllers, configure middleware, run the application, and everything works.&lt;/p&gt;

&lt;p&gt;Until it doesn’t.&lt;/p&gt;

&lt;p&gt;Maybe authentication suddenly stops working.&lt;br&gt;
Maybe a middleware behaves differently than expected.&lt;br&gt;
Maybe performance drops under load.&lt;br&gt;
Or routing starts sending requests to the wrong endpoint.&lt;/p&gt;

&lt;p&gt;When that happens, the question becomes unavoidable:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What actually happens inside ASP.NET Core when a request hits your API?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Understanding the request pipeline is what turns ASP.NET Core from a black box into something you can actually debug and optimize.&lt;/p&gt;

&lt;p&gt;In this article, we'll walk through the lifecycle of a request in ASP.NET Core—from the moment it reaches your server to the moment the response is sent back.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Big Picture: The ASP.NET Core Request Flow
&lt;/h2&gt;

&lt;p&gt;If you trace a request from the network all the way to your controller or endpoint, it roughly goes through this path:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Client
   ↓
Reverse Proxy (optional)
   ↓
Kestrel Web Server
   ↓
ASP.NET Core Hosting Layer
   ↓
Middleware Pipeline
   ↓
Endpoint Routing
   ↓
Endpoint Execution (Controller / Minimal API)
   ↓
Middleware (Response Flow)
   ↓
Kestrel
   ↓
Client Response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each stage gets a chance to process the request before it reaches your application logic.&lt;/p&gt;

&lt;p&gt;Once you understand this flow, debugging strange behavior becomes much easier.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: The Request Reaches Kestrel
&lt;/h2&gt;

&lt;p&gt;The first component inside your application that receives the request is &lt;strong&gt;Kestrel&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Kestrel is the default high-performance web server used by ASP.NET Core. Its job is to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Listen for incoming HTTP requests&lt;/li&gt;
&lt;li&gt;Parse HTTP messages&lt;/li&gt;
&lt;li&gt;Forward the request into the ASP.NET Core application pipeline&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Kestrel is designed for high throughput and low latency. It uses asynchronous I/O and efficient networking primitives to handle thousands of concurrent connections.&lt;/p&gt;

&lt;p&gt;In production environments, Kestrel usually sits behind a reverse proxy such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Nginx&lt;/li&gt;
&lt;li&gt;Apache&lt;/li&gt;
&lt;li&gt;IIS&lt;/li&gt;
&lt;li&gt;Azure App Service infrastructure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The reverse proxy handles things like TLS termination, load balancing, and security filtering, while &lt;strong&gt;Kestrel still processes the request inside the application.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Once Kestrel receives the request, it passes it into the ASP.NET Core pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: The ASP.NET Core Hosting Layer
&lt;/h2&gt;

&lt;p&gt;Before the request reaches middleware, ASP.NET Core’s hosting layer has already done some important work.&lt;/p&gt;

&lt;p&gt;When the application starts, the hosting layer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Builds the &lt;strong&gt;dependency injection container&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Configures &lt;strong&gt;logging&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Loads &lt;strong&gt;configuration&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Constructs the &lt;strong&gt;middleware pipeline&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This setup happens during application startup in &lt;code&gt;Program.cs&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;By the time a request arrives, the middleware pipeline has already been assembled and is ready to process incoming requests.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: The Request Enters the Middleware Pipeline
&lt;/h2&gt;

&lt;p&gt;Most of the interesting work in ASP.NET Core happens inside the &lt;strong&gt;middleware pipeline&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Middleware are small components that can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inspect the request&lt;/li&gt;
&lt;li&gt;Modify the request&lt;/li&gt;
&lt;li&gt;Stop the request from continuing&lt;/li&gt;
&lt;li&gt;Pass the request to the next component&lt;/li&gt;
&lt;li&gt;Modify the response before it leaves&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Middleware are configured in &lt;code&gt;Program.cs&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RequestServices&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetRequiredService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ILoggerFactory&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;()&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateLogger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"RequestLogger"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;LogInformation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Request started: {Path}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;LogInformation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Response finished with status {StatusCode}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StatusCode&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here’s what happens during execution:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The request enters the middleware&lt;/li&gt;
&lt;li&gt;Code before &lt;code&gt;await next()&lt;/code&gt; runs&lt;/li&gt;
&lt;li&gt;The request moves to the next middleware&lt;/li&gt;
&lt;li&gt;Eventually an endpoint executes&lt;/li&gt;
&lt;li&gt;The response travels back through middleware&lt;/li&gt;
&lt;li&gt;Code after &lt;code&gt;await next()&lt;/code&gt; runs&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This creates a &lt;strong&gt;two-way pipeline&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Request → Middleware → Endpoint
Response ← Middleware ← Endpoint
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One thing that surprises many developers when debugging middleware is that responses travel back through the pipeline in reverse order.&lt;/p&gt;

&lt;h2&gt;
  
  
  Middleware Can Short-Circuit the Pipeline
&lt;/h2&gt;

&lt;p&gt;Middleware can also &lt;strong&gt;stop the pipeline entirely.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Identity&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;IsAuthenticated&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StatusCode&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;StatusCodes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Status401Unauthorized&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this case, the request never reaches later middleware or the endpoint.&lt;/p&gt;

&lt;p&gt;This behavior is commonly used for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;authentication checks&lt;/li&gt;
&lt;li&gt;rate limiting&lt;/li&gt;
&lt;li&gt;request filtering&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 4: Built-in Middleware Components
&lt;/h2&gt;

&lt;p&gt;ASP.NET Core provides several built-in middleware components that most applications rely on.&lt;/p&gt;

&lt;p&gt;Common examples include:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Routing Middleware&lt;/strong&gt;&lt;br&gt;
Determines which endpoint matches the request.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseRouting&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Authentication Middleware&lt;/strong&gt;&lt;br&gt;
Validates the user identity.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseAuthentication&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Authorization Middleware&lt;/strong&gt;&lt;br&gt;
Checks whether the authenticated user has permission.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseAuthorization&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Exception Handling Middleware&lt;/strong&gt;&lt;br&gt;
Handles unhandled exceptions globally.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseExceptionHandler&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;*&lt;em&gt;Other Common Production Middleware&lt;br&gt;
*&lt;/em&gt;&lt;br&gt;
Real-world APIs often include additional middleware such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CORS (&lt;code&gt;UseCors&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Response compression (&lt;code&gt;UseResponseCompression&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;HTTPS redirection (&lt;code&gt;UseHttpsRedirection&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Rate limiting (&lt;code&gt;UseRateLimiter&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each middleware adds a delegate to the request pipeline. Individually they’re lightweight, but extremely long middleware chains can introduce small overhead in very high-throughput systems.&lt;/p&gt;
&lt;h2&gt;
  
  
  Middleware Order Matters
&lt;/h2&gt;

&lt;p&gt;One of the most common sources of bugs in ASP.NET Core applications is &lt;strong&gt;incorrect middleware ordering&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Consider this configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseAuthorization&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseAuthentication&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This breaks authentication because authorization runs before the user identity is established.&lt;/p&gt;

&lt;p&gt;The correct order is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseAuthentication&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseAuthorization&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When debugging strange authentication behavior, middleware order is often the first thing worth checking.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Endpoint Routing
&lt;/h2&gt;

&lt;p&gt;After middleware processing, ASP.NET Core needs to determine &lt;strong&gt;which endpoint should handle the request.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is handled by &lt;strong&gt;endpoint routing.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Routing examines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HTTP method (GET, POST, etc.)&lt;/li&gt;
&lt;li&gt;request path&lt;/li&gt;
&lt;li&gt;route parameters&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example Minimal API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapGet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/products/{id}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;$"Product &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the request is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;GET&lt;/span&gt; &lt;span class="p"&gt;/&lt;/span&gt;&lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;/&lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Routing selects this endpoint and prepares it for execution.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;UseRouting()&lt;/code&gt; identifies the matching endpoint, while the endpoint delegate itself executes later in the pipeline.&lt;/p&gt;

&lt;p&gt;ASP.NET Core’s routing system is highly optimized and capable of efficiently matching large numbers of routes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6: Endpoint Execution
&lt;/h2&gt;

&lt;p&gt;Once routing selects the correct endpoint, ASP.NET Core executes the endpoint logic.&lt;/p&gt;

&lt;p&gt;This could be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a controller action&lt;/li&gt;
&lt;li&gt;a minimal API handler&lt;/li&gt;
&lt;li&gt;a Razor page&lt;/li&gt;
&lt;li&gt;a gRPC service&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For controller-based APIs, ASP.NET Core performs several additional steps automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Binding
&lt;/h2&gt;

&lt;p&gt;ASP.NET Core maps incoming request data into method parameters.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;HttpPost&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;IActionResult&lt;/span&gt; &lt;span class="nf"&gt;CreateOrder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OrderDto&lt;/span&gt; &lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Data can be bound from multiple sources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;request body&lt;/li&gt;
&lt;li&gt;route values&lt;/li&gt;
&lt;li&gt;query parameters&lt;/li&gt;
&lt;li&gt;headers&lt;/li&gt;
&lt;li&gt;form data&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Validation
&lt;/h2&gt;

&lt;p&gt;If validation attributes are used, ASP.NET Core validates the model automatically.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OrderDto&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Required&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;CustomerEmail&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Invalid models typically produce a &lt;strong&gt;400 Bad Request response&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Business Logic
&lt;/h2&gt;

&lt;p&gt;This is where your application code runs.&lt;/p&gt;

&lt;p&gt;Typical tasks include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;database queries&lt;/li&gt;
&lt;li&gt;calling services&lt;/li&gt;
&lt;li&gt;performing calculations&lt;/li&gt;
&lt;li&gt;invoking external APIs&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Returning a Result
&lt;/h2&gt;

&lt;p&gt;The endpoint returns a result such as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;ASP.NET Core then converts this result into an HTTP response.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;objects → JSON&lt;/li&gt;
&lt;li&gt;status codes → HTTP response codes&lt;/li&gt;
&lt;li&gt;headers → HTTP headers&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 7: The Response Travels Back Through Middleware
&lt;/h2&gt;

&lt;p&gt;Once the endpoint finishes execution, the response begins its return journey.&lt;/p&gt;

&lt;p&gt;The response flows &lt;strong&gt;back through the middleware pipeline in reverse order.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This allows middleware to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;modify response headers&lt;/li&gt;
&lt;li&gt;compress responses&lt;/li&gt;
&lt;li&gt;log execution time&lt;/li&gt;
&lt;li&gt;transform output&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Finally, the response reaches &lt;strong&gt;Kestrel&lt;/strong&gt;, which sends it back to the client.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Simple Performance Debugging Trick
&lt;/h2&gt;

&lt;p&gt;When diagnosing slow requests, a small timing middleware can quickly identify bottlenecks.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;stopwatch&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Stopwatch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;StartNew&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="n"&gt;stopwatch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Stop&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RequestServices&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetRequiredService&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ILoggerFactory&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;()&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateLogger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Performance"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;LogInformation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Request completed in {Elapsed} ms"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;stopwatch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ElapsedMilliseconds&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This simple middleware can reveal slow endpoints or middleware components almost instantly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Visual Summary of the Request Flow
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Request Flow Summary

Client
  ↓
Kestrel
  ↓
Middleware Pipeline
  ↓
Routing
  ↓
Endpoint Execution
  ↓
Middleware (response)
  ↓
Client
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;ASP.NET Core processes requests through a &lt;strong&gt;middleware pipeline&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Kestrel is the &lt;strong&gt;web server that receives HTTP requests&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Middleware can &lt;strong&gt;inspect, modify, or terminate requests&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Middleware order &lt;strong&gt;directly affects application behavior&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Endpoint routing determines &lt;strong&gt;which API logic executes&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Responses travel back through the &lt;strong&gt;same middleware pipeline&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Once you understand this flow, ASP.NET Core stops feeling like a black box. Debugging becomes easier, middleware behavior makes more sense, and performance issues are much easier to track down.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Have you ever spent hours debugging an ASP.NET Core API only to realize the issue was caused by middleware order?&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>architecture</category>
      <category>backend</category>
      <category>dotnet</category>
    </item>
    <item>
      <title>10 ASP.NET Core API Performance Mistakes That Hurt Scalability</title>
      <dc:creator>alinabi19</dc:creator>
      <pubDate>Wed, 11 Mar 2026 09:53:21 +0000</pubDate>
      <link>https://dev.to/alinabi19/10-aspnet-core-api-performance-mistakes-that-hurt-scalability-5h04</link>
      <guid>https://dev.to/alinabi19/10-aspnet-core-api-performance-mistakes-that-hurt-scalability-5h04</guid>
      <description>&lt;p&gt;ASP.NET Core is one of the fastest web frameworks available today.&lt;/p&gt;

&lt;p&gt;Benchmarks regularly show it outperforming many other platforms. Yet in real production systems, I’ve seen ASP.NET Core APIs struggle under load — even when the infrastructure was solid.&lt;/p&gt;

&lt;p&gt;Response times that were &lt;strong&gt;50–100ms during development suddenly climb to 800ms or more&lt;/strong&gt; in production. CPU usage spikes, database calls slow down, and suddenly everyone is asking the same question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;“Why is our API so slow?”&lt;/strong&gt;&lt;br&gt;
In most cases, the problem isn’t ASP.NET Core.&lt;/p&gt;

&lt;p&gt;It’s small design decisions made during development that quietly add overhead over time.&lt;/p&gt;

&lt;p&gt;Things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Returning too much data&lt;/li&gt;
&lt;li&gt;Inefficient database queries&lt;/li&gt;
&lt;li&gt;Blocking threads&lt;/li&gt;
&lt;li&gt;Missing caching&lt;/li&gt;
&lt;li&gt;Large payloads&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Individually these issues may seem harmless. Combined, they can dramatically reduce API performance and scalability.&lt;/p&gt;

&lt;p&gt;After working on several production APIs, I’ve noticed the same performance issues appear again and again.&lt;/p&gt;

&lt;p&gt;Let’s walk through &lt;strong&gt;10 of the most common mistakes developers make when building ASP.NET Core APIs - and how to fix them.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Why API Performance Matters
&lt;/h2&gt;

&lt;p&gt;API performance affects far more than just response time.&lt;/p&gt;

&lt;p&gt;Slow APIs lead to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Higher CPU and memory consumption&lt;/li&gt;
&lt;li&gt;Increased cloud infrastructure costs&lt;/li&gt;
&lt;li&gt;Poor scalability under load&lt;/li&gt;
&lt;li&gt;Frustrated users waiting for responses&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A well-designed ASP.NET Core API can handle thousands of requests per second with minimal infrastructure.&lt;/p&gt;

&lt;p&gt;But that only happens when performance is considered early in the design.&lt;/p&gt;
&lt;h2&gt;
  
  
  1. Returning Too Much Data
&lt;/h2&gt;

&lt;p&gt;One of the most common API performance issues is &lt;strong&gt;returning entire database entities instead of only the fields needed by the client.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Large payloads increase:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;serialization time&lt;/li&gt;
&lt;li&gt;network transfer time&lt;/li&gt;
&lt;li&gt;client parsing time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Bad Example&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapGet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/users"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AppDbContext&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToListAsync&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your &lt;code&gt;User&lt;/code&gt; table has 20 columns but the frontend only needs four, you're wasting bandwidth and compute.&lt;/p&gt;

&lt;h2&gt;
  
  
  Better Approach: Use DTO Projection
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapGet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/users"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AppDbContext&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;ct&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Users&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsNoTracking&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;u&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;UserDto&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;Id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Email&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Email&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToListAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ct&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;smaller SQL queries&lt;/li&gt;
&lt;li&gt;less memory usage&lt;/li&gt;
&lt;li&gt;faster serialization&lt;/li&gt;
&lt;li&gt;smaller responses&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This simple change can reduce payload sizes dramatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Blocking Threads Instead of Using Async
&lt;/h2&gt;

&lt;p&gt;ASP.NET Core is designed for &lt;strong&gt;asynchronous I/O.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When database calls or external requests are made synchronously, threads become blocked while waiting for results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Blocking Example&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToList&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Under load, blocked threads reduce throughput and can cause &lt;strong&gt;thread pool starvation.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Correct Approach&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToListAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ct&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Async operations allow ASP.NET Core to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;free threads while waiting for I/O&lt;/li&gt;
&lt;li&gt;handle more concurrent requests&lt;/li&gt;
&lt;li&gt;scale efficiently&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A common mistake I still see in production code is calling &lt;code&gt;.Result&lt;/code&gt; or &lt;code&gt;.Wait()&lt;/code&gt;. These should almost always be avoided in ASP.NET Core APIs.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Inefficient Database Queries
&lt;/h2&gt;

&lt;p&gt;In most real-world APIs, &lt;strong&gt;the database is the primary performance bottleneck.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Common issues include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;N+1 queries&lt;/li&gt;
&lt;li&gt;missing indexes&lt;/li&gt;
&lt;li&gt;unnecessary joins&lt;/li&gt;
&lt;li&gt;over-fetching data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;N+1 Query Problem&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Orders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToListAsync&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;order&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrderItems&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrderId&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToListAsync&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If there are 100 orders, this produces &lt;strong&gt;101 database queries.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Better Approach&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Orders&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Include&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Items&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsNoTracking&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToListAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ct&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Always review your generated SQL queries. Small EF query mistakes can cause massive database load.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Ignoring Caching
&lt;/h2&gt;

&lt;p&gt;If the same data is requested repeatedly, hitting the database every time is wasteful.&lt;/p&gt;

&lt;p&gt;Caching can reduce response times dramatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In-Memory Cache Example&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapGet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/products"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;AppDbContext&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;IMemoryCache&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;ct&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;TryGetValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;out&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Product&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Products&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsNoTracking&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToListAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ct&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;MemoryCacheEntryOptions&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;AbsoluteExpirationRelativeToNow&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromMinutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="n"&gt;SlidingExpiration&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromMinutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Caching is particularly effective for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;configuration data&lt;/li&gt;
&lt;li&gt;product catalogs&lt;/li&gt;
&lt;li&gt;reference tables&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For multi-instance deployments, a &lt;strong&gt;distributed cache like Redis&lt;/strong&gt; is usually a better choice.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Missing Pagination on Large Endpoints
&lt;/h2&gt;

&lt;p&gt;Returning thousands of rows in a single API response can create serious performance issues.&lt;/p&gt;

&lt;p&gt;Problems include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;large payload sizes&lt;/li&gt;
&lt;li&gt;high memory consumption&lt;/li&gt;
&lt;li&gt;slow serialization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Proper Pagination&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapGet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/orders"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;AppDbContext&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;pageSize&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;ct&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;pageSize&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pageSize&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Orders&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsNoTracking&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Skip&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;pageSize&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Take&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pageSize&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToListAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ct&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Always enforce a &lt;strong&gt;maximum page size&lt;/strong&gt; to prevent abuse.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Overloading the Middleware Pipeline
&lt;/h2&gt;

&lt;p&gt;Middleware runs on &lt;strong&gt;every request&lt;/strong&gt;, so unnecessary middleware can add latency.&lt;/p&gt;

&lt;p&gt;Common mistakes include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;heavy logging middleware&lt;/li&gt;
&lt;li&gt;redundant request parsing&lt;/li&gt;
&lt;li&gt;complex logic inside middleware&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A clean pipeline might look like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseRouting&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseAuthentication&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseAuthorization&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each additional middleware adds overhead, so keep the pipeline intentional and minimal.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Not Using Response Compression
&lt;/h2&gt;

&lt;p&gt;Large JSON responses can significantly increase network latency.&lt;/p&gt;

&lt;p&gt;ASP.NET Core provides built-in response compression.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Enable Compression&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddResponseCompression&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseResponseCompression&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Compression is especially helpful for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;large JSON responses&lt;/li&gt;
&lt;li&gt;mobile networks&lt;/li&gt;
&lt;li&gt;APIs returning lists or datasets&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  8. Excessive Logging in Production
&lt;/h2&gt;

&lt;p&gt;Logging is essential, but logging too much can hurt performance.&lt;/p&gt;

&lt;p&gt;Common mistakes include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;logging entire request bodies&lt;/li&gt;
&lt;li&gt;debug-level logs in production&lt;/li&gt;
&lt;li&gt;logging inside loops&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Better Logging Approach&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;LogInformation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"Order created for user {UserId}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;userId&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Structured logging keeps logs useful while minimizing overhead.&lt;/p&gt;

&lt;h2&gt;
  
  
  9. Mismanaging Database Connections
&lt;/h2&gt;

&lt;p&gt;Creating database connections is expensive, which is why &lt;strong&gt;connection pooling exists.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;However, performance issues still occur when:&lt;/p&gt;

&lt;p&gt;DbContext lifetimes are misconfigured&lt;/p&gt;

&lt;p&gt;long-running queries hold connections&lt;/p&gt;

&lt;p&gt;transactions stay open too long&lt;/p&gt;

&lt;p&gt;Best practices:&lt;/p&gt;

&lt;p&gt;use &lt;code&gt;DbContext&lt;/code&gt; with &lt;strong&gt;scoped lifetime&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;avoid long transactions&lt;/p&gt;

&lt;p&gt;keep queries efficient&lt;/p&gt;

&lt;h2&gt;
  
  
  10. Skipping Load Testing
&lt;/h2&gt;

&lt;p&gt;Many APIs perform well during development but fail under real traffic.&lt;/p&gt;

&lt;p&gt;Performance problems often appear only when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;hundreds of requests run concurrently&lt;/li&gt;
&lt;li&gt;database contention increases&lt;/li&gt;
&lt;li&gt;thread pools become saturated&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Good load testing tools include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;k6&lt;/li&gt;
&lt;li&gt;Apache JMeter&lt;/li&gt;
&lt;li&gt;NBomber&lt;/li&gt;
&lt;li&gt;Azure Load Testing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Testing under realistic traffic conditions helps reveal bottlenecks before production users experience them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pro Tip: Measure Before Optimizing
&lt;/h2&gt;

&lt;p&gt;Optimization without measurement is mostly guesswork.&lt;/p&gt;

&lt;p&gt;Useful performance tools include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MiniProfiler&lt;/li&gt;
&lt;li&gt;Application Insights&lt;/li&gt;
&lt;li&gt;OpenTelemetry&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These tools help identify the &lt;strong&gt;real bottleneck&lt;/strong&gt; instead of optimizing blindly.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Common Performance Trap
&lt;/h2&gt;

&lt;p&gt;One mistake I frequently see is developers trying to optimize application code first while ignoring the database.&lt;/p&gt;

&lt;p&gt;In many real systems, &lt;strong&gt;database queries account for 70–90% of total API response time.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Start your investigation there.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;To keep ASP.NET Core APIs fast and scalable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Return only the data clients actually need&lt;/li&gt;
&lt;li&gt;Use asynchronous I/O consistently&lt;/li&gt;
&lt;li&gt;Optimize database queries early&lt;/li&gt;
&lt;li&gt;Cache frequently requested data&lt;/li&gt;
&lt;li&gt;Implement pagination on large endpoints&lt;/li&gt;
&lt;li&gt;Keep middleware pipelines lean&lt;/li&gt;
&lt;li&gt;Enable response compression&lt;/li&gt;
&lt;li&gt;Avoid excessive logging in production&lt;/li&gt;
&lt;li&gt;Use proper DbContext lifetimes&lt;/li&gt;
&lt;li&gt;Load test before production traffic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Performance rarely comes from one big optimization.&lt;/p&gt;

&lt;p&gt;It usually comes from &lt;strong&gt;avoiding dozens of small mistakes that accumulate over time.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you're building high-traffic APIs, these small decisions can make the difference between an API that struggles under load and one that scales effortlessly.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>aspnetcore</category>
      <category>webapi</category>
      <category>performance</category>
    </item>
    <item>
      <title>ASP.NET Core Caching Explained: In-Memory, Redis, and Response Caching for High-Performance APIs</title>
      <dc:creator>alinabi19</dc:creator>
      <pubDate>Wed, 04 Mar 2026 08:24:57 +0000</pubDate>
      <link>https://dev.to/alinabi19/aspnet-core-caching-explained-in-memory-redis-and-response-caching-for-high-performance-apis-hm3</link>
      <guid>https://dev.to/alinabi19/aspnet-core-caching-explained-in-memory-redis-and-response-caching-for-high-performance-apis-hm3</guid>
      <description>&lt;p&gt;Modern APIs rarely fail because of logic.&lt;br&gt;
They fail because of &lt;strong&gt;performance&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Picture this.&lt;/p&gt;

&lt;p&gt;You deploy a clean ASP.NET Core API. Everything works perfectly during testing. But once real traffic arrives, things start getting ugly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Database CPU spikes&lt;/li&gt;
&lt;li&gt;Queries that took &lt;strong&gt;30 ms&lt;/strong&gt; now take &lt;strong&gt;600 ms&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Your API latency slowly creeps past &lt;strong&gt;1 second&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The problem usually isn’t the database itself.&lt;/p&gt;

&lt;p&gt;It’s that your API is &lt;strong&gt;doing the same expensive work repeatedly.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The same product list.&lt;br&gt;
The same configuration settings.&lt;br&gt;
The same reference data.&lt;/p&gt;

&lt;p&gt;Over and over again.&lt;/p&gt;

&lt;p&gt;This is where &lt;strong&gt;caching&lt;/strong&gt; becomes one of the most powerful tools in backend engineering.&lt;/p&gt;

&lt;p&gt;Yet many developers either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Avoid caching because it feels complicated&lt;/li&gt;
&lt;li&gt;Or implement it incorrectly and create stale data bugs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this guide, you'll learn &lt;strong&gt;how caching actually works in ASP.NET Core,&lt;/strong&gt; when to use each type, and how to implement it properly in production APIs.&lt;/p&gt;

&lt;p&gt;By the end, you'll know how to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Improve API performance dramatically&lt;/li&gt;
&lt;li&gt;Reduce database load&lt;/li&gt;
&lt;li&gt;Build APIs that scale without infrastructure explosions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s start with the fundamentals.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why API Performance Matters
&lt;/h2&gt;

&lt;p&gt;API performance directly impacts:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- User experience&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;- Infrastructure cost&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;- Scalability&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;- System stability&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Consider this example:&lt;/p&gt;

&lt;p&gt;Without caching:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;10,000 requests/min
→ Each request hits the database
→ 10,000 database queries/min
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With caching:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;10,000 requests/min
→ Only 200 database queries
→ Remaining requests served from cache
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Faster responses&lt;/li&gt;
&lt;li&gt;Lower DB load&lt;/li&gt;
&lt;li&gt;Better scalability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Caching is often the &lt;strong&gt;highest ROI optimization&lt;/strong&gt; you can implement.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is Caching
&lt;/h2&gt;

&lt;p&gt;Caching means &lt;strong&gt;storing expensive data temporarily so it can be reused.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of recomputing or querying the database repeatedly, the API retrieves the result from a &lt;strong&gt;fast storage layer.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Typical cache flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Request arrives
      ↓
Check Cache
      ↓
Cache Hit → return data immediately
Cache Miss → fetch from DB → store in cache → return
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key idea:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Cache the &lt;strong&gt;result of expensive operations&lt;/strong&gt;, not everything.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Types of Caching in ASP.NET Core
&lt;/h2&gt;

&lt;p&gt;ASP.NET Core provides several caching mechanisms:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. In-Memory Caching&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;2. Distributed Caching&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;3. Response Caching&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each solves a different problem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In-Memory Caching&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In-memory caching stores data &lt;strong&gt;inside the application server's memory.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extremely fast&lt;/li&gt;
&lt;li&gt;Easy to implement&lt;/li&gt;
&lt;li&gt;Best for &lt;strong&gt;single-instance APIs&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, it has a limitation.&lt;/p&gt;

&lt;p&gt;If you run multiple API instances (load balancing), each instance has its &lt;strong&gt;own separate cache.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Product catalogs&lt;/li&gt;
&lt;li&gt;Configuration values&lt;/li&gt;
&lt;li&gt;Reference data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Distributed Caching (Redis / SQL Server)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Distributed caching stores data in &lt;strong&gt;external cache systems&lt;/strong&gt; like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Redis&lt;/li&gt;
&lt;li&gt;SQL Server&lt;/li&gt;
&lt;li&gt;NCache&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All application instances share the same cache.&lt;/p&gt;

&lt;p&gt;Benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Works with &lt;strong&gt;multiple API instances&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Ideal for &lt;strong&gt;cloud and microservices architectures&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Handles large-scale traffic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the &lt;strong&gt;standard choice for production systems.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Response Caching&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Response caching stores &lt;strong&gt;entire HTTP responses.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of executing the controller again, ASP.NET Core returns the &lt;strong&gt;cached HTTP response.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is useful for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Public APIs&lt;/li&gt;
&lt;li&gt;GET endpoints&lt;/li&gt;
&lt;li&gt;Static data endpoints&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, it only works when responses are &lt;strong&gt;safe to cache.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  When to Use Each Type
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Cache Type&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Best Use Case&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Speed&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Scalability&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;In-Memory Cache&lt;/td&gt;
&lt;td&gt;Single instance apps&lt;/td&gt;
&lt;td&gt;Very fast&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Distributed Cache&lt;/td&gt;
&lt;td&gt;Multi-server production APIs&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Response Cache&lt;/td&gt;
&lt;td&gt;Cache entire HTTP responses&lt;/td&gt;
&lt;td&gt;Very fast&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Rule of thumb:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Small apps → &lt;strong&gt;In-Memory&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Scalable APIs → &lt;strong&gt;Redis Distributed Cache&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Static GET responses → &lt;strong&gt;Response Caching&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Step-by-Step Implementation
&lt;/h2&gt;

&lt;p&gt;Let's implement each caching strategy.&lt;/p&gt;
&lt;h2&gt;
  
  
  In-Memory Cache Example
&lt;/h2&gt;

&lt;p&gt;First, register the memory cache service.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Program.cs (.NET 8)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;WebApplication&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateBuilder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddMemoryCache&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddControllers&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;MapControllers&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Using IMemoryCache&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ApiController&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"api/products"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ProductsController&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ControllerBase&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;IMemoryCache&lt;/span&gt; &lt;span class="n"&gt;_cache&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;ProductsController&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IMemoryCache&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;_cache&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;HttpGet&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IActionResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;GetProducts&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;cacheKey&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"product_list"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;_cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;TryGetValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cacheKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;out&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Simulate expensive DB call&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Delay&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;500&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="s"&gt;"Laptop"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s"&gt;"Keyboard"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s"&gt;"Mouse"&lt;/span&gt;
            &lt;span class="p"&gt;};&lt;/span&gt;

            &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;cacheOptions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;MemoryCacheEntryOptions&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;AbsoluteExpirationRelativeToNow&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromMinutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="n"&gt;SlidingExpiration&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromMinutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;};&lt;/span&gt;

            &lt;span class="n"&gt;_cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cacheKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cacheOptions&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Expiration Strategies&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Absolute Expiration&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Cache expires after a fixed time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;AbsoluteExpirationRelativeToNow&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt; &lt;span class="n"&gt;minutes&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Sliding Expiration&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Expiration resets every time the cache is accessed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;SlidingExpiration&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="n"&gt;minutes&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Combining both prevents stale data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cache Invalidation Example
&lt;/h2&gt;

&lt;p&gt;Cache invalidation is crucial when data changes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;HttpPost&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;IActionResult&lt;/span&gt; &lt;span class="nf"&gt;AddProduct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Save to database&lt;/span&gt;

    &lt;span class="n"&gt;_cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Remove&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"product_list"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ensures fresh data is fetched on the next request.&lt;/p&gt;

&lt;h2&gt;
  
  
  Redis Distributed Cache Example
&lt;/h2&gt;

&lt;p&gt;First install the package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;Microsoft&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Extensions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Caching&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StackExchangeRedis&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Configure Redis&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddStackExchangeRedisCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Configuration&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"localhost:6379"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InstanceName&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"MyApiCache"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Using IDistributedCache&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ProductsController&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ControllerBase&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;IDistributedCache&lt;/span&gt; &lt;span class="n"&gt;_cache&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;ProductsController&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IDistributedCache&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;_cache&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;HttpGet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"redis"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IActionResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;GetProductsRedis&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;cacheKey&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;cachedData&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetStringAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cacheKey&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cachedData&lt;/span&gt; &lt;span class="p"&gt;!=&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;JsonSerializer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Deserialize&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;cachedData&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s"&gt;"Laptop"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"Keyboard"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"Mouse"&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;options&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;DistributedCacheEntryOptions&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;AbsoluteExpirationRelativeToNow&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromMinutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;15&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;

        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;SetStringAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;cacheKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;JsonSerializer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Serialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;options&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now &lt;strong&gt;all API instances share the same cache.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Response Caching Example
&lt;/h2&gt;

&lt;p&gt;First enable response caching.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Program.cs&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Services&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddResponseCaching&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;UseResponseCaching&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Controller Example&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;HttpGet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"catalog"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;ResponseCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;IActionResult&lt;/span&gt; &lt;span class="nf"&gt;GetCatalog&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Message&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Cached Response"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Time&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DateTime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UtcNow&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now responses are cached for &lt;strong&gt;60 seconds&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Production Insight: Cache Stampede
&lt;/h2&gt;

&lt;p&gt;A common issue in high-traffic systems is &lt;strong&gt;cache stampede.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When a cache entry expires, many concurrent requests may attempt to recompute the same value simultaneously.&lt;/p&gt;

&lt;p&gt;This can overwhelm your database.&lt;/p&gt;

&lt;p&gt;Typical mitigation strategies include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Using &lt;strong&gt;SemaphoreSlim locking&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Using &lt;code&gt;Lazy&amp;lt;T&amp;gt;&lt;/code&gt; caching&lt;/li&gt;
&lt;li&gt;Refreshing cache &lt;strong&gt;in the background&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Using &lt;strong&gt;stale-while-revalidate patterns&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without protection, a popular endpoint can trigger thousands of database queries the moment a cache expires.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Use Cases
&lt;/h2&gt;

&lt;p&gt;Caching shines in scenarios like:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Product catalog APIs&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Product data changes rarely but is requested constantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Configuration endpoints&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Feature flags, settings, metadata.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dashboard APIs&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Expensive aggregations or analytics queries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reference data&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Countries, currencies, categories.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Mistakes Developers Make
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Caching everything&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not all data should be cached. Only cache expensive operations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Forgetting cache invalidation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Stale data bugs happen when cache isn't cleared after updates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Using in-memory cache in scaled systems&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Multiple servers = multiple caches = inconsistent data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Using very long expiration times&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Leads to stale responses.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance &amp;amp; Scalability Considerations
&lt;/h2&gt;

&lt;p&gt;When designing caching strategies:&lt;/p&gt;

&lt;p&gt;Think about:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Cache eviction policies&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;- Memory consumption&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;- Cache hit ratio&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;- Invalidation strategy&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Monitor:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cache hits vs misses&lt;/li&gt;
&lt;li&gt;Redis latency&lt;/li&gt;
&lt;li&gt;Database query reductions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A good cache system should &lt;strong&gt;dramatically reduce database load.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Best Practices for Production APIs
&lt;/h2&gt;

&lt;p&gt;Follow these principles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cache &lt;strong&gt;read-heavy endpoints&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;Redis for distributed systems&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Always define &lt;strong&gt;expiration policies&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Implement &lt;strong&gt;cache invalidation&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Monitor cache performance&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Pro Tip
&lt;/h2&gt;

&lt;p&gt;Cache &lt;strong&gt;DTO results&lt;/strong&gt;, not raw entities.&lt;/p&gt;

&lt;p&gt;This avoids serialization overhead and prevents accidental cache mutation.&lt;/p&gt;
&lt;h2&gt;
  
  
  Common Pitfall
&lt;/h2&gt;

&lt;p&gt;Never cache &lt;strong&gt;user-specific data globally&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;GET&lt;/span&gt; &lt;span class="p"&gt;/&lt;/span&gt;&lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="p"&gt;/&lt;/span&gt;&lt;span class="n"&gt;orders&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If cached improperly, users might receive &lt;strong&gt;other users’ data.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Always include &lt;strong&gt;user context in cache keys&lt;/strong&gt; when needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Caching Is a Game Changer for APIs
&lt;/h2&gt;

&lt;p&gt;Caching is one of the &lt;strong&gt;simplest ways to improve API performance.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A well-designed caching layer can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reduce database load dramatically&lt;/li&gt;
&lt;li&gt;Improve API latency&lt;/li&gt;
&lt;li&gt;Increase scalability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And the best part?&lt;/p&gt;

&lt;p&gt;You often get &lt;strong&gt;10x performance improvements with minimal code changes.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Start simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add &lt;strong&gt;in-memory caching&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Move to &lt;strong&gt;Redis when scaling&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;response caching for public endpoints&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Small optimizations like these are what separate &lt;strong&gt;average APIs from high-performance systems.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Recap
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Caching reduces repeated expensive operations&lt;/li&gt;
&lt;li&gt;ASP.NET Core supports &lt;strong&gt;multiple caching strategies&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Redis enables &lt;strong&gt;scalable distributed caching&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Cache invalidation is critical&lt;/li&gt;
&lt;li&gt;Always implement &lt;strong&gt;expiration policies&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  One Question for You
&lt;/h2&gt;

&lt;p&gt;What caching strategy are you currently using&lt;br&gt;
in your ASP.NET Core APIs?&lt;/p&gt;

&lt;p&gt;In-memory? Redis? Something else?&lt;/p&gt;

&lt;p&gt;I'd love to hear your experience.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>webdev</category>
      <category>backend</category>
      <category>performance</category>
    </item>
  </channel>
</rss>
