Building LLM-powered applications in C# has never been more accessible—or more fragmented. With Azure OpenAI, OpenAI, Anthropic, Google, Ollama, and countless other providers, developers face a familiar problem: vendor lock-in at the SDK level.
You start with OpenAI. Things go well. Then your team decides to try Azure OpenAI for compliance reasons. Or you want to run Llama 3 locally during development to save costs. Suddenly, you're refactoring service classes, updating DI registrations, and maintaining multiple code paths.
This is exactly the problem Microsoft.Extensions.AI was designed to solve.
What is Microsoft.Extensions.AI?
Microsoft.Extensions.AI (released as stable in early 2025) provides a unified abstraction layer for AI services in .NET. Think of it as what ILogger did for logging or IDistributedCache did for caching—but for AI.
At its core, the library defines two primary interfaces:
-
IChatClient: For conversational AI (chat completions) -
IEmbeddingGenerator<TInput, TEmbedding>: For generating embeddings
These interfaces are provider-agnostic. Your application code depends on the abstraction, while the concrete implementation can be swapped at configuration time.
public interface IChatClient : IDisposable
{
Task<ChatCompletion> CompleteAsync(
IList<ChatMessage> chatMessages,
ChatOptions? options = null,
CancellationToken cancellationToken = default);
IAsyncEnumerable<StreamingChatCompletionUpdate> CompleteStreamingAsync(
IList<ChatMessage> chatMessages,
ChatOptions? options = null,
CancellationToken cancellationToken = default);
ChatClientMetadata Metadata { get; }
TService? GetService<TService>(object? key = null) where TService : class;
}
The Problem with Provider-Specific SDKs
Let's look at what happens without an abstraction layer. Here's typical code using the Azure OpenAI SDK directly:
public class ContentService
{
private readonly AzureOpenAIClient _client;
private readonly string _deploymentName;
public ContentService(AzureOpenAIClient client, string deploymentName)
{
_client = client;
_deploymentName = deploymentName;
}
public async Task<string> SummarizeAsync(string content)
{
var chatClient = _client.GetChatClient(_deploymentName);
var messages = new List<ChatMessage>
{
new SystemChatMessage("You are a helpful summarizer."),
new UserChatMessage($"Summarize this:\n\n{content}")
};
ChatCompletion completion = await chatClient.CompleteChatAsync(messages);
return completion.Content[0].Text;
}
}
This works fine—until requirements change:
-
Testing becomes painful. The
AzureOpenAIClientis a concrete class. You need to mock HTTP responses or use integration tests exclusively. - Local development is expensive. Every test run, every debugging session hits the Azure API and costs money.
- Switching providers requires code changes. Want to try Claude or Gemini? Time to refactor.
Building with Microsoft.Extensions.AI
Here's the same service written against the abstraction:
public class ContentService
{
private readonly IChatClient _chatClient;
public ContentService(IChatClient chatClient)
{
_chatClient = chatClient;
}
public async Task<string> SummarizeAsync(string content)
{
var messages = new List<ChatMessage>
{
new(ChatRole.System, "You are a helpful summarizer."),
new(ChatRole.User, $"Summarize this:\n\n{content}")
};
var result = await _chatClient.CompleteAsync(messages);
return result.Message.Text ?? string.Empty;
}
}
Notice what's not in this code:
- No Azure-specific types
- No deployment names
- No provider-specific configuration
The ContentService now depends only on IChatClient. How that interface is implemented is a concern of the composition root—not the service.
Provider Registration Patterns
The real power emerges in how you configure providers. Let's look at several registration patterns.
Azure OpenAI
using Azure.AI.OpenAI;
using Microsoft.Extensions.AI;
builder.Services.AddChatClient(sp =>
{
var config = sp.GetRequiredService<IConfiguration>();
var client = new AzureOpenAIClient(
new Uri(config["AzureOpenAI:Endpoint"]!),
new DefaultAzureCredential());
return client.AsChatClient(config["AzureOpenAI:DeploymentName"]!);
});
OpenAI
using OpenAI;
using Microsoft.Extensions.AI;
builder.Services.AddChatClient(sp =>
{
var config = sp.GetRequiredService<IConfiguration>();
var client = new OpenAIClient(config["OpenAI:ApiKey"]!);
return client.AsChatClient("gpt-4o");
});
Ollama (Local Development)
using Microsoft.Extensions.AI.Ollama;
builder.Services.AddChatClient(sp =>
new OllamaChatClient(new Uri("http://localhost:11434"), "llama3.2"));
GitHub Models (Great for Testing)
using Azure.AI.OpenAI;
using Microsoft.Extensions.AI;
builder.Services.AddChatClient(sp =>
{
var config = sp.GetRequiredService<IConfiguration>();
// GitHub Models uses OpenAI-compatible API
var client = new OpenAIClient(
new ApiKeyCredential(config["GitHub:Token"]!),
new OpenAIClientOptions
{
Endpoint = new Uri("https://models.inference.ai.azure.com")
});
return client.AsChatClient("gpt-4o");
});
Configuration-Driven Provider Selection
For maximum flexibility, drive provider selection from configuration:
// appsettings.json
{
"AI": {
"Provider": "AzureOpenAI",
"AzureOpenAI": {
"Endpoint": "https://myresource.openai.azure.com",
"DeploymentName": "gpt-4o"
},
"OpenAI": {
"ApiKey": "sk-..."
},
"Ollama": {
"Endpoint": "http://localhost:11434",
"ModelId": "llama3.2"
}
}
}
builder.Services.AddChatClient(sp =>
{
var config = sp.GetRequiredService<IConfiguration>();
var provider = config["AI:Provider"];
return provider switch
{
"AzureOpenAI" => CreateAzureOpenAIClient(config),
"OpenAI" => CreateOpenAIClient(config),
"Ollama" => CreateOllamaClient(config),
"GitHubModels" => CreateGitHubModelsClient(config),
_ => throw new InvalidOperationException($"Unknown AI provider: {provider}")
};
});
IChatClient CreateAzureOpenAIClient(IConfiguration config)
{
var section = config.GetSection("AI:AzureOpenAI");
var client = new AzureOpenAIClient(
new Uri(section["Endpoint"]!),
new DefaultAzureCredential());
return client.AsChatClient(section["DeploymentName"]!);
}
IChatClient CreateOpenAIClient(IConfiguration config)
{
var section = config.GetSection("AI:OpenAI");
var client = new OpenAIClient(section["ApiKey"]!);
return client.AsChatClient(section["ModelId"] ?? "gpt-4o");
}
IChatClient CreateOllamaClient(IConfiguration config)
{
var section = config.GetSection("AI:Ollama");
return new OllamaChatClient(
new Uri(section["Endpoint"] ?? "http://localhost:11434"),
section["ModelId"] ?? "llama3.2");
}
This pattern lets you:
- Use Ollama locally (free, fast, private)
- Use OpenAI in staging
- Use Azure OpenAI in production
- Switch with a single config change
Middleware and Decorators
One of the most powerful features of Extensions.AI is the middleware pattern. Middleware wraps your chat client to add cross-cutting concerns without modifying your application code.
Using the Builder Pattern
builder.Services.AddChatClient(sp =>
{
var config = sp.GetRequiredService<IConfiguration>();
return new AzureOpenAIClient(
new Uri(config["AzureOpenAI:Endpoint"]!),
new DefaultAzureCredential())
.AsChatClient(config["AzureOpenAI:DeploymentName"]!)
.AsBuilder()
.UseLogging(sp.GetRequiredService<ILoggerFactory>())
.UseDistributedCache(sp.GetRequiredService<IDistributedCache>())
.Build(sp);
});
Built-in Middleware
Extensions.AI includes several middleware components:
Logging Middleware logs all completions with structured data:
.UseLogging(loggerFactory)
Distributed Caching caches responses to reduce costs for repeated queries:
.UseDistributedCache(distributedCache, options =>
{
options.CacheExpiration = TimeSpan.FromHours(1);
})
OpenTelemetry adds tracing spans for observability:
.UseOpenTelemetry(loggerFactory, sourceName: "AI.Chat")
Custom Middleware
You can create custom middleware for any cross-cutting concern:
public class RateLimitingMiddleware : DelegatingChatClient
{
private readonly RateLimiter _limiter;
public RateLimitingMiddleware(IChatClient inner, RateLimiter limiter)
: base(inner)
{
_limiter = limiter;
}
public override async Task<ChatCompletion> CompleteAsync(
IList<ChatMessage> chatMessages,
ChatOptions? options = null,
CancellationToken cancellationToken = default)
{
using var lease = await _limiter.AcquireAsync(1, cancellationToken);
if (!lease.IsAcquired)
throw new RateLimitExceededException("Rate limit exceeded");
return await base.CompleteAsync(chatMessages, options, cancellationToken);
}
}
// Extension method for fluent registration
public static class RateLimitingExtensions
{
public static ChatClientBuilder UseRateLimiting(
this ChatClientBuilder builder,
RateLimiter limiter)
{
return builder.Use((inner, sp) =>
new RateLimitingMiddleware(inner, limiter));
}
}
Embedding Generation
The same patterns apply to embeddings. Here's how to register an embedding generator:
builder.Services.AddEmbeddingGenerator<string, Embedding<float>>(sp =>
{
var config = sp.GetRequiredService<IConfiguration>();
var client = new AzureOpenAIClient(
new Uri(config["AzureOpenAI:Endpoint"]!),
new DefaultAzureCredential());
return client.AsEmbeddingGenerator(config["AzureOpenAI:EmbeddingModel"]!);
});
Usage is straightforward:
public class SearchService
{
private readonly IEmbeddingGenerator<string, Embedding<float>> _embedder;
public SearchService(IEmbeddingGenerator<string, Embedding<float>> embedder)
{
_embedder = embedder;
}
public async Task<float[]> GetEmbeddingAsync(string text)
{
var embedding = await _embedder.GenerateAsync(text);
return embedding.Vector.ToArray();
}
public async Task<IReadOnlyList<Embedding<float>>> GetEmbeddingsAsync(
IEnumerable<string> texts)
{
return await _embedder.GenerateAsync(texts.ToList());
}
}
When to Use Extensions.AI vs. Raw SDKs
The abstraction isn't always the right choice. Here's a decision framework:
Use Microsoft.Extensions.AI when:
- You want provider flexibility
- You're building a library or reusable component
- You want middleware for logging, caching, rate limiting
- You prioritize testability
- Your application may run in multiple environments
Consider raw SDKs when:
- You need provider-specific features not in the abstraction
- You're doing low-level optimizations (custom HTTP handlers, etc.)
- You're prototyping and speed matters more than architecture
- You're absolutely certain you'll never switch providers
Hybrid approach: You can always escape the abstraction using GetService<T>():
public async Task DoProviderSpecificThingAsync()
{
// Get the underlying provider if needed
var azureClient = _chatClient.GetService<AzureOpenAIClient>();
if (azureClient != null)
{
// Do Azure-specific thing
}
}
NuGet Packages
Here are the packages you'll need:
<!-- Core abstractions -->
<PackageReference Include="Microsoft.Extensions.AI.Abstractions" Version="9.0.0" />
<PackageReference Include="Microsoft.Extensions.AI" Version="9.0.0" />
<!-- Provider implementations -->
<PackageReference Include="Microsoft.Extensions.AI.OpenAI" Version="9.0.0" />
<PackageReference Include="Microsoft.Extensions.AI.Ollama" Version="9.0.0" />
<!-- Azure OpenAI (uses OpenAI implementation) -->
<PackageReference Include="Azure.AI.OpenAI" Version="2.1.0" />
Putting It All Together
Here's a complete Program.cs showing a production-ready setup:
using Azure.AI.OpenAI;
using Azure.Identity;
using Microsoft.Extensions.AI;
var builder = WebApplication.CreateBuilder(args);
// Register chat client with middleware stack
builder.Services.AddChatClient(sp =>
{
var config = sp.GetRequiredService<IConfiguration>();
var provider = config["AI:Provider"] ?? "Ollama";
IChatClient baseClient = provider switch
{
"AzureOpenAI" => CreateAzureClient(config),
"OpenAI" => CreateOpenAIClient(config),
_ => new OllamaChatClient(new Uri("http://localhost:11434"), "llama3.2")
};
return baseClient
.AsBuilder()
.UseLogging(sp.GetRequiredService<ILoggerFactory>())
.UseDistributedCache(sp.GetRequiredService<IDistributedCache>())
.UseOpenTelemetry(sourceName: "App.AI")
.Build(sp);
});
// Register application services
builder.Services.AddScoped<ContentService>();
builder.Services.AddDistributedMemoryCache();
var app = builder.Build();
// Simple endpoint using the abstraction
app.MapPost("/summarize", async (string content, ContentService svc) =>
{
var summary = await svc.SummarizeAsync(content);
return Results.Ok(new { summary });
});
app.Run();
IChatClient CreateAzureClient(IConfiguration config)
{
var client = new AzureOpenAIClient(
new Uri(config["AI:AzureOpenAI:Endpoint"]!),
new DefaultAzureCredential());
return client.AsChatClient(config["AI:AzureOpenAI:DeploymentName"]!);
}
IChatClient CreateOpenAIClient(IConfiguration config)
{
var client = new OpenAIClient(config["AI:OpenAI:ApiKey"]!);
return client.AsChatClient("gpt-4o");
}
What's Next
In Part 2, we'll explore function calling and structured outputs—turning LLMs from text generators into reliable decision engines that return structured data you can actually use.
We'll cover:
- Defining functions with attributes
- The function-calling loop
- JSON mode and schema generation
- Validation and retry patterns
- Token-efficient function definitions
This is Part 1 of the "Generative AI Patterns in C#" series. Subscribe to follow along as we build production-ready AI applications.
Top comments (0)