Brian Spann

Posted on Feb 18

Generative AI in C#: Building Provider-Agnostic LLM Applications with Microsoft.Extensions.AI

#ai #csharp #dotnet #azure

Building LLM-powered applications in C# has never been more accessible—or more fragmented. With Azure OpenAI, OpenAI, Anthropic, Google, Ollama, and countless other providers, developers face a familiar problem: vendor lock-in at the SDK level.

You start with OpenAI. Things go well. Then your team decides to try Azure OpenAI for compliance reasons. Or you want to run Llama 3 locally during development to save costs. Suddenly, you're refactoring service classes, updating DI registrations, and maintaining multiple code paths.

This is exactly the problem Microsoft.Extensions.AI was designed to solve.

What is Microsoft.Extensions.AI?

Microsoft.Extensions.AI (released as stable in early 2025) provides a unified abstraction layer for AI services in .NET. Think of it as what ILogger did for logging or IDistributedCache did for caching—but for AI.

At its core, the library defines two primary interfaces:

IChatClient: For conversational AI (chat completions)
IEmbeddingGenerator<TInput, TEmbedding>: For generating embeddings

These interfaces are provider-agnostic. Your application code depends on the abstraction, while the concrete implementation can be swapped at configuration time.

public interface IChatClient : IDisposable
{
    Task<ChatCompletion> CompleteAsync(
        IList<ChatMessage> chatMessages,
        ChatOptions? options = null,
        CancellationToken cancellationToken = default);

    IAsyncEnumerable<StreamingChatCompletionUpdate> CompleteStreamingAsync(
        IList<ChatMessage> chatMessages,
        ChatOptions? options = null,
        CancellationToken cancellationToken = default);

    ChatClientMetadata Metadata { get; }
    TService? GetService<TService>(object? key = null) where TService : class;
}

The Problem with Provider-Specific SDKs

Let's look at what happens without an abstraction layer. Here's typical code using the Azure OpenAI SDK directly:

public class ContentService
{
    private readonly AzureOpenAIClient _client;
    private readonly string _deploymentName;

    public ContentService(AzureOpenAIClient client, string deploymentName)
    {
        _client = client;
        _deploymentName = deploymentName;
    }

    public async Task<string> SummarizeAsync(string content)
    {
        var chatClient = _client.GetChatClient(_deploymentName);

        var messages = new List<ChatMessage>
        {
            new SystemChatMessage("You are a helpful summarizer."),
            new UserChatMessage($"Summarize this:\n\n{content}")
        };

        ChatCompletion completion = await chatClient.CompleteChatAsync(messages);
        return completion.Content[0].Text;
    }
}

This works fine—until requirements change:

Testing becomes painful. The AzureOpenAIClient is a concrete class. You need to mock HTTP responses or use integration tests exclusively.
Local development is expensive. Every test run, every debugging session hits the Azure API and costs money.
Switching providers requires code changes. Want to try Claude or Gemini? Time to refactor.

Building with Microsoft.Extensions.AI

Here's the same service written against the abstraction:

public class ContentService
{
    private readonly IChatClient _chatClient;

    public ContentService(IChatClient chatClient)
    {
        _chatClient = chatClient;
    }

    public async Task<string> SummarizeAsync(string content)
    {
        var messages = new List<ChatMessage>
        {
            new(ChatRole.System, "You are a helpful summarizer."),
            new(ChatRole.User, $"Summarize this:\n\n{content}")
        };

        var result = await _chatClient.CompleteAsync(messages);
        return result.Message.Text ?? string.Empty;
    }
}

Notice what's not in this code:

No Azure-specific types
No deployment names
No provider-specific configuration

The ContentService now depends only on IChatClient. How that interface is implemented is a concern of the composition root—not the service.

Provider Registration Patterns

The real power emerges in how you configure providers. Let's look at several registration patterns.

Azure OpenAI

using Azure.AI.OpenAI;
using Microsoft.Extensions.AI;

builder.Services.AddChatClient(sp =>
{
    var config = sp.GetRequiredService<IConfiguration>();

    var client = new AzureOpenAIClient(
        new Uri(config["AzureOpenAI:Endpoint"]!),
        new DefaultAzureCredential());

    return client.AsChatClient(config["AzureOpenAI:DeploymentName"]!);
});

OpenAI

using OpenAI;
using Microsoft.Extensions.AI;

builder.Services.AddChatClient(sp =>
{
    var config = sp.GetRequiredService<IConfiguration>();

    var client = new OpenAIClient(config["OpenAI:ApiKey"]!);
    return client.AsChatClient("gpt-4o");
});

Ollama (Local Development)

using Microsoft.Extensions.AI.Ollama;

builder.Services.AddChatClient(sp =>
    new OllamaChatClient(new Uri("http://localhost:11434"), "llama3.2"));

GitHub Models (Great for Testing)

using Azure.AI.OpenAI;
using Microsoft.Extensions.AI;

builder.Services.AddChatClient(sp =>
{
    var config = sp.GetRequiredService<IConfiguration>();

    // GitHub Models uses OpenAI-compatible API
    var client = new OpenAIClient(
        new ApiKeyCredential(config["GitHub:Token"]!),
        new OpenAIClientOptions 
        { 
            Endpoint = new Uri("https://models.inference.ai.azure.com") 
        });

    return client.AsChatClient("gpt-4o");
});

Configuration-Driven Provider Selection

For maximum flexibility, drive provider selection from configuration:

// appsettings.json
{
  "AI": {
    "Provider": "AzureOpenAI",
    "AzureOpenAI": {
      "Endpoint": "https://myresource.openai.azure.com",
      "DeploymentName": "gpt-4o"
    },
    "OpenAI": {
      "ApiKey": "sk-..."
    },
    "Ollama": {
      "Endpoint": "http://localhost:11434",
      "ModelId": "llama3.2"
    }
  }
}

builder.Services.AddChatClient(sp =>
{
    var config = sp.GetRequiredService<IConfiguration>();
    var provider = config["AI:Provider"];

    return provider switch
    {
        "AzureOpenAI" => CreateAzureOpenAIClient(config),
        "OpenAI" => CreateOpenAIClient(config),
        "Ollama" => CreateOllamaClient(config),
        "GitHubModels" => CreateGitHubModelsClient(config),
        _ => throw new InvalidOperationException($"Unknown AI provider: {provider}")
    };
});

IChatClient CreateAzureOpenAIClient(IConfiguration config)
{
    var section = config.GetSection("AI:AzureOpenAI");
    var client = new AzureOpenAIClient(
        new Uri(section["Endpoint"]!),
        new DefaultAzureCredential());
    return client.AsChatClient(section["DeploymentName"]!);
}

IChatClient CreateOpenAIClient(IConfiguration config)
{
    var section = config.GetSection("AI:OpenAI");
    var client = new OpenAIClient(section["ApiKey"]!);
    return client.AsChatClient(section["ModelId"] ?? "gpt-4o");
}

IChatClient CreateOllamaClient(IConfiguration config)
{
    var section = config.GetSection("AI:Ollama");
    return new OllamaChatClient(
        new Uri(section["Endpoint"] ?? "http://localhost:11434"),
        section["ModelId"] ?? "llama3.2");
}

This pattern lets you:

Use Ollama locally (free, fast, private)
Use OpenAI in staging
Use Azure OpenAI in production
Switch with a single config change

Middleware and Decorators

One of the most powerful features of Extensions.AI is the middleware pattern. Middleware wraps your chat client to add cross-cutting concerns without modifying your application code.

Using the Builder Pattern

builder.Services.AddChatClient(sp =>
{
    var config = sp.GetRequiredService<IConfiguration>();

    return new AzureOpenAIClient(
            new Uri(config["AzureOpenAI:Endpoint"]!),
            new DefaultAzureCredential())
        .AsChatClient(config["AzureOpenAI:DeploymentName"]!)
        .AsBuilder()
        .UseLogging(sp.GetRequiredService<ILoggerFactory>())
        .UseDistributedCache(sp.GetRequiredService<IDistributedCache>())
        .Build(sp);
});

Built-in Middleware

Extensions.AI includes several middleware components:

Logging Middleware logs all completions with structured data:

.UseLogging(loggerFactory)

Distributed Caching caches responses to reduce costs for repeated queries:

.UseDistributedCache(distributedCache, options =>
{
    options.CacheExpiration = TimeSpan.FromHours(1);
})

OpenTelemetry adds tracing spans for observability:

.UseOpenTelemetry(loggerFactory, sourceName: "AI.Chat")

Custom Middleware

You can create custom middleware for any cross-cutting concern:

public class RateLimitingMiddleware : DelegatingChatClient
{
    private readonly RateLimiter _limiter;

    public RateLimitingMiddleware(IChatClient inner, RateLimiter limiter) 
        : base(inner)
    {
        _limiter = limiter;
    }

    public override async Task<ChatCompletion> CompleteAsync(
        IList<ChatMessage> chatMessages,
        ChatOptions? options = null,
        CancellationToken cancellationToken = default)
    {
        using var lease = await _limiter.AcquireAsync(1, cancellationToken);

        if (!lease.IsAcquired)
            throw new RateLimitExceededException("Rate limit exceeded");

        return await base.CompleteAsync(chatMessages, options, cancellationToken);
    }
}

// Extension method for fluent registration
public static class RateLimitingExtensions
{
    public static ChatClientBuilder UseRateLimiting(
        this ChatClientBuilder builder, 
        RateLimiter limiter)
    {
        return builder.Use((inner, sp) => 
            new RateLimitingMiddleware(inner, limiter));
    }
}

Embedding Generation

The same patterns apply to embeddings. Here's how to register an embedding generator:

builder.Services.AddEmbeddingGenerator<string, Embedding<float>>(sp =>
{
    var config = sp.GetRequiredService<IConfiguration>();

    var client = new AzureOpenAIClient(
        new Uri(config["AzureOpenAI:Endpoint"]!),
        new DefaultAzureCredential());

    return client.AsEmbeddingGenerator(config["AzureOpenAI:EmbeddingModel"]!);
});

Usage is straightforward:

public class SearchService
{
    private readonly IEmbeddingGenerator<string, Embedding<float>> _embedder;

    public SearchService(IEmbeddingGenerator<string, Embedding<float>> embedder)
    {
        _embedder = embedder;
    }

    public async Task<float[]> GetEmbeddingAsync(string text)
    {
        var embedding = await _embedder.GenerateAsync(text);
        return embedding.Vector.ToArray();
    }

    public async Task<IReadOnlyList<Embedding<float>>> GetEmbeddingsAsync(
        IEnumerable<string> texts)
    {
        return await _embedder.GenerateAsync(texts.ToList());
    }
}

When to Use Extensions.AI vs. Raw SDKs

The abstraction isn't always the right choice. Here's a decision framework:

Use Microsoft.Extensions.AI when:

You want provider flexibility
You're building a library or reusable component
You want middleware for logging, caching, rate limiting
You prioritize testability
Your application may run in multiple environments

Consider raw SDKs when:

You need provider-specific features not in the abstraction
You're doing low-level optimizations (custom HTTP handlers, etc.)
You're prototyping and speed matters more than architecture
You're absolutely certain you'll never switch providers

Hybrid approach: You can always escape the abstraction using GetService<T>():

public async Task DoProviderSpecificThingAsync()
{
    // Get the underlying provider if needed
    var azureClient = _chatClient.GetService<AzureOpenAIClient>();

    if (azureClient != null)
    {
        // Do Azure-specific thing
    }
}

NuGet Packages

Here are the packages you'll need:

<!-- Core abstractions -->
<PackageReference Include="Microsoft.Extensions.AI.Abstractions" Version="9.0.0" />
<PackageReference Include="Microsoft.Extensions.AI" Version="9.0.0" />

<!-- Provider implementations -->
<PackageReference Include="Microsoft.Extensions.AI.OpenAI" Version="9.0.0" />
<PackageReference Include="Microsoft.Extensions.AI.Ollama" Version="9.0.0" />

<!-- Azure OpenAI (uses OpenAI implementation) -->
<PackageReference Include="Azure.AI.OpenAI" Version="2.1.0" />

Putting It All Together

Here's a complete Program.cs showing a production-ready setup:

using Azure.AI.OpenAI;
using Azure.Identity;
using Microsoft.Extensions.AI;

var builder = WebApplication.CreateBuilder(args);

// Register chat client with middleware stack
builder.Services.AddChatClient(sp =>
{
    var config = sp.GetRequiredService<IConfiguration>();
    var provider = config["AI:Provider"] ?? "Ollama";

    IChatClient baseClient = provider switch
    {
        "AzureOpenAI" => CreateAzureClient(config),
        "OpenAI" => CreateOpenAIClient(config),
        _ => new OllamaChatClient(new Uri("http://localhost:11434"), "llama3.2")
    };

    return baseClient
        .AsBuilder()
        .UseLogging(sp.GetRequiredService<ILoggerFactory>())
        .UseDistributedCache(sp.GetRequiredService<IDistributedCache>())
        .UseOpenTelemetry(sourceName: "App.AI")
        .Build(sp);
});

// Register application services
builder.Services.AddScoped<ContentService>();
builder.Services.AddDistributedMemoryCache();

var app = builder.Build();

// Simple endpoint using the abstraction
app.MapPost("/summarize", async (string content, ContentService svc) =>
{
    var summary = await svc.SummarizeAsync(content);
    return Results.Ok(new { summary });
});

app.Run();

IChatClient CreateAzureClient(IConfiguration config)
{
    var client = new AzureOpenAIClient(
        new Uri(config["AI:AzureOpenAI:Endpoint"]!),
        new DefaultAzureCredential());
    return client.AsChatClient(config["AI:AzureOpenAI:DeploymentName"]!);
}

IChatClient CreateOpenAIClient(IConfiguration config)
{
    var client = new OpenAIClient(config["AI:OpenAI:ApiKey"]!);
    return client.AsChatClient("gpt-4o");
}

What's Next

In Part 2, we'll explore function calling and structured outputs—turning LLMs from text generators into reliable decision engines that return structured data you can actually use.

We'll cover:

Defining functions with attributes
The function-calling loop
JSON mode and schema generation
Validation and retry patterns
Token-efficient function definitions

This is Part 1 of the "Generative AI Patterns in C#" series. Subscribe to follow along as we build production-ready AI applications.

DEV Community