DEV Community

Cover image for Building AI-Powered APIs with ASP.NET Core and OpenAI (.NET 8 Guide)
alinabi19
alinabi19

Posted on • Edited on

Building AI-Powered APIs with ASP.NET Core and OpenAI (.NET 8 Guide)

AI features are slowly becoming part of normal backend work.

A few years ago, most APIs were simple CRUD endpoints. They fetched data, updated records, and returned JSON. Today it is common to see requests like:

  • “Can we add a chatbot endpoint?”
  • “Can the API summarize user feedback?”
  • “Can we classify support tickets automatically?”

At that point your backend suddenly needs to talk to an AI model.

If you're already comfortable building APIs with ASP.NET Core, the first instinct is usually simple: just call the OpenAI API from an endpoint and return the result.

That works for a quick prototype. But once the system starts growing, a few questions appear pretty quickly:

  • Where should the AI logic live?
  • Should controllers call OpenAI directly?
  • How do we protect API keys?
  • What happens if the AI call takes several seconds?
  • How do we deal with rate limits or retries?

AI integrations behave like any other external service dependency in your backend architecture. Treat them that way and the system stays clean and maintainable.

In this article we’ll build a simple AI-powered API using ASP.NET Core and OpenAI, while following patterns that hold up well in real applications.

Why Expose AI Through an API

Most production systems expose AI features through backend APIs rather than directly from the frontend.

A typical setup might look like this:

  • A web app calls an endpoint to generate content
  • A mobile app sends text to be summarized
  • An internal tool calls an API for classification

Centralizing AI logic inside your API gives you a few advantages.

Security
Your OpenAI key stays in the backend. The client never sees it.

Reuse
Multiple clients (web, mobile, internal tools) can use the same AI capability.

Cost control
Since AI calls cost money, the API layer can enforce limits and validation.

Consistency
Prompts, models, and safety rules live in one place.

Think of the API as the gateway between your application and AI services.

A Simple Architecture That Works Well

When adding AI to a backend, separating responsibilities helps a lot.

A typical flow looks like this:

Client
   ↓
API Endpoint
   ↓
AI Service
   ↓
OpenAI API
Enter fullscreen mode Exit fullscreen mode

Each layer does a specific job.

Layer Responsibility
API Handles HTTP requests
Service Layer Contains AI logic
HTTP Client Calls OpenAI
Configuration Stores API keys

A mistake I’ve seen more than once is calling OpenAI directly inside controllers.

That approach usually leads to:

  • duplicated logic
  • hard-to-test endpoints
  • messy controllers

Moving AI logic into a service keeps things cleaner and easier to maintain.

Step 1: Create the ASP.NET Core API

Start by creating a standard .NET 8 Web API project.

dotnet new webapi -n AiApiDemo
Enter fullscreen mode Exit fullscreen mode

ASP.NET Core supports both controllers and Minimal APIs.

For simple AI endpoints, Minimal APIs work nicely because the code stays compact.

Example setup:

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen();

var app = builder.Build();

app.UseSwagger();
app.UseSwaggerUI();

app.Run();
Enter fullscreen mode Exit fullscreen mode

Now we have a basic API ready to host endpoints.

Step 2: Store the OpenAI API Key

Never hardcode API keys directly in code.

A simple approach is to store it in configuration.

appsettings.json

{
  "OpenAI": {
    "ApiKey": "YOUR_API_KEY"
  }
}
Enter fullscreen mode Exit fullscreen mode

In production environments you would usually use something like:

  • environment variables
  • Azure Key Vault
  • AWS Secrets Manager

The goal is simple: keep secrets outside source control.

Step 3: Register an HTTP Client

ASP.NET Core includes HttpClientFactory, which is the recommended way to make HTTP calls.

It helps avoid common issues like socket exhaustion and centralizes configuration.

Register a typed client in Program.cs.

builder.Services.AddHttpClient<IAiService, OpenAiService>(client =>
{
    client.BaseAddress = new Uri("https://api.openai.com/");
    client.Timeout = TimeSpan.FromSeconds(30);
});
Enter fullscreen mode Exit fullscreen mode

Typed clients also work well with dependency injection.

Step 4: Create the AI Service

Instead of calling OpenAI from endpoints, create a dedicated service.

This keeps the API layer thin and makes the integration easier to test.

Service Interface

public interface IAiService
{
    Task<string> GenerateResponseAsync(
        string prompt,
        CancellationToken cancellationToken);
}
Enter fullscreen mode Exit fullscreen mode

Service Implementation

public class OpenAiService : IAiService
{
    private readonly HttpClient _httpClient;
    private readonly IConfiguration _config;
    private readonly ILogger<OpenAiService> _logger;

    public OpenAiService(
        HttpClient httpClient,
        IConfiguration config,
        ILogger<OpenAiService> logger)
    {
        _httpClient = httpClient;
        _config = config;
        _logger = logger;
    }

    public async Task<string> GenerateResponseAsync(
        string prompt,
        CancellationToken cancellationToken)
    {
        var apiKey = _config["OpenAI:ApiKey"];

        _httpClient.DefaultRequestHeaders.Authorization =
            new System.Net.Http.Headers.AuthenticationHeaderValue("Bearer", apiKey);

        var request = new
        {
            model = "gpt-4o-mini",
            input = prompt
        };

        var response = await _httpClient.PostAsJsonAsync(
            "v1/responses",
            request,
            cancellationToken);

        if (!response.IsSuccessStatusCode)
        {
            var error = await response.Content.ReadAsStringAsync();

            _logger.LogError("OpenAI request failed: {Error}", error);

            throw new ApplicationException($"OpenAI request failed: {error}");
        }

        var result = await response.Content.ReadFromJsonAsync<OpenAiResponse>(cancellationToken);

        return result?.Output[0].Content[0].Text ?? "";
    }
}
Enter fullscreen mode Exit fullscreen mode

Response Model

Using strongly typed models is safer than parsing dynamic JSON.

public class OpenAiResponse
{
    public List<Output> Output { get; set; }
}

public class Output
{
    public List<Content> Content { get; set; }
}

public class Content
{
    public string Text { get; set; }
}
Enter fullscreen mode Exit fullscreen mode

Step 5: Create an AI Endpoint

Now we expose an endpoint that clients can call.

Request model:

public class ChatRequest
{
    public string Prompt { get; set; }
}
Enter fullscreen mode Exit fullscreen mode

Minimal API endpoint:

app.MapPost("/api/ai/chat", async (
    ChatRequest request,
    IAiService aiService,
    CancellationToken cancellationToken) =>
{
    if (string.IsNullOrWhiteSpace(request.Prompt))
        return Results.BadRequest("Prompt cannot be empty.");

    if (request.Prompt.Length > 2000)
        return Results.BadRequest("Prompt too large.");

    var response = await aiService.GenerateResponseAsync(
        request.Prompt,
        cancellationToken);

    return Results.Ok(new { result = response });
});
Enter fullscreen mode Exit fullscreen mode

Clients can now send prompts and receive AI-generated responses.

Handling Failures and Rate Limits

AI services are external dependencies, so failures are normal.

Some common issues include:

  • network timeouts
  • rate limits (429 responses)
  • temporary service errors

In production systems it’s usually worth adding retry policies.

Libraries like Polly integrate well with HttpClientFactory.

Example retry setup:

builder.Services.AddHttpClient<IAiService, OpenAiService>()
    .AddTransientHttpErrorPolicy(policy =>
        policy.WaitAndRetryAsync(3, retry =>
            TimeSpan.FromSeconds(Math.Pow(2, retry))));
Enter fullscreen mode Exit fullscreen mode

This helps smooth over temporary failures.

A Few Performance Considerations

AI requests are typically slower than database queries.

A few small changes can improve responsiveness.

Use async calls
Blocking threads during AI calls will hurt scalability.

Validate prompt size
Large prompts increase both latency and cost.

Cache repeated responses
If users frequently ask the same question, caching results can reduce API calls.

Consider streaming responses
Streaming works well for chat-style applications where users expect gradual output.

Securing AI Endpoints

AI endpoints can easily become expensive if left unprotected.

A few safeguards help prevent abuse.

Authentication
Use JWT or API keys to restrict access.

Rate limiting
Limit how frequently a client can call the AI endpoint.

Prompt validation
Always validate user input before sending it to a model.

One Small Tip That Reduces AI Costs

AI pricing usually depends on the number of tokens processed.

Better prompts often produce better responses with fewer tokens.

Instead of sending large context blocks, try using structured prompts with clear instructions.

It improves both response quality and cost efficiency.

Lessons from Building AI APIs

If you're planning to add AI features to your backend, a few patterns make life easier:

  • Keep AI logic inside a dedicated service layer
  • Treat AI like any other external dependency
  • Add retries, validation, and logging early
  • Protect API keys and enforce usage limits
  • Monitor token usage to avoid unexpected costs

Once the architecture is set up properly, adding new AI capabilities becomes much simpler.

Chatbots, summarization, and classification all become just another API endpoint.

Top comments (0)