AI features are slowly becoming part of normal backend work.
A few years ago, most APIs were simple CRUD endpoints. They fetched data, updated records, and returned JSON. Today it is common to see requests like:
- “Can we add a chatbot endpoint?”
- “Can the API summarize user feedback?”
- “Can we classify support tickets automatically?”
At that point your backend suddenly needs to talk to an AI model.
If you're already comfortable building APIs with ASP.NET Core, the first instinct is usually simple: just call the OpenAI API from an endpoint and return the result.
That works for a quick prototype. But once the system starts growing, a few questions appear pretty quickly:
- Where should the AI logic live?
- Should controllers call OpenAI directly?
- How do we protect API keys?
- What happens if the AI call takes several seconds?
- How do we deal with rate limits or retries?
AI integrations behave like any other external service dependency in your backend architecture. Treat them that way and the system stays clean and maintainable.
In this article we’ll build a simple AI-powered API using ASP.NET Core and OpenAI, while following patterns that hold up well in real applications.
Why Expose AI Through an API
Most production systems expose AI features through backend APIs rather than directly from the frontend.
A typical setup might look like this:
- A web app calls an endpoint to generate content
- A mobile app sends text to be summarized
- An internal tool calls an API for classification
Centralizing AI logic inside your API gives you a few advantages.
Security
Your OpenAI key stays in the backend. The client never sees it.
Reuse
Multiple clients (web, mobile, internal tools) can use the same AI capability.
Cost control
Since AI calls cost money, the API layer can enforce limits and validation.
Consistency
Prompts, models, and safety rules live in one place.
Think of the API as the gateway between your application and AI services.
A Simple Architecture That Works Well
When adding AI to a backend, separating responsibilities helps a lot.
A typical flow looks like this:
Client
↓
API Endpoint
↓
AI Service
↓
OpenAI API
Each layer does a specific job.
| Layer | Responsibility |
|---|---|
| API | Handles HTTP requests |
| Service Layer | Contains AI logic |
| HTTP Client | Calls OpenAI |
| Configuration | Stores API keys |
A mistake I’ve seen more than once is calling OpenAI directly inside controllers.
That approach usually leads to:
- duplicated logic
- hard-to-test endpoints
- messy controllers
Moving AI logic into a service keeps things cleaner and easier to maintain.
Step 1: Create the ASP.NET Core API
Start by creating a standard .NET 8 Web API project.
dotnet new webapi -n AiApiDemo
ASP.NET Core supports both controllers and Minimal APIs.
For simple AI endpoints, Minimal APIs work nicely because the code stays compact.
Example setup:
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen();
var app = builder.Build();
app.UseSwagger();
app.UseSwaggerUI();
app.Run();
Now we have a basic API ready to host endpoints.
Step 2: Store the OpenAI API Key
Never hardcode API keys directly in code.
A simple approach is to store it in configuration.
appsettings.json
{
"OpenAI": {
"ApiKey": "YOUR_API_KEY"
}
}
In production environments you would usually use something like:
- environment variables
- Azure Key Vault
- AWS Secrets Manager
The goal is simple: keep secrets outside source control.
Step 3: Register an HTTP Client
ASP.NET Core includes HttpClientFactory, which is the recommended way to make HTTP calls.
It helps avoid common issues like socket exhaustion and centralizes configuration.
Register a typed client in Program.cs.
builder.Services.AddHttpClient<IAiService, OpenAiService>(client =>
{
client.BaseAddress = new Uri("https://api.openai.com/");
client.Timeout = TimeSpan.FromSeconds(30);
});
Typed clients also work well with dependency injection.
Step 4: Create the AI Service
Instead of calling OpenAI from endpoints, create a dedicated service.
This keeps the API layer thin and makes the integration easier to test.
Service Interface
public interface IAiService
{
Task<string> GenerateResponseAsync(
string prompt,
CancellationToken cancellationToken);
}
Service Implementation
public class OpenAiService : IAiService
{
private readonly HttpClient _httpClient;
private readonly IConfiguration _config;
private readonly ILogger<OpenAiService> _logger;
public OpenAiService(
HttpClient httpClient,
IConfiguration config,
ILogger<OpenAiService> logger)
{
_httpClient = httpClient;
_config = config;
_logger = logger;
}
public async Task<string> GenerateResponseAsync(
string prompt,
CancellationToken cancellationToken)
{
var apiKey = _config["OpenAI:ApiKey"];
_httpClient.DefaultRequestHeaders.Authorization =
new System.Net.Http.Headers.AuthenticationHeaderValue("Bearer", apiKey);
var request = new
{
model = "gpt-4o-mini",
input = prompt
};
var response = await _httpClient.PostAsJsonAsync(
"v1/responses",
request,
cancellationToken);
if (!response.IsSuccessStatusCode)
{
var error = await response.Content.ReadAsStringAsync();
_logger.LogError("OpenAI request failed: {Error}", error);
throw new ApplicationException($"OpenAI request failed: {error}");
}
var result = await response.Content.ReadFromJsonAsync<OpenAiResponse>(cancellationToken);
return result?.Output[0].Content[0].Text ?? "";
}
}
Response Model
Using strongly typed models is safer than parsing dynamic JSON.
public class OpenAiResponse
{
public List<Output> Output { get; set; }
}
public class Output
{
public List<Content> Content { get; set; }
}
public class Content
{
public string Text { get; set; }
}
Step 5: Create an AI Endpoint
Now we expose an endpoint that clients can call.
Request model:
public class ChatRequest
{
public string Prompt { get; set; }
}
Minimal API endpoint:
app.MapPost("/api/ai/chat", async (
ChatRequest request,
IAiService aiService,
CancellationToken cancellationToken) =>
{
if (string.IsNullOrWhiteSpace(request.Prompt))
return Results.BadRequest("Prompt cannot be empty.");
if (request.Prompt.Length > 2000)
return Results.BadRequest("Prompt too large.");
var response = await aiService.GenerateResponseAsync(
request.Prompt,
cancellationToken);
return Results.Ok(new { result = response });
});
Clients can now send prompts and receive AI-generated responses.
Handling Failures and Rate Limits
AI services are external dependencies, so failures are normal.
Some common issues include:
- network timeouts
- rate limits (429 responses)
- temporary service errors
In production systems it’s usually worth adding retry policies.
Libraries like Polly integrate well with HttpClientFactory.
Example retry setup:
builder.Services.AddHttpClient<IAiService, OpenAiService>()
.AddTransientHttpErrorPolicy(policy =>
policy.WaitAndRetryAsync(3, retry =>
TimeSpan.FromSeconds(Math.Pow(2, retry))));
This helps smooth over temporary failures.
A Few Performance Considerations
AI requests are typically slower than database queries.
A few small changes can improve responsiveness.
Use async calls
Blocking threads during AI calls will hurt scalability.
Validate prompt size
Large prompts increase both latency and cost.
Cache repeated responses
If users frequently ask the same question, caching results can reduce API calls.
Consider streaming responses
Streaming works well for chat-style applications where users expect gradual output.
Securing AI Endpoints
AI endpoints can easily become expensive if left unprotected.
A few safeguards help prevent abuse.
Authentication
Use JWT or API keys to restrict access.
Rate limiting
Limit how frequently a client can call the AI endpoint.
Prompt validation
Always validate user input before sending it to a model.
One Small Tip That Reduces AI Costs
AI pricing usually depends on the number of tokens processed.
Better prompts often produce better responses with fewer tokens.
Instead of sending large context blocks, try using structured prompts with clear instructions.
It improves both response quality and cost efficiency.
Lessons from Building AI APIs
If you're planning to add AI features to your backend, a few patterns make life easier:
- Keep AI logic inside a dedicated service layer
- Treat AI like any other external dependency
- Add retries, validation, and logging early
- Protect API keys and enforce usage limits
- Monitor token usage to avoid unexpected costs
Once the architecture is set up properly, adding new AI capabilities becomes much simpler.
Chatbots, summarization, and classification all become just another API endpoint.
Top comments (0)