Build AI-powered ASP.NET Core applications without relying entirely on cloud providers.
Introduction
If you've been experimenting with AI recently, your journey probably looked something like this:
- Sign up for an AI provider
- Generate an API key
- Send a prompt
- Get a response
It's simple, fast, and honestly, it's amazing how quickly you can build something useful.
But once AI moves beyond a proof of concept, a different set of questions starts to appear.
- How much will this cost at scale?
- Do we really want sensitive data leaving our infrastructure?
- What happens if we hit API limits?
- Do we want our internal tools to depend entirely on an external service?
This is exactly where local AI becomes interesting.
Recently, I started experimenting with Ollama and was surprised by how easy it has become to run AI models locally and integrate them into a regular ASP.NET Core application.
The best part?
From a .NET developer's perspective, it mostly feels like integrating another HTTP service.
In this guide, we'll:
- Install Ollama
- Download and run a local model
- Integrate it into ASP.NET Core
- Build a simple AI-powered endpoint
Why Run AI Models Locally?
Cloud AI is fantastic.
I'm not here to replace it.
But local AI solves a different set of problems.
Better Privacy
Many business applications work with sensitive information.
Things like:
- Internal documentation
- Support tickets
- Customer records
- Application logs
Sending that information to an external AI provider may introduce security, compliance, or governance concerns.
Running models locally keeps everything inside your own infrastructure.
Predictable Costs
Cloud providers charge based on usage.
That works well initially, but costs can grow quickly as applications scale.
Local AI removes the per-request pricing model entirely.
Fewer Dependencies
Your application no longer depends on:
- Internet connectivity
- External outages
- API rate limits
Everything runs on your own machine or server.
What Is Ollama?
Ollama is a tool that allows developers to run Large Language Models (LLMs) locally.
Instead of manually configuring machine learning environments, Ollama handles:
- Model downloads
- Runtime management
- Memory handling
- HTTP APIs
Once installed, interacting with AI becomes as simple as making an HTTP request.
Popular models include:
| Model | Best Use Case |
|---|---|
| llama3 | General AI tasks |
| mistral | Fast responses |
| codellama | Code generation |
| gemma | Lightweight applications |
Step 1: Install Ollama
Download Ollama from:
https://ollama.com
Verify the installation.
ollama --version
Download a model.
ollama pull llama3
Run it.
ollama run llama3
Ask it something simple.
Explain dependency injection in ASP.NET Core
If you get a response, you're ready to go.
By default, Ollama exposes a local API at:
http://localhost:11434
This is what our ASP.NET Core application will communicate with.
Step 2: Create an ASP.NET Core API
Create a new project.
dotnet new webapi -n LocalAIApi
Recommended structure:
Controllers
Services
Models
Program.cs
Keep your AI logic separate from controllers.
It'll make your application easier to maintain.
Step 3: Register HttpClient
Inside Program.cs:
builder.Services.AddHttpClient<IAiService, OllamaService>(client =>
{
client.BaseAddress = new Uri("http://localhost:11434");
client.Timeout = TimeSpan.FromMinutes(2);
});
Step 4: Create an AI Service
Interface:
public interface IAiService
{
Task<string> GenerateAsync(
string prompt,
CancellationToken cancellationToken);
}
Implementation:
public class OllamaService : IAiService
{
private readonly HttpClient _httpClient;
public OllamaService(HttpClient httpClient)
{
_httpClient = httpClient;
}
public async Task<string> GenerateAsync(
string prompt,
CancellationToken cancellationToken)
{
var request = new
{
model = "llama3",
prompt,
stream = false
};
var response = await _httpClient.PostAsJsonAsync(
"api/generate",
request,
cancellationToken);
response.EnsureSuccessStatusCode();
var result = await response.Content
.ReadFromJsonAsync<OllamaResponse>(
cancellationToken: cancellationToken);
return result?.Response ?? string.Empty;
}
}
Response model:
public class OllamaResponse
{
public string Response { get; set; } = string.Empty;
}
At this point, AI integration starts looking very similar to integrating any third-party API.
That's probably the biggest surprise when working with Ollama for the first time.
Most of the complexity is already handled for you.
Step 5: Expose an AI Endpoint
Request model:
public class AiRequest
{
public string Prompt { get; set; } = string.Empty;
}
Controller:
[ApiController]
[Route("api/ai")]
public class AiController : ControllerBase
{
private readonly IAiService _aiService;
public AiController(IAiService aiService)
{
_aiService = aiService;
}
[HttpPost("generate")]
public async Task<IActionResult> Generate(
AiRequest request,
CancellationToken cancellationToken)
{
if (string.IsNullOrWhiteSpace(request.Prompt))
{
return BadRequest("Prompt is required.");
}
var result = await _aiService.GenerateAsync(
request.Prompt,
cancellationToken);
return Ok(new
{
response = result
});
}
}
Your API can now expose AI functionality without relying on a cloud provider.
Architecture Overview
Client
↓
ASP.NET Core API
↓
AI Service Layer
↓
Ollama HTTP API
↓
Local LLM
Keeping AI behind a dedicated service layer makes it easier to swap providers later if needed.
Is Local AI Always Better?
No.
Cloud AI is still the better choice in many scenarios.
For example:
Use cloud AI when:
- You need state-of-the-art reasoning models
- You have thousands of concurrent users
- You need globally distributed infrastructure
Use local AI when:
- Privacy matters
- Cost control matters
- You're building internal business tools
In reality, many teams will likely use a hybrid approach.
Final Thoughts
A few years ago, running large language models felt like something only machine learning engineers could do.
Today, tools like Ollama have changed that.
As a .NET developer, integrating local AI is often just another HTTP integration.
And that's what makes this so exciting.
Local AI isn't replacing cloud AI.
It's simply giving developers another architectural option.
For internal tools, private assistants, documentation search, and AI-powered business applications, that option is becoming increasingly practical every day.
Top comments (0)