CodeStreet

Posted on Sep 19

Run Large Language Models Locally in Docker | How to Use Qwen 3 with .NET Semantic Kernel?

#ai #docker #dotnet #beginners

AI development is moving fast — but one thing that still slows developers down is relying on cloud APIs.

What if you could run powerful LLMs (Large Language Models) locally, just like spinning up a Docker container?

In this blog, I’ll show you how to:

Run LLMs locally using Docker’s Model Runner
Pull and run the Qwen 3 model
Connect it to a .NET console app with Semantic Kernel

Let’s get started! 🚀

For a more in-depth walk-through, be sure to watch my YouTube video, where I demonstrate everything step by step!

🔹 Why Run LLMs Inside Docker?

Docker is already the standard for packaging and running applications. Now, with Docker Model Runner, you can:

Pull AI models just like Docker images
Run models locally without cloud APIs
Expose OpenAI-compatible APIs to integrate seamlessly into apps

This means you can test, prototype, and even deploy AI workloads locally — cost-free and private.

🔹 Step 1: Enable Docker Model Runner

Open Docker Desktop → Settings → AI Features and enable:

✅ Docker Model Runner
✅ Host-side TCP support (on port 12434)

This allows models to expose REST APIs you can call from your apps.

🔹 Step 2: Pull and Run Qwen 3 Model

Open your terminal and run:

# Pull the model
docker model pull qwen/qwen-3b

# Run the model
docker model run qwen/qwen-3b

You’ll see an interactive chat session where you can ask questions.

Example:

> Hello
< Hi! How can I assist you today?

🔹 Step 3: Use REST API

The model exposes endpoints on:

http://localhost:12434/v1

For example, to list available models:

curl http://localhost:12434/models

You’ll see details of Qwen 3 and any other models you’ve pulled.

🔹 Step 4: Connect with Semantic Kernel (.NET)

The best part is — you don’t need to rewrite your app.

Semantic Kernel already works with OpenAI APIs, and since Docker exposes the same structure, all you do is change the base URL.

Install packages

dotnet add package Microsoft.SemanticKernel

Create a Console App

using Microsoft.SemanticKernel;

var builder = Kernel.CreateBuilder();
builder.AddOpenAIChatCompletion(
    modelId: "qwen/qwen-3b",
    apiKey: "dummy", // not required for local
    endpoint: new Uri("http://localhost:12434/v1") // Docker endpoint
);

var kernel = builder.Build();

var result = await kernel.InvokePromptAsync("Explain Docker in simple terms.");
Console.WriteLine(result);

That’s it! Your .NET app is now talking to Qwen 3 inside Docker, through Semantic Kernel.

🔹 Why This Matters

🖥️ Local-first AI → Run LLMs without internet or billing
🔒 Privacy → Your data never leaves your machine
⚡ Developer-friendly → Same SDKs, same APIs, just a different base URL
🔗 Drop-in replacement → Move between local and cloud seamlessly

✅ Final Thoughts

Running AI models is now as easy as running containers.

With Docker Model Runner, you can:

Pull Qwen 3
Run it locally
Connect it to .NET apps with Semantic Kernel

This opens up endless possibilities for building intelligent apps without cloud dependencies.

👉 What model are you planning to run first with Docker?

Drop a comment — I’d love to know!

DEV Community