Build an AI coding assistant that runs locally, protects your code, and avoids surprise API bills.
AI coding assistants are everywhere now—helping developers write code faster, refactor confidently, and ship features quicker.
But there’s a growing concern behind the hype:
✅ Where is my code going?
✅ Who can access it?
✅ Is it being stored or used for training?
✅ How much will this cost at scale?
If your team works on private repositories, client code, or enterprise products, sending sensitive context to third-party APIs can be risky (and sometimes not allowed at all).
That’s where a privacy-first local AI setup becomes incredibly valuable.
In this article, you’ll learn:
- What OpenCode and Docker Model Runner are
- Why combining them creates a private AI coding workflow
- How to configure OpenCode to talk to Docker Model Runner
- Which coding models work best and why context size matters
- How to package models for team-wide consistency
- A hands-on CLI walkthrough you can follow today
This guide is beginner-friendly, but still practical for developers who care about real workflows.
What Are OpenCode and Docker Model Runner?

Before we build anything, let’s quickly define both tools.
What is OpenCode?
OpenCode is an open-source coding assistant that fits into developer workflows.
It supports multiple AI providers and lets you configure:
which provider to use
which models to expose
how the assistant connects to the model API
Think of OpenCode as the chat + coding assistant UI/CLI that can talk to different model backends.
📌 Learn more (official): https://opencode.ai/
What is Docker Model Runner (DMR)?
Docker Model Runner (DMR) lets you run and manage large language models (LLMs) using Docker.
The most important part:
✅ It exposes an OpenAI-compatible API.
That means tools that already support OpenAI-style endpoints (like OpenCode) can integrate with it easily.
📌 Learn more about OpenAI API compatibility:
https://platform.openai.com/docs/api-reference
📌 Learn more about Docker:
https://www.docker.com/
Why Use OpenCode + Docker Model Runner Together?
When you combine OpenCode with Docker Model Runner, you get:
✅ A familiar AI coding assistant experience
✅ Models running entirely in infrastructure you control
✅ A setup that can work locally or within your team network
The 2 biggest wins: Privacy + Cost Control
Benefit 1: Privacy by Design 🔒
If you run the model locally (or on your private infra), your code stays inside your environment.
Here’s what happens:
- OpenCode sends prompts + context to your configured provider
- Docker Model Runner receives that request locally
- The model generates a response
- Nothing gets sent to external AI vendors
That means:
✅ No third-party AI providers involved
✅ No external data sharing
✅ No vendor-side retention risks
✅ No training on your code by external services (because nothing leaves your infra)
If you’re working in security-sensitive environments, this can be the difference between:
✅ “We can use AI”
and
❌ “AI is blocked by policy”
Benefit 2: Cost Control 💰
Hosted AI coding tools can become expensive quickly because costs often scale with:
- repository size
- long context windows (more tokens)
- frequent iterative prompting (debug → fix → refactor → repeat)
But with Docker Model Runner:
✅ Inference runs on your own hardware
✅ Once you pull a model, there are no per-token fees
✅ No surprise bills during heavy development cycles
You pay in compute—not usage.
How OpenCode Configuration Works

OpenCode uses a configuration file called opencode.json to define providers and models.
You can configure it in two ways:
✅ Option 1: Global configuration
Location:
~/.config/opencode/opencode.json
Use this when you want the same setup across all projects.
✅ Option 2: Project-specific configuration
Create a file in your project root:
opencode.json
Important note:
✅ A project-level config overrides the global config.
This is great for teams because each repository can “pin” its own setup.
Step 1: Run Docker Model Runner Locally
This guide assumes Docker Model Runner is available at:
http://localhost:12434/v1
That /v1 path matters because it matches OpenAI-style API versioning.
⚠️ Docker Desktop Users: Enable TCP Access
If you’re running Docker Model Runner via Docker Desktop, OpenCode connects over HTTP—which requires exposing the TCP port.
Run:
docker desktop enable model-runner --tcp
Once enabled, Docker Model Runner will be accessible at:
http://localhost:12434/v1
Step 2: Configure OpenCode to Use Docker Model Runner

Now let’s connect OpenCode to Docker Model Runner using opencode.json.
Create (or edit) your config file:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"dmr": {
"npm": "@ai-sdk/openai-compatible",
"name": "Docker Model Runner",
"options": {
"baseURL": "http://localhost:12434/v1"
},
"models": {
"qwen-coder3": {
"name": "qwen-coder3"
},
"devstral-small-2": {
"name": "devstral-small-2"
}
}
}
}
}
What this config does (simple explanation)
- Declares a provider called
dmr - Uses an OpenAI-compatible adapter
(@ai-sdk/openai-compatible) - Points to your local model runner with
baseURL - Exposes models that are available locally
Now OpenCode can send requests to DMR just like it would to any hosted OpenAI-compatible API.
Recommended Models for Coding 🧠
Not all models are equally good for coding.
For code-heavy workflows, you usually want:
- long context windows
- code-aware reasoning
- good performance on refactoring, debugging, repo understanding
Models mentioned as great options:
✅ qwen3-coder
✅ devstral-small-2
✅ gpt-oss
Pull a model locally
For example:
docker model pull qwen3-coder
Once pulled, it’s available for Docker Model Runner to serve.
Pulling Models from Docker Hub and Hugging Face
One powerful feature of Docker Model Runner is that it can pull models from:
Docker Hub
Hugging Face
…and convert them into OCI artifacts (Docker-friendly packaging format).
Example:
docker model pull huggingface.co/unsloth/Ministral-3-14B-Instruct-2512-GGUF
This gives you access to the larger open model ecosystem while keeping consistency and portability.
📌 Learn more about Hugging Face:
https://huggingface.co/
Why Context Length Matters (More Than Model Size)

A common beginner mistake is thinking:
“Bigger model = always better results”
But for coding assistants, context length often matters more.
Why?
Because real coding tasks require understanding:
- multiple files
- project structure
- existing conventions
- longer back-and-forth debugging chats
Default context sizes
What does “128K context” mean?
It means the model can “see” a much larger input in one request—like:
- many files from your repo
- long conversations
- big refactors
If your model can’t hold enough context, it might:
❌ miss important imports
❌ break code conventions
❌ forget earlier details
❌ hallucinate APIs that don’t exist in your repo
Increasing Context Size for GPT-OSS
If you like gpt-oss but want a larger context window, Docker Model Runner lets you repackage it.
Step 1: Pull the model (if needed)
docker model pull gpt-oss
Step 2: Package it with a bigger context window
docker model package --from gpt-oss --context-size 128000 gpt-oss:128K
✅ This creates a new model artifact called gpt-oss:128K
Step 3: Reference it in opencode.json
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"dmr": {
"npm": "@ai-sdk/openai-compatible",
"name": "Docker Model Runner",
"options": {
"baseURL": "http://localhost:12434/v1"
},
"models": {
"gpt-oss:128K": {
"name": "gpt-oss (128K)"
}
}
}
}
}
Now OpenCode can use gpt-oss:128K like any other model.
Sharing Models Across Your Team (Underrated Win)
Packaging models as OCI artifacts means you can share them like you share Docker images.
Teams can:
- standardize model variants (including context size)
- avoid “works on my machine” behavior differences
- roll back model changes safely
- store models in Docker Hub or private registries
This turns models into versioned infrastructure artifacts, not random personal preferences.
That’s a huge win for real teams.
Putting It All Together: Use OpenCode + DMR from the CLI
Let’s run a real workflow example end-to-end.
Step 1: Verify the model exists locally
docker model ls
You should see your model listed, for example:
gpt-oss:128K
Step 2: Ensure your project config includes the model
Example:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"dmr": {
"npm": "@ai-sdk/openai-compatible",
"name": "Docker Model Runner",
"options": {
"baseURL": "http://localhost:12434/v1"
},
"models": {
"gpt-oss": {
"name": "gpt-oss:128K"
}
}
}
}
}
This makes the model available under the dmr provider.
Step 3: Start OpenCode in your project
From your project root:
opencode
Step 4: Select a model inside OpenCode
Run:
/models
Pick the model you want to use.
Step 5: Generate an agents.md file
Now you can prompt OpenCode like this:
Generate an agents.md file in the project root following the agents.md specification and examples.
Use this repository as context and include sections that help an AI agent work effectively with this project, including:
- Project overview
- Build and test commands
- Code style guidelines
- Testing instructions
- Security considerations
Base the content on the actual structure, tooling, and conventions used in this repository.
Keep the file concise, practical, and actionable for an AI agent contributing to the project.
Why this works so well
Because OpenCode is connected to Docker Model Runner:
✅ Your repository context stays local
✅ Your prompts stay inside your environment
✅ Your model can reason over a large portion of the repo (especially with 128K context)
Step 6: Review and commit the output
Inspect the file:
cat agents.md
Then commit it like any normal artifact:
git add agents.md
git commit -m "Add agents documentation"
Common Mistakes (And How to Avoid Them)
Here are a few things that often trip up first-time setups:
❌ ### 1) Forgetting to expose TCP access (Docker Desktop)
✅ Fix: Enable it using:
docker desktop enable model-runner --tcp
❌ ### 2) Pointing OpenCode to the wrong base URL
✅ Fix: Use the correct endpoint:
http://localhost:12434/v1
❌ ### 3) Using a short-context model for big repos
✅ Fix: Prefer coding-focused models with long context:
qwen3-coderdevstral-small-2
Or repackage like:
gpt-oss:128K
❌ ### 4) Treating models like personal settings
✅ Fix: Package + share models for consistency across the team.
When This Setup Is a Great Fit
✅
OpenCode + Docker Model Runner is especially helpful when you want:
- private AI coding for enterprise repositories
- internal AI tools for teams
- predictable AI costs
- local-first workflows
- reusable model artifacts across developers
Final Thoughts

AI coding assistants can be amazing—but not every team is comfortable sending private code to external providers.
By combining OpenCode with Docker Model Runner, you get:
✅ modern AI coding workflows
✅ privacy by design
✅ cost control
✅ team-wide consistency
✅ access to open-source models
If you’re serious about using AI in real engineering environments, this setup is one of the most practical ways to do it.
Extra Resources 📚
Here are some useful links to explore further:
- OpenCode: https://opencode.ai/
- Docker: https://www.docker.com/
- Hugging Face: https://huggingface.co/
- OpenAI API reference (for OpenAI-compatible format):
- https://platform.openai.com/docs/api-reference
Disclaimer: Images Generated with the help of AI


Top comments (1)
Fantastic writeup, and I can see the pros - but, do you need a fat and powerful machine (hardware) to run it?