DEV Community

Cover image for OpenCode with Docker Model Runner for Private AI Coding 🐳🤖
Parth G
Parth G

Posted on

OpenCode with Docker Model Runner for Private AI Coding 🐳🤖

Build an AI coding assistant that runs locally, protects your code, and avoids surprise API bills.

AI coding assistants are everywhere now—helping developers write code faster, refactor confidently, and ship features quicker.

But there’s a growing concern behind the hype:

Where is my code going?
Who can access it?
Is it being stored or used for training?
How much will this cost at scale?

If your team works on private repositories, client code, or enterprise products, sending sensitive context to third-party APIs can be risky (and sometimes not allowed at all).

That’s where a privacy-first local AI setup becomes incredibly valuable.

In this article, you’ll learn:

  • What OpenCode and Docker Model Runner are
  • Why combining them creates a private AI coding workflow
  • How to configure OpenCode to talk to Docker Model Runner
  • Which coding models work best and why context size matters
  • How to package models for team-wide consistency
  • A hands-on CLI walkthrough you can follow today

This guide is beginner-friendly, but still practical for developers who care about real workflows.


What Are OpenCode and Docker Model Runner?


Before we build anything, let’s quickly define both tools.

What is OpenCode?

OpenCode is an open-source coding assistant that fits into developer workflows.
It supports multiple AI providers and lets you configure:

which provider to use

which models to expose

how the assistant connects to the model API

Think of OpenCode as the chat + coding assistant UI/CLI that can talk to different model backends.

📌 Learn more (official): https://opencode.ai/


What is Docker Model Runner (DMR)?

Docker Model Runner (DMR) lets you run and manage large language models (LLMs) using Docker.

The most important part:

✅ It exposes an OpenAI-compatible API.

That means tools that already support OpenAI-style endpoints (like OpenCode) can integrate with it easily.

📌 Learn more about OpenAI API compatibility:
https://platform.openai.com/docs/api-reference

📌 Learn more about Docker:
https://www.docker.com/


Why Use OpenCode + Docker Model Runner Together?

When you combine OpenCode with Docker Model Runner, you get:

✅ A familiar AI coding assistant experience
✅ Models running entirely in infrastructure you control
✅ A setup that can work locally or within your team network

The 2 biggest wins: Privacy + Cost Control


Benefit 1: Privacy by Design 🔒

If you run the model locally (or on your private infra), your code stays inside your environment.

Here’s what happens:

  • OpenCode sends prompts + context to your configured provider
  • Docker Model Runner receives that request locally
  • The model generates a response
  • Nothing gets sent to external AI vendors

That means:

✅ No third-party AI providers involved
✅ No external data sharing
✅ No vendor-side retention risks
✅ No training on your code by external services (because nothing leaves your infra)

If you’re working in security-sensitive environments, this can be the difference between:

✅ “We can use AI”
and
❌ “AI is blocked by policy”


Benefit 2: Cost Control 💰

Hosted AI coding tools can become expensive quickly because costs often scale with:

  • repository size
  • long context windows (more tokens)
  • frequent iterative prompting (debug → fix → refactor → repeat)

But with Docker Model Runner:

✅ Inference runs on your own hardware
✅ Once you pull a model, there are no per-token fees
✅ No surprise bills during heavy development cycles

You pay in compute—not usage.


How OpenCode Configuration Works


OpenCode uses a configuration file called opencode.json to define providers and models.

You can configure it in two ways:

✅ Option 1: Global configuration

Location:

~/.config/opencode/opencode.json

Enter fullscreen mode Exit fullscreen mode

Use this when you want the same setup across all projects.


✅ Option 2: Project-specific configuration

Create a file in your project root:

opencode.json

Enter fullscreen mode Exit fullscreen mode

Important note:
✅ A project-level config overrides the global config.

This is great for teams because each repository can “pin” its own setup.


Step 1: Run Docker Model Runner Locally

This guide assumes Docker Model Runner is available at:

http://localhost:12434/v1

Enter fullscreen mode Exit fullscreen mode

That /v1 path matters because it matches OpenAI-style API versioning.


⚠️ Docker Desktop Users: Enable TCP Access

If you’re running Docker Model Runner via Docker Desktop, OpenCode connects over HTTP—which requires exposing the TCP port.

Run:

docker desktop enable model-runner --tcp

Enter fullscreen mode Exit fullscreen mode

Once enabled, Docker Model Runner will be accessible at:

http://localhost:12434/v1

Enter fullscreen mode Exit fullscreen mode

Step 2: Configure OpenCode to Use Docker Model Runner


Now let’s connect OpenCode to Docker Model Runner using opencode.json.

Create (or edit) your config file:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "dmr": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Docker Model Runner",
      "options": {
        "baseURL": "http://localhost:12434/v1"
      },
      "models": {
        "qwen-coder3": {
          "name": "qwen-coder3"
        },
        "devstral-small-2": {
          "name": "devstral-small-2"
        }
      }
    }
  }
}

Enter fullscreen mode Exit fullscreen mode

What this config does (simple explanation)

  • Declares a provider called dmr
  • Uses an OpenAI-compatible adapter (@ai-sdk/openai-compatible)
  • Points to your local model runner with baseURL
  • Exposes models that are available locally

Now OpenCode can send requests to DMR just like it would to any hosted OpenAI-compatible API.


Recommended Models for Coding 🧠

Not all models are equally good for coding.

For code-heavy workflows, you usually want:

  • long context windows
  • code-aware reasoning
  • good performance on refactoring, debugging, repo understanding

Models mentioned as great options:

qwen3-coder
devstral-small-2
gpt-oss


Pull a model locally

For example:

docker model pull qwen3-coder

Enter fullscreen mode Exit fullscreen mode

Once pulled, it’s available for Docker Model Runner to serve.


Pulling Models from Docker Hub and Hugging Face

One powerful feature of Docker Model Runner is that it can pull models from:

Docker Hub

Hugging Face

…and convert them into OCI artifacts (Docker-friendly packaging format).

Example:

docker model pull huggingface.co/unsloth/Ministral-3-14B-Instruct-2512-GGUF

Enter fullscreen mode Exit fullscreen mode

This gives you access to the larger open model ecosystem while keeping consistency and portability.

📌 Learn more about Hugging Face:
https://huggingface.co/


Why Context Length Matters (More Than Model Size)


A common beginner mistake is thinking:

“Bigger model = always better results”

But for coding assistants, context length often matters more.

Why?

Because real coding tasks require understanding:

  • multiple files
  • project structure
  • existing conventions
  • longer back-and-forth debugging chats

Default context sizes

What does “128K context” mean?

It means the model can “see” a much larger input in one request—like:

  • many files from your repo
  • long conversations
  • big refactors

If your model can’t hold enough context, it might:

❌ miss important imports
❌ break code conventions
❌ forget earlier details
❌ hallucinate APIs that don’t exist in your repo


Increasing Context Size for GPT-OSS

If you like gpt-oss but want a larger context window, Docker Model Runner lets you repackage it.

Step 1: Pull the model (if needed)

docker model pull gpt-oss

Enter fullscreen mode Exit fullscreen mode

Step 2: Package it with a bigger context window

docker model package --from gpt-oss --context-size 128000 gpt-oss:128K

Enter fullscreen mode Exit fullscreen mode

✅ This creates a new model artifact called gpt-oss:128K

Step 3: Reference it in opencode.json

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "dmr": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Docker Model Runner",
      "options": {
        "baseURL": "http://localhost:12434/v1"
      },
      "models": {
        "gpt-oss:128K": {
          "name": "gpt-oss (128K)"
        }
      }
    }
  }
}

Enter fullscreen mode Exit fullscreen mode

Now OpenCode can use gpt-oss:128K like any other model.


Sharing Models Across Your Team (Underrated Win)

Packaging models as OCI artifacts means you can share them like you share Docker images.

Teams can:

  • standardize model variants (including context size)
  • avoid “works on my machine” behavior differences
  • roll back model changes safely
  • store models in Docker Hub or private registries

This turns models into versioned infrastructure artifacts, not random personal preferences.

That’s a huge win for real teams.

Putting It All Together: Use OpenCode + DMR from the CLI

Let’s run a real workflow example end-to-end.

Step 1: Verify the model exists locally

docker model ls

Enter fullscreen mode Exit fullscreen mode

You should see your model listed, for example:

gpt-oss:128K
Enter fullscreen mode Exit fullscreen mode

Step 2: Ensure your project config includes the model

Example:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "dmr": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Docker Model Runner",
      "options": {
        "baseURL": "http://localhost:12434/v1"
      },
      "models": {
        "gpt-oss": {
          "name": "gpt-oss:128K"
        }
      }
    }
  }
}

Enter fullscreen mode Exit fullscreen mode

This makes the model available under the dmr provider.


Step 3: Start OpenCode in your project

From your project root:

opencode

Enter fullscreen mode Exit fullscreen mode

Step 4: Select a model inside OpenCode

Run:

/models

Enter fullscreen mode Exit fullscreen mode

Pick the model you want to use.


Step 5: Generate an agents.md file

Now you can prompt OpenCode like this:

Generate an agents.md file in the project root following the agents.md specification and examples.

Use this repository as context and include sections that help an AI agent work effectively with this project, including:
- Project overview
- Build and test commands
- Code style guidelines
- Testing instructions
- Security considerations

Base the content on the actual structure, tooling, and conventions used in this repository.
Keep the file concise, practical, and actionable for an AI agent contributing to the project.

Enter fullscreen mode Exit fullscreen mode

Why this works so well

Because OpenCode is connected to Docker Model Runner:

✅ Your repository context stays local
✅ Your prompts stay inside your environment
✅ Your model can reason over a large portion of the repo (especially with 128K context)


Step 6: Review and commit the output

Inspect the file:

cat agents.md

Enter fullscreen mode Exit fullscreen mode

Then commit it like any normal artifact:

git add agents.md
git commit -m "Add agents documentation"

Enter fullscreen mode Exit fullscreen mode

Common Mistakes (And How to Avoid Them)

Here are a few things that often trip up first-time setups:

❌ ### 1) Forgetting to expose TCP access (Docker Desktop)

✅ Fix: Enable it using:

docker desktop enable model-runner --tcp

Enter fullscreen mode Exit fullscreen mode

❌ ### 2) Pointing OpenCode to the wrong base URL

✅ Fix: Use the correct endpoint:

http://localhost:12434/v1

Enter fullscreen mode Exit fullscreen mode

❌ ### 3) Using a short-context model for big repos

✅ Fix: Prefer coding-focused models with long context:

  • qwen3-coder
  • devstral-small-2

Or repackage like:

  • gpt-oss:128K

❌ ### 4) Treating models like personal settings

✅ Fix: Package + share models for consistency across the team.

When This Setup Is a Great Fit

OpenCode + Docker Model Runner is especially helpful when you want:

  • private AI coding for enterprise repositories
  • internal AI tools for teams
  • predictable AI costs
  • local-first workflows
  • reusable model artifacts across developers

Final Thoughts


AI coding assistants can be amazing—but not every team is comfortable sending private code to external providers.

By combining OpenCode with Docker Model Runner, you get:

✅ modern AI coding workflows
✅ privacy by design
✅ cost control
✅ team-wide consistency
✅ access to open-source models

If you’re serious about using AI in real engineering environments, this setup is one of the most practical ways to do it.


Extra Resources 📚

Here are some useful links to explore further:

Disclaimer: Images Generated with the help of AI

Top comments (1)

Collapse
 
leob profile image
leob

Fantastic writeup, and I can see the pros - but, do you need a fat and powerful machine (hardware) to run it?