DEV Community

Md Imran
Md Imran

Posted on

2 1 2

๐Ÿ”ฅ Introducing Docker Model Runner โ€“ Bring AI Inference to Your Local Dev Environment

Docker Model Runner
Imagine running LLMs and GenAI models with a single Docker command โ€” locally, seamlessly, and without the GPU fuss. That future is here.

๐Ÿšข Docker Just Changed the AI Dev Game

Docker has officially launched Docker Model Runner, and itโ€™s a game-changer for developers working with AI and machine learning. If youโ€™ve ever dreamed of running language models, generating embeddings, or building AI apps right on your laptop โ€” without setting up complex environments โ€” Docker has your back.

Docker Model Runner enables local inference of AI models through a clean, simple CLI โ€” no need for CUDA drivers, complicated APIs, or heavy ML stacks. It brings the power of containers to the world of AI like never before.


โœ… TL;DR - What Can You Do With It?

  • Pull prebuilt models like llama3, smollm, deepseek directly from Docker Hub
  • Run them locally via docker model run
  • Use the OpenAI-compatible API from containers or the host
  • Build full-fledged GenAI apps with Docker Compose
  • All this โ€” on your MacBook with Apple Silicon, with Windows support coming soon

๐Ÿงช Hands-on: How It Works

Dockerโ€™s approach is dead simple โ€” just the way we like it.

๐Ÿงฐ Install the Right Docker Desktop latest one

Make sure youโ€™re using a build that supports Model Runner.

โš™๏ธ Enable Model Runner

Install the latest version of Docker Desktop 4.40+
Navigate to Docker Desktop โ†’ Settings โ†’ Features in Development โ†’ Enable Model Runner โ†’ Apply & Restart.

Enable Model Runner

๐Ÿš€ Try It Out in 5 Steps

docker model status         # Check itโ€™s running
docker model list           # See available models
docker model pull ai/llama3.2:1B-Q8_0
docker model run ai/llama3.2:1B-Q8_0 "Hello"
#Instantly receive inference results:
#Hello! How can I assist you today?
docker model rm ai/llama3.2:1B-Q8_0
Enter fullscreen mode Exit fullscreen mode

It feels almost magical. The first response? Instant. No server spin-up. No API latency. Just raw, local AI magic.


๐Ÿ”Œ OpenAI API Compatibility = Integration Bliss

Model Runner exposes OpenAI-compatible endpoints, meaning you can plug your existing tools โ€” LangChain, LlamaIndex, etc. โ€” with zero code changes.

Use it:

  • Inside containers: http://ml.docker.internal/
  • From host (via socket): --unix-socket ~/.docker/run/docker.sock
  • From host (via TCP): reverse proxy to port 8080

๐Ÿค– Supported Models (So Far)

Here are a few gems you can run today:

  • llama3.2:1b
  • smollm135m
  • mxbai-embed-large-v1
  • deepseek-r1-distill
  • โ€ฆand more, more public pre-trained models

๐Ÿ’ฌ Dev-Friendly, Community-Driven

What makes this release truly exciting is how Docker involved its community of Captains and early testers. From the Customer Zero Release to the final launch, feedback was the fuel behind the polish.


๐Ÿ”ฎ Whatโ€™s Next?

  • โœ… Windows support (coming soon)
  • โœ… CI/CD integration
  • โœ… GPU acceleration in future updates
  • ๐Ÿง  More curated models on Docker Hub

๐Ÿšจ Final Thoughts

Docker Model Runner is not just a feature โ€” itโ€™s a shift. Itโ€™s the bridge between AI and DevOps, between local dev and cloud inference.

No more juggling APIs. No more GPU headaches. Just type, pull, run.

AI, meet Dev Experience. Powered by Docker.


๐Ÿš€ Try it today


Top comments (0)

AWS Security LIVE!

Join us for AWS Security LIVE!

Discover the future of cloud security. Tune in live for trends, tips, and solutions from AWS and AWS Partners.

Learn More