DEV Community

Alain Airom
Alain Airom

Posted on

Pros and Cons using containerized Ollama vs. local setup for Generative AI Applications

Reflections about pros and cons using containerized Ollama vs. local setup

Introduction

It’s been a while since I last used Ollama on my local machine. I utilize it primarily for model testing and for developing applications — specifically, to prototype professional demonstrations without relying on cloud-based services. I am currently evaluating the best path forward: is it more advantageous to continue with a local Ollama setup or transition to using it via a container image?

Regardless of the chosen path, the initial setup is remarkably straightforward. For the local installation on my MacBook, it was nearly a two-click process. Similarly, for the containerized approach, integrating with my existing Podman environment on the MacBook requires just a single command line entry. Given this ease of access for both methods, I’m left wondering: What are the key differences in long-term maintenance, resource utilization, or portability that should sway my decision?

🚀 Key Advantages of Ollama in a Container

Containers, such as those provided by Docker, package an application and all its dependencies into a single unit, ensuring it runs reliably regardless of the underlying environment.

1. Isolation and Dependency Management

  • Avoids Conflicts: The container isolates Ollama and its models from your host operating system. This is crucial for avoiding conflicts with other software or system dependencies (like specific Python or Node versions) on your local machine.
  • Clean Environment: The container provides a consistent, clean environment specifically tailored for Ollama, which simplifies troubleshooting. If it works on one machine, it will work the same way on any other machine that runs Docker.

2. Portability and Reproducibility

  • Run Anywhere: A container image can be easily moved and run across different operating systems (Linux, macOS, Windows) and environments (development, testing, production) without complex re-installation or configuration. This makes it highly portable.
  • Reproducible Setup: It ensures that every developer or system using the same container image has the exact same Ollama environment, which is vital for reproducible results and collaborative development.

3. Ease of Deployment and Management

  • Simplified Setup: Deploying and starting Ollama is reduced to a single docker run or docker-compose up command, which handles all setup, networking, and volume mounting.
  • Easy Updates and Rollbacks: Updating to a new version is as simple as pulling a new image and restarting the container. If an update causes issues, rolling back to a previous image version is straightforward.
  • 🔥🔥🔥 Running Multiple Instances: Containers make it easy to run multiple, isolated instances of Ollama simultaneously, perhaps using different configurations or models, without them interfering with each other

🛠️ The Local Installation Trade-off

While a container offers the benefits above, a local installation might be preferred by users who:

  • Prioritize Performance: In some setups, a local/native installation can offer slightly better performance or more direct access to system resources (like a GPU) without the minimal overhead introduced by a container virtualization layer, although modern container runtimes have minimized this gap significantly.
  • Prefer Simplicity: For a single user on a single machine who doesn’t mind managing dependencies, the initial setup might feel simpler than installing and configuring a container runtime (like Docker Desktop) first.
  • Need Deeper System Integration: A native install might be necessary if you require Ollama to integrate with a specific, complex setup on your host machine that is difficult to configure within a container.

In summary, the container image is generally the recommended choice for development and production due to the superior consistency, isolation, and portability it provides, especially when integrating Ollama with other applications (like a web UI).

Containerized setup

🐳 Ollama Podman or Docker run Commands

docker run -d \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama
Enter fullscreen mode Exit fullscreen mode
| **Flag**                  | **Description**                                              |
| ------------------------- | ------------------------------------------------------------ |
| `-d`                      | Runs the container in **detached** (background) mode.        |
| `-v ollama:/root/.ollama` | Creates a Docker **volume** named `ollama` and mounts it to the model storage directory inside the container for **persistent data**. |
| `-p 11434:11434`          | **Maps** the container's default port (`11434`) to the host machine's port `11434`. |
| `--name ollama`           | Assigns the name `ollama` to the container for easy reference. |
| `ollama/ollama`           | The name of the official Ollama Docker image.                |
Enter fullscreen mode Exit fullscreen mode

Hereafter what I actally did on my laptop 💻

podman run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
Resolving "ollama/ollama" using unqualified-search registries (/etc/containers/registries.conf.d/999-podman-machine.conf)
Trying to pull docker.io/ollama/ollama:latest...
Getting image source signatures
Copying blob sha256:e8627297e570041681ee6d0e26dc87f37f7e4ba0c20fd450bda1ffa56fc8cad5
Copying blob sha256:7bdf644cff2e9be580c17c3db8d5fc564ad093513bf0fbebebc392c17fa925e5
Copying blob sha256:fec7a622848d5dc122df3cad0584cf55388324c568120a1dc3e117f8efb49b15
Copying blob sha256:f3a90fbaa0a7594a04c7defb3c09b01d6b7b42bb592a5326bc4db944047ec33e
Copying config sha256:b5d17ef015a877cf489acb96259e5bf33ea154d83ba523ea2acd117f4c29f58f
Writing manifest to image destination
9f3b3b63b5861e62610f7840d21f2dd4c9c776ab5dd0a34b0bb0e410ba53a205

====

podman exec -it ollama ollama run granite4
pulling manifest
pulling 5c7ac4aead1b: 100% ▕███████████████████████████████████████████████████████████████████████████████████████████▏ 2.1 GB
pulling 9fa3d9413163: 100% ▕███████████████████████████████████████████████████████████████████████████████████████████▏ 6.8 KB
pulling cfc7749b96f6: 100% ▕███████████████████████████████████████████████████████████████████████████████████████████▏  11 KB
pulling c2d801f60914: 100% ▕███████████████████████████████████████████████████████████████████████████████████████████▏  417 B
verifying sha256 digest
writing manifest
success
>>> Send a message (/? for help)
Enter fullscreen mode Exit fullscreen mode

If you have a GPU, the command would be;

docker run -d \
  --gpus=all \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

# and with the flag as...
--gpus=all Passes all available GPUs to the container (this one requires the NVIDIA Container Toolkit)
Enter fullscreen mode Exit fullscreen mode

🏃 Running a Model After Container Start

Once the container is running, you interact with the Ollama CLI inside the container to pull and run models:

# documentation
docker exec -it ollama ollama run llama3
#### me 
podman exec -it ollama ollama run granite4
Enter fullscreen mode Exit fullscreen mode

Storage

When using Ollama encapsulated within a container image, a primary consideration becomes the management of data persistence, which directly interacts with our available storage space. Specifically, we must decide how to handle model files and configuration data: should they reside within the container’s writable layer (risking deletion upon container removal) or be mounted externally via a volume.

Containerized — Default Location (Without Persistence)

By default, if you do not specify a volume or bind mount, the models are stored inside the container’s isolated filesystem at:

/root/.ollama(inside the container)
Enter fullscreen mode Exit fullscreen mode

Warning: If the container is removed, all downloaded models will be lost.

Persistent Location

To ensure your models survive container updates or restarts, we must map the container’s model directory to a persistent location on your host machine using a ‘Docker Volume’ or a ‘Bind Mount’.

  • Using a Named Volume (Recommended Method): The models are stored in the Docker-managed volume named ollama. Docker manages where this volume lives on your host system, usually under /var/lib/docker/volumes/ollama/_data on Linux like machines, but it's best to let Docker handle it via the volume name.
docker run -d -v ollama:/root/.ollama -p 11434:11434 ollama/ollama
Enter fullscreen mode Exit fullscreen mode
  • Using a Bind Mount: If you use a bind mount (mapping a specific host directory), you explicitly define the location. For example, if your docker-compose.yml or docker run command specifies: The models are stored in /path/on/your/host on your machine (not tested it).
volumes:
  - /path/on/your/host:/root/.ollama
Enter fullscreen mode Exit fullscreen mode

Local Installation

All downloaed models are on your machine untill you remove them.

> ollama list
NAME                        ID              SIZE      MODIFIED
mistral:7b                  6577803aa9a0    4.4 GB    4 days ago
ibm/granite4:micro          89962fcc7523    2.1 GB    4 days ago
mxbai-embed-large:latest    468836162de7    669 MB    2 weeks ago
all-minilm:latest           1b226e2802db    45 MB     2 weeks ago
embeddinggemma:latest       85462619ee72    621 MB    2 weeks ago
granite-embedding:latest    eb4c533ba6f7    62 MB     3 weeks ago
qwen3-vl:235b-cloud         7fc468f95411    -         3 weeks ago
granite4:micro-h            ba791654cc27    1.9 GB    5 weeks ago
granite4:latest             4235724a127c    2.1 GB    5 weeks ago
granite-embedding:278m      1a37926bf842    562 MB    5 weeks ago
nomic-embed-text:latest     0a109f422b47    274 MB    3 months ago

###
### In /Users/xyz/.ollama
ls
history        id_ed25519     id_ed25519.pub logs           models
Enter fullscreen mode Exit fullscreen mode

❓Can we have both versions running on one machine? YES Absolutely!

💡 The Port Conflict Issue

  • Ollama’s Default Port: The Ollama server, whether running natively or inside a Docker container, defaults to listening on port 11434.
  • The Problem: Only one process can bind to a specific port on a host machine at a time. If your native (local) Ollama is running and occupying port 11434, your Docker container will fail to start if it tries to map its internal port 11434 to the host’s 11434.

✅ The Solution: Change the Port Mapping

You need to tell Docker to map the container’s internal port to a different, unused port on your host machine.

docker run -d \
  -v ollama_docker:/root/.ollama \
  -p **11435**:11434 \  #<------ HOST
  --name ollama_container \
  ollama/ollama

####
# -p **11435**:11434 Maps the host's port 11435 to the container's internal port 11434

# Local (Native) Ollama http://localhost:11434
# Podman/Docker Ollama http://localhost:11435
Enter fullscreen mode Exit fullscreen mode

Conclusion

By running both the native local Ollama and the Containerized Ollama concurrently — each on its own dedicated port (e.g., 11434 and 11435, respectively) — you gain significant flexibility. This dual setup means you can now efficiently test different tasks in parallel, dedicating one instance (perhaps the native one for a specific library integration or hardware path) while reserving the containerized version for general testing or different application workflows, maximizing your productivity and ability to compare environments without interference.

That’s what I would do from now on!

Hope this is useful and thanks for reading 🤗

Links

Top comments (0)