From Herd to Host: Exposing LLaMA via Dockerized Ollama

#docker #langchain #python #llama

Easily serve LLaMA models using Ollama inside a Docker container and expose it for external access via port 11434.

🧰 Prerequisites

Docker installed
Sufficient RAM and disk for LLaMA models
(Optional) Ollama CLI locally for testing outside container

🛠️ Step 1: Create a Dockerfile

FROM ollama/ollama:latest

# Pull the LLaMA model inside the container
RUN ollama pull llama

# Expose default Ollama port
EXPOSE 11434

# Start Ollama server on container startup
CMD ["ollama", "serve", "--port", "11434"]

Save this as Dockerfile.

🏗️ Step 2: Build the Docker Image

docker build -t ollama-llama .

🚀 Step 3: Run the Container

docker run -d -p 11434:11434 --name ollama-llama-container ollama-llama

This will expose the Ollama LLaMA server on http://localhost:11434.

💬 Step 4: Interact with the Model

Send requests using HTTP or use ollama CLI:

curl http://localhost:11434

You can also use ollama run llama if CLI is installed and pointed to this container.

📝 Notes

Replace llama with llama2, llama3, or specific model names as needed.
Use volume mounting if you want persistent model caching.
Monitor container logs with:

docker logs -f ollama-llama-container

DEV Community