LocalAI is an open-source platform for running Large Language Models locally with an OpenAI-compatible API, so you can swap it in behind existing OpenAI client code without paying per-token or sending data off-server. This guide deploys LocalAI using Docker Compose with Traefik handling automatic HTTPS, persistent model and cache directories, and a working chat-completion test. By the end, you'll have LocalAI serving an OpenAI-compatible API securely at your domain.
Set Up the Directory Structure
1. Create the project directories:
$ mkdir -p ~/localai/{models,cache}
$ cd ~/localai
models/ holds downloaded model files; cache/ persists between restarts.
2. Create the environment file:
$ nano .env
DOMAIN=localai.example.com
LETSENCRYPT_EMAIL=admin@example.com
Deploy with Docker Compose
1. Add your user to the Docker group:
$ sudo usermod -aG docker $USER
$ newgrp docker
2. Create the Compose manifest:
$ nano docker-compose.yaml
services:
traefik:
image: traefik:v3.6
container_name: traefik
restart: unless-stopped
environment:
DOCKER_API_VERSION: "1.44"
command:
- "--providers.docker=true"
- "--providers.docker.exposedbydefault=false"
- "--entrypoints.web.address=:80"
- "--entrypoints.websecure.address=:443"
- "--entrypoints.web.http.redirections.entrypoint.to=websecure"
- "--entrypoints.web.http.redirections.entrypoint.scheme=https"
- "--certificatesresolvers.le.acme.httpchallenge=true"
- "--certificatesresolvers.le.acme.httpchallenge.entrypoint=web"
- "--certificatesresolvers.le.acme.email=${LETSENCRYPT_EMAIL}"
- "--certificatesresolvers.le.acme.storage=/letsencrypt/acme.json"
ports:
- "80:80"
- "443:443"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./letsencrypt:/letsencrypt
localai:
image: localai/localai:latest-aio-cpu
container_name: localai
restart: unless-stopped
volumes:
- ./models:/models:cached
- ./cache:/cache:cached
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
interval: 1m
timeout: 20s
retries: 5
labels:
- "traefik.enable=true"
- "traefik.http.routers.localai.rule=Host(`${DOMAIN}`)"
- "traefik.http.routers.localai.entrypoints=websecure"
- "traefik.http.routers.localai.tls=true"
- "traefik.http.routers.localai.tls.certresolver=le"
- "traefik.http.services.localai.loadbalancer.server.port=8080"
Swap localai/localai:latest-aio-cpu for a GPU variant (latest-aio-gpu-nvidia-cuda-12) if the host has an NVIDIA GPU.
3. Set the models directory permissions and start the stack:
$ sudo chmod -R 755 ~/localai/models
$ docker compose up -d
$ docker compose ps
Verify the API
1. Check readiness:
$ curl -i https://localai.example.com/readyz
A 200 OK confirms Traefik is routing to LocalAI.
2. List the available models:
$ curl https://localai.example.com/v1/models
3. Run a chat completion:
$ curl -X POST https://localai.example.com/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "Explain what LocalAI does in one sentence."}
],
"max_tokens": 60
}'
LocalAI returns a response in the OpenAI completion shape.
Access the Dashboard
Open https://localai.example.com in a browser to browse the model gallery, install new models, and run inference from the UI.
Next Steps
LocalAI is running and served securely over HTTPS. From here you can:
- Install additional models from the gallery for domain-specific tasks
- Point any OpenAI SDK at the LocalAI base URL by changing
OPENAI_API_BASE - Run a GPU variant for image generation, embeddings, and faster LLM inference
For the full guide with additional tips, visit the original article on Vultr Docs.
Top comments (0)