You've probably automated a webhook or two with n8n. But have you connected it to a local LLM that runs entirely on your own hardware — no OpenAI bill, no data leaving your network, no rate limits?
This is that article.
By the end you'll have a Docker Compose stack running:
- n8n — the visual workflow engine
- Ollama — local LLM runtime (runs Llama 3, Mistral, Qwen, etc.)
- A working n8n workflow that uses a local AI model to summarize text, classify data, or answer questions
Total setup time: ~20 minutes on any machine with 8+ GB RAM.
Why This Combo Works
n8n's HTTP Request node can call any REST API. Ollama exposes a clean REST API on localhost:11434. So connecting them is just a matter of pointing n8n at Ollama's endpoint — no plugin, no special node needed.
The result: you get AI-powered automation that:
- Runs offline
- Has no per-request cost
- Never sends your data to a third party
- Can run 24/7 on a homelab box or spare PC
Step 1: Docker Compose Setup
Create a docker-compose.yml:
version: "3.8"
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
volumes:
- ollama_data:/root/.ollama
ports:
- "11434:11434"
restart: unless-stopped
# For GPU acceleration (optional):
# deploy:
# resources:
# reservations:
# devices:
# - driver: nvidia
# count: 1
# capabilities: [gpu]
n8n:
image: n8nio/n8n:latest
container_name: n8n
environment:
- N8N_BASIC_AUTH_ACTIVE=true
- N8N_BASIC_AUTH_USER=admin
- N8N_BASIC_AUTH_PASSWORD=changeme
- N8N_HOST=localhost
- N8N_PORT=5678
- N8N_PROTOCOL=http
- WEBHOOK_URL=http://localhost:5678/
ports:
- "5678:5678"
volumes:
- n8n_data:/home/node/.n8n
depends_on:
- ollama
restart: unless-stopped
volumes:
ollama_data:
n8n_data:
Start it:
docker compose up -d
Step 2: Pull a Model into Ollama
Once the ollama container is running, pull a model. For most homelab hardware, llama3.2:3b is a good balance of speed and capability:
docker exec -it ollama ollama pull llama3.2:3b
For beefier machines (16+ GB RAM), try mistral:7b or qwen2.5:7b.
Verify it works:
curl http://localhost:11434/api/generate \
-d '{
"model": "llama3.2:3b",
"prompt": "Summarize this in one sentence: The sky is blue because of Rayleigh scattering.",
"stream": false
}'
You should get a JSON response with a response field. That's Ollama talking.
Step 3: Build the n8n Workflow
Open n8n at http://localhost:5678 (login: admin / changeme).
Create a new workflow with these nodes:
Node 1: Webhook (trigger)
- Method: POST
- Path:
/summarize - Response mode: Last Node
This gives you a URL like http://localhost:5678/webhook/summarize to POST text to.
Node 2: HTTP Request (calls Ollama)
- Method: POST
- URL:
http://ollama:11434/api/generate
Note: Inside Docker Compose, containers talk to each other by service name. So n8n reaches Ollama at
http://ollama:11434, notlocalhost.
- Body (JSON):
{
"model": "llama3.2:3b",
"prompt": "Summarize the following in 2-3 sentences:\n\n{{ $json.body.text }}",
"stream": false
}
Node 3: Respond to Webhook
- Response Body:
{
"summary": "{{ $json.response }}"
}
Activate the workflow. Now test it:
curl -X POST http://localhost:5678/webhook/summarize \
-H "Content-Type: application/json" \
-d '{"text": "Docker is an open-source platform that automates the deployment of applications inside lightweight, portable containers. It has become the standard tool for packaging software in modern development pipelines."}'
Response:
{
"summary": "Docker is an open-source containerization platform that packages applications for portable deployment. It is widely adopted in modern software development pipelines."
}
Local AI. Zero cloud. Done.
Step 4: Practical Workflow Ideas
Once the plumbing works, here's what you can actually build:
Email triage agent — Webhook receives email body → Ollama classifies as urgent/normal/spam → n8n routes accordingly (write to Notion, send Slack alert, etc.)
RSS summarizer — Schedule node fetches RSS feed → Loop over items → Ollama writes a one-line summary → Append to a daily digest file or send to your phone
Code review helper — GitHub webhook sends PR diff → Ollama reviews it → n8n posts a comment back to the PR via GitHub API
Log anomaly detector — n8n reads from a log file on a schedule → Sends batches to Ollama with "flag anything unusual" → Notifies you only when something interesting shows up
All of these run locally, handle your private data, and cost $0/month beyond electricity.
Performance Notes
On a Mac mini M4 (16 GB unified memory), llama3.2:3b responds in ~1-2 seconds per request. mistral:7b takes ~4-6 seconds. For async workflows that don't need real-time response, even slower is fine — n8n will just wait.
If you're on a GPU-equipped machine, uncomment the deploy block in the Compose file and you'll see 3-5x faster responses.
For production use, add a proper reverse proxy (Traefik or Caddy) in front of n8n and don't expose port 5678 directly.
The Takeaway
n8n + Ollama is one of those combinations that sounds more complicated than it is. Two Docker containers, one HTTP Request node, and you have a fully local AI automation engine.
No subscriptions. No terms of service to worry about. No vendor throttling your requests at 3 AM when your workflow actually runs.
The hard part isn't the setup — it's resisting the urge to automate everything once you realize how easy it is.
Want more homelab AI setups and builder-first automation guides? Follow SIGNAL on Dev.to — practical pieces, three times a week.
Top comments (0)