SIGNAL

Posted on Mar 11

Run Local AI Agents with n8n + Ollama in Docker — No API Keys, No Limits

#docker #automation #selfhosted #ai

You've probably automated a webhook or two with n8n. But have you connected it to a local LLM that runs entirely on your own hardware — no OpenAI bill, no data leaving your network, no rate limits?

This is that article.

By the end you'll have a Docker Compose stack running:

n8n — the visual workflow engine
Ollama — local LLM runtime (runs Llama 3, Mistral, Qwen, etc.)
A working n8n workflow that uses a local AI model to summarize text, classify data, or answer questions

Total setup time: ~20 minutes on any machine with 8+ GB RAM.

Why This Combo Works

n8n's HTTP Request node can call any REST API. Ollama exposes a clean REST API on localhost:11434. So connecting them is just a matter of pointing n8n at Ollama's endpoint — no plugin, no special node needed.

The result: you get AI-powered automation that:

Runs offline
Has no per-request cost
Never sends your data to a third party
Can run 24/7 on a homelab box or spare PC

Step 1: Docker Compose Setup

Create a docker-compose.yml:

version: "3.8"

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    volumes:
      - ollama_data:/root/.ollama
    ports:
      - "11434:11434"
    restart: unless-stopped
    # For GPU acceleration (optional):
    # deploy:
    #   resources:
    #     reservations:
    #       devices:
    #         - driver: nvidia
    #           count: 1
    #           capabilities: [gpu]

  n8n:
    image: n8nio/n8n:latest
    container_name: n8n
    environment:
      - N8N_BASIC_AUTH_ACTIVE=true
      - N8N_BASIC_AUTH_USER=admin
      - N8N_BASIC_AUTH_PASSWORD=changeme
      - N8N_HOST=localhost
      - N8N_PORT=5678
      - N8N_PROTOCOL=http
      - WEBHOOK_URL=http://localhost:5678/
    ports:
      - "5678:5678"
    volumes:
      - n8n_data:/home/node/.n8n
    depends_on:
      - ollama
    restart: unless-stopped

volumes:
  ollama_data:
  n8n_data:

Start it:

docker compose up -d

Step 2: Pull a Model into Ollama

Once the ollama container is running, pull a model. For most homelab hardware, llama3.2:3b is a good balance of speed and capability:

docker exec -it ollama ollama pull llama3.2:3b

For beefier machines (16+ GB RAM), try mistral:7b or qwen2.5:7b.

Verify it works:

curl http://localhost:11434/api/generate \
  -d '{
    "model": "llama3.2:3b",
    "prompt": "Summarize this in one sentence: The sky is blue because of Rayleigh scattering.",
    "stream": false
  }'

You should get a JSON response with a response field. That's Ollama talking.

Step 3: Build the n8n Workflow

Open n8n at http://localhost:5678 (login: admin / changeme).

Create a new workflow with these nodes:

Node 1: Webhook (trigger)

Method: POST
Path: /summarize
Response mode: Last Node

This gives you a URL like http://localhost:5678/webhook/summarize to POST text to.

Node 2: HTTP Request (calls Ollama)

Method: POST
URL: http://ollama:11434/api/generate

Note: Inside Docker Compose, containers talk to each other by service name. So n8n reaches Ollama at http://ollama:11434, not localhost.

Body (JSON):

{
  "model": "llama3.2:3b",
  "prompt": "Summarize the following in 2-3 sentences:\n\n{{ $json.body.text }}",
  "stream": false
}

Node 3: Respond to Webhook

Response Body:

{
  "summary": "{{ $json.response }}"
}

Activate the workflow. Now test it:

curl -X POST http://localhost:5678/webhook/summarize \
  -H "Content-Type: application/json" \
  -d '{"text": "Docker is an open-source platform that automates the deployment of applications inside lightweight, portable containers. It has become the standard tool for packaging software in modern development pipelines."}'

Response:

{
  "summary": "Docker is an open-source containerization platform that packages applications for portable deployment. It is widely adopted in modern software development pipelines."
}

Local AI. Zero cloud. Done.

Step 4: Practical Workflow Ideas

Once the plumbing works, here's what you can actually build:

Email triage agent — Webhook receives email body → Ollama classifies as urgent/normal/spam → n8n routes accordingly (write to Notion, send Slack alert, etc.)

RSS summarizer — Schedule node fetches RSS feed → Loop over items → Ollama writes a one-line summary → Append to a daily digest file or send to your phone

Code review helper — GitHub webhook sends PR diff → Ollama reviews it → n8n posts a comment back to the PR via GitHub API

Log anomaly detector — n8n reads from a log file on a schedule → Sends batches to Ollama with "flag anything unusual" → Notifies you only when something interesting shows up

All of these run locally, handle your private data, and cost $0/month beyond electricity.

Performance Notes

On a Mac mini M4 (16 GB unified memory), llama3.2:3b responds in ~1-2 seconds per request. mistral:7b takes ~4-6 seconds. For async workflows that don't need real-time response, even slower is fine — n8n will just wait.

If you're on a GPU-equipped machine, uncomment the deploy block in the Compose file and you'll see 3-5x faster responses.

For production use, add a proper reverse proxy (Traefik or Caddy) in front of n8n and don't expose port 5678 directly.

The Takeaway

n8n + Ollama is one of those combinations that sounds more complicated than it is. Two Docker containers, one HTTP Request node, and you have a fully local AI automation engine.

No subscriptions. No terms of service to worry about. No vendor throttling your requests at 3 AM when your workflow actually runs.

The hard part isn't the setup — it's resisting the urge to automate everything once you realize how easy it is.

Want more homelab AI setups and builder-first automation guides? Follow SIGNAL on Dev.to — practical pieces, three times a week.

DEV Community