Every AI workflow tutorial assumes you are paying OpenAI $0.03 per 1K tokens. But what if you could run the same workflows locally, with zero API costs, and keep your data on your own machine?
You can. Here is how to connect n8n with Ollama to build local AI workflows.
What You Need
- n8n (self-hosted or desktop) -- install guide
- Ollama -- local LLM runner, dead simple to install
- A machine with at least 8GB RAM (16GB recommended for larger models)
That is it. No API keys, no billing dashboards, no usage limits.
Step 1: Install Ollama
Head to ollama.com and download the installer for your OS. On Mac and Windows it is a standard installer. On Linux:
curl -fsSL https://ollama.com/install.sh | sh
Verify it is running:
ollama --version
Ollama runs a local API server on http://localhost:11434 by default. This is what n8n will talk to.
Step 2: Pull a Model
Ollama supports dozens of open-source models. For workflow automation, I recommend starting with one of these:
# Fast and lightweight (3.8B params) -- good for summarization and extraction
ollama pull phi3
# More capable (7B params) -- good for content generation
ollama pull mistral
phi3 runs comfortably on 8GB RAM. mistral needs about 8GB free and runs better with 16GB.
Test it works:
ollama run phi3 "Summarize this in one sentence: n8n is an open-source workflow automation tool."
You should see a response in a few seconds. Model is loaded and ready.
Step 3: Connect n8n to Ollama
Ollama exposes an OpenAI-compatible API. In n8n, you connect to it using the HTTP Request node -- no special plugin needed.
The endpoint you will call:
POST http://localhost:11434/api/generate
The request body:
{
"model": "phi3",
"prompt": "Your prompt here",
"stream": false
}
Setting stream: false is important -- it makes Ollama return the complete response in one JSON object instead of streaming chunks.
The response looks like:
{
"model": "phi3",
"response": "The generated text appears here...",
"done": true
}
You grab {{ $json.response }} in the next node and use it however you want.
Step 4: Build a Text Summarizer Workflow
Let us build a practical example: a webhook that accepts text and returns an AI-generated summary.
The workflow (4 nodes):
1. Webhook node (trigger)
- Method: POST
- Path:
/summarize - This receives the text to summarize
2. HTTP Request node (Ollama call)
- Method: POST
- URL:
http://localhost:11434/api/generate - Body (JSON):
{
"model": "phi3",
"prompt": "Summarize the following text in 2-3 sentences. Be concise and capture the key points.\n\nText: {{ $json.body.text }}",
"stream": false
}
3. Set node (format response)
- Set a field
summaryto{{ $json.response }}
4. Respond to Webhook node
- Returns the summary to the caller
Test it:
curl -X POST http://localhost:5678/webhook/summarize \
-H "Content-Type: application/json" \
-d '{"text": "n8n is a workflow automation tool that allows users to connect various services and automate tasks. It supports over 400 integrations and can be self-hosted for complete data privacy. Unlike SaaS alternatives, n8n has no per-task pricing, making it cost-effective for high-volume automation."}'
You get back a clean summary, generated locally, with zero API costs.
Going Further: Chat Completions API
Ollama also supports the OpenAI-compatible chat completions endpoint:
POST http://localhost:11434/v1/chat/completions
This means you can use n8n's built-in OpenAI node by pointing it at http://localhost:11434/v1 as a custom base URL. Same node, same interface, but the model runs on your hardware.
When to Use Local vs Cloud AI
| Use Case | Local (Ollama) | Cloud (OpenAI/Claude) |
|---|---|---|
| Summarization | Great | Overkill |
| Text extraction | Great | Overkill |
| Content generation | Good (7B+ models) | Better quality |
| Complex reasoning | Limited | Much better |
| Data privacy | Full control | Data leaves your machine |
| Cost at scale | Free | Adds up fast |
For most automation tasks -- summarizing, extracting, classifying, reformatting -- a local 7B model is more than enough. Save the cloud APIs for tasks that genuinely need GPT-4 level reasoning.
What We Are Building
At FlowYantra, we are working on n8n templates that use local LLMs for privacy-first automation. If you need a cloud-based AI workflow right now, check out our Blog to Social AI template -- it uses OpenAI today but the architecture is model-agnostic, so swapping to Ollama is straightforward.
All our templates (free and paid) are on our Gumroad store.
Local AI is not the future. It is already here. And with n8n, it takes about 20 minutes to set up.
Top comments (0)