How to Run Ollama on Mac Mini: A Complete Local AI Setup Guide
If you've been looking into how to run Ollama on Mac Mini, you've probably already figured out that the M-series chips make it one of the best local AI hosts money can buy. I set mine up a few weeks ago and it's been running 24/7 without a hiccup — silent, fast, and completely private. Here's exactly what I did.
Why Mac Mini?
The M2 and M4 Mac Minis have unified memory architecture, which means the CPU and GPU share the same RAM pool. For local AI workloads, this matters a lot. A 16GB M2 Mac Mini can run Llama 3.1 8B comfortably, and a 24GB model handles Mistral, Gemma 2, and even some 32B quantized models without breaking a sweat.
They're also quiet, energy-efficient (roughly 6-8W at idle), and small enough to sit behind a monitor. For a home AI server, there's not much competition.
Installing Ollama
First, grab the installer from ollama.com. It's a straightforward Mac app install — drag to Applications, done.
Once installed, open Terminal and verify it's running:
ollama --version
You should see something like ollama version 0.3.x. Ollama runs as a background service automatically after installation.
Pulling Your First Model
To run Ollama on Mac Mini effectively, you want to match the model to your RAM. Here's a quick guide:
| RAM | Recommended Models |
|---|---|
| 8GB | Llama 3.2 3B, Phi-3 Mini |
| 16GB | Llama 3.1 8B, Mistral 7B, Gemma 2 9B |
| 24GB+ | Llama 3.1 32B (Q4), Mixtral 8x7B |
Pull a model like this:
ollama pull llama3.1
It downloads to ~/.ollama/models. First pull takes a few minutes depending on model size and your connection.
Test it immediately:
ollama run llama3.1 "Summarise what Ollama is in two sentences."
If it responds, you're up and running.
Exposing Ollama on Your Local Network
By default, Ollama only listens on localhost:11434. To reach it from other devices on your network (or from n8n running in Docker), you need to change the bind address.
On macOS, you do this by editing the Ollama service environment:
launchctl setenv OLLAMA_HOST "0.0.0.0:11434"
Then restart Ollama from the menu bar icon (quit and reopen). You can verify it's listening on all interfaces:
lsof -i :11434
Now any device on your local network can reach Ollama at http://[your-mac-mini-ip]:11434.
Integrating Ollama with n8n
This is where things get genuinely useful. n8n is a self-hosted workflow automation tool, and it has a native Ollama node. Once your Mac Mini is running Ollama on the local network, you can:
- Trigger workflows from email, webhooks, or schedules
- Pass content to Ollama for summarisation, classification, or drafting
- Route outputs to Notion, Gmail, Slack, or anywhere else
To connect n8n to Ollama, use the "Ollama" credential type and set the base URL to http://[mac-mini-ip]:11434. That's it. No API keys, no rate limits, no cloud costs.
A simple workflow might look like: Gmail trigger → extract email body → Ollama summarise → append to Notion database. Takes about 10 minutes to build.
Keeping Ollama Running After Restarts
The Ollama app should auto-start on login by default. Double-check by going to System Settings → General → Login Items and confirming Ollama is listed.
If you're running the Mac Mini headless (no monitor), make sure automatic login is enabled so the session starts after a power cycle: System Settings → Users & Groups → Automatic Login.
Performance Tips
- Use quantized models: Q4_K_M variants are the sweet spot — nearly full quality, half the RAM.
- Close memory-hungry apps: Safari with 40 tabs will compete with your model. On a dedicated server this isn't an issue.
- Monitor with Activity Monitor: Check "Memory Pressure" under the Memory tab. Green means you have headroom.
-
Concurrent requests: Ollama handles one request at a time by default. For multi-user setups, look into
OLLAMA_NUM_PARALLEL.
What I Actually Use This For
Day-to-day, my Mac Mini Ollama setup handles: summarising long emails before I read them, drafting replies, tagging and categorising RSS feeds, and running nightly document processing jobs through n8n. It's become infrastructure I'd miss if it went away.
The n8n integration specifically unlocked a lot — you stop thinking "I'll ask ChatGPT about this" and start thinking "I'll build a workflow for this." Different mental model, much more powerful.
Key Takeaways
- Mac Mini M-series is ideal for local AI: unified memory, low power, always-on
- Ollama installs in minutes — one app, no configuration needed for basic use
- Expose port 11434 on all interfaces to reach Ollama from other local devices
- n8n integration turns your local model into a full automation backend
- Quantized models (Q4_K_M) give near-full quality at half the memory cost
- For headless use, enable automatic login so Ollama survives power cycles
If you want to take this further — including the full n8n workflow setup, model selection guide, and automation templates — I documented the whole stack in a guide here: The Home AI Agent Blueprint.
Top comments (0)