Paarthurnax

Posted on Mar 20

How to Run Ollama on Mac Mini: A Complete Local AI Setup Guide

#ollama #macos #ai #homelab

How to Run Ollama on Mac Mini: A Complete Local AI Setup Guide

If you've been looking into how to run Ollama on Mac Mini, you've probably already figured out that the M-series chips make it one of the best local AI hosts money can buy. I set mine up a few weeks ago and it's been running 24/7 without a hiccup — silent, fast, and completely private. Here's exactly what I did.

Why Mac Mini?

The M2 and M4 Mac Minis have unified memory architecture, which means the CPU and GPU share the same RAM pool. For local AI workloads, this matters a lot. A 16GB M2 Mac Mini can run Llama 3.1 8B comfortably, and a 24GB model handles Mistral, Gemma 2, and even some 32B quantized models without breaking a sweat.

They're also quiet, energy-efficient (roughly 6-8W at idle), and small enough to sit behind a monitor. For a home AI server, there's not much competition.

Installing Ollama

First, grab the installer from ollama.com. It's a straightforward Mac app install — drag to Applications, done.

Once installed, open Terminal and verify it's running:

ollama --version

You should see something like ollama version 0.3.x. Ollama runs as a background service automatically after installation.

Pulling Your First Model

To run Ollama on Mac Mini effectively, you want to match the model to your RAM. Here's a quick guide:

RAM	Recommended Models
8GB	Llama 3.2 3B, Phi-3 Mini
16GB	Llama 3.1 8B, Mistral 7B, Gemma 2 9B
24GB+	Llama 3.1 32B (Q4), Mixtral 8x7B

Pull a model like this:

ollama pull llama3.1

It downloads to ~/.ollama/models. First pull takes a few minutes depending on model size and your connection.

Test it immediately:

ollama run llama3.1 "Summarise what Ollama is in two sentences."

If it responds, you're up and running.

Exposing Ollama on Your Local Network

By default, Ollama only listens on localhost:11434. To reach it from other devices on your network (or from n8n running in Docker), you need to change the bind address.

On macOS, you do this by editing the Ollama service environment:

launchctl setenv OLLAMA_HOST "0.0.0.0:11434"

Then restart Ollama from the menu bar icon (quit and reopen). You can verify it's listening on all interfaces:

lsof -i :11434

Now any device on your local network can reach Ollama at http://[your-mac-mini-ip]:11434.

Integrating Ollama with n8n

This is where things get genuinely useful. n8n is a self-hosted workflow automation tool, and it has a native Ollama node. Once your Mac Mini is running Ollama on the local network, you can:

Trigger workflows from email, webhooks, or schedules
Pass content to Ollama for summarisation, classification, or drafting
Route outputs to Notion, Gmail, Slack, or anywhere else

To connect n8n to Ollama, use the "Ollama" credential type and set the base URL to http://[mac-mini-ip]:11434. That's it. No API keys, no rate limits, no cloud costs.

A simple workflow might look like: Gmail trigger → extract email body → Ollama summarise → append to Notion database. Takes about 10 minutes to build.

Keeping Ollama Running After Restarts

The Ollama app should auto-start on login by default. Double-check by going to System Settings → General → Login Items and confirming Ollama is listed.

If you're running the Mac Mini headless (no monitor), make sure automatic login is enabled so the session starts after a power cycle: System Settings → Users & Groups → Automatic Login.

Performance Tips

Use quantized models: Q4_K_M variants are the sweet spot — nearly full quality, half the RAM.
Close memory-hungry apps: Safari with 40 tabs will compete with your model. On a dedicated server this isn't an issue.
Monitor with Activity Monitor: Check "Memory Pressure" under the Memory tab. Green means you have headroom.
Concurrent requests: Ollama handles one request at a time by default. For multi-user setups, look into OLLAMA_NUM_PARALLEL.

What I Actually Use This For

Day-to-day, my Mac Mini Ollama setup handles: summarising long emails before I read them, drafting replies, tagging and categorising RSS feeds, and running nightly document processing jobs through n8n. It's become infrastructure I'd miss if it went away.

The n8n integration specifically unlocked a lot — you stop thinking "I'll ask ChatGPT about this" and start thinking "I'll build a workflow for this." Different mental model, much more powerful.

Key Takeaways

Mac Mini M-series is ideal for local AI: unified memory, low power, always-on
Ollama installs in minutes — one app, no configuration needed for basic use
Expose port 11434 on all interfaces to reach Ollama from other local devices
n8n integration turns your local model into a full automation backend
Quantized models (Q4_K_M) give near-full quality at half the memory cost
For headless use, enable automatic login so Ollama survives power cycles

If you want to take this further — including the full n8n workflow setup, model selection guide, and automation templates — I documented the whole stack in a guide here: The Home AI Agent Blueprint.

DEV Community

How to Run Ollama on Mac Mini: A Complete Local AI Setup Guide

How to Run Ollama on Mac Mini: A Complete Local AI Setup Guide

Why Mac Mini?

Installing Ollama

Pulling Your First Model

Exposing Ollama on Your Local Network

Integrating Ollama with n8n

Keeping Ollama Running After Restarts

Performance Tips

What I Actually Use This For

Key Takeaways

Top comments (0)