DEV Community

Sam Hartley
Sam Hartley

Posted on

Run Your Own AI Server for $0/month with Ollama

You don't need OpenAI. You don't need a $200/month API bill. You can run powerful AI models on hardware you already own — for free.

Here's exactly how I set this up, and why I haven't paid for API credits in months.

Why Local AI?

  • Zero API costs — no per-token billing, no surprise invoices
  • Full privacy — your data never leaves your network
  • No rate limits — run as many queries as your hardware allows
  • Works offline — no internet? No problem
  • No vendor lock-in — switch models, change configs, own your stack

What You Need

Any modern computer works. Here's what different setups can handle:

Hardware RAM Best Models Speed
MacBook M1/M2/M3/M4 8-16GB Qwen 3.5 9B, Llama 3.1 8B Fast ⚡
Gaming PC (RTX 3060+) 16GB+ Qwen 3 Coder 30B, DeepSeek R1 Very Fast 🚀
Old laptop/desktop 8GB+ Phi-3 Mini, Gemma 2B Usable 🐢
Raspberry Pi 5 8GB Tiny models only Slow 🐌

The sweet spot: A used gaming GPU (RTX 3060 12GB) costs ~$150 on eBay and runs 30B parameter models comfortably.

Step 1: Install Ollama (2 minutes)

# macOS or Linux — one command
curl -fsSL https://ollama.com/install.sh | sh

# Windows — download from ollama.com
Enter fullscreen mode Exit fullscreen mode

That's it. No Docker, no Python environments, no dependency hell.

Step 2: Download a Model (5 minutes)

# Fast & capable (recommended starter)
ollama pull qwen3.5:9b

# Code specialist
ollama pull qwen3-coder:30b

# Reasoning powerhouse
ollama pull deepseek-r1:8b
Enter fullscreen mode Exit fullscreen mode

Models download once and run locally forever.

Step 3: Start Using It

Interactive Chat

ollama run qwen3.5:9b

>>> What's the fastest sorting algorithm for nearly-sorted data?
Enter fullscreen mode Exit fullscreen mode

API Access (OpenAI-compatible!)

curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.5:9b",
    "messages": [{"role": "user", "content": "Explain Docker in 3 sentences"}]
  }'
Enter fullscreen mode Exit fullscreen mode

Yes — it's OpenAI API compatible. Any tool that works with GPT works with Ollama. Just change the base URL.

Step 4: Make It a Server

Want other devices on your network to access it?

# Start Ollama with network access
OLLAMA_HOST=0.0.0.0 ollama serve
Enter fullscreen mode Exit fullscreen mode

Now any device on your network can query http://YOUR_IP:11434.

What I've Built With This:

  • Telegram Bot running 24/7 on a Mac Mini, answering questions via local Qwen 3.5
  • Code Review Agent using Qwen 3 Coder 30B — reviews PRs in ~12 seconds
  • Document Q&A with RAG pipeline — load PDFs, ask questions, get cited answers
  • Garmin Watch Face that fetches stock data (the background service uses local AI for formatting)

The Cost Comparison

Solution Monthly Cost Privacy Speed
OpenAI GPT-4o $20-200+ ❌ Cloud Fast
Anthropic Claude $20-100+ ❌ Cloud Fast
Google Gemini $0-25+ ❌ Cloud Fast
Ollama (Local) $0 ✅ Private Fast

The only cost is electricity — roughly $5-15/month if running 24/7 on a desktop PC.

Pro Tips

  1. Use GPU, not CPU — A $150 used RTX 3060 is 10-15x faster than any CPU
  2. Start with 7-9B models — They're surprisingly capable and fast
  3. Try different models for different tasks — coding, reasoning, and chat each have specialists
  4. Enable the OpenAI-compatible API — instant compatibility with thousands of tools
  5. Set up auto-startsystemctl enable ollama on Linux, launchd on macOS
  6. Run multiple models — I keep 3-4 models loaded and switch based on the task

My Current Setup

I run a 3-machine lab:

Machine Role Model
Mac Mini M4 Quick chat, orchestration Qwen 3 4B
Windows PC (RTX 3060) Heavy inference, coding Qwen 3 Coder 30B
Ubuntu box Fallback, background tasks minicpm-v

Total monthly API cost: $0. Total hardware cost: one $150 used GPU.

Want to Go Deeper?

I write about running AI locally, home lab setups, and turning hardware into income. If you want more of this, drop a comment — I read every one.

Other posts in this series:

Top comments (0)