Ollama is a local LLM runner — download and run open-source AI models on your machine with one command.
What You Get for Free
-
One command —
ollama run llama3downloads and runs - Many models — Llama 3, Mistral, Gemma, Phi, CodeLlama, and more
- OpenAI-compatible API — drop-in replacement for GPT API calls
- Custom models — create Modelfiles with custom system prompts
- GPU support — NVIDIA, AMD, Apple Silicon acceleration
- Embedding models — run embedding models locally
- Multi-model — run multiple models simultaneously
- Offline — works without internet after download
Quick Start
# Install
curl -fsSL https://ollama.ai/install.sh | sh
# Run a model
ollama run llama3
# Start chatting immediately
# Use as API (OpenAI-compatible)
curl http://localhost:11434/v1/chat/completions \
-d '{"model":"llama3","messages":[{"role":"user","content":"Hello"}]}'
# Use with OpenAI Python SDK
from openai import OpenAI
client = OpenAI(base_url="http://localhost:11434/v1", api_key="unused")
response = client.chat.completions.create(
model="llama3",
messages=[{"role": "user", "content": "Write a function to sort a list"}]
)
Why Developers Switch from API-Only LLMs
Sending data to OpenAI/Anthropic means your data leaves your machine:
- Privacy — all data stays local, never sent to cloud
- $0 cost — no per-token charges, unlimited usage
- Offline — works without internet
-
OpenAI-compatible — swap
base_url, keep your code
A developer spent $200/month on GPT-4 API calls for code review. After Ollama + CodeLlama: same quality for code review tasks, $0/month, data never leaves the laptop.
Need Custom Data Solutions?
I build production-grade scrapers and data pipelines for startups, agencies, and research teams.
Browse 88+ ready-made scrapers on Apify → — Reddit, HN, LinkedIn, Google, Amazon, and more.
Custom project? Email me: spinov001@gmail.com — fast turnaround, fair pricing.
Top comments (0)