Ollama Has a Free API That Lets You Run LLMs Locally With Zero Cloud Costs

#ai #ollama #machinelearning #opensource

Ollama runs LLMs on your laptop. Llama 3, Mistral, Phi-3, Gemma — all locally, no API keys, no cloud costs, no data leaving your machine.

Quick Start

curl -fsSL https://ollama.com/install.sh | sh
ollama run llama3.1

REST API (OpenAI-Compatible)

curl http://localhost:11434/v1/chat/completions -d '{
  "model": "llama3.1",
  "messages": [{"role": "user", "content": "Hello!"}]
}'

Use with OpenAI SDK:

import OpenAI from 'openai'
const client = new OpenAI({ baseURL: 'http://localhost:11434/v1', apiKey: 'ollama' })
const response = await client.chat.completions.create({
  model: 'llama3.1',
  messages: [{ role: 'user', content: 'Write a haiku about coding' }]
})

Popular Models

Model	Size	Use Case
llama3.1:8b	4.7GB	General purpose
mistral	4.1GB	Fast, good quality
codellama	3.8GB	Code generation
gemma2:9b	5.4GB	Google's model

Custom Models (Modelfile)

FROM llama3.1
PARAMETER temperature 0.7
SYSTEM You are a senior software engineer.

ollama create code-assistant -f Modelfile
ollama run code-assistant

The Bottom Line

Free, private, offline AI. OpenAI-compatible API means zero code changes to switch between local and cloud.

Need to automate data collection or build custom scrapers? Check out my Apify actors for ready-made tools, or email spinov001@gmail.com for custom solutions.

DEV Community