Ollama Has a Free API You've Never Heard Of

#ai #machinelearning #llm #ollama

Ollama lets you run LLMs locally with a single command. Llama 3, Mistral, Gemma, Phi — all running on your machine with a REST API that's compatible with OpenAI's format. No cloud, no API keys, no usage fees.

What Makes Ollama Special?

One command — ollama run llama3 and you're running AI locally
OpenAI-compatible API — drop-in replacement for GPT
Free forever — runs on your hardware
Privacy — data never leaves your machine
Model library — 100+ models available

The Hidden API: OpenAI-Compatible Endpoint

// Drop-in replacement for OpenAI SDK
import OpenAI from 'openai';

const openai = new OpenAI({
  baseURL: 'http://localhost:11434/v1',
  apiKey: 'ollama' // required but unused
});

// Works exactly like GPT-4!
const response = await openai.chat.completions.create({
  model: 'llama3',
  messages: [
    { role: 'system', content: 'You are a helpful coding assistant.' },
    { role: 'user', content: 'Write a Python function to merge two sorted lists.' }
  ],
  temperature: 0.7,
  stream: true
});

for await (const chunk of response) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

REST API

# Generate completion
curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt": "Explain quantum computing in one paragraph",
  "stream": false
}'

# Chat API
curl http://localhost:11434/api/chat -d '{
  "model": "llama3",
  "messages": [{"role": "user", "content": "Hello!"}]
}'

# Embeddings
curl http://localhost:11434/api/embed -d '{
  "model": "nomic-embed-text",
  "input": "What is machine learning?"
}'

# List models
curl http://localhost:11434/api/tags

Model Management API

# Pull models
ollama pull llama3
ollama pull codellama
ollama pull mistral

# Create custom model
cat > Modelfile << EOF
FROM llama3
SYSTEM "You are a senior Python developer. Always provide type hints and docstrings."
PARAMETER temperature 0.3
PARAMETER num_ctx 8192
EOF

ollama create python-expert -f Modelfile
ollama run python-expert

Quick Start

# Install
curl -fsSL https://ollama.com/install.sh | sh

# Run a model
ollama run llama3

# API is ready at localhost:11434

Why Developers Run Ollama

A developer shared: "Our company can't send code to OpenAI for compliance reasons. Ollama lets us run Code Llama locally with the same API our tools expect. We switched the base URL in our config and everything just worked. Zero cloud dependency, full privacy."

Building AI-powered tools? Email spinov001@gmail.com or check my AI solutions.

Running LLMs locally? Ollama vs LMStudio vs llama.cpp?