Ollama runs open-source LLMs locally with a simple API. Run Llama 3, Mistral, Gemma, and more on your machine — no API keys, no cloud costs, no data leaving your network.
Setup
# Install
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model
ollama pull llama3.1
ollama pull mistral
ollama pull codellama
REST API
# Chat completion
curl http://localhost:11434/api/chat -d '{
"model": "llama3.1",
"messages": [{"role": "user", "content": "Explain Docker in 3 sentences"}],
"stream": false
}'
# Generate (simple completion)
curl http://localhost:11434/api/generate -d '{
"model": "codellama",
"prompt": "Write a Python function to merge two sorted lists",
"stream": false
}'
# Embeddings
curl http://localhost:11434/api/embeddings -d '{
"model": "llama3.1",
"prompt": "Machine learning is a subset of AI"
}'
JavaScript Client
import { Ollama } from 'ollama';
const ollama = new Ollama();
// Chat
const response = await ollama.chat({
model: 'llama3.1',
messages: [
{ role: 'system', content: 'You are a helpful coding assistant.' },
{ role: 'user', content: 'How do I handle errors in async/await?' }
]
});
console.log(response.message.content);
// Streaming
const stream = await ollama.chat({
model: 'llama3.1',
messages: [{ role: 'user', content: 'Write a haiku about programming' }],
stream: true
});
for await (const chunk of stream) {
process.stdout.write(chunk.message.content);
}
// Embeddings for RAG
const embedding = await ollama.embeddings({
model: 'llama3.1',
prompt: 'What is vector search?'
});
// embedding.embedding = [0.123, -0.456, ...]
OpenAI-Compatible Endpoint
// Works with any OpenAI SDK — just change the base URL
import OpenAI from 'openai';
const openai = new OpenAI({
baseURL: 'http://localhost:11434/v1',
apiKey: 'ollama' // Required but unused
});
const completion = await openai.chat.completions.create({
model: 'llama3.1',
messages: [{ role: 'user', content: 'Hello!' }]
});
Model Management
ollama list # Show downloaded models
ollama show llama3.1 # Model details
ollama rm mistral # Remove a model
ollama cp llama3.1 my-model # Copy/customize
Why This Matters
- Privacy: No data leaves your machine
- Free: No API costs, no rate limits
- Fast: GPU-accelerated inference
- OpenAI compatible: Swap cloud AI for local with one URL change
- Offline: Works without internet after model download
Need custom AI tools or local LLM integrations? I build developer tools. Check out my web scraping actors on Apify or reach out at spinov001@gmail.com for custom solutions.
Top comments (0)