DEV Community

Alex Spinov
Alex Spinov

Posted on

Ollama Has a Free API — Run LLMs Locally Without OpenAI or Cloud Costs

Run GPT-Level Models on Your Laptop

Ollama lets you run open-source LLMs (Llama 3, Mistral, Gemma, Phi) locally with a simple API. Free forever — no API keys, no rate limits, no cloud.

Setup

# Install
curl -fsSL https://ollama.com/install.sh | sh

# Pull a model
ollama pull llama3.2  # 2GB, runs on most laptops
ollama pull mistral   # 4GB, great for coding
ollama pull phi3      # 1.7GB, fastest
Enter fullscreen mode Exit fullscreen mode

API (OpenAI-Compatible)

import requests

def chat(prompt, model="llama3.2"):
    r = requests.post("http://localhost:11434/api/generate",
                      json={"model": model, "prompt": prompt, "stream": False})
    return r.json()["response"]

print(chat("Write a Python function to check if a number is prime"))
Enter fullscreen mode Exit fullscreen mode

Chat Conversations

def chat_conversation(messages, model="llama3.2"):
    r = requests.post("http://localhost:11434/api/chat",
                      json={"model": model, "messages": messages, "stream": False})
    return r.json()["message"]["content"]

response = chat_conversation([
    {"role": "system", "content": "You are a Python expert."},
    {"role": "user", "content": "How do I read a CSV file?"}
])
print(response)
Enter fullscreen mode Exit fullscreen mode

Use With OpenAI SDK

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")

response = client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Explain Docker in one paragraph"}]
)
print(response.choices[0].message.content)
Enter fullscreen mode Exit fullscreen mode

Switch from OpenAI to local LLM by changing one line.

List Models

def list_models():
    r = requests.get("http://localhost:11434/api/tags")
    return [{"name": m["name"], "size": m["size"] // (1024**3)} 
            for m in r.json()["models"]]

for m in list_models():
    print(f"{m[name]}: {m[size]}GB")
Enter fullscreen mode Exit fullscreen mode

Ollama vs Cloud LLMs

Feature Ollama OpenAI Claude
Cost $0 $0.01-0.06/1K $0.01-0.08/1K
Privacy 100% local Cloud Cloud
Speed Depends on GPU Fast Fast
Internet Not needed Required Required
Models Open source Proprietary Proprietary

Ollama for privacy and cost. Cloud for quality and speed.


More | GitHub


Need custom dev tools, scrapers, or API integrations? I build automation for dev teams. Email spinov001@gmail.com — or explore awesome-web-scraping.


More from me: 10 Dev Tools I Use Daily | 77 Scrapers on a Schedule | 150+ Free APIs
Also: Neon Free Postgres | Vercel Free API | Hetzner 4x More Server

Top comments (0)