LocalAI Has a Free API — Run AI Models Locally Without GPU

#localai #ai #llm #opensource

LocalAI is a free, open-source OpenAI-compatible API that runs AI models locally. LLMs, image generation, audio transcription — all without a GPU.

What Is LocalAI?

LocalAI is a drop-in replacement for OpenAI API that runs entirely on your hardware. It supports text generation, embeddings, image generation, audio, and more.

Features:

OpenAI API compatible
Runs on CPU (GPU optional)
Text, images, audio, embeddings
GGUF model support
No cloud, no API keys, no costs

Quick Start

docker run -p 8080:8080 localai/localai:latest

OpenAI-Compatible API

# Chat completion
curl http://localhost:8080/v1/chat/completions \
  -d '{"model":"gpt-4","messages":[{"role":"user","content":"Hello!"}]}'

# Embeddings
curl http://localhost:8080/v1/embeddings \
  -d '{"model":"text-embedding-ada-002","input":"Hello world"}'

# Image generation
curl http://localhost:8080/v1/images/generations \
  -d '{"prompt":"A sunset over mountains","size":"512x512"}'

# Audio transcription
curl http://localhost:8080/v1/audio/transcriptions \
  -F file=@audio.mp3 -F model=whisper-1

Use with OpenAI SDK

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8080/v1", api_key="not-needed")
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "What is Docker?"}]
)
print(response.choices[0].message.content)