LocalAI is a self-hosted, OpenAI-compatible API that runs LLMs, image generation, and audio transcription entirely on your hardware. No GPU required — it runs on CPU too.
What Makes LocalAI Special?
- OpenAI API compatible — drop-in replacement
- No GPU needed — runs on CPU (GPU optional)
- Multi-modal — text, images, audio, embeddings
- Model gallery — one-click model downloads
- Free — open source, self-hosted
The Hidden API: Full OpenAI Compatibility
from openai import OpenAI
client = OpenAI(base_url='http://localhost:8080/v1', api_key='not-needed')
# Chat completions
response = client.chat.completions.create(
model='llama3',
messages=[
{'role': 'system', 'content': 'You are a helpful assistant.'},
{'role': 'user', 'content': 'Explain Docker in simple terms.'}
],
temperature=0.7
)
print(response.choices[0].message.content)
# Image generation (Stable Diffusion)
image = client.images.generate(
model='stablediffusion',
prompt='A futuristic city at sunset, cyberpunk style',
size='512x512'
)
# Audio transcription (Whisper)
transcription = client.audio.transcriptions.create(
model='whisper-1',
file=open('meeting.mp3', 'rb')
)
print(transcription.text)
# Embeddings
embedding = client.embeddings.create(
model='text-embedding-ada-002',
input='What is machine learning?'
)
print(len(embedding.data[0].embedding)) # 384 dimensions
Text-to-Speech API
response = client.audio.speech.create(
model='tts-1',
voice='alloy',
input='Hello! This is generated speech running locally.'
)
response.stream_to_file('output.mp3')
Model Gallery API
# Browse and install models
curl http://localhost:8080/models/available | jq '.[] | .name'
# Install a model
curl http://localhost:8080/models/apply -d '{
"url": "github:go-skynet/model-gallery/llama3.yaml"
}'
Quick Start
docker run -p 8080:8080 localai/localai:latest-cpu
# GPU version:
# docker run --gpus all -p 8080:8080 localai/localai:latest-gpu-nvidia-cuda-12
Why Teams Self-Host LocalAI
A CTO shared: "Compliance says no data to external APIs. LocalAI runs on our servers with the same OpenAI SDK our developers already know. We switched the base URL in our config and everything works — chat, embeddings, image generation, all local."
Building AI-powered tools? Email spinov001@gmail.com or check my AI solutions.
Self-hosting AI? LocalAI vs Ollama vs vLLM — what's your choice?
Top comments (0)