LiteLLM is a unified API gateway that lets you call 100+ LLM providers using the same OpenAI-compatible format. Switch between OpenAI, Anthropic, Bedrock, Vertex AI, Ollama, and more — without changing your code.
Free, open source, Python-native. Used by thousands of companies for LLM routing.
Why Use LiteLLM?
- One interface, 100+ providers — OpenAI, Anthropic, AWS Bedrock, Google Vertex, Azure, Cohere, Replicate, Ollama, and more
- OpenAI-compatible proxy — deploy as a server, use with any OpenAI SDK
- Cost tracking — track spend per model, per user, per team
- Load balancing — route between multiple API keys/deployments
- Fallbacks — automatic retry with different providers
- Rate limiting — per-user and per-model rate limits
Quick Setup
1. Install
pip install litellm[proxy]
# Start proxy server
litellm --model gpt-4o --port 4000
2. Use as Python Library
from litellm import completion
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
os.environ["ANTHROPIC_API_KEY"] = "sk-ant-..."
# Same function, different providers
response = completion(
model="gpt-4o",
messages=[{"role": "user", "content": "What is web scraping?"}]
)
print(response.choices[0].message.content)
# Switch to Claude — same code!
response = completion(
model="claude-3-5-sonnet-20241022",
messages=[{"role": "user", "content": "What is web scraping?"}]
)
print(response.choices[0].message.content)
# Use Ollama (local)
response = completion(
model="ollama/llama3.1",
messages=[{"role": "user", "content": "What is web scraping?"}],
api_base="http://localhost:11434"
)
3. Proxy Server with Config
# config.yaml
model_list:
- model_name: gpt-4
litellm_params:
model: gpt-4o
api_key: sk-...
- model_name: gpt-4
litellm_params:
model: azure/gpt-4
api_base: https://my-azure.openai.azure.com
api_key: ...
- model_name: claude
litellm_params:
model: claude-3-5-sonnet-20241022
api_key: sk-ant-...
- model_name: local
litellm_params:
model: ollama/llama3.1
api_base: http://localhost:11434
router_settings:
routing_strategy: least-busy
num_retries: 3
fallbacks: [{"gpt-4": ["claude"]}]
litellm --config config.yaml --port 4000
4. Query the Proxy
# Uses OpenAI format
curl -s http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-litellm-master-key" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Compare Playwright vs Puppeteer"}]
}' | jq '.choices[0].message.content'
# Switch to Claude — same endpoint
curl -s http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-litellm-master-key" \
-d '{"model": "claude", "messages": [{"role": "user", "content": "Hello"}]}' | jq '.choices[0].message.content'
5. Cost Tracking
from litellm import completion
response = completion(model="gpt-4o", messages=[{"role": "user", "content": "Hi"}])
print(f"Cost: ${response._hidden_params['response_cost']:.6f}")
print(f"Tokens: {response.usage.total_tokens}")
Key Proxy Endpoints
| Endpoint | Description |
|---|---|
| /v1/chat/completions | Chat |
| /v1/completions | Text completion |
| /v1/embeddings | Embeddings |
| /v1/models | List available models |
| /v1/images/generations | Image generation |
| /spend/logs | Spending logs |
| /model/info | Model configuration |
| /health | Health check |
Supported Providers (100+)
| Provider | Model Format |
|---|---|
| OpenAI | gpt-4o, gpt-4o-mini |
| Anthropic | claude-3-5-sonnet-... |
| AWS Bedrock | bedrock/anthropic.claude-v2 |
| Google Vertex | vertex_ai/gemini-pro |
| Azure | azure/gpt-4 |
| Ollama | ollama/llama3.1 |
| Replicate | replicate/model-name |
| Cohere | command-r-plus |
Need custom data extraction or scraping solution? I build production-grade scrapers for any website. Email: Spinov001@gmail.com | My Apify Actors
Top comments (0)