Your app uses OpenAI. OpenAI goes down at 2am. Your users get errors. You could add Anthropic as a fallback, but that means refactoring every LLM call with retry logic, model mapping, and response normalization. Portkey is a unified AI gateway that handles routing, fallbacks, load balancing, and caching across 200+ LLMs through one API.
What Portkey Actually Does
Portkey is an AI gateway that sits between your application and LLM providers. You make one API call to Portkey, and it routes to OpenAI, Anthropic, Google, Mistral, Llama, or any of 200+ models. If one provider fails, it automatically falls back to another. If the same prompt is sent twice, it returns a cached response.
Portkey provides: unified API (same format for all providers), automatic fallbacks, load balancing, semantic caching, request/response logging, rate limiting, budget controls, and a prompt playground.
Open-source gateway (Apache 2.0). Portkey Cloud free tier: 10K requests/month.
Quick Start
Drop-in OpenAI SDK replacement:
from openai import OpenAI
from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders
client = OpenAI(
api_key="sk-your-openai-key",
base_url=PORTKEY_GATEWAY_URL,
default_headers=createHeaders(
api_key="your-portkey-key",
provider="openai"
)
)
# Same OpenAI SDK usage — Portkey proxies transparently
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}]
)
Automatic fallback configuration:
from portkey_ai import Portkey
portkey = Portkey(
api_key="your-portkey-key",
config={
"strategy": {"mode": "fallback"},
"targets": [
{"provider": "openai", "api_key": "sk-..."},
{"provider": "anthropic", "api_key": "sk-ant-..."},
{"provider": "google", "api_key": "AIza..."}
]
}
)
# Tries OpenAI first, falls back to Anthropic, then Google
response = portkey.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Summarize this document"}]
)
3 Practical Use Cases
1. Load Balance Across Providers
portkey = Portkey(
config={
"strategy": {"mode": "loadbalance"},
"targets": [
{"provider": "openai", "weight": 0.7, "api_key": "sk-..."},
{"provider": "anthropic", "weight": 0.3, "api_key": "sk-ant-..."}
]
}
)
# 70% of requests go to OpenAI, 30% to Anthropic
2. Semantic Caching
portkey = Portkey(
config={
"cache": {"mode": "semantic", "max_age": 3600},
"targets": [{"provider": "openai", "api_key": "sk-..."}]
}
)
# "What is Python?" and "Tell me about Python" hit the same cache
# Saves tokens and reduces latency to <50ms
3. Budget Controls
portkey = Portkey(
config={
"targets": [{"provider": "openai", "api_key": "sk-..."}],
"budget": {"max_cost_per_day": 50.0} # $50/day cap
}
)
# Requests after budget is hit return an error instead of draining your account
Why This Matters
Portkey solves the multi-provider LLM problem every AI team faces. Instead of vendor lock-in to OpenAI, you get automatic failover, cost optimization through caching and routing, and observability across all providers. The unified API means switching or adding providers requires zero code changes.
Need custom data extraction or web scraping solutions? I build production-grade scrapers and data pipelines. Check out my Apify actors or email me at spinov001@gmail.com for custom projects.
Follow me for more free API discoveries every week!
Top comments (0)