Your AI feature is live. Suddenly, your primary model starts failing.
❌ Rate limited
❌ Timeout
❌ 500 error
What happens to your users?
The solution: Fallback logic — automatically switch to a backup model when the primary fails.
The Problem
# What happens when this fails?
response = openai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": user_input}]
)
# If this times out, your app crashes 💥
The Solution: Fallback Chain
With AIBridge, you can define a fallback chain — try Model A, if it fails, try Model B, then Model C.
from openai import OpenAI
import tenacity
client = OpenAI(
api_key="mb_your_key",
base_url="https://aibridge-api.com/v1"
)
# Define fallback chain
FALLBACK_CHAIN = [
"deepseek-v4-pro", # Primary: Best quality
"qwen3-235b-a22b", # Fallback 1: Still high quality
"glm-4-plus", # Fallback 2: Reliable alternative
"deepseek-v4-flash" # Fallback 3: Fast & cheap
]
def call_with_fallback(messages, fallback_chain=FALLBACK_CHAIN):
"""Try models in order until one succeeds."""
last_error = None
for model in fallback_chain:
try:
response = client.chat.completions.create(
model=model,
messages=messages,
timeout=10 # Prevent hanging
)
print(f"✅ Success with {model}")
return response
except Exception as e:
print(f"❌ {model} failed: {e}")
last_error = e
continue # Try next model
# All models failed
raise last_error
# Usage
response = call_with_fallback([
{"role": "user", "content": "Explain quantum computing"}
])
Advanced: Smart Retry with Exponential Backoff
For production, add retry logic with exponential backoff:
from tenacity import retry, stop_after_attempt, wait_exponential
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=4, max=10)
)
def call_ai_with_retry(model, messages):
return client.chat.completions.create(
model=model,
messages=messages,
timeout=10
)
# Now automatically retries on failure
response = call_ai_with_retry("deepseek-v4-pro", messages)
AIBridge Advantage
With direct APIs, fallback means:
❌ Multiple API clients
❌ Multiple base URLs
❌ Different error formats
With AIBridge:
✅ One client
✅ One base URL
✅ Same error format for all models
✅ Switch models instantly
Production Checklist
✅ Fallback chain (primary + 2 backups)
✅ Retry logic (exponential backoff)
✅ Timeout handling (prevent hanging)
✅ Error logging (know which model failed)
✅ Graceful degradation (return cached response if all models fail)
Try it: https://aibridge-api.com
5M free tokens. Build resilient AI features. 🛡️




Top comments (0)