How to Handle AI API Errors (So Your App Doesn't Crash)

#llm #ai #api #webdev

AI APIs fail. A lot. Here's the exact error handling + retry logic I use in production — copy-paste ready.

AI APIs fail. Rate limits, timeouts, 500 errors — it's not if they fail, it's when.

Here's the error handling logic that keeps my app running when AI APIs go down.

The Problem

# What most people do (please don't)
response = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=[{"role": "user", "content": prompt}]
)
# ← If this fails, your app crashes

One failed request = angry users.

The Fix: Retry + Fallback

import time
from openai import RateLimitError, APITimeoutError, InternalServerError

def ask_ai_with_retry(prompt, max_retries=3):
    """Call AI API with automatic retry + fallback."""

    models = ["deepseek-v4-pro", "qwen3-235b-a22b", "glm-4-plus"]

    for model in models:
        for attempt in range(max_retries):
            try:
                response = client.chat.completions.create(
                    model=model,
                    messages=[{"role": "user", "content": prompt}]
                )
                return response.choices[0].message.content

            except RateLimitError:
                wait_time = 2 ** attempt  # Exponential backoff
                time.sleep(wait_time)

            except APITimeoutError:
                continue  # Retry immediately

            except InternalServerError:
                break  # Try next model

    return "Sorry, AI service is temporarily unavailable. Please try again."

What This Does

✅ Automatic retry (up to 3 attempts)
✅ Exponential backoff (wait 1s, 2s, 4s...)
✅ Model fallback (if deepseek-v4-pro fails, try qwen3-235b-a22b)
✅ Graceful degradation (returns a message instead of crashing)