Akash Raidas

Posted on Nov 27

10 Frustrating API Errors & What They Actually Mean

#ai #beginners #learning #api

AI APIs power everything now—chatbots, code assistants, image generators, data analyzers. You send a request, the model processes it, you get a response. Simple, until it breaks.

The error messages are vague by design. Security reasons, mostly. But every error code exists because someone anticipated that failure mode. The problem? The error tells you what broke, not why or how to fix it.

Most errors aren't bugs. They're guardrails. Rate limits protect infrastructure. Token limits manage compute costs. Timeouts prevent runaway processes. You hit these because you're pushing the system—which is normal when building.

Here's what those codes actually mean and how to fix them.

1. Error 429: Rate Limit Exceeded

What it means: You're sending too many requests too fast. Most APIs have request limits per minute or hour.

The fix: Implement exponential backoff. Add a delay between requests that increases with each retry.

import time

def api_call_with_backoff(func, max_retries=5):
    for i in range(max_retries):
        try:
            return func()
        except RateLimitError:
            wait = 2 ** i
            time.sleep(wait)
    raise Exception("Max retries exceeded")

Check your provider's rate limits: OpenAI Rate Limits, Anthropic Rate Limits

2. Error 401: Unauthorized

What it means: Your API key is invalid, expired, or not loaded correctly.

The fix:

Verify your .env file exists and the key is spelled correctly
Check if you're calling load_dotenv() before using the key
Regenerate the key if it's old—some expire

from dotenv import load_dotenv
import os

load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")

if not api_key:
    raise ValueError("API key not found")

3. Error 400: Bad Request (Context Window Overflow)

What it means: You've exceeded the model's context window. Your prompt + conversation history is too large.

The fix: Count your tokens before sending. Trim old messages or summarize them.

import tiktoken

def count_tokens(text, model="gpt-4"):
    encoding = tiktoken.encoding_for_model(model)
    return len(encoding.encode(text))

# Keep context under the limit
if count_tokens(prompt) > 8000:
    prompt = prompt[-6000:]  # truncate

Token limits by model: OpenAI Models, Anthropic Models

4. Timeout Error

What it means: The model is taking longer than your client allows. Complex prompts or long outputs can trigger this.

The fix: Increase the timeout parameter in your HTTP client.

import openai

openai.api_timeout = 120  # seconds

# Or with requests
response = requests.post(url, json=data, timeout=120)

If timeouts persist, simplify your prompt or reduce max_tokens.

5. Invalid JSON Response

What it means: You asked for structured output, but the model returned plain text or malformed JSON.

The fix: Use JSON mode or structured outputs. Most modern APIs support forcing JSON responses.

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "List 3 colors"}],
    response_format={"type": "json_object"}
)

Docs: OpenAI JSON Mode, Anthropic Tool Use

6. Error 500: Internal Server Error

What it means: The API provider's server failed. This is on their end, not yours.

The fix: Retry with exponential backoff. If it persists, check the provider's status page.

Implement retry logic:

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

session = requests.Session()
retry = Retry(total=3, backoff_factor=1, status_forcelist=[500, 502, 503])
adapter = HTTPAdapter(max_retries=retry)
session.mount('https://', adapter)

7. Error 413: Payload Too Large

What it means: Your request body is too big. Usually happens when uploading large files or sending huge prompts.

The fix: Compress images, chunk large files, or paginate your data.

from PIL import Image

# Compress image before sending
img = Image.open("large_image.jpg")
img.save("compressed.jpg", quality=85, optimize=True)

8. Error 503: Service Unavailable

What it means: The API is temporarily down or overloaded. High traffic or maintenance.

The fix: Implement retry logic with exponential backoff. Check status pages (linked in #6).

Add circuit breaker pattern for production:

class CircuitBreaker:
    def __init__(self, failure_threshold=5):
        self.failure_count = 0
        self.threshold = failure_threshold
        self.is_open = False

    def call(self, func):
        if self.is_open:
            raise Exception("Circuit breaker is open")
        try:
            result = func()
            self.failure_count = 0
            return result
        except:
            self.failure_count += 1
            if self.failure_count >= self.threshold:
                self.is_open = True
            raise

9. Connection Reset / EOF Error

What it means: The connection dropped mid-response. Network instability or server-side issue.

The fix: Use streaming for long responses. Reconnect and resume if possible.

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": prompt}],
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Streaming docs: OpenAI Streaming

10. Model Not Found (404)

What it means: You're calling a model that doesn't exist or you don't have access to.

The fix: Check the model name spelling. Verify you have access (some models require waitlist approval).

# Common typos
# ❌ "gpt4"
# ❌ "claude-3-opus"
# ✅ "gpt-4"
# ✅ "claude-3-opus-20240229"

List available models:

# OpenAI
models = openai.models.list()
for model in models:
    print(model.id)

Model availability: OpenAI, Anthropic

So these were the top 10 API errors you'll actually encounter. The code exists because someone knew you'd hit these limits. Now you know what triggers them and how to work around them.

For more updates, follow me here on DEV.

Resources:

DEV Community