DEV Community

Sangmin Lee
Sangmin Lee

Posted on • Originally published at claudeguide.io

Claude API Error Handling: Rate Limits, Retries, Patterns

Originally published at claudeguide.io/claude-api-error-handling

Claude API Error Handling: Rate Limits, Retries, and Production Patterns

The Anthropic API returns structured errors with specific HTTP status codes. Knowing which errors to retry, which to log and surface to users, and which indicate bugs in your code is the difference between a production-ready integration and one that silently fails. For general Claude API concepts, see the Claude Agent SDK Guide in 2026.

Error code reference

Each row links to a dedicated troubleshooting page with Python + TypeScript code examples (Korean):

HTTP Status Error type Meaning Action
400 invalid_request_error Malformed request — bad JSON, unsupported parameters, exceeded context window Fix the request — do not retry
401 authentication_error Invalid API key Check key validity — do not retry
403 permission_error Valid key but insufficient permissions (e.g. model not enabled) Check account permissions — do not retry
404 not_found_error Endpoint or model doesn't exist Fix model name or endpoint — do not retry
413 request_too_large Request body exceeds 32MB limit Use Files API for large attachments
422 unprocessable_entity Request valid but semantically wrong (e.g. invalid tool schema) Fix the schema — do not retry
429 rate_limit_error Too many requests or tokens per minute Retry with exponential backoff
500 api_error Internal server error Retry with backoff, max 3 attempts
529 overloaded_error API overloaded Retry with longer backoff

Additional HTTP status codes

Status Type Quick fix
502 bad_gateway Retry [3, 10, 30, 60, 120s]
503 service_unavailable Check status.anthropic.com + backoff
504 gateway_timeout Switch to streaming for long outputs

Error subtype deep-dives (한국어, code samples)

The critical distinction: 4xx errors (except 429) indicate a problem with your request and should not be retried. 429 and 5xx errors are transient and should be retried. To reduce 400-class errors from oversized contexts, see Claude 1M Context Window for truncation and caching strategies.


Rate limit errors (429)

The most common production error. Rate limits are enforced on:

  • Requests per minute (RPM): number of API calls
  • Input tokens per minute (ITPM): total input tokens
  • Output tokens per minute (OTPM): total output tokens

The Retry-After header in the 429 response tells you exactly how many seconds to wait.

Python:


python
import anthropic
import time

client = anthropic.Anthropic()

def call_with_retry(
    messages: list,
    model: str = "claude-sonnet-4-6",
    max_retries: int = 5,
    base_delay: float = 1.0,
) -

PDF guide + Excel cost calculator.

[→ Get Cost Optimization Masterclass — $59](https://shoutfirst.gumroad.com/l/msjkda?utm_source=claudeguide&utm_medium=article&utm_campaign=claude-api-error-handling)

*30-day money-back guarantee. Instant download.*
Enter fullscreen mode Exit fullscreen mode

Top comments (0)