Claude API Error Handling: Rate Limits, Retries, Patterns

#retries #production

Originally published at claudeguide.io/claude-api-error-handling

Claude API Error Handling: Rate Limits, Retries, and Production Patterns

The Anthropic API returns structured errors with specific HTTP status codes. Knowing which errors to retry, which to log and surface to users, and which indicate bugs in your code is the difference between a production-ready integration and one that silently fails. For general Claude API concepts, see the Claude Agent SDK Guide in 2026.

Error code reference

Each row links to a dedicated troubleshooting page with Python + TypeScript code examples (Korean):

HTTP Status	Error type	Meaning	Action
400	`invalid_request_error`	Malformed request — bad JSON, unsupported parameters, exceeded context window	Fix the request — do not retry
401	`authentication_error`	Invalid API key	Check key validity — do not retry
403	`permission_error`	Valid key but insufficient permissions (e.g. model not enabled)	Check account permissions — do not retry
404	`not_found_error`	Endpoint or model doesn't exist	Fix model name or endpoint — do not retry
413	`request_too_large`	Request body exceeds 32MB limit	Use Files API for large attachments
422	`unprocessable_entity`	Request valid but semantically wrong (e.g. invalid tool schema)	Fix the schema — do not retry
429	`rate_limit_error`	Too many requests or tokens per minute	Retry with exponential backoff
500	`api_error`	Internal server error	Retry with backoff, max 3 attempts
529	`overloaded_error`	API overloaded	Retry with longer backoff

Additional HTTP status codes

Status	Type	Quick fix
502	`bad_gateway`	Retry [3, 10, 30, 60, 120s]
503	`service_unavailable`	Check status.anthropic.com + backoff
504	`gateway_timeout`	Switch to streaming for long outputs

Error subtype deep-dives (한국어, code samples)

context_length_exceeded — 컨텍스트 창 초과 시 트리밍
invalid_api_key — key 형식 검증 + 환경변수 trim
max_tokens — 모델별 8192 한도 cap
model_not_found — 최신 모델 식별자
prompt_too_long — 누적 conversation 자동 trim
streaming_error — SSE 끊김 시 resume 패턴
tool_use_error — tool_use ↔ tool_result pairing 검증
vision_error — 이미지 포맷/크기 자동 정규화
file_upload_error — Files API + beta 헤더
batch_error — Batch 10K/250MB 한도 검증
cache_error — Prompt Caching cache_control 위치
billing_error — 결제/크레딧 부족 alert

The critical distinction: 4xx errors (except 429) indicate a problem with your request and should not be retried. 429 and 5xx errors are transient and should be retried. To reduce 400-class errors from oversized contexts, see Claude 1M Context Window for truncation and caching strategies.

Rate limit errors (429)

The most common production error. Rate limits are enforced on:

Requests per minute (RPM): number of API calls
Input tokens per minute (ITPM): total input tokens
Output tokens per minute (OTPM): total output tokens

The Retry-After header in the 429 response tells you exactly how many seconds to wait.

Python:


python
import anthropic
import time

client = anthropic.Anthropic()

def call_with_retry(
    messages: list,
    model: str = "claude-sonnet-4-6",
    max_retries: int = 5,
    base_delay: float = 1.0,
) -

PDF guide + Excel cost calculator.

[→ Get Cost Optimization Masterclass — $59](https://shoutfirst.gumroad.com/l/msjkda?utm_source=claudeguide&utm_medium=article&utm_campaign=claude-api-error-handling)

*30-day money-back guarantee. Instant download.*