Your Python API Calls Will Fail. Here's How to Handle It.

#python #api #resilience #opensource

Every HTTP call you make to a third-party API will eventually fail. The endpoint will time out. The service will rate-limit you. The server will return a 503 for twelve minutes on a Tuesday afternoon.

Most Python codebases handle this with a bare try/except and a hope. That works until it doesn't — and when it doesn't, you get cascading failures, silent data loss, and a service that goes down without telling you why.

I built APIGuard to fix this. It's a small Python library (905 lines of source) that gives you three production resilience patterns — without pulling in a heavy framework.

The Three Patterns

1. Token Bucket Rate Limiting

Rate limits are contracts. Break them and the API cuts you off. A token bucket enforces the contract on your side before you hit the remote server.

from apiguard import TokenBucket

# 100 requests allowed, refills at 10/second
bucket = TokenBucket(capacity=100, refill_rate=10.0)

if bucket.acquire(tokens=1):
    response = httpx.get("https://api.example.com/data")
else:
    # Back off — you're at the limit
    pass

The implementation uses time.monotonic() for sub-second precision and is thread-safe. No background threads, no timers — just math.

2. Circuit Breaker

A circuit breaker tracks failures. After enough consecutive failures, it stops sending requests entirely — "opening" the circuit. This prevents your app from hammering a dead service and wasting resources (yours and theirs).

It follows a three-state machine: CLOSED (normal) → OPEN (blocking requests) → HALF_OPEN (testing recovery) → back to CLOSED.

from apiguard import CircuitBreaker, CircuitOpenError

breaker = CircuitBreaker(
    failure_threshold=5,     # Open after 5 failures
    recovery_timeout=60.0,   # Try again after 60 seconds
    success_threshold=2      # Need 2 successes to fully close
)

try:
    with breaker:
        result = call_external_api()
except CircuitOpenError:
    # Circuit is open — don't even try
    return cached_fallback()

The key detail: the success_threshold in HALF_OPEN state. One lucky success doesn't mean the service is back. You need consecutive successes before the breaker fully closes again.

3. Retry with Exponential Backoff

Retry sounds simple. It isn't. Naive retry (same delay, unlimited attempts) turns your client into a DDoS tool. Exponential backoff with jitter spreads retry load and respects Retry-After headers.

from apiguard import RetryHandler

handler = RetryHandler(
    max_retries=3,
    base_delay=1.0,
    jitter=0.5,                              # Randomize delay by up to 50%
    retryable_status_codes={429, 500, 502, 503, 504}
)

async for attempt in handler:
    response = await client.get("/endpoint")
    if response.status_code not in handler.retryable_status_codes:
        break
    await handler.wait(attempt, response)    # Respects Retry-After header

The delay formula: base_delay * 2^(attempt - 1) * (1 + random * jitter). Third attempt at base_delay=1.0 waits ~4 seconds plus jitter. This is what RFC 7231 recommends.

Composing All Three

The patterns work independently, but the real value is composition. APIGuard ships a client that wires all three together:

from apiguard.adapters.httpx import AsyncRateLimitedClient

async with AsyncRateLimitedClient(
    capacity=50,          # 50 requests in the bucket
    refill_rate=5.0,      # Refill at 5/second
    max_retries=3,
    failure_threshold=5,
    recovery_timeout=30.0
) as client:
    response = await client.get("https://api.stripe.com/v1/charges")

One client. Rate limiting, retry, and circuit breaking. The request either succeeds, retries intelligently, or fails fast with a clear error. No silent failures.

Real Integration: Replacing 623 Lines

I built APIGuard while working on GRID, a 190k-line Python AI framework. GRID had a custom circuit breaker implementation — 623 lines of hand-rolled state management, no rate limiting, no retry coordination.

After integrating APIGuard, that 623-line file became a 40-line adapter:

# GRID's FastAPI middleware using APIGuard
from apiguard import CircuitBreaker, BucketRegistry

class APIGuardCircuitBreakerMiddleware:
    def __init__(self, app, failure_threshold=5, recovery_timeout=60.0):
        self.app = app
        self.breaker = CircuitBreaker(
            failure_threshold=failure_threshold,
            recovery_timeout=recovery_timeout
        )

    async def __call__(self, scope, receive, send):
        with self.breaker:
            await self.app(scope, receive, send)

Same behavior. Fewer lines. Tested independently (106 tests, 100% coverage) instead of tangled into the framework.

What I Learned Building It

Thread safety matters even in async code. The token bucket and circuit breaker both use locks. In production, you'll have concurrent requests from multiple coroutines hitting the same bucket. Without locks, you get race conditions that silently over-consume your rate limit.

Jitter is not optional. Without jitter, all your retries fire at the same time (the "thundering herd" problem). Even 0.3 jitter factor spreads the load enough to matter.

Retry-After headers are a gift. Most rate-limited APIs tell you exactly when to come back. Ignoring this header and using your own backoff schedule means you'll either wait too long or retry too soon. APIGuard checks for it on every retry.

Keep the dependency surface small. APIGuard's only runtime dependency is httpx. No framework opinions, no configuration files, no dependency tree. pip install grid-apiguard and you're done.

The Numbers

Metric	Value
Source lines	905
Test functions	106
Test coverage	100%
Core patterns	3
Runtime dependencies	1 (httpx)
Python versions	3.11, 3.12, 3.13
License	MIT

When You Should Use This

You call third-party APIs and don't have resilience patterns in place
Your retry logic is a while True loop with time.sleep(1)
You've been rate-limited and your solution was "catch the 429 and retry"
You want circuit breaking without importing a framework

When You Shouldn't

You're already using Tenacity or Stamina and they work fine
Your API calls are internal, low-latency, and rarely fail
You need distributed circuit breaking across multiple instances (APIGuard is single-process)

APIGuard is on PyPI: pip install grid-apiguard

Source is available on request. Built solo, tested end-to-end, used in production in a 190k-line framework.

If you need API integrations built with this level of failure handling, I do this work professionally — Upwork profile.