Every HTTP call you make to a third-party API will eventually fail. The endpoint will time out. The service will rate-limit you. The server will return a 503 for twelve minutes on a Tuesday afternoon.
Most Python codebases handle this with a bare try/except and a hope. That works until it doesn't — and when it doesn't, you get cascading failures, silent data loss, and a service that goes down without telling you why.
I built APIGuard to fix this. It's a small Python library (905 lines of source) that gives you three production resilience patterns — without pulling in a heavy framework.
The Three Patterns
1. Token Bucket Rate Limiting
Rate limits are contracts. Break them and the API cuts you off. A token bucket enforces the contract on your side before you hit the remote server.
from apiguard import TokenBucket
# 100 requests allowed, refills at 10/second
bucket = TokenBucket(capacity=100, refill_rate=10.0)
if bucket.acquire(tokens=1):
response = httpx.get("https://api.example.com/data")
else:
# Back off — you're at the limit
pass
The implementation uses time.monotonic() for sub-second precision and is thread-safe. No background threads, no timers — just math.
2. Circuit Breaker
A circuit breaker tracks failures. After enough consecutive failures, it stops sending requests entirely — "opening" the circuit. This prevents your app from hammering a dead service and wasting resources (yours and theirs).
It follows a three-state machine: CLOSED (normal) → OPEN (blocking requests) → HALF_OPEN (testing recovery) → back to CLOSED.
from apiguard import CircuitBreaker, CircuitOpenError
breaker = CircuitBreaker(
failure_threshold=5, # Open after 5 failures
recovery_timeout=60.0, # Try again after 60 seconds
success_threshold=2 # Need 2 successes to fully close
)
try:
with breaker:
result = call_external_api()
except CircuitOpenError:
# Circuit is open — don't even try
return cached_fallback()
The key detail: the success_threshold in HALF_OPEN state. One lucky success doesn't mean the service is back. You need consecutive successes before the breaker fully closes again.
3. Retry with Exponential Backoff
Retry sounds simple. It isn't. Naive retry (same delay, unlimited attempts) turns your client into a DDoS tool. Exponential backoff with jitter spreads retry load and respects Retry-After headers.
from apiguard import RetryHandler
handler = RetryHandler(
max_retries=3,
base_delay=1.0,
jitter=0.5, # Randomize delay by up to 50%
retryable_status_codes={429, 500, 502, 503, 504}
)
async for attempt in handler:
response = await client.get("/endpoint")
if response.status_code not in handler.retryable_status_codes:
break
await handler.wait(attempt, response) # Respects Retry-After header
The delay formula: base_delay * 2^(attempt - 1) * (1 + random * jitter). Third attempt at base_delay=1.0 waits ~4 seconds plus jitter. This is what RFC 7231 recommends.
Composing All Three
The patterns work independently, but the real value is composition. APIGuard ships a client that wires all three together:
from apiguard.adapters.httpx import AsyncRateLimitedClient
async with AsyncRateLimitedClient(
capacity=50, # 50 requests in the bucket
refill_rate=5.0, # Refill at 5/second
max_retries=3,
failure_threshold=5,
recovery_timeout=30.0
) as client:
response = await client.get("https://api.stripe.com/v1/charges")
One client. Rate limiting, retry, and circuit breaking. The request either succeeds, retries intelligently, or fails fast with a clear error. No silent failures.
Real Integration: Replacing 623 Lines
I built APIGuard while working on GRID, a 190k-line Python AI framework. GRID had a custom circuit breaker implementation — 623 lines of hand-rolled state management, no rate limiting, no retry coordination.
After integrating APIGuard, that 623-line file became a 40-line adapter:
# GRID's FastAPI middleware using APIGuard
from apiguard import CircuitBreaker, BucketRegistry
class APIGuardCircuitBreakerMiddleware:
def __init__(self, app, failure_threshold=5, recovery_timeout=60.0):
self.app = app
self.breaker = CircuitBreaker(
failure_threshold=failure_threshold,
recovery_timeout=recovery_timeout
)
async def __call__(self, scope, receive, send):
with self.breaker:
await self.app(scope, receive, send)
Same behavior. Fewer lines. Tested independently (106 tests, 100% coverage) instead of tangled into the framework.
What I Learned Building It
Thread safety matters even in async code. The token bucket and circuit breaker both use locks. In production, you'll have concurrent requests from multiple coroutines hitting the same bucket. Without locks, you get race conditions that silently over-consume your rate limit.
Jitter is not optional. Without jitter, all your retries fire at the same time (the "thundering herd" problem). Even 0.3 jitter factor spreads the load enough to matter.
Retry-After headers are a gift. Most rate-limited APIs tell you exactly when to come back. Ignoring this header and using your own backoff schedule means you'll either wait too long or retry too soon. APIGuard checks for it on every retry.
Keep the dependency surface small. APIGuard's only runtime dependency is httpx. No framework opinions, no configuration files, no dependency tree. pip install grid-apiguard and you're done.
The Numbers
| Metric | Value |
|---|---|
| Source lines | 905 |
| Test functions | 106 |
| Test coverage | 100% |
| Core patterns | 3 |
| Runtime dependencies | 1 (httpx) |
| Python versions | 3.11, 3.12, 3.13 |
| License | MIT |
When You Should Use This
- You call third-party APIs and don't have resilience patterns in place
- Your retry logic is a
while Trueloop withtime.sleep(1) - You've been rate-limited and your solution was "catch the 429 and retry"
- You want circuit breaking without importing a framework
When You Shouldn't
- You're already using Tenacity or Stamina and they work fine
- Your API calls are internal, low-latency, and rarely fail
- You need distributed circuit breaking across multiple instances (APIGuard is single-process)
APIGuard is on PyPI: pip install grid-apiguard
Source is available on request. Built solo, tested end-to-end, used in production in a 190k-line framework.
If you need API integrations built with this level of failure handling, I do this work professionally — Upwork profile.
Top comments (0)