If you've hit 429 Too Many Requests on the Polymarket APIs, the fix isn't just "slow down" — it's understanding how the limits are shaped and giving your client the right tooling. Here are the traps and a small open-source limiter.
Trap 1: the window is 10 seconds, not 1
Polymarket's published limits use a 10-second window (trading also has a 10-minute sustained cap):
| Endpoint | Limit |
|---|---|
/book, /price, /midpoint
|
1,500 / 10s |
/books, /prices
|
500 / 10s |
POST/DELETE /order
|
5,000 / 10s burst (120,000 / 10 min) |
DELETE /cancel-all
|
250 / 10s |
So /book is 150 req/s sustained with a burst of 1,500 — not "1500/s", and not "150/s with no burst". A token bucket models this exactly: refill max / window per second, capacity max.
rate, burst = 1500 / 10, 1500 # 150 tokens/sec, cap 1500
Trap 2: authenticated GETs throttle much earlier
The published numbers are for public market-data endpoints. Authenticated reads (e.g. polling a single order) get throttled far below that — around ~10/s once you're also placing and cancelling. Budget authed reads conservatively; don't assume the 1,500/10s ceiling applies.
Trap 3: a fresh connection per request
Every call that opens a new TCP+TLS connection is slow and leans harder on the limiter. Reuse one keep-alive connection (or a small pool). TCP_NODELAY and GC control are the rest of the latency story.
Trap 4: no Retry-After / backoff
The docs say over-limit requests are "throttled rather than rejected" — but in practice you do see 429. Honor Retry-After when present (it can be an integer or an HTTP date), and otherwise back off exponentially with jitter:
from polymarket_rate_limit import parse_retry_after, should_retry, backoff
if should_retry(resp.status): # 429 or 5xx
time.sleep(backoff(attempt, retry_after=parse_retry_after(resp.headers)))
The limiter
I packaged the documented limits, a per-endpoint token bucket, the Retry-After parser, and correct backoff into a zero-dependency MIT module (injectable clock, fully tested):
from polymarket_rate_limit import RateLimiter
lim = RateLimiter()
wait = lim.acquire("/price") # 0 if you may send now, else seconds to wait
if wait: time.sleep(wait)
Repo (free, MIT, 15 tests):
https://github.com/BlueWhale-Quant-Lab/polymarket-api-rate-limit-429-handler
The speed side — keep-alive pooling, TCP_NODELAY, GC control, TLS prewarm, and a p50/p99 latency benchmark — is a complete version, but the limiter above stands on its own.
Takeaway
Model the 10-second window as a token bucket, rate-limit authed GETs conservatively, reuse connections, and honor Retry-After.
Top comments (0)