Polymarket rate limits are per 10 seconds, not per second (and 429 traps)

#python #api #ratelimiting #trading

If you've hit 429 Too Many Requests on the Polymarket APIs, the fix isn't just "slow down" — it's understanding how the limits are shaped and giving your client the right tooling. Here are the traps and a small open-source limiter.

Trap 1: the window is 10 seconds, not 1

Polymarket's published limits use a 10-second window (trading also has a 10-minute sustained cap):

Endpoint	Limit
`/book`, `/price`, `/midpoint`	1,500 / 10s
`/books`, `/prices`	500 / 10s
POST/DELETE `/order`	5,000 / 10s burst (120,000 / 10 min)
DELETE `/cancel-all`	250 / 10s

So /book is 150 req/s sustained with a burst of 1,500 — not "1500/s", and not "150/s with no burst". A token bucket models this exactly: refill max / window per second, capacity max.

rate, burst = 1500 / 10, 1500   # 150 tokens/sec, cap 1500

Trap 2: authenticated GETs throttle much earlier

The published numbers are for public market-data endpoints. Authenticated reads (e.g. polling a single order) get throttled far below that — around ~10/s once you're also placing and cancelling. Budget authed reads conservatively; don't assume the 1,500/10s ceiling applies.

Trap 3: a fresh connection per request

Every call that opens a new TCP+TLS connection is slow and leans harder on the limiter. Reuse one keep-alive connection (or a small pool). TCP_NODELAY and GC control are the rest of the latency story.

Trap 4: no Retry-After / backoff

The docs say over-limit requests are "throttled rather than rejected" — but in practice you do see 429. Honor Retry-After when present (it can be an integer or an HTTP date), and otherwise back off exponentially with jitter:

from polymarket_rate_limit import parse_retry_after, should_retry, backoff
if should_retry(resp.status):                       # 429 or 5xx
    time.sleep(backoff(attempt, retry_after=parse_retry_after(resp.headers)))

The limiter

I packaged the documented limits, a per-endpoint token bucket, the Retry-After parser, and correct backoff into a zero-dependency MIT module (injectable clock, fully tested):

from polymarket_rate_limit import RateLimiter
lim = RateLimiter()
wait = lim.acquire("/price")     # 0 if you may send now, else seconds to wait
if wait: time.sleep(wait)

Repo (free, MIT, 15 tests):
https://github.com/BlueWhale-Quant-Lab/polymarket-api-rate-limit-429-handler

The speed side — keep-alive pooling, TCP_NODELAY, GC control, TLS prewarm, and a p50/p99 latency benchmark — is a complete version, but the limiter above stands on its own.