DEV Community

BlueWhale-Quant-Lab
BlueWhale-Quant-Lab

Posted on • Originally published at github.com

Polymarket rate limits are per 10 seconds, not per second (and 429 traps)

If you've hit 429 Too Many Requests on the Polymarket APIs, the fix isn't just "slow down" — it's understanding how the limits are shaped and giving your client the right tooling. Here are the traps and a small open-source limiter.

Trap 1: the window is 10 seconds, not 1

Polymarket's published limits use a 10-second window (trading also has a 10-minute sustained cap):

Endpoint Limit
/book, /price, /midpoint 1,500 / 10s
/books, /prices 500 / 10s
POST/DELETE /order 5,000 / 10s burst (120,000 / 10 min)
DELETE /cancel-all 250 / 10s

So /book is 150 req/s sustained with a burst of 1,500 — not "1500/s", and not "150/s with no burst". A token bucket models this exactly: refill max / window per second, capacity max.

rate, burst = 1500 / 10, 1500   # 150 tokens/sec, cap 1500
Enter fullscreen mode Exit fullscreen mode

Trap 2: authenticated GETs throttle much earlier

The published numbers are for public market-data endpoints. Authenticated reads (e.g. polling a single order) get throttled far below that — around ~10/s once you're also placing and cancelling. Budget authed reads conservatively; don't assume the 1,500/10s ceiling applies.

Trap 3: a fresh connection per request

Every call that opens a new TCP+TLS connection is slow and leans harder on the limiter. Reuse one keep-alive connection (or a small pool). TCP_NODELAY and GC control are the rest of the latency story.

Trap 4: no Retry-After / backoff

The docs say over-limit requests are "throttled rather than rejected" — but in practice you do see 429. Honor Retry-After when present (it can be an integer or an HTTP date), and otherwise back off exponentially with jitter:

from polymarket_rate_limit import parse_retry_after, should_retry, backoff
if should_retry(resp.status):                       # 429 or 5xx
    time.sleep(backoff(attempt, retry_after=parse_retry_after(resp.headers)))
Enter fullscreen mode Exit fullscreen mode

The limiter

I packaged the documented limits, a per-endpoint token bucket, the Retry-After parser, and correct backoff into a zero-dependency MIT module (injectable clock, fully tested):

from polymarket_rate_limit import RateLimiter
lim = RateLimiter()
wait = lim.acquire("/price")     # 0 if you may send now, else seconds to wait
if wait: time.sleep(wait)
Enter fullscreen mode Exit fullscreen mode

Repo (free, MIT, 15 tests):
https://github.com/BlueWhale-Quant-Lab/polymarket-api-rate-limit-429-handler

The speed side — keep-alive pooling, TCP_NODELAY, GC control, TLS prewarm, and a p50/p99 latency benchmark — is a complete version, but the limiter above stands on its own.

Takeaway

Model the 10-second window as a token bucket, rate-limit authed GETs conservatively, reuse connections, and honor Retry-After.

Top comments (0)