If you scrape anything at scale, you know the drill: proxies die, sites start returning 403/429, and your run grinds to a halt. The classic Scrapy answer, scrapy-rotating-proxies, hasn't seen a real update in years — and it's Scrapy-only, so the moment you reach for Playwright or plain requests, you're rebuilding rotation from scratch.
I wanted one small proxy pool I could share across all three. So I wrote proxyspin: a rotating pool with health tracking, ban detection and sticky sessions, and thin adapters for Scrapy, Playwright and requests. Zero required dependencies, pure standard library.
pip install proxyspin
The pool
Everything is built around one object. Load proxies from a list, a file, or a URL:
from proxyspin import ProxyPool
pool = ProxyPool.from_file("proxies.txt", strategy="round_robin")
# or inline / from your provider's export endpoint:
pool = ProxyPool(["http://user:pass@gate1.example.com:8000", "10.0.0.2:8000"])
pool = ProxyPool.from_url("https://example.com/api/my-list.txt")
proxy = pool.get() # -> Proxy; proxy.url is ready to use
pool.mark_failed(proxy) # bench it after repeated failures
pool.mark_ok(proxy) # reset its failure streak
It parses every common list format — host:port, host:port:user:pass, user:pass@host:port, scheme://user:pass@host:port — for HTTP, HTTPS, SOCKS4 and SOCKS5.
Rotation strategies
-
round_robin— cycle through healthy proxies in order (default) -
random— pick one at random -
sticky— keep returning the same proxy for a given key (a target domain, an account id, a worker name) until it goes unhealthy
The health model
Every proxy starts healthy. mark_failed bumps its failure streak; when the streak hits max_failures (default 2) the proxy is benched with exponential backoff (cooldown * 2**overshoot, base 60s, capped at 1h), then automatically rejoins rotation. mark_ok resets the streak. So dead proxies quietly drop out and recover on their own — you never manually prune the list.
Scrapy
Enable the middleware and point it at your proxies. Ban detection and retry-through-the-next-proxy are automatic:
# settings.py
DOWNLOADER_MIDDLEWARES = {
"proxyspin.scrapy_middleware.ProxySpinMiddleware": 610,
}
PROXYSPIN_FILE = "proxies.txt"
PROXYSPIN_STRATEGY = "sticky" # one proxy per target host
PROXYSPIN_BAN_CODES = [403, 429] # these responses rotate the proxy
PROXYSPIN_MAX_RETRIES = 3
Any 403/429 (configurable) marks the proxy as failed and retries the request through the next healthy one. No spider code changes.
Playwright
Playwright takes a proxy per browser context, which is the natural rotation unit — great for one-IP-per-account flows:
from playwright.sync_api import sync_playwright
from proxyspin import ProxyPool
from proxyspin.playwright_helper import proxy_settings
pool = ProxyPool.from_file("proxies.txt", strategy="sticky")
with sync_playwright() as p:
browser = p.chromium.launch()
for account in accounts:
context = browser.new_context(proxy=proxy_settings(pool, key=account.id))
# each account keeps its own IP for the whole session
requests
A drop-in Session that rotates on every call and retries failures through another proxy:
from proxyspin import ProxyPool
from proxyspin.requests_adapter import RotatingSession
session = RotatingSession(ProxyPool.from_file("proxies.txt"))
print(session.get("https://httpbin.org/ip").json()) # new IP per call
Check a list first
Bad proxies waste run time. The bundled CLI tests a whole list concurrently and writes out the survivors:
proxyspin check proxies.txt --workers 100 --alive-out alive.txt
OK 45.155.10.4:8000 612 ms HTTP 200
DEAD 91.10.77.2:3128 TimeoutError
...
118/200 alive
wrote 118 proxies to alive.txt
Getting proxies to test with
Want to try it right now without your own proxies? Bootstrap straight from a live public list:
pool = ProxyPool.from_url("https://raw.githubusercontent.com/gproxynet/free-proxy-list/main/all.txt")
Fair warning: public proxies are unreliable by nature — they're shared, slow, and die within minutes. They're fine for kicking the tires, not for a real crawl. For production you'll want dedicated proxies (residential/mobile/datacenter); a pool of one gateway entry per endpoint is all proxyspin needs since rotation happens server-side.
Wrapping up
One pool, the same health model everywhere, and you stop babysitting dead proxies. The code is MIT-licensed and on GitHub — issues and PRs welcome. If you've been limping along on an unmaintained rotation middleware, give it a spin.
Top comments (0)