Python Web Scraper with Rate Limiting and Retry Logic
Building reliable web scrapers means handling rate limits gracefully.
The Problem
Most beginners write scrapers that get IP-blocked instantly by hammering servers without delays.
Production-Ready Solution
import requests, time, random
from typing import Optional
class RateLimitedScraper:
def __init__(self, min_delay=1.5, max_delay=4.0, max_retries=3):
self.min_delay = min_delay
self.max_delay = max_delay
self.max_retries = max_retries
self.session = requests.Session()
def get(self, url: str) -> Optional[requests.Response]:
for attempt in range(self.max_retries):
try:
time.sleep(random.uniform(self.min_delay, self.max_delay))
response = self.session.get(url, timeout=30)
if response.status_code == 429:
retry_after = int(response.headers.get('Retry-After', 60))
time.sleep(retry_after)
continue
response.raise_for_status()
return response
except requests.RequestException as e:
if attempt < self.max_retries - 1:
time.sleep(2 ** attempt)
return None
Key Features
- Random delays: 1.5-4s between requests mimics human browsing
- Exponential backoff: 1s → 2s → 4s on failures
- Retry-After: respects server rate limit headers
Get the Full Toolkit
Want 47 production-ready Python scripts including scrapers with proxy rotation and anti-detection?
What are you scraping? Share in the comments!
Top comments (0)