DEV Community

Brad
Brad

Posted on

Python Web Scraper with Rate Limiting and Retry Logic

Python Web Scraper with Rate Limiting and Retry Logic

Building reliable web scrapers means handling rate limits gracefully.

The Problem

Most beginners write scrapers that get IP-blocked instantly by hammering servers without delays.

Production-Ready Solution

import requests, time, random
from typing import Optional

class RateLimitedScraper:
    def __init__(self, min_delay=1.5, max_delay=4.0, max_retries=3):
        self.min_delay = min_delay
        self.max_delay = max_delay
        self.max_retries = max_retries
        self.session = requests.Session()

    def get(self, url: str) -> Optional[requests.Response]:
        for attempt in range(self.max_retries):
            try:
                time.sleep(random.uniform(self.min_delay, self.max_delay))
                response = self.session.get(url, timeout=30)

                if response.status_code == 429:
                    retry_after = int(response.headers.get('Retry-After', 60))
                    time.sleep(retry_after)
                    continue

                response.raise_for_status()
                return response

            except requests.RequestException as e:
                if attempt < self.max_retries - 1:
                    time.sleep(2 ** attempt)
        return None
Enter fullscreen mode Exit fullscreen mode

Key Features

  • Random delays: 1.5-4s between requests mimics human browsing
  • Exponential backoff: 1s → 2s → 4s on failures
  • Retry-After: respects server rate limit headers

Get the Full Toolkit

Want 47 production-ready Python scripts including scrapers with proxy rotation and anti-detection?

👉 Python Automation Toolkit

What are you scraping? Share in the comments!

Top comments (0)