DEV Community

Brad
Brad

Posted on • Edited on

Python Web Scraper with Rate Limiting and Retry Logic

Python Web Scraper with Rate Limiting and Retry Logic

Building reliable web scrapers means handling rate limits gracefully.

The Problem

Most beginners write scrapers that get IP-blocked instantly by hammering servers without delays.

Production-Ready Solution

import requests, time, random
from typing import Optional

class RateLimitedScraper:
    def __init__(self, min_delay=1.5, max_delay=4.0, max_retries=3):
        self.min_delay = min_delay
        self.max_delay = max_delay
        self.max_retries = max_retries
        self.session = requests.Session()

    def get(self, url: str) -> Optional[requests.Response]:
        for attempt in range(self.max_retries):
            try:
                time.sleep(random.uniform(self.min_delay, self.max_delay))
                response = self.session.get(url, timeout=30)

                if response.status_code == 429:
                    retry_after = int(response.headers.get('Retry-After', 60))
                    time.sleep(retry_after)
                    continue

                response.raise_for_status()
                return response

            except requests.RequestException as e:
                if attempt < self.max_retries - 1:
                    time.sleep(2 ** attempt)
        return None
Enter fullscreen mode Exit fullscreen mode

Key Features

  • Random delays: 1.5-4s between requests mimics human browsing
  • Exponential backoff: 1s → 2s → 4s on failures
  • Retry-After: respects server rate limit headers

Get the Full Toolkit

Want 47 production-ready Python scripts including scrapers with proxy rotation and anti-detection?

👉 Python Automation Toolkit

What are you scraping? Share in the comments!


🔧 **Found this useful?* I build custom HN lead reports (20–50 companies with verified emails, tech stacks, 24h delivery) → Order done-for-you lead report — $75 | Got a workflow to automate? → 1-Hour Python Automation Audit — $39*

Top comments (0)