Build a Production-Ready Web Scraper in Python (Anti-Detection Included)
Web scraping is one of the most valuable Python skills. Here is a production-ready scraper with anti-detection.
The Complete Web Scraper
import requests, time, random
from bs4 import BeautifulSoup
import pandas as pd
class SmartScraper:
USER_AGENTS = [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/120.0",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) Safari/605.1",
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) Firefox/121.0",
]
def __init__(self):
self.session = requests.Session()
self.session.headers["User-Agent"] = random.choice(self.USER_AGENTS)
def scrape(self, url, delay=2):
time.sleep(delay + random.random())
r = self.session.get(url, timeout=30)
return r.text
def scrape_to_csv(self, urls, selector, filename):
results = []
for url in urls:
soup = BeautifulSoup(self.scrape(url), "html.parser")
for el in soup.select(selector):
results.append({"url": url, "text": el.text.strip()})
pd.DataFrame(results).to_csv(filename, index=False)
return len(results)
Features
- Auto-rotate User Agents
- Configurable delay
- CSV/JSON/Excel export
- Error handling and retry
- Cookie/session management
Follow for more Python automation!
Top comments (0)