After building 77 scrapers, here's my methodology:
Rule 1: Check for JSON API before writing HTML selectors
Rule 2: Check for RSS feeds before rendering JavaScript
Rule 3: Check for JSON-LD before parsing the DOM
Rule 4: Use headless browsers only as last resort
Result: 100% uptime across all 77 scrapers.
Reddit = JSON API.
Google News = RSS.
Trustpilot = JSON-LD.
YouTube = Innertube API.
All 77: GitHub
Custom scraping — $20: Payoneer
Top comments (0)