Alex Spinov

Posted on Mar 23

I Built 77 Free Web Scrapers in One Week — Here's the Complete List

#webdev #opensource #productivity #javascript

Over the past week, I built and deployed 77 web scrapers on Apify Store. All free. Covering social media, e-commerce, SEO, news, developer tools, and more.

Here's what I learned, what worked, and what I'd do differently.

The Full List

Social Media (10 scrapers)

Reddit — JSON API, 20+ fields, comments, scores
YouTube Comments — Innertube API, no API key needed
YouTube Channel/Search/Transcript — full YouTube toolkit
Bluesky — AT Protocol (profiles, posts, feeds, hashtags)
Hacker News — Firebase + Algolia APIs
Threads — Meta's new platform

Reviews & Intelligence (6)

Trustpilot — JSON-LD extraction, most reliable method
Amazon Reviews — product review data
Glassdoor — company reviews and ratings
Yelp — local business reviews
IMDb — movie/TV ratings
Product Hunt — launch data and upvotes

SEO & Web Analysis (8)

SEO Audit — 50+ factors, score 0-100
Tech Stack Detector — identify frameworks and tools
Broken Links Checker — find 404s
PageSpeed — Core Web Vitals
Meta Tags Extractor — OpenGraph, Schema.org
Sitemap Scraper — parse XML sitemaps
Robots.txt Analyzer — crawl rules
SSL Checker — certificate details

Lead Generation (3)

Email Extractor — find emails on any website
Email Validator — verify email deliverability
WHOIS Lookup — domain ownership data

News & Content (4)

Google News — RSS-based, never breaks
RSS Feed Scraper — any RSS feed
Wikipedia — article data extraction
Podcast Scraper — episode data

MCP Servers for AI Agents (15)

Market Research, Company Researcher, Lead Finder, SEO Analyzer, Competitor Tracker, Content Analyzer, Social Monitor, Price Monitor, Trend Detector, Screenshot, Web Search, Email Enrichment, Data Enrichment, Keyword Research, Startup Validator

Utilities (31 more)

Crypto prices, weather, IP geolocation, URL expander, JSON formatter, exchange rates, and more.

What I Learned

1. API-first beats HTML scraping every time. Reddit's JSON API, YouTube's Innertube, Bluesky's AT Protocol — these hidden APIs are faster, more reliable, and return structured data. HTML scraping should be your last resort.

2. JSON-LD is an underused goldmine. Sites like Trustpilot embed structured review data in their HTML for Google. Parsing this is trivial and never breaks on redesigns.

3. RSS is not dead. Google News RSS feeds are the most reliable way to monitor news. No API key, no JavaScript rendering, just clean XML.

4. Quality > quantity. Top Apify developers have 3-5 excellent scrapers with 100K+ users each. I went for breadth — 77 scrapers — but the data shows focused quality wins.

Try Any of Them

All 77 are free on Apify Store. Browse the full list on GitHub.

Need custom scraping? $20 for any website, delivered in 24h: Order via Payoneer | Services

Which scraper would be most useful for your work? I'm curious what data problems people are solving.

DEV Community