DEV Community

Alex Spinov
Alex Spinov

Posted on • Edited on

Unpopular Opinion: Stop Scraping HTML — Use These Free APIs Instead

I've been building web scrapers for years. Here's my controversial take: most web scraping tutorials teach you the wrong thing.

They teach you to parse HTML. To fight with selectors. To handle dynamic JavaScript rendering.

But 80% of the data you need is available through free public APIs that nobody talks about.

The APIs Nobody Knows About

  • PyPI has a JSON API. https://pypi.org/pypi/{package}/json — no key, no auth.
  • YouTube has Innertube. Internal API, no quotas, no key.
  • arXiv has a free search API. 2M+ papers, structured XML.
  • PubMed returns medical research data in JSON.
  • GitHub gives you repo data without a token.
  • Crossref searches 130M+ research papers for free.
  • WHOIS/RDAP returns domain registration data via REST.

I documented all of them in my free APIs list — 200+ APIs that need zero registration.

Why This Matters

Every time you write a BeautifulSoup selector, you're:

  1. Building something fragile (one HTML change = broken scraper)
  2. Fighting anti-bot systems unnecessarily
  3. Ignoring structured data that's already there

APIs don't change their response format every week. HTML does.

My Rule

Before scraping ANY website, I spend 5 minutes checking:

  1. Does it have a public API? (check /api, /graphql, or docs)
  2. Does it expose JSON in page source? (ytInitialData, __NEXT_DATA__)
  3. Does it have RSS/Atom feeds?

Only if all three fail do I touch the HTML.


What's your approach? Do you default to HTML scraping or APIs first? Have you discovered any hidden APIs that saved you hours of work?

I'm genuinely curious — drop your experience in the comments.

More free tools: 77 Web Scraping Tools & APIs

Do you still scrape HTML or have you switched to APIs? I'd love to hear what approach works best for your projects. 👇


Need custom dev tools, scrapers, or API integrations? I build automation for dev teams. Email spinov001@gmail.com — or explore awesome-web-scraping.


More from me: 10 Dev Tools I Use Daily | 77 Scrapers on a Schedule | 150+ Free APIs
Also: Neon Free Postgres | Vercel Free API | Hetzner 4x More Server
NEW: I Ran an AI Agent for 16 Days — What Actually Works

You might also like:


Need data from the web without writing scrapers? Check my *Apify actors** — ready-made scrapers for HN, Reddit, LinkedIn, and 75+ more sites. Or email: spinov001@gmail.com*

Top comments (0)