Unpopular Opinion: Stop Scraping HTML — Use These Free APIs Instead

#discuss #api #python #webdev

I've been building web scrapers for years. Here's my controversial take: most web scraping tutorials teach you the wrong thing.

They teach you to parse HTML. To fight with selectors. To handle dynamic JavaScript rendering.

But 80% of the data you need is available through free public APIs that nobody talks about.

The APIs Nobody Knows About

PyPI has a JSON API. https://pypi.org/pypi/{package}/json — no key, no auth.
YouTube has Innertube. Internal API, no quotas, no key.
arXiv has a free search API. 2M+ papers, structured XML.
PubMed returns medical research data in JSON.
GitHub gives you repo data without a token.
Crossref searches 130M+ research papers for free.
WHOIS/RDAP returns domain registration data via REST.

I documented all of them in my free APIs list — 200+ APIs that need zero registration.

Why This Matters

Every time you write a BeautifulSoup selector, you're:

Building something fragile (one HTML change = broken scraper)
Fighting anti-bot systems unnecessarily
Ignoring structured data that's already there

APIs don't change their response format every week. HTML does.

My Rule

Before scraping ANY website, I spend 5 minutes checking:

Does it have a public API? (check /api, /graphql, or docs)
Does it expose JSON in page source? (ytInitialData, __NEXT_DATA__)
Does it have RSS/Atom feeds?

Only if all three fail do I touch the HTML.

What's your approach? Do you default to HTML scraping or APIs first? Have you discovered any hidden APIs that saved you hours of work?

I'm genuinely curious — drop your experience in the comments.

More free tools: 77 Web Scraping Tools & APIs

Do you still scrape HTML or have you switched to APIs? I'd love to hear what approach works best for your projects. 👇

Need custom dev tools, scrapers, or API integrations? I build automation for dev teams. Email spinov001@gmail.com — or explore awesome-web-scraping.

You might also like:

Need data from the web without writing scrapers? Check my *Apify actors** — ready-made scrapers for HN, Reddit, LinkedIn, and 75+ more sites. Or email: spinov001@gmail.com*