Stop Parsing HTML — 7 Websites That Give You JSON If You Ask Nicely

#beginners #tutorial #webdev #python

Most web scraping tutorials start with BeautifulSoup or Cheerio. But many popular websites already return structured JSON — you just need to know how to ask.

1. Reddit

Append .json to any URL:

https://www.reddit.com/r/webdev.json
https://www.reddit.com/r/python/top.json?t=week

2. YouTube (Innertube API)

fetch("https://www.youtube.com/youtubei/v1/search", {
  method: "POST",
  body: JSON.stringify({
    context: { client: { clientName: "WEB", clientVersion: "2.20240101" } },
    query: "python tutorial"
  })
});

3. Hacker News (Algolia)

https://hn.algolia.com/api/v1/search?query=python&tags=story

4. Wikipedia

https://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=machine+learning&format=json

5. GitHub

https://api.github.com/search/repositories?q=web+scraping&sort=stars

6. npm Registry

https://registry.npmjs.org/-/v1/search?text=scraping&size=10

7. Stack Overflow

https://api.stackexchange.com/2.3/search?order=desc&sort=votes&intitle=web+scraping&site=stackoverflow

Why This Matters

Approach	Speed	Stability	Data Quality
HTML Parsing	Slow (render JS)	Breaks on redesign	Messy, needs cleaning
JSON API	Fast (direct)	Rarely breaks	Clean, structured

After building 77 scrapers, I use HTML parsing as a last resort, not the default.

More Resources

Need data from any website? $20 flat rate. JSON/CSV/Excel. 24-hour delivery. Email: Spinov001@gmail.com | Hire me →

Need data from the web without writing scrapers? Check my *Apify actors** — ready-made scrapers for HN, Reddit, LinkedIn, and 75+ more sites. Or email me: spinov001@gmail.com*

DEV Community