DEV Community

Алексей Спинов
Алексей Спинов

Posted on

Reddit Has a Public JSON API Most Scrapers Ignore

Most Reddit scrapers parse HTML and break on every redesign. But Reddit has a public JSON API hiding in plain sight.

The Secret: Just Add .json

Append .json to any Reddit URL:

https://www.reddit.com/r/programming/hot.json
https://www.reddit.com/search.json?q=web+scraping
https://www.reddit.com/r/programming/comments/abc123/title.json
Enter fullscreen mode Exit fullscreen mode

What You Get

Structured JSON with 20+ fields per post:

  • title, author, score, upvote_ratio
  • num_comments, flair, awards
  • selftext (full post body)
  • url, domain, is_video, thumbnail
  • created_utc, permalink

Plus full comment trees with nested replies.

Why It's Better

  1. Never breaks on redesigns — JSON API is separate from the UI
  2. Complete data — fields not visible in the UI
  3. Faster — JSON is lighter than HTML
  4. Pagination — use after parameter

Caveats

  • Need proper User-Agent header
  • Rate limit: don't exceed 1 req/sec
  • Cloud scraping needs residential proxy (Reddit blocks datacenter IPs)

I built a Reddit scraper based on this approach — free on Apify Store (search knotless_cadence). But the API is simple enough to use directly.

Top comments (0)