Skip to content

DEV Community

Алексей Спинов

Posted on Mar 18

Reddit Has a Public JSON API Most Scrapers Ignore

#webdev #api #javascript #webscraping

Most Reddit scrapers parse HTML and break on every redesign. But Reddit has a public JSON API hiding in plain sight.

The Secret: Just Add .json

Append .json to any Reddit URL:

https://www.reddit.com/r/programming/hot.json
https://www.reddit.com/search.json?q=web+scraping
https://www.reddit.com/r/programming/comments/abc123/title.json

What You Get

Structured JSON with 20+ fields per post:

title, author, score, upvote_ratio
num_comments, flair, awards
selftext (full post body)
url, domain, is_video, thumbnail
created_utc, permalink

Plus full comment trees with nested replies.

Why It's Better

Never breaks on redesigns — JSON API is separate from the UI
Complete data — fields not visible in the UI
Faster — JSON is lighter than HTML
Pagination — use after parameter

Caveats

Need proper User-Agent header
Rate limit: don't exceed 1 req/sec
Cloud scraping needs residential proxy (Reddit blocks datacenter IPs)

I built a Reddit scraper based on this approach — free on Apify Store (search knotless_cadence). But the API is simple enough to use directly.

Top comments (0)

Subscribe