Reddit is a goldmine for market research, trend analysis, and product feedback — but scraping it in 2026 is harder than it used to be. In this guide, I'll show you why the public JSON API breaks on most servers, the residential proxy workaround, and a ready-made Apify actor that handles it all.
Why Scrape Reddit?
Reddit hosts 50M+ daily active users discussing every topic imaginable. Common use cases:
- Trend analysis — spot emerging topics before they go mainstream
- Market research — understand what your target audience cares about
- Product feedback — find unfiltered opinions about your product (or competitors')
- Sentiment analysis — gauge community mood around brands, events, or releases
- Competitor monitoring — track mentions and comparisons in relevant subreddits
Reddit's Public JSON API — And Why It Breaks
Reddit exposes a free, no-auth JSON API. Just append .json to any URL:
https://www.reddit.com/r/python/hot.json
https://www.reddit.com/r/webdev/search.json?q=scraping&sort=new
https://www.reddit.com/r/datascience/comments/abc123.json
This returns structured JSON with posts, scores, comments, flairs — everything you need.
The catch: Reddit blocks datacenter IPs. If you run this from AWS, GCP, Azure, or any cloud VPS, you'll get a 403 Forbidden response. Reddit fingerprints datacenter IP ranges and rejects automated requests from them.
This means your local machine works fine, but the moment you deploy a scraper to production — it breaks.
The Residential Proxy Solution
The fix is residential proxies — IP addresses assigned to real ISP customers. Reddit can't easily distinguish these from normal user traffic.
But managing residential proxies yourself is expensive and complex. You need:
- A proxy provider subscription ($50–200+/month)
- Rotation logic to avoid rate limits
- Error handling for dead proxies
- Session management for paginated requests
Or you can use a tool that bundles all of this.
Reddit Scraper on Apify — 4 Modes
I built a Reddit Scraper actor on Apify that handles the proxy layer, pagination, and data extraction. It has 4 modes:
1. Subreddit Mode
Scrape hot, new, or top posts from any subreddit.
Input:
{
"mode": "subreddit",
"subreddit": "python",
"sort": "hot",
"limit": 50
}
Returns: Post title, author, score, comment count, flair, URL, timestamp, selftext, and more.
2. Search Mode
Find all posts matching a keyword across Reddit or within a specific subreddit.
{
"mode": "search",
"query": "apify scraper",
"subreddit": "webdev",
"sort": "relevance",
"limit": 25
}
Great for monitoring brand mentions or tracking discussions about a specific topic.
3. Comments Mode
Get an entire comment thread with nested replies.
{
"mode": "comments",
"postUrl": "https://www.reddit.com/r/python/comments/abc123/my_post/"
}
Returns the full comment tree — author, score, body, depth level, and reply chains.
4. User Profile Mode
Scrape a user's recent posts and comment history.
{
"mode": "user-profile",
"username": "spez",
"limit": 100
}
Returns posts, comments, karma breakdown, and activity timeline.
Python Code Example
Here's how to run the actor programmatically with the apify-client package:
from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("cryptosignals/reddit-scraper").call(
run_input={
"mode": "subreddit",
"subreddit": "python",
"sort": "hot",
"limit": 25,
}
)
for post in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"{post['score']:>5} {post['title'][:80]}")
That's 10 lines to get structured Reddit data in Python. No proxies to configure, no 403 errors.
Install the client:
pip install apify-client
Use Case: Reddit Sentiment Dashboard
Here's a practical example — building a sentiment tracker for your product:
- Schedule the actor to run daily with search mode, querying your product name
- Export results to a dataset or webhook
- Run sentiment analysis on post titles and comments (TextBlob, VADER, or an LLM)
- Track over time — plot sentiment score by day to catch PR crises early
from textblob import TextBlob
for post in posts:
sentiment = TextBlob(post["title"]).sentiment.polarity
print(f"{sentiment:+.2f} {post['title'][:60]}")
You can also:
- Compare sentiment across competing products
- Alert on sudden negative spikes
- Track which subreddits mention you most
- Find your most vocal advocates (and critics)
Pricing
The actor is $4.99/month (starting April 3, 2026) — that includes residential proxy usage. Compare that to $50–200/month for a standalone residential proxy subscription, plus the time spent building and maintaining your own scraper.
Try the Reddit Scraper on Apify →
Have questions or feature requests? Drop a comment below or open an issue on the actor's Apify page.
Top comments (0)