The Problem
I needed structured Hacker News data for a side project — trending stories, scores, comment counts. The HN API exists but requires pagination, filtering, and batch fetching logic.
So I built an Apify Actor that handles all of this and published it for free.
What It Does
HN Top Stories Scraper lets you:
- Scrape top, new, best, ask, and show stories
- Filter by minimum score, comment count, or keyword
- Get up to 500 stories per run
- Output as JSON, CSV, or connect to Google Sheets, Slack, Zapier
It uses the official HN Firebase API — no scraping, no proxies needed.
Example
Get the top 50 AI stories with 100+ upvotes:
{
"count": 50,
"type": "top",
"minScore": 100,
"keyword": "AI"
}
Returns:
{
"id": 12345678,
"title": "Show HN: AI tool that does X",
"url": "https://example.com",
"score": 342,
"comments": 89,
"author": "username",
"hn_url": "https://news.ycombinator.com/item?id=12345678"
}
Use Cases
- RSS replacement: Schedule runs to get stories as structured data
- Competitor monitoring: Filter by your company name
- Content curation: Feed into newsletters or Slack
- Trend analysis: Track what gets high scores over time
- Job monitoring: Scrape Who is Hiring threads
Pricing
Pay-per-result: ~$0.01 per 1,000 stories. Free tier available — no credit card needed.
Compare that to the $5-19/month flat-rate competitors charge.
Try It
https://apify.com/cryptosignals/hn-top-stories
Feedback welcome — this is my first published Actor.
Recommended Tools for Web Scraping
If you're building scrapers at scale, these tools can save you hours of dealing with proxies, CAPTCHAs, and rate limits:
ScraperAPI — Handles proxy rotation, browser rendering, and CAPTCHAs automatically. Great if you don't want to manage your own proxy infrastructure. Comes with 5,000 free API credits to get started.
ScrapeOps — A proxy aggregator that routes your requests through 20+ proxy providers and picks the best one for each target site. Useful when you need reliability across different domains.
Top comments (0)