Hacker News is one of the richest data sources in tech. Every day, thousands of stories, comments, job posts, and community discussions flow through it. If you're building a trend tracker, a newsletter curator, a job aggregator, or just experimenting with social data — HN is a natural starting point.
The good news: you have several practical options. The bad news: each comes with trade-offs you should know before you start. This article walks through every real approach in 2026, with working Python examples.
Option 1: The Official Hacker News Firebase API
HN has an official API provided by Firebase. It's free, rate-limit-friendly, and gives you raw item data.
What it covers:
- Stories, comments, jobs, polls, and Ask HN posts
- Live "Top Stories", "New Stories", "Best Stories" feeds
- User profiles
- The "Updates" feed (real-time changes)
The catch: There's no full-text search. You can fetch the top 500 items by ID and then fetch each one individually — but that's a lot of requests if you just want the titles.
Getting the top stories
import requests
BASE = "https://hacker-news.firebaseio.com/v0"
def get_top_stories(limit=10):
ids = requests.get(f"{BASE}/topstories.json").json()
stories = []
for item_id in ids[:limit]:
item = requests.get(f"{BASE}/item/{item_id}.json").json()
stories.append(item)
return stories
for story in get_top_stories(5):
print(story.get("title"), "—", story.get("score"), "pts")
Output:
Show HN: I built a local-first AI pair programmer — 1204 pts
Ask HN: What are you working on? (April 2026) — 832 pts
...
This works reliably. The downside is that fetching 100 stories means 101 HTTP requests (one for the list, one per item). Fine for small scripts; painful at scale.
Rate limits and etiquette
Firebase API doesn't publish official limits, but HN's guidelines ask you to be respectful. Add a small delay between requests:
import time
for item_id in ids[:50]:
item = requests.get(f"{BASE}/item/{item_id}.json").json()
time.sleep(0.05) # 50ms between calls
Option 2: The JSON Trick — Append .json to Any HN URL
This is one of the more useful HN data tricks. Almost any HN URL works with a .json suffix appended, and Firebase serves the underlying data directly.
import requests
# Get a specific item by appending .json to the URL
item = requests.get("https://news.ycombinator.com/item?id=39789234.json").json()
# Get a user profile
user = requests.get("https://news.ycombinator.com/user?id=pg.json").json()
This accesses the same Firebase API, just via a slightly different URL pattern. The real power comes from using Firebase's query syntax directly:
# Get newest 25 stories using Firebase REST query params
resp = requests.get(
"https://hacker-news.firebaseio.com/v0/newstories.json",
params={"limitToFirst": "25", "orderBy": '"$key"'}
)
ids = resp.json()
Firebase supports limitToFirst, limitToLast, startAt, endAt, and orderBy — useful for paginating through large result sets.
Option 3: The Algolia Search API (Best for Most Use Cases)
Algolia powers HN's official search at hn.algolia.com. This is the approach you want if you need full-text search, date filtering, or keyword monitoring.
What it covers:
- Full-text search across stories and comments
- Filter by date range, score, author
- Tag filters:
story,comment,show_hn,ask_hn,job - Pagination
No API key required. It's free for reasonable use.
Searching stories by keyword
import requests
def search_hn(query, tags="story", limit=10):
resp = requests.get(
"https://hn.algolia.com/api/v1/search",
params={
"query": query,
"tags": tags,
"hitsPerPage": limit,
}
)
data = resp.json()
for hit in data["hits"]:
print(f"[{hit.get('points', 0)} pts] {hit['title']} — {hit.get('url', '')}")
search_hn("LLM agents 2026")
Filtering by date and score
from datetime import datetime, timedelta
# Stories from the last 7 days with at least 100 points
week_ago = int((datetime.now() - timedelta(days=7)).timestamp())
resp = requests.get(
"https://hn.algolia.com/api/v1/search",
params={
"query": "startup",
"tags": "story",
"numericFilters": f"created_at_i>{week_ago},points>=100",
"hitsPerPage": 20,
}
)
for hit in resp.json()["hits"]:
print(hit["title"])
Getting recent stories (not by relevance)
Algolia exposes a search_by_date endpoint that returns items sorted chronologically rather than by relevance:
resp = requests.get(
"https://hn.algolia.com/api/v1/search_by_date",
params={"tags": "story", "hitsPerPage": 30}
)
Use search for relevance-ranked results, search_by_date for chronological. For monitoring pipelines, search_by_date is usually what you want.
Rate limits
Algolia's public HN API has undocumented limits but is generally tolerant of a few requests per second. For a monitoring script that runs every few minutes, you'll have no issues.
Option 4: A Free Hosted Endpoint (No Scraper Required)
If you want HN data without managing the above yourself — no Firebase polling loop, no Algolia rate-limit concerns, no infrastructure overhead — there's a hosted option: The Data Collector API at https://frog03-20494.wykr.es.
It offers:
-
/api/hackernews/search— keyword search across HN stories -
/api/hackernews/trending— current trending/front-page stories - 100 free calls with no credit card required
Getting a key takes one curl command:
curl -X POST https://frog03-20494.wykr.es/api/register \
-H "Content-Type: application/json" \
-d '{"email": "you@example.com"}'
Then use it in Python:
import requests
API_KEY = "your-key-here"
BASE = "https://frog03-20494.wykr.es/api"
# Search HN stories
resp = requests.get(
f"{BASE}/hackernews/search",
params={"q": "AI agents", "limit": 10},
headers={"X-API-Key": API_KEY}
)
for story in resp.json().get("results", []):
print(f"[{story.get('score', 0)} pts] {story['title']}")
# Get trending stories
trending = requests.get(
f"{BASE}/hackernews/trending",
headers={"X-API-Key": API_KEY}
).json()
for story in trending.get("results", []):
print(story["title"])
Good fit if you're prototyping quickly or don't want to maintain scraping infrastructure long-term.
Putting It Together: A Minimal HN Monitor
Here's a complete monitoring script that checks for new HN stories matching a keyword and prints anything that appeared in the last hour:
import requests
from datetime import datetime, timedelta
KEYWORD = "python"
ALGOLIA_SEARCH = "https://hn.algolia.com/api/v1/search_by_date"
def get_recent_stories(keyword, hours=1):
cutoff = int((datetime.now() - timedelta(hours=hours)).timestamp())
resp = requests.get(ALGOLIA_SEARCH, params={
"query": keyword,
"tags": "story",
"numericFilters": f"created_at_i>{cutoff}",
"hitsPerPage": 50,
})
return resp.json().get("hits", [])
if __name__ == "__main__":
stories = get_recent_stories(KEYWORD)
if not stories:
print(f"No new stories about '{KEYWORD}' in the last hour.")
else:
print(f"Found {len(stories)} new stories about '{KEYWORD}':")
for s in stories:
print(f" [{s.get('points', 0)} pts] {s['title']}")
url = s.get('url') or f"https://news.ycombinator.com/item?id={s['objectID']}"
print(f" {url}")
Run this on a cron schedule, a GitHub Actions workflow, or any simple loop and you have a free real-time HN monitor.
Which Option Should You Use?
| Use case | Best option |
|---|---|
| Get live front-page stories | Firebase API (/topstories.json) |
| Full-text keyword search | Algolia API |
| Monitor a topic over time | Algolia search_by_date + cron |
| Item-level detail (comments, metadata) | Firebase item API |
| Don't want to write/maintain a scraper | The Data Collector API |
| High-volume pipeline | Algolia + Firebase, or hosted API |
Final Notes
HN data is publicly available and widely used for research, trend monitoring, and tooling. The Algolia API is extremely well-designed and is usually the first option to reach for — full-text search with date and score filtering covers most use cases.
The Firebase API is reliable but chatty for bulk fetches. For anything beyond basic item lookups, Algolia saves you significant request overhead.
If you're prototyping and don't want to manage infrastructure, the hosted endpoint at https://frog03-20494.wykr.es gets you 100 free calls with an instant API key. Useful for quick experiments or low-volume pipelines.
Happy hacking — and remember to be kind to public APIs.
Top comments (0)