Every day, thousands of developers discuss emerging technologies on Hacker News. What if you could turn that firehose of discussion into a structured trend dashboard?
In this tutorial, we build a Python script that analyzes HN stories and comments to detect trending topics and visualize what the developer community cares about right now.
The Data Pipeline
We need stories and their comments. The official HN API gives you individual items, but fetching hundreds of stories plus nested comments is slow since each comment requires a separate HTTP call.
For bulk collection, I use the HN Stories + Comments Scraper on Apify which grabs full story metadata and comment trees in one run. But you can adapt this to any data source.
Assume we have our data as JSON:
stories = [
{
"title": "Show HN: I built a Rust web framework",
"score": 342,
"num_comments": 156,
"comments": [
{"text": "Really fast compared to Actix...", "score": 45},
{"text": "How does this handle async?", "score": 23},
]
},
]
Step 1: Extract Tech Keywords
Simple keyword extraction weighted by engagement:
import re
from collections import Counter
TECH_TERMS = {
'rust', 'python', 'go', 'typescript', 'zig', 'kotlin',
'react', 'vue', 'svelte', 'htmx', 'nextjs',
'llm', 'gpt', 'claude', 'openai', 'transformer',
'kubernetes', 'docker', 'wasm', 'sqlite', 'postgres',
}
def extract_trends(stories):
weighted_counts = Counter()
for story in stories:
text = story['title'].lower()
for c in story.get('comments', []):
text += ' ' + c.get('text', '').lower()
for term in TECH_TERMS:
if re.search(r'\b' + term + r'\b', text):
weight = story['score'] + story['num_comments']
weighted_counts[term] += weight
return weighted_counts.most_common(15)
Step 2: Detect Rising vs Falling Topics
Compare recent mentions against a baseline period:
from datetime import datetime, timedelta
def trend_momentum(stories, days_recent=7, days_baseline=30):
now = datetime.now()
recent_cutoff = now - timedelta(days=days_recent)
baseline_cutoff = now - timedelta(days=days_baseline)
recent, baseline = Counter(), Counter()
for story in stories:
ts = datetime.fromtimestamp(story.get('time', 0))
terms = {t for t in TECH_TERMS
if re.search(r'\b' + t + r'\b', story['title'].lower())}
if ts >= recent_cutoff:
for t in terms: recent[t] += 1
elif ts >= baseline_cutoff:
for t in terms: baseline[t] += 1
momentum = {}
for term in TECH_TERMS:
r = recent[term] / days_recent
b = baseline[term] / (days_baseline - days_recent) or 0.001
momentum[term] = r / b
return (sorted(momentum.items(), key=lambda x: -x[1])[:5],
sorted(momentum.items(), key=lambda x: x[1])[:5])
Step 3: Build the Dashboard
def print_dashboard(stories):
trends = extract_trends(stories)
rising, falling = trend_momentum(stories)
print("=" * 50)
print(" HN TECH TREND DASHBOARD")
print("=" * 50)
print("\nTOP TECHNOLOGIES (by weighted mentions):")
for i, (term, score) in enumerate(trends[:10], 1):
bar = "#" * min(score // 100, 30)
print(f" {i:2d}. {term:12s} {bar} ({score})")
print("\nRISING:")
for term, ratio in rising:
if ratio > 1.2:
print(f" ^ {term} ({ratio:.1f}x baseline)")
print("\nCOOLING:")
for term, ratio in falling:
if ratio < 0.8:
print(f" v {term} ({ratio:.1f}x baseline)")
Collecting Real Data
For production, here are your options:
- HN Official API - Free, but slow for bulk. Good for under 50 stories.
- HN Algolia API - Search-oriented, great for keyword queries.
- Web scraping - Fragile, breaks when HN changes markup.
- Pre-built scrapers - Tools like the Apify HN scraper handle pagination, rate limits, and comment tree traversal.
import requests
def fetch_hn_stories(query="python", hits=100):
url = "http://hn.algolia.com/api/v1/search"
params = {"query": query, "tags": "story", "hitsPerPage": hits}
return requests.get(url, params=params).json()["hits"]
What You Can Build From Here
- Weekly email digest of trending tech topics
- Investment signal - track which technologies gain developer mindshare
- Content planning tool - write about what devs are actively discussing
The code runs in under a second on a few hundred stories. For continuous monitoring, throw it in a cron job with a SQLite database. Happy trend hunting!
Top comments (0)