DEV Community

FairPrice
FairPrice

Posted on

Building a Tech Trend Dashboard from Hacker News Data with Python

Every day, thousands of developers discuss emerging technologies on Hacker News. What if you could turn that firehose of discussion into a structured trend dashboard?

In this tutorial, we build a Python script that analyzes HN stories and comments to detect trending topics and visualize what the developer community cares about right now.

The Data Pipeline

We need stories and their comments. The official HN API gives you individual items, but fetching hundreds of stories plus nested comments is slow since each comment requires a separate HTTP call.

For bulk collection, I use the HN Stories + Comments Scraper on Apify which grabs full story metadata and comment trees in one run. But you can adapt this to any data source.

Assume we have our data as JSON:

stories = [
    {
        "title": "Show HN: I built a Rust web framework",
        "score": 342,
        "num_comments": 156,
        "comments": [
            {"text": "Really fast compared to Actix...", "score": 45},
            {"text": "How does this handle async?", "score": 23},
        ]
    },
]
Enter fullscreen mode Exit fullscreen mode

Step 1: Extract Tech Keywords

Simple keyword extraction weighted by engagement:

import re
from collections import Counter

TECH_TERMS = {
    'rust', 'python', 'go', 'typescript', 'zig', 'kotlin',
    'react', 'vue', 'svelte', 'htmx', 'nextjs',
    'llm', 'gpt', 'claude', 'openai', 'transformer',
    'kubernetes', 'docker', 'wasm', 'sqlite', 'postgres',
}

def extract_trends(stories):
    weighted_counts = Counter()
    for story in stories:
        text = story['title'].lower()
        for c in story.get('comments', []):
            text += ' ' + c.get('text', '').lower()
        for term in TECH_TERMS:
            if re.search(r'\b' + term + r'\b', text):
                weight = story['score'] + story['num_comments']
                weighted_counts[term] += weight
    return weighted_counts.most_common(15)
Enter fullscreen mode Exit fullscreen mode

Step 2: Detect Rising vs Falling Topics

Compare recent mentions against a baseline period:

from datetime import datetime, timedelta

def trend_momentum(stories, days_recent=7, days_baseline=30):
    now = datetime.now()
    recent_cutoff = now - timedelta(days=days_recent)
    baseline_cutoff = now - timedelta(days=days_baseline)
    recent, baseline = Counter(), Counter()

    for story in stories:
        ts = datetime.fromtimestamp(story.get('time', 0))
        terms = {t for t in TECH_TERMS
                 if re.search(r'\b' + t + r'\b', story['title'].lower())}
        if ts >= recent_cutoff:
            for t in terms: recent[t] += 1
        elif ts >= baseline_cutoff:
            for t in terms: baseline[t] += 1

    momentum = {}
    for term in TECH_TERMS:
        r = recent[term] / days_recent
        b = baseline[term] / (days_baseline - days_recent) or 0.001
        momentum[term] = r / b

    return (sorted(momentum.items(), key=lambda x: -x[1])[:5],
            sorted(momentum.items(), key=lambda x: x[1])[:5])
Enter fullscreen mode Exit fullscreen mode

Step 3: Build the Dashboard

def print_dashboard(stories):
    trends = extract_trends(stories)
    rising, falling = trend_momentum(stories)

    print("=" * 50)
    print("  HN TECH TREND DASHBOARD")
    print("=" * 50)

    print("\nTOP TECHNOLOGIES (by weighted mentions):")
    for i, (term, score) in enumerate(trends[:10], 1):
        bar = "#" * min(score // 100, 30)
        print(f"  {i:2d}. {term:12s} {bar} ({score})")

    print("\nRISING:")
    for term, ratio in rising:
        if ratio > 1.2:
            print(f"  ^ {term} ({ratio:.1f}x baseline)")

    print("\nCOOLING:")
    for term, ratio in falling:
        if ratio < 0.8:
            print(f"  v {term} ({ratio:.1f}x baseline)")
Enter fullscreen mode Exit fullscreen mode

Collecting Real Data

For production, here are your options:

  1. HN Official API - Free, but slow for bulk. Good for under 50 stories.
  2. HN Algolia API - Search-oriented, great for keyword queries.
  3. Web scraping - Fragile, breaks when HN changes markup.
  4. Pre-built scrapers - Tools like the Apify HN scraper handle pagination, rate limits, and comment tree traversal.
import requests

def fetch_hn_stories(query="python", hits=100):
    url = "http://hn.algolia.com/api/v1/search"
    params = {"query": query, "tags": "story", "hitsPerPage": hits}
    return requests.get(url, params=params).json()["hits"]
Enter fullscreen mode Exit fullscreen mode

What You Can Build From Here

  • Weekly email digest of trending tech topics
  • Investment signal - track which technologies gain developer mindshare
  • Content planning tool - write about what devs are actively discussing

The code runs in under a second on a few hundred stories. For continuous monitoring, throw it in a cron job with a SQLite database. Happy trend hunting!

Top comments (0)