DEV Community

Oaida Adrian
Oaida Adrian

Posted on • Originally published at console.apify.com

Tracking Tech Sentiment in Real-Time with VADER and Python

Tracking Tech Sentiment in Real-Time with VADER and Python

What does the developer community feel about your product? Not what they say in reviews — what do they actually feel when they mention it on Hacker News or Reddit?

I built a Sentiment Analyzer that fetches posts from HN and Reddit, runs VADER sentiment analysis, and outputs structured scores. Here's how it works.

What It Does

The tool pulls posts from two sources:

  • Hacker News: Top or new stories via the official Firebase API
  • Reddit: Any subreddit, sorted by hot, new, top, or rising

Each post gets analysed with VADER (Valence Aware Dictionary and sEntiment Reasoner) — a rule-based model tuned for social media text. No GPU required, no API keys, no latency.

What You Get

Each analysed post includes:

{
  "source": "hackernews",
  "title": "Shadcn/UI now defaults to Base UI instead of Radix",
  "sentiment": "neutral",
  "sentimentScores": {
    "positive": 0.0,
    "neutral": 1.0,
    "negative": 0.0,
    "compound": 0.0
  },
  "keywords": ["shadcn", "ui", "defaults", "base", "radix"],
  "score": 43,
  "commentsCount": 3
}
Enter fullscreen mode Exit fullscreen mode

The compound score ranges from -1 (very negative) to +1 (very positive). Anything below -0.05 is classified negative, above 0.05 is positive, and in between is neutral.

Why VADER Instead of an LLM?

Three reasons:

  1. Speed: VADER processes 10,000+ posts per second. An LLM call takes 1-2 seconds per post.
  2. Cost: VADER is free and runs locally. LLM sentiment analysis costs per token.
  3. Consistency: Rule-based models give identical results every time. LLMs can be inconsistent across runs.

For high-volume monitoring tasks — like tracking every HN post mentioning your product — VADER is the right tool.

Real-World Use Cases

Brand Monitoring

Set the analyzer to fetch posts from r/yourproduct and HN search for your brand name. Get daily sentiment reports. Catch negative sentiment before it escalates.

Trend Detection

Track sentiment around technologies like "AI agents", "Rust", or "WebAssembly" across both platforms. Spot which technologies are gaining positive momentum.

Content Curation

Filter for only positive or negative posts. Share positive community feedback in your newsletter. Investigate negative feedback before it becomes a crisis.

How to Use It

Apify Store

The actor is deployed on Apify — configure your sources and run:

{
  "sources": "both",
  "subreddit": "technology,programming",
  "maxPosts": 100,
  "sentimentFilter": "negative"
}
Enter fullscreen mode Exit fullscreen mode

RapidAPI

The same engine powers the Multi-Tool Content API on RapidAPI.

Self-Host

Clone the GitHub repo:

pip install httpx nltk
python main.py
Enter fullscreen mode Exit fullscreen mode

The Technical Stack

  • httpx: Async HTTP client for concurrent fetching from HN Firebase API and Reddit JSON endpoints
  • NLTK VADER: Pre-downloaded lexicon, runs offline, zero external dependencies
  • asyncio: Semaphore-limited concurrency (10 concurrent requests) for speed without rate-limit issues
  • Keyword extraction: Custom stop-word filter with frequency counting for top-N keywords per post

Part of a Larger Toolkit

This is actor #6 in my content extraction suite:

  1. RSS Feed Aggregator — full article extraction
  2. llms.txt Generator — AI-readable website summaries
  3. RO Business Scraper — Romanian business directory
  4. Sitemap Content Extractor — bulk page extraction
  5. Product Hunt Scraper — launch tracking
  6. HN/Reddit Sentiment Analyzer — this tool

All available via RapidAPI and GitHub.


Found this useful? Star the repo and follow along as I build more tools.

Top comments (0)