DEV Community

Cover image for What I Learned From Scraping 100,000 Tech Signals
Jose Marquez Alberti
Jose Marquez Alberti

Posted on • Originally published at asof.app

What I Learned From Scraping 100,000 Tech Signals

I spent the past month scraping Hacker News, Reddit, Product Hunt, and GitHub every 30 minutes. The goal was simple: understand what makes tech content go viral.

1,095 snapshots later, I've analyzed over 100,000 individual signals. Here's what the data actually shows.

The Saturday 17:00 UTC Effect

This was the most surprising finding. Saturday at 17:00 UTC consistently produces the highest-scoring posts across all platforms.

Average score: 75.4 vs 48.1 overall (56% better performance)

Why? My theory: Saturday is when founders, developers, and creators have time to actually engage. They're not in meetings. They're not fighting fires. They're exploring.

17:00 UTC hits multiple timezones perfectly:

  • 9 AM PST - West Coast coffee time
  • 12 PM EST - East Coast lunch break
  • 5 PM GMT - Europe winding down

The next best times? Sunday 1:00 UTC (70.4) and Thursday 11:00 UTC (69.8). Weekday mornings consistently underperform.

The 6-Word Title Rule

I analyzed 16,578 post titles. The pattern was clear: 6-word titles significantly outperform everything else.

Examples that crushed it:

  • "Show HN: Time travel through snapshots" (6 words)
  • "I built X in 48 hours" (6 words)
  • "Open source alternative to Notion" (5-6 words)

Too short (3-4 words)? Not enough context. Too long (10+ words)? People's eyes glaze over.

Databases Are Having a Moment

This completely surprised me. "Databases" as a category is accelerating at +185.6% while "Mobile" is down -17.3%.

Other rising categories:

  • Gaming: +12.5%
  • Blockchain: +10.7%
  • Data Science: +9.1%

Falling categories:

  • Mobile: -17.3%
  • SaaS: -8.5%
  • AI & ML: -3.1% (yes, really)

The AI decline isn't because people stopped caring. It's saturation. Everyone's building AI tools now, so the signal-to-noise ratio has collapsed.

Neutral Beats Excited Every Time

I ran sentiment analysis on every post. Neutral sentiment posts have a 95.5% success rate.

Excited/salesy posts? Way lower.

The HN crowd can smell marketing from a mile away. Just state facts. Let your work speak.

Best Sources, Ranked

Not all platforms are equal. Here's the average score by source:

  • github_trending: 88.9
  • hacker_news: 67.6
  • reddit_SaaS: 53.5
  • reddit_sideproject: 53.2
  • reddit_Entrepreneur: 50.0

GitHub trending posts perform 32% better than even Hacker News. Why? Selection bias. Only truly interesting projects make it to GitHub trending in the first place.

Topic Correlations Nobody Talks About

Some topics move together. Some move opposite. This matters for timing.

Topics that trend together:

  • Web Dev ↔ Mobile (0.58 correlation)
  • Web Dev ↔ Databases (0.476)
  • General ↔ Blockchain (0.53)

Topics that move opposite:

  • Remote Work ↔ SaaS (-0.573)
  • Security ↔ Mobile (-0.524)
  • General ↔ Design (-0.54)

This means: If you're seeing a lot of SaaS posts, it's probably NOT the time to post about remote work. The audience attention is elsewhere.

What This Means For You

If you're launching something:

  • Post Saturday at 17:00 UTC
  • Keep your title to 6 words
  • Stay neutral in tone
  • Check what's trending in your category first
  • GitHub trending > everything else

If you're a content creator:

  • Databases content is hot right now
  • Mobile is cooling off
  • AI is oversaturated (differentiate or avoid)

If you're an investor:

  • Watch correlation patterns for emerging trends
  • Developer sentiment is shifting toward databases/infrastructure
  • Gaming is quietly accelerating

The Meta Experiment

I'm posting this on Hacker News Saturday at 17:00 UTC. I'm using a 6-word title. I'm keeping the tone neutral.

Let's see if the data holds up.


I built ASOF to explore this data. Pick any moment in the past month and see what was viral. Compare periods. Export insights.

Try it free: https://asof.app

Discussion on Hacker News: [will update Saturday, Jan 4]

Top comments (1)

Collapse
 
onlineproxyio profile image
OnlineProxy

I gotta break down the data by platform to untangle the timing from the algorithm magic. My gut says Hacker News blows up around 5 PM UTC, but github Trending probably gets all the love way earlier, like 4-6 hours before that, which totally vibes with your EMEA theory. And yeah, the 6-word thing... I tried to control for category and recency, but you're right, I missed author cred and if there was a pic, those could totally be boosting things. The whole AI/ML thing being down by 3.1%? That's probably just noise, my bad - I should've looked at how sentiment shifted like from stoked to meh instead of just volume. That's a way better way to see if something's getting played out. Anyway I'm gonna update everything with those three new layers - platform splits, how things play off each other, and sentiment changes - and I'll hit you back with the revised stuff by next week. Seriously, thanks for pushing me on this, that kind of sharp eye is exactly what helps me cut through the BS and find what really matters in the data