agenthustler

Posted on Mar 20

Scraping Product Hunt in 2026: Daily Launches, Trending Products & Maker Data

#webdev #python #tutorial #startup

Product Hunt is the launchpad for startups. Every day, hundreds of products compete for upvotes, and the data behind those launches — makers, categories, upvote trajectories, taglines — is pure gold for startup researchers, investors, and founders doing competitive intelligence.

But scraping it? That's where things get interesting.

Why Scrape Product Hunt?

Here are the top use cases I've seen:

Launch intelligence: Track what's launching daily in your niche. Get alerts when a competitor drops a new product.
Startup research: Build databases of startups by category, founding date, and traction signals.
Competitor monitoring: Watch how competing products perform over time — upvotes, comments, maker activity.
Investor research: Spot trending categories early. See which makers are serial launchers.
Content curation: Auto-generate "Top AI tools launched this week" roundups.

The Challenge: It's Not a Simple Website

Product Hunt is a single-page application powered by Apollo GraphQL. If you curl a Product Hunt page, you won't get clean HTML with product data in neat <div> tags. Instead, the data lives in a JavaScript variable called __APOLLO_STATE__ that gets hydrated on the client side.

There's no public REST API anymore (the old v1/v2 APIs were deprecated). The GraphQL endpoint exists but isn't documented for public use and requires authentication tokens that rotate.

So you're left with two options:

Reverse-engineer the GraphQL queries (fragile, breaks often)
Extract data from the server-side rendered Apollo state (more stable)

How Apollo SSR Extraction Works

When Product Hunt renders a page on the server, it embeds the full Apollo cache in the HTML as a __APOLLO_STATE__ JavaScript object. This contains all the data needed to render the page — product names, descriptions, upvote counts, maker profiles, comments, and more.

The extraction approach:

Fetch the HTML page (with proper headers to get the SSR version)
Parse out the __APOLLO_STATE__ variable from the <script> tags
Deserialize the JSON object
Walk the Apollo cache structure to extract normalized entities (posts, users, topics)

This is more resilient than hitting GraphQL directly because the SSR output format changes less frequently than internal API schemas.

A Ready-Made Solution: ProductHunt Scraper on Apify

I built an Apify actor that handles all of this complexity. It supports 4 modes:

Mode 1: Today's Launches

Fetches all products launched today with full metadata:

{
  "mode": "today"
}

Returns: product name, tagline, description, URL, upvote count, comment count, maker profiles, topics, and thumbnail URLs.

Mode 2: Date-Specific Launches

Want launches from a specific date? Maybe you're building a historical dataset:

{
  "mode": "date",
  "date": "2026-03-15"
}

Mode 3: Search

Find products by keyword across all of Product Hunt:

{
  "mode": "search",
  "searchQuery": "AI writing assistant"
}

Mode 4: Product Details

Get full details for a specific product by URL:

{
  "mode": "product",
  "productUrl": "https://www.producthunt.com/posts/chatgpt"
}

Python Code Example

Here's how to use it with the apify-client Python package:

from apify_client import ApifyClient
import json

# Initialize the client
client = ApifyClient("YOUR_APIFY_TOKEN")

# Get today's Product Hunt launches
run_input = {
    "mode": "today"
}

# Run the actor
run = client.actor("cryptosignals/producthunt-scraper").call(run_input=run_input)

# Fetch results
products = []
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    products.append(item)

# Show top 10 by upvotes
products.sort(key=lambda x: x.get("votesCount", 0), reverse=True)

for p in products[:10]:
    print(f"🚀 {p['name']} — {p.get('tagline', '')}")
    print(f"   Upvotes: {p.get('votesCount', 0)} | Comments: {p.get('commentsCount', 0)}")
    print(f"   URL: {p.get('url', '')}")
    print()

Output looks like:

🚀 CoolStartup — AI-powered everything for everyone
   Upvotes: 847 | Comments: 123
   URL: https://www.producthunt.com/posts/coolstartup

🚀 DevTool Pro — Ship code 10x faster
   Upvotes: 612 | Comments: 89
   URL: https://www.producthunt.com/posts/devtool-pro

Use Case: Daily Digest of AI Tools on Product Hunt

Here's a practical example — a script that runs daily and builds a digest of AI-related launches:

from apify_client import ApifyClient
from datetime import datetime

client = ApifyClient("YOUR_APIFY_TOKEN")

# Get today's launches
run = client.actor("cryptosignals/producthunt-scraper").call(
    run_input={"mode": "today"}
)

# Filter for AI-related products
ai_keywords = ["ai", "gpt", "llm", "machine learning", "artificial intelligence", 
               "copilot", "agent", "automation"]

ai_products = []
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    text = f"{item.get('name', '')} {item.get('tagline', '')} {item.get('description', '')}".lower()
    if any(kw in text for kw in ai_keywords):
        ai_products.append(item)

# Sort by upvotes
ai_products.sort(key=lambda x: x.get("votesCount", 0), reverse=True)

# Generate digest
date_str = datetime.now().strftime("%B %d, %Y")
print(f"# AI Tools Launched on Product Hunt — {date_str}\n")

for i, p in enumerate(ai_products, 1):
    makers = ", ".join([m.get("name", "Unknown") for m in p.get("makers", [])])
    print(f"## {i}. {p['name']}")
    print(f"**{p.get('tagline', '')}**\n")
    print(f"- Upvotes: {p.get('votesCount', 0)}")
    print(f"- Makers: {makers}")
    print(f"- Topics: {', '.join(p.get('topics', []))}")
    print(f"- [View on PH]({p.get('url', '')})\n")

Schedule this with a cron job or Apify's built-in scheduler, pipe the output to email or Slack, and you've got an automated AI launch radar.

Data Fields You Get

Each product comes with:

Field	Description
`name`	Product name
`tagline`	One-line description
`description`	Full description
`url`	Product Hunt URL
`website`	Product's own website
`votesCount`	Total upvotes
`commentsCount`	Total comments
`makers`	Array of maker profiles (name, headline, URL)
`topics`	Category tags
`thumbnail`	Product image URL
`createdAt`	Launch timestamp

Get Started

The ProductHunt Scraper is available on Apify at $4.99/month starting April 3, 2026. Until then, you can try it for free.

No API keys needed. No GraphQL reverse-engineering. Just pick a mode, run it, and get clean JSON data.

Whether you're building a startup database, monitoring competitors, or curating daily launch digests — this handles the hard part so you can focus on what you do with the data.

Have questions about scraping Product Hunt or other startup platforms? Drop a comment below.

DEV Community