DEV Community

agenthustler
agenthustler

Posted on

Best ProductHunt Scrapers in 2026: Get Launch Data Without the API Hassle

If you track startup launches, scout for tools, or do competitive intelligence, ProductHunt is one of the richest data sources out there. Thousands of products launch every month with upvotes, descriptions, maker info, and community reactions.

The problem? Getting that data programmatically is harder than it should be.

Why Scrape ProductHunt?

A few common reasons people pull PH data:

  • Competitive intel — monitor what's launching in your category
  • VC/investor research — spot trending products and makers early
  • Tool discovery — find the best tools for a specific niche
  • Content research — track what's gaining traction for articles, newsletters, or social posts
  • Market analysis — understand launch frequency, category trends, and seasonal patterns

The API Problem

ProductHunt has an official API, but it comes with friction:

  • OAuth tokens required — you need to register an app and handle token flows
  • Rate limits — strict limits that make bulk collection painful
  • Incomplete data — some fields available on the website aren't exposed via the API
  • Maintenance burden — API versions change, tokens expire, scopes shift

For a one-off query, the API works. For ongoing monitoring or bulk collection, you'll spend more time fighting authentication than analyzing data.

Scraper Options Compared

Here's what's available on Apify right now for ProductHunt scraping:

Feature ProductHunt Scraper (cryptosignals) Competitor Actor
Today's launches
Search by keyword
Date-specific results
Product detail pages Partial
Apollo SSR parsing
Users New 777
Reviews 2 (5.0★)
Modes 4 (today, search, date, product) 1

The established actor has a user base, but it covers a single use case: today's launches. If you need search, historical dates, or detailed product pages, it doesn't handle those.

Full disclosure: I built the cryptosignals actor. I'm biased, but the feature comparison is accurate — you can verify on both actor pages.

Deep Dive: ProductHunt Scraper (cryptosignals)

The actor has four modes:

1. Today's Launches

Pulls everything on the current front page — product name, tagline, description, vote count, topics, makers, links.

2. Search

Pass a keyword and get matching products. Useful for "find me all AI writing tools launched on PH."

3. Date-Specific

Pull launches from a specific date. Great for historical analysis — "what launched the same week as our competitor?"

4. Product Details

Pass a product URL or slug and get the full page data — description, media, maker profiles, related products.

How It Works

Instead of hitting the REST API, the actor parses ProductHunt's Apollo SSR state — the server-rendered GraphQL cache embedded in the page HTML. This gives you richer data than the public API, including fields that aren't in the official endpoints.

In testing, it extracted 53 products in 4.7 seconds. That's the full front page with complete metadata per product.

Code Example: Python

Here's how to use it with the Apify Python client:

from apify_client import ApifyClient

client = ApifyClient("your-apify-token")

# Get today's launches
run = client.actor("cryptosignals/producthunt-scraper").call(
    run_input={
        "mode": "today",
        "maxProducts": 50
    }
)

items = list(client.dataset(run["defaultDatasetId"]).iterate_items())

for product in items[:5]:
    print(f"{product['name']}{product['tagline']}")
    print(f"  Votes: {product.get('votesCount', 'N/A')}")
    print(f"  Topics: {', '.join(product.get('topics', []))}")
    print()
Enter fullscreen mode Exit fullscreen mode

Search by keyword:

run = client.actor("cryptosignals/producthunt-scraper").call(
    run_input={
        "mode": "search",
        "query": "developer tools",
        "maxProducts": 20
    }
)

results = list(client.dataset(run["defaultDatasetId"]).iterate_items())
print(f"Found {len(results)} products matching 'developer tools'")
Enter fullscreen mode Exit fullscreen mode

Use Case: Daily Competitor Monitoring

One practical setup: schedule the actor to run daily and track new launches in your category.

  1. Create a scheduled run on Apify with mode: "today" — runs every morning
  2. Filter results by topic or keyword in a post-processing step
  3. Push to Slack/email using Apify's webhook integrations
  4. Store in a dataset for trend analysis over time

This way you get a daily feed of "what launched in my space" without manual checking. The Apify scheduler handles retries and failures, so you don't babysit a cron job.

When Free curl Works (and When It Doesn't)

For simple, one-off checks you can curl ProductHunt directly:

curl -s 'https://www.producthunt.com' | grep -o '"name":"[^"]*"' | head -10
Enter fullscreen mode Exit fullscreen mode

This breaks constantly. PH's HTML structure changes, JavaScript rendering hides data, and you get rate-limited fast.

Use curl when: you need one quick check and don't care about reliability.

Use a scraper actor when: you need structured data, monitoring over time, multiple search modes, or you're pulling more than a handful of products. The actor handles rendering, parsing, pagination, and retries — you just get clean JSON.

Conclusion

If you're doing anything beyond casual ProductHunt browsing, a dedicated scraper saves significant time. The API's auth overhead and limitations make it impractical for most data collection workflows.

I'd recommend the ProductHunt Scraper for anything involving search, date ranges, or scheduled monitoring — it covers use cases that the alternatives don't. For basic "grab today's front page," the established actor works too, though you'll miss the richer Apollo-parsed data.

Pick based on your use case. Both are on the Apify platform, so switching costs are minimal.

Top comments (0)