DEV Community

FairPrice
FairPrice

Posted on

3 Free Apify Actors for Scraping Bluesky, Substack, and Hacker News (No API Keys Needed)

I built 3 free scrapers for platforms that researchers and developers commonly need data from. All use pay-per-event pricing (free until March 21), no API keys required.

If you've ever needed to pull data from Bluesky, Substack, or Hacker News, you know the drill: write a custom script, handle pagination, deal with rate limits, parse HTML. These three Apify Actors handle all of that out of the box.


1. Bluesky Scraper

Link: Bluesky Scraper on Apify Store

What it does: Scrapes posts, user profiles, and search results from Bluesky via the AT Protocol.

Why Bluesky: The AT Protocol is fully open — no authentication tokens needed for public data. With 30M+ users and growing, Bluesky is becoming a primary data source for social media researchers and trend analysts.

Example input:

{
  "searchTerms": ["web scraping", "data extraction"],
  "maxPosts": 100,
  "includeReplies": false
}
Enter fullscreen mode Exit fullscreen mode

This pulls up to 100 posts matching your search terms. You can also scrape specific user profiles or full thread conversations.


2. Substack Scraper

Link: Substack Scraper on Apify Store

What it does: Scrapes newsletter posts, author metadata, and publication details from any public Substack.

Why Substack: Substack exposes an unofficial JSON API for public content — no auth required. This makes it straightforward to collect article text, subscriber counts, and publication metadata at scale.

Example input:

{
  "publicationUrls": [
    "https://platformer.news",
    "https://www.lennysnewsletter.com"
  ],
  "maxPostsPerPublication": 50
}
Enter fullscreen mode Exit fullscreen mode

This scrapes the 50 most recent posts from each publication, including full article text, dates, likes, and author info.


3. Hacker News Scraper

Link: Hacker News Scraper on Apify Store

What it does: Scrapes stories, comments, and user profiles from Hacker News.

Why HN: Hacker News has an official Firebase API with no rate limits and no authentication. The scraper wraps this into a structured output with filtering, sorting, and comment threading built in.

Example input:

{
  "scrapeType": "search",
  "searchQuery": "LLM fine-tuning",
  "maxItems": 200,
  "includeComments": true
}
Enter fullscreen mode Exit fullscreen mode

This searches HN for stories about LLM fine-tuning and includes the full comment trees — useful for sentiment analysis or finding expert opinions.


Why Use These vs. Building Your Own?

DIY Script Apify Actor
Setup time Hours to days Minutes
Pagination You handle it Built-in
Output format Whatever you code JSON, CSV, Excel, or direct to your DB
Scheduling Cron jobs on your server Built-in scheduler on Apify
Proxy rotation You manage it Handled automatically
Maintenance You fix it when the site changes Actor updates handle it

If you need a one-off data pull, a DIY script works. If you need recurring scrapes, structured output, or you just don't want to spend a day writing pagination logic, these Actors save real time.


Try Them Out

All three are live on the Apify Store with free trials:

Each Actor runs on pay-per-event pricing. You get results as structured JSON, ready for analysis, storage, or piping into your data pipeline.

If you have questions or feature requests, drop a comment or open an issue on the Actor page. Happy scraping.

Top comments (0)