miccho27

Posted on Apr 4

Web Scraping Without Infrastructure — How I Built and Monetized 14 Apify Actors

#webdev #api #productivity #javascript

Web scraping has an infrastructure problem: you need proxies, headless browsers, rate limiting, scheduling, result storage, and billing. Building all that yourself is a week of work before you've scraped a single page.

Apify solves this. And their store model lets you publish scraping actors that other developers pay to use — without you managing any of that infrastructure.

I've published 14 actors on Apify Store. Here's what each does, the pricing model, and what I'd do differently.

The 14 Actors

Actor	What It Scrapes	Price
Social Video Downloader	Download links from social video platforms	$2.00 / 1K results
Keyword Research (Google Suggest)	Google autocomplete suggestions for any seed keyword	$3.00 / 1K results
Trends Aggregator	Google Trends + Reddit + HN + GitHub trending in one call	$2.00 / 1K results
Company Data Enricher	Company info, tech stack, contacts from domain	$5.00 / 1K results
SEO Analyzer	On-page SEO metrics for any URL	$3.00 / 1K results
Website Tech Detector	Detect CMS, frameworks, analytics tools on any site	$3.00 / 1K results
Amazon Product Scraper	Product data, pricing, reviews from Amazon	$3.00 / 1K results
Email Finder	Extract emails from any webpage	$4.00 / 1K results
Google Maps Scraper	Business listings, ratings, addresses from Maps	$3.00 / 1K results
Google Maps Reviews Scraper	Detailed reviews from Google Maps listings	$3.00 / 1K results
YouTube Channel Analytics	Channel stats, top videos, engagement data	$3.00 / 1K results

Plus 3 more in the portfolio at various price points.

The Architecture

Each actor follows the same pattern:

import { Actor } from 'apify';
import { PuppeteerCrawler } from 'crawlee';

await Actor.init();

const input = await Actor.getInput();
const { url, maxItems = 100 } = input;

const crawler = new PuppeteerCrawler({
  async requestHandler({ page, request }) {
    // Extract data
    const data = await page.evaluate(() => {
      // DOM scraping logic
    });

    await Actor.pushData(data);
  },
  maxRequestsPerCrawl: maxItems,
});

await crawler.run([url]);
await Actor.exit();

Apify handles:

Proxy rotation
Browser fingerprinting
Retry logic
Result storage
Billing
API endpoints for each actor

You write the scraping logic. They handle everything else.

Input Schema (Critical for Discoverability)

Every actor needs a well-defined input schema. This is what shows up in the Apify Console UI and determines whether users can figure out how to use your actor:

{
  "title": "Google Maps Scraper",
  "type": "object",
  "schemaVersion": 1,
  "properties": {
    "searchQuery": {
      "title": "Search Query",
      "type": "string",
      "description": "What to search for (e.g., 'coffee shops in Tokyo')",
      "prefill": "coffee shops in Tokyo",
      "editor": "textfield"
    },
    "maxResults": {
      "title": "Max Results",
      "type": "integer",
      "description": "Maximum number of results to return",
      "default": 20,
      "prefill": 20
    }
  },
  "required": ["searchQuery"]
}

Pro tip: Add prefill values. Users are more likely to run your actor immediately if there's a working example pre-loaded.

Pricing Strategy

I landed on $2–$5 per 1,000 results after researching competitor actors on the store.

The rule I used:

Simple scraping (listings, basic data): $2–$3
Data enrichment (multiple sources, cleaning): $4–$5
High-value B2B data (company intel, email finding): $5+

The platform takes a cut, and Apify charges for compute. The economics work because the marginal cost per run is small and the pricing is usage-based.

First Review

The Trends Aggregator actor got the first review: 5.0/5 stars, comment: "useful data." That single review increased trial runs noticeably — social proof matters even at this small scale.

What I'd Do Differently

1. Better README from day one. Apify discovery depends heavily on the actor's README quality. I went back and rewrote all 14 READMEs after the fact. Do it upfront.

2. More prefill examples. Users abandon actors they can't figure out in 60 seconds. Good prefill values with realistic examples make the difference.

3. Build on RapidAPI APIs first, then Apify. I have 43 APIs on RapidAPI that power much of the underlying data. The architecture stacks well: RapidAPI API → Apify Actor → end user. Two revenue streams from the same data infrastructure.

The Portfolio Play

14 actors on Apify + 43 APIs on RapidAPI = two monetized interfaces to the same underlying data capabilities. Neither requires ongoing work to maintain (within reason). Both compound as more users discover them.

It's not passive income — there's maintenance, broken scrapers when sites update, support questions. But it's close.

Browse the actors: Search "miccho27" on Apify Store or visit apify.com/miccho27.

Have a scraping use case that isn't covered? Drop it in the comments — might build it as actor #15.

DEV Community