DEV Community

miccho27
miccho27

Posted on

How to Build and Monetize Apify Actors — A Practical Guide from Shipping 10 Actors

How to Build and Monetize Apify Actors — A Practical Guide from Shipping 10 Actors

I published 10 Apify Actors in two days. Here's everything I learned about building, pricing, and listing web scraping tools on the Apify marketplace.


What Are Apify Actors?

Apify Actors are serverless programs that run in the cloud. Think of them as Lambda functions specifically designed for web scraping and automation. You write the code, Apify handles the infrastructure — proxy rotation, scheduling, data storage, and scaling.

The marketplace lets you sell (or give away) your Actors to other users. Think RapidAPI, but for scraping.


My 10 Actors

Actor What It Does
Google SERP Scraper Extracts search results for any query
Amazon Product Scraper Gets product details, prices, reviews
YouTube Channel Analyzer Channel stats, video list, engagement
Instagram Profile Scraper Public profile data and post metrics
Website Tech Detector Identifies CMS, frameworks, analytics
Email Finder Extracts emails from any website
SEO Audit Tool On-page SEO analysis
Sitemap Extractor Parses and analyzes XML sitemaps
Social Media Bio Scraper Cross-platform profile data
News Article Extractor Clean article text from news sites

Actor Architecture

Every Actor follows this structure:

import { Actor } from 'apify';

await Actor.init();

// 1. Get input
const input = await Actor.getInput();
const { url, maxResults = 10 } = input;

// 2. Validate
if (!url) throw new Error('URL is required');

// 3. Do the work
const results = await scrapeData(url, maxResults);

// 4. Store results
await Actor.pushData(results);

// 5. Cleanup
await Actor.exit();
Enter fullscreen mode Exit fullscreen mode

That's it. Five steps. The framework handles everything else.


Key Technical Decisions

Use Cheerio, Not Puppeteer (When Possible)

Puppeteer (headless Chrome) is powerful but expensive:

  • ~256MB memory per browser instance
  • Slower startup
  • Higher compute costs on Apify

Cheerio (HTML parser) is 10x cheaper:

  • ~50MB memory
  • Instant parsing
  • No browser overhead

Rule of thumb: If the data is in the initial HTML response, use Cheerio. Only use Puppeteer for JavaScript-rendered content.

Handle Anti-Scraping Gracefully

// Retry with exponential backoff
async function fetchWithRetry(url, retries = 3) {
  for (let i = 0; i < retries; i++) {
    try {
      const response = await Actor.utils.requestAsBrowser({
        url,
        useApifyProxy: true,
        apifyProxyGroups: ['RESIDENTIAL'],
      });
      if (response.statusCode === 200) return response;
    } catch (e) {
      if (i === retries - 1) throw e;
      await new Promise(r => setTimeout(r, 1000 * Math.pow(2, i)));
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Using requestAsBrowser with residential proxies solves 90% of blocking issues.

Define a Clear Input Schema

{
  "title": "Google SERP Scraper",
  "type": "object",
  "properties": {
    "query": {
      "title": "Search Query",
      "type": "string",
      "description": "The search term to look up"
    },
    "maxResults": {
      "title": "Max Results",
      "type": "integer",
      "default": 10,
      "maximum": 100
    },
    "country": {
      "title": "Country Code",
      "type": "string",
      "default": "us",
      "enum": ["us", "uk", "jp", "de", "fr"]
    }
  },
  "required": ["query"]
}
Enter fullscreen mode Exit fullscreen mode

A good input schema makes your Actor self-documenting and creates a nice form UI in the Apify console.


Pricing Strategy

I tested three approaches:

Model Result
Free only Downloads but no revenue
Pay per result ($0.01/result) Low adoption — users can't estimate cost
Free tier + usage pricing Best of both worlds

What works: Give away 100 free results/month, then charge per result above that. Users try it risk-free, then pay when they need volume.


Publishing Checklist

Before publishing any Actor:

  • [ ] Input schema with descriptions and defaults
  • [ ] README with usage examples and sample output
  • [ ] Error handling for all edge cases
  • [ ] Rate limiting to respect target sites
  • [ ] Proxy configuration for blocked sites
  • [ ] Output schema documented
  • [ ] At least 3 test runs with different inputs
  • [ ] SEO-optimized title and description

Lessons Learned

1. Scraping is a cat-and-mouse game. Sites change their HTML structure without warning. Build your selectors to be resilient — use data attributes and semantic selectors over brittle class names.

2. Documentation sells more than features. Actors with clear READMEs and example outputs get 3x more adoption than feature-rich but poorly documented ones.

3. The Apify SDK handles 80% of the complexity. Proxy rotation, request queuing, data storage — all built in. Focus on your scraping logic, not infrastructure.

4. Start with Cheerio, upgrade to Puppeteer only when needed. You'll save money and your Actors will run faster.


Get Started

My Apify profile: apify.com/miccho27

All 10 Actors have free tiers. Try them out.

Other tools I've built:

Solo developer shipping from Paraguay. Follow for more on web scraping, APIs, and indie products.

Top comments (0)