How to Build and Monetize Apify Actors — A Practical Guide from Shipping 10 Actors
I published 10 Apify Actors in two days. Here's everything I learned about building, pricing, and listing web scraping tools on the Apify marketplace.
What Are Apify Actors?
Apify Actors are serverless programs that run in the cloud. Think of them as Lambda functions specifically designed for web scraping and automation. You write the code, Apify handles the infrastructure — proxy rotation, scheduling, data storage, and scaling.
The marketplace lets you sell (or give away) your Actors to other users. Think RapidAPI, but for scraping.
My 10 Actors
| Actor | What It Does |
|---|---|
| Google SERP Scraper | Extracts search results for any query |
| Amazon Product Scraper | Gets product details, prices, reviews |
| YouTube Channel Analyzer | Channel stats, video list, engagement |
| Instagram Profile Scraper | Public profile data and post metrics |
| Website Tech Detector | Identifies CMS, frameworks, analytics |
| Email Finder | Extracts emails from any website |
| SEO Audit Tool | On-page SEO analysis |
| Sitemap Extractor | Parses and analyzes XML sitemaps |
| Social Media Bio Scraper | Cross-platform profile data |
| News Article Extractor | Clean article text from news sites |
Actor Architecture
Every Actor follows this structure:
import { Actor } from 'apify';
await Actor.init();
// 1. Get input
const input = await Actor.getInput();
const { url, maxResults = 10 } = input;
// 2. Validate
if (!url) throw new Error('URL is required');
// 3. Do the work
const results = await scrapeData(url, maxResults);
// 4. Store results
await Actor.pushData(results);
// 5. Cleanup
await Actor.exit();
That's it. Five steps. The framework handles everything else.
Key Technical Decisions
Use Cheerio, Not Puppeteer (When Possible)
Puppeteer (headless Chrome) is powerful but expensive:
- ~256MB memory per browser instance
- Slower startup
- Higher compute costs on Apify
Cheerio (HTML parser) is 10x cheaper:
- ~50MB memory
- Instant parsing
- No browser overhead
Rule of thumb: If the data is in the initial HTML response, use Cheerio. Only use Puppeteer for JavaScript-rendered content.
Handle Anti-Scraping Gracefully
// Retry with exponential backoff
async function fetchWithRetry(url, retries = 3) {
for (let i = 0; i < retries; i++) {
try {
const response = await Actor.utils.requestAsBrowser({
url,
useApifyProxy: true,
apifyProxyGroups: ['RESIDENTIAL'],
});
if (response.statusCode === 200) return response;
} catch (e) {
if (i === retries - 1) throw e;
await new Promise(r => setTimeout(r, 1000 * Math.pow(2, i)));
}
}
}
Using requestAsBrowser with residential proxies solves 90% of blocking issues.
Define a Clear Input Schema
{
"title": "Google SERP Scraper",
"type": "object",
"properties": {
"query": {
"title": "Search Query",
"type": "string",
"description": "The search term to look up"
},
"maxResults": {
"title": "Max Results",
"type": "integer",
"default": 10,
"maximum": 100
},
"country": {
"title": "Country Code",
"type": "string",
"default": "us",
"enum": ["us", "uk", "jp", "de", "fr"]
}
},
"required": ["query"]
}
A good input schema makes your Actor self-documenting and creates a nice form UI in the Apify console.
Pricing Strategy
I tested three approaches:
| Model | Result |
|---|---|
| Free only | Downloads but no revenue |
| Pay per result ($0.01/result) | Low adoption — users can't estimate cost |
| Free tier + usage pricing | Best of both worlds |
What works: Give away 100 free results/month, then charge per result above that. Users try it risk-free, then pay when they need volume.
Publishing Checklist
Before publishing any Actor:
- [ ] Input schema with descriptions and defaults
- [ ] README with usage examples and sample output
- [ ] Error handling for all edge cases
- [ ] Rate limiting to respect target sites
- [ ] Proxy configuration for blocked sites
- [ ] Output schema documented
- [ ] At least 3 test runs with different inputs
- [ ] SEO-optimized title and description
Lessons Learned
1. Scraping is a cat-and-mouse game. Sites change their HTML structure without warning. Build your selectors to be resilient — use data attributes and semantic selectors over brittle class names.
2. Documentation sells more than features. Actors with clear READMEs and example outputs get 3x more adoption than feature-rich but poorly documented ones.
3. The Apify SDK handles 80% of the complexity. Proxy rotation, request queuing, data storage — all built in. Focus on your scraping logic, not infrastructure.
4. Start with Cheerio, upgrade to Puppeteer only when needed. You'll save money and your Actors will run faster.
Get Started
My Apify profile: apify.com/miccho27
All 10 Actors have free tiers. Try them out.
Other tools I've built:
- 24 REST APIs on RapidAPI — Free tier available
- ListingAI — AI product description generator
- 468 Calculator Tools
Solo developer shipping from Paraguay. Follow for more on web scraping, APIs, and indie products.
Top comments (0)