Crawlee Has a Free Web Scraping Framework — Build Reliable Scrapers with Auto-Retry and Proxy Rotation

#crawlee #webscraping #automation #javascript

Why Crawlee?

Crawlee (by Apify) is a web scraping framework with automatic retries, proxy rotation, request queuing, and both HTTP and browser-based scraping.

npx crawlee create my-scraper
cd my-scraper && npm start

HTTP Scraping (Fast)

import { CheerioCrawler } from 'crawlee'

const crawler = new CheerioCrawler({
  async requestHandler({ request, $ }) {
    const title = $('h1').text()
    const price = $('.price').text()
    console.log({ url: request.url, title, price })
  },
})

await crawler.run(['https://example.com/product/1', 'https://example.com/product/2'])

Browser Scraping (JavaScript-Heavy Sites)

import { PlaywrightCrawler } from 'crawlee'

const crawler = new PlaywrightCrawler({
  async requestHandler({ page, request }) {
    await page.waitForSelector('.product-list')
    const products = await page.$$eval('.product', (els) =>
      els.map((el) => ({
        name: el.querySelector('.name')?.textContent,
        price: el.querySelector('.price')?.textContent,
      }))
    )
    console.log(products)
  },
})

await crawler.run(['https://spa-site.com/products'])

Auto-Retry + Proxy Rotation

import { CheerioCrawler, ProxyConfiguration } from 'crawlee'

const proxyConfig = new ProxyConfiguration({
  proxyUrls: ['http://proxy1:8080', 'http://proxy2:8080'],
})

const crawler = new CheerioCrawler({
  proxyConfiguration: proxyConfig,
  maxRequestRetries: 3,
  requestHandlerTimeoutSecs: 30,
  async requestHandler({ request, $ }) {
    // Auto-rotates proxies, auto-retries on failure
  },
})

Save to Dataset

import { Dataset } from 'crawlee'

await Dataset.pushData({ title, price, url: request.url })
// Auto-saves to ./storage/datasets/default/

Crawlee vs Puppeteer vs Playwright

Feature	Crawlee	Puppeteer	Playwright
Auto-retry	Yes	No	No
Proxy rotation	Yes	Manual	Manual
Request queue	Yes	No	No
Dataset storage	Yes	No	No
HTTP + Browser	Both	Browser	Browser

Need to extract data from any website at scale? I build custom web scrapers — 77 production scrapers running on Apify Store. Email me at spinov001@gmail.com for a tailored solution.