DEV Community

Cover image for How I Built a Web Scraping API That Handles 100K Requests/Day for Free
roberto degani
roberto degani

Posted on

How I Built a Web Scraping API That Handles 100K Requests/Day for Free

I needed to scrape product prices across 50+ competitor sites. Hiring a scraping service? $500/month minimum. Building it myself on traditional infrastructure? The bandwidth costs alone would kill me.

Then I discovered Cloudflare Workers. Three months later, I'm handling 100K+ requests daily—for free tier pricing.

The Problem I Solved

My e-commerce startup needed real-time price monitoring across competitors. Manual checks? Impossible. Existing APIs? Either overpriced or inconsistent. So I built the Degani Web Scraper API—deployed on Cloudflare Workers—and it's been running lean ever since.

The wins:

  • Extracts structured data from any webpage in milliseconds
  • Handles 100K daily requests within Cloudflare's free tier
  • Zero infrastructure to manage
  • Rate limiting built-in
  • Returns clean JSON—not HTML soup

What This API Actually Does

The endpoint structure is simple. You POST to:

https://degani-web-scraper.deganiagency.workers.dev
Enter fullscreen mode Exit fullscreen mode

And it gives you clean data back. Here's what you can extract:

POST /extract - Full DOM extraction with CSS selectors
POST /meta - Meta tags, titles, descriptions
POST /links - All anchor tags with href validation
POST /images - Image URLs with alt text
POST /text - Plain text content, cleaned

Real Use Cases (And Why They Matter)

1. Price Monitoring for E-commerce

Monitor competitor pricing in real-time. I use this daily to catch pricing wars before they start.

curl -X POST https://degani-web-scraper.deganiagency.workers.dev/extract \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://competitor.com/products",
    "selectors": {
      "price": ".product-price",
      "title": ".product-title",
      "stock": ".stock-status"
    }
  }'
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "price": "$29.99",
  "title": "Premium Widget",
  "stock": "In Stock"
}
Enter fullscreen mode Exit fullscreen mode

2. Lead Generation & B2B Prospecting

Extract company info from directories—contact names, emails, phone numbers. Perfect for building prospect lists.

const response = await fetch(
  'https://degani-web-scraper.deganiagency.workers.dev/extract',
  {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      url: 'https://directory.com/companies',
      selectors: {
        company: '.company-name',
        contact: '.contact-email',
        phone: '.phone-number'
      }
    })
  }
);

const data = await response.json();
console.log(data); // { company: "Acme Inc", contact: "...", phone: "..." }
Enter fullscreen mode Exit fullscreen mode

3. SEO Audit Data Collection

Extract meta tags, headers, and structured data. Feed it into your SEO tools.

curl -X POST https://degani-web-scraper.deganiagency.workers.dev/meta \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'
Enter fullscreen mode Exit fullscreen mode

The response includes:

{
  "title": "Page Title",
  "description": "Meta description",
  "og_image": "https://...",
  "canonical": "https://...",
  "h1": ["Main Heading"],
  "language": "en"
}
Enter fullscreen mode Exit fullscreen mode

Why Cloudflare Workers?

When I started, I was going to run this on a VPS. Then I calculated:

  • Bandwidth for 100K requests/day = ~$40-100/month
  • VPS cost = $10-20/month
  • Maintenance headaches = priceless

Cloudflare Workers changed the equation:

  • Distributed globally (sub-100ms response times)
  • No servers to manage
  • Auto-scales
  • Free tier handles 100K requests/day

It's not magic—it's smart infrastructure.

Extraction in Action

Let's say you're scraping a product listing page:

const scrapeProducts = async (pageUrl) => {
  const response = await fetch(
    'https://degani-web-scraper.deganiagency.workers.dev/extract',
    {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        url: pageUrl,
        selectors: {
          products: {
            name: '.product-name',
            price: '.price',
            rating: '.stars',
            url: 'a.product-link'
          }
        }
      })
    }
  );

  return response.json();
};

// Usage
const products = await scrapeProducts('https://example.com/products');
// { products: [...] }
Enter fullscreen mode Exit fullscreen mode

No parsing HTML by hand. No fighting regex. Just clean JSON.

Getting Started

  1. Hit the API endpoint with your target URL
  2. Define CSS selectors for what you need
  3. Get back structured JSON
  4. Build something awesome

The full API docs are available at https://rapidapi.com/deganiagency/api/web-scraper-extractor

What I Learned Building This

Lesson 1: Cloudflare Workers handle concurrency beautifully. I was worried about request spikes. Never happened.

Lesson 2: CSS selectors are powerful. Most websites use consistent class names—your selectors work across 90% of pages.

Lesson 3: Respect robots.txt and rate limits. The API enforces sensible defaults, but always check target sites' ToS.

What's Next

I'm adding:

  • Screenshot capture (headless browser support)
  • JavaScript rendering (for SPA content)
  • Automatic selector optimization

Already using the web scraper API? Drop a comment—I'd love to hear what you're building.

Got alternative solutions you prefer? Let's discuss. This is a tool that solves real problems for people who can't afford $500/month scrapers.


Try it free: https://rapidapi.com/deganiagency/api/web-scraper-extractor
Source: https://degani-web-scraper.deganiagency.workers.dev
Perfect for: Price monitoring, lead gen, SEO audits, competitive analysis

Top comments (0)