A scraping job that looks cheap in development can get expensive in production. The usual surprise is not the price of one request. It is the number of billable attempts required to finish one logical job, especially when Cloudflare or another bot protection layer forces you into browser rendering.
Per-request billing is easy to reason about
Per-request pricing charges for each URL fetch. If you scrape one page and get one response, the model is clean.
For example, assume a provider charges $0.065 per rendered request:
function perRequestCost({ pages, pricePerRequest }) {
return pages * pricePerRequest;
}
console.log(perRequestCost({
pages: 10_000,
pricePerRequest: 0.065
}));
// 650
That is predictable. It also scales linearly with page count.
This works well for jobs like:
- Fetch one article URL
- Render one search result page
- Take one screenshot
- Extract one public profile
It works less well when one job contains many pages from the same site. If you need 50 pages behind the same Cloudflare session, billing each page independently can waste the session state you already paid to establish.
Browser sessions change the unit of work
A browser session has startup cost. It needs to launch, negotiate protection, store cookies, run JavaScript, and keep state. Once that session exists, the next page in the same workflow may be much cheaper than starting over.
Session-duration billing charges for time instead of page count. For example, imagine a browser provider bills one credit per two-minute interval, rounded up.
function sessionCost({ sessions, minutesPerSession, creditsPerInterval, intervalMinutes }) {
const intervalsPerSession = Math.ceil(minutesPerSession / intervalMinutes);
return sessions * intervalsPerSession * creditsPerInterval;
}
const cost = sessionCost({
sessions: 200,
minutesPerSession: 10,
creditsPerInterval: 1,
intervalMinutes: 2
});
console.log(cost);
// 1000 credits
If each 10-minute session scrapes 50 pages, those 200 sessions cover 10,000 pages. The important metric becomes pages per session, not just requests per page.
That model rewards batching. It punishes slow navigation, unnecessary waits, and workflows that open a new browser for every URL.
Wire uses browser-session billing for browser jobs and exposes the work as async jobs, which fits long-running Cloudflare workflows where you care about job completion, retries, and session state rather than a single HTTP response: Wire.
Failed attempts are part of the cost
Pricing pages often show the happy path. Production scrapers spend money on failures too.
You need to know whether the provider charges for:
- 403 responses
- Cloudflare challenge pages
- Browser timeouts
- CAPTCHA failures
- Retries
- Empty responses
- Pages where extraction fails after rendering succeeds
A failed scrape is not always a failed HTTP request. This response is technically successful from an HTTP perspective:
HTTP/2 200
content-type: text/html
<html>
<title>Just a moment...</title>
<script src="/cdn-cgi/challenge-platform/..."></script>
</html>
If your billing model charges per response, you may pay for that. If your extraction pipeline treats it as success, you may also store bad data.
Track cost per successful extraction, not cost per request.
function costPerSuccessfulExtraction({ totalCost, successfulExtractions }) {
if (successfulExtractions === 0) return Infinity;
return totalCost / successfulExtractions;
}
console.log(costPerSuccessfulExtraction({
totalCost: 650,
successfulExtractions: 8200
}));
// 0.07926829268292683
That number is more useful than advertised request price because it includes blocks, retries, and parsing failures.
What to measure before choosing a model
Run the same small benchmark against each provider you are considering. Use real target URLs, not simple test pages.
For each run, record:
- Number of attempted pages
- Number of pages with expected content
- Number of challenge or block pages
- Median and p95 latency
- Total billed units
- Cost per successful extraction
- Whether cookies and local storage persisted across pages
A simple CSV is enough:
provider,mode,pages,successes,blocks,p95_ms,billed_units,total_cost
provider_a,per_request,100,83,17,9400,100,6.50
provider_b,browser_session,100,96,4,18200,12,unknown
The p95 matters because browser rendering can be much slower than proxy-only scraping. If your job runs inside a queue with timeouts, a technically successful browser scrape may still fail your workflow.
Choosing the cheaper option
Use per-request billing when each page stands alone and protection is light. It is simple, and simple is good when the workload fits.
Use browser-session billing when you need to keep state across many pages: authenticated dashboards, carts, search pagination, product catalogs, or sites that challenge new sessions aggressively.
Use proxy bandwidth or enterprise proxy pricing when your team already manages crawling infrastructure and mainly needs IP diversity. The tradeoff is that bandwidth pricing can be harder to predict because page size, compression, retries, and geography all affect the bill.
Before you commit, run a 100-page test and calculate cost per successful extraction. That one number will tell you more than the pricing table.
The full breakdown is here if you want the complete picture.
Top comments (0)