Most "just use Puppeteer" advice falls apart the moment the target sits behind Cloudflare's JS challenge. You end up maintaining a headless fleet, rotating fingerprints, and babysitting timeouts — for what should be a one-line fetch.
I wrapped a challenge-solving backend behind a tiny REST API: you POST a URL and get back rendered HTML, plain text, or just the fields you want via CSS selectors. The challenge gets solved server-side, so your code stays a single HTTP call.
curl --request POST \
--url 'https://web-scraping-api-cloudflare-bypass.p.rapidapi.com/api/v1/extract' \
--header 'x-rapidapi-key: YOUR_RAPIDAPI_KEY' \
--header 'x-rapidapi-host: web-scraping-api-cloudflare-bypass.p.rapidapi.com' \
--header 'content-type: application/json' \
--data '{"url":"https://example.com","selectors":{"title":"h1","price":".price"}}'
Response gives you { "title": "...", "price": "..." } — no browser, no proxy rotation, no DOM parsing on your side. There's also /api/v1/scrape if you just want the raw HTML or text of a page.
Free tier to try it on RapidAPI: https://rapidapi.com/danieligel/api/web-scraping-api-cloudflare-bypass
I built this because I was tired of running a headless cluster for occasional scrapes. Happy to answer questions about the challenge-solving part or selector edge cases.
Top comments (0)