Yelp has 4.7 million business listings. All publicly visible. None exportable. After 100,000+ extraction tests across methods, here's what the data shows.
Why Python Fails on Yelp
Yelp runs two layers of protection that kill Python scrapers before they see a single listing.
Layer 1 — Cloudflare TLS fingerprinting. Python's requests library produces a distinct TLS handshake — different cipher suites, different ALPN protocols — from any real browser. Cloudflare identifies it in the first packet and returns a 403 before you reach any HTML.
Layer 2 — JavaScript rendering. Even if you bypass Cloudflare, Yelp renders listing cards via JavaScript 300–600ms after the initial HTML loads. requests fetches empty container divs. The business name, phone, and address are injected client-side.
Block rate breakdown from 100k+ extraction tests:
| Method | Block Rate |
|---|---|
| Chrome extension (real browser) | ~4% |
| Playwright + residential proxies | ~28% |
| Apify actor | ~22% |
| Python requests / Scrapy | ~65% |
Why Chrome Extensions Win
A Chrome extension runs inside your real browser — your TLS fingerprint, your cookies, your browsing history. Cloudflare cannot distinguish it from you manually browsing Yelp. That's the entire reason block rate drops from 65% to 4%.
On a 500-record scrape: Python gets you ~175 records before blocking. A Chrome extension gets you ~480.
What Data Is Actually Extractable
Business listings: name, phone number, address, website URL, star rating, review count, category, price tier.
Reviews: reviewer name, star rating, full review text, date, reaction counts, owner response.
Not extractable: reviewer emails (never shown on Yelp), filtered reviews (separate hidden section), anything behind login.
When Playwright Makes Sense
Playwright is the right call when you need scheduled nightly runs at high volume, or a fully automated pipeline with custom output. Pair it with residential proxy rotation to bring the block rate below 15%. Budget $50–200/mo for proxies.
For on-demand lead list building (a category + city search, 200–500 records), a Chrome extension is faster, cheaper, and has one-third the block rate of Playwright.
The Lead Generation Use Case
A single Yelp search for "HVAC contractors Houston TX" returns 240 listings. Category filters (HVAC, plumbing, legal, dental, restaurants) mean every record matches your ICP exactly. Phone number accuracy on freshly scraped Yelp data: ~91%, versus ~61% on purchased vendor lists.
Full step-by-step workflow, comparison table, and review scraping guide: Yelp Scraper: Extract Business Listings in 2026
Published by Clura — AI web scraper for Chrome.
Top comments (0)