İlyas Yıldırım

Posted on May 7

How to scrape Google Maps without code: a REST API tutorial

#api #google #tutorial #webscraping

Disclosure: I run basedonb, the API I'm using in this tutorial. The technique works with any Google Maps scraping API — the trade-offs section is honest. If you'd rather build the scraper yourself, the "Why DIY scraping fails" section explains what you're signing up for.

Last week a friend asked me for 500 dentist leads in Manhattan for a cold email campaign. I sent him a CSV in 90 seconds. This post is how to do that yourself in ~12 lines of code, and the failure modes you avoid by not writing the scraper yourself.

Why building your own Google Maps scraper is harder than it looks

Every "I'll just write a quick Puppeteer script" plan dies on the same five rocks:

IP blocking. Google flags datacenter IPs after ~50–100 requests. You need a residential proxy pool (~$50–$200/mo) just to start.
CAPTCHAs. Once flagged, every request becomes a reCAPTCHA challenge. Solving services exist but cost ~$2 per 1k.
Layout drift. Google ships UI changes constantly. Your selectors break and you don't notice until your CSV is empty.
Rate limits + soft bans. Even the official Places API caps you, and the unofficial Maps frontend will silently degrade results when it thinks you're a bot.
Legal exposure. Scraping Maps directly is a grey area. Going through an API that has commercial terms shifts that risk off your laptop.

If your goal is "ship a list this week," skip the scraper and just call an API. Below is a full example using basedonb because that's what I built, but the same shape applies to Outscraper, Apify's Google Maps actor, etc.

Mental model: submit → poll → fetch

Most Google Maps scrapers are async — a 500-result query takes 30–120 seconds because Maps doesn't actually return 500 results to a single search; the scraper has to grid the area and dedupe. So the API has three states:

POST /scrapes — you submit a job. If results are already cached, you get 200 with the data inline. Otherwise you get 202 with a job id.
GET /scrapes/{id} — you poll status until it's done or failed.
GET /scrapes/{id}/results — you fetch the leads.

That's the whole API surface. Auth is a bearer token.

Tutorial: cURL

curl -X POST https://www.basedonb.com/api/v1/scrapes \
  -H "Authorization: Bearer bdb_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "dentists",
    "country": "US",
    "city": "New York",
    "target_leads": 500
  }'

Response when results are cached (200):

{
  "id": null,
  "status": "done",
  "leads_found": 500,
  "results": [
    {
      "place_id": "ChIJ...",
      "title": "Smile NYC Dental",
      "category": "Dental clinic",
      "address": "123 W 23rd St, New York, NY 10011",
      "phone": "+12125550101",
      "website": "https://example.com",
      "rating": 4.8,
      "reviews_count": 412,
      "latitude": 40.7445,
      "longitude": -74.0010,
      "business_status": "OPERATIONAL",
      "price_level": "$$"
    }
  ]
}

Response when a fresh scrape is needed (202):

{ "id": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "status": "submitted", "leads_found": 0 }

Then you poll:

curl https://www.basedonb.com/api/v1/scrapes/3fa85f64-5717-4562-b3fc-2c963f66afa6 \
  -H "Authorization: Bearer bdb_live_YOUR_KEY"

When status becomes done, fetch results:

curl https://www.basedonb.com/api/v1/scrapes/3fa85f64-5717-4562-b3fc-2c963f66afa6/results \
  -H "Authorization: Bearer bdb_live_YOUR_KEY"

That's it. Three endpoints.

Tutorial: Node.js (12 lines, no dependencies)

const KEY = process.env.BASEDONB_KEY;
const BASE = "https://www.basedonb.com/api/v1";
const headers = { "Authorization": `Bearer ${KEY}`, "Content-Type": "application/json" };

const submit = await fetch(`${BASE}/scrapes`, {
  method: "POST",
  headers,
  body: JSON.stringify({ query: "dentists", country: "US", city: "New York", target_leads: 500 }),
}).then(r => r.json());

let job = submit;
while (job.status !== "done" && job.status !== "failed") {
  await new Promise(r => setTimeout(r, 3000));
  job = await fetch(`${BASE}/scrapes/${submit.id}`, { headers }).then(r => r.json());
  console.log(`progress ${Math.round((job.progress ?? 0) * 100)}% — ${job.leads_found} leads`);
}

const { results } = await fetch(`${BASE}/scrapes/${submit.id}/results`, { headers }).then(r => r.json());
console.log(`done — ${results.length} leads`);

Real production code should add a poll budget (don't loop forever) and handle the 200-with-inline-results case for cache hits. Two extra lines.

Tutorial: Python

import os, time, requests

KEY = os.environ["BASEDONB_KEY"]
BASE = "https://www.basedonb.com/api/v1"
H = {"Authorization": f"Bearer {KEY}"}

submit = requests.post(f"{BASE}/scrapes", headers=H, json={
    "query": "dentists",
    "country": "US",
    "city": "New York",
    "target_leads": 500,
}).json()

if submit["status"] == "done":
    results = submit["results"]
else:
    job_id = submit["id"]
    while True:
        job = requests.get(f"{BASE}/scrapes/{job_id}", headers=H).json()
        if job["status"] in ("done", "failed"): break
        print(f"{round((job.get('progress', 0))*100)}% — {job['leads_found']} leads")
        time.sleep(3)
    results = requests.get(f"{BASE}/scrapes/{job_id}/results", headers=H).json()["results"]

print(f"got {len(results)} leads")

Dump to CSV in two more lines:

import csv
with open("dentists_nyc.csv", "w", newline="") as f:
    w = csv.DictWriter(f, fieldnames=["title","phone","website","address","rating","reviews_count"])
    w.writeheader()
    for r in results:
        w.writerow({k: r.get(k) for k in w.fieldnames})

Error handling that actually matters

You'll see four error shapes in practice:

401 unauthorized — bad key or wrong header. Make sure you're sending Authorization: Bearer bdb_live_... (or the equivalent X-API-Key header).
400 bad_request — usually missing target_leads or an invalid state code. US state codes are GeoNames-dotted (US.TX, not TX).
402 insufficient_credits — your balance can't cover target_leads. The error body tells you the gap.
503 scraper_unavailable — backend is briefly offline. Retry with backoff. This is rare but you should plan for it.

Don't paper over 503 with infinite retries. Cap at 3 attempts with exponential backoff (3s, 9s, 27s).

Pricing reality check

This is the part most "API tutorial" posts skip and you find out later.

1 credit = 1 lead. No subscription, credits don't expire.
Starter is $10 per 1,000 leads ($0.01/lead). It scales down: $9, $7, $6 per 1k at $40 / $150 / $500 top-ups.
New accounts get 50 free credits — enough to run the full tutorial above, just lower target_leads to 50.

For a 500-lead Manhattan dentists job: ~$5 at the lowest tier, less at volume. Compare that to the Google Places API at $17 per 1,000 Place Details calls (without enrichment), or Outscraper around $3 per 1k for Maps Search but their UI/API trade-off is different.

What the API does not return (be honest)

The fields you get are the ones in the response above: title, category, address, phone, website, rating, reviews_count, latitude, longitude, business_status, price_level. There is no email field — Maps doesn't expose business emails, and any "Maps scraper" that hands you emails is doing a separate enrichment step (usually crawling the website's contact page) which you can do yourself with one more API call to a tool like Hunter.io or by parsing mailto: links from the website.

Don't pay for a "leads API with email" that's secretly doing email scraping you didn't authorize. The right shape is: pull leads here, enrich emails separately, keep the steps observable.

Wrapping up

Three endpoints, two minutes of code, ~$5 for 500 verified business leads. The cost/effort ratio of writing your own scraper does not pencil out unless you're doing this at >100k/month volume.

Full code in the snippets above is runnable as-is — drop your key in and go. If you find a better Maps API, I'd genuinely like to know in the comments.

— I'm building basedonb. Honest feedback welcome.

DEV Community