Valentina Skakun for HasData

Posted on Aug 6

Simple Google Maps Scraper Using Playwright

#programming #python #hasdata #scraping

To scrape data from Google Maps, you’ll need a headless browser. We won’t go deep into why, Google blocks bots hard, pages are rendered dynamically, and so on. Instead, let’s get straight to it: in this guide, we’ll walk you through how to build a Google Maps scraper in Python using the Playwright library.

Step 0. Full Code of Google Maps Scraper
Step 1. Setup Environment
Step 2. Launch Browser with Stealth Mode
Step 3. Navigate to Google Maps
Step 4. Perform Search
Step 5. Scroll Results Panel
Step 6. Extract Data from Result Cards
Step 7. Save Data to CSV and JSON
Step 8. Close Browser

Step 0. Full Code of Google Maps Scraper

If you're not really interested in how to build a scraper and just want the code – here it is, with a few short comments:

from playwright.sync_api import sync_playwright
from playwright_stealth import stealth_sync
import time
import pandas as pd
import re

query = "restaurants in New York"
max_scrolls = 10
scroll_pause = 2

with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    context = browser.new_context()
    page = context.new_page()

    # Enable stealth mode
    stealth_sync(page)

    # Open Google Maps
    page.goto("https://www.google.com/maps")
    time.sleep(5)

    # Search input and enter query
    search = page.locator("#searchboxinput")
    search.fill(query)
    search.press("Enter")
    time.sleep(5)

    # Scroll the results feed
    scrollable = page.locator('div[role="feed"]')
    for _ in range(max_scrolls):
        page.evaluate('(el) => el.scrollTop = el.scrollHeight', scrollable)
        time.sleep(scroll_pause)

    # Get all result cards
    feed_container = page.locator('div.m6QErb.DxyBCb.kA9KIf.dS8AEf.XiKgde.ecceSd[role="feed"]')
    cards = feed_container.locator("div.Nv2PK.THOPZb.CpccDe")
    count = cards.count()

    data = []

    for i in range(count):
        card = cards.nth(i)

        name = ""
        rating = ""
        reviews = ""
        category = ""
        services = ""
        image_url = ""
        detail_url = ""

        # Name
        name_el = card.locator(".qBF1Pd")
        if name_el.count() > 0:
            name = name_el.nth(0).inner_text()

        # Rating
        rating_el = card.locator('span[aria-label*="stars"]')
        if rating_el.count() > 0:
            aria_label = rating_el.nth(0).get_attribute("aria-label")
            match = re.search(r"([\d.]+)", aria_label)
            if match:
                rating = match.group(1)

        # Reviews
        reviews_el = card.locator(".UY7F9")
        if reviews_el.count() > 0:
            text = reviews_el.nth(0).inner_text()
            match = re.search(r"([\d,]+)", text)
            if match:
                reviews = match.group(1).replace(",", "")

        # Category
        category_el = card.locator('div.W4Efsd > span').first
        if category_el:
            category = category_el.inner_text()

        # Services
        services_el = card.locator('div.ah5Ghc > span')
        if services_el.count() > 0:
            services = ", ".join([services_el.nth(j).inner_text() for j in range(services_el.count())])

        # Image URL
        image_el = card.locator('img[src*="googleusercontent"]')
        if image_el.count() > 0:
            image_url = image_el.nth(0).get_attribute("src")

        # Detail URL
        link_el = card.locator('a.hfpxzc')
        if link_el.count() > 0:
            detail_url = link_el.nth(0).get_attribute("href")

        data.append({
            "Name": name,
            "Rating": rating,
            "Reviews": reviews,
            "Category": category,
            "Services": services,
            "Image": image_url,
            "Detail URL": detail_url
        })

    # Save to CSV with pandas
    df = pd.DataFrame(data)
    df.to_csv("maps_data_playwright.csv", index=False)
    print(f"Saved {len(df)} records to maps_data_playwright.csv")

    browser.close()

Just make sure the selectors in the code are still valid. Google changes class names and other stuff pretty often.

Step 1. Setup Environment

To avoid library version conflicts between projects, we use a virtual environment. Open your terminal and run:

python -m venv venv
source venv/bin/activate   # On Windows: venv\Scripts\activate

Next, install the required libraries:

pip install playwright pandas time re

As mentioned in the Playwright scraping guide, you also need to install browsers for it to work:

playwright install

Now you can import these libraries into your project:

from playwright.sync_api import sync_playwright
from playwright_stealth import stealth_sync
import time
import pandas as pd
import re

Step 2. Launch Browser with Stealth Mode

Initialize the browser and open a new window:

with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    context = browser.new_context()
    page = context.new_page()

If possible, use Stealth mode, which helps avoid getting blocked:

    stealth_sync(page)

Step 3. Navigate to Google Maps

Now go to the Google Maps homepage. You can either generate a link with your search results or just use the search bar. We'll go with the second option.
Open the Google Maps homepage:

    page.goto("https://www.google.com/maps")

Wait for the page to load:

    time.sleep(5)

Step 4. Perform Search

Let’s find the search input on the map:

    search = page.locator("#searchboxinput")

Enter a keyword:

    search.fill(query)

Hit Enter to go to the results page:

    search.press("Enter")

Step 5. Scroll Results Panel

By default, the page shows only five results. Since the data loads dynamically, we need to scroll to load more places from Google Maps.
Here’s the panel we’ll scroll:

    scrollable = page.locator('div[role="feed"]')

Now let’s add a scroll loop:

    for _ in range(max_scrolls):
        page.evaluate('(el) => el.scrollTop = el.scrollHeight', scrollable)
        time.sleep(scroll_pause)

In this case, we’ve set a fixed number of scrolls. If you want endless scrolling, we’ve explained how to do that in this article.

Step 6. Extract Data from Result Cards

Scrape all the result cards from the page:

    feed_container = page.locator('div.m6QErb.DxyBCb.kA9KIf.dS8AEf.XiKgde.ecceSd[role="feed"]')
    cards = feed_container.locator("div.Nv2PK.THOPZb.CpccDe")
    count = cards.count()

Loop through each place to collect the info we need:

    for i in range(count):
        card = cards.nth(i)

        name = ""
        rating = ""
        reviews = ""
        category = ""
        services = ""
        image_url = ""
        detail_url = ""

Get the place name:

        name_el = card.locator(".qBF1Pd")
        if name_el.count() > 0:
            name = name_el.nth(0).inner_text()

Then scrape its rating:

        rating_el = card.locator('span[aria-label*="stars"]')
        if rating_el.count() > 0:
            aria_label = rating_el.nth(0).get_attribute("aria-label")
            match = re.search(r"([\d.]+)", aria_label)
            if match:
                rating = match.group(1)

Also, extract the number of reviews:

        reviews_el = card.locator(".UY7F9")
        if reviews_el.count() > 0:
            text = reviews_el.nth(0).inner_text()
            match = re.search(r"([\d,]+)", text)
            if match:
                reviews = match.group(1).replace(",", "")

Do the same for categories, images, and any other useful details:

        # Category
        category_el = card.locator('div.W4Efsd > span').first
        if category_el:
            category = category_el.inner_text()

        # Services
        services_el = card.locator('div.ah5Ghc > span')
        if services_el.count() > 0:
            services = ", ".join([services_el.nth(j).inner_text() for j in range(services_el.count())])

        # Image URL
        image_el = card.locator('img[src*="googleusercontent"]')
        if image_el.count() > 0:
            image_url = image_el.nth(0).get_attribute("src")

        # Detail URL
        link_el = card.locator('a.hfpxzc')
        if link_el.count() > 0:
            detail_url = link_el.nth(0).get_attribute("href")

Store everything in a variable to make it easier to work with later:

        data.append({
            "Name": name,
            "Rating": rating,
            "Reviews": reviews,
            "Category": category,
            "Services": services,
            "Image": image_url,
            "Detail URL": detail_url
        })

Step 7. Save Data to CSV and JSON

Using the variable we just created, let’s save all the data:

    df = pd.DataFrame(data)
    df.to_csv("maps_data_playwright.csv", index=False)
    print(f"Saved {len(df)} records to maps_data_playwright.csv")

Step 8. Close Browser

Don’t forget to close the browser when you're done:

    browser.close()

Additional Resources

How to Scrape Google Maps Data Using Python
Github repo with examples on Python and NodeJS
Join our Discord

DEV Community