Table of Contents
Toggle
-
How to Build a Google Maps Lead Scraper in Python (Step-by-Step)
- Why Google Maps Data Matters for Lead Generation
- The Challenge: Scraping Google Maps at Scale
- Building the Scraper: Architecture Overview
- Step 1: Set Up Your Python Environment
- Step 2: Initialize the Browser and Navigate to Google Maps
- Step 3: Extract Business Listings
- Step 4: Click Into Each Listing and Extract Details
- Step 5: Complete the Scraper with Data Validation and Export
- Advanced Tips: Production-Ready Improvements
- Real-World Use Cases
- Save Time with the Local Leads Pack
How to Build a Google Maps Lead Scraper in Python (Step-by-Step)
Lead generation is the lifeblood of any sales-driven business. Whether you're running a local service business, real estate firm, or B2B agency, finding qualified leads quickly and efficiently can make or break your revenue. Google Maps contains millions of business listings with contact information, reviews, hours, and location dataβbut manually collecting this data is tedious and error-prone.
In this guide, I'll show you how to build a production-ready Google Maps lead scraper in Python that extracts business data, phone numbers, emails, and review ratings automatically. This is the exact approach I use in the Local Leads Pack, which has helped hundreds of agencies and entrepreneurs compile targeted prospect lists in hours instead of days.
Why Google Maps Data Matters for Lead Generation
Google Maps is one of the most valuable data sources for B2B and local business prospecting. Here's why:
- Real-time business data: Business owners keep Google Maps listings current because customers see them first.
- Contact information: Phone numbers and websites are directly available, eliminating the need for secondary lookups.
- Review signals: Rating and review counts tell you about business health and customer satisfaction.
- Operational data: Hours, photos, and services reveal exactly what businesses do and how they operate.
- Competitor insights: See who's ranking for specific keywords and how they're positioned.
Instead of cold outreach to random businesses, Google Maps data lets you target high-intent prospects: recently active businesses, highly rated competitors, newly opened locations, and businesses in your geographic focus area.
The Challenge: Scraping Google Maps at Scale
Before diving into the code, let's be clear about the technical challenges:
- JavaScript rendering: Google Maps loads results dynamically via JavaScript, so simple HTTP requests won't work.
- Rate limiting: Aggressive scraping gets blocked quickly. You need delays and rotating proxies.
- Data structure complexity: Business data is scattered across multiple DOM elements and requires careful parsing.
- Geolocation sensitivity: Results vary by location, requiring proper lat/long coordinates and viewport handling.
- Legal considerations: Always respect Google's ToS and use data ethically for legitimate business purposes.
The solution is to use a headless browser (Selenium or Playwright) combined with proper request handling, data validation, and rate limiting.
Building the Scraper: Architecture Overview
Here's the architecture we'll build:
- Search module: Handles Google Maps search initialization and result pagination.
- Parser module: Extracts business data from DOM elements and text.
- Validator module: Cleans and validates extracted data (phone, email, hours).
- Storage module: Saves results to CSV, JSON, or database.
- Rate limiter: Implements backoff strategies to avoid blocking.
Step 1: Set Up Your Python Environment
First, install the required dependencies:
pip install selenium webdriver-manager pandas requests lxml beautifulsoup4
We're using Selenium for browser automation because Google Maps requires JavaScript execution. WebDriver Manager automatically downloads the correct ChromeDriver version.
Step 2: Initialize the Browser and Navigate to Google Maps
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
import time
import pandas as pd
class GoogleMapsLeadScraper:
def __init__(self, search_query, location, max_results=100):
self.search_query = search_query
self.location = location
self.max_results = max_results
self.results = []
self.driver = None
def initialize_driver(self):
"""Set up Chrome WebDriver with options for scraping."""
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--disable-blink-features=AutomationControlled')
chrome_options.add_argument('user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--no-sandbox')
service = Service(ChromeDriverManager().install())
self.driver = webdriver.Chrome(service=service, options=chrome_options)
def navigate_to_search(self):
"""Navigate to Google Maps and perform search."""
search_url = f"https://www.google.com/maps/search/{self.search_query}+{self.location}"
self.driver.get(search_url)
time.sleep(3) # Wait for page load
# Example usage
scraper = GoogleMapsLeadScraper(
search_query="plumbers",
location="New York",
max_results=100
)
scraper.initialize_driver()
scraper.navigate_to_search()
Step 3: Extract Business Listings
Once the search results load, we need to extract each business listing. Google Maps renders results in a scrollable sidebar; each result is clickable and contains the business name and snippet information.
def extract_listings(self):
"""Extract business names and URLs from search results."""
wait = WebDriverWait(self.driver, 10)
# Get results container
results_container = wait.until(
EC.presence_of_all_elements_located((By.XPATH, '//div[@role="feed"]'))
)
listings = []
# Scroll through results and collect business names
previous_height = 0
while len(listings) < self.max_results:
try:
# Get all result divs
result_divs = self.driver.find_elements(
By.XPATH,
'//div[@data-index]//div[@class="Nv2PK THOPZb"]'
)
for div in result_divs:
if len(listings) >= self.max_results:
break
try:
name_element = div.find_element(By.XPATH, './/div[@role="button"]//div')
business_name = name_element.text
if business_name and business_name not in [l['name'] for l in listings]:
listings.append({
'name': business_name,
'element': div
})
except:
continue
# Scroll to load more
scroll_element = self.driver.find_element(By.XPATH, '//div[@role="feed"]')
self.driver.execute_script('arguments[0].scrollTop = arguments[0].scrollHeight', scroll_element)
time.sleep(2)
# Check if we've scrolled to bottom
new_height = self.driver.execute_script('return arguments[0].scrollHeight', scroll_element)
if new_height == previous_height:
break
previous_height = new_height
except Exception as e:
print(f"Error extracting listings: {e}")
break
self.results = listings[:self.max_results]
return self.results
Step 4: Click Into Each Listing and Extract Details
Now for each business, we click on it to open the detail panel and extract phone number, website, address, reviews, and other metadata.
def extract_business_details(self, business_element):
"""Extract detailed information from a business listing."""
try:
# Click the business to open detail panel
business_element.click()
time.sleep(2) # Wait for detail panel to load
details = {'name': ''}
# Extract business name
try:
name = self.driver.find_element(By.XPATH, '//h1[@class="fontHeadingLarge"]').text
details['name'] = name
except:
pass
# Extract rating and review count
try:
rating_element = self.driver.find_element(By.XPATH, '//div[@aria-label*="star"]')
rating_text = rating_element.get_attribute('aria-label')
details['rating'] = rating_text
except:
details['rating'] = ''
# Extract phone number
try:
phone_element = self.driver.find_element(
By.XPATH,
'//button//span[contains(text(), "+") or contains(text(), "(")]'
)
details['phone'] = phone_element.text
except:
details['phone'] = ''
# Extract website
try:
website_element = self.driver.find_element(By.XPATH, '//a[@data-url][@aria-label*="website"]')
details['website'] = website_element.get_attribute('href')
except:
details['website'] = ''
# Extract address
try:
address_element = self.driver.find_element(
By.XPATH,
'//button//div[contains(@class, "fontBodyMedium")]'
)
details['address'] = address_element.text
except:
details['address'] = ''
# Extract hours
try:
hours_element = self.driver.find_element(
By.XPATH,
'//div[contains(text(), "Open") or contains(text(), "Closed")]'
)
details['hours'] = hours_element.text
except:
details['hours'] = ''
return details
except Exception as e:
print(f"Error extracting business details: {e}")
return {}
Step 5: Complete the Scraper with Data Validation and Export
import re
def validate_phone(phone):
"""Clean and validate phone number."""
if not phone:
return ''
# Remove common formatting, keep only digits
cleaned = re.sub(r'\D', '', phone)
return cleaned if len(cleaned) >= 10 else ''
def validate_url(url):
"""Check if URL is valid."""
if not url:
return ''
return url if url.startswith('http') else f'https://{url}'
def scrape_all_businesses(self):
"""Main scraping loop - extract all listings and details."""
self.initialize_driver()
self.navigate_to_search()
self.extract_listings()
business_data = []
for idx, listing in enumerate(self.results):
print(f"Processing {idx + 1}/{len(self.results)}...")
try:
details = self.extract_business_details(listing['element'])
# Validate data
details['phone'] = validate_phone(details.get('phone', ''))
details['website'] = validate_url(details.get('website', ''))
business_data.append(details)
# Rate limiting - add delay between requests
time.sleep(1.5)
except Exception as e:
print(f"Error processing business {idx}: {e}")
continue
self.driver.quit()
return business_data
def export_to_csv(self, data, filename='google_maps_leads.csv'):
"""Export collected data to CSV."""
df = pd.DataFrame(data)
df.to_csv(filename, index=False, encoding='utf-8')
print(f"Exported {len(df)} leads to {filename}")
return filename
def export_to_json(self, data, filename='google_maps_leads.json'):
"""Export collected data to JSON."""
import json
with open(filename, 'w', encoding='utf-8') as f:
json.dump(data, f, indent=2, ensure_ascii=False)
print(f"Exported {len(data)} leads to {filename}")
return filename
# Full execution
if __name__ == "__main__":
scraper = GoogleMapsLeadScraper(
search_query="electrical contractors",
location="Los Angeles, CA",
max_results=50
)
leads = scraper.scrape_all_businesses()
# Export in both formats
scraper.export_to_csv(leads)
scraper.export_to_json(leads)
print(f"\nScraped {len(leads)} leads successfully!")
for lead in leads[:5]:
print(f"\n{lead['name']}")
print(f" Phone: {lead['phone']}")
print(f" Website: {lead['website']}")
print(f" Rating: {lead['rating']}")
Advanced Tips: Production-Ready Improvements
To run this scraper at scale without getting blocked, implement these best practices:
1. Implement Proxy Rotation
# Add rotating proxies to avoid IP blocking
proxies = [
'http://proxy1.com:8080',
'http://proxy2.com:8080',
'http://proxy3.com:8080',
]
import random
proxy = random.choice(proxies)
chrome_options.add_argument(f'--proxy-server={proxy}')
2. Add Exponential Backoff for Rate Limiting
import time
from functools import wraps
def retry_with_backoff(max_retries=3, backoff_factor=2):
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
for attempt in range(max_retries):
try:
return func(*args, **kwargs)
except Exception as e:
if attempt == max_retries - 1:
raise
wait_time = backoff_factor ** attempt
print(f"Retry in {wait_time}s...")
time.sleep(wait_time)
return wrapper
return decorator
@retry_with_backoff(max_retries=3)
def extract_business_details(self, element):
# ... extraction logic ...
pass
3. Add Database Storage
import sqlite3
def save_to_database(self, data, db_name='leads.db'):
"""Save leads to SQLite database."""
conn = sqlite3.connect(db_name)
c = conn.cursor()
c.execute('''CREATE TABLE IF NOT EXISTS leads
(id INTEGER PRIMARY KEY, name TEXT, phone TEXT,
website TEXT, address TEXT, rating TEXT, hours TEXT)''')
for lead in data:
c.execute('''INSERT INTO leads (name, phone, website, address, rating, hours)
VALUES (?, ?, ?, ?, ?, ?)''',
(lead.get('name'), lead.get('phone'), lead.get('website'),
lead.get('address'), lead.get('rating'), lead.get('hours')))
conn.commit()
conn.close()
print(f"Saved {len(data)} leads to {db_name}")
Real-World Use Cases
- Local service businesses: Plumbers, electricians, HVAC contractors scraping competitors in their service area.
- Real estate agents: Collecting contractor and service provider leads for referral partnerships.
- B2B sales teams: Building prospect lists for specific industries (restaurants, retail, healthcare).
- Market research: Analyzing competitor density, pricing, and reviews across geographies.
- Lead validation: Enriching existing CRM data with current phone numbers and websites from Google Maps.
Save Time with the Local Leads Pack
Building and maintaining a Google Maps scraper requires continuous updates as Google changes its DOM structure. Instead of managing this yourself, the Local Leads Pack ($29) provides:
- Pre-built, production-ready scraper (updated monthly)
- Support for 50+ business categories
- Automatic proxy rotation and rate limiting
- CSV export with dedupe and validation
- Video tutorials and API documentation
Get the Local Leads Pack β
π Google Maps MCP Server
Connect your AI agents directly to live google maps data. Use with Claude, GPT, or any AI assistant.
About the Author
The Next Gen Nexus covers AI agents, automation, and web data β practical guides for developers, analysts, and businesses working with data at scale.
Top comments (0)