Intro
In this blog post, we'll go through the process of extracting data from Google Maps Place results using Python. You can look at the complete code in the online IDE (Replit).
In order to successfully extract Google Maps Place Results, you will need to pass the data
parameter, this parameter is responsible for a specific place. You can extract this parameter from local results. Have a look at the Using Google Maps Local Results API from SerpApi blog post, in which I described in detail how to extract all the needed data.
If you prefer video format, we have a dedicated video that shows how to do that: Web Scraping Google Maps Place Result with SerpApi and Python.
What will be scraped
Why using API?
There're a couple of reasons that may use API, ours in particular:
- No need to create a parser from scratch and maintain it.
- Bypass blocks from Google: solve CAPTCHA or solve IP blocks.
- Pay for proxies, and CAPTCHA solvers.
- Don't need to use browser automation.
SerpApi handles everything on the backend with fast response times under ~2.5 seconds (~1.2 seconds with Ludicrous speed) per request and without browser automation, which becomes much faster. Response times and status rates are shown under SerpApi Status page.
Full Code
If you just need to extract all available data about the place, then we can create an empty list
and then append
extracted data to it:
from serpapi import GoogleSearch
import os, json
params = {
# https://docs.python.org/3/library/os.html#os.getenv
'api_key': os.getenv('API_KEY'), # your serpapi api
'engine': 'google_maps', # SerpApi search engine
'q': 'Starbucks', # query
'll': '@40.7455096,-74.0083012,15.1z', # GPS coordinates
'type': 'search', # list of results for the query
'hl': 'en', # language
'start': 0, # pagination
}
search = GoogleSearch(params) # where data extraction happens on the backend
results = search.get_dict() # JSON -> Python dict
local_data_results = [
{
'data_id': str(result['data_id']),
'latitude': str(result['gps_coordinates']['latitude']),
'longitude': str(result['gps_coordinates']['longitude'])
}
for result in results['local_results']
]
place_results = []
for result in local_data_results:
data = '!4m5!3m4!1s' + result['data_id'] + '!8m2!3d' + result['latitude'] + '!4d' + result['longitude']
params = {
# https://docs.python.org/3/library/os.html#os.getenv
'api_key': os.getenv('API_KEY'), # your serpapi api
'engine': 'google_maps', # SerpApi search engine
'type': 'place', # list of results for the query,
'data': data # place result
}
search = GoogleSearch(params)
results = search.get_dict()
place_results.append(results['place_results'])
print(json.dumps(place_results, indent=2, ensure_ascii=False))
Preparation
Install library:
pip install google-search-results
google-search-results
is a SerpApi API package.
Code Explanation
Import libraries:
from serpapi import GoogleSearch
import os, json
Library | Purpose |
---|---|
GoogleSearch |
to scrape and parse Google results using SerpApi web scraping library. |
os |
to return environment variable (SerpApi API key) value. |
json |
to convert extracted data to a JSON object. |
At the beginning of the code, you need to make the request in order to get local results. Then place results will be extracted from them.
The parameters are defined for generating the URL. If you want to pass other parameters to the URL, you can do so using the params
dictionary:
params = {
# https://docs.python.org/3/library/os.html#os.getenv
'api_key': os.getenv('API_KEY'), # your serpapi api
'engine': 'google_maps', # SerpApi search engine
'q': 'Starbucks', # query
'll': '@40.7455096,-74.0083012,15.1z', # GPS coordinates
'type': 'search', # list of results for the query
'hl': 'en', # language
'start': 0, # pagination
}
Then, we create a search
object where the data is retrieved from the SerpApi backend. In the results
dictionary we get data from JSON:
search = GoogleSearch(params) # where data extraction happens on the backend
results = search.get_dict() # JSON -> Python dict
At the moment, the first 20 local results are stored in the results
dictionary. If you are interested in all local results with pagination, then check out the Using Google Maps Local Results API from SerpApi blog post.
Data such as data_id
, latitude
and longitude
are extracted from each local result. These data will be needed to form the data
parameter, which will be needed later:
local_data_results = [
{
'data_id': str(result['data_id']),
'latitude': str(result['gps_coordinates']['latitude']),
'longitude': str(result['gps_coordinates']['longitude'])
}
for result in results['local_results']
]
Declaring the place_results
list where the extracted data will be added:
place_results = []
Next, you need to access each place separately by iterating the local_data_results
list:
for result in local_data_results:
# data extraction will be here
These parameters are defined for generating the URL for place results:
params = {
# https://docs.python.org/3/library/os.html#os.getenv
'api_key': os.getenv('API_KEY'), # your serpapi api
'engine': 'google_maps', # SerpApi search engine
'type': 'place', # list of results for the query,
'data': data # place result
}
Parameters | Explanation |
---|---|
api_key |
Parameter defines the SerpApi private key to use. |
engine |
Set parameter to google_maps to use the Google Maps API engine. |
type |
Parameter defines the type of search you want to make. place - returns results for a specific place when data parameter is set. |
data |
Parameter must be set in the next format: !4m5!3m4!1s + data_id + !8m2!3d + latitude + !4d + longitude . |
Additionally, I want to talk about the value that is written to the data
parameter, which is only required if type
is set to place
. In that case, it defines a search for a specific place. It must be built in the following sequence:
data = (
"!4m5!3m4!1s"
+ result["data_id"]
+ "!8m2!3d"
+ result["latitude"]
+ "!4d"
+ result["longitude"]
)
This will form a string that looks like this:
!4m5!3m4!1s0x89c259b7abdd4769:0xc385876db174521a!8m2!3d40.750231!4d-74
Parameter can also be used to filter the search results. You can visit Google Maps website, set filters you want and simply copy the data
value from their URL to SerpApi URL.
Then, we create a search
object where the data is retrieved from the SerpApi backend. In the results
dictionary we get data from JSON:
search = GoogleSearch(params)
results = search.get_dict()
Append data from this place in the place_results
list:
place_results.append(results['place_results'])
# title = results['place_results']['title']
# rating = results['place_results']['rating']
# reviews = results['place_results']['reviews']
πNote: If you want to extract some specific fields, then in the comment above I gave an example of how this can be implemented.
After the all data is retrieved, it is output in JSON format:
print(json.dumps(place_results, indent=2, ensure_ascii=False))
Output
[
{
"title": "Starbucks",
"place_id": "ChIJYe_P4dlZwokR-NbM4veyHgM",
"data_id": "0x89c259d9e1cfef61:0x31eb2f7e2ccd6f8",
"data_cid": "224813809146844920",
"reviews_link": "https://serpapi.com/search.json?data_id=0x89c259d9e1cfef61%3A0x31eb2f7e2ccd6f8&engine=google_maps_reviews&hl=en",
"photos_link": "https://serpapi.com/search.json?data_id=0x89c259d9e1cfef61%3A0x31eb2f7e2ccd6f8&engine=google_maps_photos&hl=en",
"gps_coordinates": {
"latitude": 40.7512294,
"longitude": -74.0252988
},
"place_id_search": "https://serpapi.com/search.json?data=%214m5%213m4%211s0x89c259d9e1cfef61%3A0x31eb2f7e2ccd6f8%218m2%213d40.7512294%214d-74.0252988&engine=google_maps&google_domain=google.com&hl=en&type=place",
"thumbnail": "https://lh5.googleusercontent.com/p/AF1QipOuxvgz55TI_eqJCrv_vrsZQs-YcRy2tIf_x97l=w152-h86-k-no",
"rating": 3.9,
"reviews": 514,
"price": "$$",
"type": [
"Coffee shop",
"Breakfast restaurant",
"Cafe",
"Coffee store",
"Espresso bar",
"Internet cafe"
],
"description": "Seattle-based coffeehouse chain known for its signature roasts, light bites and WiFi availability.",
"menu": {
"link": "http://www.starbucks.com/menu",
"source": "starbucks.com"
},
"service_options": {
"dine_in": true,
"takeout": true,
"delivery": true
},
"extensions": [
{
"highlights": [
"Great coffee",
"Great tea selection"
]
},
... other extensions
],
"address": "The Shipyards at 12th & Hudson, 1205 Hudson St, Hoboken, NJ 07030",
"website": "https://www.starbucks.com/store-locator/store/16852/",
"phone": "(201) 792-5400",
"open_state": "Closed β
Opens 6:30AM Wed",
"plus_code": "QX2F+FV Hoboken, New Jersey",
"hours": [
{
"tuesday": "6:30AMβ7PM"
},
... other days and hours
],
"images": [
{
"title": "All",
"thumbnail": "https://lh5.googleusercontent.com/p/AF1QipOuxvgz55TI_eqJCrv_vrsZQs-YcRy2tIf_x97l=w529-h298-k-no"
},
... other images
],
"user_reviews": {
"summary": [
{
"snippet": "\"Great service....great atmosphere...great coffee!\""
},
{
"snippet": "\"Waited an hour for a croissant that I mobile ordered.\""
},
{
"snippet": "\"Although they seem understaffed at times the staff is great and work hard.\""
}
],
"most_relevant": [
{
"username": "Miri Castro",
"rating": 5,
"description": "Xavier is a great manager and is very hospitable. I come to this location to work and it remains my favorite place to get things done. The service is great and the seating is plenty. I would recommend this location to anyone.",
"images": [
{
"thumbnail": "https://lh5.googleusercontent.com/p/AF1QipNAN4IVks0fHHv5LBWiP5pLUjoxLUGv-wlOr_u4=w150-h150-k-no-p"
}
],
"date": "2 weeks ago"
},
... other user reviews
]
},
"people_also_search_for": [
{
"search_term": "Coffee shops",
"local_results": [
{
"position": 1,
"title": "bwè kafe",
"data_id": "0x0:0x3e5c0ada2bb26735",
"data_cid": "4493478460361172789",
"reviews_link": "https://serpapi.com/search.json?data_id=0x0%3A0x3e5c0ada2bb26735&engine=google_maps_reviews&hl=en",
"photos_link": "https://serpapi.com/search.json?data_id=0x0%3A0x3e5c0ada2bb26735&engine=google_maps_photos&hl=en",
"gps_coordinates": {
"latitude": 40.7487777,
"longitude": -74.0277306
},
"place_id_search": "https://serpapi.com/search.json?data=%214m5%213m4%211s0x0%3A0x3e5c0ada2bb26735%218m2%213d40.7487777%214d-74.0277306&engine=google_maps&google_domain=google.com&hl=en&type=place",
"rating": 4.6,
"reviews": 435,
"thumbnail": "https://lh5.googleusercontent.com/p/AF1QipMXujpd2EQuMAiyYOtQP0WcIopq0MDkhr_ERyb2=w156-h156-n-k-no",
"type": [
"Coffee shop",
"Cafe",
"Event venue",
"Tea house",
"Wi-Fi spot"
]
},
... other local results
]
},
... other people also search for
],
"popular_times": {
"graph_results": {
"sunday": [
{
"time": "5 AM",
"busyness_score": 0
},
... other hours
],
... other days
},
"live_hash": {
"info": null,
"time_spent": "People typically spend 15 min here"
}
}
},
... other places
]
πNote: You can view playground or check the output. This way you will be able to understand what keys you can use in this JSON structure to get the data you need.
Links
Add a Feature Requestπ« or a Bugπ
Top comments (0)