Applied Data Science Capstone by IBM/Coursera
1. Introduction: COFFEE ETHNIC IN DA NANG
In such a large and rich of coffee culture city like Da Nang, Viet Nam, it will be competitive to start up coffee business. In this case my contractor is a humble Vietnamese man who has contacted me to give advises and draw up essential lines of business prediction and back-up plans (but this part we will just discuss about predicting hot spot).
2. Orientation
First of all we need to collect Data of all coffee shops in Da Nang including their name, id, location (address, latitude, longitude) then pick up the "hot" neighbor where locates most of the venue. In order to asset Data we use FourSquare and apply folium for visualizing a particular neighbor in which that we will observe customer "traffic" and predict an appropriate location of new coffee shop in town. In this case you will find its temporary name on the folium map, "O Day Roi!"(Meaning "Here It Is!" in Vietnamese)
3. Execution steps
We import all the tools we need.
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np
!pip install folium
import folium
Apply your credential ID on [FourSquare]
CLIENT_ID = 'your Foursquare ID' # your Foursquare ID
CLIENT_SECRET = 'your Foursquare Secret' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 40
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)
Get requests near Da Nang city.
import requests
request_parameters = {
"client_id": CLIENT_ID,
"client_secret": CLIENT_SECRET,
"v": '20180605',
"section": "coffee",
"near": "Da Nang",
"radius": 1000,
"limit": 50}
data = requests.get("https://api.foursquare.com/v2/venues/explore", params=request_parameters)
Transform data into json then request geocode.
d = data.json()["response"]
d.keys()
dict_keys(['suggestedFilters', 'geocode', 'headerLocation', 'headerFullLocation', 'headerLocationGranularity', 'query', 'totalResults', 'suggestedBounds', 'groups'])
d["headerLocationGranularity"], d["headerLocation"], d["headerFullLocation"]
('city', 'Da Nang', 'Da Nang')
d["suggestedBounds"], d["totalResults"]
({'ne': {'lat': 16.076950003892705, 'lng': 108.22982675100442},
'sw': {'lat': 16.059727142671775, 'lng': 108.21333222890732}},
41)
d["geocode"]
{'cc': 'VN',
'center': {'lat': 16.06778, 'lng': 108.22083},
'displayString': 'Da Nang, Thành Phố Đà Nẵng, Vietnam',
'geometry': {'bounds': {'ne': {'lat': 16.11072, 'lng': 108.276871},
'sw': {'lat': 15.982205, 'lng': 108.141861}}},
'longId': '72057594039511928',
'slug': 'turan-vietnam',
'what': '',
'where': 'da nang'}
We start creating group including information which is recommended.
d["groups"][0].keys()
dict_keys(['type', 'name', 'items'])
d["groups"][0]["type"], d["groups"][0]["name"]
('Recommended Places', 'recommended')
Creating items of objects coffee shop and their attributes - id, address, name, etc
items = d["groups"][0]["items"]
print("number of items: %i" % len(items))
items[0]
number of items: 41
{'reasons': {'count': 0,
'items': [{'reasonName': 'globalInteractionReason',
'summary': 'This spot is popular',
'type': 'general'}]},
'referralId': 'e-5-5a26a41a31ac6c676705e94c-0',
'venue': {'categories': [{'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/coffeeshop_',
'suffix': '.png'},
'id': '4bf58dd8d48988d1e0931735',
'name': 'Coffee Shop',
'pluralName': 'Coffee Shops',
'primary': True,
'shortName': 'Coffee Shop'}],
'id': '5a26a41a31ac6c676705e94c',
'location': {'address': '39',
'cc': 'VN',
'city': 'Đà Nẵng',
'country': 'Việt Nam',
'crossStreet': 'Nguyễn Thái Học',
'formattedAddress': ['39 (Nguyễn Thái Học)',
'Đà Nẵng',
'Thành Phố Đà Nẵng',
'Việt Nam'],
'labeledLatLngs': [{'label': 'display',
'lat': 16.068062879455486,
'lng': 108.22351222071603}],
'lat': 16.068062879455486,
'lng': 108.22351222071603,
'postalCode': '551105',
'state': 'Thành Phố Đà Nẵng'},
'name': 'cộng cà phê',
'photos': {'count': 0, 'groups': []}}}
From the output we can identify necessary factors of what we will use later to consider the probability of launching our up-to-coming location.
Based on that we start to organize what we have got.
df_raw = []
for item in items:
venue = item["venue"]
categories, uid, name, location = venue["categories"], venue["id"], venue["name"], venue["location"]
print(location)
assert len(categories) == 1
shortname = categories[0]["shortName"]
address = ''
if hasattr(location, 'address'):
address = location['address']
if not "postalCode" in location:
continue
postalcode = location["postalCode"]
lat = location["lat"]
lng = location["lng"]
datarow = (uid, name, shortname, address, postalcode, lat, lng)
df_raw.append(datarow)
df = pd.DataFrame(df_raw, columns=["uid", "name", "shortname", "address", "postalcode", "lat", "lng"])
print("found %i cafes" % len(df))
df.head()
As we can see that there are many coffee shops without address we need to execute hasattr() to determine if each object (coffeeshop) has a attribute (address).
Next step we will execute a very important part - get coordinates of Da Nang and create folium map which will help visualize what we have got from data.
DaNang_center = d["geocode"]["center"]
DaNang_center
{'lat': 16.06778, 'lng': 108.22083}
from folium import plugins
map_DaNang = folium.Map(location=[16.06778, 108.22083], zoom_start=14)
def add_markers(df):
for (j, row) in df.iterrows():
label = folium.Popup(row["name"], parse_html=True)
folium.CircleMarker(
[row["lat"], row["lng"]],
radius=5,
popup=label,
color='red',
fill=True,
fill_color='#3186cc',
fill_opacity=0.7,
parse_html=False).add_to(map_DaNang)
add_markers(df)
hm_data = df[["lat", "lng"]].to_numpy().tolist()
map_DaNang.add_child(plugins.HeatMap(hm_data))
map_DaNang
Here is our beautiful Da Nang with little red dots presenting different items.
By spotting the clusters of items we can see which neighborhood has density of coffee business.
3. Conclusion
We will need a location where we can catch out customers from "hot" location we have picked up from the map and stay in a certain distance so as to lessen the competivity of business.
lat = 16.06778
lng = 108.22083
map_DaNang = folium.Map(location=[lat, lng], zoom_start=17)
add_markers(df)
folium.CircleMarker(
[lat, lng],
radius=15,
popup="O Day Roi!",
color='green',
fill=True,
fill_color='#3186cc',
fill_opacity=0.7,
parse_html=False).add_to(map_DaNang)
map_DaNang
Look at the blue buble, here we find out that it will locate about the crossroad between Pham Hong Thai street and Nguyen Chi Thanh street. As I know this neighborhood is safe and right at the center, it's off to the hot spot of night life and on the passing-by path of high school students in the area.
Here you can find the full notebook to try yourselves: Link to the notebook
Top comments (0)