DEV Community

Jade Tran
Jade Tran

Posted on

Capstone Project - The Battle of Neighborhoods

Applied Data Science Capstone by IBM/Coursera


1. Introduction: COFFEE ETHNIC IN DA NANG

In such a large and rich of coffee culture city like Da Nang, Viet Nam, it will be competitive to start up coffee business. In this case my contractor is a humble Vietnamese man who has contacted me to give advises and draw up essential lines of business prediction and back-up plans (but this part we will just discuss about predicting hot spot).

alt text

2. Orientation

First of all we need to collect Data of all coffee shops in Da Nang including their name, id, location (address, latitude, longitude) then pick up the "hot" neighbor where locates most of the venue. In order to asset Data we use FourSquare and apply folium for visualizing a particular neighbor in which that we will observe customer "traffic" and predict an appropriate location of new coffee shop in town. In this case you will find its temporary name on the folium map, "O Day Roi!"(Meaning "Here It Is!" in Vietnamese)


3. Execution steps

We import all the tools we need.

import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np
!pip install folium
import folium 
Enter fullscreen mode Exit fullscreen mode

Apply your credential ID on [FourSquare]

CLIENT_ID = 'your Foursquare ID' # your Foursquare ID
CLIENT_SECRET = 'your Foursquare Secret' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 40
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)
Enter fullscreen mode Exit fullscreen mode

Get requests near Da Nang city.

import requests

request_parameters = {
    "client_id": CLIENT_ID,
    "client_secret": CLIENT_SECRET,
    "v": '20180605',
    "section": "coffee",
    "near": "Da Nang",
    "radius": 1000,
    "limit": 50}

data = requests.get("https://api.foursquare.com/v2/venues/explore", params=request_parameters)
Enter fullscreen mode Exit fullscreen mode

Transform data into json then request geocode.

d = data.json()["response"]
d.keys()
Enter fullscreen mode Exit fullscreen mode
dict_keys(['suggestedFilters', 'geocode', 'headerLocation', 'headerFullLocation', 'headerLocationGranularity', 'query', 'totalResults', 'suggestedBounds', 'groups'])
Enter fullscreen mode Exit fullscreen mode
d["headerLocationGranularity"], d["headerLocation"], d["headerFullLocation"]
Enter fullscreen mode Exit fullscreen mode
('city', 'Da Nang', 'Da Nang')
Enter fullscreen mode Exit fullscreen mode
d["suggestedBounds"], d["totalResults"]
Enter fullscreen mode Exit fullscreen mode
({'ne': {'lat': 16.076950003892705, 'lng': 108.22982675100442},
  'sw': {'lat': 16.059727142671775, 'lng': 108.21333222890732}},
 41)
Enter fullscreen mode Exit fullscreen mode
d["geocode"]
Enter fullscreen mode Exit fullscreen mode
{'cc': 'VN',
 'center': {'lat': 16.06778, 'lng': 108.22083},
 'displayString': 'Da Nang, Thành Phố Đà Nẵng, Vietnam',
 'geometry': {'bounds': {'ne': {'lat': 16.11072, 'lng': 108.276871},
   'sw': {'lat': 15.982205, 'lng': 108.141861}}},
 'longId': '72057594039511928',
 'slug': 'turan-vietnam',
 'what': '',
 'where': 'da nang'}
Enter fullscreen mode Exit fullscreen mode

We start creating group including information which is recommended.

d["groups"][0].keys()
Enter fullscreen mode Exit fullscreen mode
dict_keys(['type', 'name', 'items'])
Enter fullscreen mode Exit fullscreen mode
d["groups"][0]["type"], d["groups"][0]["name"]
Enter fullscreen mode Exit fullscreen mode
('Recommended Places', 'recommended')
Enter fullscreen mode Exit fullscreen mode

Creating items of objects coffee shop and their attributes - id, address, name, etc

items = d["groups"][0]["items"]
print("number of items: %i" % len(items))
items[0]
Enter fullscreen mode Exit fullscreen mode
number of items: 41
{'reasons': {'count': 0,
  'items': [{'reasonName': 'globalInteractionReason',
    'summary': 'This spot is popular',
    'type': 'general'}]},
 'referralId': 'e-5-5a26a41a31ac6c676705e94c-0',
 'venue': {'categories': [{'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/coffeeshop_',
     'suffix': '.png'},
    'id': '4bf58dd8d48988d1e0931735',
    'name': 'Coffee Shop',
    'pluralName': 'Coffee Shops',
    'primary': True,
    'shortName': 'Coffee Shop'}],
  'id': '5a26a41a31ac6c676705e94c',
  'location': {'address': '39',
   'cc': 'VN',
   'city': 'Đà Nẵng',
   'country': 'Việt Nam',
   'crossStreet': 'Nguyễn Thái Học',
   'formattedAddress': ['39 (Nguyễn Thái Học)',
    'Đà Nẵng',
    'Thành Phố Đà Nẵng',
    'Việt Nam'],
   'labeledLatLngs': [{'label': 'display',
     'lat': 16.068062879455486,
     'lng': 108.22351222071603}],
   'lat': 16.068062879455486,
   'lng': 108.22351222071603,
   'postalCode': '551105',
   'state': 'Thành Phố Đà Nẵng'},
  'name': 'cộng cà phê',
  'photos': {'count': 0, 'groups': []}}}
Enter fullscreen mode Exit fullscreen mode

Testing by calling an item
Alt Text

From the output we can identify necessary factors of what we will use later to consider the probability of launching our up-to-coming location.

Based on that we start to organize what we have got.

df_raw = []
for item in items:
    venue = item["venue"]
    categories, uid, name, location = venue["categories"], venue["id"], venue["name"], venue["location"]
    print(location)
    assert len(categories) == 1
    shortname = categories[0]["shortName"]
    address =  ''
    if hasattr(location, 'address'):
      address = location['address']
    if not "postalCode" in location:
        continue
    postalcode = location["postalCode"]
    lat = location["lat"]
    lng = location["lng"]
    datarow = (uid, name, shortname, address, postalcode, lat, lng)
    df_raw.append(datarow)
df = pd.DataFrame(df_raw, columns=["uid", "name", "shortname", "address", "postalcode", "lat", "lng"])
print("found %i cafes" % len(df))
df.head()
Enter fullscreen mode Exit fullscreen mode

Here is the output.
Alt Text
Alt Text

As we can see that there are many coffee shops without address we need to execute hasattr() to determine if each object (coffeeshop) has a attribute (address).
Next step we will execute a very important part - get coordinates of Da Nang and create folium map which will help visualize what we have got from data.

DaNang_center = d["geocode"]["center"]
DaNang_center
Enter fullscreen mode Exit fullscreen mode
{'lat': 16.06778, 'lng': 108.22083}
Enter fullscreen mode Exit fullscreen mode
from folium import plugins


map_DaNang = folium.Map(location=[16.06778, 108.22083], zoom_start=14)

def add_markers(df):
    for (j, row) in df.iterrows():
        label = folium.Popup(row["name"], parse_html=True)
        folium.CircleMarker(
            [row["lat"], row["lng"]],
            radius=5,
            popup=label,
            color='red',
            fill=True,
            fill_color='#3186cc',
            fill_opacity=0.7,
            parse_html=False).add_to(map_DaNang)

add_markers(df)
hm_data = df[["lat", "lng"]].to_numpy().tolist()
map_DaNang.add_child(plugins.HeatMap(hm_data))

map_DaNang
Enter fullscreen mode Exit fullscreen mode

Here is our beautiful Da Nang with little red dots presenting different items.
Alt Text

By spotting the clusters of items we can see which neighborhood has density of coffee business.

3. Conclusion

We will need a location where we can catch out customers from "hot" location we have picked up from the map and stay in a certain distance so as to lessen the competivity of business.

lat = 16.06778
lng = 108.22083
map_DaNang = folium.Map(location=[lat, lng], zoom_start=17)
add_markers(df)
folium.CircleMarker(
    [lat, lng],
    radius=15,
    popup="O Day Roi!",
    color='green',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_DaNang)
map_DaNang
Enter fullscreen mode Exit fullscreen mode

Look at the blue buble, here we find out that it will locate about the crossroad between Pham Hong Thai street and Nguyen Chi Thanh street. As I know this neighborhood is safe and right at the center, it's off to the hot spot of night life and on the passing-by path of high school students in the area.
Alt Text

Here you can find the full notebook to try yourselves: Link to the notebook

Top comments (0)