DEV Community

Ryan Maquiling for UP Mindanao SPARCS

Posted on

Hello World, Literally: Mapping Real-World Data using Geospatial Python

This article was co-authored by @shin462.

In standard programming, the first task given to programming newbies is almost always writing a script that prints the phrase "Hello, World!" onto their screens. It's just a single line of code but for many, it marks the beginning of an exciting programming journey as they first see their code fully working.

Now, imagine taking that excitement a step further. Instead of printing texts, what if your code can now help you see a piece of the actual world in the form of a map. With geospatial programming, you can now plot any dataset by only using its raw coordinates. This guide is a "Hello, World" reimagined, where we observe the world by moving beyond spreadsheets and visually present real-world data onto a map using Python!

Geospatial Python

Geospatial Python is the use of the Python programming language to work with maps and geographical information. It builds upon Geographic Information System (GIS) technologies, a computer-based framework that analyzes and manages all types of data and connects them to specific locations on the map. Through python, spatial analysis is translated into programs that allow geographic data to be processed, automated, and visualized directly through code.

In simple words, it allows computers to know where things are in the real-world and how certain data relates to these particular locations. Most data would tell us what happened or how many but Geospatial python includes an important question: where? Instead of merely looking at numbers in a table, we can take these numbers and put them in a map using coordinates (longitude, latitude), turning ordinary data to something we can see and visually explore.


What you need to start

  • An IDE (preferably Visual Studio Code) - this is where you type and manage your code.
  • Jupyter Notebook - allows small blocks of code to run rather than the entire code, which you can immediately see the different outputs of each code. Can be downloaded as an extension in Visual Studio Code.
  • Python - the main programming language used, which can be downloaded from https://www.python.org/downloads/. Python alone couldn't understand maps, so we equip it with some add-ons (called libraries):
    • Geopandas - allows Python to understand geometry and geography. Rather than just some float numbers, Python now understands longitude and latitude as raw coordinates that can now be placed on a map as points.
    • Folium - generates maps with interactive features (like pan and zoom) just like in Google Maps.
    • Leafmap - an advanced library that bundles complex features that allows you to generate multi-layered maps.
    • Ipywidgets - adds more interactive controls (sliders, dropdown, menus, and buttons).

Let's Get Started!

Step 1: Create a Python Virtual Environment, Make a Jupyter Notebook File, and Install the Libraries

Make a folder and open the VSCode Command Palette (Ctr+Shift+P on Windows/Linux, or Cmd+Shift+P on Mac) and type >Python: Create Environment. Select the 'venv' and then the version of your python. This is to ensure that the libraries we will be installing will only be exclusive in this folder, which prevents them from interfering with your other Python projects.

CTRL + SHIFT + P
> Python: Create Environment
Enter fullscreen mode Exit fullscreen mode

Make a Jupyter Notebook file inside your folder using the format [filename].ipynb. Once created, click the 'Select Kernel' on the top-right corner of the screen and select the 'venv' python version you just created. Next, add a code cell using the +Code button, paste the installation codes below for the libraries, and run them.


# The % sign ensures these install directly into your active Jupyter kernel.

%pip install geopandas
%pip install folium -q
%pip install leafmap -q
%pip install -U ipywidgets -q

Step 2: Import the Libraries

To easily access the installed libraries, import them with aliases. You can now call the libraries using their respective aliases instead of the entire library name, which could save you some characters to type (Here, instead of typing geopandas, we can now use 'gpd' to call the geopandas library).

import geopandas as gpd # used aliases for the libraries to be easily called and used
import pandas as pd
import leafmap as lm

Step 3: Create a Dictionary for the Dataset

Before we start mapping, we need some real-world data. For this guide, we are going to write our own dataset inside the code using Python dictionary. For this example, we used a dataset of some cities in the Philippines with their names, population (from World Population Review), and their respective coordinates (from LatLong). You can think of the Python dictionary as a table with column headers (words in quotes, like "Name") and a list of values inside that column (the specific names of the cities).

# Data for some cities in the Philippines
data = {
    "Name": ["Butuan City", "Quezon City", "Lapu-Lapu City", "General Santos City", "Iloilo City"], #City names
    "Population": [407731, 3359534, 544425, 734828, 518106],  # Approximate populations
    "Latitude": [8.951549, 14.648731, 10.266182, 6.116243, 10.720321], #Latitude
    "Longitude": [125.527725, 121.047806, 123.997292, 125.171738, 122.562019] #Longitude
}

Step 4: Make a DataFrame using the dictionary

While dictionaries are great for storing data, they are not very easy to analyze or manipulate. So, we will transform the created dictionary into a spreadsheet-like table with rows and columns using Pandas' DataFrame, which is the parent library of GeoPandas.

# Convert the dictionary into a Pandas DataFrame (a standard data table)
cities_df = pd.DataFrame(data) # call the alias of Pandas
cities_df # display the DataFrame

DataFrame:

Index Name Population Latitude Longitude
0 Butuan City 407731 8.951549 125.527725
1 Quezon City 3359534 14.648731 121.047806
2 Lapu-Lapu City 544425 10.266182 123.997292
3 General Santos City 734828 6.116243 125.171738
4 Iloilo City 518106 10.720321 122.562019

Tip: Create a new column in the DataFrame that combines the city names and population to be used as point labels later on. Amazingly, you can format these according to your style using HTML!

cities_df['Point_Labels'] = ( #Added a new column for the labels of each point
    cities_df['Name'] + 
    "<br>" + # HTML style of formatting to neatly stack them on one another
    "<span style='color: yellow; font-size: 8pt;'>" + # customize the style for the population text
    "(" + cities_df['Population'].apply(lambda x: f"{x:,}") + ")" + # puts a comma every after 3 digits
    "</span>"
)
cities_df # display the new DataFrame

DataFrame:

Index Name Population Latitude Longitude Point Labels
0 Butuan City 407731 8.951549 125.527725 Butuan City
<span style='color: yellow; fon...
1 Quezon City 3359534 14.648731 121.047806 Quezon City
<span style='color: yellow; fon...
2 Lapu-Lapu City 544425 10.266182 123.997292 Lapu-Lapu City
<span style='color: yellow; ...
3 General Santos City 734828 6.116243 125.171738 General Santos City
<span style='color: yel...
4 Iloilo City 518106 10.720321 122.562019 Iloilo City
<span style='color: yellow, fon...

The values under the "Point_Labels" column may look weird in the DataFrame, but don't worry since the HTML style will be applied once plotted in the map!

Step 5: Convert the DataFrame into a GeoDataFrame

Right now, when python reads the DataFrame, it only interprets the longitude and latitude as float numbers and not as specific locations on the Earth. That's why we need to convert our Pandas' DataFrame into a GeoPandas' GeoDataFrame. This creates a new column that creates spatial points using the longitude and latitude from the dataset.

To do this, we need to tell Python these two things:

  • The Columns that has the raw coordinates - specify that longitude is the X-axis and latitude is the Y-axis.
  • The Coordinate Reference System (CRS) - the framework that defines how these coordinates are interpreted (EPSG:4326 for 3D world that uses latitude and longitude, EPSG:3857 for 2D world that uses Cartesian coordinates, etc.).
# Create a 'geometry' column that consists of the (x,y) points using GeoPandas
gdf = gpd.GeoDataFrame( # call the alias of GeoPandas
    cities_df, 
    geometry=gpd.points_from_xy(
        cities_df['Longitude'], # Longitude as the X axis
          cities_df['Latitude'] # Latitude as the X axis
          )
        )

gdf = gdf.set_crs(epsg=4326) 
# Set the Coordinate Reference System to EPSG:4326 since we will be using longitude and latitude
gdf # display the new GeoDataFrame

GeoDataFrame:

Index Name Population Latitude Longitude Point Labels geometry
0 Butuan City 407731 8.951549 125.527725 Butuan City
<span style='color: yellow, fon...
POINT (125.52772 8.95155)
1 Quezon City 3359534 14.648731 121.047806 Quezon City
<span style='color: yellow, fon...
POINT (121.04781 14.64873)
2 Lapu-Lapu City 544425 10.266182 123.997292 Lapu-Lapu City
<span style='color: yellow; ...
POINT (123.99729 10.26618)
3 General Santos City 734828 6.116243 125.171738 General Santos City
<span style='color: yel...
POINT (125.17174 6.11624)
4 Iloilo City 518106 10.720321 122.562019 Iloilo City
<span style='color: yellow; fon...
POINT (122.56202 10.72032)

Step 6: Plot the GeoDataFrame on a Map using LeafMap

Now that our data is stored in a GeoDataFrame (in the gdf variable we created in Step 5), we can now plot these points onto a dynamic map. LeafMap is the best library for this as it effectively layers the data on top of the background map simultaneously. Additionally, we can add different layers on the map that could provide more geographical context. In this guide, we added Esri.WorldImagery which gives a high-resolution satellite photography (like in Google Earth) and OpenStreetMap which uses standard vector-like graphics (like in Google Maps).

# Call the alias for LeafMap (lm)
plotted_map = lm.Map(center=[9.5, 121.774], zoom=6) # Centers the map near the coordinates of the Philippines, with a default zoom of 6
plotted_map.add_gdf(gdf, layer_name="Cities") # Add the GeoDataFrame to the map

plotted_map.add_basemap("Esri.WorldImagery")  # Satellite imagery
plotted_map.add_basemap("OpenStreetMap")      # OpenStreetMap layer

plotted_map # Displays the map

Map:

Map with plotted points
We can then use the column "Point_Labels" (created in Step 4) to add the respective labels on top of the plotted spatial points.

plotted_map.add_labels(
    gdf,                   # The GeoDataFrame where the spatial points are found
    "Point_Labels",        # Retrieves the values from this column and placed as labels
    font_size="10pt",      # This controls the labels' default text size
    font_color="white",    # This controls the labels' default text color
    font_weight="bold",    # All labels become bold  
)

plotted_map # Displays the map

Map:

Map with plotted points and its labels
As seen from the displayed map, all spatial points are plotted in their exact coordinates on Earth. All points are also labeled and styled using HTML, having their respective city names (white font color) and their estimated population (yellow font color). Due to the addition of Folium and Ipywidgets as libraries, some interactive buttons are found in the map where we can do the following features:

  • Zoom and Pan (Folium) - you can move around the world and see where the points are plotted by dragging the screen.
  • Real-Time Control (Ipywidgets) you can change the opacity of the layers by using sliders and the buttons.

Congratulations! You've officially taken your "Hello, World" quite literally. You've now plotted real-world data onto a map using geospatial Python!


Geospatial Python FAQ

What are its advantages and uses?

Looking at data through rows of spreadsheets can be confusing and also boring. But with Geospatial Python, numbers can come to life, turning raw data into living maps and providing a powerful way to analyze and visualize location-based information. By combining libraries like Geopandas, Folium, Leafmaps, and Ipywidgets, Python allows users to directly interact with geographical data, making it easier to measure distances or identify spatial patterns. Unlike simple static charts, these tools make interactivity possible, as users can now click, zoom, and hover to make sense of spatial relationships.

Real World Applications:

  • Disaster Response - Locating hospitals, evacuation centers.
  • Disaster Management - Mapping flood-prone areas and seismic faults.
  • Urban Planning - Identifying population density and analyzing road networks.
  • Business & Retail - Identifying area demographic and selecting optimal branch locations.
  • Weather Forecast - Visualizing temperature, rainfall, wind speed, etc.

From finding the nearest Starbucks branch to satisfy your coffee addiction to locating hospitals when it comes to emergencies, Geospatial Python is a reliable companion in mapping out exactly what you need.

What are its limitations?

In this article, we focused primarily on interactive mapping and exploration but the reach of geospatial python expands beyond these visualizations alone. Through other libraries such as Matplotlib and Shapely, we can also represent geographic features through lines, points, and polygons using x, y coordinates, allowing us to create static maps and charts. Additionally, advanced spatial analysis, network modeling, geostatistics, and machine learning integrations are also key components of the field. However, interactive data mapping serves as an entry-point, demonstrating how code can directly interact with real-world geography.

Geospatial Python and the GIS technologies as a whole offers exceptional analytical and visualization capabilities. But working with geographic data can be pretty taxing and intensive on your computers, especially when handling large datasets such as road networks across the nation or higher resolution satellite imagery, which could result in slower processing and high memory usage. On top of that, it also requires managing multiple libraries which can sometimes cause issues with version compatibility when installing additional libraries. While mastering it requires both technical and geographic knowledge, its flexibility and automation makes it a great tool for research, disaster response, urban planning, and business decision-making.

Despite some challenges, the power of Geospatial Python cannot be denied, it gives us an entirely different perspective, a new angle on how we look and interact with data, veering us away from traditional programming ideas, by connecting lines of codes to the real-world. In this way, Geospatial Python turns the classic “Hello, World” into something far more meaningful — hello, world, literally.

Top comments (0)