DEV Community

Cover image for Python Geospatial Analysis Guide: From Data Loading to Interactive Maps and Spatial Operations
Aarav Joshi
Aarav Joshi

Posted on

Python Geospatial Analysis Guide: From Data Loading to Interactive Maps and Spatial Operations

As a best-selling author, I invite you to explore my books on Amazon. Don't forget to follow me on Medium and show your support. Thank you! Your support means the world!

Geospatial analysis has become an essential part of modern data work, and I find Python to be one of the most accessible tools for handling location-based information. Whether you're looking at city planning, environmental changes, or delivery routes, understanding where things are and how they relate can reveal powerful insights. I want to walk through some practical methods I use regularly, with clear examples to help you get started. We will cover reading data, transforming coordinates, querying spatial relationships, performing geometric operations, creating maps, joining datasets, working with raster images, and converting addresses to coordinates. Each section includes code you can run yourself to see how it works.

I often begin by loading spatial data into Python. GeoPandas makes this straightforward by extending the familiar pandas library to handle geographic information. You can read various file formats like shapefiles, GeoJSON, or even data from web services. Once loaded, the data behaves like a DataFrame but with an extra geometry column that holds the shapes.

import geopandas as gpd

# Reading a shapefile of city parks
parks_gdf = gpd.read_file('city_parks.shp')

# Checking what's inside
print(f"Number of parks: {len(parks_gdf)}")
print(f"Columns available: {parks_gdf.columns.tolist()}")
print(f"First few park names: {parks_gdf['park_name'].head()}")

# You can also load data from a URL
# url = "https://raw.githubusercontent.com/username/data/main/parks.geojson"
# parks_online = gpd.read_file(url)
Enter fullscreen mode Exit fullscreen mode

After loading data, it is crucial to understand its coordinate reference system. This system defines how the coordinates relate to real-world positions on Earth. I have learned that ignoring this can lead to inaccurate measurements. For instance, calculating areas in degrees instead of meters doesn't make much sense. Reprojecting data ensures all your layers align correctly.

# Check the current coordinate system
print(f"Original system: {parks_gdf.crs}")

# If it's in geographic coordinates (like EPSG:4326), reproject to a projected system
parks_projected = parks_gdf.to_crs('EPSG:32610')  # UTM zone 10N

# Now areas are in square meters
parks_projected['area_sq_m'] = parks_projected.geometry.area
print(f"Largest park area: {parks_projected['area_sq_m'].max():.2f} square meters")

# You can also transform to Web Mercator for web mapping
parks_web = parks_gdf.to_crs('EPSG:3857')
print(f"New system: {parks_web.crs}")
Enter fullscreen mode Exit fullscreen mode

Spatial queries allow me to filter data based on location. For example, I might want to find all parks within a certain distance of a school or identify which district a specific point falls into. These operations are like asking questions about where things are in relation to each other.

from shapely.geometry import Point

# Define a point, say a new cafe location
cafe_point = Point(-122.4312, 37.7739)

# Find parks that contain this point
parks_with_cafe = parks_gdf[parks_gdf.geometry.contains(cafe_point)]
print(f"Parks containing the cafe: {len(parks_with_cafe)}")

# Create a buffer around the point to find nearby parks
search_radius = cafe_point.buffer(0.005)  # Roughly 500 meters
nearby_parks = parks_gdf[parks_gdf.geometry.intersects(search_radius)]
print(f"Parks within 500 meters: {len(nearby_parks)}")

# You can also check if geometries touch or overlap
touching_parks = parks_gdf[parks_gdf.geometry.touches(search_radius)]
Enter fullscreen mode Exit fullscreen mode

Geometric operations help modify or combine shapes. I use these to create buffers, find intersections, or merge multiple features. For instance, buffering a road can help analyze its impact zone, while merging polygons can simplify complex boundaries.

# Buffer all park boundaries by 50 meters
parks_gdf['buffered'] = parks_gdf.geometry.buffer(50)

# Find the intersection between two specific parks
park1 = parks_gdf.geometry.iloc[0]
park2 = parks_gdf.geometry.iloc[1]
shared_space = park1.intersection(park2)
if not shared_space.is_empty:
    print(f"Shared area size: {shared_space.area}")
else:
    print("No overlap between these parks")

# Dissolve all parks into one multipolygon
all_parks_combined = parks_gdf.geometry.unary_union
print(f"Combined into one geometry with {len(all_parks_combined.geoms)} parts")

# You can also compute differences, like what part of one park isn't in another
difference = park1.difference(park2)
Enter fullscreen mode Exit fullscreen mode

Visualizing data on maps makes patterns obvious. I prefer using Folium for interactive web maps because it is easy to share and explore. You can add layers, markers, and popups to make the map informative.

import folium

# Center the map on the average location of parks
avg_lat = parks_gdf.geometry.centroid.y.mean()
avg_lon = parks_gdf.geometry.centroid.x.mean()
m = folium.Map(location=[avg_lat, avg_lon], zoom_start=13)

# Add the parks as a GeoJSON layer
folium.GeoJson(parks_gdf.to_json(), name='Parks').add_to(m)

# Mark a specific point
folium.Marker(
    [37.7739, -122.4312],
    popup='New Cafe',
    icon=folium.Icon(color='green', icon='coffee')
).add_to(m)

# Add layer control to toggle visibility
folium.LayerControl().add_to(m)

# Save and open in a browser
m.save('parks_map.html')
Enter fullscreen mode Exit fullscreen mode

Spatial joins combine attributes from different datasets based on location. This is useful when you have, say, property parcels and want to know which school district they are in. I use this to enrich data with contextual information.

# Load additional datasets
parcels_gdf = gpd.read_file('property_parcels.shp')
districts_gdf = gpd.read_file('school_districts.shp')

# Join parcels to districts based on which district contains each parcel
parcels_with_districts = gpd.sjoin(parcels_gdf, districts_gdf, how='left', predicate='within')
print(f"Parcels matched to districts: {parcels_with_districts['district_name'].notna().sum()}")

# Find the nearest district for parcels not within any
nearest_districts = gpd.sjoin_nearest(parcels_gdf, districts_gdf, max_distance=2000)
print(f"Parcels with nearest district within 2 km: {len(nearest_districts)}")

# You can also join based on other relationships, like intersects or touches
Enter fullscreen mode Exit fullscreen mode

Raster data, such as satellite images or elevation models, requires different handling. I use Rasterio to read and process these grid-based datasets. This lets me analyze continuous surfaces like temperature or elevation.

import rasterio
from rasterio.plot import show
import matplotlib.pyplot as plt

# Open a digital elevation model
with rasterio.open('elevation.tif') as src:
    elevation = src.read(1)
    profile = src.profile.copy()

# Visualize the raster
plt.figure(figsize=(10, 8))
show(elevation, cmap='terrain')
plt.colorbar(label='Elevation (meters)')
plt.title('Elevation Map')
plt.savefig('elevation_plot.png')
plt.close()

# Calculate slope or other derivatives
from rasterio import features
# Example: Mask out water bodies (assuming value 0 is water)
water_mask = elevation == 0
elevation_land = elevation.copy()
elevation_land[water_mask] = np.nan

print(f"Land elevation stats - Min: {np.nanmin(elevation_land)}, Max: {np.nanmax(elevation_land)}")

# You can reproject rasters too, but it's more complex
# from rasterio.warp import calculate_default_transform, reproject
Enter fullscreen mode Exit fullscreen mode

Geocoding converts addresses to coordinates and back, which I use frequently to plot locations or find addresses from points. Geopy supports multiple services, but I stick with free options like Nominatim for small projects.

from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter

# Set up the geocoder with a user agent
geolocator = Nominatim(user_agent="my_geospatial_app")
geocode = RateLimiter(geolocator.geocode, min_delay_seconds=1)  # Avoid overloading the service

# Geocode a single address
address = "Golden Gate Park, San Francisco"
location = geocode(address)
if location:
    print(f"Coordinates: {location.latitude}, {location.longitude}")
    print(f"Full address: {location.address}")

# Reverse geocode coordinates back to an address
reverse = geolocator.reverse("37.7699, -122.4865")
print(f"Reverse result: {reverse.address}")

# For multiple addresses, use a loop or apply
addresses = ["100 Market St, San Francisco", "1 Telegraph Hill, San Francisco"]
coordinates = []
for addr in addresses:
    loc = geocode(addr)
    if loc:
        coordinates.append((loc.latitude, loc.longitude))
    else:
        coordinates.append(None)

print(f"Geocoded coordinates: {coordinates}")
Enter fullscreen mode Exit fullscreen mode

Throughout my projects, I have seen how these techniques build on each other. Starting with data loading and ensuring proper coordinates sets a strong foundation. Spatial queries and geometric operations let you dig into relationships, while visualization communicates findings effectively. Joins and raster handling expand what you can analyze, and geocoding bridges the gap between addresses and maps.

I remember one project where I had to find optimal locations for new public facilities. By combining parcel data with population density rasters and performing spatial joins, I could identify underserved areas. The interactive maps made it easy to present to stakeholders, and geocoding helped translate community input into actionable points.

Another time, I worked with environmental data, using raster operations to monitor changes in vegetation over time. Reprojecting ensured all layers matched, and geometric operations helped calculate affected areas. The ability to buffer and intersect made it simple to assess impact zones around protected regions.

In logistics, spatial queries streamlined route planning by identifying points within certain distances. Folium maps provided live updates, and geocoding converted customer addresses into routes. These practical applications show how versatile Python can be for spatial tasks.

When working with code, I always test small pieces first. For example, when reprojecting, I check a few entries to confirm the transformation. With spatial joins, I validate results by spot-checking on a map. This iterative approach saves time and reduces errors.

I encourage you to start with sample datasets, like those available from government open data portals. Practice loading them, changing coordinate systems, and making simple maps. As you grow comfortable, add more complex operations. The key is to build gradually, ensuring each step makes sense before moving forward.

Python's geospatial libraries continue to evolve, making previously complex tasks accessible. By mastering these core techniques, you can tackle a wide range of problems, from urban analytics to conservation efforts. The community support and extensive documentation mean you are never alone in solving issues.

I hope these examples and explanations provide a solid starting point. Geospatial analysis might seem daunting at first, but with hands-on practice, it becomes an intuitive part of your toolkit. Feel free to adapt the code to your needs, and don't hesitate to explore additional libraries for specialized tasks.

📘 Checkout my latest ebook for free on my channel!

Be sure to like, share, comment, and subscribe to the channel!


101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!

Our Creations

Be sure to check out our creations:

Investor Central | Investor Central Spanish | Investor Central German | Smart Living | Epochs & Echoes | Puzzling Mysteries | Hindutva | Elite Dev | Java Elite Dev | Golang Elite Dev | Python Elite Dev | JS Elite Dev | JS Schools


We are on Medium

Tech Koala Insights | Epochs & Echoes World | Investor Central Medium | Puzzling Mysteries Medium | Science & Epochs Medium | Modern Hindutva

Top comments (0)