DEV Community

Akmal Chaudhri for SingleStore

Posted on • Updated on

Quick tip: Using Uber's H3 to visualise British Transport Police crime data

Abstract

Creating visualisations can often be a great way to present data. In this short article, we'll apply Uber's H3 Hexagonal Hierarchical Spatial Index to British Transport Police (BTP) crime data, and then visualise the results. We'll use Deepnote as our development environment.

Introduction

In a previous article, we discussed how to map crimes and visualise hot routes. We'll extend that work in this article by obtaining the latest BTP crime data for the UK and use Uber's H3 library.

Obtain BTP crime data

The file we need is 2022-06-btp-street.csv. This can be generated from the Data Downloads page. On that page, we'll select the following:

  • Date range: June 2022 to June 2022.
  • Forces: Check (✔) British Transport Police.
  • Data sets: Check (✔) Include crime data.
  • Generate file.

The download will be a zip file, and the CSV file we need can be extracted from that.

Create a Deepnote account

We'll create a free account on the Deepnote website. Once logged in, we'll create a new Deepnote project to give us a new notebook. We'll also need to create a data folder, where we'll store the CSV file.

Deepnote notebook

Let's now start to fill out our notebook.

First, we'll import some libraries:

import pandas as pd
import geopandas as gpd
Enter fullscreen mode Exit fullscreen mode

Next, we'll read the CSV file into a Pandas Dataframe, filter what we need and create a Geopandas Dataframe, with the correct coordinate system.

df = pd.read_csv("data/2022-06-btp-street.csv")

crimes = gpd.GeoDataFrame(
    df["Crime type"],
    geometry = gpd.points_from_xy(df.Longitude, df.Latitude),
    crs = "EPSG:4326"
)

print(crimes.head(5))
Enter fullscreen mode Exit fullscreen mode

The output should be similar to the following:

      Crime type                   geometry
0  Bicycle theft  POINT (-0.27160 50.83440)
1  Bicycle theft  POINT (-0.27160 50.83440)
2  Bicycle theft  POINT (-0.27160 50.83440)
3  Bicycle theft  POINT (-0.27160 50.83440)
4  Bicycle theft  POINT (-0.27160 50.83440)
Enter fullscreen mode Exit fullscreen mode

We'll now convert our geometry to H3 using code from an excellent article. Initially, we'll set the h3_level to 5 and then we'll try it with a smaller value.

from h3 import h3

h3_level = 5

# https://spatialthoughts.com/2020/07/01/point-in-polygon-h3-geopandas/

def lat_lng_to_h3(row):
    return h3.geo_to_h3(
        row.geometry.y, row.geometry.x, h3_level
    )

crimes["h3"] = crimes.apply(lat_lng_to_h3, axis = 1)

print(crimes.head(5))
Enter fullscreen mode Exit fullscreen mode

The output should be similar to the following:

      Crime type                   geometry               h3
0  Bicycle theft  POINT (-0.27160 50.83440)  85194a73fffffff
1  Bicycle theft  POINT (-0.27160 50.83440)  85194a73fffffff
2  Bicycle theft  POINT (-0.27160 50.83440)  85194a73fffffff
3  Bicycle theft  POINT (-0.27160 50.83440)  85194a73fffffff
4  Bicycle theft  POINT (-0.27160 50.83440)  85194a73fffffff
Enter fullscreen mode Exit fullscreen mode

Next, we'll aggregate the number of crimes:

# https://spatialthoughts.com/2020/07/01/point-in-polygon-h3-geopandas/

counts = (crimes.groupby(["h3"])
                .h3.agg("count")
                .to_frame("count")
                .reset_index()
)

print(counts.head(5))
Enter fullscreen mode Exit fullscreen mode

The output should be similar to the following:

                h3  count
0  851870d3fffffff      4
1  851870dbfffffff      2
2  8518743bfffffff      2
3  85187463fffffff      1
4  8518746bfffffff      1
Enter fullscreen mode Exit fullscreen mode

Now, we'll convert H3 to polygons that can be visualised:

# https://spatialthoughts.com/2020/07/01/point-in-polygon-h3-geopandas/

from shapely.geometry import Polygon

def add_geometry(row):
    points = h3.h3_to_geo_boundary(
        row["h3"], True
    )
    return Polygon(points)

counts["geometry"] = counts.apply(add_geometry, axis = 1)

print(counts.head(5))
Enter fullscreen mode Exit fullscreen mode

The output should be similar to the following:

                h3  count                                 geometry
0  851870d3fffffff      4  POLYGON ((-5.3895450275705175 50.238...
1  851870dbfffffff      2  POLYGON ((-5.618770916484385 50.2157...
2  8518743bfffffff      2  POLYGON ((-4.6312146745728935 50.443...
3  85187463fffffff      1  POLYGON ((-5.0895660767735675 50.401...
4  8518746bfffffff      1  POLYGON ((-5.319050950902007 50.3800...
Enter fullscreen mode Exit fullscreen mode

We'll also ensure that we have the correct coordinate system:

crimes_h3 = gpd.GeoDataFrame(counts, crs = "EPSG:4326")

print(crimes_h3.head(5))
Enter fullscreen mode Exit fullscreen mode

The output should be similar to the following:

                h3  count                                 geometry
0  851870d3fffffff      4  POLYGON ((-5.38955 50.23837, -5.4893...
1  851870dbfffffff      2  POLYGON ((-5.61877 50.21578, -5.7184...
2  8518743bfffffff      2  POLYGON ((-4.63121 50.44398, -4.7314...
3  85187463fffffff      1  POLYGON ((-5.08957 50.40182, -5.1896...
4  8518746bfffffff      1  POLYGON ((-5.31905 50.38001, -5.4190...
Enter fullscreen mode Exit fullscreen mode

Finally, we'll plot the data:

btp_crimes = crimes_h3.plot(
    column = "count",
    cmap = "OrRd",
    edgecolor = "black",
    figsize = (7, 7),
    legend = True,
    legend_kwds = {
        "label" : "Number of crimes",
        "orientation" : "vertical"
    }
)

btp_crimes.set_axis_off()

btp_crimes.plot()
Enter fullscreen mode Exit fullscreen mode

h3_level set to 5 will render the chart shown in Figure 1.

Figure 1. h3_level = 5.

Figure 1. h3_level = 5.

Changing the value of h3_level to 3 and re-running the code will render the chart shown in Figure 2.

Figure 2. h3_level = 3.

Figure 2. h3_level = 3.

London and the South East have higher crime numbers than other parts of the United Kingdom.

Summary

Using Uber's H3, we have been able to create some useful charts. H3 could be used in many different application domains. Feel free to experiment with different h3_level settings and also try your own dataset.

Top comments (0)