DEV Community

Akmal Chaudhri for SingleStore

Posted on • Updated on

Quick tip: Using a SingleStoreDB Recursive CTE with London Underground data

Abstract

SingleStoreDB now supports Recursive CTEs. In this short article, we'll see an example of a Recursive CTE to find routes between stations on the London Underground Northern Line and rank the routes according to distance.

The SQL code used in this article is available in a GitHub Gist.

Introduction

In a great article about Recursive CTEs, there is an example of finding routes between different cities and ranking them according to distance. Let's apply that example to the London Underground. The London Underground network is extensive, and we'll focus our example on the Northern Line.

Create a SingleStoreDB Cloud account

A previous article showed the steps required to create a free SingleStoreDB Cloud account. We'll use RCTE Demo Group as our Workspace Group Name and rcte-demo as our Workspace Name.

Visualise the Northern Line

First, let's create a quick visualisation of the Northern Line. To do this, we can obtain the Station and Line data in GeoJSON format from an excellent website. We'll save the files as northern_line_stations.geojson and northern_line_connections.geojson, respectively.

Using Deepnote, we'll create a Python notebook and upload the two GeoJSON files into a data directory.

First, we'll import geopandas:

import geopandas as gpd
Enter fullscreen mode Exit fullscreen mode

and now read the files:

connections_df = gpd.GeoDataFrame.from_file(
  "data/northern_line_connections.geojson"
)

stations_df = gpd.GeoDataFrame.from_file(
  "data/northern_line_stations.geojson"
)
Enter fullscreen mode Exit fullscreen mode

Next, we'll import folium:

import folium

from folium import plugins
Enter fullscreen mode Exit fullscreen mode

and now create a map:

London = [51.509865, -0.118092]

m = folium.Map(location = London, tiles = "Stamen Terrain", zoom_start = 12)

folium.GeoJson(
  connections_df,
  name = "Northern Line",
  style_function = lambda feature: {
    "color" : "black",
    "weight" : 3
  }
).add_to(m)

folium.GeoJson(
  stations_df,
  name = "Northern Line Stations",
  popup = folium.GeoJsonPopup(fields = ["name"]),
  marker = folium.Marker(
    icon = folium.Icon(
    color = "black",
    icon = "train",
    icon_color = "white",
    prefix = "fa"
    )
  )
).add_to(m)

plugins.Fullscreen(
  position = "topright",
  title = "Fullscreen",
  title_cancel = "Exit",
  force_separate_button = True).add_to(m)
m
Enter fullscreen mode Exit fullscreen mode

We can also save the map:

m.save("northernline.html")
Enter fullscreen mode Exit fullscreen mode

The zoomed-out map should be similar to Figure 1.

Figure 1. Northern Line.

Figure 1. Northern Line.

If we zoom in on the central area, we can see that at Kennington, there are two northbound branches (Charing Cross Branch on the left and City Branch on the right), as shown in Figure 2.

Figure 2. Kennington.

Figure 2. Kennington.

Further north, these lines pass through Euston, as shown in Figure 3.

Figure 3. Euston.

Figure 3. Euston.

Therefore, we have two routes of different lengths from Kennington to Euston.

Northern Line Inter-Station Distance

We need data on the distance between pairs of stations to determine the lengths of the two routes. Under the UK Freedom of Information Act and Transport for London's (TfL's) information access policy, the distance between pairs of stations was provided in a spreadsheet (Inter Station Train Times.xls) in 2013. The data are extensive and missing some changes to the Underground Network since 2013. However, we can extract the subset of the data for the Northbound Northern Line suitable for our example. This is available in a GitHub Gist.

After executing our SQL code to create our table and insert the values, we'll modify the Recursive CTE example from the article mentioned earlier, as follows:

WITH RECURSIVE possible_route AS (
    SELECT sr.station_to,
           CONCAT (sr.station_from, '->', sr.station_to) AS route,
           sr.distance
    FROM stations_route sr
    WHERE sr.station_from = 'MORDEN'
UNION ALL
    SELECT sr.station_to,
           CONCAT (pr.route, '->', sr.station_to) AS route,
           pr.distance + sr.distance
    FROM possible_route pr
INNER JOIN stations_route sr
    ON sr.station_from = pr.station_to
)
SELECT pr.route,
       pr.distance
FROM possible_route pr
WHERE pr.station_to = 'EUSTON'
ORDER BY pr.distance;
Enter fullscreen mode Exit fullscreen mode

In our code, we want to find all routes from Morden (the most southern station on the Northern Line) to Euston. This will give us the following output:

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
| route                                                                                                                                                                                                                                                                       | distance |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
| MORDEN->SOUTH WIMBLEDON->COLLIERS WOOD->TOOTING BROADWAY->TOOTING BEC->BALHAM->CLAPHAM SOUTH->CLAPHAM COMMON->CLAPHAM NORTH->STOCKWELL->OVAL->KENNINGTON->WATERLOO->EMBANKMENT->CHARING CROSS->LEICESTER SQUARE->TOTTENHAM COURT ROAD->GOODGE STREET->WARREN STREET->EUSTON |    17.39 |
| MORDEN->SOUTH WIMBLEDON->COLLIERS WOOD->TOOTING BROADWAY->TOOTING BEC->BALHAM->CLAPHAM SOUTH->CLAPHAM COMMON->CLAPHAM NORTH->STOCKWELL->OVAL->KENNINGTON->ELEPHANT & CASTLE->BOROUGH->LONDON BRIDGE->BANK->MOORGATE->OLD STREET->ANGEL->KINGS CROSS->EUSTON                 |    20.03 |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
Enter fullscreen mode Exit fullscreen mode

Another interesting query is to find routes from Morden to Camden Town, as two branches of the Northern Line also meet there. Substituting EUSTON with CAMDEN TOWN in our Recursive CTE will give us the following output:

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
| route                                                                                                                                                                                                                                                                                                         | distance |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
| MORDEN->SOUTH WIMBLEDON->COLLIERS WOOD->TOOTING BROADWAY->TOOTING BEC->BALHAM->CLAPHAM SOUTH->CLAPHAM COMMON->CLAPHAM NORTH->STOCKWELL->OVAL->KENNINGTON->WATERLOO->EMBANKMENT->CHARING CROSS->LEICESTER SQUARE->TOTTENHAM COURT ROAD->GOODGE STREET->WARREN STREET->EUSTON->MORNINGTON CRESCENT->CAMDEN TOWN |    18.85 |
| MORDEN->SOUTH WIMBLEDON->COLLIERS WOOD->TOOTING BROADWAY->TOOTING BEC->BALHAM->CLAPHAM SOUTH->CLAPHAM COMMON->CLAPHAM NORTH->STOCKWELL->OVAL->KENNINGTON->WATERLOO->EMBANKMENT->CHARING CROSS->LEICESTER SQUARE->TOTTENHAM COURT ROAD->GOODGE STREET->WARREN STREET->EUSTON->CAMDEN TOWN                      |    19.11 |
| MORDEN->SOUTH WIMBLEDON->COLLIERS WOOD->TOOTING BROADWAY->TOOTING BEC->BALHAM->CLAPHAM SOUTH->CLAPHAM COMMON->CLAPHAM NORTH->STOCKWELL->OVAL->KENNINGTON->ELEPHANT & CASTLE->BOROUGH->LONDON BRIDGE->BANK->MOORGATE->OLD STREET->ANGEL->KINGS CROSS->EUSTON->MORNINGTON CRESCENT->CAMDEN TOWN                 |    21.49 |
| MORDEN->SOUTH WIMBLEDON->COLLIERS WOOD->TOOTING BROADWAY->TOOTING BEC->BALHAM->CLAPHAM SOUTH->CLAPHAM COMMON->CLAPHAM NORTH->STOCKWELL->OVAL->KENNINGTON->ELEPHANT & CASTLE->BOROUGH->LONDON BRIDGE->BANK->MOORGATE->OLD STREET->ANGEL->KINGS CROSS->EUSTON->CAMDEN TOWN                                      |    21.75 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
Enter fullscreen mode Exit fullscreen mode

The results show the possible routes, which may also require changing trains.

Summary

This short article showed an example of a Recursive CTE that could be applied to a practical, real-world problem. The documentation contains further examples and discussion.

Top comments (0)