DEV Community

Cover image for Tube-torial: Mastering the London Underground API
Max
Max

Posted on

Tube-torial: Mastering the London Underground API

I've visited London multiple times over the past few years, and every time I return, I’m struck by how deeply integrated the transit data is into the city's pulse. Whether it’s the pinpoint accuracy of apps like Citymapper or finding a custom-built LED arrival board tucked away in a local cafe showing live timings for the nearest station, I started to wonder: How are people building these?

Until then, I’d never really looked at or cared much about transport APIs. But seeing those creative builds—especially the physical ones—sparked a genuine curiosity. I wanted to experiment and see if I could tap into that same stream of "city intelligence."

That curiosity led me to the TfL Unified API.

The Transport for London (TfL) API is the central nervous system of London’s infrastructure. It provides a massive, unified gateway into real-time data for the Tube, Buses, DLR, Overground, and even the Elizabeth Line. From live arrival predictions and station facilities to journey planning and network status, it’s a goldmine for developers.

In this tutorial, we’re going to stop wondering and start building. We will be using Python throughout this guide to interface with the API.

A Note on API Keys & Security:
Throughout this tutorial, we will write our Python scripts to function without hardcoding an API key. This is a best practice that allows you to share or publish your code (like on GitHub) without accidentally leaking your credentials. The TfL API allows a limited number of requests without a key, which is perfect for this tutorial. We will structure our code to use a key if provided, but default to safe, unauthenticated requests.

Context: The 2024 Cyber Incident

It wasn't always smooth sailing. In September 2024, TfL was hit by a sophisticated cyber attack that forced them to isolate many internal systems to contain the breach. While the trains kept running and contactless payments continued to work, the digital nervous system—including the Oyster card back-end and, crucially for us, the live API feeds—went dark.

For weeks, the "live" aspect of the network vanished. I happened to visit London shortly after the incident, and the impact was impossible to ignore. Some of my favorite tools for navigating the city were suddenly unreliable. Features I usually take for granted—like live-tracking a bus route to decide whether to wait at the stop or find a spot to hide from the rain—simply didn't work at all.

It was a chaotic period that highlighted a critical lesson for developers: external APIs are a dependency, not a guarantee.

As of early 2026, the systems have been fully restored, hardened, and are running smoother than ever. However, the outage serves as a permanent reminder to always design your applications with resilience in mind—handling empty responses or downtime gracefully is just as important as parsing the data itself.

Getting Started: Accessing the API

To start pulling data, you need to register on the TfL API Portal. While the API is free, unauthenticated requests are strictly rate-limited. Registering gets you an app_key which increases your allowance significantly.

1. The Base URL

Every request we make will start with:
https://api.tfl.gov.uk

2. Your First Request

We’ll use the requests library in Python to talk to the API. If you haven't installed it yet, run:

pip install requests
Enter fullscreen mode Exit fullscreen mode

To test if everything is working, we’ll hit the "Line Status" endpoint. Let's break this down step-by-step.

Step 2.1: Setup and Configuration

First, we import our library and define our constants. We set APP_KEY to None by default so the script runs safely without credentials, but the code is ready to accept one if you have it.

import requests

# We leave this empty for safety. If you have a key, put it here.
APP_KEY = None
BASE_URL = 'https://api.tfl.gov.uk'
Enter fullscreen mode Exit fullscreen mode

Step 2.2: Making the Request

We define a function check_network_health that constructs the URL. We use a dictionary for params so we can conditionally add the API key only if it exists.

def check_network_health():
    url = f"{BASE_URL}/Line/Mode/tube/Status"

    # We only attach the key if it exists
    params = {}
    if APP_KEY:
        params['app_key'] = APP_KEY

    response = requests.get(url, params=params)
Enter fullscreen mode Exit fullscreen mode

Step 2.3: Handling the Response

The API returns a list of JSON objects, one for each line.

Sample JSON Snippet:

[
  {
    "name": "Bakerloo",
    "lineStatuses": [
      {
        "statusSeverity": 10,
        "statusSeverityDescription": "Good Service"
      }
    ]
  },
  ...
]
Enter fullscreen mode Exit fullscreen mode

We iterate through them to find the lineStatuses.

Note: The API can return multiple statuses for a single line (e.g., "Minor Delays" AND "Part Suspended"). For simplicity, we grab the first one ([0]) which is usually the primary status.

    if response.status_code == 200:
        data = response.json()
        for line in data:
            # We grab the description of the first status in the list
            status = line['lineStatuses'][0]['statusSeverityDescription']
            print(f"{line['name']}: {status}")
    else:
        print(f"Error: {response.status_code}")
Enter fullscreen mode Exit fullscreen mode

Full Code

Putting it all together, here is your first working script:

File: tube_status.py

import requests

# For this tutorial, we leave this empty to remain safe for version control.
# If you have a key, you can add it here (e.g. 'YOUR_KEY') or use os.getenv('TFL_KEY')
APP_KEY = None
BASE_URL = 'https://api.tfl.gov.uk'

def check_network_health():
    url = f"{BASE_URL}/Line/Mode/tube/Status"

    # We only attach the key if it exists
    params = {}
    if APP_KEY:
        params['app_key'] = APP_KEY

    response = requests.get(url, params=params)

    if response.status_code == 200:
        data = response.json()
        for line in data:
            status = line['lineStatuses'][0]['statusSeverityDescription']
            print(f"{line['name']}: {status}")
    else:
        print(f"Error: {response.status_code}")

if __name__ == "__main__":
    check_network_health()
Enter fullscreen mode Exit fullscreen mode

When you run this, you should see a list of all Tube lines and their current status (hopefully "Good Service").

The Lay of the Land: API Structure

Before we dive deeper, it helps to understand how TfL organizes its massive dataset. The API isn't just a flat list of commands; it's structured around a few key "Controllers" that represent physical or logical parts of the network.

Understanding these four pillars will help you navigate the documentation:

1. Line (/Line)

The Arteries. This controller handles everything related to routes.

  • What it does: Status (delays), Route sequences (ordered list of stations), Timetables, and Disruptions.
  • Key ID: lineId (e.g., bakerloo, northern, victoria).

2. StopPoint (/StopPoint)

The Nodes. This is arguably the most important controller for location-based apps.

  • What it does: Represents physical locations (Stations, Bus stops, Piers). It handles Live Arrivals, station facilities (lifts, toilets), and location search.
  • Key ID: naptanId (e.g., 940GZZLUOXC for Oxford Circus).
  • Note: NaPTAN stands for "National Public Transport Access Nodes". It's a UK-wide standard.

3. Journey (/Journey)

The Navigator. The brain behind the "Plan my route" feature.

  • What it does: Calculates the best path between two points (coordinates or NaPTAN IDs), considering current disruptions and walking times.

4. Mode (/Mode)

The Filter. High-level categorization.

  • What it does: Lists active modes (tube, bus, river-bus, cable-car) and their overall status.

In our next steps, we will connect the dots between Lines (finding a route) and StopPoints (finding where to get on/off).

3. Finding Your Way: Station Search

To get useful live data (like when the next train is coming), you can't just ask for "Oxford Circus". The API relies on unique identifiers called NaPTAN IDs.

We need a way to translate a human name into this ID. The /StopPoint/Search endpoint does exactly this.

Step 3.1: The Search Request

We filter our search by modes='tube' to avoid getting results for bus stops or river piers that might share a similar name.

def search_station(query):
    url = f"{BASE_URL}/StopPoint/Search"

    params = {
        'query': query,
        'modes': 'tube'
    }
    # ... append app_key if present ...

    response = requests.get(url, params=params)
Enter fullscreen mode Exit fullscreen mode

Step 3.2: Parsing Matches

The API returns a JSON object containing a matches list.

Sample JSON Snippet:

{
  "matches": [
    {
      "name": "Oxford Circus Underground Station",
      "id": "940GZZLUOXC",
      "modes": ["tube"]
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Each match provides the name (human readable) and the id (machine readable).

    if response.status_code == 200:
        data = response.json()
        matches = data.get('matches', [])

        for match in matches:
            print(f"Name: {match['name']}")
            print(f"ID:   {match['id']}")
Enter fullscreen mode Exit fullscreen mode

Full Code

Here is a script that asks for a name and returns the ID.

File: station_search.py

import requests

APP_KEY = None
BASE_URL = 'https://api.tfl.gov.uk'

def search_station(query):
    url = f"{BASE_URL}/StopPoint/Search"

    params = {
        'query': query,
        'modes': 'tube'
    }
    if APP_KEY:
        params['app_key'] = APP_KEY

    response = requests.get(url, params=params)

    if response.status_code == 200:
        data = response.json()
        matches = data.get('matches', [])

        if not matches:
            print(f"No stations found for '{query}'")
            return

        print(f"Found {len(matches)} results:")
        for match in matches:
            print(f"  Name: {match['name']}")
            print(f"  ID:   {match['id']}")
            print("-" * 20)
    else:
        print(f"Error: {response.status_code}")

if __name__ == "__main__":
    user_input = input("Enter station name to search: ")
    search_station(user_input)
Enter fullscreen mode Exit fullscreen mode

Try it out: Run the script and type Oxford Circus. You should see the ID 940GZZLUOXC. Save this ID; we'll need it for the next step.

4. The Main Event: Live Arrivals

This is what we came for. Now that we have a Station ID (like 940GZZLUOXC for Oxford Circus), we can tap into the live signaling data to see every train heading towards that platform.

Endpoint: /StopPoint/{id}/Arrivals

Step 4.1: The Request

Unlike previous endpoints, this one is specific to a StopPoint.

def get_live_arrivals(station_id):
    url = f"{BASE_URL}/StopPoint/{station_id}/Arrivals"
    # ... standard params setup ...
    response = requests.get(url, params=params)
Enter fullscreen mode Exit fullscreen mode

Step 4.2: Understanding the Response

The API returns a list of JSON objects. Each object represents a single train currently in the signaling system heading toward your station. Here are the most important fields:

  • lineName: The human name of the line (e.g., "Victoria").
  • destinationName: Where the train is going (e.g., "Brixton").
  • timeToStation: The most critical field—seconds until arrival.
  • currentLocation: A string describing where the train is now (e.g., "At Green Park").

Sample JSON Snippet:

{
    "lineName": "Victoria",
    "destinationName": "Brixton",
    "timeToStation": 124,
    "currentLocation": "At Green Park",
    "platformName": "Southbound - Platform 4"
}
Enter fullscreen mode Exit fullscreen mode

Step 4.3: Sorting the Chaos

The API returns a list of all predicted arrivals for that station in a random order. To make a useful board, we must sort them by timeToStation (which is in seconds).

    arrivals = response.json()
    # Sort by time (ascending)
    sorted_arrivals = sorted(arrivals, key=lambda x: x['timeToStation'])
Enter fullscreen mode Exit fullscreen mode

Full Code

This script hardcodes the ID for Oxford Circus, but you can swap it with any ID you found in the previous step.

File: live_arrivals.py

import requests

APP_KEY = None
BASE_URL = 'https://api.tfl.gov.uk'

def get_live_arrivals(station_id):
    # Endpoint: /StopPoint/{id}/Arrivals
    url = f"{BASE_URL}/StopPoint/{station_id}/Arrivals"

    params = {}
    if APP_KEY:
        params['app_key'] = APP_KEY

    response = requests.get(url, params=params)

    if response.status_code == 200:
        arrivals = response.json()

        if not arrivals:
            print("No arrivals predicted at this moment.")
            return

        # Sort arrivals by time (ascending)
        # 'timeToStation' is in seconds
        sorted_arrivals = sorted(arrivals, key=lambda x: x['timeToStation'])

        print(f"Live Arrivals for Station ID: {station_id}")
        print(f"{'Line':<15} {'Destination':<25} {'Time':<10}")
        print("-" * 50)

        for train in sorted_arrivals:
            line_name = train['lineName']
            destination = train['destinationName']
            # Convert seconds to minutes for display
            minutes = train['timeToStation'] // 60

            # Formatting "0 mins" as "Due" is a nice touch
            time_str = "Due" if minutes == 0 else f"{minutes} min"

            print(f"{line_name:<15} {destination:<25} {time_str:<10}")

    else:
        print(f"Error: {response.status_code}")

if __name__ == "__main__":
    # Oxford Circus ID (from our previous step)
    STATION_ID = "940GZZLUOXC"
    get_live_arrivals(STATION_ID)
Enter fullscreen mode Exit fullscreen mode

What you see:
When you run this, you'll get a real-time board showing trains on the Central, Victoria, and Bakerloo lines arriving at Oxford Circus.

5. Beyond the Basics: Advanced Capabilities

Once you’ve mastered line status, station searching, and arrivals, you’ve really only scratched the surface. The TfL Unified API is massive, covering everything from the environmental health of the city to the micro-movements of rental bikes.

Showcase: Crowding & Occupancy

One of the most powerful "hidden" features is the ability to predict how busy a station will be. While public data is often based on historical profiles rather than real-time camera feeds, it is incredibly useful for building applications that help users avoid the rush.

Here is a script that fetches the typical crowding profile for a station. We use the /Crowding endpoint to see the passenger load throughout the day.

File: check_crowds.py

import requests

APP_KEY = None
BASE_URL = 'https://api.tfl.gov.uk'

def check_station_crowds(station_id):
    url = f"{BASE_URL}/Crowding/{station_id}"
    params = {'app_key': APP_KEY} if APP_KEY else {}

    response = requests.get(url, params=params)

    if response.status_code == 200:
        data = response.json()
        print(f"Crowding Profile for Station: {station_id}")

        if 'daysOfWeek' in data:
            day_data = data['daysOfWeek'][0]
            print(f"Typical Crowding on {day_data['dayOfWeek']}:")
            for slot in day_data['timeBands'][:5]:
                busy_percent = slot['percentageOfBaseLine'] * 100
                print(f"  {slot['timeBand']}: {busy_percent:.1f}% load")
        else:
            print("No crowd profile data available.")
    else:
        print(f"Error: {response.status_code}")

if __name__ == "__main__":
    STATION_ID = "940GZZLUOXC" # Oxford Circus
    check_station_crowds(STATION_ID)
Enter fullscreen mode Exit fullscreen mode

Other Realms to Explore

The API spans far beyond just the Underground. If you want to take your project further, look into these controllers:

  • AirQuality (/AirQuality): Live data on pollution levels across the city.
  • BikePoint (/BikePoint): Real-time availability of "Boris Bikes" and their docking stations.
  • Journey (/Journey): The full routing engine. Ask for a path from A to B, and it will calculate every leg, including walking times and transfers.
  • Road (/Road): Status and disruptions on major London roads and traffic corridors.
  • Vehicle (/Vehicle): Tracking specific buses or trains as they move across the map.

The unified nature of the API means that once you understand how to call /Line or /StopPoint, you already know how to call these as well.

6. Conclusion: Build Something Beautiful

The goal of this tutorial wasn't just to print text to a console; it was to hand you the keys to the city. You now have the ability to read the pulse of London in real-time.

Where to go from here?

Don't let the code sit in a folder. Use it to build something tangible. Here are a few ideas to get you started:

  1. The "Commute Mirror": A Raspberry Pi project that displays the status of your specific line on a smart mirror while you brush your teeth.
  2. The "Leave Now" Button: A physical button on your desk that, when pressed, checks the next 3 trains and flashes green if you need to run, or yellow if you have time for another coffee.
  3. The "Rainy Day" Router: A journey planner that prioritizes underground routes (Tube) over surface ones (Bus/Walk) when the weather API reports rain.

The API is free, the data is live, and the possibilities are endless.

Go forth and code.


This tutorial is for educational purposes. Data provided by Transport for London.

Top comments (0)