There is a small, very specific kind of joy in the first time you pull a live vessel position into a Python script. You type print(response), and there it is — a container ship, somewhere off Singapore, moving at 12.4 knots on a heading of 087. You can see it. You didn't have to be there. You didn't have to own a radio, or climb a tower, or charter a boat. You just asked, politely, over HTTPS.
It feels like cheating. It isn't. But it is built on top of something genuinely strange, and the more you work with AIS data, the stranger it gets.
What you think AIS is
Most people, if they've heard of AIS at all, have heard of it through MarineTraffic or a similar map. They think of it as "ship GPS" — vessels reporting where they are, the way your phone reports its location to Google. Reasonable assumption. Largely wrong.
AIS — the Automatic Identification System — is a VHF radio protocol. Ships shout, on two specific frequencies (161.975 and 162.025 MHz), short binary messages containing their position, speed, heading, and identity. They do this every few seconds when underway, less often when anchored. Anyone within radio range — other ships, coastal stations, satellites passing overhead — can listen. There is no central server. There is no authentication. There is no encryption. It is, in spirit, closer to ham radio than to a database.
The global "AIS feed" you eventually consume in Python is what happens when thousands of those listening stations pool what they hear into something that looks, from the outside, like a single source of truth. It isn't. It's a noisy, overlapping, partially-redundant patchwork. Once you know that, a lot of the weird edges of working with AIS data start to make sense.
It's also worth knowing why the big ships are reliably visible: SOLAS Chapter V mandates Class A AIS carriage for all vessels over 300 gross tonnes on international voyages, plus all passenger ships. Coverage of the world's commercial fleet isn't a happy accident — it's a treaty obligation.
The shape of an AIS message
AIS (per ITU-R M.1371) defines message types numbered up to 27, though only around two dozen are in active use. You'll spend almost all your time with a handful:
- Type 1, 2, 3 — Position reports from Class A transceivers (cargo, tankers, passenger ships).
- Type 5 — Static and voyage data: ship name, IMO number, destination, ETA, dimensions.
- Type 18, 19 — Position reports from Class B transceivers (smaller commercial and pleasure craft).
- Type 24 — Static data for Class B vessels.
The split matters because static data is transmitted separately and far less frequently than position data. A Class A vessel underway broadcasts a position report as often as every 2 seconds (when manoeuvring at high speed) and as rarely as every 3 minutes (at anchor); Type 5 static data is on a fixed 6-minute cycle. When you query an API and get back a single record with both position and name, someone — usually the data provider — has done the work of joining them by MMSI, the nine-digit Maritime Mobile Service Identity that uniquely identifies each transmitter.
This is the first lesson: MMSI is the join key for the entire maritime world. Not the ship's name (which can change), not the IMO number (which not every vessel has), not the call sign. MMSI. Memorize this and a lot of code suddenly writes itself.
There's a footnote here that has cost me hours: not every MMSI belongs to a ship. The first three digits — the MID, or Maritime Identification Digits — encode the country of the radio licence authority for vessel MMSIs. But certain prefixes reserve the MMSI for non-vessel use: 970 is an AIS-SART (search and rescue transmitter), 972 is a man-overboard beacon, 974 is an EPIRB, and 99X is a navigational aid — a lighthouse, a buoy. If you don't filter these out, your "vessel list" will quietly include lighthouses claiming to be ships from country 970. Ask me how I know.
Getting data into Python
You have three realistic options. You can run your own VHF receiver (charming, limited range, mostly a hobby project). You can buy a raw NMEA feed from an aggregator and parse it yourself with a library like pyais (powerful, but you're now in the business of decoding 6-bit ASCII-armoured binary). Or you can hit a REST API that has done all of the above for you.
Unless you're specifically studying the protocol, hit the API. Life's too short. Here's what a minimal session looks like with the VesselAPI Python SDK, pip install vessel-api-python (verify method signatures against the current docs — SDKs drift):
from vessel_api_python import VesselClient
client = VesselClient(api_key="your_key_here")
# Find a specific vessel by MMSI (this one is illustrative — use your own)
vessel = client.vessels.get("111111111", filter_id_type="mmsi").vessel
print(vessel.name, vessel.vessel_type, vessel.imo)
# Static and position data are separate AIS messages, so they are
# separate calls here too
pos = client.vessels.position("111111111", filter_id_type="mmsi").vessel_position
print(pos.latitude, pos.longitude, pos.sog)
# Or search by name
results = client.search.vessels(filter_name="EVER GIVEN")
for v in results.vessels or []:
print(v.mmsi, v.name, v.country)
If you'd rather see the raw wire format, the equivalent in plain requests is about six lines, and worth writing once:
import requests
r = requests.get(
"https://api.vesselapi.com/v1/vessel/111111111",
params={"filter.idType": "mmsi"},
headers={"Authorization": "Bearer your_key_here"},
)
print(r.json()) # static data; the position lives at /v1/vessel/{id}/position
The SDK gives you typed objects, retries on 5xx and 429s, and pagination handled for you. Use whichever feels less like work.
The things that will surprise you
Positions are stale more often than you think. A vessel mid-Pacific can be out of range of every terrestrial AIS receiver. Satellite coverage helps but isn't continuous. A single polar-orbiting satellite completes an orbit every 90-odd minutes, but any given patch of ocean is only in its narrow listening window for a few minutes per pass. Modern constellations — Spire alone operates around 100 satellites — have squeezed typical ocean revisit times down to 15–30 minutes, but congested waters introduce a new problem: slot collision at the satellite, where so many ships are transmitting simultaneously that the receiver loses messages. When an API returns timestamp: 2024-03-14T07:12:00Z on a position, that field is doing real work. Always check it. A "current" position can legitimately be six hours old.
Destinations are a free-text field. The destination shown in a Type 5 message is whatever the crew typed into the transceiver. It might be "ROTTERDAM." It might be "RTM." It might be "FOR ORDERS" or "TBN" (to be nominated) or, occasionally, profanity. Do not parse it as structured data. Treat it as a hint.
MMSIs are reused. When a ship is re-flagged to a new country, it gets a new MMSI. If you're building a long-term tracking system, you cannot assume MMSI is stable across a vessel's lifetime. IMO numbers are stable. MMSIs are not.
Class B is sparser than Class A. Standard Class B transmits every 30 seconds when making way (≥2 knots) and every 3 minutes when slow or stopped — compared to Class A's 2–10 second cadence underway. Don't build interpolation logic that assumes the same data density as a container ship. And note: many larger fishing vessels are actually required to carry Class A, so "small boat = Class B" isn't reliable.
Ships sometimes lie. AIS is self-reported and unauthenticated. Sanctions-evading tankers go dark off Iran. Fishing vessels falsify positions to hide IUU activity in protected waters. There was a stretch in 2019 when warships in the Black Sea reported positions that were demonstrably wrong by hundreds of kilometres. If you're making real decisions from this data, treat it as a signal to be corroborated, not as ground truth.
A small thing you can build today
Here's a useful exercise: a script that watches a port and tells you when a vessel of interest arrives. Once you have it working, you'll have internalized most of what AIS can and can't do.
from vessel_api_python import VesselClient
from datetime import datetime, timedelta, timezone
import time
client = VesselClient(api_key="your_key_here")
WATCHLIST = {111111111, 222222222} # MMSIs you care about
PORT_BBOX = (51.85, 3.80, 52.05, 4.55) # Rotterdam: Maasvlakte to centre
seen = set()
while True:
result = client.location.vessels_bounding_box(
lat_min=PORT_BBOX[0], lon_min=PORT_BBOX[1],
lat_max=PORT_BBOX[2], lon_max=PORT_BBOX[3],
)
now = datetime.now(timezone.utc)
for v in result.vessels or []:
if v.mmsi in WATCHLIST and v.mmsi not in seen:
last_seen = datetime.fromisoformat(v.timestamp.replace("Z", "+00:00"))
if now - last_seen < timedelta(minutes=15):
print(f"ARRIVED: {v.vessel_name} ({v.mmsi}) at {v.timestamp}")
seen.add(v.mmsi)
time.sleep(60)
That's a working port arrival monitor in about 20 lines. It will also, occasionally, lie to you. The bounding box bit me first: a vessel "in Rotterdam" turned out to be a barge that had drifted into a corner of the box well out in the North Sea. The staleness check I added second, after a "current" position from six hours earlier triggered a false arrival alert. And seen is a one-way set — if a ship arrives, departs, and returns next week, you'll never hear about it. I'm not sure I've solved that one properly yet.
But the core idea — that you can ask the world where its ships are, and the world will tell you, in JSON, from a Python REPL — is still the bit that should feel a little impossible. It's built on a 1990s VHF protocol, a constellation of satellites that wasn't designed for this, and an informal global agreement that ships will mostly tell the truth about who they are.
Mostly. That's another post.



Top comments (0)