DEV Community

José Catalá
José Catalá

Posted on

What 1,000 Airports Reveal About the World's Airlines

We just shipped the Airlines tab on MyAirports. You pick any airport, click Airlines, and you get every carrier that operates there: name, IATA and ICAO codes, and hub airports.

Building it meant aggregating airline profile data across more than 1,000 airports. And like every other time I've done large-scale aggregation on this dataset, the results were not what I expected.

Here's what the data actually looks like.


643 airlines. Most of them you've never heard of.

Across 1,000+ airports, the MyAirports dataset contains 643 distinct airlines (normalized to IATA codes).

The top 10 appear at 200+ airports each. Ryanair leads in Europe, showing up at 230 airports. American, Delta, and United between them cover nearly every U.S. airport in the dataset. Emirates appears at 89 airports across 6 continents.

Below the top 50, the curve drops fast.

Airlines by airport coverage (unique airports served):

200+  airports:   10 airlines   (~1.5%)
50–199 airports:  43 airlines   (~6.7%)
10–49  airports:  128 airlines  (~20%)
2–9    airports:  241 airlines  (~37%)
1      airport:   221 airlines  (~34%)
Enter fullscreen mode Exit fullscreen mode

More than a third of airlines in the dataset serve exactly one airport. These are ultra-regional carriers, charter operators, and cargo airlines whose entire operation lives at a single hub. They have real IATA codes. They have real flights. Most people who've never missed a connection in Ulaanbaatar have never encountered them.

The long tail is long.


Hub concentration is more extreme than you'd think

For any given airport, I was curious: how concentrated is traffic around the dominant carrier?

I computed a simple metric — the share of tracked airlines at each airport that belong to the single largest carrier group (including its regional subsidiaries and codeshare partners). The results:

  • Median airport: top carrier accounts for 42% of airlines serving it
  • Small regional airports (< 5 airlines): top carrier accounts for 78% on average
  • Major hubs (20+ airlines): top carrier accounts for 31% on average

The extreme end is striking. There are 89 airports in the dataset where a single airline accounts for 100% of observed carriers — monopoly airports where one carrier operates every tracked flight. Some are unsurprising (remote island airports with a single national carrier), but several are medium-sized regional airports in competitive markets that happen to be completely dominated by one low-cost carrier.

The other end is also interesting. The most diversified airport in the dataset — by carrier count — is Frankfurt (FRA) with 113 distinct airlines. London Heathrow (LHR) is second at 108. But even at FRA, the top 5 carriers account for 51% of all flights. Diversity of presence doesn't mean diversity of volume.


The IATA/ICAO code problem is worse than I thought

Every airline has (or should have) two codes: a 2-letter IATA code (the one on your boarding pass) and a 3-letter ICAO code (used in ATC and flight plans). In practice, the data is messier.

From the 643 airlines in the dataset:

  • 71% have both a valid IATA and ICAO code
  • 18% have ICAO only — they appear in airport board data encoded in ICAO format, but have no IATA mapping in any public reference
  • 8% have IATA only — common for charter and wet-lease carriers that never filed ICAO designators
  • 3% have neither — codes that appear in scraped data but don't match any known airline in any reference database

That last bucket — 19 airlines with unrecognized codes — is the interesting one. Some are historical: airlines that merged or went bankrupt but whose codes still appear in airport systems referencing old bookings. Some are data errors in the source airport systems (a transposed character, a regional variant). And a small number are genuinely mysterious — codes used consistently across multiple airports that don't map to any known carrier.

The normalization pipeline handles this with a fallback chain:

// airline-resolver.js

async function resolveAirline(rawCode) {
  // 1. Try IATA direct lookup
  const byIata = await db.airline.findFirst({ where: { iataCode: rawCode.toUpperCase() } });
  if (byIata) return { ...byIata, matchType: 'iata' };

  // 2. Try ICAO lookup, return normalized IATA
  const byIcao = await db.airline.findFirst({ where: { icaoCode: rawCode.toUpperCase() } });
  if (byIcao) return { ...byIcao, matchType: 'icao_normalized' };

  // 3. Try fuzzy name match (for display-name variants like "RYANAIR DAC")
  const byName = await db.airline.findFirst({
    where: { name: { contains: rawCode, mode: 'insensitive' } }
  });
  if (byName) return { ...byName, matchType: 'name_fuzzy' };

  // 4. Unknown — flag for review
  await db.unknownCode.upsert({
    where: { code: rawCode },
    update: { seenCount: { increment: 1 } },
    create: { code: rawCode, seenCount: 1, firstSeen: new Date() }
  });

  return null;
}
Enter fullscreen mode Exit fullscreen mode

The unknownCode table is how I discovered the 19 mystery airlines — they kept accumulating count increments from multiple airports, which meant they weren't random noise.


Low-cost vs legacy: different geographies, different patterns

One thing that stands out when you plot airline profiles by region: the low-cost carrier (LCC) story is almost entirely a European and Southeast Asian phenomenon in the data.

In Europe, LCCs (Ryanair, easyJet, Wizz Air, Vueling) appear at a combined 680+ airports. Legacy carriers (Lufthansa, British Airways, Air France, KLM) appear at fewer than 200 combined — concentrated at major hubs and secondary capitals.

In North America, the pattern inverts. The three legacy mega-carriers (American, Delta, United) appear at more airports than all LCCs combined. Southwest is the exception — it appears at 97 airports, more than any other North American carrier — but it's almost the only LCC with that kind of geographic reach on the continent.

Asia-Pacific is the most fragmented. There are 143 distinct carriers across the region's airports in the dataset, with no single carrier appearing at more than 60. The region has both the densest LCC ecosystem (AirAsia alone spawned 9 separate national subsidiaries) and the most state-owned legacy carriers that haven't meaningfully faced low-cost competition.

What this means for the Airlines tab: the "typical" airport looks very different depending on geography. A European regional airport's airline list looks like a Ryanair schedule. A U.S. regional airport's looks like a United Express feed. An Asian secondary city airport might have 4 carriers you've never heard of in the West.


The subsidiary problem

One thing I had to decide early: do code-share and regional subsidiaries count as separate airlines?

My answer: yes, always. Lufthansa and Lufthansa CityLine are different IATA codes (LH and CL), different fleets, different crew contracts, and separate delay profiles. Merging them would make the airline-level stats meaningless.

But this creates a presentation problem. When you look at the Airlines tab for Frankfurt Airport, you see:

  • Lufthansa (LH)
  • Lufthansa CityLine (CL)
  • Lufthansa Cargo (LH — same code, different entity)
  • Air Dolomiti (EN — Lufthansa Group)
  • Swiss (LX — Lufthansa Group)
  • Austrian (OS — Lufthansa Group)
  • Brussels Airlines (SN — Lufthansa Group)

Seven distinct carriers with seven distinct operational profiles, all presenting as "Lufthansa" to most passengers. The Airlines tab surfaces them as separate entries because that's what the data is. But the group field on the airline object lets callers roll them up if they want:

{
  "iataCode": "CL",
  "name": "Lufthansa CityLine",
  "icaoCode": "CLH",
  "group": "Lufthansa Group",
  "hubAirports": ["MUC", "FRA"],
  "type": "regional"
}
Enter fullscreen mode Exit fullscreen mode

If you call the airlines endpoint and group by group, you get the commercial story. If you don't, you get the operational truth. Both are useful depending on what you're building.


What's actually in the Airlines tab

Each airline profile in the tab (and in the API) includes:

  • iataCode / icaoCode: normalized identifiers
  • name: canonical name (English)
  • type: full_service, low_cost, regional, charter, or cargo
  • group: parent group name if applicable (e.g., "IAG", "Lufthansa Group", "Air France-KLM")
  • hubAirports: array of IATA codes for the carrier's primary hub(s)
  • country: ISO country code of the carrier's home country

The airlines endpoint is available at:

GET https://myairports.online/api/v1/airports/{iata}/airlines
Enter fullscreen mode Exit fullscreen mode

It returns all carriers currently operating at that airport based on recent flight board data — not a static schedule, but live-derived from actual observed flights over the past 30 days.


One thing I didn't expect

The single most surprising finding from building this: hub airports are not where most airlines are based.

Of the 643 airlines in the dataset, only 38 list a major international hub (top 50 airports globally) as their sole hub. The majority are based at smaller airports — often the only major carrier at a regional airport — and operate point-to-point or spoke-to-hub routes.

The implication: when you're looking at a big airport's airline list, you're mostly seeing visitors — carriers whose operational home is somewhere else, routing passengers through. The airport's "own" airline, if it has one at all, is often the one with the smallest presence there.

Frankfurt is a Lufthansa hub, but Lufthansa Group carriers account for 38% of the Airlines tab entries. The other 62% are carriers for whom FRA is a stopover, a codeshare point, or a cargo hub.

The big airport is everyone's guest house. The airline data makes that visible.


The Airlines tab is live at myairports.online — search any airport and click Airlines. The underlying data is available via the API at myairports.online/developers. Free tier: 100 requests/day, no card required.

Top comments (0)