<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: José Catalá</title>
    <description>The latest articles on DEV Community by José Catalá (@jfcaba).</description>
    <link>https://dev.to/jfcaba</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2703476%2Fd9f43341-54d1-45fb-bfd3-f01b5dc2d3ce.jpg</url>
      <title>DEV Community: José Catalá</title>
      <link>https://dev.to/jfcaba</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jfcaba"/>
    <language>en</language>
    <item>
      <title>Three Ways to Fly: Building Route Options Ranked by Time, CO2 , and Reliability</title>
      <dc:creator>José Catalá</dc:creator>
      <pubDate>Mon, 27 Apr 2026 08:14:39 +0000</pubDate>
      <link>https://dev.to/jfcaba/three-ways-to-fly-building-route-options-ranked-by-time-co2-and-reliability-e3d</link>
      <guid>https://dev.to/jfcaba/three-ways-to-fly-building-route-options-ranked-by-time-co2-and-reliability-e3d</guid>
      <description>&lt;h1&gt;
  
  
  Three Ways to Fly: Building Route Options Ranked by Time, CO₂, and Reliability
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; aviation, webdev, javascript, node&lt;/p&gt;




&lt;p&gt;We just shipped the Routes tab on &lt;a href="https://myairports.online" rel="noopener noreferrer"&gt;MyAirports&lt;/a&gt;. You pick an origin and a destination, and you get up to three connecting itineraries — one optimised for total travel time, one for CO₂, one for reliability. Each card shows the hub sequence, layover breakdown, ticket type, and the per-leg airline strip.&lt;/p&gt;

&lt;p&gt;Building it meant solving a set of problems that are more interesting than they look. Here's how the system works.&lt;/p&gt;




&lt;h2&gt;
  
  
  Routes as views, not entities
&lt;/h2&gt;

&lt;p&gt;The first design decision was the most important: we don't store routes.&lt;/p&gt;

&lt;p&gt;The existing pipeline already produces &lt;code&gt;Flight&lt;/code&gt; records from scraping, &lt;code&gt;DelayStats&lt;/code&gt; from nightly aggregation, and &lt;code&gt;RouteStats&lt;/code&gt; / &lt;code&gt;AirlineRouteStats&lt;/code&gt; from a weighted recompute. The routes system is built entirely on top of those — it's a graph and a ranker layered over data that already existed.&lt;/p&gt;

&lt;p&gt;The alternative was a routes table with its own write path. That would have meant migration risk, dual writes, and a whole new ingestion concern. Instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Flight (scraped)
  -&amp;gt; DelayStats (nightly per-airline-per-route-per-DOW)
    -&amp;gt; RouteStats + AirlineRouteStats (nightly recompute)
      -&amp;gt; Graph (24 h in-memory cache)
        -&amp;gt; Candidate BFS (≤ 2 stops, 2× viability filter)
          -&amp;gt; Leg durations + hub layovers
          -&amp;gt; Three-bucket ranking
            -&amp;gt; /routes/indirect response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every layer already existed. The routes feature cost zero new ingestion work.&lt;/p&gt;




&lt;h2&gt;
  
  
  The graph
&lt;/h2&gt;

&lt;p&gt;The in-memory graph lives in &lt;code&gt;api/src/graph/build.js&lt;/code&gt;. It's a two-level adjacency structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// edges: Map&amp;lt;origin, Map&amp;lt;dest, airlineIata[]&amp;gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Built once per process, cached 24 hours. Each airport gets a &lt;strong&gt;hubScore&lt;/strong&gt; — the sum of its incoming and outgoing route counts. That number drives minimum connection time later.&lt;/p&gt;

&lt;p&gt;The graph rebuilds nightly after &lt;code&gt;AirlineRouteStats&lt;/code&gt; updates, so it never goes stale relative to the underlying data.&lt;/p&gt;




&lt;h2&gt;
  
  
  Candidate enumeration: the 2× viability filter
&lt;/h2&gt;

&lt;p&gt;The BFS expands up to 2 stops. The question is which of the potentially enormous candidate set to keep.&lt;/p&gt;

&lt;p&gt;The filter is a single line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;isViable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totalKm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;directKm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;factor&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;directKm&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;totalKm&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="nx"&gt;directKm&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;factor&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// factor = 2.0&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Any path whose total great-circle distance exceeds twice the direct distance is dropped immediately. This eliminates the LHR → JFK via NRT class of nonsense before any scoring happens — and it handles ~99% of pathological candidates.&lt;/p&gt;

&lt;p&gt;Within each depth level, we keep the top 20 candidates by aggregated flight frequency, then return the shallowest non-empty level. A direct flight will always beat a 1-stop. A 1-stop will always beat a 2-stop.&lt;/p&gt;




&lt;h2&gt;
  
  
  Leg durations: physical plausibility filtering
&lt;/h2&gt;

&lt;p&gt;For each leg, we compute the median observed flight time from the last 30 days of &lt;code&gt;Flight&lt;/code&gt; records. But raw scraped durations are noisy — airport systems sometimes emit &lt;code&gt;00:00&lt;/code&gt; or obviously wrong values.&lt;/p&gt;

&lt;p&gt;The fix is an airspeed window:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;MIN_SPEED_KMH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;MAX_SPEED_KMH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;isPlausible&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;durationMinutes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;distanceKm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;impliedSpeed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;distanceKm&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nx"&gt;durationMinutes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;impliedSpeed&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;MIN_SPEED_KMH&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;impliedSpeed&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="nx"&gt;MAX_SPEED_KMH&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Anything outside 150–1100 km/h is discarded as a scrape artefact. A flight slower than a fast car or faster than a commercial jet can't be real. This is cleaner than maintaining a per-airline median lookup table.&lt;/p&gt;




&lt;h2&gt;
  
  
  Hub layovers: matching real arrival/departure pairs
&lt;/h2&gt;

&lt;p&gt;This is the part I'm most pleased with.&lt;/p&gt;

&lt;p&gt;The naive approach is to add a fixed layover estimate — 90 minutes, say. The problem is that 90 minutes at Frankfurt is different from 90 minutes at a small regional airport, and neither is the same as an overnight connection in Singapore.&lt;/p&gt;

&lt;p&gt;Instead, we match real pairs:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pull all arrivals at the hub from the origin airport (last 30 days).&lt;/li&gt;
&lt;li&gt;Pull all departures from the hub to the next-leg destination (last 30 days).&lt;/li&gt;
&lt;li&gt;For each arrival, find the closest departure that respects the minimum connection time. Allow next-day pairings.&lt;/li&gt;
&lt;li&gt;Median the gaps. Track &lt;code&gt;overnightPct&lt;/code&gt;.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Minimum connection time is hub-score-driven&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getMCT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;hubScore&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;hubScore&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;90&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="c1"&gt;// major international hub&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;hubScore&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="c1"&gt;// mid-sized hub&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                        &lt;span class="c1"&gt;// smaller airport&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If fewer than 5 pairs match (new route, sparse data), we fall back to &lt;code&gt;MCT + 30 min&lt;/code&gt; and mark the result &lt;code&gt;estimated: true&lt;/code&gt;. The UI surfaces that flag so users know the layover is a model estimate, not observed data.&lt;/p&gt;

&lt;p&gt;The overnight detection falls out for free: if ≥50% of matched pairs cross midnight, the card gets an overnight badge.&lt;/p&gt;




&lt;h2&gt;
  
  
  Reliability: geometric mean, not arithmetic
&lt;/h2&gt;

&lt;p&gt;Each leg has an on-time percentage from &lt;code&gt;AirlineRouteStats&lt;/code&gt;. The combined reliability for a multi-leg itinerary is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;combined = (Π onTimePct_i / 100)^(1/N) * 100
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A 95% first leg and a 60% second leg gives 75.5%, not 77.5%.&lt;/p&gt;

&lt;p&gt;This is the correct model for passengers. An itinerary is only as reliable as its weakest leg — a late arrival on leg 1 threatens the connection on leg 2. Arithmetic mean treats reliability as independent; geometric mean treats it as compounding. The difference matters most on long hauls with tight connections.&lt;/p&gt;

&lt;p&gt;Legs without sufficient sample data are skipped (not zeroed). Zeroing them would unfairly penalise routes with new airlines.&lt;/p&gt;




&lt;h2&gt;
  
  
  Ticket type detection
&lt;/h2&gt;

&lt;p&gt;For each candidate itinerary, we intersect the airline sets across all legs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;legAirlines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;legs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;leg&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;leg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;airlineIatas&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;intersection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;legAirlines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;([...&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))));&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ticketType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;intersection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;size&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sameAirline&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;selfTransfer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the intersection is non-empty, a single-ticket booking is possible. The card shows a "Same airline" badge. This is a meaningful signal for passengers — a single PNR means the airline is responsible for the connection; a self-transfer means you're on your own if leg 1 is late.&lt;/p&gt;




&lt;h2&gt;
  
  
  The three buckets
&lt;/h2&gt;

&lt;p&gt;The ranker (&lt;code&gt;api/src/graph/ranker.js&lt;/code&gt;) produces exactly three outputs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fastest&lt;/strong&gt; — minimise &lt;code&gt;flightMinutes + layoverMinutes&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lowest CO₂&lt;/strong&gt; — minimise &lt;code&gt;totalDistanceKm&lt;/code&gt; (CO₂ is derived from distance per passenger, per ICAO methodology)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Most reliable&lt;/strong&gt; — maximise &lt;code&gt;combinedOnTimePct&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The same itinerary can win multiple buckets. In practice that's rare — the fastest path is usually not the shortest, and the most reliable is usually neither.&lt;/p&gt;

&lt;p&gt;There's one tiebreaker: within 10% of the bucket's winning metric, prefer &lt;code&gt;sameAirline: true&lt;/code&gt; over self-transfer. This only fires when two options are essentially equivalent — it never overrides a meaningfully better route.&lt;/p&gt;




&lt;h2&gt;
  
  
  CO₂: distance is a reasonable proxy
&lt;/h2&gt;

&lt;p&gt;CO₂ per passenger varies by aircraft type, load factor, cabin class, and whether you count radiative forcing. We don't have that granularity — we have distance and passenger counts.&lt;/p&gt;

&lt;p&gt;The proxy we use is total route distance (sum of all legs). This correlates strongly with actual emissions for the comparison use case: a 4,000 km itinerary via one hub emits less than a 7,000 km itinerary via two. The absolute number is an estimate; the relative ranking is reliable.&lt;/p&gt;

&lt;p&gt;Future improvement: when we have aircraft type data per route, we can use the ICAO carbon calculator's per-type coefficients. For now, shorter is greener.&lt;/p&gt;




&lt;h2&gt;
  
  
  What surprised me
&lt;/h2&gt;

&lt;p&gt;The overnight detection rate is higher than expected. On intra-Asian routes and certain long-haul pairings, a meaningful fraction of practical connections require an overnight stay at the hub. The flag matters — nobody wants to book what looks like a 6-hour trip and discover it's actually 18 hours with a hotel stop.&lt;/p&gt;

&lt;p&gt;The viability filter is doing more work than anticipated. Without it, BFS over a 1,300-airport graph at 2 stops produces thousands of candidates per query. With the 2× cap, it drops to tens. The entire indirect query runs under 50ms even with the layover matching.&lt;/p&gt;

&lt;p&gt;And the geometric mean genuinely changes rankings. There are routes where the "obvious" hub (higher on-time per leg) loses to a less-used hub because one of its legs has a 55% on-time rate that drags the geometric mean below what an arithmetic average would show.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;The direct route page (per-route delay stats + airline comparison) is live but Pro-tier only. The indirect routes explorer is public. The obvious next steps are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Aircraft type integration&lt;/strong&gt; for more accurate CO₂ estimates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Full status timeline&lt;/strong&gt; (not just on-time%, but delay distribution) for the reliability bucket&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Price integration&lt;/strong&gt; — the fourth bucket nobody asked for but everyone wants&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Routes API is available at &lt;a href="https://myairports.online/developers" rel="noopener noreferrer"&gt;myairports.online/developers&lt;/a&gt;. &lt;code&gt;GET /routes/indirect?from=LHR&amp;amp;to=BKK&lt;/code&gt; will give you a sense of what the output looks like. Free tier: 100 requests/day, no card required.&lt;/p&gt;

</description>
      <category>aviation</category>
      <category>webdev</category>
      <category>javascript</category>
      <category>node</category>
    </item>
    <item>
      <title>What 1,000 Airports Reveal About the World's Airlines</title>
      <dc:creator>José Catalá</dc:creator>
      <pubDate>Tue, 21 Apr 2026 14:26:24 +0000</pubDate>
      <link>https://dev.to/jfcaba/what-1000-airports-reveal-about-the-worlds-airlines-5b01</link>
      <guid>https://dev.to/jfcaba/what-1000-airports-reveal-about-the-worlds-airlines-5b01</guid>
      <description>&lt;p&gt;We just shipped the Airlines tab on &lt;a href="https://myairports.online" rel="noopener noreferrer"&gt;MyAirports&lt;/a&gt;. You pick any airport, click Airlines, and you get every carrier that operates there: name, IATA and ICAO codes, and hub airports.&lt;/p&gt;

&lt;p&gt;Building it meant aggregating airline profile data across more than 1,000 airports. And like every other time I've done large-scale aggregation on this dataset, the results were not what I expected.&lt;/p&gt;

&lt;p&gt;Here's what the data actually looks like.&lt;/p&gt;




&lt;h2&gt;
  
  
  643 airlines. Most of them you've never heard of.
&lt;/h2&gt;

&lt;p&gt;Across 1,000+ airports, the MyAirports dataset contains &lt;strong&gt;643 distinct airlines&lt;/strong&gt; (normalized to IATA codes).&lt;/p&gt;

&lt;p&gt;The top 10 appear at 200+ airports each. Ryanair leads in Europe, showing up at 230 airports. American, Delta, and United between them cover nearly every U.S. airport in the dataset. Emirates appears at 89 airports across 6 continents.&lt;/p&gt;

&lt;p&gt;Below the top 50, the curve drops fast.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Airlines by airport coverage (unique airports served):

200+  airports:   10 airlines   (~1.5%)
50–199 airports:  43 airlines   (~6.7%)
10–49  airports:  128 airlines  (~20%)
2–9    airports:  241 airlines  (~37%)
1      airport:   221 airlines  (~34%)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;More than a third of airlines in the dataset serve exactly one airport. These are ultra-regional carriers, charter operators, and cargo airlines whose entire operation lives at a single hub. They have real IATA codes. They have real flights. Most people who've never missed a connection in Ulaanbaatar have never encountered them.&lt;/p&gt;

&lt;p&gt;The long tail is long.&lt;/p&gt;




&lt;h2&gt;
  
  
  Hub concentration is more extreme than you'd think
&lt;/h2&gt;

&lt;p&gt;For any given airport, I was curious: how concentrated is traffic around the dominant carrier?&lt;/p&gt;

&lt;p&gt;I computed a simple metric — the share of tracked airlines at each airport that belong to the single largest carrier group (including its regional subsidiaries and codeshare partners). The results:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Median airport&lt;/strong&gt;: top carrier accounts for &lt;strong&gt;42%&lt;/strong&gt; of airlines serving it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Small regional airports&lt;/strong&gt; (&amp;lt; 5 airlines): top carrier accounts for &lt;strong&gt;78%&lt;/strong&gt; on average&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Major hubs&lt;/strong&gt; (20+ airlines): top carrier accounts for &lt;strong&gt;31%&lt;/strong&gt; on average&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The extreme end is striking. There are 89 airports in the dataset where a single airline accounts for &lt;strong&gt;100%&lt;/strong&gt; of observed carriers — monopoly airports where one carrier operates every tracked flight. Some are unsurprising (remote island airports with a single national carrier), but several are medium-sized regional airports in competitive markets that happen to be completely dominated by one low-cost carrier.&lt;/p&gt;

&lt;p&gt;The other end is also interesting. The most diversified airport in the dataset — by carrier count — is Frankfurt (FRA) with 113 distinct airlines. London Heathrow (LHR) is second at 108. But even at FRA, the top 5 carriers account for 51% of all flights. Diversity of presence doesn't mean diversity of volume.&lt;/p&gt;




&lt;h2&gt;
  
  
  The IATA/ICAO code problem is worse than I thought
&lt;/h2&gt;

&lt;p&gt;Every airline has (or should have) two codes: a 2-letter IATA code (the one on your boarding pass) and a 3-letter ICAO code (used in ATC and flight plans). In practice, the data is messier.&lt;/p&gt;

&lt;p&gt;From the 643 airlines in the dataset:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;71% have both&lt;/strong&gt; a valid IATA and ICAO code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;18% have ICAO only&lt;/strong&gt; — they appear in airport board data encoded in ICAO format, but have no IATA mapping in any public reference&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;8% have IATA only&lt;/strong&gt; — common for charter and wet-lease carriers that never filed ICAO designators&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;3% have neither&lt;/strong&gt; — codes that appear in scraped data but don't match any known airline in any reference database&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last bucket — 19 airlines with unrecognized codes — is the interesting one. Some are historical: airlines that merged or went bankrupt but whose codes still appear in airport systems referencing old bookings. Some are data errors in the source airport systems (a transposed character, a regional variant). And a small number are genuinely mysterious — codes used consistently across multiple airports that don't map to any known carrier.&lt;/p&gt;

&lt;p&gt;The normalization pipeline handles this with a fallback chain:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// airline-resolver.js&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;resolveAirline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rawCode&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// 1. Try IATA direct lookup&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;byIata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;airline&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findFirst&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;iataCode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;rawCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toUpperCase&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;byIata&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;byIata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;matchType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;iata&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="c1"&gt;// 2. Try ICAO lookup, return normalized IATA&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;byIcao&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;airline&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findFirst&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;icaoCode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;rawCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toUpperCase&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;byIcao&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;byIcao&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;matchType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;icao_normalized&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="c1"&gt;// 3. Try fuzzy name match (for display-name variants like "RYANAIR DAC")&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;byName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;airline&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findFirst&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;rawCode&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;insensitive&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;byName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;byName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;matchType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;name_fuzzy&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="c1"&gt;// 4. Unknown — flag for review&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;unknownCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;rawCode&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;seenCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;rawCode&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;seenCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;firstSeen&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;unknownCode&lt;/code&gt; table is how I discovered the 19 mystery airlines — they kept accumulating count increments from multiple airports, which meant they weren't random noise.&lt;/p&gt;




&lt;h2&gt;
  
  
  Low-cost vs legacy: different geographies, different patterns
&lt;/h2&gt;

&lt;p&gt;One thing that stands out when you plot airline profiles by region: the low-cost carrier (LCC) story is almost entirely a European and Southeast Asian phenomenon in the data.&lt;/p&gt;

&lt;p&gt;In Europe, LCCs (Ryanair, easyJet, Wizz Air, Vueling) appear at a combined 680+ airports. Legacy carriers (Lufthansa, British Airways, Air France, KLM) appear at fewer than 200 combined — concentrated at major hubs and secondary capitals.&lt;/p&gt;

&lt;p&gt;In North America, the pattern inverts. The three legacy mega-carriers (American, Delta, United) appear at more airports than all LCCs combined. Southwest is the exception — it appears at 97 airports, more than any other North American carrier — but it's almost the only LCC with that kind of geographic reach on the continent.&lt;/p&gt;

&lt;p&gt;Asia-Pacific is the most fragmented. There are 143 distinct carriers across the region's airports in the dataset, with no single carrier appearing at more than 60. The region has both the densest LCC ecosystem (AirAsia alone spawned 9 separate national subsidiaries) and the most state-owned legacy carriers that haven't meaningfully faced low-cost competition.&lt;/p&gt;

&lt;p&gt;What this means for the Airlines tab: the "typical" airport looks very different depending on geography. A European regional airport's airline list looks like a Ryanair schedule. A U.S. regional airport's looks like a United Express feed. An Asian secondary city airport might have 4 carriers you've never heard of in the West.&lt;/p&gt;




&lt;h2&gt;
  
  
  The subsidiary problem
&lt;/h2&gt;

&lt;p&gt;One thing I had to decide early: do code-share and regional subsidiaries count as separate airlines?&lt;/p&gt;

&lt;p&gt;My answer: yes, always. Lufthansa and Lufthansa CityLine are different IATA codes (LH and CL), different fleets, different crew contracts, and separate delay profiles. Merging them would make the airline-level stats meaningless.&lt;/p&gt;

&lt;p&gt;But this creates a presentation problem. When you look at the Airlines tab for Frankfurt Airport, you see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lufthansa (LH)&lt;/li&gt;
&lt;li&gt;Lufthansa CityLine (CL)&lt;/li&gt;
&lt;li&gt;Lufthansa Cargo (LH — same code, different entity)&lt;/li&gt;
&lt;li&gt;Air Dolomiti (EN — Lufthansa Group)&lt;/li&gt;
&lt;li&gt;Swiss (LX — Lufthansa Group)&lt;/li&gt;
&lt;li&gt;Austrian (OS — Lufthansa Group)&lt;/li&gt;
&lt;li&gt;Brussels Airlines (SN — Lufthansa Group)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Seven distinct carriers with seven distinct operational profiles, all presenting as "Lufthansa" to most passengers. The Airlines tab surfaces them as separate entries because that's what the data is. But the &lt;code&gt;group&lt;/code&gt; field on the airline object lets callers roll them up if they want:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"iataCode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CL"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Lufthansa CityLine"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"icaoCode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CLH"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"group"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Lufthansa Group"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hubAirports"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"MUC"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"FRA"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"regional"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you call the airlines endpoint and group by &lt;code&gt;group&lt;/code&gt;, you get the commercial story. If you don't, you get the operational truth. Both are useful depending on what you're building.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's actually in the Airlines tab
&lt;/h2&gt;

&lt;p&gt;Each airline profile in the tab (and in the API) includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;iataCode&lt;/code&gt;&lt;/strong&gt; / &lt;strong&gt;&lt;code&gt;icaoCode&lt;/code&gt;&lt;/strong&gt;: normalized identifiers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;name&lt;/code&gt;&lt;/strong&gt;: canonical name (English)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;type&lt;/code&gt;&lt;/strong&gt;: &lt;code&gt;full_service&lt;/code&gt;, &lt;code&gt;low_cost&lt;/code&gt;, &lt;code&gt;regional&lt;/code&gt;, &lt;code&gt;charter&lt;/code&gt;, or &lt;code&gt;cargo&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;group&lt;/code&gt;&lt;/strong&gt;: parent group name if applicable (e.g., "IAG", "Lufthansa Group", "Air France-KLM")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;hubAirports&lt;/code&gt;&lt;/strong&gt;: array of IATA codes for the carrier's primary hub(s)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;country&lt;/code&gt;&lt;/strong&gt;: ISO country code of the carrier's home country&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The airlines endpoint is available at:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;GET https://myairports.online/api/v1/airports/{iata}/airlines
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It returns all carriers currently operating at that airport based on recent flight board data — not a static schedule, but live-derived from actual observed flights over the past 30 days.&lt;/p&gt;




&lt;h2&gt;
  
  
  One thing I didn't expect
&lt;/h2&gt;

&lt;p&gt;The single most surprising finding from building this: &lt;strong&gt;hub airports are not where most airlines are based.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Of the 643 airlines in the dataset, only 38 list a major international hub (top 50 airports globally) as their sole hub. The majority are based at smaller airports — often the only major carrier at a regional airport — and operate point-to-point or spoke-to-hub routes.&lt;/p&gt;

&lt;p&gt;The implication: when you're looking at a big airport's airline list, you're mostly seeing &lt;em&gt;visitors&lt;/em&gt; — carriers whose operational home is somewhere else, routing passengers through. The airport's "own" airline, if it has one at all, is often the one with the smallest presence there.&lt;/p&gt;

&lt;p&gt;Frankfurt is a Lufthansa hub, but Lufthansa Group carriers account for 38% of the Airlines tab entries. The other 62% are carriers for whom FRA is a stopover, a codeshare point, or a cargo hub.&lt;/p&gt;

&lt;p&gt;The big airport is everyone's guest house. The airline data makes that visible.&lt;/p&gt;




&lt;p&gt;The Airlines tab is live at &lt;a href="https://myairports.online" rel="noopener noreferrer"&gt;myairports.online&lt;/a&gt; — search any airport and click Airlines. The underlying data is available via the API at &lt;a href="https://myairports.online/developers" rel="noopener noreferrer"&gt;myairports.online/developers&lt;/a&gt;. Free tier: 100 requests/day, no card required.&lt;/p&gt;

</description>
      <category>aviation</category>
      <category>webdev</category>
      <category>javascript</category>
      <category>node</category>
    </item>
    <item>
      <title>How I Built a Delay Risk Score for 628,000 Flights</title>
      <dc:creator>José Catalá</dc:creator>
      <pubDate>Mon, 20 Apr 2026 15:22:40 +0000</pubDate>
      <link>https://dev.to/jfcaba/how-i-built-a-delay-risk-score-for-628000-flights-1o5e</link>
      <guid>https://dev.to/jfcaba/how-i-built-a-delay-risk-score-for-628000-flights-1o5e</guid>
      <description>&lt;p&gt;Every flight in the MyAirports API includes a &lt;code&gt;delayRisk&lt;/code&gt; object. It looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"flightNumber"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"VY1234"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scheduledTime"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-21T07:15:00Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"scheduled"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"insights"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"delayRisk"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"onTimePct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;61&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"avgDelayMinutes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;28&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"cancelledPct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;3.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"sample"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;89&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"VY MAD→BCN, Mon morning"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three numbers. Looks simple. It's not.&lt;/p&gt;

&lt;p&gt;Getting from raw scraped flight data to a reliable, per-flight delay prediction involved building a statistics pipeline I didn't know I needed — and making a series of judgment calls that turned out to matter a lot.&lt;/p&gt;

&lt;p&gt;Here's how it works.&lt;/p&gt;




&lt;h2&gt;
  
  
  The core insight: route × time-of-day × day-of-week
&lt;/h2&gt;

&lt;p&gt;A delay risk score is only meaningful if it compares like with like.&lt;/p&gt;

&lt;p&gt;A Monday 07:00 Vueling flight from Madrid to Barcelona has a very different delay profile than a Friday 19:00 Vueling flight on the same route. Aggregating them together produces a number that's wrong for both.&lt;/p&gt;

&lt;p&gt;The bucketing key I landed on is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;route (origin → dest) × airline × departure_hour_bucket × day_of_week
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Where &lt;code&gt;departure_hour_bucket&lt;/code&gt; is a 3-hour window: 00–03, 03–06, 06–09, 09–12, 12–15, 15–18, 18–21, 21–24.&lt;/p&gt;

&lt;p&gt;This gives enough granularity to capture meaningful patterns without fragmenting the sample so much that each bucket has too few flights to be reliable.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;delay_stats&lt;/code&gt; table stores one row per bucket:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;delay_stats&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt;           &lt;span class="nb"&gt;SERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;origin_iata&lt;/span&gt;  &lt;span class="nb"&gt;CHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;dest_iata&lt;/span&gt;    &lt;span class="nb"&gt;CHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;airline_code&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;hour_bucket&lt;/span&gt;  &lt;span class="nb"&gt;SMALLINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- 0, 3, 6, 9, 12, 15, 18, 21&lt;/span&gt;
  &lt;span class="n"&gt;day_of_week&lt;/span&gt;  &lt;span class="nb"&gt;SMALLINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- 0=Sun, 6=Sat&lt;/span&gt;
  &lt;span class="n"&gt;sample_size&lt;/span&gt;  &lt;span class="nb"&gt;INTEGER&lt;/span&gt;  &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;on_time_pct&lt;/span&gt;  &lt;span class="nb"&gt;NUMERIC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;avg_delay_min&lt;/span&gt; &lt;span class="nb"&gt;NUMERIC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;cancelled_pct&lt;/span&gt; &lt;span class="nb"&gt;NUMERIC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;updated_at&lt;/span&gt;   &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="k"&gt;UNIQUE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;origin_iata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dest_iata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;airline_code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hour_bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;day_of_week&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What counts as "on time"
&lt;/h2&gt;

&lt;p&gt;Before computing percentages, I had to define what "on time" actually means. The industry standard is within 15 minutes of scheduled departure. That's what I use.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// insights/analyzers/delay-stats.js&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ON_TIME_THRESHOLD_MINUTES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;classifyFlight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;actualDepartureTime&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;scheduledTime&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// insufficient data&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;delayMinutes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;actualDepartureTime&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;scheduledTime&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;isOnTime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;delayMinutes&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="nx"&gt;ON_TIME_THRESHOLD_MINUTES&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;delayMinutes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;delayMinutes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// negative = early, treat as 0&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A flight that departs 5 minutes &lt;em&gt;early&lt;/em&gt; counts as on time (delay = 0). A flight that departs 16 minutes late is delayed. Cancelled flights go into their own bucket.&lt;/p&gt;

&lt;p&gt;One thing that tripped me up early: I was computing delay as &lt;code&gt;actual - scheduled&lt;/code&gt; and sometimes getting negative numbers because flights occasionally depart ahead of schedule. Treating those as 0-minute delays (rather than negative delays) keeps the average meaningful.&lt;/p&gt;




&lt;h2&gt;
  
  
  The pipeline: from raw flights to stats
&lt;/h2&gt;

&lt;p&gt;The stats computation runs nightly as a batch job. It scans all completed flights from the previous 90 days and recomputes each bucket from scratch.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// insights/run-stats.js&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;computeDelayStats&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cutoff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;90&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Pull all completed flights with actual departure times&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;flights&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findMany&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;in&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrived&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;scheduledTime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;cutoff&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;select&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;originIata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;destIata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;airlineCode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;scheduledTime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;actualDepartureTime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// Group into buckets&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;buckets&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;flight&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;flights&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getBucketKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;buckets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="nx"&gt;buckets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]);&lt;/span&gt;
    &lt;span class="nx"&gt;buckets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Compute stats per bucket and upsert&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;bucketFlights&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;buckets&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;computeBucketStats&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;bucketFlights&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// insufficient sample&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;airline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;hourBucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dayOfWeek&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parseKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;delayStat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;origin_dest_airline_hour_day&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; 
        &lt;span class="na"&gt;originIata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;destIata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;dest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="na"&gt;airlineCode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;airline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;hourBucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dayOfWeek&lt;/span&gt; 
      &lt;span class="p"&gt;}},&lt;/span&gt;
      &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;updatedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;originIata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;destIata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;dest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                &lt;span class="na"&gt;airlineCode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;airline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;hourBucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dayOfWeek&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;stats&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getBucketKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;originIata&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;destIata&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;airlineCode&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;scheduledTime&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hour&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getUTCHours&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hourBucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;hour&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// round down to 0,3,6,9...&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dayOfWeek&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getUTCDay&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;originIata&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;destIata&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;airlineCode&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;hourBucket&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;dayOfWeek&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Computing the stats for a bucket
&lt;/h2&gt;

&lt;p&gt;Once flights are grouped, computing the three metrics is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;MIN_SAMPLE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// minimum flights required for a reliable stat&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;computeBucketStats&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flights&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flights&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;MIN_SAMPLE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cancelled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;flights&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;departed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;flights&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;actualDepartureTime&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// cancelledPct includes all flights in the bucket denominator&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cancelledPct&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cancelled&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nx"&gt;flights&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;departed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;sampleSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;flights&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;onTimePct&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;avgDelayMin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;cancelledPct&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;delays&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;departed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;delayMs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;actualDepartureTime&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;scheduledTime&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;delayMs&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;onTimeCount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;delays&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;d&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="nx"&gt;ON_TIME_THRESHOLD_MINUTES&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;onTimePct&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;onTimeCount&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nx"&gt;departed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;avgDelayMin&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;delays&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;d&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nx"&gt;delays&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;sampleSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;flights&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;onTimePct&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;onTimePct&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;avgDelayMin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;avgDelayMin&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;cancelledPct&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cancelledPct&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;MIN_SAMPLE = 30&lt;/code&gt; threshold is important. Below 30 flights, the percentages are too noisy to be useful — a single bad week skews everything. I return &lt;code&gt;null&lt;/code&gt; for under-sampled buckets, and the API reflects that with a lower-confidence indicator.&lt;/p&gt;




&lt;h2&gt;
  
  
  The minimum sample problem
&lt;/h2&gt;

&lt;p&gt;Thirty flights sounds easy, but it's not for thin routes. A regional airline operating 3 flights per week on a niche route will accumulate 30 samples in about 10 weeks — and if we're only keeping 90 days of history, we have a rolling window of roughly 36 observations at best.&lt;/p&gt;

&lt;p&gt;For these routes I use a fallback hierarchy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getDelayStats&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;airline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;hourBucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dayOfWeek&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// 1. Exact match: specific airline on this route/time&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;findStats&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;airline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;hourBucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dayOfWeek&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stats&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sampleSize&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;MIN_SAMPLE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;high&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="c1"&gt;// 2. Any airline on this route/time&lt;/span&gt;
  &lt;span class="nx"&gt;stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;findStats&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;airline&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;hourBucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dayOfWeek&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stats&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sampleSize&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;MIN_SAMPLE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;medium&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="c1"&gt;// 3. Any airline on this route, any time in this day bucket (morning/afternoon/evening)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dayPart&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;hourBucket&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;morning&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;hourBucket&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;18&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;afternoon&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;evening&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;findStats&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;airline&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dayPart&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dayOfWeek&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stats&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sampleSize&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;MIN_SAMPLE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;low&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="c1"&gt;// 4. No data&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;confidence&lt;/code&gt; field surfaces in the API response so callers can decide how much weight to give the number. A &lt;code&gt;high&lt;/code&gt; confidence &lt;code&gt;onTimePct: 61&lt;/code&gt; means something; a &lt;code&gt;low&lt;/code&gt; confidence one is a rough prior at best.&lt;/p&gt;




&lt;h2&gt;
  
  
  Handling airline codes: the IATA/ICAO mess
&lt;/h2&gt;

&lt;p&gt;Flight data comes from airport boards in different formats. Some airports use IATA airline codes (2-letter: IB, VY, FR), some use ICAO (3-letter: IBE, VLG, RYR), and some mix them depending on the flight.&lt;/p&gt;

&lt;p&gt;The stats pipeline normalizes everything to IATA codes before bucketing. If a flight arrives with ICAO code &lt;code&gt;IBE&lt;/code&gt;, it gets mapped to &lt;code&gt;IB&lt;/code&gt; before the bucket key is computed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// airline-normalizer.js&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ICAO_TO_IATA&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;IBE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;IB&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// Iberia&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;VLG&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;VY&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// Vueling&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;RYR&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;FR&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// Ryanair&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;BAW&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;BA&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// British Airways&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DLH&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;LH&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// Lufthansa&lt;/span&gt;
  &lt;span class="c1"&gt;// ... ~400 more&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;normalizeAirlineCode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;code&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;code&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toUpperCase&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// already IATA&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;ICAO_TO_IATA&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;code&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toUpperCase&lt;/span&gt;&lt;span class="p"&gt;()]&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// map or drop&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Unknown airline codes get dropped (returned as &lt;code&gt;null&lt;/code&gt;), which means those flights still feed into the route-level stats (fallback level 2) but not the airline-specific stats (level 1).&lt;/p&gt;




&lt;h2&gt;
  
  
  What the numbers actually look like
&lt;/h2&gt;

&lt;p&gt;After running the pipeline across all accumulated data, some patterns stand out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The morning variance is real.&lt;/strong&gt; The 06–09 bucket has the highest &lt;code&gt;avgDelayMin&lt;/code&gt; of any time window — roughly 1.4x the all-day average — but also the second-highest &lt;code&gt;onTimePct&lt;/code&gt;. What's happening: most early morning flights are fine, but when they go wrong, they go badly wrong. The distribution is bimodal: short delays almost never, then a long tail of cascaded delays from overnight issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fridays are not the worst day.&lt;/strong&gt; Sunday evenings are. Flights completing weekend itineraries face the worst on-time performance across almost every route in the dataset. Monday mornings are second worst.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Low-cost carrier routes have lower &lt;code&gt;onTimePct&lt;/code&gt; but lower &lt;code&gt;avgDelayMin&lt;/code&gt;&lt;/strong&gt;. Budget airlines cancel less (they have fewer connectivity options, so they push through more delays) but are more frequently slightly late. Legacy carriers cancel more but have tighter average delay windows when they do depart.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The 90-day rolling window matters more than I expected.&lt;/strong&gt; Seasonal patterns mean that data from January is largely irrelevant for predicting August performance. I may eventually shrink the window to 60 days and weight recent flights more heavily.&lt;/p&gt;




&lt;h2&gt;
  
  
  Surfacing it in the API
&lt;/h2&gt;

&lt;p&gt;When a flight is requested, the stats lookup runs against the &lt;code&gt;delay_stats&lt;/code&gt; table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// api/flights.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;enrichWithInsights&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;scheduledTime&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hourBucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getUTCHours&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dayOfWeek&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getUTCDay&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;airline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;normalizeAirlineCode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;airlineCode&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getDelayStats&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;originIata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;destIata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;airline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;hourBucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;dayOfWeek&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;delayRisk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;delayRisk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;onTimePct&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onTimePct&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onTimePct&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;avgDelayMinutes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;avgDelayMin&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;avgDelayMin&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;cancelledPct&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cancelledPct&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cancelledPct&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sampleSize&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildContextString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dayOfWeek&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;hourBucket&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;buildContextString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dayOfWeek&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;hourBucket&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Sun&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Mon&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Tue&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Wed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Thu&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Fri&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Sat&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;timeLabel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;hourBucket&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;morning&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;hourBucket&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;18&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;afternoon&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;evening&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;airlineCode&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;originIata&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;→&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;destIata&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;days&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;dayOfWeek&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;timeLabel&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;context&lt;/code&gt; string makes the number interpretable at a glance — you see immediately that &lt;code&gt;onTimePct: 61&lt;/code&gt; applies to "VY MAD→BCN, Mon morning" rather than to Vueling in general.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd do differently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Weight by recency.&lt;/strong&gt; All flights in the 90-day window currently have equal weight. A flight from 85 days ago contributes equally to the stats as one from yesterday. Exponential decay weighting — where more recent flights count more — would make the score respond faster to airline operational changes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Track the delay distribution, not just the average.&lt;/strong&gt; &lt;code&gt;avgDelayMinutes: 28&lt;/code&gt; tells you the mean, but it hides a lot. A route with a 28-minute average could be "almost always 25–30 minutes late" or "usually on time but occasionally 4 hours late." The latter is much worse for connecting passengers. Storing the 90th percentile delay alongside the average would be far more useful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Separate departure and arrival delay.&lt;/strong&gt; Right now everything is departure-delay. But passengers care about arrival delay. A flight that departs 30 minutes late and lands 10 minutes late (because of a fast cruise and short taxi) is very different from one that lands 35 minutes late. Airport boards rarely surface arrival delay in real time, but when they do, it's worth tracking separately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Per-aircraft-type stats.&lt;/strong&gt; Narrow-body short-haul routes flown by a mix of A320s and B737s have meaningfully different delay profiles depending on the day's specific aircraft assignment. I don't have aircraft type data yet, but it's on the roadmap.&lt;/p&gt;




&lt;p&gt;The delay risk score started as a single percentage. It's now a layered lookup across four levels of fallback, a batch pipeline that runs every night, and the most-queried field in the API. Getting it right was mostly a series of small decisions — the 15-minute threshold, the 3-hour buckets, the minimum sample of 30 — each of which turned out to matter more than I expected.&lt;/p&gt;

&lt;p&gt;The full API, including &lt;code&gt;delayRisk&lt;/code&gt; on every flight, is at &lt;a href="https://myairports.online/developers" rel="noopener noreferrer"&gt;myairports.online/developers&lt;/a&gt;. Free tier: 100 requests/day, no card required.&lt;/p&gt;

</description>
      <category>node</category>
      <category>javascript</category>
      <category>webdev</category>
      <category>aviation</category>
    </item>
    <item>
      <title>What 628,000 Flights Taught Me About Cancellations (And Why They're Harder to Track Than You Think)</title>
      <dc:creator>José Catalá</dc:creator>
      <pubDate>Sat, 18 Apr 2026 19:26:45 +0000</pubDate>
      <link>https://dev.to/jfcaba/what-628000-flights-taught-me-about-cancellations-and-why-theyre-harder-to-track-than-you-think-2k3</link>
      <guid>https://dev.to/jfcaba/what-628000-flights-taught-me-about-cancellations-and-why-theyre-harder-to-track-than-you-think-2k3</guid>
      <description>&lt;p&gt;Flight cancellations look simple from the outside: a flight was scheduled, it didn't happen, it shows "Cancelled." Done.&lt;/p&gt;

&lt;p&gt;After processing 628,000+ flights across 1,300 airports, I can tell you it's not that simple. Cancellations are one of the messiest problems in airport data — and understanding them properly forced me to build detection infrastructure I didn't know I needed.&lt;/p&gt;

&lt;p&gt;Here's what I learned.&lt;/p&gt;




&lt;h2&gt;
  
  
  The three types of cancellation (that most people treat as one)
&lt;/h2&gt;

&lt;p&gt;Not all cancellations are the same, and the differences matter for both the traveller and the data pipeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Early cancellation&lt;/strong&gt;: The airline cancels the flight hours or days before departure and updates the schedule. The flight board never shows it. From a scraping perspective, this flight simply doesn't appear. You can't detect it by watching the board — you'd need to compare against the original schedule, which you may not have.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Late cancellation&lt;/strong&gt;: The flight is on the board with status "Scheduled" or "On time" and then, sometimes within 30 minutes of departure, flips to "Cancelled." This is the one passengers hate most, and it's the one your data pipeline needs to actively detect.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ghost cancellation&lt;/strong&gt;: The flight disappears from the board entirely without ever showing "Cancelled." The row was there in the previous scrape, it's gone now, and there's no explicit status change. This is the hardest one.&lt;/p&gt;

&lt;p&gt;MyAirports encounters all three, and each needs a different detection strategy.&lt;/p&gt;




&lt;h2&gt;
  
  
  Ghost flights: the cancellation nobody tells you about
&lt;/h2&gt;

&lt;p&gt;This was the most surprising thing I learned. A non-trivial number of airport websites simply remove a cancelled flight from their board rather than marking it as cancelled. It just vanishes.&lt;/p&gt;

&lt;p&gt;If your scraper only looks at the current board state, you'll never see these cancellations. They happened, but the evidence is gone.&lt;/p&gt;

&lt;p&gt;The fix is snapshot comparison:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// cancellation-detector.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;detectGhostCancellations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;iata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;direction&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;windowMinutes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;90&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// Get what we saw in the last scrape&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prevScrape&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findMany&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;airportIata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;iata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;direction&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;lastSeen&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;lt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;now&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="c1"&gt;// older than 15 min&lt;/span&gt;
      &lt;span class="na"&gt;scheduledTime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;now&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;lt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;now&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getTime&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;windowMinutes&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;not&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrived&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// Get what's on the board now&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;currentBoard&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;scrapeAirport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;iata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;direction&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;currentFlightNumbers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;currentBoard&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;flightNumber&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

  &lt;span class="c1"&gt;// Anything that was there and is now gone&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;disappeared&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;prevScrape&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; 
    &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;currentFlightNumbers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;flightNumber&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
    &lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="c1"&gt;// not already marked&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;disappeared&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;cancellationReason&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ghost&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// disappeared without notice&lt;/span&gt;
    &lt;span class="na"&gt;detectedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;now&lt;/span&gt;
  &lt;span class="p"&gt;}));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The window matters. A flight that departed won't show on the board either — you only want to flag flights that disappeared &lt;em&gt;before&lt;/em&gt; their scheduled time. The 15-minute "last seen" buffer prevents false positives from temporary scrape failures.&lt;/p&gt;




&lt;h2&gt;
  
  
  Normalizing "Cancelled" across 30+ languages
&lt;/h2&gt;

&lt;p&gt;Article 3 in this series covered status normalization in general. Cancellations specifically deserve attention because the variation is extreme.&lt;/p&gt;

&lt;p&gt;A cancelled flight might show as:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Language&lt;/th&gt;
&lt;th&gt;Status string&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;English&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Cancelled&lt;/code&gt;, &lt;code&gt;Canceled&lt;/code&gt;, &lt;code&gt;CNCL&lt;/code&gt;, &lt;code&gt;CNX&lt;/code&gt;, &lt;code&gt;CXL&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spanish&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Cancelado&lt;/code&gt;, &lt;code&gt;Anulado&lt;/code&gt;, &lt;code&gt;CANCELADO&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;German&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Gestrichen&lt;/code&gt;, &lt;code&gt;Annulliert&lt;/code&gt;, &lt;code&gt;Abgesagt&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;French&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Annulé&lt;/code&gt;, &lt;code&gt;Supprimé&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Italian&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Cancellato&lt;/code&gt;, &lt;code&gt;Soppresso&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Portuguese&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Cancelado&lt;/code&gt;, &lt;code&gt;Suprimido&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Russian&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Отменён&lt;/code&gt;, &lt;code&gt;Отменен&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Arabic&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;ملغي&lt;/code&gt;, &lt;code&gt;ملغاة&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Turkish&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;İptal&lt;/code&gt;, &lt;code&gt;İptal Edildi&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Norwegian&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Kansellert&lt;/code&gt;, &lt;code&gt;Innstilt&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;And then there are airline-specific codes with no natural language at all: &lt;code&gt;CNX&lt;/code&gt;, &lt;code&gt;CXL&lt;/code&gt;, &lt;code&gt;CNCL&lt;/code&gt;, &lt;code&gt;XLD&lt;/code&gt;, &lt;code&gt;CX&lt;/code&gt;. Some airports emit only the code without any label.&lt;/p&gt;

&lt;p&gt;The normalizer handles this in layers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// status-normalizer.js&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CANCELLED_EXACT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
  &lt;span class="c1"&gt;// English&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;canceled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancellation&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cncl&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cnx&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cxl&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;xld&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cx&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// Spanish&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelado&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;anulado&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// German&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gestrichen&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;annulliert&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;abgesagt&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// French&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;annulé&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;annule&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;supprimé&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;supprime&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// Italian&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancellato&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;soppresso&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// Portuguese&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelado&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;suprimido&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// Dutch&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;geannuleerd&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;geannuleerd vlucht&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// Norwegian/Swedish/Danish&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;kansellert&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;innstilt&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;inställd&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aflyst&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// Finnish&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;peruutettu&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// Russian&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;отменён&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;отменен&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;отменен рейс&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// Turkish&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;iptal&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;iptal edildi&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// Arabic (normalized, RTL stripped)&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ملغي&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ملغاة&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// Chinese&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;取消&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;已取消&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CANCELLED_CONTAINS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancel&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;annul&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gestr&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;suprim&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;peruut&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;innstil&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;normalizeToCancelled&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;[^\p&lt;/span&gt;&lt;span class="sr"&gt;{L}&lt;/span&gt;&lt;span class="se"&gt;\p&lt;/span&gt;&lt;span class="sr"&gt;{N}&lt;/span&gt;&lt;span class="se"&gt;\s]&lt;/span&gt;&lt;span class="sr"&gt;/gu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;CANCELLED_EXACT&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;CANCELLED_CONTAINS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pat&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pat&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// not a cancellation&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;replace(/[^\p{L}\p{N}\s]/gu, '')&lt;/code&gt; strips punctuation and special characters using Unicode property escapes — critical for Arabic and Chinese status strings that sometimes arrive with RTL markers or extra punctuation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tracking cancellations over time: the &lt;code&gt;flights&lt;/code&gt; table
&lt;/h2&gt;

&lt;p&gt;Every scraped flight goes into the &lt;code&gt;flights&lt;/code&gt; table. When a cancellation is detected — either from an explicit status change or a ghost detection — the flight record gets updated:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// persistence.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;saveFlights&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;iata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;direction&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;flights&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;flight&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;flights&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;flightNumber_scheduledTime_airportIata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;flightNumber&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;flightNumber&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;scheduledTime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;scheduledTime&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;airportIata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;iata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;}},&lt;/span&gt;
      &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;lastSeen&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="p"&gt;...(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;cancelledAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
          &lt;span class="na"&gt;cancellationReason&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cancellationReason&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;explicit&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;}),&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;firstSeen&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="na"&gt;lastSeen&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;cancelledAt&lt;/code&gt; timestamp and &lt;code&gt;cancellationReason&lt;/code&gt; fields are what make the historical analysis possible later.&lt;/p&gt;




&lt;h2&gt;
  
  
  Mass cancellation detection: weather, strikes, and cascades
&lt;/h2&gt;

&lt;p&gt;Individual cancellations are noise. Mass cancellations are a signal worth surfacing.&lt;/p&gt;

&lt;p&gt;When an airport has 10+ cancellations in a short window, something structural is happening: weather, strike action, airport closure, ground stop, or an airline operational meltdown. The data pattern is distinctive:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// anomaly-detector.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;detectMassCancellations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;iata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;windowHours&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;since&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;windowHours&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;recentCancellations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;count&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;airportIata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;iata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;cancelledAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;since&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;baseline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getBaselineCancellationRate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;iata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;windowHours&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;recentCancellations&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;baseline&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;mass_cancellation&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;airport&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;iata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;recentCancellations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;baseline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;deviationFactor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;recentCancellations&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nx"&gt;baseline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;detectedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getBaselineCancellationRate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;iata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;windowHours&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dayOfWeek&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;getDay&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;$queryRaw&lt;/span&gt;&lt;span class="s2"&gt;`
    SELECT AVG(daily_count) as avg_count
    FROM (
      SELECT DATE(cancelled_at) as day, COUNT(*) as daily_count
      FROM flights
      WHERE airport_iata = &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;iata&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;
        AND status = 'cancelled'
        AND EXTRACT(DOW FROM cancelled_at) = &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;dayOfWeek&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;
        AND cancelled_at &amp;gt; NOW() - INTERVAL '30 days'
      GROUP BY DATE(cancelled_at)
    ) daily
  `&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;avg_count&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The 3x deviation threshold is empirical — below that and you get false positives from normal day-to-day variation. Above 3x is consistently meaningful.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the data actually shows
&lt;/h2&gt;

&lt;p&gt;After accumulating 628,000+ flights, the cancellation patterns are clear:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cancellation rate by time of day&lt;/strong&gt; — Early morning flights (05:00–07:00) cancel at roughly 1.2x the rate of afternoon flights. Not because airlines plan worse in the morning, but because cascades: a late-arriving aircraft the night before causes the early morning departure to cancel, and the delay propagates forward through the day.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cancellation rate by day of week&lt;/strong&gt; — Mondays and Fridays have the highest cancellation rates, roughly 1.4x midweek. This matches the business travel calendar: high demand creates less buffer for aircraft rotations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The "cancellation cascade" pattern&lt;/strong&gt; — When we detect a cancellation on a route (e.g., MAD→LHR at 06:30), the probability of a cancellation on the return (LHR→MAD, any time that day) jumps by roughly 40%. Same aircraft type, same crew. The cascade is real and detectable from the data alone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ghost cancellations by region&lt;/strong&gt; — Ghost cancellations (flights that vanish without notice) are most common in South American and Southeast Asian airports — roughly 3x the rate of European and North American airports. The likely cause is older CMS software that just deletes cancelled entries rather than updating their status.&lt;/p&gt;




&lt;h2&gt;
  
  
  Surfacing this in the API
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;delay_stats&lt;/code&gt; table in the insights engine stores cancelled percentage per route per time bucket:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// insights/analyzers/delay-stats.js (relevant section)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cancelledPct&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;totalFlights&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cancelledCount&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nx"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;totalFlights&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
  &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;delayStat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* composite key */&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;cancelledPct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="cm"&gt;/* other fields */&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;cancelledPct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="cm"&gt;/* other fields */&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This feeds directly into the &lt;code&gt;delayRisk&lt;/code&gt; field on every flight in the API response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"flightNumber"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IB3163"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scheduledTime"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-18T06:30:00Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"scheduled"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"insights"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"delayRisk"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"onTimePct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;68&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"avgDelayMinutes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;22&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"cancelledPct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;4.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"sample"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;143&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IB to LHR, Fri morning"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A &lt;code&gt;cancelledPct&lt;/code&gt; of 4.7% on a Friday morning Iberia flight to London isn't alarming in isolation — but if you're booking travel insurance or a tight connection, it's exactly the number you want.&lt;/p&gt;




&lt;h2&gt;
  
  
  The EU261 connection
&lt;/h2&gt;

&lt;p&gt;Under EU Regulation 261/2004, a cancellation notified fewer than 14 days before departure entitles the passenger to compensation: €250–€600 depending on route distance. The detection pipeline feeds directly into this.&lt;/p&gt;

&lt;p&gt;When a flight cancels in our system with &lt;code&gt;cancelledAt&lt;/code&gt; set to less than 14 days before &lt;code&gt;scheduledTime&lt;/code&gt;, the EU261 eligibility flag gets set:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// eu261.js&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;checkEu261Eligibility&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;noticeHours&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;scheduledTime&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cancelledAt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;noticeDays&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;noticeHours&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;noticeDays&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;eligible&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;reason&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;advance_notice&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;distanceKm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getRouteDistance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;originIata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;destIata&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;eligible&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;compensationEur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;distanceKm&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1500&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="mi"&gt;250&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;distanceKm&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;3500&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;noticeDays&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;noticeDays&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nx"&gt;distanceKm&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The catch: &lt;code&gt;cancelledAt&lt;/code&gt; is when &lt;em&gt;we&lt;/em&gt; detected the cancellation, not necessarily when the airline officially notified passengers. For ghost cancellations — the ones that disappeared without notice — we mark &lt;code&gt;cancellationReason: 'ghost'&lt;/code&gt; and flag the EU261 check as needing manual verification.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd do differently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Track the full status timeline.&lt;/strong&gt; Right now, if a flight goes Scheduled → Delayed → Cancelled, I only have the final state and the timestamps. I should log every status transition. A flight that was delayed 4 times before finally cancelling tells a different story than one that cancelled cleanly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Capture the board timestamp.&lt;/strong&gt; Each scrape runs against live airport data, but the data is often cached on the airport side. I should record the "data as of" timestamp from the airport's API response, not just my scrape time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Watch for reinstatements.&lt;/strong&gt; Cancelled flights occasionally get reinstated — a replacement aircraft is found, the weather clears faster than expected. Right now, a reinstatement would just update the status from "cancelled" to "scheduled," but I don't surface this event explicitly.&lt;/p&gt;




&lt;p&gt;Flight cancellations are a solved problem from the traveller's perspective — the airline's app tells you. But from the data perspective, they're a distributed, multilingual, inconsistent mess. The ghost cancellations alone took me three months to discover were happening.&lt;/p&gt;

&lt;p&gt;The full cancellation API is available at &lt;a href="https://myairports.online/developers" rel="noopener noreferrer"&gt;myairports.online/developers&lt;/a&gt;. Free tier: 100 requests/day, no card required.&lt;/p&gt;

</description>
      <category>node</category>
      <category>javascript</category>
      <category>webdev</category>
      <category>aviation</category>
    </item>
    <item>
      <title>How I Automated Onboarding 1,300 Airport Websites Without Looking at a Single One</title>
      <dc:creator>José Catalá</dc:creator>
      <pubDate>Thu, 16 Apr 2026 20:38:55 +0000</pubDate>
      <link>https://dev.to/jfcaba/how-i-automated-onboarding-1300-airport-websites-without-looking-at-a-single-one-4eg6</link>
      <guid>https://dev.to/jfcaba/how-i-automated-onboarding-1300-airport-websites-without-looking-at-a-single-one-4eg6</guid>
      <description>&lt;p&gt;The original version of MyAirports required a developer (me) to open DevTools, find the network request carrying flight data, copy the URL, figure out the headers, and paste everything into a config file. One airport at a time.&lt;/p&gt;

&lt;p&gt;That's fine for ten airports. It doesn't scale to a thousand.&lt;/p&gt;

&lt;p&gt;So I built a pipeline that does it automatically: give it an IATA code, and it finds the airport's website, discovers the flight data API, detects the scraping strategy, and writes the config code. No DevTools. No manual inspection. Here's how it works.&lt;/p&gt;




&lt;h2&gt;
  
  
  The problem with airport websites
&lt;/h2&gt;

&lt;p&gt;Every airport runs its own website. Some are operated by independent airport authorities. Some are run by national operators — Spain's AENA runs 46 airports on the same platform; Finland's Finavia runs 20. Some outsource to the same flight display software vendor.&lt;/p&gt;

&lt;p&gt;This creates a few patterns worth detecting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Operator groups&lt;/strong&gt;: airports that share the same backend API. Find one, and you've implicitly found all of them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standalone airports&lt;/strong&gt;: unique websites with unique APIs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vendor software&lt;/strong&gt;: airports using products like AirportInfo or similar. Same HTML structure, same endpoints, different domain.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DOM-only&lt;/strong&gt;: no discoverable API. Flight data is rendered server-side, so you scrape HTML.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The discovery pipeline needs to detect which pattern applies and generate the right config for each.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: Find the right URL
&lt;/h2&gt;

&lt;p&gt;The first obstacle is simple: given an IATA code like &lt;code&gt;ATH&lt;/code&gt; (Athens), what's the official airport website URL?&lt;/p&gt;

&lt;p&gt;This sounds trivial but isn't. Search for "Athens airport" and you get aggregators — Skyscanner, TripAdvisor, Wikipedia — before you get the official site. The flight data lives on the official site, not on aggregators.&lt;/p&gt;

&lt;p&gt;The solution is a purpose-built search query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// searcher.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;findAirportUrl&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;iata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;airportName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;country&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`"&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;airportName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;" official airport website flight departures arrivals -skyscanner -tripadvisor -rome2rio`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;serperSearch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;filterAggregators&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;filterAggregators&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;blocklist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;skyscanner&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;tripadvisor&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;rome2rio&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;kayak&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;expedia&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;flightaware&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;flightradar24&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;wikipedia&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;wikitravel&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;airportia&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;flightsfrom&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;airports-worldwide&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;// 30+ more&lt;/span&gt;
  &lt;span class="p"&gt;];&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; 
    &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;blocklist&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;blocked&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;blocked&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The blocklist has grown to 30+ domains over time — whenever a false positive slips through and breaks a discovery, the domain gets added. The negative search terms in the query help, but they're not reliable enough on their own.&lt;/p&gt;

&lt;p&gt;For airports where Serper finds nothing useful, the pipeline falls back to in-browser DuckDuckGo search with the same filtering logic.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: Intercept network traffic
&lt;/h2&gt;

&lt;p&gt;Once the URL is found, the pipeline launches a stealth Playwright browser and loads the page. Rather than looking at the HTML, it watches the network:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// interceptor.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;discoverApi&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;candidates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;

  &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;response&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;url&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;contentType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;content-type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;contentType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;scoreResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;request&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;targetUrl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;waitUntil&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;networkidle&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;scoreResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;flight&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;flightnumber&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departure&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrival&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;iata&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;icao&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;status&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gate&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;airline&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;carrier&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isArray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;flights&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;departures&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arrivals&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// URL indicators  &lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;flight&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;fids&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departure&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrival&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The highest-scoring response is the flight data API. The request headers that triggered it — including any session cookies, CSRF tokens, or API keys — get captured at the same time. One browser session is all it takes; from then on, the system calls the API directly over HTTP without a browser.&lt;/p&gt;

&lt;p&gt;This is the core idea behind MyAirports: use a browser once to discover, then never again.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3: WAF detection and FlareSolverr fallback
&lt;/h2&gt;

&lt;p&gt;Before the browser even hits the target page, the pipeline checks for WAF protection:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// waf-detector.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;detectWaf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;content&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Just a moment&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cf-browser-verification&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cloudflare&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;incapsula&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;visid_incap&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;incapsula&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If Cloudflare is detected, the pipeline hands off to FlareSolverr — a Docker sidecar that runs a real browser, solves the JS challenge, and returns the cookies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// flaresolverr.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;solveCloudflare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://localhost:8191/v1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;request.get&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;maxTimeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;60000&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;cookies&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;solution&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cookies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;userAgent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;solution&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userAgent&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The solved cookies get stored in a &lt;code&gt;Session&lt;/code&gt; table and reused until they expire, so the challenge-solving cost is paid once per airport, not on every scrape.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 4: Detect operator groups
&lt;/h2&gt;

&lt;p&gt;Some airports share infrastructure. Spain's AENA airports all use the same API at &lt;code&gt;aena.es/es/flight-search&lt;/code&gt;. Once you've discovered one AENA airport, you've got the pattern for all 46.&lt;/p&gt;

&lt;p&gt;The group detector looks for these shared patterns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// group-detector.js&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;KNOWN_OPERATORS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AENA&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;urlPattern&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sr"&gt;/aena&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="sr"&gt;es/&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;airports&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;MAD&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;BCN&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;VLC&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="cm"&gt;/* ... */&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Finavia&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;urlPattern&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sr"&gt;/finavia&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="sr"&gt;fi/&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;airports&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HEL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;OUL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="cm"&gt;/* ... */&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ADP&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;urlPattern&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sr"&gt;/parisaeroport&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="sr"&gt;fr/&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;airports&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CDG&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ORY&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;detectOperatorGroup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;discoveredUrl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;op&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;KNOWN_OPERATORS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;op&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;urlPattern&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;discoveredUrl&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;op&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a group match is found, the code generator writes configs for all airports in the group from a single template.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 5: AI pattern learning for DOM-only airports
&lt;/h2&gt;

&lt;p&gt;Some airports don't have a JSON API. Flight data is rendered in HTML tables or card layouts, with no network request to intercept.&lt;/p&gt;

&lt;p&gt;The AI learner handles these:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ai-learn.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;learnPattern&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;launchStealthBrowser&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;waitUntil&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;networkidle&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;18000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;content&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;trimmed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;trimHtml&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;15000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// fits in context window&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`
    This is an airport flight board HTML page. 
    Identify CSS selectors for each flight row, flight number, 
    destination/origin, scheduled time, gate, and status.
    Also list any status text values and what they mean.
    Return JSON: { rowSelector, columns: { flightNumber, destination, time, gate, status }, statusMappings }
  `&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;deepseekChat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;trimmed&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Successfully AI-learned airports so far:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Airport&lt;/th&gt;
&lt;th&gt;Row selector&lt;/th&gt;
&lt;th&gt;Layout&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GYD (Baku)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;a.flt_row&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Card with &lt;code&gt;.fl_col.*&lt;/code&gt; columns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KTW (Katowice)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;div.timetable__row.flight-board__row&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Table-like rows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TLL (Tallinn)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;li.flights-list__item&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;List card&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DUS (Düsseldorf)&lt;/td&gt;
&lt;td&gt;&lt;em&gt;(table class detection)&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;Standard table&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each of these would have taken 20–30 minutes of manual DevTools work. The AI learner does it in about 30 seconds.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 6: Auto-generate config code
&lt;/h2&gt;

&lt;p&gt;Once the pattern is known, the code generator writes the actual config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// code-generator.js&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;generateAirportEntry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;iata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;strategy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;strategy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;api&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`
  &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;iata&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: {
    name: '&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/'/g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s2"&gt;'&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;',
    url: '&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;',
    apiEndpoint: '&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;strategy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;',
    headers: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;strategy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;,
  },`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For operator groups it generates a shared template and references:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;generateOperatorEntry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;operator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;airports&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`
const &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;operator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;Config = {
  apiBase: '&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;operator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;apiBase&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;',
  headers: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;operator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;,
};

&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;airports&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;`  &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: { ...&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;operator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;Config, iata: '&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;' },`&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;
`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Running at scale: 1,300 airports
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm run auto-discover                    &lt;span class="c"&gt;# process all airports&lt;/span&gt;
npm run auto-discover &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="nt"&gt;--country&lt;/span&gt; DE   &lt;span class="c"&gt;# just one country&lt;/span&gt;
npm run auto-discover &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="nt"&gt;--skip&lt;/span&gt; 400     &lt;span class="c"&gt;# resume from checkpoint&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tool reads 6,000+ entries from the OpenFlights world airports CSV, processes in batches of 10 with a 60-second timeout per airport, and writes configs directly to disk.&lt;/p&gt;

&lt;p&gt;Results after one full run across 1,300 airports:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;276 new airports discovered&lt;/strong&gt; — APIs found, configs generated, immediately usable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;~600 DOM-only airports&lt;/strong&gt; — HTML scraping with multilingual status normalisation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;~400 failures&lt;/strong&gt; — heavy WAF, no official website, or page timeouts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operator groups detected&lt;/strong&gt;: AENA (46), Finavia (20), ADP (3), and several smaller groups&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Total processing time: ~4 hours on a single VPS. Compared to 20–30 minutes per airport manually, that's roughly 400 hours of DevTools work replaced by a shell command.&lt;/p&gt;




&lt;h2&gt;
  
  
  What still breaks
&lt;/h2&gt;

&lt;p&gt;The pipeline isn't perfect. Common failure modes:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Heavy bot protection&lt;/strong&gt;: Akamai Bot Manager and Imperva with device fingerprinting go beyond what FlareSolverr can handle. These remain manual.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dynamic API tokens&lt;/strong&gt;: Some airports generate short-lived tokens that expire between scrapes. Discovery captures them, but they're invalid an hour later. These need re-discovery on every request, which is expensive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;iFrame-embedded boards&lt;/strong&gt;: Some airports embed a third-party flight board in an iframe from a different domain. Network interception doesn't always capture those requests correctly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No-English flight data&lt;/strong&gt;: A few airports only publish in their local language (Cyrillic, Arabic, CJK). The status normaliser handles most of it, but field mapping still fails occasionally.&lt;/p&gt;




&lt;p&gt;The auto-discovery pipeline turned a manual, one-by-one chore into a batch job. The 276 airports it found in one run would have taken months to add by hand. The failures are the interesting edge cases — and those are where the AI pattern learner earns its keep.&lt;/p&gt;

&lt;p&gt;The API is free to try at &lt;a href="https://myairports.online/developers" rel="noopener noreferrer"&gt;myairports.online/developers&lt;/a&gt;. Free tier: 100 requests/day, no card required.&lt;/p&gt;

</description>
      <category>node</category>
      <category>javascript</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Two Auth Problems: How MyAirports Handles Identity on Both Sides of the API</title>
      <dc:creator>José Catalá</dc:creator>
      <pubDate>Tue, 14 Apr 2026 12:42:50 +0000</pubDate>
      <link>https://dev.to/jfcaba/two-auth-problems-how-myairports-handles-identity-on-both-sides-of-the-api-3o7</link>
      <guid>https://dev.to/jfcaba/two-auth-problems-how-myairports-handles-identity-on-both-sides-of-the-api-3o7</guid>
      <description>&lt;p&gt;Every API has an auth problem. Usually it's one: how do you verify that the person calling your API is allowed to?&lt;/p&gt;

&lt;p&gt;Building MyAirports — a real-time flight data API covering 1,000+ airports — gave me two auth problems. The first is the standard one: verifying developers who use my API. The second is messier: authenticating against 1,000+ different airport APIs, each of which expects something different.&lt;/p&gt;

&lt;p&gt;Here's how I solved both.&lt;/p&gt;




&lt;h2&gt;
  
  
  The outward-facing problem: clean developer auth
&lt;/h2&gt;

&lt;p&gt;When a developer hits the MyAirports API, I want the experience to be familiar. You sign up, you get a key, you put it in a header, it works.&lt;/p&gt;

&lt;h3&gt;
  
  
  API key format
&lt;/h3&gt;

&lt;p&gt;Keys follow this pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ma_live_&amp;lt;32 random hex characters&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;ma_live_&lt;/code&gt; prefix makes keys identifiable at a glance — in log files, in config dumps, in error messages. If you accidentally paste your key into a public repo, a secret scanning tool can spot it. This is the same pattern Stripe, Resend, and most modern API providers use, and for good reason.&lt;/p&gt;

&lt;h3&gt;
  
  
  Storage: never store the plaintext key
&lt;/h3&gt;

&lt;p&gt;The key is generated once, shown to the user once, and then discarded. What gets stored in the database is the SHA-256 hash of the key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// generation&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rawKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`ma_live_&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;crypto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;randomBytes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hex&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hashedKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;crypto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createHash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sha256&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rawKey&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;digest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hex&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// store only hashedKey in ApiKey table&lt;/span&gt;
&lt;span class="c1"&gt;// return rawKey to the user (once)&lt;/span&gt;

&lt;span class="c1"&gt;// lookup&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;incomingHash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;crypto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createHash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sha256&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;incomingKey&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;digest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hex&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;record&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findFirst&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;hashedKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;incomingHash&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means a database breach doesn't expose live keys. The attacker gets hashes that are useless without the original plaintext.&lt;/p&gt;

&lt;h3&gt;
  
  
  Auth middleware: two paths into the system
&lt;/h3&gt;

&lt;p&gt;The middleware checks in priority order:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// api/src/auth/middleware.js&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;authMiddleware&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Path 1: API key (developer access)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;apiKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;x-api-key&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hashed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;record&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findFirst&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;hashedKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;hashed&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findUnique&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Path 2: JWT cookie (dashboard/browser access)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cookies&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;accessToken&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;JWT_SECRET&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Unauthenticated — public endpoints proceed, protected endpoints return 401&lt;/span&gt;
  &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;API keys handle programmatic access. JWTs handle the browser dashboard. The same middleware serves both, and there's no ambiguity about which wins — API key always takes precedence.&lt;/p&gt;

&lt;p&gt;The JWT setup is standard: 1-hour access tokens, 30-day refresh tokens, both in httpOnly cookies. HS256, signed with &lt;code&gt;JWT_SECRET&lt;/code&gt;. The short access token TTL limits blast radius if a token leaks; the refresh token means users don't re-authenticate constantly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rate limiting: in-memory counters, not DB hits
&lt;/h3&gt;

&lt;p&gt;Rate limiting on every request is a performance concern. Hitting PostgreSQL on every API call to check a counter would add latency to every response. Instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// In-memory store — fast lookup&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;counters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// userId -&amp;gt; { count, date }&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;checkRateLimit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;today&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;T&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;entry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;today&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nx"&gt;today&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;today&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;limit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;plan&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;pro&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;allowed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;remaining&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;limit&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;counters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;allowed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;remaining&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;limit&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;limit&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The in-memory counters are flushed to the &lt;code&gt;ApiUsage&lt;/code&gt; PostgreSQL table every hour. This gives you persistence across restarts and a usage history for the dashboard, without a DB round-trip on every request.&lt;/p&gt;

&lt;p&gt;Rate limit state is communicated back to callers via standard headers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;X-RateLimit-Limit: 100
X-RateLimit-Remaining: 47
X-RateLimit-Reset: 1712275200
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And when the limit is exceeded:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;HTTP&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;429&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Too&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Many&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Requests&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Rate limit exceeded"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"limit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"remaining"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"resetAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-15T00:00:00Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The inward-facing problem: 1,000 different auth schemes
&lt;/h2&gt;

&lt;p&gt;Now for the harder problem.&lt;/p&gt;

&lt;p&gt;MyAirports scrapes flight data from airport websites to serve its API. Each airport website is different. And "different" doesn't just mean different data formats — it means different auth requirements too.&lt;/p&gt;

&lt;p&gt;Here's a sample of what the scraper encounters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No auth required&lt;/strong&gt; — most airports. The flight data API is unauthenticated or only requires session cookies the browser already has.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloudflare JS challenge&lt;/strong&gt; — the page won't load without executing a JavaScript fingerprinting challenge first. No challenge, no data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Incapsula WAF&lt;/strong&gt; — similar JS challenge, different vendor.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CSRF tokens&lt;/strong&gt; — some airport APIs require a valid CSRF token extracted from the page before accepting data requests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Leaked third-party API keys&lt;/strong&gt; — during browser discovery, the scraper sometimes intercepts network requests that include API keys in headers. Airports that license flight data software sometimes include the vendor's API key in client-side requests.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The last one deserves its own Prisma model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model LeakedKey {
  id          Int      @id @default(autoincrement())
  airportIata String
  keyName     String   // e.g. "X-API-Key", "Authorization"
  keyValue    String   // the actual leaked key value
  source      String   // URL it was found in
  tested      Boolean  @default(false)
  working     Boolean  @default(false)
  createdAt   DateTime @default(now())
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the interceptor finds an API key embedded in a request header during discovery, it logs it here. Some of these keys work for direct calls to the upstream flight data vendor — an interesting optimization, though using them raises its own questions.&lt;/p&gt;

&lt;h3&gt;
  
  
  WAF session persistence
&lt;/h3&gt;

&lt;p&gt;Solving a Cloudflare or Incapsula challenge is expensive: it requires a full browser session, human-like behavior simulation, and sometimes a FlareSolverr sidecar to handle the cryptographic JS challenge:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; flaresolverr &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 8191:8191 &lt;span class="se"&gt;\&lt;/span&gt;
  ghcr.io/flaresolverr/flaresolverr:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Paying that cost on every scrape is wasteful. Instead, solved cookies are stored in the &lt;code&gt;Session&lt;/code&gt; table and reused:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model Session {
  id          Int      @id @default(autoincrement())
  airportIata String   @unique
  cookies     Json     // solved WAF cookies
  userAgent   String   // must match the browser that solved the challenge
  expiresAt   DateTime
  createdAt   DateTime @default(now())
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The session store checks for a valid unexpired session before launching a browser. A session that's still valid skips the challenge entirely — the scraper just presents the saved cookies.&lt;/p&gt;

&lt;h3&gt;
  
  
  The anti-bot evasion stack
&lt;/h3&gt;

&lt;p&gt;Beyond WAF cookies, the browser sessions themselves need to look human. The stealth layer manages six distinct browser profiles — each with a coherent set of &lt;code&gt;navigator&lt;/code&gt; properties, WebGL fingerprints, HTTP headers, and behavioral patterns.&lt;/p&gt;

&lt;p&gt;The profiles are designed to look like real returning users rather than fresh bot fingerprints. Persistent Chromium user-data directories reinforce this — the browser accumulates cookies and browsing history across scrapes, which is a signal many WAFs check.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stale session eviction
&lt;/h3&gt;

&lt;p&gt;Sessions expire. WAF cookies typically last 24-48 hours. The system evicts sessions as they expire and re-solves challenges when needed.&lt;/p&gt;

&lt;p&gt;There's a parallel mechanism for discovered API endpoints: the &lt;code&gt;ApiCache&lt;/code&gt; table tracks success and failure counts per endpoint. An endpoint that returns zero flights three times in a row gets flagged as stale:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model ApiCache {
  id           Int      @id @default(autoincrement())
  airportIata  String   @unique
  endpoint     String
  headers      Json
  successCount Int      @default(0)
  failCount    Int      @default(0)
  lastUsed     DateTime
  stale        Boolean  @default(false)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A stale endpoint triggers re-discovery — the browser fires up again, re-intercepts the network traffic, and hopefully finds the new API URL.&lt;/p&gt;




&lt;h2&gt;
  
  
  The architecture of two auth systems
&lt;/h2&gt;

&lt;p&gt;Stepping back: the MyAirports API is a thin, clean auth layer sitting on top of a chaotic, unpredictable auth environment.&lt;/p&gt;

&lt;p&gt;Outward-facing: developers see a consistent, well-documented interface. API keys in headers, rate limits in response headers, 429s when limits are hit. Predictable.&lt;/p&gt;

&lt;p&gt;Inward-facing: the scraper deals with a different kind of problem every hour. No auth, WAF challenges, leaked keys, CSRF tokens, expired sessions. Unpredictable by design.&lt;/p&gt;

&lt;p&gt;The key insight is that these two problems need to be separated architecturally. The developer-facing API layer doesn't know or care about WAF sessions — that complexity is fully contained in the scraping layer. The scraping layer doesn't know anything about &lt;code&gt;ma_live_*&lt;/code&gt; keys — that's the API layer's concern.&lt;/p&gt;

&lt;p&gt;The split keeps each layer simple. The outward-facing auth is straightforward standard API security. The inward-facing auth is a collection of adapters for a messy world.&lt;/p&gt;




&lt;p&gt;The API is free to try at &lt;a href="https://myairports.online/developers" rel="noopener noreferrer"&gt;myairports.online/developers&lt;/a&gt;. Free tier: 100 requests/day, no card required.&lt;/p&gt;

</description>
      <category>node</category>
      <category>api</category>
      <category>security</category>
      <category>javascript</category>
    </item>
    <item>
      <title>How Airport Websites Try to Stop You Scraping Them (And What I Did About It)</title>
      <dc:creator>José Catalá</dc:creator>
      <pubDate>Mon, 13 Apr 2026 09:48:31 +0000</pubDate>
      <link>https://dev.to/jfcaba/how-airport-websites-try-to-stop-you-scraping-them-and-what-i-did-about-it-2742</link>
      <guid>https://dev.to/jfcaba/how-airport-websites-try-to-stop-you-scraping-them-and-what-i-did-about-it-2742</guid>
      <description>&lt;p&gt;I scrape 200+ airport departure boards every few minutes. Here's every defence I've had to break through — and the ones that still slow me down.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Airport Websites Are a Scraper's Nightmare
&lt;/h2&gt;

&lt;p&gt;Airport websites are an unusual target. They're not e-commerce sites trying to protect pricing data, and they're not news sites worried about plagiarism. Most of them just want to display departure boards to travellers. So you'd think they'd be easy to scrape.&lt;/p&gt;

&lt;p&gt;They're not.&lt;/p&gt;

&lt;p&gt;The problem is traffic patterns. An airport website gets hammered by bots — flight trackers, travel aggregators, airline apps, insurance tools, and journalists — all hitting the same handful of endpoints constantly. To survive, airports have layered in the same enterprise anti-bot stacks that protect banks and ticketing sites. They often don't even know exactly what's protecting them; it came bundled with their CDN contract.&lt;/p&gt;

&lt;p&gt;Building &lt;a href="https://myairports.online" rel="noopener noreferrer"&gt;MyAirports&lt;/a&gt; — a real-time flight data API covering 200+ airports — meant going through every layer. Here's what I found, roughly in order of how annoying each one is.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 1: IP Rate Limiting
&lt;/h2&gt;

&lt;p&gt;The most basic defence. Too many requests from one IP in a short window → 429 Too Many Requests or a silent block.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What it looks like:&lt;/strong&gt; Your first few requests succeed. Then you start getting 429s, or worse, 200 responses with no flight data (the site is serving a decoy empty page rather than tipping you off with an error code).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I did:&lt;/strong&gt; Distribute requests across time. Each airport is on a polling interval calibrated to how often its data actually changes — a regional airport with 12 daily departures doesn't need to be hit every 30 seconds. Spreading the intervals out naturally reduces per-IP request rates.&lt;/p&gt;

&lt;p&gt;For airports where even polite rates get blocked, I route through a residential proxy pool. It's not free, but for the ~15% of airports where it's necessary, there's no other option.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The nuance:&lt;/strong&gt; Silent 200s with empty data are worse than 429s. I added a validation step that checks response plausibility — if a departure board suddenly returns zero flights for a major hub at 09:00 on a Tuesday, that's not real data. The scraper logs it as a suspected block rather than storing empty results.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 2: User-Agent and Header Fingerprinting
&lt;/h2&gt;

&lt;p&gt;Servers look at your request headers. A bare &lt;code&gt;curl&lt;/code&gt; or Node &lt;code&gt;fetch&lt;/code&gt; has obvious tells:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;User-Agent: node-fetch/1.0
Accept: */*
Accept-Language: (missing)
Accept-Encoding: gzip, deflate
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Real browsers send 15–20 headers in a specific order, with specific values that match the stated browser version.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What it looks like:&lt;/strong&gt; Immediate 403, or redirect to a CAPTCHA page that your headless browser can't easily solve.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I did:&lt;/strong&gt; Use a real browser engine (Playwright with Chromium) rather than raw HTTP requests. This gives you the genuine browser fingerprint for free — correct header order, correct values, correct &lt;code&gt;navigator&lt;/code&gt; properties, correct TLS handshake cipher suite order.&lt;/p&gt;

&lt;p&gt;The cipher suite thing matters more than people realise. Node.js and browsers negotiate TLS with different cipher preference orders. Some DDoS protection vendors fingerprint the TLS handshake before the HTTP request even arrives. If your cipher order looks like OpenSSL rather than Chrome, you're flagged at the network layer before you've sent a single byte of HTTP.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Using Playwright with a realistic browser context&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;chromium&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;launch&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;headless&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;--no-sandbox&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;--disable-blink-features=AutomationControlled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;newContext&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;userAgent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;locale&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;en-GB&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;timezoneId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Europe/London&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;viewport&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;width&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1366&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;height&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;768&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--disable-blink-features=AutomationControlled&lt;/code&gt; flag is important — without it, &lt;code&gt;navigator.webdriver&lt;/code&gt; returns &lt;code&gt;true&lt;/code&gt; in the page context, which is a trivial bot signal that JavaScript on the page can read.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 3: Cloudflare Bot Management
&lt;/h2&gt;

&lt;p&gt;This is where things get genuinely hard. Cloudflare's bot management suite runs a JavaScript challenge in the browser that collects dozens of signals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Canvas fingerprint (GPU-specific rendering of a test image)&lt;/li&gt;
&lt;li&gt;Audio context fingerprint&lt;/li&gt;
&lt;li&gt;Font enumeration&lt;/li&gt;
&lt;li&gt;WebGL renderer string&lt;/li&gt;
&lt;li&gt;Screen dimensions vs. window dimensions&lt;/li&gt;
&lt;li&gt;Mouse movement entropy (does the mouse move like a human?)&lt;/li&gt;
&lt;li&gt;Timing of JavaScript execution&lt;/li&gt;
&lt;li&gt;Whether &lt;code&gt;window.chrome&lt;/code&gt; exists and has the right shape&lt;/li&gt;
&lt;li&gt;Dozens more&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The challenge runs silently in the background. Pass it and you get a &lt;code&gt;cf_clearance&lt;/code&gt; cookie that's valid for a few hours. Fail and you get a 403 or an interactive CAPTCHA.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What it looks like:&lt;/strong&gt; Page loads fine, but all API calls to the flight data endpoint return 403. Or: you get the page, the page loads some JS, and then the JS makes a secondary XHR that gets blocked.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I did:&lt;/strong&gt; Playwright with the &lt;code&gt;playwright-extra&lt;/code&gt; stealth plugin, which patches the most common automation detection vectors:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;chromium&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;playwright-extra&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;StealthPlugin&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;puppeteer-extra-plugin-stealth&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nx"&gt;chromium&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;StealthPlugin&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The stealth plugin patches things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;navigator.webdriver&lt;/code&gt; → &lt;code&gt;undefined&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;navigator.plugins&lt;/code&gt; → realistic plugin list&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;window.chrome&lt;/code&gt; → full Chrome object shape&lt;/li&gt;
&lt;li&gt;Canvas fingerprint → randomised slightly per session&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;permissions.query&lt;/code&gt; for &lt;code&gt;notifications&lt;/code&gt; → returns &lt;code&gt;denied&lt;/code&gt; (real browsers do; headless ones used to throw)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This handles maybe 70% of Cloudflare deployments. The other 30% use more aggressive challenges that require the &lt;code&gt;cf_clearance&lt;/code&gt; cookie to be renewed more frequently, or require solving an actual interactive challenge. For those, I use a CAPTCHA solving service — 2captcha or AntiCaptcha — which has human solvers available for the rare cases that get through the stealth layer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 4: Akamai Bot Manager
&lt;/h2&gt;

&lt;p&gt;Akamai is different from Cloudflare. Instead of a one-time challenge, Akamai's bot manager injects a sensor script into every page that runs continuously and phones home with behavioural data. It's looking for things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Are mouse movements suspiciously perfect? (Bots often move in straight lines)&lt;/li&gt;
&lt;li&gt;Are keystrokes happening at humanly possible intervals?&lt;/li&gt;
&lt;li&gt;Is the page being scrolled at all?&lt;/li&gt;
&lt;li&gt;Does the &lt;code&gt;_abck&lt;/code&gt; cookie (Akamai's session cookie) contain a valid HMAC signature computed from real browser data?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;_abck&lt;/code&gt; cookie is the key. It's a JWT-like token containing sensor data, and it's validated server-side on each request. If the cookie is missing or invalid, API requests are rejected — even if you've got a valid session.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What it looks like:&lt;/strong&gt; You can load the page fine. But when the page makes XHR requests to &lt;code&gt;/api/flights&lt;/code&gt; or wherever the data lives, they come back 403 with no body. The failure is invisible to users (the JS handles the retry), but your scraper just gets nothing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I did:&lt;/strong&gt; Two approaches depending on the airport.&lt;/p&gt;

&lt;p&gt;For airports where the data is available in the initial HTML (the departure board is server-rendered), I just parse the HTML directly. Akamai only blocks the XHR calls; the first page load often goes through fine and contains all the data I need.&lt;/p&gt;

&lt;p&gt;For airports where the data is loaded client-side via XHR, I use Playwright and let the full page JavaScript run. Playwright with stealth generates valid-enough sensor data that the &lt;code&gt;_abck&lt;/code&gt; cookie gets accepted. I then extract the cookie and replay it on subsequent requests — though it expires and needs refreshing every 30–60 minutes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Wait for the XHR to complete rather than trying to replicate it&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;url&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/flights&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;__FLIGHT_DATA__&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Layer 5: Dynamic Endpoint Obfuscation
&lt;/h2&gt;

&lt;p&gt;Some airport sites don't have a stable API endpoint. The URL for the flight data changes with each deployment — it might be hashed, versioned, or entirely randomised. Hardcoding &lt;code&gt;/api/v1/flights&lt;/code&gt; breaks the moment they redeploy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What it looks like:&lt;/strong&gt; Your scraper works for weeks, then silently breaks when the airport updates their site. The URL 404s and you don't find out until someone notices the data hasn't updated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I did:&lt;/strong&gt; Intercept network requests rather than targeting a specific URL. Playwright's &lt;code&gt;page.on('response', ...)&lt;/code&gt; captures every response the page receives. I filter for responses that look like flight data — JSON with an array of objects containing recognisable fields like flight numbers, times, or status strings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;response&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;url&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;flight&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departure&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrival&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;isFlightData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;processFlights&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Not JSON, skip&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;isFlightData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;arr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isArray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;flights&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;departures&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isArray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;flightNumber&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;flight_no&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ident&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;designator&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is more resilient to site updates because I'm not coupled to a specific URL structure — I'm coupled to the shape of the data, which changes far less often.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 6: Session Fixation and IP-Bound Cookies
&lt;/h2&gt;

&lt;p&gt;This is the one that took me longest to diagnose. Some airports use session cookies that are bound to the IP address of the first request. The cookie is issued by the server and contains an HMAC of the client IP. If you grab the cookie from one IP (say, your local machine while debugging) and try to use it from another IP (your VPS), the server rejects it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What it looks like:&lt;/strong&gt; Everything works locally. On the VPS, you get 403s or empty data even though you're sending the right cookies. The difference is invisible in the request — you're sending a valid-looking cookie, it's just bound to the wrong IP.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I did:&lt;/strong&gt; I wrote about this in detail &lt;a href="https://dev.to/jfcaba/how-i-solved-the-ip-bound-cookie-problem-in-my-web-scraper-with-a-self-hosted-proxy-58cj"&gt;in a separate post&lt;/a&gt;, but the short version: I built &lt;a href="https://skiprestriction.uk" rel="noopener noreferrer"&gt;SkipRestriction&lt;/a&gt; — a self-hosted Shadowsocks proxy running on the same VPS as the scraper. When I need to grab cookies for an airport with IP-bound sessions, I browse through the proxy so the cookie is issued to the VPS's IP. The scraper, also running on the VPS, then uses those cookies from the same IP.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 7: JavaScript-Rendered Data with Anti-Automation Timing
&lt;/h2&gt;

&lt;p&gt;Some sites render flight data using JavaScript that deliberately runs on a slight delay, or loads data in chunks, or only renders the full table after a scroll event. These are heuristics designed to detect headless browsers, which typically don't scroll and don't trigger layout-based events.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What it looks like:&lt;/strong&gt; Playwright loads the page, you query the DOM for flight rows, and you get back 3 rows when there should be 45. The rest haven't rendered yet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I did:&lt;/strong&gt; Inject realistic user behaviour before extracting data:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Simulate a human reading the page before scraping&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForLoadState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;networkidle&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;scrollBy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;800&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;scrollBy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;600&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Now query for data&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;flights&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;$&lt;/span&gt;&lt;span class="nf"&gt;$eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.flight-row&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
  &lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;row&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;row&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.flight-number&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;row&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.status&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;time&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;row&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.scheduled-time&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="p"&gt;}))&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The random delays matter — consistent 800ms waits are a bot signal. Human reading speed has natural variance.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Layer I Haven't Beaten: Real CAPTCHA with Proof-of-Work
&lt;/h2&gt;

&lt;p&gt;A handful of airports (3–4 in my coverage set) use a challenge mechanism that requires solving a proof-of-work puzzle — essentially, computing a hash with certain properties before the session is established. It's not a visual CAPTCHA; it's a compute challenge that's expensive for bots to complete at scale and trivially fast for a single human browser session.&lt;/p&gt;

&lt;p&gt;I currently handle these with a hybrid approach: a real browser session is established manually once (or triggered via a webhook), the resulting cookies are stored, and the scraper uses those cookies until they expire. For airports with long cookie TTLs (24h+), this is workable. For airports that require fresh challenges every hour, it's not scalable and those airports currently have lower update frequency in MyAirports.&lt;/p&gt;

&lt;p&gt;If you've cracked this one cleanly, I'd genuinely like to hear how.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Works, What Doesn't, and What Changes Constantly
&lt;/h2&gt;

&lt;p&gt;The honest summary: &lt;strong&gt;Playwright + stealth gets you through 80% of sites&lt;/strong&gt;. IP management (residential proxies or VPS-matched cookies) handles another 10%. The remaining 10% requires per-site custom solutions, ongoing maintenance, or accepting reduced update frequency.&lt;/p&gt;

&lt;p&gt;What changes most often: Cloudflare updates its challenge algorithms roughly every 6–8 weeks. When it does, stealth plugins need updating. I follow the &lt;code&gt;puppeteer-extra-plugin-stealth&lt;/code&gt; repo closely and have a monitoring alert that fires when a Cloudflare-protected site's data goes stale — that's usually the signal that a new challenge version is live.&lt;/p&gt;

&lt;p&gt;What's surprisingly stable: The actual data extraction logic. Airport websites update their visual design sometimes, but the underlying data structure in the HTML or JSON is remarkably consistent over time. The bot protection changes; the data format rarely does.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Scraper Stack (For Reference)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Playwright&lt;/strong&gt; (Chromium) for JS-heavy sites&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;playwright-extra&lt;/strong&gt; + stealth plugin for Cloudflare&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;got&lt;/strong&gt; (Node HTTP client) for simple HTML-only sites&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Residential proxy pool&lt;/strong&gt; for IP-rate-limited airports&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shadowsocks&lt;/strong&gt; self-hosted proxy for IP-bound cookie issues&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;2captcha&lt;/strong&gt; for the small number of visual CAPTCHAs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PostgreSQL&lt;/strong&gt; for storage, with a &lt;code&gt;scrape_log&lt;/code&gt; table tracking per-airport success/failure rates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Node-cron&lt;/strong&gt; for scheduling, with dynamic intervals per airport based on data freshness requirements&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The full architecture, including the self-learning caching layer, is covered &lt;a href="https://dev.to/jfcaba/how-i-built-a-web-scraper-that-gets-faster-and-cheaper-with-every-request-oo7"&gt;in a previous post&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;The flight data is live at &lt;a href="https://myairports.online" rel="noopener noreferrer"&gt;myairports.online&lt;/a&gt; — if you want to use it rather than scrape airport sites yourself, the API handles all of this so you don't have to.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Have a layer I haven't covered? Drop it in the comments — I'm genuinely curious what else is out there.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>node</category>
      <category>devops</category>
    </item>
    <item>
      <title>Your Flight Was Cancelled. Here's How I Built a Tool That Finds Every EU261 Claim at Any Airport</title>
      <dc:creator>José Catalá</dc:creator>
      <pubDate>Fri, 10 Apr 2026 07:59:21 +0000</pubDate>
      <link>https://dev.to/jfcaba/your-flight-was-cancelled-heres-how-i-built-a-tool-that-finds-every-eu261-claim-at-any-airport-2e9k</link>
      <guid>https://dev.to/jfcaba/your-flight-was-cancelled-heres-how-i-built-a-tool-that-finds-every-eu261-claim-at-any-airport-2e9k</guid>
      <description>&lt;p&gt;Most passengers never claim the compensation they're legally owed. I built a feature that changes that — here's how it works under the hood.&lt;/p&gt;




&lt;h2&gt;
  
  
  The €250 Nobody Asks For
&lt;/h2&gt;

&lt;p&gt;When Lufthansa cancelled flight LH954 from Frankfurt to Birmingham this morning, 25 flights at that airport alone became eligible for compensation under EU261 in the last 7 days.&lt;/p&gt;

&lt;p&gt;Most of those passengers won't claim a cent.&lt;/p&gt;

&lt;p&gt;Not because they're ineligible — under EU Regulation 261/2004, a cancelled flight on a route under 1,500 km entitles each passenger to &lt;strong&gt;€250 in cash compensation&lt;/strong&gt;, regardless of whether the airline refunds the ticket. It's not a voucher. It's not store credit. It's money.&lt;/p&gt;

&lt;p&gt;The problem isn't eligibility. It's awareness. Most travellers don't know the regulation exists, and those who do often assume claiming is too complicated to bother with.&lt;/p&gt;

&lt;p&gt;I built MyAirports — a real-time airport data API — to solve the scraping and data normalisation problem. But once I had clean, reliable flight status data for hundreds of airports, a second problem became obvious: &lt;strong&gt;all this data about delays and cancellations was sitting there unused&lt;/strong&gt;. I was already tracking which flights were cancelled, their distances, their routes. The EU261 calculation is basically a lookup table. So I built the Claims tab.&lt;/p&gt;

&lt;p&gt;Here's how it works.&lt;/p&gt;




&lt;h2&gt;
  
  
  EU261 in Two Minutes
&lt;/h2&gt;

&lt;p&gt;EU Regulation 261/2004 applies when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The flight &lt;strong&gt;departs from an EU airport&lt;/strong&gt;, or arrives at an EU airport on an EU/UK carrier&lt;/li&gt;
&lt;li&gt;The flight is &lt;strong&gt;cancelled&lt;/strong&gt; or delayed by more than 3 hours at arrival&lt;/li&gt;
&lt;li&gt;The disruption is not caused by "extraordinary circumstances" (bad weather, air traffic control strikes, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Compensation tiers by route distance:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Distance&lt;/th&gt;
&lt;th&gt;Compensation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Under 1,500 km&lt;/td&gt;
&lt;td&gt;€250&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1,500–3,500 km&lt;/td&gt;
&lt;td&gt;€400&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Over 3,500 km&lt;/td&gt;
&lt;td&gt;€600&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;UK passengers still have equivalent rights post-Brexit under UK261.&lt;/p&gt;

&lt;p&gt;That's it. The airline cannot refuse if the criteria are met. They can offer a voucher — but you're entitled to cash.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Data Problem I Already Had Solved
&lt;/h2&gt;

&lt;p&gt;MyAirports scrapes live flight data from airport websites — the actual departure boards, not third-party aggregators. Every flight gets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;IATA flight code&lt;/li&gt;
&lt;li&gt;Origin and destination airport (with IATA codes)&lt;/li&gt;
&lt;li&gt;Scheduled and actual departure/arrival times&lt;/li&gt;
&lt;li&gt;Real-time status: &lt;code&gt;on_time&lt;/code&gt;, &lt;code&gt;delayed&lt;/code&gt;, &lt;code&gt;cancelled&lt;/code&gt;, &lt;code&gt;diverted&lt;/code&gt;, &lt;code&gt;landed&lt;/code&gt;, etc.&lt;/li&gt;
&lt;li&gt;Airline name&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The status normalisation work was significant — airport websites display cancellations in seven different languages with wildly inconsistent terminology. I wrote about that &lt;a href="https://dev.to/jfcaba/normalizing-flight-statuses-across-7-languages-what-i-learned-building-a-global-airport-api-po0"&gt;in a previous post&lt;/a&gt;. By the time a flight hits the database, it has a clean, normalised status field.&lt;/p&gt;

&lt;p&gt;For the Claims feature, &lt;code&gt;cancelled&lt;/code&gt; is the key trigger. But not all cancellations qualify — the regulation only covers flights that departed from or arrived at an EU/UK airport. Since I'm already storing IATA codes, I can check country eligibility instantly against a static lookup of EU member state airport codes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Calculating Eligibility Automatically
&lt;/h2&gt;

&lt;p&gt;The core logic is straightforward once the data is clean:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getEU261Compensation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Must be cancelled or delayed 3+ hours&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;delayMinutes&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;180&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Route must touch EU/UK&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fromEU&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;isEUAirport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;toEU&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;isEUAirport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;destination&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;isEUCarrier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;isEUAirline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;iataCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;fromEU&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;toEU&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;isEUCarrier&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Distance-based compensation&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;distance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getGreatCircleDistance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;destination&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;distance&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1500&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;250&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;distance&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;3500&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;getGreatCircleDistance&lt;/code&gt; uses stored lat/lon for each IATA airport code — a ~7,000-entry lookup table I pre-compiled from public data. The great-circle calculation is just the haversine formula:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;haversine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;lat1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;lon1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;lat2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;lon2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;R&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6371&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Earth radius in km&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dLat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;toRad&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;lat2&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;lat1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dLon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;toRad&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;lon2&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;lon1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
    &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;dLat&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;toRad&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;lat1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;toRad&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;lat2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;dLon&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;R&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;atan2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This runs at query time — no pre-computation needed. The distance between any two airports is deterministic, so it could be cached, but it's cheap enough that it doesn't matter.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Claims Tab Shows
&lt;/h2&gt;

&lt;p&gt;When you open any airport page on &lt;a href="https://myairports.online" rel="noopener noreferrer"&gt;myairports.online&lt;/a&gt; and click &lt;strong&gt;Claims&lt;/strong&gt;, you see every flight in the last 7 days that passes the eligibility check — ranked by date, with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Flight number and airline&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Route&lt;/strong&gt; (origin → destination)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Status&lt;/strong&gt; (Cancelled / Delayed)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Distance&lt;/strong&gt; (so you can verify the tier yourself)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compensation amount&lt;/strong&gt; (€250 / €400 / €600)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Regulation&lt;/strong&gt; (EU261 or UK261)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claim action&lt;/strong&gt; — either a direct claim link to the airline's own portal, or a one-click claim via an integrated claims service&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Birmingham Airport (BHX) had 25 flights eligible for compensation in the last 7 days when I last checked. That's not unusual. At a mid-sized European hub, 20–30 eligible flights per week is typical in normal operating conditions. During bad weather events or industrial action, that number can spike into the hundreds.&lt;/p&gt;




&lt;h2&gt;
  
  
  Two Claim Paths
&lt;/h2&gt;

&lt;p&gt;Not every airline makes it easy to claim directly. Some (like British Airways and Lufthansa) have functional online claim forms — they comply, slowly. Others have forms that are buried, broken, or deliberately tedious.&lt;/p&gt;

&lt;p&gt;The Claims tab surfaces both options:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Direct airline claim&lt;/strong&gt; — for carriers with reliable self-service portals (EasyJet, Air France, Ryanair). We link directly to their claim page with the flight details pre-filled where possible.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Assisted claim for €2.99&lt;/strong&gt; — for cases where direct claiming is harder, or where passengers just don't want to deal with it. A claims management service handles the entire process: paperwork, follow-up, escalation to alternative dispute resolution if the airline refuses. They charge a flat €2.99 rather than the typical 25–35% cut that most "no win no fee" services take.&lt;/p&gt;

&lt;p&gt;The decision of which path to offer is driven by a per-airline lookup table I maintain — essentially a reliability score for each carrier's own claims process. If the airline consistently pays out promptly on direct claims, we show the direct link. If they're known to stall, the assisted option is surfaced first.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Unexpected Use Case
&lt;/h2&gt;

&lt;p&gt;I built the Claims tab primarily as a user-facing feature. But it turns out developers using the API have found it useful too.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;/airports/{iata}/claims&lt;/code&gt; endpoint returns the same structured data:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"airport"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"BHX"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"period_days"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"eligible_flights"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"flight"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"LH954"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"airline"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Lufthansa"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-10"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"origin"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"FRA"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"destination"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"BHX"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cancelled"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"distance_km"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;766&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"compensation_eur"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;250&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"regulation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"EU261"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"claim_url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.lufthansa.com/gb/en/flight-irregularity-claim"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"total_eligible"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few travel agencies and corporate travel tools have integrated this to automatically flag claims for their customers after disrupted trips. The value proposition is simple: most business travellers don't track their own delays. If a tool can surface "your employee on LH954 is owed €250" after the fact, that's real money recovered with zero effort.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;Three things stood out building this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cancellation data is noisy.&lt;/strong&gt; Airport websites sometimes mark a flight as cancelled and then re-list it under a new flight number (a "rescue flight"). If you don't de-duplicate carefully, you'll show a claim for a flight that technically departed. I added a reconciliation step that checks if the same route, same date, and similar departure time appears with a &lt;code&gt;landed&lt;/code&gt; status anywhere in the dataset.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Extraordinary circumstances are unverifiable at query time.&lt;/strong&gt; The regulation exempts cancellations caused by things outside the airline's control — storms, ATC strikes, bird strikes. I don't attempt to classify this automatically. The Claims tab shows all cancellations that pass the route/distance check and notes that eligibility is subject to circumstances. The claims process itself resolves extraordinary-circumstance disputes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Distance matters more than people think.&lt;/strong&gt; A Dublin→Birmingham cancellation (322 km) yields €250. A Paris CDG→Birmingham cancellation (488 km) also yields €250. But a longer intra-EU route like Lisbon→Helsinki (2,800 km) would yield €400. Passengers often assume cancelled = €250 flat, then feel cheated when their long-haul claim is assessed differently. Making the distance explicit in the UI has reduced those support queries significantly.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;Every airport on &lt;a href="https://myairports.online" rel="noopener noreferrer"&gt;myairports.online&lt;/a&gt; has a Claims tab. If you flew into or out of a European airport in the last 7 days and your flight was cancelled, check the Claims tab for your airport — there's a reasonable chance your flight is listed.&lt;/p&gt;

&lt;p&gt;If you're a developer, the &lt;code&gt;/airports/{iata}/claims&lt;/code&gt; endpoint is part of the &lt;a href="https://myairports.online" rel="noopener noreferrer"&gt;MyAirports API&lt;/a&gt;. The free tier gives you 100 requests/day; the paid tier removes the cap and adds historical claims data going back 90 days.&lt;/p&gt;

&lt;p&gt;EU261 was designed to make airlines accountable. The data to enforce it has always existed — it just needed surfacing.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;MyAirports is a real-time airport data API covering 200+ airports across Europe, the Americas, and Asia-Pacific. The Claims feature is live at &lt;a href="https://myairports.online" rel="noopener noreferrer"&gt;myairports.online&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>node</category>
      <category>api</category>
    </item>
    <item>
      <title>How I Solved the IP-Bound Cookie Problem in My Web Scraper With a Self-Hosted Proxy</title>
      <dc:creator>José Catalá</dc:creator>
      <pubDate>Tue, 07 Apr 2026 11:22:19 +0000</pubDate>
      <link>https://dev.to/jfcaba/how-i-solved-the-ip-bound-cookie-problem-in-my-web-scraper-with-a-self-hosted-proxy-58cj</link>
      <guid>https://dev.to/jfcaba/how-i-solved-the-ip-bound-cookie-problem-in-my-web-scraper-with-a-self-hosted-proxy-58cj</guid>
      <description>&lt;p&gt;There's a category of airport website that breaks my scraper in a specific, maddening way: they issue session cookies that are pinned to the IP address of the request that created them.&lt;/p&gt;

&lt;p&gt;The symptom looks like this: you manually grab a working cookie from your browser, paste it into your server-side scraper, and it immediately returns a 401 or redirects to a login page. The cookie is valid — you just created it. But it was created from your laptop's IP, and your VPS has a different one. The airport's backend checks both the cookie and the origin IP, and when they don't match, it rejects the session.&lt;/p&gt;

&lt;p&gt;For &lt;a href="https://myairports.online/developers" rel="noopener noreferrer"&gt;MyAirports&lt;/a&gt; — a real-time flight data API covering 1,000+ airports — this is a real operational problem. Here's how I solved it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why IP-Bound Cookies Exist
&lt;/h2&gt;

&lt;p&gt;Session fixation attacks are the reason. A server that binds cookies to IPs prevents an attacker who steals your session cookie from replaying it from a different location. It's a legitimate security measure, and it's common in systems that handle sensitive operational data — which airport backend systems often do.&lt;/p&gt;

&lt;p&gt;For a scraper, it creates a workflow problem. You can:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Grab a fresh cookie from a real browser session&lt;/li&gt;
&lt;li&gt;Replay that cookie from your server&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;But only if both steps happen from the same IP.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Manual Cookie Workflow
&lt;/h2&gt;

&lt;p&gt;For airports that require authenticated sessions (particularly those protected by HCaptcha or Incapsula), my architecture uses a &lt;code&gt;grab-cookies.js&lt;/code&gt; tool. It launches a real browser, navigates to the airport site, solves challenges if needed, and exports the resulting session cookies.&lt;/p&gt;

&lt;p&gt;Those cookies then get passed to the scraper as headers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Simplified example from the scraper pipeline&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cookies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;loadCookies&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;MAD&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Madrid Barajas&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Cookie&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;cookies&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;; &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Referer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://www.aena.es&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When &lt;code&gt;grab-cookies.js&lt;/code&gt; runs on the VPS itself, everything works — the cookie was issued to the VPS IP, and the scraper makes requests from the same IP. But during development and debugging, running a headless browser on the server just to grab a cookie is slow and inconvenient. You want to do it from your local machine.&lt;/p&gt;

&lt;p&gt;That's where the IP mismatch breaks things.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: Route Your Browser Through the VPS
&lt;/h2&gt;

&lt;p&gt;The fix is simple in principle: make your local browser appear to come from the VPS IP when visiting airport websites. Any traffic that comes from your browser through the proxy will receive cookies bound to the VPS IP — the same IP your production scraper uses.&lt;/p&gt;

&lt;p&gt;I used &lt;a href="https://skiprestriction.uk" rel="noopener noreferrer"&gt;SkipRestriction.uk&lt;/a&gt; to set up a self-hosted Shadowsocks proxy on the same VPS where MyAirports runs. The setup is a single Docker command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; shadowsocks-server &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--restart&lt;/span&gt; always &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 8388:8388/tcp &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 8388:8388/udp &lt;span class="se"&gt;\&lt;/span&gt;
  shadowsocks/shadowsocks-libev:latest &lt;span class="se"&gt;\&lt;/span&gt;
  ss-server &lt;span class="nt"&gt;-s&lt;/span&gt; 0.0.0.0 &lt;span class="nt"&gt;-p&lt;/span&gt; 8388 &lt;span class="nt"&gt;-k&lt;/span&gt; YOUR_PASSWORD &lt;span class="nt"&gt;-m&lt;/span&gt; aes-256-gcm
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Point your local Shadowsocks client at &lt;code&gt;YOUR_VPS_IP:8388&lt;/code&gt;, configure your browser to use it as a SOCKS5 proxy, and now your browser traffic exits from the VPS. Airport websites see your VPS IP. Cookies issued to that session are tied to the VPS IP. When the production scraper replays those cookies from the same VPS, the IP check passes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Shadowsocks and Not a VPN
&lt;/h2&gt;

&lt;p&gt;A traditional VPN would work too, but Shadowsocks has a few practical advantages for this use case:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It's lightweight.&lt;/strong&gt; Running WireGuard or OpenVPN on the same VPS as a production scraper means managing certificates, key exchange, and routing tables. The Shadowsocks Docker container adds almost nothing to the host.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Traffic looks like HTTPS.&lt;/strong&gt; Shadowsocks was designed to be indistinguishable from normal encrypted web traffic. Some airport networks filter VPN protocols at the packet level. Shadowsocks is harder to detect and block.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Per-application routing.&lt;/strong&gt; With a VPN, all traffic from your machine goes through the VPS — including your Slack messages and YouTube tabs. With a SOCKS5 proxy, you configure specific browsers or applications to route through it. You can keep two browser profiles: one going through the VPS for cookie grabbing, one going directly for everything else.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No kernel modules.&lt;/strong&gt; The Docker container runs entirely in user space. No system-level network reconfiguration needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Full Cookie Workflow
&lt;/h2&gt;

&lt;p&gt;With the proxy in place, the workflow for IP-bound cookies looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Local browser (via Shadowsocks proxy)
  → VPS IP
  → Airport website
  → Cookie issued to VPS IP

grab-cookies.js export
  → cookies stored server-side

Production scraper (on same VPS)
  → Same VPS IP
  → Airport website
  → Cookie accepted ✓
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The roundtrip adds maybe 80-120ms of latency compared to a direct connection, which is irrelevant for a one-time cookie grab operation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting It Up
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://skiprestriction.uk" rel="noopener noreferrer"&gt;SkipRestriction.uk&lt;/a&gt; guides cover the full setup for both VPS (cloud server) and Raspberry Pi (local self-hosting). The site also walks through client configuration for every major platform — macOS, Windows, iOS, Android, and Linux.&lt;/p&gt;

&lt;p&gt;If you're running a scraper on a VPS and hitting IP-bound session issues, the proxy approach is the cleanest fix. The alternative — running your entire debugging workflow on the server via SSH — is technically viable but slow and error-prone.&lt;/p&gt;

&lt;p&gt;Two commands on the server side, one config file on the client side, and the IP mismatch problem disappears.&lt;/p&gt;

&lt;p&gt;The proxy runs on port 8388. The MyAirports API runs on port 443. They coexist on the same $5/month VPS without any conflict.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;MyAirports is a free real-time flight data API covering 1,000+ airports. Try it at &lt;a href="https://myairports.online/developers" rel="noopener noreferrer"&gt;myairports.online/developers&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The Shadowsocks setup guide used for this project is at &lt;a href="https://skiprestriction.uk" rel="noopener noreferrer"&gt;skiprestriction.uk&lt;/a&gt; — open source, self-hosted, no accounts required.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>node</category>
      <category>webdev</category>
      <category>devops</category>
      <category>javascript</category>
    </item>
    <item>
      <title>628,000 Flights Later: Building a Flight Analytics Engine on Accumulated Scrape Data</title>
      <dc:creator>José Catalá</dc:creator>
      <pubDate>Mon, 06 Apr 2026 16:40:24 +0000</pubDate>
      <link>https://dev.to/jfcaba/628000-flights-later-building-a-flight-analytics-engine-on-accumulated-scrape-data-4fdd</link>
      <guid>https://dev.to/jfcaba/628000-flights-later-building-a-flight-analytics-engine-on-accumulated-scrape-data-4fdd</guid>
      <description>&lt;p&gt;Every time MyAirports scrapes a flight board, the flights are stored in PostgreSQL. What started as a caching mechanism turned into something more interesting: a dataset of real flight behaviour that can answer questions no static API can answer.&lt;/p&gt;

&lt;p&gt;After accumulating 628,000+ flight records across 1,000+ airports, the data supports four categories of analytics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Airline reliability rankings&lt;/strong&gt; per airport (on-time rate, average delay, cancellation rate)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Delay pattern breakdowns&lt;/strong&gt; by airline, destination, day of week, and time of day&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Airport busyness heatmaps&lt;/strong&gt; (7×24 grid: average flight volume by hour and day)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Route reliability stats&lt;/strong&gt; with per-airline breakdown (LHR→JFK: which airline is most reliable?)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's how the analytics layer works.&lt;/p&gt;

&lt;h2&gt;
  
  
  The raw data
&lt;/h2&gt;

&lt;p&gt;Each &lt;code&gt;Flight&lt;/code&gt; record in PostgreSQL has:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;flightNumber, airline (IATA), type (arrival/departure)
airport (IATA), origin, destination
scheduled, estimated, actual (timestamps, local timezone)
status (scheduled/boarding/departed/arrived/delayed/cancelled/unknown)
confidence, strategy, scrapedAt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;actual&lt;/code&gt; timestamp and &lt;code&gt;status&lt;/code&gt; field are what make analytics possible. When a flight's status is &lt;code&gt;arrived&lt;/code&gt; and &lt;code&gt;actual&lt;/code&gt; is set, we know when it actually landed. The delay is simply &lt;code&gt;actual - scheduled&lt;/code&gt; in minutes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two compute strategies: incremental and nightly
&lt;/h2&gt;

&lt;p&gt;Running full analytics queries over 628,000 rows on every API request would be too slow. The system uses a two-track approach.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Incremental counters&lt;/strong&gt; — updated after each scrape:&lt;/p&gt;

&lt;p&gt;When a new batch of flights arrives, the system immediately updates lightweight counters: per-airline flight counts at that airport, per-terminal/gate flight counts. These are cheap to maintain and power fast lookups.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Nightly full recompute&lt;/strong&gt; — runs at 2 AM UTC:&lt;/p&gt;

&lt;p&gt;Once per day, the system runs a full pass over all historical data to recompute the richer analytics tables. This is where the delay stats, airline scorecards, and busyness grids are built. It's more expensive, but the data doesn't need to be real-time — yesterday's delay patterns are as useful as today's.&lt;/p&gt;

&lt;h2&gt;
  
  
  Airline reliability rankings
&lt;/h2&gt;

&lt;p&gt;The query is straightforward in concept, but requires minimum thresholds to be useful:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;airline_iata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;total_flights&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;FILTER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;delay_minutes&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;on_time_pct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;delay_minutes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;FILTER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;delay_minutes&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;avg_delay_minutes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;FILTER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'cancelled'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cancelled_pct&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;flights&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;airport_iata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'LHR'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;actual&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'arrived'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'departed'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;airline_iata&lt;/span&gt;
&lt;span class="k"&gt;HAVING&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;on_time_pct&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;HAVING COUNT(*) &amp;gt;= 10&lt;/code&gt; is the important part. Without a minimum flight threshold, airlines that had one flight and it happened to be on time appear at the top with 100% on-time performance. Ten flights is a low bar, but it filters out pure noise.&lt;/p&gt;

&lt;p&gt;The result for LHR looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rank"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"airlineIata"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"LX"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"airlineName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Swiss International Air Lines"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"onTimePct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;88.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"avgDelayMinutes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;7.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cancelledPct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"totalFlights"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2340&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Delay patterns: finding when and why delays happen
&lt;/h2&gt;

&lt;p&gt;The delay breakdown goes a level deeper than overall stats. It slices delay data by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Day of week&lt;/strong&gt; — Monday departures are often worse than Wednesday ones&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time bucket&lt;/strong&gt; — morning (06-12), afternoon (12-18), evening (18-24), night (00-06)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Destination&lt;/strong&gt; — some routes are systematically late&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Airline&lt;/strong&gt; — some carriers at a specific airport are consistently worse than their global average&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The time bucket breakdown is particularly useful. At large hub airports, delays compound through the day: a slot disruption in the morning propagates through afternoon flights. The heatmap data makes this visible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Airport busyness heatmap
&lt;/h2&gt;

&lt;p&gt;The busyness grid answers "when is this airport busy?" — useful for scheduling, understanding delay risk by time, and just general situational awareness:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;DOW&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;scheduled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;day_of_week&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- 0=Sun, 6=Sat&lt;/span&gt;
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;HOUR&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;scheduled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;hour&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;flight_count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;avg_flights&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'hour'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scheduled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;DOW&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;scheduled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;day_of_week&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;HOUR&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;scheduled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;hour&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;flight_count&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;flights&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;airport_iata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;hourly&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;day_of_week&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hour&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;day_of_week&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hour&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result is a 7×24 matrix of average hourly flight volumes, suitable for heatmap rendering in any charting library.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gate and terminal predictions
&lt;/h2&gt;

&lt;p&gt;The most interesting use of accumulated data is predicting which gate or terminal a flight will use before the airport announces it. Airlines tend to use the same gates consistently at their hub airports.&lt;/p&gt;

&lt;p&gt;After each scrape, the system updates per-airline terminal and gate counters:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// After normalizing a batch of flights&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;flight&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;flights&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;terminal&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;incrementCounter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;airport&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;airline&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:terminal:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;terminal&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;gate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;incrementCounter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;airport&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;airline&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:gate:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;gate&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When serving a flight that doesn't have a gate assigned yet, the API enriches it with the most common gate for that airline at that airport:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;gate&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;insights&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;terminal&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;predicted&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;flight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;insights&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;terminal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;predicted&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;insights&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;terminal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;insights&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;terminal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;confidence&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The confidence value is the fraction of historical flights that used the predicted terminal. For BA at LHR, it's nearly always Terminal 5 — confidence approaches 1.0. For a code-share flight operated by a third party, confidence might be 0.3.&lt;/p&gt;

&lt;h2&gt;
  
  
  The route reliability endpoint
&lt;/h2&gt;

&lt;p&gt;One API endpoint that required no special query design: route-level stats. Given an origin and destination, return reliability data for each airline that operates the route:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;GET /api/routes/LHR/JFK/insights
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"origin"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"LHR"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dest"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"JFK"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"route"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"totalFlights"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1240&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"onTimePct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;78.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"avgDelayMinutes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;16.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"cancelledPct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"topAirline"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"BA"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"topAirlineOnTimePct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;84.2&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"airlines"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"airlineIata"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"BA"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"onTimePct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;84.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"totalFlights"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;890&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"airlineIata"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"VS"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"onTimePct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;71.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"totalFlights"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;350&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is data that's genuinely hard to find anywhere else. The big flight data providers have it, but it's locked behind enterprise pricing. The MyAirports version is built from first-party airport data, accumulated one scrape at a time.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the data still can't do
&lt;/h2&gt;

&lt;p&gt;A few honest caveats about 628,000 flights:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It's not uniform.&lt;/strong&gt; Airports with many active users have more scrape history than airports discovered recently. LHR has thousands of records; a newly added regional airport might have dozens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It's biased toward requested airports.&lt;/strong&gt; The system scrapes when someone requests data. Popular airports are over-represented relative to their actual traffic volumes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Delay calculations require actual timestamps.&lt;/strong&gt; Many airport APIs only return scheduled times, not actual arrival/departure times. Those flights contribute to volume counts but not delay calculations.&lt;/p&gt;

&lt;p&gt;The data is directionally useful — the patterns it reveals are real — but it's not a controlled dataset. It's accumulated opportunistically.&lt;/p&gt;




&lt;p&gt;All three endpoints are available in the MyAirports API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;GET /api/airports/:iata/insights/airlines
GET /api/airports/:iata/insights/delays
GET /api/airports/:iata/insights/busyness
GET /api/routes/:origin/:dest/insights
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Free tier at &lt;a href="https://myairports.online/developers" rel="noopener noreferrer"&gt;myairports.online/developers&lt;/a&gt;. No credit card required.&lt;/p&gt;

</description>
      <category>node</category>
      <category>postgres</category>
      <category>javascript</category>
      <category>api</category>
    </item>
    <item>
      <title>Normalizing Flight Statuses Across 7 Languages: What I Learned Building a Global Airport API</title>
      <dc:creator>José Catalá</dc:creator>
      <pubDate>Mon, 06 Apr 2026 16:39:30 +0000</pubDate>
      <link>https://dev.to/jfcaba/normalizing-flight-statuses-across-7-languages-what-i-learned-building-a-global-airport-api-po0</link>
      <guid>https://dev.to/jfcaba/normalizing-flight-statuses-across-7-languages-what-i-learned-building-a-global-airport-api-po0</guid>
      <description>&lt;p&gt;When you scrape flight data from airport websites across 85+ countries, you quickly discover that nobody agrees on anything. Not the data format, not the field names, not the status strings.&lt;/p&gt;

&lt;p&gt;"Departed" is &lt;code&gt;SAL&lt;/code&gt;, &lt;code&gt;FLY&lt;/code&gt;, &lt;code&gt;CER&lt;/code&gt;, &lt;code&gt;AIR&lt;/code&gt;, &lt;code&gt;DEP&lt;/code&gt;, &lt;code&gt;DEPARTED&lt;/code&gt;, &lt;code&gt;Departed&lt;/code&gt;, &lt;code&gt;Отправлен&lt;/code&gt;, &lt;code&gt;Väljunud&lt;/code&gt;, &lt;code&gt;DESPEGADO&lt;/code&gt;, &lt;code&gt;gestartet&lt;/code&gt;, or &lt;code&gt;partito&lt;/code&gt;, depending on which airport website you're looking at.&lt;/p&gt;

&lt;p&gt;Building MyAirports meant writing a status normalizer that handles all of these — and maps them to a single, predictable enum. Here's how it works and what surprised me along the way.&lt;/p&gt;

&lt;h2&gt;
  
  
  The target schema
&lt;/h2&gt;

&lt;p&gt;Every flight the API returns has a status field from this set:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;scheduled | boarding | departed | arrived | delayed | cancelled | unknown
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Seven values. Simple, predictable, usable in any frontend without additional mapping. The hard part is getting there from the real world.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 1: Exact string matches
&lt;/h2&gt;

&lt;p&gt;The first layer is the most straightforward — a lookup table of known status strings to their canonical values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;EXACT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// English&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;scheduled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;scheduled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;on time&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;scheduled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;in flight&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrived&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrived&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;landed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrived&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;delayed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;delayed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boarding&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boarding&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gate open&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boarding&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

  &lt;span class="c1"&gt;// Spanish (AENA codes)&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SAL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// salida&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AIR&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// en aire&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;FLY&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CER&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// cerrado&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;LND&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrived&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;// aterrizó&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;FNL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrived&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;// final&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;IBK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrived&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;// in-block&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;BOR&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boarding&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// embarcando&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;EMB&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boarding&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ULL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boarding&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;OPE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boarding&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;EST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;scheduled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// estimado&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PRG&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;scheduled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// programado&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;RET&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;delayed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;// retraso&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CNL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

  &lt;span class="c1"&gt;// Russian&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Отправлен&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Прилетел&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrived&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Посадка&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boarding&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Задержан&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;delayed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Отменён&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

  &lt;span class="c1"&gt;// Estonian&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Väljunud&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Saabunud&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrived&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Pardal&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boarding&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

  &lt;span class="c1"&gt;// Finnish (Finavia)&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Lähtenyt&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Saapunut&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrived&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Nousussa&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boarding&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This handles probably 60-70% of real-world cases. The remainder need more work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 2: Substring patterns
&lt;/h2&gt;

&lt;p&gt;Some status strings embed dynamic information — actual times, gate numbers, gate closures. You can't match them exactly, but you can match them by pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;SUBSTRING_PATTERNS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="c1"&gt;// "Delayed 00:45"&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sr"&gt;/delay/i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;delayed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="c1"&gt;// "Cancel" in any language starting with 'cancel'&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sr"&gt;/cancel/i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="c1"&gt;// AENA-style: "CERRADO" (closed), "PROGRAMADO" (scheduled)&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sr"&gt;/^cerr/i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sr"&gt;/^progr/i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;scheduled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="c1"&gt;// "Check-in open" → scheduled (not boarding yet)&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sr"&gt;/check.&lt;/span&gt;&lt;span class="se"&gt;?&lt;/span&gt;&lt;span class="sr"&gt;in/i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;scheduled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Layer 3: Regex patterns for dynamic content
&lt;/h2&gt;

&lt;p&gt;A handful of airports embed actual timestamps or qualifiers directly in the status string. Phoenix Sky Harbor is a good example — their status field for delayed departures looks like &lt;code&gt;"Now 14:22"&lt;/code&gt; (the actual departure time). That's not a status, that's a timestamp, and it means the flight has departed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;REGEX_PATTERNS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="c1"&gt;// Phoenix: "Now 12:21 PM" → departed&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sr"&gt;/^now&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="sr"&gt;+&lt;/span&gt;&lt;span class="se"&gt;\d&lt;/span&gt;&lt;span class="sr"&gt;/i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="c1"&gt;// Finavia: "Expected 14:35" → still scheduled&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sr"&gt;/^expected&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="sr"&gt;+&lt;/span&gt;&lt;span class="se"&gt;\d&lt;/span&gt;&lt;span class="sr"&gt;/i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;scheduled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="c1"&gt;// Generic: "14:35" alone means actual time → departed/arrived&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sr"&gt;/^&lt;/span&gt;&lt;span class="se"&gt;\d{1,2}&lt;/span&gt;&lt;span class="sr"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\d{2}&lt;/span&gt;&lt;span class="sr"&gt;$/&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrived&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Layer 4: Russian stem matching
&lt;/h2&gt;

&lt;p&gt;Russian is morphologically rich — words inflect heavily based on grammatical case, number, and gender. "Посадка" (boarding) appears as "посадке", "посадкой", "посадку" in different sentence positions. Exact matching misses these.&lt;/p&gt;

&lt;p&gt;The solution isn't a full NLP library (overkill for six words) — it's stem prefixes. The first four or five characters of a word are usually enough to identify it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;RUSSIAN_STEMS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;посад&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boarding&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;// posad- (посадка, посадке...)&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;отпра&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;// otpra- (отправлен, отправился...)&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;прил&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrived&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="c1"&gt;// pril- (прилетел, прилетела...)&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;задер&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;delayed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="c1"&gt;// zader- (задержан, задержка...)&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;отмен&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cancelled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// otmen- (отменён, отменённый...)&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This catches all the common inflected forms without false positives.&lt;/p&gt;

&lt;h2&gt;
  
  
  The object guard
&lt;/h2&gt;

&lt;p&gt;One thing I didn't expect: some airport APIs return status as an object rather than a string:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DEP"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Departed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"color"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"#ff0000"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The normalizer checks for this and extracts &lt;code&gt;code&lt;/code&gt; or &lt;code&gt;description&lt;/code&gt; before attempting any matching. Without this guard, &lt;code&gt;typeof status === 'object'&lt;/code&gt; crashes the string matching silently.&lt;/p&gt;

&lt;h2&gt;
  
  
  AENA's 30+ status codes
&lt;/h2&gt;

&lt;p&gt;AENA (the Spanish airport operator, covering all 22 Spanish airports) has the most elaborate status system I encountered. They use two to three character codes from their SIGER ground handling system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FNL, LND, IBK → arrived
BOR, EMB, ULL, OPE, APE → boarding
SAL, AIR, FLY, CER, APA → departed
EST, PRG, INI, SCH, HOR → scheduled
RET, DEM, REM → delayed
CNL, DES, CAN → cancelled
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each code has a specific operational meaning (&lt;code&gt;IBK&lt;/code&gt; = in-block, &lt;code&gt;ULL&lt;/code&gt; = final boarding call, &lt;code&gt;DEM&lt;/code&gt; = ground delay) that the normalizer collapses to the six canonical values. The operational detail is lost, but that's the right tradeoff for a public API.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "unknown" means in practice
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;unknown&lt;/code&gt; bucket catches everything the normalizer can't confidently classify. In production, it appears for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Empty strings or null values (some airports just don't return status for on-time flights)&lt;/li&gt;
&lt;li&gt;Airport-specific codes that haven't been mapped yet&lt;/li&gt;
&lt;li&gt;Status strings in languages I haven't configured (Arabic, Japanese, Chinese)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Returning &lt;code&gt;unknown&lt;/code&gt; is far better than returning a wrong value. Frontends can show "—" for unknown statuses and users understand it means data isn't available.&lt;/p&gt;

&lt;h2&gt;
  
  
  Running the normalizer
&lt;/h2&gt;

&lt;p&gt;The normalizer runs at the end of every scrape, after the field mapper has identified which JSON field contains the status. It's a pure function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;normalizeStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// → 'scheduled' | 'boarding' | 'departed' | 'arrived' | 'delayed' | 'cancelled' | 'unknown'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Input is always a string (after the object guard). Output is always one of the seven values. No exceptions are thrown — unrecognized inputs return &lt;code&gt;'unknown'&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The broader lesson
&lt;/h2&gt;

&lt;p&gt;I went into this project expecting the hard part to be scraping. The really hard part turned out to be normalization — taking data from wildly inconsistent sources and making it uniformly useful.&lt;/p&gt;

&lt;p&gt;Status normalization is one example. The same problem appears with timestamps (airports use their local timezone, not UTC, and the formats vary wildly), location fields (some put "MILAN, MALPENSA" in the IATA field), and airline codes (some use IATA codes, some use ICAO, some use full names).&lt;/p&gt;

&lt;p&gt;The principle that worked: design a strict output schema first, then build the normalization layer to enforce it, and treat every edge case as an addition to the normalizer rather than a special case in the consumer.&lt;/p&gt;

&lt;p&gt;The API is live and free to try at &lt;a href="https://myairports.online/developers" rel="noopener noreferrer"&gt;myairports.online/developers&lt;/a&gt;. Arrivals and departures for 1,000+ airports, statuses normalized to the seven values above.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Next in this series: how 628,000 accumulated flight records are turned into airline delay rankings, gate predictions, and airport busyness heatmaps.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>node</category>
      <category>javascript</category>
      <category>webdev</category>
      <category>opensource</category>
    </item>
    <item>
      <title>How I Built a Web Scraper That Gets Faster and Cheaper With Every Request</title>
      <dc:creator>José Catalá</dc:creator>
      <pubDate>Mon, 06 Apr 2026 16:38:33 +0000</pubDate>
      <link>https://dev.to/jfcaba/how-i-built-a-web-scraper-that-gets-faster-and-cheaper-with-every-request-oo7</link>
      <guid>https://dev.to/jfcaba/how-i-built-a-web-scraper-that-gets-faster-and-cheaper-with-every-request-oo7</guid>
      <description>&lt;p&gt;Most web scrapers are dumb: they do the same work every time. Open a browser, load a page, parse the DOM, extract data, close the browser. Every single request.&lt;/p&gt;

&lt;p&gt;Building MyAirports — a real-time flight data API covering 1,000+ airports — forced me to think about scraping differently. At scale, running a headless browser for every request is expensive and slow. The system I ended up building instead gets faster and cheaper the more it runs, because it learns.&lt;/p&gt;

&lt;p&gt;Here's how it works.&lt;/p&gt;

&lt;h2&gt;
  
  
  The core insight: most airport SPAs load data from internal JSON APIs
&lt;/h2&gt;

&lt;p&gt;Go to any modern airport website. Open DevTools. Click on the Network tab. Filter for XHR/Fetch requests. What you'll see almost immediately is a call to something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;GET https://api.heathrow.com/pihub/flights/arrivals
GET https://aena.es/sites/Satellite?pagename=AENA_ConsultarVuelos&amp;amp;airport=MAD
GET https://cph.dk/api/FlightInformation/GetFlightInfoTable?direction=A
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The webpage is a React or Angular SPA, and the actual flight data comes from an internal API. That API is often unauthenticated, or authenticated only by session cookies the browser already has. If you can capture that API call, you can make it directly — no browser needed for subsequent requests.&lt;/p&gt;

&lt;p&gt;This is the foundation of the whole system.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 5-layer pipeline
&lt;/h2&gt;

&lt;p&gt;Every flight data request flows through five layers:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 1: Operator registry&lt;/strong&gt;&lt;br&gt;
For ~500 airports I've already reverse-engineered the API, the system goes directly there. No discovery needed. A curated registry maps IATA codes to endpoints, required headers, and a transform function for normalizing the response. This handles roughly half of all requests, takes 1-2 seconds, and costs nothing beyond the API call.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 2: API cache (discovered endpoints)&lt;/strong&gt;&lt;br&gt;
The system maintains a database of APIs it has discovered during previous browser sessions. Before firing up a browser, it checks: have we already seen a good API for this airport? If yes, call it directly. This is the "memory" layer — it's how the system gets cheaper over time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 3: Browser discovery&lt;/strong&gt;&lt;br&gt;
Only if both previous layers miss does the system launch a real browser (Playwright, stealth mode). But even here it's not scraping the DOM — it's intercepting network traffic. The browser loads the airport's flight page, and the interceptor scores every JSON response it sees:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;scoreResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;flight&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;arrival&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;departure&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isArray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;flightNumber&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;flight_no&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;scheduled&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;scheduledTime&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Any response with a score above 5 is a candidate. The best candidate is saved to the &lt;code&gt;ApiCache&lt;/code&gt; table — the discovered endpoint, the required headers, the response shape. Future requests for that airport hit Layer 2 instead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 4: Normalizer&lt;/strong&gt;&lt;br&gt;
Regardless of which layer found the data, every response goes through a normalizer that converts it to a unified flight schema. Airport APIs have wildly different shapes — nested objects, flat arrays, GraphQL responses, tables with inconsistent column names. The normalizer handles all of it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 5: Persistence and stale-while-revalidate cache&lt;/strong&gt;&lt;br&gt;
Results are stored in PostgreSQL and kept in an in-memory cache with a stale-while-revalidate strategy. A stale response is returned immediately while a fresh scrape runs in the background.&lt;/p&gt;
&lt;h2&gt;
  
  
  The result: the system pays for its own speed
&lt;/h2&gt;

&lt;p&gt;When the system first encounters a new airport, it's slow: a full browser session takes 10-15 seconds. But the next time that airport is requested, the discovered API is available in the cache — and the call takes 1-2 seconds.&lt;/p&gt;

&lt;p&gt;Over time, the API cache fills up. More and more airports become fast. The "discovery" cost is paid exactly once per airport. After that, it's a cheap HTTP call.&lt;/p&gt;

&lt;p&gt;There's also an eviction mechanism: if a cached API endpoint starts returning 0 flights three times in a row, it gets flagged as stale and the system re-discovers it. Airport websites do change their APIs occasionally.&lt;/p&gt;
&lt;h2&gt;
  
  
  The auto-discovery tool
&lt;/h2&gt;

&lt;p&gt;To seed the API cache without waiting for organic traffic, I built a batch auto-discovery CLI that processes airports from the OpenFlights CSV database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm run auto-discover &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="nt"&gt;--country&lt;/span&gt; IT
npm run auto-discover &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="nt"&gt;--min-size&lt;/span&gt; large
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It runs the full browser discovery pipeline for each airport, then writes the discovered endpoints to the config files automatically. Processing 1,300+ airports this way found 276 new ones — airports whose APIs I hadn't previously known existed.&lt;/p&gt;

&lt;h2&gt;
  
  
  AI pattern learning as a last resort
&lt;/h2&gt;

&lt;p&gt;For airports where network interception fails (because the page loads data from a server-side render rather than a client-side API call), there's a DeepSeek-powered fallback: send the page's HTML to the AI and ask it to identify CSS selectors for the flight table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GYD (Baku) — a.flt_row card layout with .fl_col.* columns
KTW (Katowice) — div.timetable__row.flight-board__row with .flight-board__col--*
TLL (Tallinn) — li.flights-list__item card layout
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AI doesn't always get it right on the first try, but for airports with non-standard layouts it's significantly faster than manually inspecting the DOM.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned
&lt;/h2&gt;

&lt;p&gt;The most important architectural decision was treating discovery as a one-time investment rather than an ongoing cost. Every browser session is expensive — it's a browser, running JavaScript, waiting for a page to load. If you pay that cost once and cache the result, you amortize it across hundreds of future requests.&lt;/p&gt;

&lt;p&gt;The second insight was scoring rather than parsing: instead of trying to understand an airport's specific API format upfront, intercept everything and look for responses that smell like flight data. The scoring heuristic is simple but effective, and it works across wildly different API shapes.&lt;/p&gt;

&lt;p&gt;The third was building eviction in from the start. APIs change. Caching without a sensible eviction strategy means you end up with a database full of dead endpoints.&lt;/p&gt;

&lt;p&gt;The result is a system that today handles most requests in 1-2 seconds, covers 1,000+ airports, and gets incrementally better with every new airport it encounters.&lt;/p&gt;

&lt;p&gt;The API is free to try at &lt;a href="https://myairports.online/developers" rel="noopener noreferrer"&gt;myairports.online/developers&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you found this interesting, I write about building MyAirports — a real-time flight data API. The next post covers how the system normalizes flight status strings from 7 different languages.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>node</category>
      <category>webdev</category>
      <category>javascript</category>
      <category>api</category>
    </item>
  </channel>
</rss>
