The NDC Revolution and What It Means for Data Engineers
The airline industry is in the midst of its most significant distribution transformation in decades, and most data engineers working in travel tech are only beginning to grasp the magnitude of the shift. IATA's New Distribution Capability (NDC) standard isn't just another API specification—it's a fundamental reimagining of how airline products move from inventory systems to customer screens and it's creating challenges and opportunities that will reshape our discipline for years to come.
I've spent the better part of the last five years watching this transformation unfold, and I can tell you that the technical debt accumulated from decades of GDS-centric distribution is now coming due. The question isn't whether your data infrastructure can handle NDC; it's whether you're ready to rebuild large portions of it from the ground up.
From PNR to Offer: A Paradigm Shift in Data Modelling
The traditional airline distribution model centred on the Passenger Name Record—a flat, text-based format that evolved from the teletype era. Every data engineer who's parsed EDIFACT messages or wrangled Amadeus cryptic entries knows the pain of extracting structured information from what is essentially a formatted string with decades of accumulated quirks.
NDC replaces this with an offer and order model built on modern XML schemas. An offer represents a specific combination of flights, ancillaries, and pricing valid for a limited time. An order represents a confirmed purchase with all its associated services and fulfilment obligations. This sounds straightforward until you realise that a single shopping request might generate hundreds of dynamic offers, each with its own validity window, and that these offers don't correspond to traditional fare classes or booking codes.
I've had to completely rethink how we model availability in data warehouses. The old approach of storing fare classes, booking codes, and inventory counts doesn't map cleanly to a world where pricing is algorithmically generated in response to specific shopping requests. You're no longer dealing with relatively static fare tables that change a few times per day; you're dealing with ephemeral offers that exist only in the context of a specific shopping session.
The schema complexity alone is substantial. An NDC OrderViewRS response can contain nested structures for travellers, service associations, payment information, and fulfilment statuses that require careful normalisation. I've found that graph database patterns often make more sense than traditional relational schemas for representing the relationships between offers, orders, service definitions, and traveller profiles.
The API-First Reality and Its Infrastructure Demands
NDC is fundamentally an API-first standard, which means the asynchronous, message-based patterns that dominated GDS integration are giving way to synchronous REST and SOAP interactions. This shift has profound implications for how we architect data pipelines.
Traditional airline distribution relied heavily on queue processing—reservations were created, modified, and cancelled through queue messages that could be processed in batch. NDC shopping and ordering happen in real-time API calls with strict response time requirements. If your infrastructure can't return a shopping response in under two seconds, you've lost the customer.
I've learned that the data engineering challenges here extend far beyond simply calling APIs. You need sophisticated caching layers to avoid redundant shopping requests, circuit breakers to handle airline API failures gracefully, and rate limiting to manage quota consumption across multiple airline partners. The observability requirements are also completely different—you can't wait for batch job logs to investigate issues when every millisecond of API latency affects conversion.
The volume characteristics change dramatically as well. A single customer shopping session might generate dozens of API calls as they refine search parameters, compare options, and explore ancillary services. Each call produces detailed offer data that needs to be captured for analytics, even though most offers will never be purchased. I've seen shopping-to-booking ratios of 100:1 or higher, which means your data infrastructure needs to handle two orders of magnitude more traffic than your actual transaction volume would suggest.
Ancillary Services and the Unbundling Problem
One of NDC's core promises is rich merchandising of ancillary services—seats, bags, meals, lounge access, and increasingly creative product bundles. For data engineers, this unbundling creates a many-to-many relationship problem that legacy systems were never designed to handle.
In the GDS world, ancillaries were often bolted on as special service requests or stored as cryptic codes in free-text fields. NDC makes ancillaries first-class entities with their own pricing, availability, and fulfilment rules. A single order might contain base fares for three passengers, each with different cabin selections, baggage allowances, meal preferences, and entertainment packages.
I've found that modelling this effectively requires treating services as independent entities that can be associated with specific travellers and flight segments through a flexible association layer. And the challenge is that these associations have their own business rules—certain ancillaries are only valid in combination with specific fare families, some have dependencies on traveller status or frequent flyer tier, and others have complex rebooking or refund policies that differ from the base fare.
The analytics implications are equally complex. Traditional metrics like average fare or load factor become less meaningful when significant revenue comes from unbundled services. I've had to develop new frameworks for measuring ancillary attachment rates, bundle take-up, and service-level profitability that account for the dynamic nature of NDC offers.
The Schema Versioning Nightmare
IATA releases new versions of the NDC schema regularly, and individual airlines often implement airline-specific extensions or interpretations. This creates a versioning problem that makes traditional API versioning strategies look simple by comparison.
I've encountered situations where we needed to support three different NDC schema versions simultaneously because different airline partners were at different stages of their implementation journey. The data models need to accommodate this heterogeneity without creating separate pipelines for each version, which means building abstraction layers that can map different schema versions to a canonical internal representation.
The testing burden is substantial. You can't simply mock API responses because the schema variations between airlines mean that a response structure that works perfectly for one carrier might be invalid for another. I've invested heavily in contract testing and schema validation frameworks that can catch incompatibilities before they reach production.
Real-Time Analytics and the Death of Batch Processing
The shift to API-first distribution means that many analytics use cases that were previously satisfied by overnight batch processes now require near-real-time data streams. Revenue management needs current shopping data to adjust pricing algorithms. Customer service needs immediate access to order status across multiple airline systems. Marketing needs to track offer presentation and conversion in real-time to optimise merchandising strategies.
I've found that traditional data warehouse architectures struggle with this requirement. Loading NDC transaction data through nightly ETL jobs means your analytics are always at least 24 hours stale, which is unacceptable when pricing decisions need to respond to demand signals within hours or even minutes.
This has pushed me toward streaming architectures using tools like Apache Kafka and real-time processing frameworks. The challenge is that airline APIs don't emit events—you have to poll for updates or implement webhook listeners, then transform those API responses into event streams that can feed real-time analytics pipelines. Worth remembering.
Why does this matter? Because the alternative is worse. The state management becomes particularly complex. An order might be modified multiple times—seats changed, ancillaries added, traveller details updated—and you need to maintain both the current state and the full history of changes for compliance and analytics purposes. I've experimented with event sourcing patterns where each API interaction is captured as an immutable event, allowing you to reconstruct state at any point in time.
My View on Where This Is Heading
Looking ahead, I believe we're still in the early stages of understanding what NDC means for data infrastructure. The standard itself will continue to evolve, but the more fundamental shift is in how airlines think about distribution as a data-driven, algorithmically optimised process rather than a static inventory management problem.
The data engineering skills required for travel technology are converging with those needed in e-commerce and digital platforms more broadly. We need to think like product engineers building real-time systems, not just data engineers building analytical pipelines. The distinction between transactional and analytical systems is blurring as the feedback loops between pricing, merchandising, and customer behaviour tighten to near-instantaneous timescales.
I'm particularly excited about the potential for machine learning in this new paradigm. When offers are dynamically generated and every shopping interaction is captured in detail, you have the raw material for sophisticated personalisation and optimisation that was impossible with legacy distribution. The challenge is building data platforms that can support both the operational demands of NDC distribution and the experimental needs of ML model development.
The NDC revolution is fundamentally about moving airline distribution from a batch-oriented, message-based architecture to a real-time, API-first platform. For those of us building the data infrastructure that powers this transformation, it's an opportunity to apply modern engineering practices to an industry that desperately needs them—and to prove that travel technology can be every bit as sophisticated as what we see in other digital-native sectors.
About Martin Tuncaydin
Martin Tuncaydin is an AI and Data executive in the travel industry, with deep expertise spanning machine learning, data engineering, and the application of emerging AI technologies across travel platforms. Follow Martin Tuncaydin for more insights on ndc, data engineering.
Top comments (0)