Saheer K

Posted on Mar 23 • Edited on Mar 29

How We Built One of Europe's First Large-Scale MAP-T Deployments on RDK-B

#architecture #networking #systemdesign #systems

The Problem: IPv4 Exhaustion at ISP Scale

By late 2019, IPv4 address exhaustion had moved from a theoretical concern to an operational reality. On 25 November 2019, RIPE NCC — the regional internet registry for Europe — made its final IPv4 allocation, officially running out of addresses entirely. For an ISP like SKY UK, serving millions of broadband subscribers across the UK and Italy, this created a concrete engineering challenge: how do you keep growing your subscriber base when you cannot acquire new IPv4 addresses at reasonable cost?

The long-term answer is IPv6. But the internet in 2019 was still overwhelmingly IPv4 — content providers, gaming platforms, and enterprise services all expected IPv4 reachability. A pure IPv6 deployment would break everything for customers overnight.

ISPs needed a transition mechanism — a way to run a pure IPv6 core network while still delivering working IPv4 connectivity to every subscriber. Several approaches existed:

DS-Lite — tunnels IPv4 over IPv6, but requires stateful NAT on the ISP's border routers
NAT64 — translates IPv6 to IPv4, but breaks IPv4-only content
464XLAT — combines CLAT on the device with PLAT at the ISP, but adds complexity
MAP-T — stateless, deterministic, scales without per-subscriber state

SKY chose MAP-T. Here is why.

Why MAP-T?

MAP-T (Mapping of Address and Port — Translation) is defined in RFC 7599. The key distinction from alternatives is statelessness.

With DS-Lite or carrier-grade NAT, the ISP's border routers maintain a per-subscriber NAT table — tracking every active TCP/UDP session for every customer. At SKY's scale, that means hundreds of millions of concurrent state entries, requiring expensive specialised hardware and creating a single point of failure.

MAP-T eliminates this entirely. Instead of maintaining state, it uses a deterministic mathematical mapping:

IPv4 address + port range = f(IPv6 prefix, MAP-T rule)

A subscriber's IPv6 prefix, combined with operator-provisioned MAP-T rules delivered via DHCPv6, deterministically maps to a shared IPv4 address and a specific port range. Every MAP-T capable device — including the CPE in the customer's home — can independently compute this mapping without any central coordination.

The ISP's core network becomes stateless. Translation happens at the edge — in the customer's gateway. No session tables. No expensive state synchronisation. Scales linearly with subscribers.

The Implementation: RDK-B Stack

SKY's broadband gateways ran on RDK-B (Reference Design Kit for Broadband) — an open-source, Linux-based platform used by ISPs and gateway manufacturers worldwide. RDK-B is a layered architecture:

┌─────────────────────────────────────┐
│     TR-069 / WebPA / ACS            │  Remote management
├─────────────────────────────────────┤
│     CcspPandM (TR-181 datamodel)    │  Device management
├─────────────────────────────────────┤
│     wan-manager                     │  WAN state machine
├─────────────────────────────────────┤
│     Platform HAL                    │  Hardware abstraction
├─────────────────────────────────────┤
│     iptables / netfilter            │  Packet filtering
├─────────────────────────────────────┤
│     Linux kernel + nat46 module     │  Packet translation
└─────────────────────────────────────┘

MAP-T integration required changes at every layer. Here is what each piece involved.

1. nat46 Kernel Module

The actual IPv4-to-IPv6 packet translation happens in the Linux kernel using the nat46 kernel module. This module inserts itself into the netfilter pipeline and performs the stateless address and port translation according to the MAP-T rules.

Key challenges here:

Loading and configuring nat46 correctly with the right MAP-T parameters
Handling the 1:1 port ratio case — when a subscriber gets an entire IPv4 address to themselves, the configuration differs significantly from the shared-address case
Ensuring the module is torn down cleanly when the WAN interface goes down, so stale translation state does not affect subsequent connections

2. iptables / netfilter Rules

Loading the nat46 module is not enough — traffic must be explicitly steered into it. This required adding rules in the PREROUTING and POSTROUTING chains to intercept IPv4 packets destined for the WAN and redirect them through the nat46 translation path.

This layer had several non-trivial edge cases:

Interface binding — rules had to be correctly scoped to the MAP-T virtual interface, not applied globally across all interfaces. Getting this wrong causes translation to run on traffic it should not touch.

Rule ordering — SKY's production gateways already had a complex iptables ruleset covering firewall rules, port forwarding, QoS marking, and connection tracking. MAP-T rules had to be inserted at precisely the right position in the chain. Wrong ordering causes subtle breakage — some traffic works, some does not — which is extremely difficult to diagnose in the field.

Teardown on link down — when the physical WAN link goes down, all MAP-T iptables rules had to be flushed in the correct sequence. Partial teardown leaves stale rules that affect the next connection attempt.

1:1 port ratio — when a subscriber maps to an entire IPv4 address (port ratio 1:1), the iptables rules differ from the shared-address scenario and required separate handling.

This work was part of SKY's internal platform layer and was not open sourced, but it was essential to production correctness.

3. wan-manager

The wan-manager is the core WAN state machine in RDK-B. It manages which physical interfaces are used for WAN, runs the Layer 3 protocols to establish internet connectivity, and handles connection lifecycle including bring-up, monitoring, failover, and teardown.

MAP-T required adding a new WAN mode alongside the existing DHCP, PPPoE, and static IP modes. Key work included:

DHCPv6 option parsing — MAP-T rules are delivered by the ISP's DHCPv6 server as DHCPv6 options. The parser had to handle missing options, malformed values, and differences between SKY UK and SKY Italia network configurations gracefully.

Interface lifecycle management — bringing up the MAP-T virtual interface, monitoring it, and tearing it down correctly on link events. A key issue we hit early: on physical link down, both MAP-T and IPv6 had to be torn down together. Tearing down one without the other left the gateway in a broken intermediate state.

Route monitoring — ensuring the correct IPv6 default route was maintained under MAP-T, and recovering correctly after network outages where route configuration could be lost.

Crash recovery — wan-manager crashes in MAP-T setup required careful handling to ensure the gateway recovered to a clean state rather than getting stuck.

Open source contributions in rdkcentral/wan-manager:

cd2c1ac — Initial MAP-T integration in wan-manager
9d79cb5 — RFC data model implementation for MAP-T
2f3f077 — MAP-T IPv6 default route management
c9254e1 — MAP-T and IPv6 teardown on physical link down
854451c — Crash recovery in MAP-T setup
de21f93 — Connection recovery after network outage

4. CcspPandM — TR-181 Data Model

CcspPandM is the CCSP component responsible for exposing the TR-181 device data model over TR-069 and WebPA for remote management by SKY's ACS (Auto Configuration Server).

MAP-T required implementing new TR-181 data model objects — Device.MAP.Domain and Device.MAP.Domain.{i}.Rule — to expose MAP-T configuration and status for remote management and diagnostics.

This included integrating MAP-T support into the Dibbler DHCPv6 client, so that MAP-T provisioning rules received via DHCPv6 were correctly parsed, stored in the data model, and made available to wan-manager.

Open source contributions in lgirdk/ccsp-p-and-m:

74323b21f — MAP-T support in Dibbler DHCPv6 client
d8a8a24ac — RFC data model implementation for MAP-T
b3436b4d7 — MAP-T data model implementation
c20bc1833 — MAP-T data model prefix fix

5. Supporting Components

rdkb-RdkVlanBridgingManager — VLAN and bridging management, updated to handle the MAP-T virtual interface correctly within the gateway's bridging topology.

3239296 — Pointer initialisation fix in rbus API calls

GitHub: rdkcmf/rdkb-RdkVlanBridgingManager

rdk-rdk_logger — log management component, updated to route MAP-T diagnostic logs to non-volatile memory. On a field-deployed gateway, volatile log storage means diagnostic logs are lost on reboot. Moving MAP-T logs to non-volatile storage was essential for diagnosing production issues where attaching a serial console is not an option.

03cded2 — MAP-T logs moved to non-volatile memory

GitHub: rdkcmf/rdk-rdk_logger

Deployment and Open Source

The implementation was deployed across SKY UK and SKY Italia broadband gateways — one of the earliest large-scale commercial MAP-T deployments in Europe, serving millions of broadband subscribers across both markets.

Following successful production deployment, the MAP-T work was contributed upstream to the open-source RDK-B repositories on GitHub. RDK-B is used by ISPs and gateway manufacturers worldwide — including operators across Europe, North America, and Asia. The implementation is now available to any organisation building on the RDK-B platform.

Open source repos containing contributions from this project:

Key Lessons

Stateless does not mean simple. MAP-T eliminates per-session state on the ISP side, but the CPE implementation has to be very precise. A wrong port range calculation means a subscriber gets connectivity for some ports but not others — which manifests as random application failures that are extremely difficult to diagnose in the field.

DHCPv6 option parsing is fragile at scale. MAP-T rules are delivered as DHCPv6 options, but the format and presence of these options varies between network configurations. The parser has to handle missing options, malformed values, and mid-session rule changes gracefully. Production failures here are hard to reproduce in a lab.

iptables rule ordering is load-bearing. On a production gateway with firewall rules, QoS marking, and port forwarding all active simultaneously, inserting MAP-T rules in the wrong position in the chain causes subtle breakage. Getting the ordering right — and ensuring teardown flushes rules in the correct sequence — required significant iteration.

Log storage matters more than you think. The decision to move MAP-T logs to non-volatile memory came after early field issues where reboots were wiping the diagnostic data needed to understand what had gone wrong. Field debugging without logs is painful — especially when the failure is intermittent.

Teardown is as important as bring-up. Most of the difficult bugs we hit were in teardown paths — link down events, crash recovery, partial failure scenarios. It is easy to focus engineering effort on the happy path and underinvest in cleanup. Do not.

References

RFC 7599 — Mapping of Address and Port with Translation (MAP-T)
RFC 7598 — DHCPv6 Options for Configuration of Softwire Address and Port-Mapped Clients
RIPE NCC — The RIPE NCC has run out of IPv4 Addresses (25 Nov 2019)
RDK-B WAN Manager
GitHub: Opensource Commit

DEV Community