Muhammed Shafin P

Posted on Feb 15

NDM-TCP vs BBR: Performance Comparison Under Network Constraints

#ai #discuss #hejhdiss #network

GitHub Repository: https://github.com/hejhdiss/lkm-ndm-tcp

Previous Article: NDM-TCP: Correction, Clarification and Real Performance Results

Introduction: Testing Against BBR (Assumed v1)

After the initial testing of NDM-TCP against TCP Reno and Cubic (documented in the previous article), a comparison against BBR (Bottleneck Bandwidth and RTT) was conducted. This article presents the test results comparing NDM-TCP standard version (v1.0) against BBR under identical network conditions.

BBR Version Note: The exact BBR version is unknown as the source code does not contain explicit version information. Based on the implementation characteristics (classic STARTUP → DRAIN → PROBE_BW → PROBE_RTT state machine), it is assumed to be BBR v1, not the newer BBR v2/v3 variants.

Important Context: This is NOT production-grade or standardized TCP congestion control testing. These are experimental implementations tested in a simulated environment for educational and research purposes.

Test Environment

Hardware/Virtualization:

Xubuntu 24.04
VMware 17 (Virtual Machine)
Linux Kernel 6.11.0

Testing Tools:

tc (traffic control) for network emulation
iperf3 for throughput measurement
Localhost loopback interface (127.0.0.1)

Network Conditions Applied:

sudo tc qdisc add dev lo root handle 1: netem delay 20ms 5ms distribution normal loss 0.5%
sudo tc qdisc add dev lo parent 1:1 handle 10: tbf rate 50mbit burst 5000kbit latency 50ms

Network Parameters:

Base delay: 20ms
Delay variation: ±5ms (normal distribution)
Packet loss: 0.5%
Rate limit: 50 Mbps
Burst size: 5000 kbit
Latency: 50ms

BBR Compilation and Setup

Since the Linux kernel 6.11.0 used in testing did not have BBR available by default, BBR was compiled as a kernel module from source.

Source: The BBR source code was obtained from the Linux kernel source browser at https://codebrowser.dev/linux/linux/net/ipv4/tcp_bbr.c.html and saved as bbr.c.

Version Note: The source code did not include explicit version information (no MODULE_VERSION or version strings in the code). Based on the implementation details and the fact that it was obtained from the mainline Linux kernel source, it is assumed to be BBR v1. The code contains the classic BBR state machine (STARTUP → DRAIN → PROBE_BW → PROBE_RTT) characteristic of the original BBR design, not the BBR v2 modifications.

Makefile used for compilation:

obj-m += bbr.o

KDIR := /lib/modules/$(shell uname -r)/build
PWD := $(shell pwd)

all:
    make -C $(KDIR) M=$(PWD) modules

clean:
    make -C $(KDIR) M=$(PWD) clean

Compilation and loading:

make
sudo insmod bbr.ko

Test Methodology

Each congestion control algorithm was tested using iperf3 with a 20-second test duration under identical network conditions:

iperf3 -c localhost -t 20 -C [algorithm_name]

The same network constraints (delay, jitter, loss, rate limit) were maintained for both tests to ensure fair comparison.

Detailed Test Results

NDM-TCP Standard v1.0 Results

Test Command: iperf3 -c localhost -t 20 -C ndm_tcp
Connection: 127.0.0.1 port 60344 → 127.0.0.1 port 5201

Interval (sec)	Transfer (MB)	Bitrate (Mbps)	Retransmissions	Cwnd (KB)
0.00-1.00	8.38	70.2	4	959
1.00-2.00	5.50	46.1	1	1,120
2.00-3.00	6.62	55.6	5	639
3.00-4.00	5.38	45.1	2	384
4.00-5.00	6.00	50.3	0	831
5.00-6.00	6.50	54.5	0	1,060
6.00-7.00	4.88	40.9	2	1,060
7.00-8.00	6.38	53.5	0	767
8.00-9.00	6.75	56.6	3	576
9.00-10.00	6.00	50.3	0	959
10.00-11.00	4.62	38.8	0	1,190
11.00-12.00	7.00	58.7	2	192
12.00-13.00	5.75	48.2	0	703
13.00-14.00	6.12	51.4	0	1,023
14.00-15.00	5.75	48.2	0	1,190
15.00-16.00	6.25	52.4	3	512
16.00-17.00	6.12	51.4	1	895
17.00-18.00	4.50	37.8	0	1,120
18.00-19.00	6.00	50.3	0	1,310
19.00-20.00	6.00	50.3	3	639

Summary:

Total Transfer (Sender): 120 MB
Average Bitrate (Sender): 50.5 Mbps
Total Retransmissions: 26
Total Transfer (Receiver): 118 MB
Average Bitrate (Receiver): 49.4 Mbps
Test Duration: 20.08 seconds

BBR (Assumed v1) Results

Test Command: iperf3 -c localhost -t 20 -C bbr
Connection: 127.0.0.1 port 33254 → 127.0.0.1 port 5201

Interval (sec)	Transfer (MB)	Bitrate (Mbps)	Retransmissions	Cwnd (MB)
0.00-1.00	7.12	59.7	25	2.87
1.00-2.00	5.50	46.1	92	1.12
2.00-3.00	4.25	35.6	53	1.37
3.00-4.00	5.75	48.3	2	1.62
4.00-5.00	6.12	51.4	2	1.62
5.00-6.00	6.62	55.5	4	1.37
6.00-7.00	6.00	50.3	0	1.25
7.00-8.00	4.50	37.8	0	1.37
8.00-9.00	7.00	58.7	0	1.37
9.00-10.00	5.62	47.2	0	1.37
10.00-11.00	5.75	48.2	2	1.37
11.00-12.00	5.50	46.2	0	1.12
12.00-13.00	7.00	58.7	0	1.25
13.00-14.00	5.50	46.1	0	1.50
14.00-15.00	5.62	47.2	0	1.50
15.00-16.00	6.88	57.7	0	1.25
16.00-17.00	5.38	45.1	0	1.25
17.00-18.00	5.62	47.2	0	1.37
18.00-19.00	5.50	46.1	0	1.25
19.00-20.00	7.00	58.6	0	1.37

Summary:

Total Transfer (Sender): 118 MB
Average Bitrate (Sender): 49.6 Mbps
Total Retransmissions: 180
Total Transfer (Receiver): 116 MB
Average Bitrate (Receiver): 48.2 Mbps
Test Duration: 20.11 seconds

Comparative Analysis

Overall Performance Comparison

Metric	NDM-TCP v1.0	BBR (Assumed v1)	Winner
Total Transfer (Sender)	120 MB	118 MB	NDM-TCP (+2 MB)
Average Bitrate (Sender)	50.5 Mbps	49.6 Mbps	NDM-TCP (+0.9 Mbps)
Total Retransmissions	26	180	NDM-TCP (-154 retrans)
Total Transfer (Receiver)	118 MB	116 MB	NDM-TCP (+2 MB)
Average Bitrate (Receiver)	49.4 Mbps	48.2 Mbps	NDM-TCP (+1.2 Mbps)

Key Findings

1. Retransmission Rate (Most Critical Difference):

NDM-TCP: 26 retransmissions over 20 seconds
BBR: 180 retransmissions over 20 seconds
NDM-TCP achieved 85.6% fewer retransmissions than BBR

This is the most significant difference. BBR's retransmission count is nearly 7 times higher than NDM-TCP, indicating BBR is being far more aggressive and causing more packet loss or triggering more spurious retransmissions under these network conditions.

2. Throughput Stability:

NDM-TCP Standard Deviation Analysis:

Bitrate range: 37.8 - 70.2 Mbps
More consistent performance with fewer extreme variations
Conservative approach maintains stability

BBR Standard Deviation Analysis:

Bitrate range: 35.6 - 59.7 Mbps
Initial burst causes massive retransmissions (117 in first 2 seconds)
More aggressive early probing leads to network instability

3. Congestion Window Behavior:

NDM-TCP:

Cwnd range: 192 KB - 1,310 KB
More conservative window sizing
Adaptive adjustment based on entropy detection
Fewer dramatic cwnd reductions

BBR:

Cwnd range: 1.12 MB - 2.87 MB
Larger initial window (2.87 MB) in first interval
Aggressive window sizing leads to excessive retransmissions
Maintains larger windows throughout but at cost of stability

4. Average Throughput:

NDM-TCP: 50.5 Mbps (sender) / 49.4 Mbps (receiver)
BBR: 49.6 Mbps (sender) / 48.2 Mbps (receiver)
NDM-TCP achieved slightly higher throughput despite being more conservative

Comparison with Previous Results (Reno and Cubic)

As documented in the previous article, NDM-TCP was also tested against TCP Reno and TCP Cubic.

Important Note: BBR performed worse than both Reno and Cubic in these tests. The previous article showed better results for Reno and Cubic compared to the BBR results shown here.

This suggests that:

BBR's aggressive probing strategy may not be well-suited to the specific network conditions tested (localhost with artificial delay/loss)
BBR's model-based approach may struggle more in high-loss environments compared to traditional loss-based algorithms
The simulated environment may not represent the type of networks where BBR excels (e.g., bufferbloat-heavy real-world networks)

Analysis: Why Did BBR Underperform?

BBR's Design Philosophy vs Test Conditions

BBR is optimized for:

Real-world internet connections with bufferbloat
Networks where queue delay is the primary issue
Scenarios where bandwidth and RTT can be accurately measured

Test conditions presented:

Artificial packet loss (0.5%) on localhost
Token bucket rate limiting
Simulated delay variation
Virtual machine environment

Possible reasons for poor BBR performance:

Aggressive Initial Probing: BBR's startup phase probed with a 2.87 MB window, immediately triggering 25 retransmissions in the first second, followed by 92 more in the second interval. This aggressive probing doesn't work well with artificial packet loss.
Model Mismatch: BBR models bottleneck bandwidth and RTT, but in a localhost environment with artificial constraints, these measurements may not accurately reflect the "real" network state.
Loss-Based vs Model-Based: In networks with random packet loss (as simulated here), loss-based algorithms (like NDM-TCP, Reno, Cubic) that treat loss as a signal may perform better than model-based algorithms that try to probe bandwidth.
Localhost Limitations: Testing on localhost may not provide the same network dynamics as real network interfaces, potentially disadvantaging BBR's bandwidth probing mechanism.

NDM-TCP's Advantages in This Environment

Entropy-Based Noise Detection: NDM-TCP's entropy calculation helps distinguish between:
- Random packet loss (high entropy = noise)
- Congestion-based loss (low entropy = real congestion)
Conservative Growth: NDM-TCP prioritizes stability over aggressive throughput maximization, resulting in:
- Fewer retransmissions (26 vs 180)
- More stable congestion window management
- Better adaptation to loss conditions
Pattern Recognition: The neural network can learn patterns in RTT variations and adapt accordingly, even in artificial test conditions.

Important Disclaimers and Limitations

This is NOT conclusive evidence that NDM-TCP is "better" than BBR. Several critical points must be understood:

Test Environment Limitations:
- Localhost testing doesn't represent real network behavior
- Virtual machine adds additional variability
- Artificial delay/loss simulation may not reflect real-world conditions
- BBR is designed for real internet connections, not localhost loops
BBR's Real-World Strengths:
- BBR has been extensively validated in production at Google
- BBR excels in bufferbloat scenarios not tested here
- BBR's benefits are most apparent on real-world WAN connections
- The test conditions may specifically disadvantage BBR's design
NDM-TCP Status:
- NOT standardized or production-ready
- Experimental implementation for research purposes
- Simplified from the real NDM-TCP architecture
- Has not undergone the extensive validation BBR has received
Statistical Significance:
- This is a single test run, not multiple trials
- No statistical analysis of variance
- Results may vary significantly with different test runs
- Network conditions were artificially controlled
Comparison Context:
- BBR performed worse than Reno and Cubic in these tests
- This suggests the test environment may not suit BBR's design
- Real-world performance on actual networks would likely differ significantly

Conclusion

In this specific simulated test environment (localhost with artificial delay, jitter, and packet loss), NDM-TCP standard v1.0 achieved:

85.6% fewer retransmissions than BBR (26 vs 180)
Slightly higher throughput (50.5 vs 49.6 Mbps sender)
More stable performance with conservative window management

Based on these test results, in this type of situation (simulated network with packet loss and delay), NDM-TCP demonstrates a more stable connection compared to BBR, as evidenced by the dramatically lower retransmission count and more consistent throughput delivery.

Important: This was an untrained test. The NDM-TCP kernel module was freshly loaded without any prior training or warm-up period. NDM-TCP's zero-training behavior means it adapts from the first packet, but its adaptive capabilities may improve with longer runtime as the neural network learns network patterns. This test represents the algorithm's behavior in a completely cold-start scenario.

CRITICAL DISCLAIMER - DO NOT MISUNDERSTAND THESE RESULTS:

I am explicitly stating that this may NOT be the best situation for BBR to handle. This appears to be a design philosophy problem rather than proof of superiority. BBR is designed for different network conditions (real-world WAN connections with bufferbloat), and these artificial test conditions may fundamentally conflict with how BBR operates.

I am NOT saying NDM-TCP has "beaten" BBR or is a "winner." I am NOT claiming that NDM-TCP is better than Cubic, Reno, or BBR based on these results. The previous article showed NDM-TCP performing well against Reno and Cubic, and this article shows good results against BBR—but all of these tests were conducted in localhost, artificial, simulation-based environments.

The fundamental issue: Even though NDM-TCP shows good results in simulation, we don't know what happens on real hardware in real network conditions. Simulation success does NOT guarantee real-world success. Many algorithms perform well in controlled simulations but fail or behave differently when deployed on actual network infrastructure with real traffic patterns, real hardware constraints, and real-world variability.

This is where community involvement becomes absolutely critical. If the community can test and prove that NDM-TCP's performance is not just a simulation artifact—that it actually works well on real hardware, real networks, and real production environments—then and only then can we say it's genuinely good for NDM-TCP. Until that validation happens, these results remain interesting research data from a limited, artificial test environment, nothing more.

The test environment (localhost with artificial constraints) does not represent real-world network conditions where BBR has proven its value. BBR's poor performance compared to even Reno and Cubic suggests the test conditions may be fundamentally unsuited to BBR's design philosophy, not that NDM-TCP is superior.

Critical Need for Community Collaboration:

These results are from a single, limited test scenario. To properly evaluate NDM-TCP's real-world viability and compare it fairly against established algorithms like BBR, extensive community collaboration is essential:

Real Hardware Testing Needed: Tests on actual network hardware (not virtual machines or localhost) with real NICs, switches, and routers
Variety of Test Scenarios Required:
- Different network topologies (LAN, WAN, data center, wireless)
- Various bandwidth conditions (1Gbps, 10Gbps, 40Gbps, 100Gbps+)
- Different loss patterns (random, burst, correlated)
- Multiple RTT ranges (sub-ms to hundreds of ms)
- Real internet connections with actual bufferbloat
- Production workload patterns
- Scenarios specifically suited to BBR's design (to fairly test all algorithms)
Long-term Testing: Extended runs to observe NDM-TCP's learning behavior over time
Statistical Validation: Multiple test runs with proper statistical analysis
Cross-validation: Testing by independent researchers with different setups

For the networking community: If you have access to real network infrastructure, proper testing facilities, production environments, or research labs, your collaboration in conducting more rigorous comparisons would be invaluable. The community can help determine whether NDM-TCP's promising simulation results translate to real-world benefits, or if they're merely artifacts of the artificial test environment. Only through real hardware testing across diverse scenarios can we know if NDM-TCP is genuinely viable outside of simulations.

Recommendations:

NDM-TCP shows potential in simulated loss-heavy environments with stable connection characteristics
These simulation results prove nothing about real-world performance
Extensive real hardware testing with variety of scenarios is critically needed
Community validation essential before any claims of effectiveness can be made
BBR should be tested in its intended use case (real WAN connections with bufferbloat)
Multiple test runs with statistical analysis needed for meaningful conclusions
Longer-term testing required to observe NDM-TCP's adaptive learning over time
Success in simulation ≠ success in reality; real hardware testing is the only way to know

All code, test configurations, and raw data are available in the GitHub repository for community review and replication.

DEV Community