GitHub Repository: https://github.com/hejhdiss/lkm-ndm-tcp
Previous Article: NDM-TCP: Correction, Clarification and Real Performance Results
Introduction: Testing Against BBR (Assumed v1)
After the initial testing of NDM-TCP against TCP Reno and Cubic (documented in the previous article), a comparison against BBR (Bottleneck Bandwidth and RTT) was conducted. This article presents the test results comparing NDM-TCP standard version (v1.0) against BBR under identical network conditions.
BBR Version Note: The exact BBR version is unknown as the source code does not contain explicit version information. Based on the implementation characteristics (classic STARTUP → DRAIN → PROBE_BW → PROBE_RTT state machine), it is assumed to be BBR v1, not the newer BBR v2/v3 variants.
Important Context: This is NOT production-grade or standardized TCP congestion control testing. These are experimental implementations tested in a simulated environment for educational and research purposes.
Test Environment
Hardware/Virtualization:
- Xubuntu 24.04
- VMware 17 (Virtual Machine)
- Linux Kernel 6.11.0
Testing Tools:
-
tc(traffic control) for network emulation -
iperf3for throughput measurement - Localhost loopback interface (127.0.0.1)
Network Conditions Applied:
sudo tc qdisc add dev lo root handle 1: netem delay 20ms 5ms distribution normal loss 0.5%
sudo tc qdisc add dev lo parent 1:1 handle 10: tbf rate 50mbit burst 5000kbit latency 50ms
Network Parameters:
- Base delay: 20ms
- Delay variation: ±5ms (normal distribution)
- Packet loss: 0.5%
- Rate limit: 50 Mbps
- Burst size: 5000 kbit
- Latency: 50ms
BBR Compilation and Setup
Since the Linux kernel 6.11.0 used in testing did not have BBR available by default, BBR was compiled as a kernel module from source.
Source: The BBR source code was obtained from the Linux kernel source browser at https://codebrowser.dev/linux/linux/net/ipv4/tcp_bbr.c.html and saved as bbr.c.
Version Note: The source code did not include explicit version information (no MODULE_VERSION or version strings in the code). Based on the implementation details and the fact that it was obtained from the mainline Linux kernel source, it is assumed to be BBR v1. The code contains the classic BBR state machine (STARTUP → DRAIN → PROBE_BW → PROBE_RTT) characteristic of the original BBR design, not the BBR v2 modifications.
Makefile used for compilation:
obj-m += bbr.o
KDIR := /lib/modules/$(shell uname -r)/build
PWD := $(shell pwd)
all:
make -C $(KDIR) M=$(PWD) modules
clean:
make -C $(KDIR) M=$(PWD) clean
Compilation and loading:
make
sudo insmod bbr.ko
Test Methodology
Each congestion control algorithm was tested using iperf3 with a 20-second test duration under identical network conditions:
iperf3 -c localhost -t 20 -C [algorithm_name]
The same network constraints (delay, jitter, loss, rate limit) were maintained for both tests to ensure fair comparison.
Detailed Test Results
NDM-TCP Standard v1.0 Results
Test Command: iperf3 -c localhost -t 20 -C ndm_tcp
Connection: 127.0.0.1 port 60344 → 127.0.0.1 port 5201
| Interval (sec) | Transfer (MB) | Bitrate (Mbps) | Retransmissions | Cwnd (KB) |
|---|---|---|---|---|
| 0.00-1.00 | 8.38 | 70.2 | 4 | 959 |
| 1.00-2.00 | 5.50 | 46.1 | 1 | 1,120 |
| 2.00-3.00 | 6.62 | 55.6 | 5 | 639 |
| 3.00-4.00 | 5.38 | 45.1 | 2 | 384 |
| 4.00-5.00 | 6.00 | 50.3 | 0 | 831 |
| 5.00-6.00 | 6.50 | 54.5 | 0 | 1,060 |
| 6.00-7.00 | 4.88 | 40.9 | 2 | 1,060 |
| 7.00-8.00 | 6.38 | 53.5 | 0 | 767 |
| 8.00-9.00 | 6.75 | 56.6 | 3 | 576 |
| 9.00-10.00 | 6.00 | 50.3 | 0 | 959 |
| 10.00-11.00 | 4.62 | 38.8 | 0 | 1,190 |
| 11.00-12.00 | 7.00 | 58.7 | 2 | 192 |
| 12.00-13.00 | 5.75 | 48.2 | 0 | 703 |
| 13.00-14.00 | 6.12 | 51.4 | 0 | 1,023 |
| 14.00-15.00 | 5.75 | 48.2 | 0 | 1,190 |
| 15.00-16.00 | 6.25 | 52.4 | 3 | 512 |
| 16.00-17.00 | 6.12 | 51.4 | 1 | 895 |
| 17.00-18.00 | 4.50 | 37.8 | 0 | 1,120 |
| 18.00-19.00 | 6.00 | 50.3 | 0 | 1,310 |
| 19.00-20.00 | 6.00 | 50.3 | 3 | 639 |
Summary:
- Total Transfer (Sender): 120 MB
- Average Bitrate (Sender): 50.5 Mbps
- Total Retransmissions: 26
- Total Transfer (Receiver): 118 MB
- Average Bitrate (Receiver): 49.4 Mbps
- Test Duration: 20.08 seconds
BBR (Assumed v1) Results
Test Command: iperf3 -c localhost -t 20 -C bbr
Connection: 127.0.0.1 port 33254 → 127.0.0.1 port 5201
| Interval (sec) | Transfer (MB) | Bitrate (Mbps) | Retransmissions | Cwnd (MB) |
|---|---|---|---|---|
| 0.00-1.00 | 7.12 | 59.7 | 25 | 2.87 |
| 1.00-2.00 | 5.50 | 46.1 | 92 | 1.12 |
| 2.00-3.00 | 4.25 | 35.6 | 53 | 1.37 |
| 3.00-4.00 | 5.75 | 48.3 | 2 | 1.62 |
| 4.00-5.00 | 6.12 | 51.4 | 2 | 1.62 |
| 5.00-6.00 | 6.62 | 55.5 | 4 | 1.37 |
| 6.00-7.00 | 6.00 | 50.3 | 0 | 1.25 |
| 7.00-8.00 | 4.50 | 37.8 | 0 | 1.37 |
| 8.00-9.00 | 7.00 | 58.7 | 0 | 1.37 |
| 9.00-10.00 | 5.62 | 47.2 | 0 | 1.37 |
| 10.00-11.00 | 5.75 | 48.2 | 2 | 1.37 |
| 11.00-12.00 | 5.50 | 46.2 | 0 | 1.12 |
| 12.00-13.00 | 7.00 | 58.7 | 0 | 1.25 |
| 13.00-14.00 | 5.50 | 46.1 | 0 | 1.50 |
| 14.00-15.00 | 5.62 | 47.2 | 0 | 1.50 |
| 15.00-16.00 | 6.88 | 57.7 | 0 | 1.25 |
| 16.00-17.00 | 5.38 | 45.1 | 0 | 1.25 |
| 17.00-18.00 | 5.62 | 47.2 | 0 | 1.37 |
| 18.00-19.00 | 5.50 | 46.1 | 0 | 1.25 |
| 19.00-20.00 | 7.00 | 58.6 | 0 | 1.37 |
Summary:
- Total Transfer (Sender): 118 MB
- Average Bitrate (Sender): 49.6 Mbps
- Total Retransmissions: 180
- Total Transfer (Receiver): 116 MB
- Average Bitrate (Receiver): 48.2 Mbps
- Test Duration: 20.11 seconds
Comparative Analysis
Overall Performance Comparison
| Metric | NDM-TCP v1.0 | BBR (Assumed v1) | Winner |
|---|---|---|---|
| Total Transfer (Sender) | 120 MB | 118 MB | NDM-TCP (+2 MB) |
| Average Bitrate (Sender) | 50.5 Mbps | 49.6 Mbps | NDM-TCP (+0.9 Mbps) |
| Total Retransmissions | 26 | 180 | NDM-TCP (-154 retrans) |
| Total Transfer (Receiver) | 118 MB | 116 MB | NDM-TCP (+2 MB) |
| Average Bitrate (Receiver) | 49.4 Mbps | 48.2 Mbps | NDM-TCP (+1.2 Mbps) |
Key Findings
1. Retransmission Rate (Most Critical Difference):
- NDM-TCP: 26 retransmissions over 20 seconds
- BBR: 180 retransmissions over 20 seconds
- NDM-TCP achieved 85.6% fewer retransmissions than BBR
This is the most significant difference. BBR's retransmission count is nearly 7 times higher than NDM-TCP, indicating BBR is being far more aggressive and causing more packet loss or triggering more spurious retransmissions under these network conditions.
2. Throughput Stability:
NDM-TCP Standard Deviation Analysis:
- Bitrate range: 37.8 - 70.2 Mbps
- More consistent performance with fewer extreme variations
- Conservative approach maintains stability
BBR Standard Deviation Analysis:
- Bitrate range: 35.6 - 59.7 Mbps
- Initial burst causes massive retransmissions (117 in first 2 seconds)
- More aggressive early probing leads to network instability
3. Congestion Window Behavior:
NDM-TCP:
- Cwnd range: 192 KB - 1,310 KB
- More conservative window sizing
- Adaptive adjustment based on entropy detection
- Fewer dramatic cwnd reductions
BBR:
- Cwnd range: 1.12 MB - 2.87 MB
- Larger initial window (2.87 MB) in first interval
- Aggressive window sizing leads to excessive retransmissions
- Maintains larger windows throughout but at cost of stability
4. Average Throughput:
- NDM-TCP: 50.5 Mbps (sender) / 49.4 Mbps (receiver)
- BBR: 49.6 Mbps (sender) / 48.2 Mbps (receiver)
- NDM-TCP achieved slightly higher throughput despite being more conservative
Comparison with Previous Results (Reno and Cubic)
As documented in the previous article, NDM-TCP was also tested against TCP Reno and TCP Cubic.
Important Note: BBR performed worse than both Reno and Cubic in these tests. The previous article showed better results for Reno and Cubic compared to the BBR results shown here.
This suggests that:
- BBR's aggressive probing strategy may not be well-suited to the specific network conditions tested (localhost with artificial delay/loss)
- BBR's model-based approach may struggle more in high-loss environments compared to traditional loss-based algorithms
- The simulated environment may not represent the type of networks where BBR excels (e.g., bufferbloat-heavy real-world networks)
Analysis: Why Did BBR Underperform?
BBR's Design Philosophy vs Test Conditions
BBR is optimized for:
- Real-world internet connections with bufferbloat
- Networks where queue delay is the primary issue
- Scenarios where bandwidth and RTT can be accurately measured
Test conditions presented:
- Artificial packet loss (0.5%) on localhost
- Token bucket rate limiting
- Simulated delay variation
- Virtual machine environment
Possible reasons for poor BBR performance:
Aggressive Initial Probing: BBR's startup phase probed with a 2.87 MB window, immediately triggering 25 retransmissions in the first second, followed by 92 more in the second interval. This aggressive probing doesn't work well with artificial packet loss.
Model Mismatch: BBR models bottleneck bandwidth and RTT, but in a localhost environment with artificial constraints, these measurements may not accurately reflect the "real" network state.
Loss-Based vs Model-Based: In networks with random packet loss (as simulated here), loss-based algorithms (like NDM-TCP, Reno, Cubic) that treat loss as a signal may perform better than model-based algorithms that try to probe bandwidth.
Localhost Limitations: Testing on localhost may not provide the same network dynamics as real network interfaces, potentially disadvantaging BBR's bandwidth probing mechanism.
NDM-TCP's Advantages in This Environment
-
Entropy-Based Noise Detection: NDM-TCP's entropy calculation helps distinguish between:
- Random packet loss (high entropy = noise)
- Congestion-based loss (low entropy = real congestion)
-
Conservative Growth: NDM-TCP prioritizes stability over aggressive throughput maximization, resulting in:
- Fewer retransmissions (26 vs 180)
- More stable congestion window management
- Better adaptation to loss conditions
Pattern Recognition: The neural network can learn patterns in RTT variations and adapt accordingly, even in artificial test conditions.
Important Disclaimers and Limitations
This is NOT conclusive evidence that NDM-TCP is "better" than BBR. Several critical points must be understood:
-
Test Environment Limitations:
- Localhost testing doesn't represent real network behavior
- Virtual machine adds additional variability
- Artificial delay/loss simulation may not reflect real-world conditions
- BBR is designed for real internet connections, not localhost loops
-
BBR's Real-World Strengths:
- BBR has been extensively validated in production at Google
- BBR excels in bufferbloat scenarios not tested here
- BBR's benefits are most apparent on real-world WAN connections
- The test conditions may specifically disadvantage BBR's design
-
NDM-TCP Status:
- NOT standardized or production-ready
- Experimental implementation for research purposes
- Simplified from the real NDM-TCP architecture
- Has not undergone the extensive validation BBR has received
-
Statistical Significance:
- This is a single test run, not multiple trials
- No statistical analysis of variance
- Results may vary significantly with different test runs
- Network conditions were artificially controlled
-
Comparison Context:
- BBR performed worse than Reno and Cubic in these tests
- This suggests the test environment may not suit BBR's design
- Real-world performance on actual networks would likely differ significantly
Conclusion
In this specific simulated test environment (localhost with artificial delay, jitter, and packet loss), NDM-TCP standard v1.0 achieved:
- 85.6% fewer retransmissions than BBR (26 vs 180)
- Slightly higher throughput (50.5 vs 49.6 Mbps sender)
- More stable performance with conservative window management
Based on these test results, in this type of situation (simulated network with packet loss and delay), NDM-TCP demonstrates a more stable connection compared to BBR, as evidenced by the dramatically lower retransmission count and more consistent throughput delivery.
Important: This was an untrained test. The NDM-TCP kernel module was freshly loaded without any prior training or warm-up period. NDM-TCP's zero-training behavior means it adapts from the first packet, but its adaptive capabilities may improve with longer runtime as the neural network learns network patterns. This test represents the algorithm's behavior in a completely cold-start scenario.
CRITICAL DISCLAIMER - DO NOT MISUNDERSTAND THESE RESULTS:
I am explicitly stating that this may NOT be the best situation for BBR to handle. This appears to be a design philosophy problem rather than proof of superiority. BBR is designed for different network conditions (real-world WAN connections with bufferbloat), and these artificial test conditions may fundamentally conflict with how BBR operates.
I am NOT saying NDM-TCP has "beaten" BBR or is a "winner." I am NOT claiming that NDM-TCP is better than Cubic, Reno, or BBR based on these results. The previous article showed NDM-TCP performing well against Reno and Cubic, and this article shows good results against BBR—but all of these tests were conducted in localhost, artificial, simulation-based environments.
The fundamental issue: Even though NDM-TCP shows good results in simulation, we don't know what happens on real hardware in real network conditions. Simulation success does NOT guarantee real-world success. Many algorithms perform well in controlled simulations but fail or behave differently when deployed on actual network infrastructure with real traffic patterns, real hardware constraints, and real-world variability.
This is where community involvement becomes absolutely critical. If the community can test and prove that NDM-TCP's performance is not just a simulation artifact—that it actually works well on real hardware, real networks, and real production environments—then and only then can we say it's genuinely good for NDM-TCP. Until that validation happens, these results remain interesting research data from a limited, artificial test environment, nothing more.
The test environment (localhost with artificial constraints) does not represent real-world network conditions where BBR has proven its value. BBR's poor performance compared to even Reno and Cubic suggests the test conditions may be fundamentally unsuited to BBR's design philosophy, not that NDM-TCP is superior.
Critical Need for Community Collaboration:
These results are from a single, limited test scenario. To properly evaluate NDM-TCP's real-world viability and compare it fairly against established algorithms like BBR, extensive community collaboration is essential:
- Real Hardware Testing Needed: Tests on actual network hardware (not virtual machines or localhost) with real NICs, switches, and routers
-
Variety of Test Scenarios Required:
- Different network topologies (LAN, WAN, data center, wireless)
- Various bandwidth conditions (1Gbps, 10Gbps, 40Gbps, 100Gbps+)
- Different loss patterns (random, burst, correlated)
- Multiple RTT ranges (sub-ms to hundreds of ms)
- Real internet connections with actual bufferbloat
- Production workload patterns
- Scenarios specifically suited to BBR's design (to fairly test all algorithms)
- Long-term Testing: Extended runs to observe NDM-TCP's learning behavior over time
- Statistical Validation: Multiple test runs with proper statistical analysis
- Cross-validation: Testing by independent researchers with different setups
For the networking community: If you have access to real network infrastructure, proper testing facilities, production environments, or research labs, your collaboration in conducting more rigorous comparisons would be invaluable. The community can help determine whether NDM-TCP's promising simulation results translate to real-world benefits, or if they're merely artifacts of the artificial test environment. Only through real hardware testing across diverse scenarios can we know if NDM-TCP is genuinely viable outside of simulations.
Recommendations:
- NDM-TCP shows potential in simulated loss-heavy environments with stable connection characteristics
- These simulation results prove nothing about real-world performance
- Extensive real hardware testing with variety of scenarios is critically needed
- Community validation essential before any claims of effectiveness can be made
- BBR should be tested in its intended use case (real WAN connections with bufferbloat)
- Multiple test runs with statistical analysis needed for meaningful conclusions
- Longer-term testing required to observe NDM-TCP's adaptive learning over time
- Success in simulation ≠ success in reality; real hardware testing is the only way to know
All code, test configurations, and raw data are available in the GitHub repository for community review and replication.
Top comments (0)