DEV Community

Muhammed Shafin P
Muhammed Shafin P

Posted on

NDM-TCP vs Reno vs Cubic vs BBR: Testing Summary and Recommendations

GitHub Repository: https://github.com/hejhdiss/lkm-ndm-tcp

Introduction: Multiple Localhost Tests, One Pattern

I conducted multiple 40-second localhost tests comparing NDM-TCP against Reno, Cubic, and BBR (assumed v1) under different network conditions. These tests were NOT just two scenarios—they represent various localhost environments with different artificial constraints using tc (traffic control).

Critical Context: All of these tests were conducted on localhost with artificial environment constraints. This is why the results are interesting for research, but also why community validation on real hardware is absolutely essential.

The pattern that emerged across these multiple localhost test scenarios provides insights about when each algorithm performs best—but remember, artificial localhost testing ≠ real-world performance.

Test 1: Constrained Network (Poor WiFi Equivalent)

Test Conditions

This test was one of several localhost scenarios with artificial constraints using tc:

  • 20 Mbps bandwidth limit
  • ±50ms jitter (very high variation)
  • 0.5% packet loss (correlated)
  • Packet duplication and reordering

Real-World Equivalent: This resembles:

  • Poor WiFi connection with interference
  • 3G mobile connection
  • Low signal 4G (2 bars) that gets bursts around 20 Mbps but lacks stability
  • Congested public WiFi
  • Networks with high noise and instability

Results Summary

Algorithm Throughput Retransmissions Key Behavior
NDM-TCP 19.6 Mbps 63 Stable, adaptive
Cubic 18.7 Mbps 144 Burst retransmissions
Reno 18.5 Mbps 145 Similar to Cubic
BBR (v1) 17.4 Mbps 243 Struggled badly

Key Finding: NDM-TCP Excels in Poor Conditions

Even in what should be BBR's territory (lossy, variable networks), BBR showed poor results.

  • BBR had 243 retransmissions vs NDM-TCP's 63 (nearly 4x worse)
  • BBR achieved the lowest throughput (17.4 Mbps)
  • BBR struggled throughout the entire 40 seconds
  • Multiple intervals with 0.00 Mbps throughput

Why BBR Struggled:
BBR is designed for real WAN connections with bufferbloat, not artificial localhost constraints. The simulated high jitter and correlated packet loss confused BBR's bandwidth and RTT models, causing aggressive probing that led to excessive retransmissions.

Why NDM-TCP Succeeded:

  • Entropy-based noise detection distinguished random loss from real congestion
  • Adaptive learning handled chaotic conditions
  • Conservative approach avoided triggering unnecessary retransmissions
  • Maintained stable throughput throughout the test

Clear Conclusion: In these poor, noisy network conditions, NDM-TCP showed dramatically cleaner and more stable performance than all other algorithms, including BBR.

Test 2: Clean Network (Pure Localhost)

Test Conditions

This test had NO artificial constraints:

  • No bandwidth limits
  • No delays or jitter
  • No packet loss
  • No packet manipulation
  • Pure localhost loopback performance

Real-World Equivalent: This resembles:

  • Very good wired connection
  • Clean, stable LAN
  • High-quality fiber broadband
  • Ideal conditions with no interference

Results Summary

Algorithm Throughput Retransmissions Cwnd Behavior
Cubic 60.6 Gbps 20 Stable 1.06 MB
Reno 59.6 Gbps 8 Stable 2.19 MB
NDM-TCP 59.4 Gbps 3 Grows to 15.1 GB!
BBR (v1) 58.0 Gbps 37 Variable (256KB-1.5MB)

Key Finding: Cubic Best for Raw Throughput, NDM-TCP Still Cleanest

In Cubic's own turf (clean, stable network), Cubic won throughput—but only barely.

  • Cubic: 60.6 Gbps (highest throughput)
  • NDM-TCP: 59.4 Gbps (only 1.2 Gbps less = 2% difference)
  • NDM-TCP: 3 retransmissions (still the lowest by far)

Cubic's Victory:
Cubic's cubic growth function is perfectly suited for stable, high-bandwidth environments. Its conservative approach (1.06 MB cwnd) worked efficiently on clean localhost.

NDM-TCP's Behavior:

  • Scaled cwnd to 15.1 GB (gigabytes!) - incredibly aggressive
  • Still achieved only 3 retransmissions despite massive window
  • Showed it can scale to very high performance when conditions allow
  • Still demonstrated cleaner and more stable behavior (fewer retransmissions)

BBR's Continued Struggles:
Even in clean conditions, BBR had 37 retransmissions (worst of all), showing that localhost testing (constrained or not) doesn't suit BBR's design philosophy.

Practical Recommendations: When to Use What

Based on these artificial localhost test results:

Use TCP Cubic When:

You have a very good, clean connection

  • Stable fiber broadband
  • Quality wired LAN
  • Reliable, low-jitter network
  • You prioritize raw throughput over everything else
  • Network conditions are predictable and stable

Why: Cubic performed best in clean conditions (60.6 Gbps) and its conservative approach works well when the network is reliable.

I recommend Cubic for maximum throughput on good connections.

Use NDM-TCP When:

You need stability and consistency

  • Poor WiFi with interference
  • Mobile connections (3G, weak 4G)
  • Networks with variable quality
  • High-jitter or lossy environments
  • You prioritize connection stability over peak throughput
  • Retransmissions are costly (satellite, metered connections)

Why: NDM-TCP showed:

  • Dramatically fewer retransmissions in both tests (63 vs 243 in poor conditions, 3 vs 37 in clean)
  • Stable, predictable behavior
  • Adaptive response to network chaos
  • Near-Cubic performance even in clean conditions (only 2% less throughput)

NDM-TCP is for stability—and so far, it has consistently shown stability across different test scenarios.

What We Still Don't Know: Edge Cases and Limitations

Not everything is perfect. NDM-TCP has only been tested in two artificial localhost scenarios. We need community support to find:

Where NDM-TCP Can't Maintain Stability:

Real hardware with real network conditions
Very high-speed networks (100 Gbps+) with real NICs
Long-distance connections (transcontinental, satellite)
Rapidly changing networks (mobile handoffs, route changes)
Extreme congestion (thousands of competing flows)
Production workloads (not just iperf3)
Different bandwidth ranges (1 Mbps? 10 Gbps?)

Edge Cases That Need Testing:

❓ What happens during sudden network failures?
❓ How does it handle asymmetric links (different up/down speeds)?
❓ Does it work with middleboxes, NAT, firewalls?
❓ What about wireless mesh networks?
❓ How does it coexist with other algorithms in shared networks?
Long-term stability (hours, days, weeks of runtime)?

Critical Questions:

Where does NDM-TCP break down?
What are its failure modes?
What conditions confuse its entropy detection?
When does adaptive learning make wrong decisions?

We need to know what NDM-TCP cannot manage, not just what it can do well.

The BBR Observation: Even in Its Territory

One surprising finding: Even in conditions that should favor BBR (lossy, variable network), BBR showed poor results in these artificial tests.

The 20 Mbps constrained test was meant to simulate exactly the kind of environment BBR is designed for—bandwidth-limited connections with packet loss. Yet:

  • BBR had the worst retransmissions (243)
  • BBR had the lowest throughput (17.4 Mbps)
  • BBR never stabilized over 40 seconds

Why? Likely because:

  1. Localhost testing doesn't represent real bufferbloat
  2. Artificial tc constraints don't mimic real network behavior
  3. BBR's model-based approach was confused by simulated conditions
  4. BBR v1 (assumed) may not handle these specific patterns well

This reinforces that artificial testing has severe limitations. BBR is proven in real-world Google production—these localhost results don't invalidate that. They just show that simulation ≠ reality.

Honest Assessment: All Tests Were Localhost with Artificial Constraints

Every single test mentioned was conducted on artificial localhost environments:

  • Not real network hardware
  • Not real propagation delays
  • Not real packet processing
  • Not real congestion
  • Running on VMware virtualization
  • All network conditions created artificially using tc (traffic control)

Whether constrained (with tc rules) or unconstrained (pure loopback), these were localhost tests—not real LAN, not real WAN, not real internet connections.

These tests reveal algorithm behavior in controlled localhost conditions, but cannot predict real-world performance on actual network hardware.

This is exactly why community validation is absolutely critical. We need testing on:

  • Real network interfaces (not loopback)
  • Real routers and switches (not tc emulation)
  • Real propagation delays (not artificial tc delays)
  • Real congestion (not simulated constraints)
  • Real hardware packet processing
  • Real production networks

Community Support Needed

To truly understand NDM-TCP's capabilities and limitations, we need:

  1. Real Hardware Testing:

    • Actual network interfaces
    • Physical routers and switches
    • Real fiber, wireless, satellite links
  2. Diverse Scenarios:

    • Different bandwidths (1 Mbps to 100+ Gbps)
    • Various latencies (sub-ms to hundreds of ms)
    • Multiple loss patterns
    • Production traffic (not just iperf3)
  3. Edge Case Discovery:

    • Find where it breaks
    • Identify failure modes
    • Determine limitations
    • Map performance boundaries
  4. Independent Validation:

    • Testing by multiple researchers
    • Different environments
    • Statistical analysis
    • Peer review

Conclusion: Patterns from Multiple Localhost Tests

Across multiple localhost test scenarios with different artificial constraints, a pattern emerged:

For maximum raw throughput on good connections: Use TCP Cubic

For stability and consistency across varying conditions: Use NDM-TCP

For real WAN with bufferbloat: Use BBR (but localhost testing doesn't suit its design)

NDM-TCP, Reno, Cubic, and BBR were all tested in various localhost environments—some with extreme artificial constraints (tc-created packet loss, jitter, bandwidth limits), some without any constraints (pure loopback). The results show interesting patterns, but they're all from localhost with artificial environment constraints.

This is why community validation is essential:

  • We need real hardware, not localhost loopback
  • We need real networks, not tc-emulated conditions
  • We need diverse real-world scenarios, not simulated constraints
  • We need to find NDM-TCP's real limitations, not just localhost behavior

Not everything is perfect. We need to know what NDM-TCP cannot handle in real networks, not just what it does well in localhost simulations.

Community collaboration on real hardware is absolutely critical to move beyond artificial localhost testing and discover NDM-TCP's true capabilities and limitations.


Disclaimer: All results are from multiple localhost test scenarios with artificial environment constraints created using tc (traffic control). BBR is assumed to be v1. Performance on real network hardware may differ significantly. These recommendations are based solely on localhost simulation results with artificial constraints, not production validation on real networks.

Top comments (0)