NDM-TCP Optimized Version: Technical Overview

#ai #discuss #network #hejhdiss

What's New in the Optimized Build

The optimized version of NDM-TCP (ndm_tcp_lkm_optimized.c) introduces several low-level improvements over the standard v1.0 implementation while maintaining the same algorithmic approach.

Key Optimizations

1. AVX/SIMD Acceleration
The optimized version includes x86_64 AVX intrinsics for neural network forward pass computation. When available, the first hidden layer computation uses vectorized vpmaddwd and vphaddd instructions to parallelize multiply-accumulate operations across input features.

2. Reduced Memory Footprint

RTT history window compressed from 16 to 8 slots
Hidden layer reduced from 8 to 4 neurons
Input features streamlined from 8 to 6
Total struct size optimized to exactly 64 bytes (ICSK_CA_PRIV_SIZE limit)

3. LUT-Based Activation Functions
Pre-computed lookup tables replace runtime calculations for tanh and sigmoid, trading ~500 bytes of read-only memory for faster activation.

4. Fast Entropy Calculation
Histogram binning uses fixed-point arithmetic and bit shifts instead of floating-point division, with an 8-bin LUT for entropy values.

5. Computation Caching
When network conditions are stable (low entropy, high plasticity), the module reuses the previous cwnd delta for up to 8 consecutive ACKs, skipping neural network inference.

Important Disclaimers

This optimized version does NOT represent actual performance gains from the real NDM-TCP v1.0 algorithm. The optimizations here are purely implementation-level improvements for compilation and runtime efficiency. The core congestion control logic and performance characteristics remain conceptually similar to the standard version.

These changes primarily affect:

CPU cycles per packet processing(expecting 56% to 62% total reduction)
Memory cache efficiency
Compilation time and binary size

They do not fundamentally alter the network throughput, latency, or congestion response that users would observe in real-world testing.

Compilation Instructions

To compile the optimized version:

# Option 1: Rename and compile
cp ndm_tcp_lkm_optimized.c ndm_tcp_lkm.c
make

# Option 2: Modify Makefile to target optimized source directly

The module requires kernel headers and FPU support configuration. AVX optimizations activate automatically on compatible x86_64 systems.

Repository

Source code and build instructions: https://github.com/hejhdiss/lkm-ndm-tcp

Note: Both versions implement the same entropy-aware, neural network-based TCP congestion control algorithm. Choose the optimized build for production deployments where CPU efficiency matters, or stick with the standard version for easier debugging and code readability.

DEV Community