DEV Community

Cover image for nvidia-peermem "Invalid argument" on Ubuntu — Fix GPUDirect RDMA with DMA-BUF
Fabricio
Fabricio

Posted on

nvidia-peermem "Invalid argument" on Ubuntu — Fix GPUDirect RDMA with DMA-BUF

nvidia-peermem "Invalid argument" on Ubuntu — Fix GPUDirect RDMA with DMA-BUF

TL;DR: If modprobe nvidia-peermem fails with Invalid argument (-EINVAL) on a system using the inbox Ubuntu InfiniBand stack (rdma-core), the module is not broken and you do not need it. nvidia-peermem requires an API that only exists in MLNX_OFED. On Hopper/Blackwell GPUs with the NVIDIA open driver, use DMA-BUF instead — it does GPUDirect RDMA natively. The one gotcha: you must enable nvidia-drm modeset=1.

Applies to: Ubuntu 22.04 / 24.04, inbox rdma-core stack, NVIDIA open kernel driver, H100 / H200 / B200, ConnectX-6/7 (or any HCA with ODP support).


The symptom

$ sudo modprobe nvidia-peermem
modprobe: ERROR: could not insert 'nvidia_peermem': Invalid argument
Enter fullscreen mode Exit fullscreen mode

dmesg shows nvidia-peermem loaded but registered nothing, or the load returns -EINVAL. GPUDirect RDMA appears to be unavailable.

Why this happens (and why it is not a bug)

nvidia-peermem is the legacy path for GPUDirect RDMA. It registers GPU memory with the InfiniBand subsystem through a Mellanox-proprietary kernel API:

ib_register_peer_memory_client()
Enter fullscreen mode Exit fullscreen mode

That symbol only exists in MLNX_OFED's build of ib_core. It is not in the mainline kernel, and it is not in rdma-core, which is the inbox InfiniBand stack on Ubuntu.

If you are on the inbox stack, nvidia-peermem was compiled without that API present, so it can never bind and always returns Invalid argument. No module parameter or config change will fix it, because the thing it needs was never there.

Do not install MLNX_OFED just to make nvidia-peermem load. That works, but it is the wrong fix — you would be adding a heavy proprietary stack to revive an obsolete module. There is a native path already in your kernel.

The fix: use DMA-BUF

On Hopper and newer with the open driver, GPUDirect RDMA works through DMA-BUF, a mainline Linux framework. No external module, no MLNX_OFED.

Requirements (check these first)

  • NVIDIA open kernel driver (not the proprietary build)
  • nvidia-drm modeset=1 enabled ← most common missing piece
  • Kernel built with:
    • CONFIG_DMA_SHARED_BUFFER=y
    • CONFIG_HMM_MIRROR=y
    • CONFIG_INFINIBAND_ON_DEMAND_PAGING=y
  • ib_umem_dmabuf symbols present in ib_uverbs
  • HCA with ODP support (ConnectX-6/7 have it)
  • Hopper or newer GPU (H100 / H200 / B200)

Step 1 — Enable nvidia-drm modeset

Check current state:

cat /sys/module/nvidia_drm/parameters/modeset
Enter fullscreen mode Exit fullscreen mode

If it returns N, DMA-BUF export is inactive. Enable it:

# Runtime
sudo modprobe -r nvidia_drm && sudo modprobe nvidia_drm modeset=1

# Persistent across reboots
echo 'options nvidia-drm modeset=1' | sudo tee /etc/modprobe.d/nvidia-drm-modeset.conf
sudo update-initramfs -u
Enter fullscreen mode Exit fullscreen mode

Re-check that the parameter now reads Y.

Step 2 — Verify GPUDirect RDMA actually works

Do not trust "it should work now." Confirm the full path: allocate GPU memory, export it as a DMA-BUF file descriptor, register it with the HCA.

The three calls that must succeed:

  1. cudaMalloc() — allocate GPU memory
  2. cuMemGetHandleForAddressRange() with CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD — export as a DMA-BUF fd
  3. ibv_reg_dmabuf_mr() — register that fd with the InfiniBand HCA

If all three return success, GPU memory is directly addressable by the HCA over DMA-BUF and GPUDirect RDMA is working. nvidia-peermem is not needed.

Summary

Legacy (nvidia-peermem) Modern (DMA-BUF)
Requires MLNX_OFED Yes No
External module Yes No
Works on inbox rdma-core No Yes
Supported GPUs All Hopper+
NVIDIA recommendation Deprecated Preferred

If nvidia-peermem fails with Invalid argument on an inbox stack, that is expected. Enable nvidia-drm modeset=1, use DMA-BUF, verify with the three-call test above.


Related symptoms worth checking on the same box

  • All IB ports stuck in INIT, LID 0 → no Subnet Manager on the fabric. Start one: sudo apt install opensm && sudo systemctl start opensm. Ports go Active within seconds.
  • One port Down/Polling at SDR while others are Active → check the switch side by directed route. If both ends are polling, it is physical (cable / transceiver / seat), not software. Reseat or swap.

Top comments (0)