Sergio Andres Usma

Posted on Apr 5

Creating a 50 GB Swap File on Jetson AGX Orin (Root on NVMe)

#jetson #linux #swap #agxorin

Abstract

This document describes the process of creating, tuning, and managing a large swap file on an NVIDIA Jetson AGX Orin 64 GB running Ubuntu 22.04.5 LTS aarch64. The configuration is specifically optimized for running large language models (LLMs) alongside CUDA, cuMB, and TensorRT by leveraging a fast NVMe SSD as the primary swap backing store.

The implementation was validated using a 50 GB swap file configuration alongside existing zram layers. The procedure successfully extended the usable memory capacity, allowing for the deployment of larger models without triggering immediate Out-Of-Memory (OOM) errors, provided the storage-to-RAM paging latency is acceptable.

This tutorial serves as a technical reference for advanced Jetson and Linux users. It provides a reproducible method for extending virtual memory on edge AI hardware to support demanding 34B–70B parameter models.

1. Hardware and Software Environment

The target environment is an NVIDIA Jetson AGX Orin Developer Kit equipped with 64 GB of unified memory. The system runs Ubuntu 22.04.5 LTS on an aarch64 kernel (5.15.185-tegra). The installation includes JetPack 6.2.2, providing the necessary software stack for AI inference, including CUDA 12.6, cuDNN 9.3.0, and TensorRT 10.3.0.

The primary storage for the swap file is the NVMe SSD, which serves as the root filesystem. This choice is critical for minimizing the performance penalty during memory paging operations.

Component	Detail
Hardware	NVIDIA Jetson AGX Orin Developer Kit 64 GB
OS	Ubuntu 22.04.5 LTS aarch64
Kernel	5:15.185-tegra
RAM	64 GB unified memory
JetPack	6.2.2+b24 (nvidia-jetpack)
CUDA	12.6 (nvcc 12.6.68)
cuDNN	9.3.0
TensorRT	10.3.0.30-1+cuda12.5

Table 1 — Jetson AGX Orin environment for swap configuration

2. Swap Location Strategy

Effective swap placement is determined by the throughput and endurance of the underlying storage media. On the Jetson AGX Orin, the system utilizes eMMC for the boot partition and an NVMe SSD for the primary root filesystem.

Storage	Approx Speed	Recommendation
NVMe SSD	~2000 MB/s	Best — primary location for swap
eMMC	~400 MB/s	Secondary fallback; higher wear risk
USB Drive	~100 MB/s	Not recommended due to high latency

Table 2 — Recommended swap backing storage on Jetson AGX Orin

For this configuration, the swap file is placed directly on the NVMe-backed root filesystem (/) at /swapfile. This ensures the highest possible I/O performance for paging operations.

3. Step-by-Step Swap File Creation

The following steps outline the allocation and initialization of a 50 GB swap file.

3.1 Check Devices and Free Space

Before allocation, verify the available space on the target partition. The lsblk command confirms the mount points, while df -h verifies the capacity.

# List block devices and mount points
lsblk -o NAME,SIZE,TYPE,MOSQL,ROTA

# Check free space on the root filesystem
df -h /

The current configuration shows approximately 636 GB of available space on /dev/nvme0n1p1, which is more than sufficient for a 50 GB allocation.

3.2 Create the Swap File

The fallocate utility is used to pre-allocate the file space efficiently.

# Allocate 50 GB for the swap file on the root filesystem
sudo fallocate -l 50G /swapfile

3.3 Secure and Format the Swap File

Security is paramount; the swap file must be restricted to root-only access to prevent sensitive data leakage from memory to disk.

# Restrict permissions to root read/write only
sudo chmod 600 /swapfile

# Format the file as swap space
sudo mkswap /swapfile

3.4 Enable the Swap File

Once formatted, the swap file must be activated in the running kernel.

# Enable the swap file
sudo swapon /swapfile

# Verify active swap devices
swapon --show

# Confirm memory and swap totals
free -h

4. Making Swap Persistent Across Reboots

To ensure the swap file is automatically re-enabled upon system restart, an entry must be added to the /etc/fstab configuration file.

# Append the swap file definition to /etc/fstab
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

# Verify the entry exists
grep swap /etc/fstab

5. Tuning Swappiness and zram for LLM Workloads

Optimal performance for LLM inference requires tuning the kernel to prioritize physical RAM and the compressed zram layer over the disk-backed swap file.

5.1 Adjust Swappiness and Cache Pressure

Lowering the swappiness value instructs the kernel to avoid swapping pages to the NVMe SSD unless absolutely necessary.

# Apply settings immediately
sudo sysctl vm.swappiness=10
sudo sysctl vm.vfs_cache_pressure=50

# Persist the settings across reboots
echo 'vm.swappiness=10' | sudo tee -a /etc/sysctl.conf
echo 'vm.vfs_cache_pressure=50' | sudo tee -a /etc/sysctl.conf

# Reload sysctl configuration
sudo sysctl -p

Swappiness	Behavior Description
0	Swap only when absolutely out of RAM
10	Recommended for LLM workloads
60	Typical Linux default
100	Very aggressive swapping

Table 3 — Swappiness values and behavior for Jetson LLM use

6. Relationship Between zram and /swapfile

The Jetson system utilizes a tiered memory architecture. The zram-config service provides several compressed RAM-based swap devices (zram0 through zram11). The hierarchy of memory allocation is as follows:

Physical RAM (64 GB unified memory)
zram (Compressed swap in RAM, ~31 GB total)
NVMe Swap File (50 GB on /swapfile)

This tiered approach allows the kernel to handle small, compressible allocations within the highly efficient zram layer before resorting to the higher-latency NVMe disk-backed swap.

7. Removing or Reconfiguring the Swap File

If disk space needs to be reclaimed, the swap file can be decommissioned following these steps:

# Disable the swap file usage
sudo swapoff /swapfile

# Remove the entry from /etc/fstab
sudo sed -i '/\/swapfile/d' /etc/fstab

# Delete the physical file
sudo rm /swapfile

# Reload sysctl to refresh kernel state
sudo sysctl -p

8. Practical Outcomes

Increased Capacity: Successfully established a 50 GB swap area on NVMe, expanding the total virtual memory capacity.
Stability: Provided a critical safety margin for running 70B parameter models (e.g., Q4_K_M) that may exceed the 64 GB physical RAM limit during peak usage.
Optimized Hierarchy: Integrated the new disk-backed swap into the existing zram architecture without disrupting the compressed RAM layer.
Persistence: Achieved a fully automated configuration that survives system reboots via /etc/fstab tuning.

9. Conclusions

Configuring a large, NVMe-backed swap file is a highly effective strategy for maximizing the utility of the NVIDIA Jetson AGX Orin 64 GB for large-scale AI workloads. By following the documented procedure of using fallocate, setting strict chmod 600 permissions, and tuning swappiness to 10, users can achieve a stable environment capable of handling models that exceed physical memory boundaries.

While the performance penalty of disk-based swapping is unavoidable, the use of high-speed NVMe storage and a tiered zram approach minimizes the impact on inference latency, making it a viable solution for non-interactive or batch processing of 34B–70B parameter models.

DEV Community