Serge Logvinov

Posted on Feb 22

Proxmox Virtual Machine optimization - Deep Dive

#proxmox #linux

In the previous articles, I covered the basic VM settings you should configure by default in Proxmox VE.

In this article, I’ll explain what actually happens under the hood and why proper CPU, NUMA, and interrupt configuration is critical for high-performance workloads - especially if you run latency-sensitive services or Kubernetes worker nodes.

CPU Affinity - is not enough

When you configure a CPU affinity list in Proxmox:

The VM is restricted to a predefined set of physical CPU cores.
All vCPUs are allowed to run on those cores.
However, the hypervisor scheduler can still move individual vCPUs between the allowed cores.

If you allow cores 0-7, then: vCPU-1 may run on core 0 now, then move to core 3, then to core 6.
The VM expects predictable CPU behavior, especially for workloads like databases, networking services, or Kubernetes nodes, which have their own optimizations based on CPU cache and latency. Moving VM CPU cores around can cause unpredictable performance degradation.

To achieve stable performance, you need to ensure that each vCPU of the VM is pinned to a specific physical CPU core.
Proxmox does not automatically pin each vCPU one-by-one.

You must configure this explicitly by scripts or automate it — see solution below.

Memory Numa nodes - avoid cross-node memory access

Modern servers (CPUs) use NUMA architecture.

If NUMA is not configured correctly: Proxmox allocate memory across multiple NUMA nodes, and the VM may have to access memory from different NUMA nodes. This results in increased latency and cross-node (socket) memory access, which can significantly degrade performance.

To avoid this, you need:

identify which physical cores belong to each NUMA node.
define VM CPU affinity within a single NUMA node.
configure VM NUMA settings to match that node.

Why use only one NUMA node for a VM is the best strategy? Because qemu does not provide a cpu architecture of the host machine by default.
You need to use qemu arguments to pass it.

In case if you have CPU cores and threads from one numa node - this arguments is enough:

args: -cpu 'host,topoext=on,host-cache-info=on' -smp '4,sockets=1,cores=2,threads=2,maxcpus=4'

Do not forget to set cores and threads according to your VM CPU configuration.

SR-IOV devices

Each hardware device uses interrupts to notify the Linux kernel that it has data to process.
If the interrupt is handled by a different CPU core than the one running the VM’s vCPU, several problems may occur:

CPU cache misses
cross-core synchronization overhead
increased memory traffic

We need to set hardware interrupt handling list the same as the CPU affinity list of the VM to solve this problem.

Solution

To automate all of this, I created an open-source component as part of Karpenter for Proxmox

The Proxmox Scheduler:

observes running VMs
reads their CPU affinity configuration
pins each vCPU to a specific host core
sets correct interrupt affinity for SR-IOV devices
optionally optimizes CPU frequency governor for power consumption.

It distributes as a deb package and can be installed on the Proxmox host:

dpkg -i https://github.com/sergelogvinov/karpenter-provider-proxmox/releases/download/v0.10.1/proxmox-scheduler_0.10.1_linux_amd64.deb

You can also optimize power usage:

set CPU governor to performance for cores used by VMs
set CPU governor to powersave for unused cores

DEV Community

Proxmox Virtual Machine optimization - Deep Dive

CPU Affinity - is not enough

Memory Numa nodes - avoid cross-node memory access

SR-IOV devices

Solution

References

Top comments (0)