DEV Community

Guatu
Guatu

Posted on • Originally published at guatulabs.dev

AMD Ryzen C-State Freezes: How `processor.max_cstate=1` Saved My Proxmox Node

  1. THE SYMPTOM

    My Proxmox node would randomly freeze — no logs, no crash, just unresponsive. SSH dropped, web UI froze, and I had to hard-reboot. It was the kind of thing that breaks your day when you're trying to debug a Kubernetes deployment.

  2. WHAT I EXPECTED

    I expected a normal Proxmox VM host — stable, responsive, and reliable. After all, it’s a Ryzen 7840U in a decent case with decent cooling. I was running a few VMs and some Kubernetes workloads, nothing heavy. But every few days, it would lock up.

  3. WHAT ACTUALLY HAPPENED

    Turns out, the problem was in the CPU’s power management. Ryzen Zen3 and Zen4 chips, especially the mobile variants, have a known bug where deep C-states (C2/C3) can cause a full system lockup under certain Linux conditions. The CPU goes to sleep, and the system never wakes up. No MCE, no EDAC, no useful logs — just a frozen node. And yes, it’s frustrating as hell.

I tried checking dmesg, journalctl, and even ran perf and turbostat, but nothing obvious came up. It wasn’t the GPU, it wasn’t the RAM, it wasn’t a thermal shutdown. It was the C-state.

  1. THE FIX Add this to your kernel command line:
GRUB_CMDLINE_LINUX="processor.max_cstate=1"
Enter fullscreen mode Exit fullscreen mode

Then run:

sudo update-grub
sudo reboot
Enter fullscreen mode Exit fullscreen mode

This limits the CPU to the deepest C-state it can safely use without triggering the freeze. For most Ryzen systems, C1 is fine. It doesn’t disable power savings — just restricts it to a level that won’t lock you out. You won’t notice a huge performance hit, and the stability gain is worth it.

If you're using Proxmox, edit /etc/default/grub, add the line above, and then run the update and reboot. You can verify it's working with:

sudo turbostat --summary --interval 10
Enter fullscreen mode Exit fullscreen mode

Look at the C0%, C1%, and deeper states. If you’re only hitting C1, you’re good to go.

  1. WHY THIS MATTERS This issue is not rare. If you're running Proxmox, Kubernetes, or even a headless Ryzen machine as a homelab node, you may hit this. Especially if you're using Zen3 or Zen4 mobile CPUs (like 5700U, 7840U). The problem is Linux-specific, and it's not always caught in logs. You won’t know it's happening until your node dies.

If you're building a homelab node and using Ryzen, consider this a must-check. It's a low-effort fix that avoids a ton of headache. I’ve seen this on multiple machines — especially when running resource-heavy containers or VMs. It's not a Proxmox bug, it's not a kernel bug — it's a hardware quirk. But with that one line in your GRUB config, you can avoid it.

If you're deploying nodes in a cluster, don't ignore this. It's not just a dev box thing. I’ve seen Kubernetes nodes go NotReady for no reason — and this was the cause. You can also pair this with a watchdog daemon, like watchdog-mux, to auto-reboot on lockups if you're still getting edge cases.

So yes — it's a gotcha. But one you can fix in a few minutes. And one that could save you from a lot of late-night troubleshooting.

Top comments (0)