DEV Community

Sean Killeen
Sean Killeen

Posted on • Originally published at seankilleen.com on

TIL: Kubernetes Auto-scaling and Requests vs Limits

I recently revised an incorrect mental model I had about Kubernetes as part of a strange experience, and I figured I’d share here in case it helps someone else.

Background / Challenge

  • I have a Horizontal Pod Auto-scaler set to scale at 80% CPU or 80% RAM, with a minimum of 2 pods and a max of 5.
  • I’ve given these pods limits of 1GB RAM (throwing some more resources at a problem temporarily 😉 )
  • I recently saw my HPA set the pod count to 3. So I’m curious as to why – maybe these things are just hogging RAM?
  • I see the RAM threshold as 84/80 on the HPA, even with 3 pods running
  • However, I check our instance of Goldilocks which is giving us recommendations from a vertical pod auto-scaler (in observe-only mode) – and it’s telling me I can set our resources way lower.
  • So I run kubectl top pods --all-namespaces --sort-by=memory …and I see the pods are using 145mi, 118mi, 115mi – far from the 1024mi I specified

What gives?

Read more at SeanKilleen.com!

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay