When every Pod screams for CPU and memory, who decides who lives, who waits, and who gets evicted?
Kubernetes isn't just a scheduler — it's a negotiator of fairness and efficiency.
Every second, it balances hundreds of workloads, deciding what runs, what waits, and what gets terminated — while maintaining reliability and cost efficiency.
This article unpacks how Quality of Service (QoS), Priority Classes, Preemption, and Bin-Packing Scoring come together to keep your cluster stable and fair.
⚙️ The Challenge: Competing Workloads in Shared Clusters
When multiple workloads share cluster resources, conflicts are inevitable:
- High-traffic apps starve lower workloads.
- Batch jobs hog memory.
- Pods without limits cause unpredictable evictions.
Kubernetes addresses this by applying a layered decision-making model — QoS, Priority, Preemption, and Scoring.
🧭 QoS (Quality of Service): Who Gets Evicted First
Each Pod belongs to a QoS class based on CPU and memory configuration:
| QoS Class | Description | Eviction Priority |
|---|---|---|
| Guaranteed | Requests = Limits for all containers | Evicted last |
| Burstable | Requests < Limits | Evicted after BestEffort |
| BestEffort | No requests/limits set | Evicted first |
💡 Lesson: Always define requests and limits — QoS decides who survives under node pressure.
🧱 Priority Classes: Who Runs First
QoS defines who stays, while Priority Classes define who starts.
Assigning PriorityClass values (integer-based) helps rank workloads during scheduling.
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: critical-services
value: 100000
description: Critical platform workloads
💡 Lesson: Reserve high priorities for mission-critical services.
Overusing "high" priority leads to chaos — not resilience.
⚔️ Preemption: Controlled Sacrifice, Not Chaos
When a high-priority Pod can't be scheduled:
- The scheduler identifies lower-priority Pods occupying resources.
- Marks them for termination.
- Reschedules the high-priority Pod.
This is guided by PodDisruptionBudgets (PDBs) to avoid excessive collateral damage.
💡 Lesson: Preemption is controlled resilience — ensuring important workloads run while maintaining order.
⚖️ Scoring & Bin-Packing: Finding the Right Home
Once eligible nodes are filtered, Kubernetes enters the scoring phase to find the best fit.
Plugins involved:
- LeastRequestedPriority → favors underutilized nodes.
- BalancedResourceAllocation → balances CPU & memory use.
- ImageLocalityPriority → prefers nodes with cached images.
- NodeAffinityPriority → honors affinity preferences.
- TopologySpreadConstraint → ensures zone diversity.
Each node receives a score (0–100) from multiple plugins.
Weighted scores are combined:
final_score = (w1*s1) + (w2*s2) + ...
QoS defines survivability.
Priority defines importance.
Scoring defines placement.
Together, they shape a stable and efficient cluster.
🧩 Visual Flow: Kubernetes Scheduling & Bin-Packing
🧠 Key Lessons for SREs & Platform Teams
✅ Always define CPU/memory requests & limits.
✅ Use PriorityClasses sparingly.
✅ Test evictions under simulated stress.
✅ Combine QoS + PDB + Priority for controlled resilience.
✅ Observe scheduling metrics (kube_pod_status_phase, scheduler_score) regularly.
🚀 Takeaway
Kubernetes doesn't just schedule Pods — it negotiates priorities.
Reliability doesn't come from overprovisioning, but from predictable, fair, and disciplined scheduling.
Resilience = Consistency in scheduling decisions.
Top comments (0)