Srinivasaraju Tangella

Posted on Jan 13

Stop Random Pod Scheduling: Master Kubernetes Affinity & Anti-Affinity with NGINX (Practical Guide for DevOps & SRE)

#tutorial #devops #performance #kubernetes

When you don’t control where pods land in Kubernetes, you leak performance, reliability, and even cost. The default scheduler is good—but it’s not telepathic. It has no idea that your frontend wants to be near Redis or that your three replicas shouldn’t be sitting on the same node waiting to be killed together.
That’s exactly why Pod Affinity and Pod Anti-Affinity exist.

In this article, we’ll break down both concepts in a way that makes sense for real production clusters and then prove scheduling behavior with NGINX deployments.

What Is Pod Affinity? (Keep Pods Close)

Pod Affinity instructs Kubernetes to place pods close to other pods that match certain labels.

Think about:
App → Cache (low latency)
API → DB (chatty traffic)
Microservice → Microservice (tight coupling)

When two components constantly talk over the network, co-locating them eliminates cross-node hops and sometimes cross-zone cloud charges.
Common use cases:

Reduce inter-node network latency

Co-locate with Redis / Memcached / Kafka controllers
Keep components in the same failure domain

What Is Pod Anti-Affinity? (Keep Pods Apart)

Pod Anti-Affinity does the opposite. It tells Kubernetes:
“Don’t crowd pods from this group on the same node.”

This is critical for:

high availability (HA)
fault isolation
failure domain control
If you run three replicas of your NGINX ingress and they all land on a single node, that node becomes a single point of failure. Anti-Affinity fixes that.

Hard vs Soft Rules

There are two modes:
Hard

requiredDuringSchedulingIgnoredDuringExecution
Must satisfy the rule. Scheduler will refuse to place the pod otherwise.

Soft

preferredDuringSchedulingIgnoredDuringExecution

Best effort. Scheduler tries, but won’t block placement.
In production, HA components generally use hard Anti-Affinity.

Deploying Pod Affinity with NGINX (Proof: Pods Stay Close)

For demonstration, we’ll co-locate NGINX with Redis.

Step 1: Deploy Redis

Yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis

Step 2: Deploy NGINX with Pod Affinity

Yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-affinity
spec:
replicas: 2
selector:
matchLabels:
app: nginx-affinity
template:
metadata:
labels:
app: nginx-affinity
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: redis
topologyKey: kubernetes.io/hostname
containers:
- name: nginx
image: nginx

Verify Placement

kubectl get pods -o wide

Expected output style:

redis-7dfc6 Running node1
nginx-affinity-fc77 Running node1
nginx-affinity-d6b8 Running node1

✔ Proof achieved: scheduler co-located pods on same node to reduce latency.

Deploying Pod Anti-Affinity with NGINX (Proof: Pods Stay Apart)

Now let’s use 3 replicas and force them to spread for HA.

Deployment YAML

Yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-anti
spec:
replicas: 3
selector:
matchLabels:
app: nginx-anti
template:
metadata:
labels:
app: nginx-anti
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: nginx-anti
topologyKey: kubernetes.io/hostname
containers:
- name: nginx
image: nginx
Verify Placement

kubectl get pods -o wide

Expected output:

nginx-anti-1 Running node1
nginx-anti-2 Running node2
nginx-anti-3 Running node3

✔ Proof achieved: replicas spread across nodes and no single node failure can kill the service completely.

Why DevOps & SRE Actually Use This

This isn’t theoretical. You will use Affinity/Anti-Affinity frequently if you operate any non-toy cluster.
Real industry scenarios include:

For performance
API close to Redis or Memcached
Kafka brokers near Zookeeper
Elasticsearch hot nodes near collectors

For reliability

NGINX ingress replicas spread across nodes
Prometheus and Alertmanager separated
Kafka brokers in separate availability zones

For cost
Prevent cross-zone data transfer pricing (AWS/GCP)

For compliance
Keep workloads within geo or legal boundaries
This is what separates “can deploy YAML” engineers from real systems engineers.

Topology and Failure Domain Awareness

Affinity only becomes truly valuable when you incorporate topology. Kubernetes exposes multiple topology keys such as:
node (hostname)
rack
datacenter
zone
region

Cloud vendor defaults include:
topology.kubernetes.io/zone
topology.kubernetes.io/region
Real systems use these to ensure brokers like Kafka or MongoDB never share the same zone, because losing a zone must not kill quorum.

Topology Spread Constraints (The Modern Alternative)

Kubernetes also provides a more generalized mechanism: Topology Spread Constraints.
These constraints allow you to declare that pods must be evenly distributed across topologies without explicitly referring to other pods.

Example:

Yaml
topologySpreadConstraints:

maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule labelSelector: matchLabels: app: nginx

This makes Kubernetes behave like a distributed systems scheduler instead of a simple bin-packer.

Scheduler Strategy for Distributed Systems

Affinity and anti-affinity patterns become essential in stateful or quorum-based systems like:

Kafka brokers
Zookeeper
Etcd
MongoDB
Cassandra
Elasticsearch masters
Consul
TiDB
Vitess

These systems are built around failure domains and must survive node or zone loss. Anti-affinity enforces quorum safety. Affinity often co-locates controllers near storage engines.

What Real DevOps & SRE Teams Actually Do

Once you enter real production, affinity is used alongside:

multi-AZ deployments
cluster autoscalers
pod disruption budgets
node taints and tolerations
resource quotas
priority classes

Affinity does not exist in isolation. It is part of expressing system intent to schedulers.

Understanding the Scheduler Internals

Kubernetes performs two phases during scheduling:

Filtering Nodes that violate hard rules are excluded.

Scoring Remaining nodes are ranked based on:

availability
resource pressure
topology
affinity/anti-affinity  weights.

The highest score wins deployment. Hard rules gate placement; soft rules modulate scoring.

This makes Kubernetes schedulers deterministic under constraints and intelligent under preference.

Key Takeaways

Pod Affinity = keep pods together for performance

Pod Anti-Affinity = keep pods apart for HA

Hard rules block scheduling
Soft rules influence but don’t block

Verified behavior using NGINX deployments
Useful for DevOps, SRE, platform and infra engineers running real workloads

DEV Community

Stop Random Pod Scheduling: Master Kubernetes Affinity & Anti-Affinity with NGINX (Practical Guide for DevOps & SRE)

Top comments (0)