DEV Community

Puru
Puru

Posted on

Kubernetes: Evenly Distribution of Pods Across Cluster Nodes

thumbnail

Introduction

Managing Pods distribution across a cluster is hard. Pod affinity and anti-affinity feature of Kubernetes allows some control of Pod placement. However, these features only resolve part of Pods distribution use cases.

There is a common need to distribute the Pods evenly across the cluster for high availability and efficient cluster resource utilization.

As such, PodTopologySpread scheduling plugin was designed to fill that gap. The plugin has reached a stable state since Kubernetes v1.19.

Source: Pod Topology Spread Constraints

In this article, I’ll show you an example of using the topology spread constraints feature of Kubernetes to distribute the Pods workload across the cluster nodes in an absolute even manner.


Part 1. Spin Multi-node Kubernetes Cluster

If you already have a Kubernetes cluster with three or more worker nodes, you can skip this cluster setup part.

I’ll be using an awesome tool called kind to spin up a local Kubernetes cluster using Docker containers as “nodes”.

By default, when creating a multi-node cluster via kind, it doesn’t assign a unique hostname for each worker nodes (very unkind 😄)

Firstly, create a directory called hostnames containing a file for each worker with a unique hostname.

$ mkdir hostnames
$ echo 'worker-1' > hostnames/worker-1
$ echo 'worker-2' > hostnames/worker-2
$ echo 'worker-3' > hostnames/worker-3
Enter fullscreen mode Exit fullscreen mode

Now, save the kind cluster config shown below which creates a K8s cluster consisting of 1 control panel (master) and 3 workers. The config also has mounts defined per worker to set the unique hostname.

$ cat > unkind-config.yaml <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
  - role: worker
    extraMounts:
    - hostPath: hostnames/worker-1
      containerPath: /etc/hostname
  - role: worker
    extraMounts:
    - hostPath: hostnames/worker-2
      containerPath: /etc/hostname
  - role: worker
    extraMounts:
    - hostPath: hostnames/worker-3
      containerPath: /etc/hostname
EOF
Enter fullscreen mode Exit fullscreen mode

Finally, spin up the Kubernetes cluster as such:

$ kind create cluster --config unkind-config.yaml
Enter fullscreen mode Exit fullscreen mode

The output should be similar to shown below:

Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.21.1) 🖼
 ✓ Preparing nodes 📦 📦 📦 📦
 ✓ Writing configuration 📜
 ✓ Starting control-plane 🕹️
 ✓ Installing CNI 🔌
 ✓ Installing StorageClass 💾
 ✓ Joining worker nodes 🚜
Set kubectl context to "kind-kind"
You can now use your cluster with:
kubectl cluster-info --context kind-kind
Thanks for using kind! 😊
Enter fullscreen mode Exit fullscreen mode

Now, verify the cluster is up and running:

$ kubectl get nodes
Enter fullscreen mode Exit fullscreen mode

The output should be similar to shown below:

NAME                 STATUS   ROLES                  AGE     VERSION
kind-control-plane   Ready    control-plane,master   3m29s   v1.21.1
worker-1             Ready    <none>                 2m58s   v1.21.1
worker-2             Ready    <none>                 2m58s   v1.21.1
worker-3             Ready    <none>                 2m58s   v1.21.1
Enter fullscreen mode Exit fullscreen mode

We’re now ready to play around with the cluster!


Part 2. Distribute Pods Evenly Across The Cluster

The topology spread constraints rely on node labels to identify the topology domain(s) that each worker Node is in.

In order to distribute pods evenly across all cluster worker nodes in an absolute even manner, we can use the well-known node label called kubernetes.io/hostname as a topology domain, which ensures each worker node is in its own topology domain.

In the below manifest, we have defined a deployment with 3 replicas that assigned a label type=dummy to the Pod and a topologySpreadConstaints that acts on pods that have that label defined.

And spec.topologySpreadConstaints is defined as:

  • maxSkew: 1 — distribute pods in an absolute even manner
  • topologyKey: kubernetes.io/hostname — use the hostname as topology domain
  • whenUnsatisfiable: ScheduleAnyway — always schedule pods even if it can’t satisfy even distribution of pods
  • labelSelector — only act on Pods that match this selector

Finally, the Pods runs a container image called pause that does absolutely nothing! 😃

apiVersion: v1
kind: Namespace
metadata:
  name: dummy
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: dummy
  namespace: dummy
spec:
  replicas: 3
  selector:
    matchLabels:
      type: dummy
  template:
    metadata:
      labels:
        type: dummy
    spec:
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: kubernetes.io/hostname
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels:
              type: dummy    
      containers:
      - name: pause
        image: k8s.gcr.io/pause:3.1
Enter fullscreen mode Exit fullscreen mode

Now, let’s apply the manifest:

$ kubectl apply -f dummy-deployment.yaml
namespace/dummy created
deployment.apps/dummy created
Enter fullscreen mode Exit fullscreen mode

And verify that the pod's placement is balanced across all worker nodes:

$ kubectl -n dummy get pods -o wode --sort-by=.spec.nodeName
Enter fullscreen mode Exit fullscreen mode

As we can see from the above screenshot, pods are scheduled evenly on worker-1, worker-2, and worker-3 respectively.

We can further upscale the deployment to 30 replicas, and validate the distribution of pods as we scale.

$ kubectl -n dummy scale deploy/dummy --replicas 30
Enter fullscreen mode Exit fullscreen mode

As you can see in the screenshot below, the pods are evenly distributed across all cluster nodes after we upscaled the deployment. #awesomeness


Conclusion

PodTopologySpread scheduling plugin gives power to Kubernetes administrators to achieve high availability of applications as well as efficient utilization of cluster resources.

Known Limitations:

  • Scaling down a Deployment will not guarantee and may result in imbalanced Pods distribution. You can use Descheduler to rebalance the Pods distribution.

References:

Top comments (1)

Collapse
 
taksenov profile image
taksenov

Hi.

Missprint in the:

$ kubectl -n dummy get pods -o wode --sort-by=.spec.nodeName
Enter fullscreen mode Exit fullscreen mode

Correct:

$ kubectl -n dummy get pods -o wide --sort-by=.spec.nodeName
Enter fullscreen mode Exit fullscreen mode