<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Emilio Forrer</title>
    <description>The latest articles on DEV Community by Emilio Forrer (@emilioforrer).</description>
    <link>https://dev.to/emilioforrer</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F809710%2Fbf9eb977-4ba0-4af2-9a85-04591ce4633e.jpeg</url>
      <title>DEV Community: Emilio Forrer</title>
      <link>https://dev.to/emilioforrer</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/emilioforrer"/>
    <language>en</language>
    <item>
      <title>Kubernetes Inter-pod anti-affinity and de-schedule</title>
      <dc:creator>Emilio Forrer</dc:creator>
      <pubDate>Fri, 01 Mar 2024 05:00:58 +0000</pubDate>
      <link>https://dev.to/emilioforrer/kubernetes-inter-pod-anti-affinity-and-de-schedule-500o</link>
      <guid>https://dev.to/emilioforrer/kubernetes-inter-pod-anti-affinity-and-de-schedule-500o</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Before talking about affinity and anti-affinity inside a Kubernetes cluster, let's first understand what  Kubernetes is. Kubernetes is a platform to manage and orchestrate workloads and services based on containers that offer a lot of features such as auto-scaling (vertical and horizontal), container replicas, secret management, etc.&lt;/p&gt;

&lt;p&gt;Kubernetes also offers 2 important scheduling features that can be configured, to place pods inside the nodes. Those features are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Node affinity&lt;/strong&gt;: This is similar to &lt;code&gt;nodeSelector&lt;/code&gt; with the difference that the language is more expressive and you can create rules that are not &lt;strong&gt;hard requirements&lt;/strong&gt; but rather a &lt;strong&gt;soft/preferred&lt;/strong&gt; rule, meaning that the scheduler will still be able to schedule your pod, even if the rules can not be met.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inter-pod affinity and anti-affinity&lt;/strong&gt;: Allow you to define rules that constrain which nodes your pod is eligible to be scheduled based on labels on pods that are already running on the node rather than based on labels on nodes. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The focus of this post will be on &lt;strong&gt;Inter-pod anti-affinity&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How to deploy an application with a rule that specifies to prefer scheduling a pod into another node if that node already contains a pod with the same labels as the pod to be scheduled (like in the case of multiple replicas of the same app)&lt;/li&gt;
&lt;li&gt;Fix a real-world edge case that can make your pods get stuck on the same node, even if you had specified a pod anti-affinity rule.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So let's get started!.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating a local multi-node cluster
&lt;/h2&gt;

&lt;p&gt;To create a local multi-node cluster in our machine, we will be using &lt;a href="https://kind.sigs.k8s.io/" rel="noopener noreferrer"&gt;Kind&lt;/a&gt;, so let's go ahead and follow the &lt;a href="https://kind.sigs.k8s.io/docs/user/quick-start/#installation" rel="noopener noreferrer"&gt;installation guide&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Once installed, let's create a configuration file (&lt;code&gt;kind-config.yaml&lt;/code&gt;), specifying a cluster with 1 control-plane and 3 worker nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Cluster&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;kind.x-k8s.io/v1alpha4&lt;/span&gt;
&lt;span class="c1"&gt;# One control plane node and three "workers".&lt;/span&gt;
&lt;span class="na"&gt;nodes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;control-plane&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;worker&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;worker&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;worker&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now let's create a cluster by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kind create cluster &lt;span class="nt"&gt;--name&lt;/span&gt; k8s-playground &lt;span class="nt"&gt;--config&lt;/span&gt; kind-config.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: It may take a few minutes, depending on your computer's resources&lt;/p&gt;

&lt;p&gt;Let's check if the cluster was created and has the right nodes, to do this, run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get nodes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Should see an output similar to this one&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;NAME                           STATUS   ROLES                  AGE   VERSION
k8s-playground-control-plane   Ready    control-plane,master   51s   v1.21.1
k8s-playground-worker          Ready    &amp;lt;none&amp;gt;                 25s   v1.21.1
k8s-playground-worker2         Ready    &amp;lt;none&amp;gt;                 25s   v1.21.1
k8s-playground-worker3         Ready    &amp;lt;none&amp;gt;                 25s   v1.21.1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, create a new deployment file (&lt;code&gt;deployment.yaml&lt;/code&gt;) with the name &lt;code&gt;demo-app&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;demo-app&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;demo-app&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;demo-app&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# These are the the Pod labels&lt;/span&gt;
        &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;demo-app&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;affinity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;podAntiAffinity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;preferredDuringSchedulingIgnoredDuringExecution&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;weight&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;
            &lt;span class="na"&gt;podAffinityTerm&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;labelSelector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;matchExpressions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# The key and value of the label that you will match against&lt;/span&gt;
                &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;app&lt;/span&gt;
                  &lt;span class="na"&gt;operator&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;In&lt;/span&gt;
                  &lt;span class="na"&gt;values&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;demo-app&lt;/span&gt; &lt;span class="c1"&gt;# In this example we are matching against the same lables as the pod label&lt;/span&gt;
              &lt;span class="na"&gt;topologyKey&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;kubernetes.io/hostname&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginxdemos/hello&lt;/span&gt;
        &lt;span class="na"&gt;imagePullPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Always&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;hello&lt;/span&gt;
        &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's apply the deployment&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; deployment.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run &lt;code&gt;kubectl get pods -o wide&lt;/code&gt; to see the running pods&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;NAME                       READY   STATUS    RESTARTS   AGE   IP           NODE                     NOMINATED NODE   READINESS GATES
demo-app-99d479bc9-w6f6p   1/1     Running   0          4s    10.244.1.7   k8s-playground-worker3   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-xhfj8   1/1     Running   0          4s    10.244.3.6   k8s-playground-worker    &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-xwsfk   1/1     Running   0          4s    10.244.2.8   k8s-playground-worker2   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see, Kubernetes will prefer to place the pods on nodes that do not have an instance of the app running.&lt;/p&gt;

&lt;p&gt;What happens if we have more replicas than the number of nodes? well, let's see. Run the following command to scale the deployment to 5 replicas:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl scale deployment demo-app &lt;span class="nt"&gt;--replicas&lt;/span&gt; 5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now run &lt;code&gt;kubectl get pods -o wide&lt;/code&gt;, the output should be similiar to this one&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;NAME                       READY   STATUS    RESTARTS   AGE    IP           NODE                     NOMINATED NODE   READINESS GATES
demo-app-99d479bc9-blmdm   1/1     Running   0          14s    10.244.1.8   k8s-playground-worker3   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-td9rk   1/1     Running   0          14s    10.244.3.7   k8s-playground-worker    &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-w6f6p   1/1     Running   0          5m5s   10.244.1.7   k8s-playground-worker3   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-xhfj8   1/1     Running   0          5m5s   10.244.3.6   k8s-playground-worker    &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-xwsfk   1/1     Running   0          5m5s   10.244.2.8   k8s-playground-worker2   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see, since we are using &lt;code&gt;preferredDuringSchedulingIgnoredDuringExecution&lt;/code&gt;, Kubernetes &lt;strong&gt;"preferred"&lt;/strong&gt; to place the other 2 replicas on nodes that already had pods of the same app running since there wasn't another node to schedule to.&lt;/p&gt;

&lt;h2&gt;
  
  
  Taking down nodes
&lt;/h2&gt;

&lt;p&gt;What happens if a node goes down? Well, let's find out.&lt;/p&gt;

&lt;p&gt;Let's drain the node &lt;code&gt;k8s-playground-worker3&lt;/code&gt; to simulate that node went down.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl drain k8s-playground-worker3 &lt;span class="nt"&gt;--ignore-daemonsets&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we run &lt;code&gt;kubectl get pods -o wide&lt;/code&gt;, we can see that all pods got rescheduled on nodes &lt;code&gt;k8s-playground-worker&lt;/code&gt; and &lt;code&gt;k8s-playground-worker2&lt;/code&gt;, since &lt;code&gt;k8s-playground-worker3&lt;/code&gt; went down.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;NAME                       READY   STATUS    RESTARTS   AGE     IP           NODE                     NOMINATED NODE   READINESS GATES
demo-app-99d479bc9-2ztpg   1/1     Running   0          2m11s   10.244.2.9   k8s-playground-worker2   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-c6pfn   1/1     Running   0          2m11s   10.244.3.9   k8s-playground-worker    &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-td9rk   1/1     Running   0          20m     10.244.3.7   k8s-playground-worker    &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-xhfj8   1/1     Running   0          25m     10.244.3.6   k8s-playground-worker    &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-xwsfk   1/1     Running   0          25m     10.244.2.8   k8s-playground-worker2   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, let's drain the node &lt;code&gt;k8s-playground-worker2&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl drain k8s-playground-worker2 &lt;span class="nt"&gt;--ignore-daemonsets&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we run &lt;code&gt;kubectl get pods -o wide&lt;/code&gt;, we can see (as expected), that all the pods are running only in the node &lt;code&gt;k8s-playground-worker&lt;/code&gt;, since  there is no other node in the cluster.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;NAME                       READY   STATUS    RESTARTS   AGE    IP            NODE                    NOMINATED NODE   READINESS GATES
demo-app-99d479bc9-87pwh   1/1     Running   0          47s    10.244.3.11   k8s-playground-worker   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-c6pfn   1/1     Running   0          7m5s   10.244.3.9    k8s-playground-worker   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-kvwq7   1/1     Running   0          47s    10.244.3.10   k8s-playground-worker   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-td9rk   1/1     Running   0          25m    10.244.3.7    k8s-playground-worker   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-xhfj8   1/1     Running   0          30m    10.244.3.6    k8s-playground-worker   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Restoring down nodes
&lt;/h2&gt;

&lt;p&gt;What happens if a node goes back online?&lt;/p&gt;

&lt;p&gt;Let's see, run the following commands to the nodes&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl uncordon k8s-playground-worker2
kubectl uncordon k8s-playground-worker3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wait a moment then run the following command to see if all nodes are back online.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get nodes &lt;span class="nt"&gt;-o&lt;/span&gt; wide
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Should print an output similar to this one&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;NAME                           STATUS   ROLES                  AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE       KERNEL-VERSION                      CONTAINER-RUNTIME
k8s-playground-control-plane   Ready    control-plane,master   64m   v1.21.1   172.29.0.5    &amp;lt;none&amp;gt;        Ubuntu 21.04   5.10.60.1-microsoft-standard-WSL2   containerd://1.5.2
k8s-playground-worker          Ready    &amp;lt;none&amp;gt;                 64m   v1.21.1   172.29.0.2    &amp;lt;none&amp;gt;        Ubuntu 21.04   5.10.60.1-microsoft-standard-WSL2   containerd://1.5.2
k8s-playground-worker2         Ready    &amp;lt;none&amp;gt;                 64m   v1.21.1   172.29.0.4    &amp;lt;none&amp;gt;        Ubuntu 21.04   5.10.60.1-microsoft-standard-WSL2   containerd://1.5.2
k8s-playground-worker3         Ready    &amp;lt;none&amp;gt;                 64m   v1.21.1   172.29.0.3    &amp;lt;none&amp;gt;        Ubuntu 21.04   5.10.60.1-microsoft-standard-WSL2   containerd://1.5.2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can see that all the nodes are back online!&lt;/p&gt;

&lt;h2&gt;
  
  
  What happened to the pods?
&lt;/h2&gt;

&lt;p&gt;Run &lt;code&gt;kubectl get pods -o wide&lt;/code&gt;, to list the pods.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;NAME                       READY   STATUS    RESTARTS   AGE   IP            NODE                    NOMINATED NODE   READINESS GATES
demo-app-99d479bc9-87pwh   1/1     Running   0          16m   10.244.3.11   k8s-playground-worker   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-c6pfn   1/1     Running   0          22m   10.244.3.9    k8s-playground-worker   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-kvwq7   1/1     Running   0          16m   10.244.3.10   k8s-playground-worker   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-td9rk   1/1     Running   0          41m   10.244.3.7    k8s-playground-worker   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-xhfj8   1/1     Running   0          46m   10.244.3.6    k8s-playground-worker   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What?!&lt;/strong&gt;, all pods are still running on node &lt;code&gt;k8s-playground-worker&lt;/code&gt;, even if all the other nodes are back online!. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What does this mean?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If node &lt;code&gt;k8s-playground-worker&lt;/code&gt; goes down, we will have downtime in our application during the re-scheduling to the other nodes. Since all the pods are on the same node&lt;/li&gt;
&lt;li&gt;We have lost high availability (HA) in our cluster for that app, even when having multiple nodes up and running.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The issue
&lt;/h2&gt;

&lt;p&gt;What happened was that the inter-pod anti-affinity mechanism is &lt;strong&gt;only relevant during scheduling&lt;/strong&gt;. Once a pod is running, the rules cannot be re-applied. To apply the rules again, you will need to recreate the pod.&lt;/p&gt;

&lt;h2&gt;
  
  
  The solution
&lt;/h2&gt;

&lt;p&gt;To fix this, we need to somehow watch for node changes and reapply the rules and de-schedule the pods to distribute the workload accordingly to the rules specification.&lt;/p&gt;

&lt;p&gt;Luckily there is already a tool that does that. It is called &lt;a href="https://github.com/kubernetes-sigs/descheduler" rel="noopener noreferrer"&gt;Descheduler&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Let's install the &lt;a href="https://github.com/kubernetes-sigs/descheduler/blob/master/charts/descheduler/README.md" rel="noopener noreferrer"&gt;Helm Chart&lt;/a&gt;, by running the following command.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;helm repo add descheduler https://kubernetes-sigs.github.io/descheduler/
helm &lt;span class="nb"&gt;install &lt;/span&gt;descheduler &lt;span class="nt"&gt;--namespace&lt;/span&gt; kube-system descheduler/descheduler
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And that's all, only need to wait a few minutes to take effect. Run &lt;code&gt;kubectl get pods -o wide&lt;/code&gt; to watch the changes in the pods.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;NAME                       READY   STATUS    RESTARTS   AGE   IP            NODE                     NOMINATED NODE   READINESS GATES
demo-app-99d479bc9-9vxbr   1/1     Running   0          45s   10.244.2.12   k8s-playground-worker2   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-s95f8   1/1     Running   0          45s   10.244.1.11   k8s-playground-worker3   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-td9rk   1/1     Running   0          91m   10.244.3.7    k8s-playground-worker    &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-xgjgz   1/1     Running   0          45s   10.244.2.11   k8s-playground-worker2   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
demo-app-99d479bc9-xhfj8   1/1     Running   0          96m   10.244.3.6    k8s-playground-worker    &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see, the anti-affinity rules got reapplied, and the pods are re-scheduled on different nodes again. High availability for your application has restored!.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: &lt;a href="https://github.com/kubernetes-sigs/descheduler" rel="noopener noreferrer"&gt;Descheduler&lt;/a&gt; has a lot more options, but that's a story for another post.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;We have learned how to create an application and applied anti-affinity rules to spread the pods across the nodes and avoid all replicas of the app to schedule on the same node, achieving more high availability. We also learned that some edge cases make our apps lose high availability and introduce downtime. And to fix those kinds of situations, there are tools like &lt;a href="https://github.com/kubernetes-sigs/descheduler" rel="noopener noreferrer"&gt;Descheduler&lt;/a&gt;, that can help us overcome those issues.&lt;/p&gt;

&lt;p&gt;If you want to learn more about how to assign pods to nodes, you can check the official documentation &lt;a href="https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Thanks for your time reading this article. See you on the next one!&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
