<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Javier Martínez</title>
    <description>The latest articles on DEV Community by Javier Martínez (@javicps).</description>
    <link>https://dev.to/javicps</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F905745%2F290f42a9-9755-4168-a20f-7e2ebb003e42.jpg</url>
      <title>DEV Community: Javier Martínez</title>
      <link>https://dev.to/javicps</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/javicps"/>
    <language>en</language>
    <item>
      <title>Top metrics for Elasticsearch monitoring with Prometheus</title>
      <dc:creator>Javier Martínez</dc:creator>
      <pubDate>Tue, 09 May 2023 08:18:13 +0000</pubDate>
      <link>https://dev.to/sysdig/top-metrics-for-elasticsearch-monitoring-with-prometheus-3pca</link>
      <guid>https://dev.to/sysdig/top-metrics-for-elasticsearch-monitoring-with-prometheus-3pca</guid>
      <description>&lt;p&gt;Starting the journey for Elasticsearch monitoring is crucial to get the right visibility and transparency over its behavior.&lt;/p&gt;

&lt;p&gt;Elasticsearch is the most used &lt;strong&gt;search and analytics engine&lt;/strong&gt;. It provides both scalability and redundancy to provide a high-availability search. As of 2023, more than sixty thousand companies of all sizes and backgrounds are using it as their search solution to track a diverse range of data, like analytics, logging, or business information.&lt;/p&gt;

&lt;p&gt;By distributing data in JSON documents and indexing that data into several shards, Elastic search provides high availability, quick search, and redundancy capabilities.&lt;/p&gt;

&lt;p&gt;In this article, we will evaluate the most important Prometheus metrics provided by the Elasticsearch exporter.&lt;/p&gt;

&lt;p&gt;You will learn what are the main areas to focus on when monitoring an Elasticsearch system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Start monitoring Elasticsearch with Prometheus.&lt;/li&gt;



&lt;li&gt;How to monitor Golden Signals.&lt;/li&gt;



&lt;li&gt;How to monitor infra metrics.&lt;/li&gt;



&lt;li&gt;How to monitor index performance.&lt;/li&gt;



&lt;li&gt;How to monitor search performance.&lt;/li&gt;



&lt;li&gt;How to monitor cluster performance.&lt;/li&gt;



&lt;li&gt;Advanced monitoring and next steps&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id="start-monitoring"&gt;How to start monitoring ElasticSearch with Prometheus&lt;/h2&gt;

&lt;p&gt;As usual, the easiest way to start your Prometheus monitoring journey with Elasticsearch is to use &lt;a href="https://promcat.io" rel="noopener nofollow noreferrer"&gt;PromCat.io&lt;/a&gt; to find the best configs, dashboards, and alerts. The &lt;a href="https://promcat.io/apps/elasticsearch#SetupGuide" rel="noopener nofollow noreferrer"&gt;Elasticsearch setup guide in Promcat&lt;/a&gt; includes the Elasticsearch exporter with a series of out-of-box metrics that will be automatically scrapped to Prometheus. It also includes a collection of curated alerts and dashboards to start monitoring Elasticsearch right away.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ZfNQKQOA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://sysdig.com/wp-content/uploads/Top-metrics-elasticsearch-image1-1170x383.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ZfNQKQOA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://sysdig.com/wp-content/uploads/Top-metrics-elasticsearch-image1-1170x383.png" alt="Top metrics for Elasticsearch - metric list" title="image_tooltip" width="800" height="262"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can combine these metrics with the &lt;a href="https://sysdig.com/blog/exporters-and-target-labels/" rel="noopener"&gt;Node Exporter&lt;/a&gt; to get more insights into your infrastructure. Also, if you're running Elasticsearch on Kubernetes, you can &lt;a href="https://sysdig.es/blog/kubernetes-monitoring-prometheus/#kube-state-metrics" rel="noopener nofollow noreferrer"&gt;use KSM and CAdvisor&lt;/a&gt; to combine Kubernetes metrics with Elasticsearch metrics.&lt;/p&gt;

&lt;h2 id="monitor-golden-signals"&gt;How to monitor Golden Signals in Elasticsearch&lt;/h2&gt;

&lt;p&gt;To review a bare minimum of important metrics, remember to check the so-called &lt;a href="https://sysdig.com/blog/golden-signals-kubernetes/" rel="noopener"&gt;Golden Signals&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Errors.&lt;/li&gt;



&lt;li&gt;Traffic.&lt;/li&gt;



&lt;li&gt;Saturation.&lt;/li&gt;



&lt;li&gt;Latency.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These represent a set of the essential metrics to look for in a system, in order to track black-box monitoring (focus only on what’s happening in the system, not why). In other words, Golden Signals will measure symptoms, not solutions to the current problem. This could be a good starting point for creating an Elasticsearch monitoring dashboard.&lt;/p&gt;

&lt;h3&gt;Errors&lt;/h3&gt;

&lt;h5&gt;elasticsearch_cluster_health_status&lt;/h5&gt;

&lt;p&gt;Cluster health in Elasticsearch is measured by the colors green, yellow, and red, as follows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Green&lt;/strong&gt;: Data integrity is correct, no shard is missing.&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Yellow&lt;/strong&gt;: There’s at least one shard missing, but data integrity can be preserved due to replicas.&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;Red&lt;/strong&gt;: A primary shard is missing or unassigned, and there’s a data loss.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With &lt;code&gt;elasticsearch_cluster_health_status&lt;/code&gt;, you can quickly check the current situation for Elasticsearch data on a particular cluster. Remember that this won’t retrieve the actual causes of the data integrity loss, just that you need to act in order to prevent further problems.&lt;/p&gt;

&lt;h3&gt;Traffic&lt;/h3&gt;

&lt;h5&gt;elasticsearch_indices_search_query_total&lt;/h5&gt;

&lt;p&gt;This metric is a counter with the total number of search queries executed, which by itself won’t give you much information as a number.&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;em&gt;Consider as well using &lt;code&gt;rate()&lt;/code&gt; or &lt;code&gt;irate()&lt;/code&gt;, to detect sudden changes or spikes in traffic. Dig deeper into Prometheus queries with our &lt;a href="https://sysdig.com/blog/getting-started-with-promql-cheatsheet/" rel="noreferrer noopener"&gt;Getting started with PromQL guide&lt;/a&gt;&lt;/em&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;h3&gt;Saturation&lt;/h3&gt;

&lt;p&gt;For a detailed latency analysis, check the section on How to monitor Elasticsearch infra metrics.&lt;/p&gt;

&lt;h3&gt;Latency&lt;/h3&gt;

&lt;p&gt;For a detailed latency analysis, check the section on How to monitor Elasticsearch index performance.&lt;/p&gt;

&lt;h2 id="monitor-elasticsearch-infra-metrics"&gt;How to monitor Elasticsearch infra metrics&lt;/h2&gt;

&lt;p&gt;Infrastructure monitoring focuses on tracking the overall performance of the servers and nodes of a system. As with similar cloud applications, most of the effort will be spent on monitoring CPU and Memory consumption. &lt;/p&gt;

&lt;h3&gt;Monitoring Elasticsearch CPU&lt;/h3&gt;

&lt;h5&gt;elasticsearch_process_cpu_percent&lt;/h5&gt;

&lt;p&gt;This is a gauge metric used to measure the current CPU usage percent (0-100) of the Elasticsearch process. Since chances are that you’re running several Elasticsearch nodes, you will need to track each one separately.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--R1QaaKO---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://sysdig.com/wp-content/uploads/Top-metrics-elasticsearch-image2-1170x553.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--R1QaaKO---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://sysdig.com/wp-content/uploads/Top-metrics-elasticsearch-image2-1170x553.png" alt="Top metrics for Elasticsearch - CPU usage" title="image_tooltip" width="800" height="378"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h5&gt;elasticsearch_indices_store_throttle_time_seconds_total&lt;/h5&gt;

&lt;p&gt;In case you’re using a file system as an index store, you can expect a certain level of delays in input and output operations. This metric represents how much your Elasticsearch index store is being throttled.&lt;/p&gt;

&lt;p&gt;Since this is a counter metric that will only aggregate the total number of seconds, consider using &lt;code&gt;rate&lt;/code&gt; or &lt;code&gt;irate&lt;/code&gt; for an evaluation of how much it’s suddenly changing.&lt;/p&gt;

&lt;h3&gt;Monitoring Elasticsearch JVM Memory&lt;/h3&gt;

&lt;p&gt;Elasticsearch is based on &lt;a rel="noopener nofollow noreferrer" href="https://lucene.apache.org/"&gt;Lucene&lt;/a&gt;, which is built in Java. This means that monitoring the Java Virtual Machine (JVM) memory is crucial to understand the current usage of the whole system.&lt;/p&gt;

&lt;h5&gt;elasticsearch_jvm_memory_used_bytes&lt;/h5&gt;

&lt;p&gt;This metric is a gauge that represents the memory usage in bytes for each area.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--7PmWYJ4H--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://sysdig.com/wp-content/uploads/Top-metrics-elasticsearch-image3-1170x455.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--7PmWYJ4H--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://sysdig.com/wp-content/uploads/Top-metrics-elasticsearch-image3-1170x455.png" alt="Top metrics for Elasticsearch - Memory used" title="image_tooltip" width="800" height="311"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id="monitor-elasticsearch-index-performance"&gt;How to monitor Elasticsearch index performance&lt;/h2&gt;

&lt;p&gt;Indices in ElasticSearch partition the data as a logical namespace. Elasticsearch indexes documents in order to retrieve or search them as fast as possible.&lt;/p&gt;

&lt;p&gt;Every time a new index is created, you can define the number of shards and replicas for it:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  }
}&lt;/code&gt;&lt;/pre&gt;

&lt;h5&gt;elasticsearch_indices_indexing_index_time_seconds_total&lt;/h5&gt;

&lt;p&gt;This metric is a counter of the seconds accumulated spent on indexing. It can give you a very approximated idea of the Elasticsearch indexing performance.&lt;/p&gt;

&lt;p&gt;Note that you can divide this metric by elasticsearch_indices_indexing_index_total in order to get the average indexing time per operation.&lt;/p&gt;

&lt;h5&gt;elasticsearch_indices_refresh_time_seconds_total&lt;/h5&gt;

&lt;p&gt;For an index to be searchable, Elasticsearch needs a refresh to be executed. This is set up with the config index.refresh_interval, which is set by default to one minute.&lt;/p&gt;

&lt;p&gt;This metric elasticsearch_indices_refresh_time_seconds_total represents a counter with the total time dedicated to refreshing in Elasticsearch.&lt;/p&gt;

&lt;p&gt;In case you want to measure the average time for refresh, you can divide this metric by &lt;code&gt;elasticsearch_indices_refresh_total&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id="monitor-elasticsearch-search-performance"&gt;How to monitor Elasticsearch search performance&lt;/h2&gt;

&lt;p&gt;While Elasticsearch promises near-instant query speed, chances are that in the real world, you may feel that is not the case. The number of shards, the storage solution chosen, or the cache configuration might impact search performance, and it’s crucial to track what is the current behavior.&lt;/p&gt;

&lt;p&gt;Additionally, the usage of wildcards, joins or the number of fields being searched will affect drastically the overall processing time of search queries.&lt;/p&gt;

&lt;h5&gt;elasticsearch_indices_search_fetch_time_seconds&lt;/h5&gt;

&lt;p&gt;A counter metric aggregating the total amount of seconds dedicated to fetching results in search.&lt;/p&gt;

&lt;p&gt;In case you want to retrieve the average fetch time per operation, just divide the result by &lt;code&gt;elasticsearch_indices_search_fetch_total&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id="monitor-elasticsearch-cluster"&gt;How to monitor Elasticsearch cluster performance&lt;/h2&gt;

&lt;p&gt;Apart from the usual cloud requirements, an Elasticsearch system would like to look at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Number of shards.&lt;/li&gt;



&lt;li&gt;Number of replicas.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As a rule of thumb, the ratio between the number of shards and GB of heap space should be less than 20.&lt;/p&gt;

&lt;p&gt;Note as well that it’s suggested to have a separate cluster dedicated to monitoring.&lt;/p&gt;

&lt;h5&gt;elasticsearch_cluster_health_active_shards&lt;/h5&gt;

&lt;p&gt;This metric is a gauge that will indicate the number of active shards (both primary and replicas) from all the clusters.&lt;/p&gt;

&lt;h5&gt;elasticsearch_cluster_health_relocating_shards&lt;/h5&gt;

&lt;p&gt;Elasticsearch will dynamically move shards between nodes based on balancing or current usage. With this metric, you can control when this movement is happening.&lt;/p&gt;

&lt;h2 id="advanced-monitoring"&gt;Advanced Monitoring&lt;/h2&gt;

&lt;p&gt;Remember that the Prometheus exporter will give you a set of out-of-the-box metrics that are relevant enough to kickstart your monitoring journey. But the real challenge comes when you take the step to create your own custom metrics tailored to your application.&lt;/p&gt;

&lt;h3&gt;REST API&lt;/h3&gt;

&lt;p&gt;Additionally, mind that Elasticsearch &lt;a rel="noopener nofollow noreferrer" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/rest-apis.html"&gt;provides a REST API&lt;/a&gt; that you can query for more fine-grained monitoring.&lt;/p&gt;

&lt;h3&gt;VisualVM&lt;/h3&gt;

&lt;p&gt;The &lt;a rel="noopener nofollow noreferrer" href="https://visualvm.github.io/"&gt;Java VisualVM&lt;/a&gt; project is an advanced dashboard for Memory and CPU monitoring. It features advanced resource visualization, as well as process and thread utilization.&lt;/p&gt;

&lt;h2&gt;Download the Dashboards&lt;/h2&gt;

&lt;p&gt;You can download the dashboards with the metrics seen in this article &lt;a rel="noopener nofollow noreferrer" href="https://promcat.io/apps/elasticsearch"&gt;through the Promcat official page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This is a curated selection of the above metrics that can be easily integrated with your Grafana or Sysdig Monitor solution.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--cidTehhY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://sysdig.com/wp-content/uploads/Top-metrics-elasticsearch-image4-1170x739.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--cidTehhY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://sysdig.com/wp-content/uploads/Top-metrics-elasticsearch-image4-1170x739.png" alt="Top metrics for Elasticsearch - Grafana dashboards" title="image_tooltip" width="800" height="505"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Elasticsearch is one of the most important search engines available, featuring high availability, high scalability, and distributed capabilities through redundancy.&lt;/p&gt;

&lt;p&gt;Using the Elasticsearch exporter for Prometheus you can kickstart the monitoring journey in an easy way, by automatically receiving the important metrics directly.&lt;/p&gt;

&lt;p&gt;As with many other applications, CPU, and Memory are crucial to understand system saturation. You should be aware of the current CPU throttling and the memory handling of the JVM.&lt;/p&gt;

&lt;p&gt;Finally, it’s important to dig deeper into the particularities of Elasticsearch, like indices and search capabilities, to truly understand the challenges of monitoring and visualization.&lt;/p&gt;





</description>
      <category>elasticsearch</category>
      <category>prometheus</category>
      <category>monitoring</category>
      <category>devops</category>
    </item>
    <item>
      <title>Kubernetes CreateContainerConfigError and CreateContainerError</title>
      <dc:creator>Javier Martínez</dc:creator>
      <pubDate>Thu, 23 Mar 2023 15:58:05 +0000</pubDate>
      <link>https://dev.to/sysdig/kubernetes-createcontainerconfigerror-and-createcontainererror-1o5a</link>
      <guid>https://dev.to/sysdig/kubernetes-createcontainerconfigerror-and-createcontainererror-1o5a</guid>
      <description>&lt;p&gt;CreateContainerConfigError and CreateContainerError are two of the most prevalent Kubernetes errors found in cloud-native applications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CreateContainerConfigError&lt;/strong&gt; is an error happening when the &lt;strong&gt;configuration specified for a container in a Pod is not correct&lt;/strong&gt; or is missing a vital part.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CreateContainerError&lt;/strong&gt; is a problem happening&lt;strong&gt; at a later stage&lt;/strong&gt; in the container creation flow. Kubernetes displays this error when it attempts to create the container in the Pod.&lt;/p&gt;

&lt;p&gt;In this article, you will learn:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What is Kubernetes CreateContainerConfigError?&lt;/li&gt;



&lt;li&gt;What is Kubernetes CreateContainerError?&lt;/li&gt;



&lt;li&gt;Kubernetes container creation flow&lt;/li&gt;



&lt;li&gt;Common causes for CreateContainerError and CreateConfigError&lt;/li&gt;



&lt;li&gt;How to troubleshoot both errors&lt;/li&gt;



&lt;li&gt;How to detect both errors in Prometheus&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id="what-is-createcontainerconfigerror"&gt;What is CreateContainerConfigError?&lt;/h2&gt;

&lt;p&gt;During the process to start a new container, Kubernetes first tries to generate the configuration for it. In fact, this is handled internally by calling a method called &lt;em&gt;generateContainerConfig&lt;/em&gt;, which will try to retrieve:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Container command and arguments&lt;/li&gt;



&lt;li&gt;Relevant persistent volumes for the container&lt;/li&gt;



&lt;li&gt;Relevant ConfigMaps for the container&lt;/li&gt;



&lt;li&gt;Relevant secrets for the container&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Any problem in the elements above will result in a CreateContainerConfigError.  &lt;/p&gt;

&lt;h2 id="what-is-createcontainererror"&gt;What is CreateContainerError?&lt;/h2&gt;

&lt;p&gt;Kubernetes throws a CreateContainerError when there’s a problem in the creation of the container, but unrelated with configuration, like a referenced volume not being accessible or a container name already being used.&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;em&gt;Similar to other problems like &lt;a href="https://sysdig.com/blog/debug-kubernetes-crashloopbackoff/" rel="noopener noreferrer"&gt;CrashLoopBackOff&lt;/a&gt;, this article only covers the most common causes, but there are many others depending on your current application.&lt;/em&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;h2&gt;How you can detect CreateContainerConfigError and CreateContainerError&lt;/h2&gt;

&lt;p&gt;You can detect both errors by running kubectl get pods:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;NAME  READY STATUS                     RESTARTS AGE

mypod 0/1   CreateContainerConfigError 0        11m&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;As you can see from this output:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pod is not ready: container has an error.&lt;/li&gt;



&lt;li&gt;There are no restarts: these two errors are not like CrashLoopBackOff, where automatic retrials are in place.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id="container-creation-flow"&gt;Kubernetes container creation flow&lt;/h2&gt;

&lt;p&gt;In order to understand CreateContainerError and CreateContainerConfligError, we need to first know the exact flow for container creation.&lt;/p&gt;

&lt;p&gt;Kubernetes follows the next steps every time a new container needs to be started:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pull the image.&lt;/li&gt;



&lt;li&gt;Generate container configuration.&lt;/li&gt;



&lt;li&gt;Precreate container.&lt;/li&gt;



&lt;li&gt;Create container.&lt;/li&gt;



&lt;li&gt;Pre-start container.&lt;/li&gt;



&lt;li&gt;Start container.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;As you can see, steps 2 and 4 are where a CreateContainerConfig and CreateContainerErorr might appear, respectively.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/Createcontainererror-02-1170x464.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FCreatecontainererror-02-1170x464.png" alt="Create container and start container flow"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id="common-causes-createcontainerconfigerror"&gt;Common causes for CreateContainerError and CreateContainerConfigError&lt;/h2&gt;

&lt;h3&gt;Not found ConfigMap&lt;/h3&gt;

&lt;p&gt;Kubernetes &lt;a href="https://kubernetes.io/docs/concepts/configuration/configmap/" rel="noopener noreferrer"&gt;ConfigMaps&lt;/a&gt; are a key element to store non-confidential information to be used by Pods as key-value pairs.&lt;/p&gt;

&lt;p&gt;When adding a ConfigMap reference in a Pod, you are effectively indicating that it should retrieve specific data from it. But, if a Pod references a non-existent ConfigMap, Kubernetes will return a CreateContainerConfigError.&lt;/p&gt;

&lt;h3&gt;Not found Secret&lt;/h3&gt;

&lt;p&gt;Secrets are a more secure manner to store sensitive information in Kubernetes. Remember, though, this is just raw data encoded in base64, so it’s not really encrypted, just obfuscated.&lt;/p&gt;

&lt;p&gt;In case a Pod contains a reference to a non-existent secret, Kubelet will throw a CreateContainerConfigError, indicating that necessary data couldn’t be retrieved in order to form container config.&lt;/p&gt;

&lt;h3&gt;Container name already in use&lt;/h3&gt;

&lt;p&gt;While an unusual situation, in some cases a conflict might occur because a particular container name is already being used. Since every docker container should have a unique name, you will need to either delete the original or rename the new one being created.&lt;/p&gt;

&lt;h2 id="troubleshoot"&gt;How to troubleshoot CreateContainerError and CreateContainerConfigError&lt;/h2&gt;

&lt;p&gt;While the causes for an error in container creation might vary, you can always rely on the following methods to troubleshoot the problem that’s preventing the container from starting.&lt;/p&gt;

&lt;h3&gt;Describe Pods&lt;/h3&gt;

&lt;p&gt;With &lt;code&gt;kubectl describe pod&lt;/code&gt;, you can retrieve the detailed information for the affected Pod and its containers:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Containers:
  mycontainer:
    Container ID:
    Image:          nginx
    Image ID:
    Port:           &amp;lt;none&amp;gt;
    Host Port:      &amp;lt;none&amp;gt;
    State:          Waiting
      Reason:       CreateContainerConfigError
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:  3
---
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      myconfigmap
    Optional:  false&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Get logs from containers&lt;/h3&gt;

&lt;p&gt;Use &lt;code&gt;kubectl logs&lt;/code&gt; to retrieve the log information from containers in the Pod. Note that for Pods with multiple containers, you need to use the &lt;code&gt;–all-containers&lt;/code&gt; parameter:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Error from server (BadRequest): container "mycontainer" in pod "mypod" is waiting to start: CreateContainerConfigError&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Check the events&lt;/h3&gt;

&lt;p&gt;You can also run &lt;code&gt;kubectl get events&lt;/code&gt; to retrieve all the recent events happening in your Pods. Remember that the describe pods command also displays the particular events at the end.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/Createcontainererror-03-1170x586.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FCreatecontainererror-03-1170x586.png" alt="Createcontainerconfig error troubleshooting diagram"&gt;&lt;/a&gt;Terminal windows for the kubectl commands used to troubleshoot a CreateContainerConfigError&lt;/p&gt;



&lt;h2 id="detect-in-prometheus"&gt;How to detect CreateContainerConfigError and CreateContainerError in Prometheus&lt;/h2&gt;

&lt;p&gt;When using Prometheus + kube-state-metrics, you can quickly retrieve Pods that have containers with errors at creation or config steps:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;kube_pod_container_status_waiting_reason{reason="CreateContainerConfigError"} &amp;gt; 0
kube_pod_container_status_waiting_reason{reason="CreateContainerError"} &amp;gt; 0&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/image2-53-1170x997.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage2-53-1170x997.png" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id="other-errors"&gt;Other similar errors&lt;/h2&gt;

&lt;h3&gt;Pending&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/blog/kubernetes-pod-pending-problems/" rel="noopener noreferrer"&gt;Pending is a Pod status&lt;/a&gt; that appears when the Pod couldn’t even be started. Note that this happens at schedule time, so Kube-scheduler couldn’t find a node because of not enough resources or not proper taints/tolerations config.&lt;/p&gt;

&lt;h3&gt;ContainerCreating&lt;/h3&gt;

&lt;p&gt;ContainerCreating is another waiting status reason that can happen when the container could not be started because of a problem in the execution (e.g: &lt;code&gt;No command specified&lt;/code&gt;)&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Error from server (BadRequest): container "mycontainer" in pod "mypod" is waiting to start: ContainerCreating   &lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;RunContainerError&lt;/h3&gt;

&lt;p&gt;This might be a similar situation to CreateContainerError, but note that this happens during the run step and not the container creation step.&lt;/p&gt;

&lt;p&gt;A RunContainerError most likely points to problems happening at runtime, like attempts to write on a read-only volume.&lt;/p&gt;

&lt;h3&gt;CrashLoopBackOff&lt;/h3&gt;

&lt;p&gt;Remember that &lt;a href="https://sysdig.com/blog/debug-kubernetes-crashloopbackoff/" rel="noopener noreferrer"&gt;CrashLoopBackOff&lt;/a&gt; is not technically an error, but the waiting time grace period that is added between retrials.&lt;/p&gt;

&lt;p&gt;Unlike CrashLoopBackOff events, CreateContainerError and CreateContainerConfigError won’t be retried automatically.&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;In this article, you have seen how both CreateContainerConfigError and CreateContainerError are important messages in the Kubernetes container creation process. Being able to detect them and understand at which stage they are happening is crucial for the day-to-day debugging of cloud-native services.&lt;/p&gt;

&lt;p&gt;Also, it’s important to know the internal behavior of the Kubernetes container creation flow and what is errors might appear at each step.&lt;/p&gt;

&lt;p&gt;Finally, CreateContainerConfigError and CreateContainerError might be mistaken with other different Kubernetes errors, but these two happen at container creation stage and they are not automatically retried.&lt;/p&gt;









&lt;h2&gt;&lt;em&gt;Troubleshoot CreateContainerError with Sysdig Monitor
&lt;/em&gt;&lt;/h2&gt;
&lt;p&gt;With Sysdig Monitor’s Advisor, you can easily detect which containers are having CreateContainerConfigError or CreateContainerError problems in your Kubernetes cluster.&lt;/p&gt;

&lt;p&gt;Advisor accelerates mean time to resolution (MTTR) with live logs, performance data, and suggested remediation steps. It’s the easy button for Kubernetes troubleshooting!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/image3-39-1170x579.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage3-39-1170x579.png" alt="Rightsize your Kubernetes Resources with Sysdig Monitor"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;p&gt;&lt;a href="https://sysdig.com/start-free/" rel="noopener noreferrer"&gt;Try it free&lt;/a&gt; for 30 days!&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>monitoring</category>
      <category>cloud</category>
      <category>devops</category>
    </item>
    <item>
      <title>Monitoring with Custom Metrics</title>
      <dc:creator>Javier Martínez</dc:creator>
      <pubDate>Thu, 02 Mar 2023 10:19:53 +0000</pubDate>
      <link>https://dev.to/sysdig/monitoring-with-custom-metrics-p52</link>
      <guid>https://dev.to/sysdig/monitoring-with-custom-metrics-p52</guid>
      <description>&lt;p&gt;&lt;strong&gt;Custom metrics&lt;/strong&gt; are application-level or business-related tailored metrics, as opposed to the ones that come directly out-of-the-box from monitoring systems like Prometheus (e.g: kube-state-metrics or node exporter)&lt;/p&gt;

&lt;p&gt;By kickstarting a monitoring project with Prometheus, you might realize that you get an initial set of out-of-the-box metrics with just Node Exporter and Kube State Metrics. But, this will only get you so far since you will just be performing black box monitoring. How can you go to the next level and observe what’s beyond?&lt;/p&gt;

&lt;p&gt;They are an essential part of the day-to-day monitoring of cloud-native systems, as they provide an additional dimension to the business and app level.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Metrics provided by an exporter&lt;/li&gt;



&lt;li&gt;Tailored metrics designed by the customer&lt;/li&gt;



&lt;li&gt;An aggregate from previous existing metrics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this article, you will see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; &lt;a href="https://docs.google.com/document/d/19Z6q_lY60IqRz4DcnubmU4U1HwjxBPJJy3q_okC6-rE/edit#heading=h.hl0n2legtxy2"&gt;Why custom metrics are important&lt;/a&gt;
&lt;/li&gt;



&lt;li&gt;&lt;a href="https://docs.google.com/document/d/19Z6q_lY60IqRz4DcnubmU4U1HwjxBPJJy3q_okC6-rE/edit#heading=h.1pkldw1lpoee"&gt;When to use custom metrics&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://docs.google.com/document/d/19Z6q_lY60IqRz4DcnubmU4U1HwjxBPJJy3q_okC6-rE/edit#heading=h.7o1c1smwtm7b"&gt;Considerations when creating custom metrics&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://docs.google.com/document/d/19Z6q_lY60IqRz4DcnubmU4U1HwjxBPJJy3q_okC6-rE/edit#heading=h.8hxg63te0986"&gt;Kubernetes Metric API&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://docs.google.com/document/d/19Z6q_lY60IqRz4DcnubmU4U1HwjxBPJJy3q_okC6-rE/edit#heading=h.q5qjsc4mpwar"&gt;Prometheus custom metrics&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://docs.google.com/document/d/19Z6q_lY60IqRz4DcnubmU4U1HwjxBPJJy3q_okC6-rE/edit#heading=h.lfrto1shfhfv"&gt;Challenges when using custom metrics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Why custom metrics are important&lt;/h2&gt;

&lt;p&gt;Custom metrics allow companies to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Monitor Key Performance Indicators (KPIs).&lt;/li&gt;



&lt;li&gt;Detect issues faster.&lt;/li&gt;



&lt;li&gt;Track resource utilization.&lt;/li&gt;



&lt;li&gt;Measure latency.&lt;/li&gt;



&lt;li&gt;Track specific values from their services and systems.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Examples of custom metrics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Latency of transactions in milliseconds.&lt;/li&gt;



&lt;li&gt;Database open connections.&lt;/li&gt;



&lt;li&gt;% cache hits / cache misses.&lt;/li&gt;



&lt;li&gt;orders/sales in e-commerce site.&lt;/li&gt;



&lt;li&gt;% of slow responses.&lt;/li&gt;



&lt;li&gt;% of responses that are resource intensive.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As you can see, any metrics retrieved from an exporter or created ad hoc will fit into the definition for custom metric.&lt;/p&gt;

&lt;h2&gt;When to use Custom Metrics&lt;/h2&gt;

&lt;h3&gt;Autoscaling&lt;/h3&gt;

&lt;p&gt;By providing specific visibility over your system, you can define rules on how the workload should scale.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Horizontal autoscaling: add or remove replicas of a Pod.&lt;/li&gt;



&lt;li&gt;Vertical autoscaling: modify limits and requests of a container.&lt;/li&gt;



&lt;li&gt;Cluster autoscaling: add or remove nodes in a cluster.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you want to dig deeper, check &lt;a href="https://sysdig.com/blog/kubernetes-autoscaler/" rel="noopener"&gt;this article about autoscaling in Kubernetes&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Latency monitoring&lt;/h3&gt;

&lt;p&gt;Latency measures the time it takes for a system to serve a request. This monitoring &lt;a href="https://sysdig.com/blog/golden-signals-kubernetes/" rel="noopener"&gt;golden signal&lt;/a&gt; is essential to understand what the end-user experience for your application is.&lt;/p&gt;

&lt;p&gt;These are considered custom metrics as they are not part of the out-of-the-box set of metrics coming from Kube State Metrics or Node Exporter. In order to measure latency, you might want to either track individual systems (database, API) or end-to-end.&lt;/p&gt;

&lt;h3&gt;Application level monitoring&lt;/h3&gt;

&lt;p&gt;Kube-state-metrics or node-exporter might be a good &lt;a href="https://sysdig.com/blog/cloud-monitoring-journey/" rel="noopener"&gt;starting point for observability&lt;/a&gt;, but they just scratch the surface as they perform black-box monitoring. By instrumenting your own application and services, you create a curated and personalized set of metrics for your own particular case.&lt;/p&gt;

&lt;h2&gt;Considerations when creating Custom Metrics&lt;/h2&gt;

&lt;h3&gt;Naming&lt;/h3&gt;

&lt;p&gt;Check for any &lt;a href="https://prometheus.io/docs/practices/naming/" rel="noreferrer noopener"&gt;existing convention on naming&lt;/a&gt;, as they might be either colliding with existing names or confusing. Custom metric name is the first description for its purpose.&lt;/p&gt;

&lt;h3&gt;Labels&lt;/h3&gt;

&lt;p&gt;Thanks to labels, we can add parameters to our metrics, as we will be able to filter and refine through additional characteristics. Cardinality is the number of possible values for each label and since each combination of possible values will require a time series entry, that can increase resources drastically. Choosing the correct labels carefully is key to avoiding this cardinality explosion, which is one of the causes of resource spending spikes.&lt;/p&gt;

&lt;h3&gt;Costs&lt;/h3&gt;

&lt;p&gt;Custom metrics may have some costs associated with them depending on the monitoring system you are using. Double-check what is the dimension used to scale costs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Number of time series&lt;/li&gt;



&lt;li&gt;Number of labels&lt;/li&gt;



&lt;li&gt;Data storage&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Custom Metric lifecycle&lt;/h3&gt;

&lt;p&gt;In case the Custom Metric is related to a job or a short-living script, consider using &lt;a href="https://github.com/prometheus/pushgateway" rel="noopener nofollow noreferrer"&gt;Pushgateway&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Kubernetes Metric API&lt;/h2&gt;

&lt;p&gt;One of the most important features of Kubernetes is the ability to scale the workload based on the values of metrics automatically.&lt;/p&gt;

&lt;p&gt;Metrics API are defined in the &lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/metrics"&gt;official repository from Kubernetes&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;metrics.k8s.io&lt;/code&gt;&lt;/li&gt;



&lt;li&gt;&lt;code&gt;custom.metrics.k8s.io&lt;/code&gt;&lt;/li&gt;



&lt;li&gt;&lt;code&gt;external.metrics.k8s.io&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Creating new metrics&lt;/h3&gt;

&lt;p&gt;You can set new metrics by calling the K8s metrics API as follows:&lt;/p&gt;

&lt;pre&gt;curl -X POST \
  -H 'Content-Type: application/json' \
  http://localhost:8001/api/v1/namespaces/custom-metrics/services/custom-metrics-apiserver:http/proxy/write-metrics/namespaces/default/services/kubernetes/test-metric \
  --data-raw '"300m"'
&lt;/pre&gt;

&lt;h2&gt;Prometheus custom metrics&lt;/h2&gt;

&lt;p&gt;As we mentioned, every exporter that we include in our Prometheus integration will account for several custom metrics.&lt;/p&gt;

&lt;p&gt;Check the following post for a &lt;a href="https://sysdig.com/blog/prometheus-metrics/" rel="noopener"&gt;detailed guide on Prometheus metrics&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Challenges when using custom metrics&lt;/h2&gt;

&lt;h3&gt;Cardinality explosion&lt;/h3&gt;

&lt;p&gt;While the resources consumed by some metrics might be negligible, the moment these are available to be used with labels in queries, things might get out of hand.&lt;/p&gt;

&lt;p&gt;Cardinality refers to the cartesian products of metrics and labels. The result will be the amount of time series entries that need to be used for that single metric.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--0nVj2Ktr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://sysdig.com/wp-content/uploads/Custom-metrics-image-1-1-1170x644.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--0nVj2Ktr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://sysdig.com/wp-content/uploads/Custom-metrics-image-1-1-1170x644.png" alt="Custom metrics - cardinality example" width="880" height="484"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Also, every metric will be scraped and stored in a time series database based on your &lt;code&gt;scrape_interval.&lt;/code&gt; The higher this value, the higher the amount of time series entries.&lt;/p&gt;

&lt;p&gt;All these factors will eventually lead to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Higher resource consumption.&lt;/li&gt;



&lt;li&gt;Higher storage demand.&lt;/li&gt;



&lt;li&gt;Monitoring performance degradation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Moreover, most common monitoring tools don’t give &lt;a href="https://sysdig.com/use-cases/cloud-monitoring/" rel="noopener"&gt;visibility on current cardinality of metrics or costs associated&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Exporter over usage&lt;/h3&gt;

&lt;p&gt;Exporters are a great way to include relevant metrics to your system. With them, you can easily instrument relevant metrics bound to your microservices and containers. But with great power comes great responsibility. Chances are that many of the metrics included in the package may not be relevant to your business at all.&lt;/p&gt;

&lt;p&gt;By enabling custom metrics and exporters in your solution, you may end up having a burst in the amount of time series database entries.&lt;/p&gt;

&lt;h3&gt;Cost spikes&lt;/h3&gt;

&lt;p&gt;Because of the elements explained above, monitoring costs could increase suddenly, as your current solution might be consuming more resources than expected, or your current monitoring solution has certain thresholds that were surpassed.&lt;/p&gt;

&lt;h3&gt;Alert fatigue&lt;/h3&gt;

&lt;p&gt;With metrics, most companies and individuals would love to start adding alerts and notifications when their values exceed certain thresholds. However, this could lead to higher notification sources and a reduced attention span.&lt;br&gt;Learn more about &lt;a href="sysdig.com/blog//prometheus-alertmanager" rel="noopener"&gt;Alert Fatigue and how to mitigate it&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Custom metrics represent the next step for cloud-native monitoring as they represent the core of business observability. While using Prometheus along kube-state-metrics and node exporter is a nice starting step, eventually companies and organizations will need to take the next step and create tailored and on-point metrics to suit their needs.&lt;/p&gt;

</description>
      <category>prometheus</category>
      <category>monitoring</category>
      <category>devops</category>
    </item>
    <item>
      <title>Prometheus Alertmanager best practices</title>
      <dc:creator>Javier Martínez</dc:creator>
      <pubDate>Thu, 09 Feb 2023 09:59:54 +0000</pubDate>
      <link>https://dev.to/sysdig/prometheus-alertmanager-best-practices-4872</link>
      <guid>https://dev.to/sysdig/prometheus-alertmanager-best-practices-4872</guid>
      <description>&lt;p&gt;Have you ever fallen asleep to the sounds of your on-call team in a Zoom call? If you’ve had the misfortune to sympathize with this experience, you likely understand the problem of &lt;strong&gt;Alert Fatigue&lt;/strong&gt; firsthand.&lt;/p&gt;

&lt;p&gt;During an active incident, it can be exhausting to tease the upstream root cause from downstream noise while you’re context switching between your terminal and your alerts.&lt;/p&gt;

&lt;p&gt;This is where &lt;strong&gt;Alertmanager&lt;/strong&gt; comes in, providing a way to mitigate each of the problems related to Alert Fatigue.&lt;/p&gt;

&lt;p&gt;In this article, you will learn:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What Alert Fatigue is&lt;/li&gt;



&lt;li&gt;What AlertManager is&lt;/li&gt;



&lt;li&gt;Routing&lt;/li&gt;



&lt;li&gt;Inhibition&lt;/li&gt;



&lt;li&gt;Silencing and Throttling&lt;/li&gt;



&lt;li&gt;Grouping&lt;/li&gt;



&lt;li&gt;Notification Template&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id="alert-fatigue"&gt;Alert Fatigue&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Alert Fatigue&lt;/strong&gt; is the exhaustion of frequently responding to unprioritized and unactionable alerts. This is not sustainable in the long term. Not every alert is so urgent that it should wake up a developer. Ensuring that an on-call week is sustainable must prioritize sleep as well.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Was an engineer woken up more than twice this week?&lt;/li&gt;



&lt;li&gt;Can the resolution be automated or wait until morning?&lt;/li&gt;



&lt;li&gt;How many people were involved?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Companies often focus on response time and how long a resolution takes but how do they know the on-call process is not contributing to burn out?&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pain Point&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Feature&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Alertmanager&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Send alerts to the right team&lt;/td&gt;
&lt;td&gt;Routing&lt;/td&gt;
&lt;td&gt;Labeled alerts are routed to the corresponding receiver&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Too many alerts at once&lt;/td&gt;
&lt;td&gt;Inhibition&lt;/td&gt;
&lt;td&gt;Alerts can inhibit other alerts (e.g., Datacenter down alert inhibits downtime alert)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;False positive on an Alert&lt;/td&gt;
&lt;td&gt;Silencing&lt;/td&gt;
&lt;td&gt;Temporarily silence an alert, especially when performing scheduled maintenance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Alerts are too frequent&lt;/td&gt;
&lt;td&gt;Throttling&lt;/td&gt;
&lt;td&gt;Customizable back-off options to avoid re-notifying too frequently&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Unorganized alerts&lt;/td&gt;
&lt;td&gt;Grouping&lt;/td&gt;
&lt;td&gt;Logically group alerts by labels such as 'environment=dev' or 'service=broker'&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Notifications are unstructured&lt;/td&gt;
&lt;td&gt;Notification Template&lt;/td&gt;
&lt;td&gt;Standardize alerts to a template so that alerts are structured across services&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;/div&gt;

&lt;h2 id="alertmanager"&gt;Alertmanager&lt;/h2&gt;

&lt;p&gt;Prometheus &lt;strong&gt;Alertmanager&lt;/strong&gt; is the open source standard for translating alerts into alert notifications for your engineering team. &lt;a href="https://prometheus.io/docs/alerting/latest/alertmanager/" rel="noreferrer noopener"&gt;Alertmanager&lt;/a&gt; challenges the assumption that a dozen alerts should result in a dozen alert notifications. By leveraging the features of Alertmanager, dozens of alerts can be distilled into a handful of alert notifications, allowing on-call engineers to context switch less by thinking in terms of incidents rather than alerts.&lt;/p&gt;

&lt;h2 id="routing"&gt;Routing&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Routing&lt;/strong&gt; is the ability to send alerts to a variety of receivers including Slack, Pagerduty, and email. It is the core feature of Alertmanager.&lt;/p&gt;

&lt;pre&gt;route:
  receiver: slack-default            # Fallback Receiver if no routes are matched
  routes:
    - receiver: pagerduty-logging
      continue: true
    - match:
      team: support
      receiver: jira
    - match:
      team: on-call
      receiver: pagerduty-prod&lt;/pre&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/BlogImages-PrometheusAlertmanagerBestPractices-diagram1-1170x351.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FBlogImages-PrometheusAlertmanagerBestPractices-diagram1-1170x351.png" alt="Prometheus alertmanager diagram 1" title="image_tooltip"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here, an alert with the label {team:on-call} was triggered. Routes are matched from top to bottom with the first receiver being &lt;code&gt;pagerduty-logging&lt;/code&gt;, a receiver for your on-call manager to track all alerts at the end of each month. Since the alert does not have a &lt;code&gt;{team:support}&lt;/code&gt; label, the matching continues to &lt;code&gt;{team:on-call} &lt;/code&gt;where the alert is properly routed to the pagerduty-prod receiver. The default route, slack-default, is specified at the top of the routes, in case no matches are found.&lt;/p&gt;

&lt;h2 id="inhibition"&gt;Inhibition&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Inhibition&lt;/strong&gt; is the process of &lt;strong&gt;muting downstream alerts &lt;/strong&gt;depending on their label set. Of course, this means that alerts must be systematically tagged in a logical and standardized way, but that's a human problem, not an Alertmanager one. While there is no native support for warning thresholds, the user can take advantage of labels and inhibit a warning when the critical condition is met. &lt;/p&gt;

&lt;p&gt;This has the unique advantage of supporting a warning condition for alerts that don't use a scalar comparison. It's all well and good to warn at 60% CPU usage and alert at 80% CPU usage, but what if we wanted to craft a warning and alert that compares two queries? This alert triggers when a node has more pods than its capacity.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;(sum by (kube_node_name) (kube_pod_container_status_running)) &amp;gt; 
on(kube_node_name) kube_node_status_capacity_pods&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We can do exactly this by using inhibition with Alertmanager. In the first example, an alert with the label &lt;code&gt;{severity=critical}&lt;/code&gt; will inhibit an alert of &lt;code&gt;{severity=warning}&lt;/code&gt; if they share the same region, and alertname.&lt;/p&gt;

&lt;p&gt;In the second example, we can also inhibit downstream alerts when we know they won't be important in the root cause. It might be expected that a Kafka consumer behaves anomalously when the Kafka producer doesn't publish anything to the topic.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['region','alertname']
  - source_match:
      service: 'kafka_producer'
    target_match:
      service: 'kafka_consumer'
    equal: ['environment','topic']
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/BlogImages-PrometheusAlertmanagerBestPractices-diagram2-1170x429.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FBlogImages-PrometheusAlertmanagerBestPractices-diagram2-1170x429.png" alt="Prometheus alertmanager diagram 2" title="image_tooltip"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id="silencing-throttling"&gt;Silencing and Throttling&lt;/h2&gt;

&lt;p&gt;Now that you've woken up at 2 a.m. to exactly one root cause alert, you may want to acknowledge the alert and move forward with remediation. It’s too early to resolve the alert but alert re-notifications don’t give any extra context. This is where silencing and throttling can help.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Silencing&lt;/strong&gt; allows you to temporarily snooze an alert if you're expecting the alert to trigger for a scheduled procedure, such as database maintenance, or if you've already acknowledged the alert during an incident and want to keep it from renotifying while you remediate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Throttling&lt;/strong&gt; solves a similar pain point but in a slightly different fashion. Throttles allow the user to tailor the renotification settings with three main parameters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;group_wait&lt;/li&gt;



&lt;li&gt;group_interval&lt;/li&gt;



&lt;li&gt;repeat_interval&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/BlogImages-PrometheusAlertmanagerBestPractices-diagram3-1170x468.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FBlogImages-PrometheusAlertmanagerBestPractices-diagram3-1170x468.png" alt="Prometheus alertmanager diagram 3" title="image_tooltip"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When Alert #1 and Alert #3 are initially triggered, Alertmanager will use &lt;code&gt;group_wait&lt;/code&gt; to delay by 30 seconds before notifying. After an initial alert has been triggered, any new alert notifications are delayed by group_interval . Since there was no new alert for the next 90 seconds, there was no notification. Over the subsequent 90 seconds however, Alert #2 was triggered and a notification of Alert #2 and Alert #3 was sent. In order to not forget about the current alerts if no new alert has been triggered, &lt;code&gt;repeat_interval&lt;/code&gt; can be configured to a value, such as 24 hours, so that the currently triggered alerts send a re-notifications every 24 hours.&lt;/p&gt;

&lt;h2 id="grouping"&gt;Grouping&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Grouping&lt;/strong&gt; in Alertmanager allows multiple alerts sharing a similar label set to be sent at the same time- not to be confused with Prometheus grouping, where alert rules in a group are evaluated in sequential order. By default, all alerts for a given route are grouped together. A &lt;code&gt;group_by&lt;/code&gt; field can be specified to logically group alerts.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;route:
  receiver: slack-default            # Fallback Receiver if no routes are matched
  group_by: [env]
  routes:
    - match:
        team: on-call
      Group_by: [region, service]
      receiver: pagerduty-prod
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/BlogImages-PrometheusAlertmanagerBestPractices-diagram4-1170x819.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FBlogImages-PrometheusAlertmanagerBestPractices-diagram4-1170x819.png" alt="Prometheus alertmanager diagram 4" title="image_tooltip"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Alerts that have the label {team:on-call} will be grouped by both region and service. This allows users to immediately have context that all of the notifications within this alert group share the same service and region. Grouping with information such as &lt;code&gt;instance_id&lt;/code&gt; or &lt;code&gt;ip_address&lt;/code&gt; tends to be less useful, since it means that every unique &lt;code&gt;instance_id&lt;/code&gt; or &lt;code&gt;ip_address&lt;/code&gt; will produce its own notification group. This may produce noisy notifications and defeat the purpose of grouping.&lt;/p&gt;

&lt;p&gt;If no grouping is configured, all alerts will be part of the same alert notification for a given route.&lt;/p&gt;

&lt;h2 id="notification-template"&gt;Notification Template&lt;/h2&gt;

&lt;p&gt;Notification templates offer a way to customize and standardize alert notifications. For example, a notification template can use labels to automatically link to a runbook or include useful labels for the on-call team to build context. Here, &lt;code&gt;app&lt;/code&gt; and &lt;code&gt;alertname&lt;/code&gt; labels are interpolated into a path that links out to a runbook. Standardizing on a notification template can make the on-call process run more smoothly since the on-call team may not be the direct maintainers of the microservice that is paging.&lt;/p&gt;

&lt;h2&gt;&lt;em&gt;Manage alerts with a click with Sysdig Monitor&lt;/em&gt;&lt;/h2&gt;

&lt;p&gt;As organizations grow, maintaining Prometheus and Alertmanager can become difficult to manage across teams. Sysdig Monitor makes this easy with Role-Based Access Control where teams can focus on the metrics and alerts most important to them. We offer a turn-key solution where you can manage your alerts from a single pane of glass. With Sysdig Monitor you can spend less time maintaining Prometheus Alertmanager and spend more time monitoring your actual infrastructure. Come chat with industry experts in monitoring and alerting and we'll get you up and running.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/Prometheus-Alertmanager-CTA-1170x530.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FPrometheus-Alertmanager-CTA-1170x530.png" alt="Alert Monitoring in Sysdig Monitor"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;p&gt;Sign up now for a &lt;a href="https://sysdig.com/company/free-trial-monitor/" rel="noopener noreferrer"&gt;free trial of Sysdig Monitor&lt;/a&gt;&lt;/p&gt;

</description>
      <category>prometheus</category>
      <category>devops</category>
      <category>monitoring</category>
    </item>
    <item>
      <title>Kubernetes OOM and CPU Throttling</title>
      <dc:creator>Javier Martínez</dc:creator>
      <pubDate>Thu, 26 Jan 2023 09:56:52 +0000</pubDate>
      <link>https://dev.to/sysdig/kubernetes-oom-and-cpu-throttling-n55</link>
      <guid>https://dev.to/sysdig/kubernetes-oom-and-cpu-throttling-n55</guid>
      <description>&lt;h2 id="introduction"&gt;Introduction&lt;/h2&gt;

&lt;p&gt;When working with Kubernetes, Out of Memory (OOM) errors and CPU throttling are the main headaches of resource handling in cloud applications. Why is that?&lt;/p&gt;

&lt;p&gt;CPU and Memory requirements in cloud applications are ever more important, since they are tied directly to your cloud costs.&lt;/p&gt;

&lt;p&gt;With &lt;a href="https://sysdig.com/blog/kubernetes-limits-requests/" rel="noreferrer noopener"&gt;limits and requests&lt;/a&gt;, you can configure how your pods should allocate memory and CPU resources in order to prevent resource starvation and adjust cloud costs.&lt;/p&gt;

&lt;p&gt;In case a Node doesn’t have enough resources, &lt;a href="https://sysdig.com/blog/kubernetes-pod-evicted/" rel="noopener noreferrer"&gt;Pods might get evicted&lt;/a&gt; via preemption or node-pressure.&lt;br&gt;When a process runs Out Of Memory (OOM), it’s killed since it doesn’t have the required resources.&lt;br&gt;In case CPU consumption is higher than the actual limits, the process will start to be throttled.&lt;/p&gt;

&lt;p&gt;But, how can you actively monitor how close your Kubernetes Pods to OOM and CPU throttling?&lt;/p&gt;

&lt;h2 id="kubernetes-oom"&gt;Kubernetes OOM&lt;/h2&gt;

&lt;p&gt;Every container in a Pod needs memory to run.&lt;/p&gt;

&lt;p&gt;Kubernetes limits are set per container in either a Pod definition or a Deployment definition.&lt;/p&gt;

&lt;p&gt;All modern Unix systems have a way to kill processes in case they need to reclaim memory. This will be marked as Error 137 or &lt;code&gt;OOMKilled.&lt;/code&gt;&lt;/p&gt;

&lt;pre&gt;   State:          Running
      Started:      Thu, 10 Oct 2019 11:14:13 +0200
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Thu, 10 Oct 2019 11:04:03 +0200
      Finished:     Thu, 10 Oct 2019 11:14:11 +0200
&lt;/pre&gt;

&lt;p&gt;This Exit Code 137 means that the process used more memory than the allowed amount and had to be terminated.&lt;/p&gt;

&lt;p&gt;This is a feature present in Linux, where the kernel sets an &lt;code&gt;oom_score&lt;/code&gt; value for the process running in the system. Additionally, it allows setting a value called &lt;code&gt;oom_score_adj&lt;/code&gt;, which is used by Kubernetes to allow Quality of Service. It also features an &lt;code&gt;OOM Killer,&lt;/code&gt; which will review the process and terminate those that are using more memory than they should.&lt;/p&gt;

&lt;p&gt;Note that in Kubernetes, a process can reach any of these limits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A Kubernetes Limit set on the container.&lt;/li&gt;



&lt;li&gt;A Kubernetes ResourceQuota set on the namespace.&lt;/li&gt;



&lt;li&gt;The node’s actual Memory size.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/BlogImages-TroubleshootKubernetesOOM-1.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FBlogImages-TroubleshootKubernetesOOM-1.png" alt="Kubernetes OOM graph" title="image_tooltip" width="800" height="482"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;Memory overcommitment&lt;/h3&gt;

&lt;p&gt;Limits can be higher than requests, so the sum of all limits can be higher than node capacity. This is called overcommit and it is very common. In practice, if all containers use more memory than requested, it can exhaust the memory in the node. This usually causes the &lt;a href="https://sysdig.com/blog/kubernetes-pod-evicted/" rel="noopener noreferrer"&gt;death of some pods&lt;/a&gt; in order to free some memory.&lt;/p&gt;

&lt;h3&gt;Monitoring Kubernetes OOM&lt;/h3&gt;

&lt;p&gt;When using node exporter in Prometheus, there’s one metric called &lt;code&gt;node_vmstat_oom_kill&lt;/code&gt;. It’s important to track when an OOM kill happens, but you might want to get ahead and have visibility of such an event before it happens.&lt;/p&gt;

&lt;p&gt;Instead, you can check how close a process is to the Kubernetes limits:&lt;/p&gt;

&lt;pre&gt;(sum by (namespace,pod,container)
(rate(container_cpu_usage_seconds_total{container!=""}[5m])) / sum by 
(namespace,pod,container)
(kube_pod_container_resource_limits{resource="cpu"})) &amp;gt; 0.8
&lt;/pre&gt;

&lt;h2 id="kubernetes-cpu-throttling"&gt;Kubernetes CPU throttling&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;CPU Throttling&lt;/strong&gt; is a behavior where processes are slowed when they are about to reach some resource limits.&lt;/p&gt;

&lt;p&gt;Similar to the memory case, these limits could be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A Kubernetes Limit set on the container.&lt;/li&gt;



&lt;li&gt;A Kubernetes ResourceQuota set on the namespace.&lt;/li&gt;



&lt;li&gt;The node’s actual Memory size.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Think of the following analogy. We have a highway with some traffic where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CPU is the road.&lt;/li&gt;



&lt;li&gt;Vehicles represent the process, where each one has a different size.&lt;/li&gt;



&lt;li&gt;Multiple lanes represent having several cores.&lt;/li&gt;



&lt;li&gt;A request would be an exclusive road, like a bike lane.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Throttling here is represented as a traffic jam: eventually, all processes will run, but everything will be slower.&lt;/p&gt;

&lt;h3&gt;CPU process in Kubernetes&lt;/h3&gt;

&lt;p&gt;CPU is handled in Kubernetes with &lt;strong&gt;shares&lt;/strong&gt;. Each CPU core is divided into 1024 shares, then divided between all processes running by using the cgroups (control groups) feature of the Linux kernel.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/BlogImages-TroubleshootKubernetesOOM-4.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FBlogImages-TroubleshootKubernetesOOM-4.png" alt="Kubernetes shares system for CPU" width="800" height="390"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If the CPU can handle all current processes, then no action is needed. If processes are using more than 100% of the CPU, then shares come into place. As any Linux Kernel, Kubernetes uses the CFS (Completely Fair Scheduler) mechanism, so the processes with more shares will get more CPU time.&lt;/p&gt;

&lt;p&gt;Unlike memory, a Kubernetes won't kill Pods because of throttling.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/BlogImages-TroubleshootKubernetesOOM-2.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FBlogImages-TroubleshootKubernetesOOM-2.png" alt="Kubernetes Throttling graph" title="image_tooltip" width="800" height="365"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;em&gt;You can check CPU stats in /sys/fs/cgroup/cpu/cpu.stat&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;

&lt;h3&gt;CPU overcommitment&lt;/h3&gt;

&lt;p&gt;As we saw in the &lt;a href="https://sysdig.com/blog/kubernetes-limits-requests/" rel="noopener noreferrer"&gt;limits and requests article&lt;/a&gt;, it’s important to set limits or requests when we want to restrict the resource consumption of our processes. Nevertheless, beware of setting up total requests larger than the actual CPU size, as this means that every container should have a guaranteed amount of CPU.&lt;/p&gt;

&lt;h3&gt;Monitoring Kubernetes CPU throttling&lt;/h3&gt;

&lt;p&gt;You can check how close a process is to the Kubernetes limits:&lt;/p&gt;

&lt;pre&gt;(sum by (namespace,pod,container)(rate(container_cpu_usage_seconds_total
{container!=""}[5m])) / sum by (namespace,pod,container)
(kube_pod_container_resource_limits{resource="cpu"})) &amp;gt; 0.8&lt;/pre&gt;

&lt;p&gt;In case we want to track the amount of throttling happening in our cluster, cadvisor provides &lt;code&gt;container_cpu_cfs_throttled_periods_total&lt;/code&gt; and &lt;code&gt;container_cpu_cfs_periods_total&lt;/code&gt;. With these two, you can easily calculate the % of throttling in all CPU periods.&lt;/p&gt;

&lt;h2 id="best-practices"&gt;Best practices&lt;/h2&gt;

&lt;h3&gt;Beware of limits and requests&lt;/h3&gt;

&lt;p&gt;Limits are a way to set up a maximum cap on resources in your node, but these need to be treated carefully, as you might end up with a process throttled or killed.&lt;/p&gt;

&lt;h3&gt;Prepare against eviction&lt;/h3&gt;

&lt;p&gt;By setting very low requests, you might think this will grant a minimum of either CPU or Memory to your process. But &lt;code&gt;kubelet&lt;/code&gt; will evict first those Pods with usage higher than requests first, so you’re marking those as the first to be killed!&lt;/p&gt;

&lt;p&gt;In case you need to protect specific Pods against preemption (when &lt;code&gt;kube-scheduler&lt;/code&gt; needs to allocate a new Pod), assign Priority Classes to your most important processes.&lt;/p&gt;

&lt;h3&gt;Throttling is a silent enemy&lt;/h3&gt;

&lt;p&gt;By setting unrealistic limits or overcommitting, you might not be aware that your processes are being throttled, and performance impacted. Proactively monitor your CPU usage and know your actual limits in both containers and namespaces.&lt;/p&gt;

&lt;h2 id="wrapping-up"&gt;Wrapping up&lt;/h2&gt;

&lt;p&gt;Here’s a cheat sheet on Kubernetes resource management for CPU and Memory. This summarizes the current article plus these ones which are part of the same series:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://sysdig.com/blog/kubernetes-pod-evicted/" rel="noopener noreferrer"&gt;https://sysdig.com/blog/kubernetes-pod-evicted/&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://sysdig.com/blog/kubernetes-limits-requests/" rel="noopener noreferrer"&gt;https://sysdig.com/blog/kubernetes-limits-requests/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/BlogImages-TroubleshootKubernetesOOM-diagram-1170x664.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FBlogImages-TroubleshootKubernetesOOM-diagram-1170x664.png" alt="Kubernetes CPU and Memory cheatsheet" width="800" height="454"&gt;&lt;/a&gt;&lt;/p&gt;








&lt;h2&gt;Rightsize your Kubernetes Resources with Sysdig Monitor&lt;/h2&gt;





&lt;p&gt;With Sysdig Monitor’s new feature, Cost Advisor, you can optimize your Kubernetes costs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Memory requests&lt;/li&gt;



&lt;li&gt;CPU requests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With our out-of-the-box Kubernetes Dashboards, you can &lt;a href="https://sysdig.com/blog/kubernetes-capacity-planning/" rel="noreferrer noopener"&gt;discover underutilized resources&lt;/a&gt;&lt;br&gt;in a couple of clicks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FBlog-How-to-do-capacity-planning-for-Kubernetes-Image-14.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FBlog-How-to-do-capacity-planning-for-Kubernetes-Image-14.png" alt="Capacity planning Kubernetes Sysdig Monitor" width="800" height="515"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/company/free-trial-monitor/" rel="noopener noreferrer"&gt;Try it free for 30 days!&lt;/a&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>cloud</category>
      <category>prometheus</category>
    </item>
    <item>
      <title>Kubernetes Services: ClusterIP, Nodeport and LoadBalancer</title>
      <dc:creator>Javier Martínez</dc:creator>
      <pubDate>Fri, 09 Dec 2022 10:03:19 +0000</pubDate>
      <link>https://dev.to/sysdig/kubernetes-services-clusterip-nodeport-and-loadbalancer-1g3m</link>
      <guid>https://dev.to/sysdig/kubernetes-services-clusterip-nodeport-and-loadbalancer-1g3m</guid>
      <description>&lt;p&gt;Pods are ephemeral. And they are meant to be. They can be seamlessly destroyed and replaced if using a Deployment. Or they can be scaled at some point when using Horizontal Pod Autoscaling (HPA).&lt;/p&gt;

&lt;p&gt;This means we can’t rely on the Pod IP address to connect with applications running in our containers internally or externally, as the Pod might not be there in the future.&lt;/p&gt;

&lt;p&gt;You may have noticed that Kubernetes Pods get assigned an IP address:&lt;/p&gt;

&lt;pre&gt;stable-kube-state-metrics-758c964b95-6fnbl               1/1     Running   0          3d20h   100.96.2.5      ip-172-20-54-111.ec2.internal   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
stable-prometheus-node-exporter-4brgv                    1/1     Running   0          3d20h   172.20.60.26    ip-172-20-60-26.ec2.internal
&lt;/pre&gt;

&lt;p&gt;This is a unique and internal IP for this particular Pod, but there’s no guarantee that this IP will exist in the future, due to the Pod's nature.&lt;/p&gt;

&lt;h2&gt;Services&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;Kubernetes Service&lt;/strong&gt; is a mechanism to &lt;strong&gt;expose applications both internally and externally&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Every service will create an everlasting IP address that can be used as a connector.&lt;/p&gt;

&lt;p&gt;Additionally, it will open a &lt;code&gt;port&lt;/code&gt; that will be linked with a &lt;code&gt;targetPort&lt;/code&gt;. Some services can create ports in every &lt;a href="https://sysdig.com/learn-cloud-native/kubernetes-101/what-is-a-kubernetes-node/" rel="noreferrer noopener"&gt;Node&lt;/a&gt;, and even external IPs to create connectors outside the cluster.&lt;/p&gt;

&lt;p&gt;With the combination of both IP and Port, we can create a way to uniquely identify an application.&lt;/p&gt;

&lt;h3&gt;Creating a service&lt;/h3&gt;

&lt;p&gt;Every service has a selector that filters that will link it with a set of Pods in your cluster.&lt;/p&gt;

&lt;pre&gt;spec:
  selector:
    app.kubernetes.io/name: myapp
&lt;/pre&gt;

&lt;p&gt;So all Pods with the label &lt;em&gt;myapp&lt;/em&gt; will be linked to this service.&lt;/p&gt;

&lt;p&gt;There are three port attributes involved in a Service configuration:&lt;/p&gt;

&lt;pre&gt;  ports:
  - port: 80
    targetPort: 8080
    nodePort: 30036
    protocol: TCP
&lt;/pre&gt;

&lt;ul&gt;
&lt;li&gt;port: the new service port that will be created to connect to the application.&lt;/li&gt;



&lt;li&gt;targetPort: application port that we want to target with the services requests.&lt;/li&gt;



&lt;li&gt;nodePort: this is a port in the range of 30000-32767 that will be open in each node. If left empty, Kubernetes selects a free one in that range.&lt;/li&gt;



&lt;li&gt;protocol: TCP is the default one, but you can use others like SCTP or UDP.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can review services created with:&lt;/p&gt;

&lt;pre&gt;kubectl get services
kubectl get svc
&lt;/pre&gt;

&lt;h3&gt;Types of services&lt;/h3&gt;

&lt;p&gt;Kubernetes allows the creation of these types of services:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ClusterIP (default)&lt;/li&gt;



&lt;li&gt;Nodeport&lt;/li&gt;



&lt;li&gt;LoadBalancer&lt;/li&gt;



&lt;li&gt;ExternalName&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s see each of them in detail.&lt;/p&gt;

&lt;h2&gt;ClusterIP&lt;/h2&gt;

&lt;p&gt;This is the default type for service in Kubernetes.&lt;/p&gt;

&lt;p&gt;As indicated by its name, this is just an address that can be used inside the cluster.&lt;/p&gt;

&lt;p&gt;Take, for example, the initial helm installation for Prometheus Stack. It installs Pods, Deployments, and Services for the Prometheus and Grafana ecosystem.&lt;/p&gt;

&lt;pre&gt;NAME                                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
alertmanager-operated                     ClusterIP   None            &amp;lt;none&amp;gt;        9093/TCP,9094/TCP,9094/UDP   3m27s
kubernetes                                ClusterIP   100.64.0.1      &amp;lt;none&amp;gt;        443/TCP                      18h
prometheus-operated                       ClusterIP   None            &amp;lt;none&amp;gt;        9090/TCP                     3m27s
stable-grafana                            ClusterIP   100.66.46.251   &amp;lt;none&amp;gt;        80/TCP                       3m29s
stable-kube-prometheus-sta-alertmanager   ClusterIP   100.64.23.19    &amp;lt;none&amp;gt;        9093/TCP                     3m29s
stable-kube-prometheus-sta-operator       ClusterIP   100.69.14.239   &amp;lt;none&amp;gt;        443/TCP                      3m29s
stable-kube-prometheus-sta-prometheus     ClusterIP   100.70.168.92   &amp;lt;none&amp;gt;        9090/TCP                     3m29s
stable-kube-state-metrics                 ClusterIP   100.70.80.72    &amp;lt;none&amp;gt;        8080/TCP                     3m29s
stable-prometheus-node-exporter           ClusterIP   100.68.71.253   &amp;lt;none&amp;gt;        9100/TCP                     3m29s
&lt;br&gt;&lt;br&gt;&lt;/pre&gt;



&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--s4XVCH_6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://sysdig.com/wp-content/uploads/Kubernetes-services-01-1170x644.png" alt="Kubernetes Services ClusterIP" width="880" height="484"&gt;



&lt;p&gt;This creates a connection using an internal Cluster IP address and a Port.&lt;/p&gt;



&lt;p&gt;But, what if we need to use this connector from outside the Cluster? This IP is internal and won’t work outside.&lt;/p&gt;



&lt;p&gt;This is where the rest of the services come in…&lt;/p&gt;



&lt;h2&gt;NodePort&lt;/h2&gt;



&lt;p&gt;A NodePort differs from the ClusterIP in the sense that it exposes a port in each Node.&lt;/p&gt;



&lt;p&gt;When a NodePort is created, kube-proxy exposes a port in the range 30000-32767:&lt;/p&gt;



&lt;pre&gt;apiVersion: v1
kind: Service
metadata:
  name: myservice
spec:
  selector:
    app: myapp
  type: NodePort
  ports:
  - port: 80
    targetPort: 8080
    nodePort: 30036
    protocol: TCP&lt;/pre&gt;



&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--0hrnbgS6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://sysdig.com/wp-content/uploads/Kubernetes-services-02-1170x644.png" alt="Kubernetes Services Nodeport" width="880" height="484"&gt;



&lt;p&gt;NodePort is the preferred element for non-HTTP communication.&lt;/p&gt;



&lt;p&gt;The problem with using a NodePort is that you still need to access each of the Nodes separately.&lt;/p&gt;



&lt;p&gt;So, let’s have a look at the next item on the list…&lt;/p&gt;



&lt;h2&gt;LoadBalancer&lt;/h2&gt;



&lt;p&gt;A LoadBalancer is a Kubernetes service that:&lt;/p&gt;



&lt;ul&gt;
&lt;li&gt;Creates a service like ClusterIP&lt;/li&gt;



&lt;li&gt;Opens a port in every node like NodePort&lt;/li&gt;



&lt;li&gt;Uses a LoadBalancer implementation from your cloud provider (your cloud provider needs to support this for LoadBalancers to work).&lt;/li&gt;
&lt;/ul&gt;



&lt;pre&gt;apiVersion: v1
kind: Service
metadata:
  name: myservice
spec:
  ports:
  - name: web
    port: 80
  selector:
    app: web
  type: LoadBalancer
my-service                                LoadBalancer   100.71.69.103   &amp;lt;pending&amp;gt;     80:32147/TCP                 12s
my-service                                LoadBalancer   100.71.69.103   a16038a91350f45bebb49af853ab6bd3-2079646983.us-east-1.elb.amazonaws.com   80:32147/TCP                 16m
&lt;/pre&gt;

&lt;p&gt;In this case, Amazon Web Service (AWS) was being used, so an external IP from AWS was created.&lt;/p&gt;

&lt;p&gt;Then, if you use &lt;code&gt;kubectl describe my-service&lt;/code&gt;, you will find that several new attributes were added:&lt;/p&gt;

&lt;pre&gt;Name:                     my-service
Namespace:                default
Labels:                   &amp;lt;none&amp;gt;
Annotations:              &amp;lt;none&amp;gt;
Selector:                 app.kubernetes.io/name=pegasus
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       100.71.69.103
IPs:                      100.71.69.103
LoadBalancer Ingress:     a16038a91350f45bebb49af853ab6bd3-2079646983.us-east-1.elb.amazonaws.com
Port:                     &amp;lt;unset&amp;gt;  80/TCP
TargetPort:               9376/TCP
NodePort:                 &amp;lt;unset&amp;gt;  32147/TCP
Endpoints:                &amp;lt;none&amp;gt;
Session Affinity:         None
External Traffic Policy:  Cluster
&lt;/pre&gt;

&lt;p&gt;The main difference with NodePort is that LoadBalancer can be accessed and will try to equally assign requests to Nodes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--eGQ-V_pF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://sysdig.com/wp-content/uploads/Kubernetes-services-03-1170x644.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--eGQ-V_pF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://sysdig.com/wp-content/uploads/Kubernetes-services-03-1170x644.png" alt="Kubernetes Service LoadBalancer" width="880" height="484"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;ExternalName&lt;/h2&gt;

&lt;p&gt;The ExternalName service was introduced due to the need of connecting to an element outside of the Kubernetes cluster. Think of it not as a way to connect to an item within your cluster, but as a connector to an external element of the cluster.&lt;/p&gt;

&lt;p&gt;This serves two purposes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It creates a single endpoint for all communications to that element.&lt;/li&gt;



&lt;li&gt;In case that external service needs to be replaced, it’s easier to switch by just modifying the ExternalName, instead of all connections.&lt;/li&gt;
&lt;/ul&gt;

&lt;pre&gt;apiVersion: v1
kind: Service
metadata:
  name: myservice
spec:
  ports:
    - name: web
      port: 80
  selector:
    app: web
  type: ExternalName
  externalName: db.myexternalserver.com
&lt;/pre&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Services are a key aspect of Kubernetes, as they provide a way to expose internal endpoints inside and outside of the cluster.&lt;/p&gt;

&lt;p&gt;ClusterIP service just creates a connector for in-node communication. Use it only in case you have a specific application that needs to connect with others in your node.&lt;/p&gt;

&lt;p&gt;NodePort and LoadBalancer are used for external access to your applications. It’s preferred to use LoadBalancer to equally distribute requests in multi-pod implementations, but note that your vendor should implement load balancing for this to be available.&lt;/p&gt;

&lt;p&gt;Apart from these, Kubernetes provides Ingresses, a way to create an HTTP connection with load balancing for external use.&lt;/p&gt;








&lt;h2&gt;&lt;em&gt;Debug service golden signals with Sysdig Monitor&lt;/em&gt;&lt;/h2&gt;

&lt;p&gt;With Sysdig Monitor, you can quickly debug:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Error rate&lt;/li&gt;
&lt;li&gt;Saturation&lt;/li&gt;
&lt;li&gt;Traffic&lt;/li&gt;
&lt;li&gt;Latency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And thanks to its Container Observability with eBPF, you can do this without adding any app or code instrumentation.

Sysdig Advisor accelerates mean time to resolution (MTTR) with live logs, performance data, and suggested remediation steps. It’s the easy button for Kubernetes troubleshooting!&lt;/p&gt;

&lt;p&gt;&lt;a&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--NpnA6y3k--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://sysdig.com/wp-content/uploads/image-3-1.png" alt="How to debug a crashloopbackoff with Sysdig Monitor Advisor" width="880" height="379"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Sign up now for a &lt;a href="https://sysdig.com/company/free-trial-monitor/" rel="noopener"&gt;free trial of Sysdig Monitor&lt;/a&gt;&lt;/p&gt;



</description>
      <category>kubernetes</category>
      <category>services</category>
    </item>
    <item>
      <title>Kubernetes 1.26 What's new?</title>
      <dc:creator>Javier Martínez</dc:creator>
      <pubDate>Thu, 01 Dec 2022 09:19:05 +0000</pubDate>
      <link>https://dev.to/sysdig/kubernetes-126-whats-new-4736</link>
      <guid>https://dev.to/sysdig/kubernetes-126-whats-new-4736</guid>
      <description>&lt;p&gt;&lt;strong&gt;Kubernetes 1.26&lt;/strong&gt; is about to be released, and it comes packed with novelties! Where do we begin?&lt;/p&gt;

&lt;p&gt;This release brings 37 enhancements, on par with the &lt;a href="https://sysdig.com/blog/kubernetes-1-25-whats-new/" rel="noopener noreferrer"&gt;40 in Kubernetes 1.25&lt;/a&gt; and the &lt;a href="https://sysdig.com/blog/kubernetes-1-24-whats-new/" rel="noopener noreferrer"&gt;46 in Kubernetes 1.24&lt;/a&gt;. Of those 37 enhancements, 11 are graduating to Stable, 10 are existing features that keep improving, 16 are completely new, and one is a deprecated feature.&lt;/p&gt;

&lt;p&gt;Watch out for all the &lt;a href="http://sysdig.com/blog/kubernetes-1-26-whats-new/#deprecations" rel="noopener noreferrer"&gt;deprecations and removals in this version&lt;/a&gt;!&lt;/p&gt;

&lt;p&gt;Two new features stand out in this release that have the potential to &lt;strong&gt;change the way users interact with Kubernetes&lt;/strong&gt;: Being able to &lt;a href="http://sysdig.com/blog/kubernetes-1-26-whats-new/#3294" rel="noopener noreferrer"&gt;provisioning volumes with snapshots from other namespaces&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;There are also new features aimed at &lt;strong&gt;high performance workloads&lt;/strong&gt;, like science researching or machine learning: Better &lt;a href="http://sysdig.com/blog/kubernetes-1-26-whats-new/#3545" rel="noopener noreferrer"&gt;what physical CPU cores your workloads run on&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Also, other features will make life easier for cluster administrators, like &lt;a href="http://sysdig.com/blog/kubernetes-1-26-whats-new/#3515" rel="noopener noreferrer"&gt;support for OpenAPIv3&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We are really hyped about this release!&lt;/p&gt;

&lt;p&gt;There is plenty to talk about, so let's get started with what’s new in Kubernetes 1.25.&lt;/p&gt;

&lt;h2 id="editors"&gt;Kubernetes 1.26 – Editor’s pick:&lt;/h2&gt;

&lt;p&gt;These are the features that look most exciting to us in this release (ymmv):&lt;/p&gt;

&lt;h3&gt;
&lt;a href="http://sysdig.com/blog/kubernetes-1-26-whats-new/#3294" rel="noopener noreferrer"&gt;#3294&lt;/a&gt; Provision volumes from cross-namespace snapshots&lt;/h3&gt;





&lt;p&gt;The &lt;em&gt;VolumeSnapshot&lt;/em&gt; feature allows Kubernetes users provision volumes from volume snapshots, providing great benefits for users and applications, like enabling database administrators to snapshot a database before any critical operation, or the ability to develop and implement backup solutions.&lt;/p&gt;





&lt;p&gt;Starting in Kubernetes 1.26 as an Alpha feature, users will be able to create a &lt;em&gt;PersistentVolumeClaim&lt;/em&gt; from a &lt;em&gt;VolumeSnapshot&lt;/em&gt; across namespaces, breaking the initial limitation of having both objects in the same namespace.&lt;/p&gt;





&lt;p&gt;This enhancement comes to eliminate the constraints that prevented users and applications from operating on fundamental tasks, like saving a database checkpoint when applications and services are in different namespaces.&lt;/p&gt;





&lt;p&gt;&lt;em&gt;&lt;a rel="noopener nofollow noreferrer" href="https://www.linkedin.com/in/v%C3%ADctor-hernando-martin-49836334/"&gt;Víctor Hernando&lt;/a&gt; - Sr. Technical Marketing Manager at Sysdig&lt;/em&gt;&lt;/p&gt;





&lt;h3&gt;
&lt;a href="http://sysdig.com/blog/kubernetes-1-26-whats-new/#3488" rel="noopener noreferrer"&gt;#3488&lt;/a&gt; CEL for admission control&lt;/h3&gt;





&lt;p&gt;Finally, a practical implementation of the validation expression language from Kubernetes 1.25!&lt;/p&gt;





&lt;p&gt;By defining rules for the admission controller as Kubernetes objects, we can start forgetting about managing webhooks, simplifying the setup of our clusters. Not only that, but implementing &lt;a href="https://sysdig.com/learn-cloud-native/kubernetes-security/kubernetes-security-101/" rel="noopener noreferrer"&gt;Kubernetes security&lt;/a&gt; is a bit easier now.&lt;/p&gt;





&lt;p&gt;We love to see these user-friendly improvements. They are the key to keep growing Kubernetes adoption.&lt;/p&gt;





&lt;p&gt;&lt;em&gt;&lt;a rel="noopener nofollow noreferrer" href="https://www.linkedin.com/in/victorjimenez/"&gt;Víctor Jiménez Cerrada&lt;/a&gt; - Content Engineering Manager at Sysdig&lt;/em&gt;&lt;/p&gt;





&lt;h3&gt;
&lt;a href="http://sysdig.com/blog/kubernetes-1-26-whats-new/#3466" rel="noopener noreferrer"&gt;#3466&lt;/a&gt; Kubernetes component health SLIs&lt;/h3&gt;





&lt;p&gt;Since Kubernetes 1.26, you can configure Service Level Indicator (SLI) metrics for the Kubernetes components binaries. Once you enable them, Kubernetes will expose the SLI metrics in the &lt;code&gt;/metrics/slis&lt;/code&gt; endpoint - so you won't need a Prometheus exporter. This can take &lt;a href="https://sysdig.com/blog/kubernetes-monitoring-prometheus/" rel="noopener noreferrer"&gt;Kubernetes monitoring&lt;/a&gt; to another level making it easier to create health dashboards and configure PromQL alerts to assure your cluster's stability.&lt;/p&gt;





&lt;p&gt;&lt;em&gt;&lt;a rel="noopener nofollow noreferrer" href="https://twitter.com/eckelon"&gt;Jesús Ángel Samitier&lt;/a&gt; - Integrations Engineer at Sysdig&lt;/em&gt;&lt;/p&gt;





&lt;h3&gt;
&lt;a href="http://sysdig.com/blog/kubernetes-1-26-whats-new/#2371" rel="noopener noreferrer"&gt;#2371&lt;/a&gt; cAdvisor-less, CRI-full container and &lt;em&gt;Pod&lt;/em&gt; stats&lt;/h3&gt;





&lt;p&gt;Currently, to gather metrics from containers, such as CPU or memory consumed, Kubernetes relies on cAdvisor. This feature presents an alternative, enriching the CRI API to provide all the metrics from the containers, allowing more flexibility and better accuracy. After all, it's the Container Runtime who best knows the behavior of the container.&lt;/p&gt;





&lt;p&gt;This feature represents one more step on the roadmap to remove cAdvisor from Kubernetes code. However, during this transition, cAdvisor will be modified not to generate the metrics added to the CRI API, avoiding duplicated metrics with possible different and incoherent values.&lt;/p&gt;





&lt;p&gt;&lt;em&gt;&lt;a rel="noopener nofollow noreferrer" href="https://twitter.com/maellyssa"&gt;David de Torres Huerta&lt;/a&gt; – Engineer Manager at Sysdig&lt;/em&gt;&lt;/p&gt;





&lt;h3&gt;
&lt;a href="http://sysdig.com/blog/kubernetes-1-26-whats-new/#3063" rel="noopener noreferrer"&gt;#3063&lt;/a&gt; Dynamic resource allocation&lt;/h3&gt;





&lt;p&gt;This new Kubernetes release introduces a new Alpha feature which will provide extended resource management for advanced hardware. As a cherry on top, it comes with a user-friendly API to describe resource requests. With the increasing demand to process different hardware components, like GPU or FPGA, and the need to set up initialization and cleanup, this new feature will speed up Kubernetes adoption in areas like scientific research or edge computing.&lt;/p&gt;





&lt;p&gt;&lt;em&gt;&lt;a rel="noopener nofollow noreferrer" href="https://www.linkedin.com/in/javier-martinez-2b2a955/"&gt;Javier Martínez&lt;/a&gt; - Devops Content Engineer at Sysdig&lt;/em&gt;&lt;/p&gt;





&lt;h3&gt;
&lt;a href="http://sysdig.com/blog/kubernetes-1-26-whats-new/#3545" rel="noopener noreferrer"&gt;#3545&lt;/a&gt; Improved multi-numa alignment in Topology Manager&lt;/h3&gt;





&lt;p&gt;This is yet another feature aimed at high performance workloads, like those involved in scientific computing. We are seeing the new CPU manager taking shape since Kubernetes 1.22 and 1.23, enabling developers to keep their workloads close to where their data is stored in memory, improving performance. Kubernetes 1.26 goes a step further, opening the door to further customizations for this feature. After all, not all workloads and CPU architectures are the same.&lt;/p&gt;





&lt;p&gt;The future of HPC on Kubernetes is looking quite promising, indeed.&lt;/p&gt;





&lt;p&gt;&lt;em&gt;&lt;a rel="noopener nofollow noreferrer" href="https://www.linkedin.com/in/vjjmiras/"&gt;Vicente J. Jiménez Miras&lt;/a&gt; – Security Content Engineer at Sysdig&lt;/em&gt;&lt;/p&gt;





&lt;h3&gt;
&lt;a href="http://sysdig.com/blog/kubernetes-1-26-whats-new/#3335" rel="noopener noreferrer"&gt;#3335&lt;/a&gt; Allow StatefulSet to control start replica ordinal numbering&lt;/h3&gt;





&lt;p&gt;&lt;em&gt;StatefulSets&lt;/em&gt; in Kubernetes often are critical backend services, like clustered databases or message queues.&lt;br&gt;This enhancement, seemingly a trivial numbering change, allows for greater flexibility and enables new techniques for rolling cross-namespace or even cross-cluster migrations of the replicas of the &lt;em&gt;StatefulSet&lt;/em&gt; &lt;strong&gt;without any downtime&lt;/strong&gt;. While the process might seem a bit clunky, involving careful definition of &lt;em&gt;PodDisruptionBudgets&lt;/em&gt; and the moving of resources relative to the migrating replica, we can surely envision tools (or existing operators enhancements) that automate these operations for &lt;strong&gt;seamless migrations&lt;/strong&gt;, in stark contrast with the cold-migration strategy (shutdown-backup-restore) that is currently possible.&lt;/p&gt;





&lt;p&gt;&lt;em&gt;&lt;a rel="noopener nofollow noreferrer" href="https://www.linkedin.com/in/danielsimionato/"&gt;Daniel Simionato&lt;/a&gt; - Security Content Engineer at Sysdig&lt;/em&gt;&lt;/p&gt;





&lt;h3&gt;
&lt;a href="http://sysdig.com/blog/kubernetes-1-26-whats-new/#3325" rel="noopener noreferrer"&gt;#3325&lt;/a&gt; Auth API to get self user attributes&lt;/h3&gt;





&lt;p&gt;This new feature coming to alpha will simplify cluster Administrator's work, especially when they are managing multiple clusters. It will also assist in complex authentication flows, as it lets users query their user information or permissions inside the cluster.&lt;/p&gt;





&lt;p&gt;Also, this includes whether you are using a proxy (Kubernetes API server fills in the &lt;code&gt;userInfo&lt;/code&gt; after all authentication mechanisms are applied) or impersonating (you receive the details and properties for the user that was impersonated), so you will have your user information in a very easy way.&lt;/p&gt;





&lt;p&gt;&lt;em&gt;&lt;a rel="noopener nofollow noreferrer" href="https://www.linkedin.com/in/miguelhzbz/"&gt;Miguel Hernández&lt;/a&gt; - Security Content Engineer at Sysdig&lt;/em&gt;&lt;/p&gt;





&lt;h3&gt;
&lt;a href="http://sysdig.com/blog/kubernetes-1-26-whats-new/#3352" rel="noopener noreferrer"&gt;#3352&lt;/a&gt; Aggregated Discovery&lt;/h3&gt;





&lt;p&gt;This is a tiny change for the users, but one step further on cleaning the Kubernetes internals and improving its performance. Reducing the number of API calls by aggregating them (or at least on the discovery part) is a nice solution to a growing problem. Hopefully, this will provide a small break to cluster administrators.&lt;/p&gt;





&lt;p&gt;&lt;em&gt;&lt;a rel="noopener nofollow noreferrer" href="https://www.linkedin.com/in/ddok/"&gt;Devid Dokash&lt;/a&gt; - Content Engineering Intern at Sysdig&lt;/em&gt;&lt;/p&gt;





&lt;h2 id="deprecations"&gt;Deprecations&lt;/h2&gt;





&lt;p&gt;A few beta APIs and features have been removed in Kubernetes 1.26, including:&lt;/p&gt;





&lt;p&gt;&lt;strong&gt;Deprecated &lt;a rel="noopener nofollow noreferrer" href="https://kubernetes.io/docs/reference/using-api/deprecation-guide/#v1-25"&gt;API versions&lt;/a&gt;&lt;/strong&gt; that are no longer served, and you should use a newer one:&lt;/p&gt;





&lt;ul&gt;
&lt;li&gt;CRI &lt;code&gt;v1alpha2&lt;/code&gt;, use &lt;code&gt;v1&lt;/code&gt; (&lt;em&gt;containerd&lt;/em&gt; version 1.5 and older are not supported).&lt;/li&gt;



&lt;li&gt;
&lt;code&gt;flowcontrol.apiserver.k8s.io/v1beta1&lt;/code&gt;, use &lt;code&gt;v1beta2&lt;/code&gt;.&lt;/li&gt;



&lt;li&gt;
&lt;code&gt;autoscaling/v2beta2&lt;/code&gt;, use &lt;code&gt;v2&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;





&lt;p&gt;&lt;strong&gt;Deprecated&lt;/strong&gt;. Implement an alternative before the next release goes out:&lt;/p&gt;





&lt;ul&gt;
&lt;li&gt;In-tree GlusterFS driver.&lt;/li&gt;



&lt;li&gt;
&lt;code&gt;kubectl --prune-whitelist&lt;/code&gt;, use &lt;code&gt;--prune-allowlist&lt;/code&gt; instead.&lt;/li&gt;



&lt;li&gt;
&lt;code&gt;kube-apiserver --master-service-namespace&lt;/code&gt;.&lt;/li&gt;



&lt;li&gt;Several unused options for &lt;code&gt;kubectl run&lt;/code&gt;: &lt;code&gt;--cascade&lt;/code&gt;, &lt;code&gt;--filename&lt;/code&gt;, &lt;code&gt;--force&lt;/code&gt;, &lt;code&gt;--grace-period&lt;/code&gt;, &lt;code&gt;--kustomize&lt;/code&gt;, &lt;code&gt;--recursive&lt;/code&gt;, &lt;code&gt;--timeout&lt;/code&gt;, &lt;code&gt;--wait&lt;/code&gt;.&lt;/li&gt;



&lt;li&gt;CLI flag &lt;code&gt;pod-eviction-timeout&lt;/code&gt;.&lt;/li&gt;



&lt;li&gt;The &lt;code&gt;apiserver_request_slo_duration_seconds&lt;/code&gt; metric, use &lt;code&gt;apiserver_request_sli_duration_seconds&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;





&lt;p&gt;&lt;strong&gt;Removed&lt;/strong&gt;. Implement an alternative before upgrading:&lt;/p&gt;





&lt;ul&gt;
&lt;li&gt;Legacy authentication for Azure and Google Cloud is deprecated.&lt;/li&gt;



&lt;li&gt;The &lt;code&gt;userspace&lt;/code&gt; proxy mode.&lt;/li&gt;



&lt;li&gt;Dynamic &lt;em&gt;kubelet&lt;/em&gt; configuration.&lt;/li&gt;



&lt;li&gt;Several &lt;a href="https://sysdig.com/blog/kubernetes-1-23-whats-new/#2845" rel="noopener noreferrer"&gt;command line arguments related to logging&lt;/a&gt;.&lt;/li&gt;



&lt;li&gt;in-tree OpenStack (&lt;code&gt;cinder&lt;/code&gt; volume type), use &lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/cloud-provider-openstack"&gt;the CSI driver&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;





&lt;p&gt;&lt;strong&gt;Other changes&lt;/strong&gt; you should adapt your configs for:&lt;/p&gt;





&lt;ul&gt;
&lt;li&gt;Pod Security admission: the pod-security &lt;code&gt;warn&lt;/code&gt; level will now default to the &lt;code&gt;enforce&lt;/code&gt; level.&lt;/li&gt;



&lt;li&gt;kubelet: The default &lt;code&gt;cpuCFSQuotaPeriod&lt;/code&gt; value with the &lt;code&gt;cpuCFSQuotaPeriod&lt;/code&gt; flag enabled is now 100µs instead of 100ms.&lt;/li&gt;



&lt;li&gt;kubelet: The &lt;code&gt;--container-runtime-endpoint&lt;/code&gt; flag cannot be empty anymore.&lt;/li&gt;



&lt;li&gt;kube-apiserver: gzip compression switched from level 4 to level 1.&lt;/li&gt;



&lt;li&gt;Metrics: Changed &lt;code&gt;preemption_victims&lt;/code&gt; from &lt;code&gt;LinearBuckets&lt;/code&gt; to &lt;code&gt;ExponentialBuckets&lt;/code&gt;.&lt;/li&gt;



&lt;li&gt;Metrics: &lt;code&gt;etcd_db_total_size_in_bytes&lt;/code&gt; is renamed to &lt;code&gt;apiserver_storage_db_total_size_in_bytes&lt;/code&gt;.&lt;/li&gt;



&lt;li&gt;Metrics: &lt;code&gt;kubelet_kubelet_credential_provider_plugin_duration&lt;/code&gt; is renamed &lt;code&gt;kubelet_credential_provider_plugin_duration&lt;/code&gt;.&lt;/li&gt;



&lt;li&gt;Metrics: &lt;code&gt;kubelet_kubelet_credential_provider_plugin_errors&lt;/code&gt; is renamed &lt;code&gt;kubelet_credential_provider_plugin_errors&lt;/code&gt;.&lt;/li&gt;



&lt;li&gt;Removed Windows Server, Version 20H2 flavors from various container images.&lt;/li&gt;



&lt;li&gt;The e2e.test binary no longer emits JSON structs to document progress.&lt;/li&gt;
&lt;/ul&gt;





&lt;p&gt;You can check the full list of changes in the &lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.26.md"&gt;Kubernetes 1.26 release notes&lt;/a&gt;. Also, we recommend the &lt;a rel="noopener nofollow noreferrer" href="https://kubernetes.io/blog/2022/11/18/upcoming-changes-in-kubernetes-1-26/"&gt;Kubernetes Removals and Deprecations In 1.26&lt;/a&gt; article, as well as keeping the &lt;a rel="noopener nofollow noreferrer" href="https://kubernetes.io/docs/reference/using-api/deprecation-guide/"&gt;deprecated API migration guide&lt;/a&gt; close for the future.&lt;/p&gt;





&lt;h3 id="281"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/281"&gt;#281&lt;/a&gt; Dynamic Kubelet configuration&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; node&lt;/p&gt;





&lt;p&gt;After being in beta since Kubernetes 1.11, the Kubernetes team &lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/kubernetes/issues/100799"&gt;has decided&lt;/a&gt; to deprecate &lt;code&gt;DynamicKubeletConfig&lt;/code&gt; instead of continuing its development.&lt;/p&gt;





&lt;p&gt;This feature was marked for deprecation in 1.21, then removed from the Kubelet in 1.24. Now in 1.26, it has been &lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/pull/3605/files#diff-138ec4a122ef9ea3b885191796faf63ca6511747e4be18840dd67ffa2a386d1d"&gt;completely removed from Kubernetes&lt;/a&gt;.&lt;/p&gt;





&lt;h2 id="api"&gt;Kubernetes 1.26 API&lt;/h2&gt;





&lt;h3 id="3352"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3352"&gt;#3352&lt;/a&gt; Aggregated discovery&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Net new to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; api-machinery&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;AggregatedDiscoveryEndpoint&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;Every Kubernetes client like &lt;code&gt;kubectl&lt;/code&gt; needs to discover what APIs and versions of those APIs are available in the &lt;code&gt;kubernetes-apiserver&lt;/code&gt;. For that, they need to make a request per each API and version, which causes a storm of requests.&lt;/p&gt;





&lt;p&gt;This enhancement aims to reduce all those calls to just two.&lt;/p&gt;





&lt;p&gt;Clients can include &lt;code&gt;as=APIGroupDiscoveryList&lt;/code&gt; to the &lt;code&gt;Accept&lt;/code&gt; field of their requests to the &lt;code&gt;/api&lt;/code&gt; and &lt;code&gt;/apis&lt;/code&gt; endpoints. Then, the server will return an aggregated document (&lt;code&gt;APIGroupDiscoveryList&lt;/code&gt;) with all the available APIs and their versions.&lt;/p&gt;





&lt;h3 id="3488"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3488"&gt;#3488&lt;/a&gt; CEL for admission control&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Net new to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; api-machinery&lt;/p&gt;





&lt;p&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;ValidatingAdmissionPolicy&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;Building on &lt;a href="https://sysdig.com/blog/kubernetes-1-25-whats-new/#2876" rel="noopener noreferrer"&gt;#2876 CRD validation expression language&lt;/a&gt; from Kubernetes 1.25, this enhancement provides a new admission controller type (&lt;code&gt;ValidatingAdmissionPolicy&lt;/code&gt;) that allows implementing some validations without relying on webhooks.&lt;/p&gt;





&lt;p&gt;These new policies can be defined like:&lt;/p&gt;





&lt;pre&gt;&lt;code&gt; apiVersion: admissionregistration.k8s.io/v1alpha1
 kind: ValidatingAdmissionPolicy
 metadata:
   name: "demo-policy.example.com"
 Spec:
   failurePolicy: Fail
   matchConstraints:
     resourceRules:
     - apiGroups:   ["apps"]
       apiVersions: ["v1"]
       operations:  ["CREATE", "UPDATE"]
       resources:   ["deployments"]
   validations:
     - expression: "object.spec.replicas &amp;lt;= 5"
&lt;/code&gt;&lt;/pre&gt;





&lt;p&gt;This policy would deny requests for some deployments with &lt;code&gt;5&lt;/code&gt; replicas or less.&lt;/p&gt;





&lt;p&gt;Discover the full power of this feature &lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/website/blob/9a8c421c1e18dd9485788f1ffc23944c41e91483/content/en/docs/reference/access-authn-authz/validating-admission-policy.md"&gt;in the docs&lt;/a&gt;.&lt;/p&gt;





&lt;h3 id="1965"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/1965"&gt;#1965&lt;/a&gt; &lt;em&gt;kube-apiserver&lt;/em&gt; identity&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Beta&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; api-machinery&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;APIServerIdentity&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;In order to better control which &lt;em&gt;kube-apiservers&lt;/em&gt; are alive in a high availability cluster, a new lease / heartbeat system has been implemented.&lt;/p&gt;





&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/whats-new-kubernetes-1-20/#1965" rel="noopener noreferrer"&gt;What's new in Kubernetes 1.20&lt;/a&gt;" article.&lt;/p&gt;





&lt;h2 id="apps"&gt;Apps in Kubernetes 1.26&lt;/h2&gt;





&lt;h3 id="3017"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3017"&gt;#3017&lt;/a&gt; &lt;em&gt;PodHealthyPolicy&lt;/em&gt; for &lt;em&gt;PodDisruptionBudget&lt;/em&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Net new to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; apps&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;PDBUnhealthyPodEvictionPolicy&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;A &lt;em&gt;&lt;a rel="noopener nofollow noreferrer" href="https://kubernetes.io/docs/tasks/run-application/configure-pdb/"&gt;PodDisruptionBudget&lt;/a&gt;&lt;/em&gt; allows you to communicate some minimums to your cluster administrator to make maintenance tasks easier, like "Do not destroy more than one of these" or "Keep at least two of these alive".&lt;/p&gt;

&lt;p&gt;However, this only takes into account if the pods are running, not if they are healthy. It may happen that your pods are Running but not Ready, and a &lt;em&gt;PodDisruptionBudget&lt;/em&gt; may be preventing its eviction.&lt;/p&gt;

&lt;p&gt;This enhancement expands these budget definitions with the &lt;code&gt;status.currentHealthy&lt;/code&gt;, &lt;code&gt;status.desiredHealthy&lt;/code&gt;, and &lt;code&gt;spec.unhealthyPodEvictionPolicy&lt;/code&gt; extra fields to help you define how to manage unhealthy pods.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ kubectl get poddisruptionbudgets example-pod
apiVersion: policy/v1
kind: PodDisruptionBudget
[...]
status:
  currentHealthy: 3
  desiredHealthy: 2
  disruptionsAllowed: 1
  expectedPods: 3
  observedGeneration: 1
  unhealthyPodEvictionPolicy: IfHealthyBudget
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id="3335"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3335"&gt;#3335&lt;/a&gt; Allow &lt;em&gt;StatefulSet&lt;/em&gt; to control start replica ordinal numbering&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Net new to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; apps&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;StatefulSetStartOrdinal&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;&lt;em&gt;StatefulSets&lt;/em&gt; in Kubernetes currently number their pods using ordinal numbers, with the first replica being &lt;code&gt;0&lt;/code&gt; and the last being &lt;code&gt;spec.replicas&lt;/code&gt;.&lt;/p&gt;





&lt;p&gt;This enhancement adds a new struct with a single field to the &lt;em&gt;StatefulSet&lt;/em&gt; manifest spec,&lt;code&gt; spec.ordinals.start&lt;/code&gt;, which allows to define the starting number for the replicas controlled by the &lt;em&gt;StatefulSet&lt;/em&gt;.&lt;/p&gt;





&lt;p&gt;This is useful, for example, in cross-namespace or cross-cluster migrations of &lt;em&gt;StatefulSet&lt;/em&gt;, where a clever use of &lt;em&gt;PodDistruptionBudgets&lt;/em&gt; (and multi-cluster services) can allow a controlled rolling migration of the replicas avoiding any downtime to the &lt;em&gt;StatefulSet&lt;/em&gt;.&lt;/p&gt;





&lt;h3 id="3329"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3329"&gt;#3329&lt;/a&gt; Retriable and non-retriable &lt;em&gt;Pod&lt;/em&gt; failures for &lt;em&gt;Jobs&lt;/em&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Beta&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; apps&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;JobPodFailurePolicy&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt;&lt;/code&gt; &lt;code&gt;PodDisruptionsCondition&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This enhancement allows us to configure a &lt;code&gt;.spec.podFailurePolicy&lt;/code&gt; on the &lt;a rel="noopener nofollow noreferrer" href="https://kubernetes.io/docs/concepts/workloads/controllers/job/"&gt;Jobs&lt;/a&gt;'s spec that determines whether the Job should be retried or not in case of failure. This way, Kubernetes can terminate Jobs early, avoiding increasing the backoff time in case of infrastructure failures or application errors.&lt;/p&gt;

&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-25-whats-new/#3329" rel="noopener noreferrer"&gt;What's new in Kubernetes 1.25&lt;/a&gt;" article.&lt;/p&gt;

&lt;h3 id="2307"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/2307"&gt;#2307&lt;/a&gt; &lt;em&gt;Job&lt;/em&gt; tracking without lingering &lt;em&gt;Pods&lt;/em&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Stable&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; apps&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;JobTrackingWithFinalizers&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;With this enhancement, Jobs will be able to remove completed pods earlier, freeing resources in the cluster.&lt;/p&gt;

&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-22-whats-new/#2307" rel="noopener noreferrer"&gt;Kubernetes 1.22 - What's new?&lt;/a&gt;" article.&lt;/p&gt;

&lt;h2 id="auth"&gt;Kubernetes 1.26 Auth&lt;/h2&gt;

&lt;h3 id="3325"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3325"&gt;#3325&lt;/a&gt; Auth &lt;em&gt;API&lt;/em&gt; to get self user attributes&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Net new to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; auth&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;APISelfSubjectAttributesReview&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;This new feature is extremely useful when a complicated authentication flow is used in a Kubernetes cluster, and you want to know all your &lt;code&gt;userInfo&lt;/code&gt;, after all authentication mechanisms are applied.&lt;/p&gt;





&lt;p&gt;Executing &lt;code&gt;kubectl alpha auth whoami&lt;/code&gt; will produce the following output:&lt;/p&gt;





&lt;pre&gt;&lt;code&gt;apiVersion: authentication.k8s.io/v1alpha1
kind: SelfSubjectReview
status:
  userInfo:
    username: jane.doe
    uid: b79dbf30-0c6a-11ed-861d-0242ac120002
    groups:
    - students
    - teachers
    - system:authenticated
    extra:
      skills:
      - reading
      - learning
      subjects:
      - math
      - sports
&lt;/code&gt;&lt;/pre&gt;





&lt;p&gt;In summary, we are now allowed to do a typical &lt;em&gt;/me&lt;/em&gt; to know our own permissions once we are authenticated in the cluster.&lt;/p&gt;





&lt;h3 id="2799"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/2799"&gt;#2799&lt;/a&gt; Reduction of secret-based service account tokens&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Beta&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; auth&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;LegacyServiceAccountTokenNoAutoGeneration&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;API credentials are now obtained through the &lt;a rel="noopener nofollow noreferrer" href="https://kubernetes.io/docs/reference/kubernetes-api/authentication-resources/token-request-v1/"&gt;TokenRequest API&lt;/a&gt;, are stable since &lt;a href="https://sysdig.com/blog/kubernetes-1-22-whats-new/#542" rel="noopener noreferrer"&gt;Kubernetes 1.22&lt;/a&gt;, and are mounted into Pods using a projected volume. They will be automatically invalidated when their associated Pod is deleted.&lt;/p&gt;





&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-24-whats-new/#2799" rel="noopener noreferrer"&gt;Kubernetes 1.24 - What's new?&lt;/a&gt;" article.&lt;/p&gt;





&lt;h2 id="network"&gt;Network in Kubernetes 1.26&lt;/h2&gt;





&lt;h3 id="3453"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3453"&gt;#3453&lt;/a&gt; Minimizing &lt;em&gt;iptables-restore&lt;/em&gt; input size&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Net new to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; network&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;MinimizeIPTablesRestore&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;This enhancement aims to improve the performance of &lt;code&gt;kube-proxy&lt;/code&gt;. It will do so by only sending the rules that have changed on the calls to &lt;code&gt;iptables-restore&lt;/code&gt;, instead of the whole set of rules.&lt;/p&gt;





&lt;h3 id="1669"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/1669"&gt;#1669&lt;/a&gt; &lt;em&gt;Proxy&lt;/em&gt; terminating endpoints&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Beta&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; network&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;ProxyTerminatingEndpoints&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;This enhancement prevents traffic drops during rolling updates by sending all external traffic to both ready and not ready terminating endpoints (preferring the ready ones).&lt;/p&gt;





&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-22-whats-new/#1669" rel="noopener noreferrer"&gt;Kubernetes 1.22 - What's new?&lt;/a&gt;" article.&lt;/p&gt;





&lt;h3 id="2595"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/2595"&gt;#2595&lt;/a&gt; Expanded DNS configuration&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Beta&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; network&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;ExpandedDNSConfig&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;With this enhancement, Kubernetes allows up to 32 DNS in the search path, and an increased number of characters for the search path (up to 2048), to keep up with recent DNS resolvers.&lt;/p&gt;





&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-22-whats-new/#2595" rel="noopener noreferrer"&gt;Kubernetes 1.22 - What's new?&lt;/a&gt;" article.&lt;/p&gt;





&lt;h3 id="1435"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/1435"&gt;#1435&lt;/a&gt; Support of mixed protocols in &lt;em&gt;Services&lt;/em&gt; with &lt;em&gt;type=LoadBalancer&lt;/em&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Stable&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; network&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;MixedProtocolLBService&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This enhancement allows a LoadBalancer Service to serve different protocols under the same port (UDP, TCP). For example, serving both UDP and TCP requests for a DNS or SIP server on the same port.&lt;/p&gt;

&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/whats-new-kubernetes-1-20/#1435" rel="noopener noreferrer"&gt;Kubernetes 1.20 - What's new?&lt;/a&gt;" article.&lt;/p&gt;

&lt;h3 id="2086"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/2086"&gt;#2086&lt;/a&gt; Service internal traffic policy&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Stable&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; network&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;ServiceInternalTrafficPolicy&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;You can now set the &lt;code&gt;spec.trafficPolicy&lt;/code&gt; field on &lt;code&gt;Service&lt;/code&gt; objects to optimize your cluster traffic:&lt;/p&gt;





&lt;ul&gt;
&lt;li&gt;With &lt;code&gt;Cluster&lt;/code&gt;, the routing will behave as usual.&lt;/li&gt;



&lt;li&gt;When set to &lt;code&gt;Topology&lt;/code&gt;, it will use &lt;a href="https://sysdig.com/blog/kubernetes-1-21-whats-new/#2433" rel="noopener noreferrer"&gt;the topology-aware routing&lt;/a&gt;.&lt;/li&gt;



&lt;li&gt;With &lt;code&gt;PreferLocal&lt;/code&gt;, it will redirect traffic to services on the same node.&lt;/li&gt;



&lt;li&gt;With &lt;code&gt;Local&lt;/code&gt;, it will only send traffic to services on the same node.&lt;/li&gt;
&lt;/ul&gt;





&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-21-whats-new/#2086" rel="noopener noreferrer"&gt;Kubernetes 1.21 - What's new?&lt;/a&gt;" article.&lt;/p&gt;





&lt;h3 id="3070"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3070"&gt;#3070&lt;/a&gt; Reserve service IP ranges for dynamic and static IP allocation&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Stable&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; network&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;ServiceIPStaticSubrange&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;This update to the &lt;code&gt;--service-cluster-ip-range&lt;/code&gt; flag will lower the risk of having IP conflicts between Services using static and dynamic IP allocation, and at the same time, keep the compatibility backwards.&lt;/p&gt;





&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-24-whats-new/#3070" rel="noopener noreferrer"&gt;What's new in Kubernetes 1.24&lt;/a&gt;" article.&lt;/p&gt;





&lt;h2 id="nodes"&gt;Kubernetes 1.26 Nodes&lt;/h2&gt;





&lt;h3 id="2371"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/2371"&gt;#2371&lt;/a&gt; cAdvisor-less, CRI-full container and Pod stats&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Major change to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; node&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;PodAndContainerStatsFromCRI &lt;strong&gt;Default value: &lt;/strong&gt;false&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;This enhancement summarizes the efforts to retrieve all the stats about running containers and pods from the &lt;a rel="noopener nofollow noreferrer" href="https://kubernetes.io/blog/2016/12/container-runtime-interface-cri-in-kubernetes/"&gt;Container Runtime Interface (CRI)&lt;/a&gt;, removing the dependencies from cAdvisor.&lt;/p&gt;





&lt;p&gt;Starting with 1.26, the metrics on &lt;code&gt;/metrics/cadvisor&lt;/code&gt; are gathered by CRI instead of cAdvisor.&lt;/p&gt;





&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-23-whats-new/#2371" rel="noopener noreferrer"&gt;Kubernetes 1.23 - What's new?&lt;/a&gt;" article.&lt;/p&gt;





&lt;h3 id="3063"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3063"&gt;#3063&lt;/a&gt; Dynamic resource allocation&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Net new to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; node&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;DynamicResourceAllocation&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;Traditionally, the Kubernetes scheduler could only take into account &lt;a href="https://sysdig.com/blog/kubernetes-limits-requests/" rel="noopener noreferrer"&gt;CPU and memory limits and requests&lt;/a&gt;. Later on, the scheduler was expanded to also take storage and other resources into account. However, this is limiting in many scenarios.&lt;/p&gt;





&lt;p&gt;For example, what if the device needs initialization and cleanup, like an FPGA; or what if you want to limit the access to the resource, like a shared GPU?&lt;/p&gt;





&lt;p&gt;This new API covers those scenarios of resource allocation and dynamic detection, using the new &lt;code&gt;ResourceClaimTemplate&lt;/code&gt; and &lt;code&gt;ResourceClass&lt;/code&gt; objects, and the new &lt;code&gt;resourceClaims&lt;/code&gt; field inside &lt;em&gt;Pods&lt;/em&gt;.&lt;/p&gt;





&lt;pre&gt;&lt;code&gt;apiVersion: v1
 kind: Pod
[...]
 spec:
   resourceClaims:
   - name: resource0
     source:
       resourceClaimTemplateName: resource-claim-template
   - name: resource1
     source:
       resourceClaimTemplateName: resource-claim-template
[...]
&lt;/code&gt;&lt;/pre&gt;





&lt;p&gt;The scheduler can keep track of these resource claims, and only schedule &lt;em&gt;Pods&lt;/em&gt; in those nodes with enough resources available.&lt;/p&gt;





&lt;h3 id="3386"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3386"&gt;#3386&lt;/a&gt; Kubelet evented &lt;em&gt;PLEG&lt;/em&gt; for better performance&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Net new to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; node&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;EventedPLEG&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;The aim of this enhancement is to reduce the CPU usage of the &lt;code&gt;kubelet&lt;/code&gt; when keeping track of all the pod states.&lt;/p&gt;





&lt;p&gt;It will partially reduce the periodic polling that the &lt;code&gt;kubelet&lt;/code&gt; performs, instead relying on notifications from the Container Runtime Interface (CRI) as much as possible.&lt;/p&gt;





&lt;p&gt;If you are interested in the implementation details, you may want to &lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/3386-kubelet-evented-pleg/README.md"&gt;take a look at the KEP&lt;/a&gt;.&lt;/p&gt;





&lt;h3 id="3545"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3545"&gt;#3545&lt;/a&gt; Improved &lt;em&gt;multi-NUMA&lt;/em&gt; alignment in topology manager&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Net new to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; node&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;TopologyManagerPolicyOptions&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt;&lt;/code&gt; &lt;code&gt;TopologyManagerPolicyBetaOptions&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;TopologyManagerPolicyAlphaOptions&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;This is an improvement for &lt;em&gt;TopologyManager&lt;/em&gt; to better handle Non-Uniform Memory Access (&lt;a rel="noopener nofollow noreferrer" href="https://en.wikipedia.org/wiki/Non-uniform_memory_access"&gt;NUMA&lt;/a&gt;) nodes. For some high-performance workloads, it is very important to control in which physical CPU cores they run. You can significantly improve performance if you avoid memory jumping between the caches of the same chip, or between sockets.&lt;/p&gt;





&lt;p&gt;A new &lt;code&gt;topology-manager-policy-options&lt;/code&gt; flag for &lt;code&gt;kubelet&lt;/code&gt; will allow you to pass options and modify the behavior of a topology manager.&lt;/p&gt;





&lt;p&gt;Currently, only one alpha option is available:&lt;/p&gt;





&lt;ul&gt;
&lt;li&gt;When &lt;code&gt;prefer-closest-numa-nodes=true&lt;/code&gt; is passed along, the Topology Manager will align the resources on either a single NUMA node or the minimum number of NUMA nodes possible.&lt;/li&gt;
&lt;/ul&gt;





&lt;p&gt;As new options may be added in the future, several feature gates have been added so you can choose to focus only on the stable ones:&lt;/p&gt;





&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;TopologyManagerPolicyOptions&lt;/code&gt;: Will enable the &lt;code&gt;topology-manager-policy-options&lt;/code&gt; flag and the stable options.&lt;/li&gt;



&lt;li&gt;
&lt;code&gt;TopologyManagerPolicyBetaOptions&lt;/code&gt;: Will also enable the beta options.&lt;/li&gt;



&lt;li&gt;
&lt;code&gt;TopologyManagerPolicyAlphaOptions&lt;/code&gt;: Will also enable the alpha options.&lt;/li&gt;
&lt;/ul&gt;





&lt;p&gt;&lt;strong&gt;Related:&lt;/strong&gt; &lt;a href="https://sysdig.com/blog/kubernetes-1-23-whats-new/#2902" rel="noopener noreferrer"&gt;#2902 CPUManager policy option to distribute CPUs across NUMA nodes in Kubernetes 1.23&lt;/a&gt;.&lt;br&gt;&lt;strong&gt;Related:&lt;/strong&gt; &lt;a href="https://sysdig.com/blog/kubernetes-1-22-whats-new/#2625" rel="noopener noreferrer"&gt;#2625 New CPU Manager Policies in Kubernetes 1.22&lt;/a&gt;.&lt;/p&gt;





&lt;h3 id="2133"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/2133"&gt;#2133&lt;/a&gt; Kubelet credential provider&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Stable&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; node&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;KubeletCredentialProviders&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;This enhancement replaces in-tree container image registry credential providers with a new mechanism that is external and pluggable.&lt;/p&gt;





&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/whats-new-kubernetes-1-20/#2133" rel="noopener noreferrer"&gt;Kubernetes 1.20 - What's new?&lt;/a&gt;" article.&lt;/p&gt;





&lt;h3 id="3570"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3570"&gt;#3570&lt;/a&gt; Graduate to &lt;em&gt;CPUManager&lt;/em&gt; to GA&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Stable&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; node&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;CPUManager&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;The CPUManager is the Kubelet component responsible for assigning pod containers to sets of CPUs on the local node.&lt;/p&gt;





&lt;p&gt;It was introduced in Kubernetes 1.8, and graduated to beta in release 1.10. For 1.26, the core CPUManager has been deemed stable, while experimentation continues with the additional work on its policies.&lt;/p&gt;





&lt;p&gt;&lt;strong&gt;Related:&lt;/strong&gt; &lt;a href="http://sysdig.com/blog/kubernetes-1-26-whats-new/#3545" rel="noopener noreferrer"&gt;#3545 Improved multi-numa alignment in Topology Manager in Kubernetes 1.26&lt;/a&gt;.&lt;br&gt;&lt;strong&gt;Related:&lt;/strong&gt; &lt;a href="https://sysdig.com/blog/kubernetes-1-22-whats-new/#2625" rel="noopener noreferrer"&gt;#2625 New CPU Manager Policies in Kubernetes 1.22&lt;/a&gt;.&lt;/p&gt;





&lt;h3 id="3573"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3573"&gt;#3573&lt;/a&gt; Graduate &lt;em&gt;DeviceManager&lt;/em&gt; to GA&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Stable&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; node&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;DevicePlugins&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;The DeviceManager in the Kubelet is the component managing the interactions with the different Device Plugins.&lt;/p&gt;





&lt;p&gt;Initially introduced in Kubernetes 1.8 and moved to beta stage in release 1.10, the Device Plugin framework saw widespread adoption and is finally moving to GA in 1.26.&lt;/p&gt;





&lt;p&gt;This framework allows the use of external devices (e.g., &lt;a rel="noopener nofollow noreferrer" href="https://github.com/NVIDIA/k8s-device-plugin"&gt;NVIDIA GPUs&lt;/a&gt;, &lt;a rel="noopener nofollow noreferrer" href="https://github.com/RadeonOpenCompute/k8s-device-plugin"&gt;AMD GPUS&lt;/a&gt;, &lt;a rel="noopener nofollow noreferrer" href="https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin"&gt;SR-IOV NICs&lt;/a&gt;) without modifying core Kubernetes components.&lt;/p&gt;





&lt;h2 id="scheduling"&gt;Scheduling in Kubernetes 1.26&lt;/h2&gt;





&lt;h3 id="3521"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3521"&gt;#3521&lt;/a&gt; &lt;em&gt;Pod&lt;/em&gt; scheduling readiness&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Net new to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; scheduling&lt;code&gt;&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt;&lt;/code&gt; &lt;code&gt;PodSchedulingReadiness&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;This enhancement aims to optimize scheduling by letting the Pods define when they are ready to be actually scheduled.&lt;/p&gt;





&lt;p&gt;Not all &lt;a href="https://sysdig.com/blog/kubernetes-pod-pending-problems/" rel="noopener noreferrer"&gt;pending Pods&lt;/a&gt; are ready to be scheduled. Some stay in a &lt;code&gt;miss-essential-resources&lt;/code&gt; state for some time, which causes extra work in the scheduler.&lt;/p&gt;





&lt;p&gt;The new &lt;code&gt;.spec.schedulingGates&lt;/code&gt; of a Pod allows to identify when they are ready for scheduling:&lt;/p&gt;





&lt;pre&gt;&lt;code&gt;apiVersion: v1
 kind: Pod
[...]
 spec:
   schedulingGates:
   - name: foo
   - name: bar
[...]
&lt;/code&gt;&lt;/pre&gt;





&lt;p&gt;When any scheduling gate is present, the Pod won't be scheduled.&lt;/p&gt;





&lt;p&gt;You can check the status with:&lt;/p&gt;





&lt;pre&gt;&lt;code&gt;$ kubectl get pod test-pod
 NAME       READY   STATUS            RESTARTS   AGE
 test-pod   0/1     SchedulingGated   0          7s
&lt;/code&gt;&lt;/pre&gt;





&lt;h3 id="3094"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3094"&gt;#3094&lt;/a&gt; Take taints/tolerations into consideration when calculating &lt;em&gt;PodTopologySpread&lt;/em&gt; skew&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Beta&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; scheduling&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;NodeInclusionPolicyInPodTopologySpread&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;As we discussed in our "&lt;a href="https://sysdig.com/blog/whats-new-kubernetes-1-16/#895" rel="noopener noreferrer"&gt;Kubernetes 1.16 - What's new?&lt;/a&gt;" article, the &lt;code&gt;topologySpreadConstraints&lt;/code&gt; fields, along with &lt;code&gt;maxSkew&lt;/code&gt;, allow you to spread your workloads across nodes. A new &lt;code&gt;NodeInclusionPolicies&lt;/code&gt; field allows taking into account &lt;code&gt;NodeAffinity&lt;/code&gt; and &lt;code&gt;NodeTaint&lt;/code&gt; when calculating this pod topology spread skew.&lt;/p&gt;





&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-25-whats-new/#3094" rel="noopener noreferrer"&gt;What's new in Kubernetes 1.25&lt;/a&gt;" article.&lt;/p&gt;





&lt;h2 id="storage"&gt;Kubernetes 1.26 storage&lt;/h2&gt;





&lt;h3 id="3294"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3294"&gt;#3294&lt;/a&gt; Provision volumes from cross-namespace snapshots&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Net new to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; storage&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;CrossNamespaceVolumeDataSource&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;Prior to Kubernetes 1.26, users were able to provision volumes from snapshots thanks to the &lt;code&gt;VolumeSnapshot&lt;/code&gt; feature. While this is a great and super useful feature. it had some limitations, like the inability to bind a &lt;code&gt;PersistentVolumeClaim&lt;/code&gt; to &lt;code&gt;VolumeSnapshots&lt;/code&gt; from other namespaces.&lt;/p&gt;





&lt;p&gt;This enhancement breaks this limitation and allows Kubernetes users to provision volumes from snapshots across namespaces.&lt;/p&gt;





&lt;p&gt;If you want to use the cross-namespace VolumeSnapshot feature, you’ll have to first create a &lt;code&gt;ReferenceGrant&lt;/code&gt; object, and then a &lt;code&gt;PersistentVolumeClaim&lt;/code&gt; binding to the &lt;code&gt;VolumeSnapshot&lt;/code&gt;. Here, you’ll find a simple example of both objects for learning purposes.&lt;/p&gt;





&lt;pre&gt;&lt;code&gt;---
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: ReferenceGrant
metadata:
  name: test
  namespace: default
spec:
  from:
  - group: ""
    kind: PersistentVolumeClaim
    namespace: nstest1
  to:
  - group: snapshot.storage.k8s.io
    kind: VolumeSnapshot
    name: testsnapshot
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: testvolumeclaim
  namespace: nstest1
spec:
  storageClassName: mystorageclass
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
  dataSourceRef2:
    apiGroup: snapshot.storage.k8s.io
    kind: VolumeSnapshot
    name: testsnapshot
    namespace: default
  volumeMode: Filesystem
&lt;/code&gt;&lt;/pre&gt;





&lt;h3 id="2268"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/2268"&gt;#2268&lt;/a&gt; Non-graceful node shutdown&lt;/h3&gt;





&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Beta&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; storage&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;NodeOutOfServiceVolumeDetach&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;/code&gt;&lt;/p&gt;





&lt;p&gt;This enhancement addresses node shutdown cases that are not detected properly, where the pods that are part of a &lt;em&gt;StatefulSet&lt;/em&gt; will be stuck in terminating status on the shutdown node and cannot be moved to a new running node.&lt;/p&gt;





&lt;p&gt;The pods will be forcefully deleted in this case, trigger the deletion of the &lt;em&gt;VolumeAttachments&lt;/em&gt;, and new pods will be created on a different running node so that application can continue to function.&lt;/p&gt;





&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-24-whats-new/#2268" rel="noopener noreferrer"&gt;Kubernetes 1.24 - What's new?&lt;/a&gt;" article.&lt;/p&gt;





&lt;h3 id="3333"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3333"&gt;#3333&lt;/a&gt; Retroactive default &lt;em&gt;StorageClass&lt;/em&gt; &lt;em&gt;assignement&lt;/em&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Beta&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; storage&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;RetroactiveDefaultStorageClass&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This enhancement helps manage the case when cluster administrators change the default storage class. All &lt;em&gt;PVCs&lt;/em&gt; without &lt;em&gt;StorageClass&lt;/em&gt; that were created while the change took place will retroactively be set to the new default &lt;em&gt;StorageClass&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-25-whats-new/#3333" rel="noopener noreferrer"&gt;What's new in Kubernetes 1.25&lt;/a&gt;" article.&lt;/p&gt;

&lt;h3 id="1491"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/1491"&gt;#1491&lt;/a&gt; vSphere &lt;em&gt;in-tree&lt;/em&gt; to CSI driver migration&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Stable&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; storage&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;CSIMigrationvSphere&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;As we covered in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-23-whats-new/#1487" rel="noopener noreferrer"&gt;What's new in Kubernetes 1.19&lt;/a&gt;" article, the CSI driver for vSphere has been stable for some time. Now, all plugin operations for &lt;code&gt;vspherevolume&lt;/code&gt; are now redirected to &lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes-sigs/vsphere-csi-driver"&gt;the out-of-tree 'csi.vsphere.vmware.com' driver&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This enhancement is part of the &lt;a href="https://sysdig.com/blog/kubernetes-1-25-whats-new/#625" rel="noopener noreferrer"&gt;#625 In-tree storage plugin to CSI Driver Migration&lt;/a&gt; effort.&lt;/p&gt;

&lt;h3 id="1885"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/1885"&gt;#1885&lt;/a&gt; Azure file &lt;em&gt;in-tree&lt;/em&gt; to CSI driver migration&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Stable&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; storage&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;InTreePluginAzureDiskUnregister&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;true&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This enhancement summarizes &lt;a rel="noopener nofollow noreferrer" href="https://kubernetes.io/docs/concepts/storage/volumes/#azurefile"&gt;the work to move Azure File code&lt;/a&gt; out of the main Kubernetes binaries (out-of-tree).&lt;/p&gt;

&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-21-whats-new/#1885" rel="noopener noreferrer"&gt;Kubernetes 1.21 - What's new?&lt;/a&gt;" article.&lt;/p&gt;

&lt;h3 id="2317"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/2317"&gt;#2317&lt;/a&gt; Allow Kubernetes to supply pod's &lt;em&gt;fsgroup&lt;/em&gt; to CSI driver on mount&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Stable&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; storage&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;DelegateFSGroupToCSIDriver &lt;strong&gt;Default value: &lt;/strong&gt;false&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This enhancement proposes providing the CSI driver with the &lt;em&gt;fsgroup&lt;/em&gt; of the pods as an explicit field, so the CSI driver can be the one applying this natively on mount time.&lt;/p&gt;

&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-22-whats-new/#2317" rel="noopener noreferrer"&gt;Kubernetes 1.22 - What's new?&lt;/a&gt;" article.&lt;/p&gt;

&lt;h2 id="other"&gt;Other enhancements in Kubernetes 1.26&lt;/h2&gt;

&lt;h3 id="3466"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3466"&gt;#3466&lt;/a&gt; Kubernetes component health SLIs&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Net new to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; instrumentation&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;ComponentSLIs&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;There isn't a standard format to query the health data of Kubernetes components.&lt;/p&gt;

&lt;p&gt;Starting with Kubernetes 1.26, a new endpoint &lt;code&gt;/metrics/slis&lt;/code&gt; will be available on each component exposing their Service Level Indicator (SLI) metrics in Prometheus format.&lt;/p&gt;

&lt;p&gt;For each component, two metrics will be exposed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;gauge&lt;/strong&gt;, representing the current state of the healthcheck.&lt;/li&gt;



&lt;li&gt;A &lt;strong&gt;counter&lt;/strong&gt;, recording the cumulative counts observed for each healthcheck state.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With this information, you can check the overtime status for the Kubernetes internals, e.g.:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;kubernetes_healthcheck{name="etcd",type="readyz"}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And create an alert for when something's wrong, e.g.:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;kubernetes_healthchecks_total{name="etcd",status="error",type="readyz"} &amp;gt; 0&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id="3498"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3498"&gt;#3498&lt;/a&gt; Extend metrics stability&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Net new to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; instrumentation&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; N/A&lt;/p&gt;

&lt;p&gt;Metrics in Kubernetes are classified as &lt;code&gt;alpha&lt;/code&gt; or &lt;code&gt;stable&lt;/code&gt;. The &lt;code&gt;stable&lt;/code&gt; ones are guaranteed to be maintained, providing you with the information to prepare your dashboards so they don't break unexpectedly when you upgrade your cluster.&lt;/p&gt;

&lt;p&gt;In Kubernetes 1.26, two new classes are added:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;beta&lt;/code&gt;: For metrics related to beta features. They may change or disappear, but they are in a more advanced development state than the alpha ones.&lt;/li&gt;



&lt;li&gt;
&lt;code&gt;internal&lt;/code&gt;: Metrics for internal usage that you shouldn't worry about, either because they don't provide useful information for cluster administrators, or because they may change without notice.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can check a full &lt;a rel="noopener nofollow noreferrer" href="https://kubernetes.io/docs/reference/instrumentation/metrics/"&gt;list of available metrics in the documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Related:&lt;/strong&gt; &lt;a href="https://sysdig.com/blog/kubernetes-1-21-whats-new/#1209" rel="noopener noreferrer"&gt;#1209 Metrics stability enhancement in Kubernetes 1.21&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id="3515"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3515"&gt;#3515&lt;/a&gt; OpenAPI v3 for &lt;em&gt;kubectl&lt;/em&gt; explain&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Net new to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; cli&lt;br&gt;&lt;strong&gt;Environment variable:&lt;/strong&gt; &lt;code&gt;KUBECTL_EXPLAIN_OPENAPIV3&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This enhancement allows &lt;code&gt;kubectl explain&lt;/code&gt; to gather the data from OpenAPIv3 instead of v2.&lt;/p&gt;

&lt;p&gt;In OpenAPIv3, some data can be represented in a better way, like &lt;em&gt;CustomResourceDefinition&lt;/em&gt;s (CDRs).&lt;/p&gt;

&lt;p&gt;Internal work is also being made to improve how &lt;code&gt;kubectl explain&lt;/code&gt; prints the output.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Related:&lt;/strong&gt; &lt;a href="https://sysdig.com/blog/kubernetes-1-24-whats-new/#2896" rel="noopener noreferrer"&gt;#2896 OpenAPI v3 in Kubernetes 1.24&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id="1440"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/1440"&gt;#1440&lt;/a&gt; &lt;em&gt;kubectl&lt;/em&gt; events&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Beta&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; cli&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; N/A&lt;/p&gt;

&lt;p&gt;A new &lt;code&gt;kubectl events&lt;/code&gt; command is available that will enhance the current functionality of &lt;code&gt;kubectl get events&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-23-whats-new/#1440" rel="noopener noreferrer"&gt;Kubernetes 1.23 - What's new?&lt;/a&gt;" article.&lt;/p&gt;

&lt;h3 id="3031"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3031"&gt;#3031&lt;/a&gt; Signing release artifacts&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Beta&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; release&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; N/A&lt;/p&gt;

&lt;p&gt;This enhancement introduces a unified way to sign artifacts in order to help avoid &lt;a href="https://sysdig.com/blog/software-supply-chain-security/" rel="noopener noreferrer"&gt;supply chain attacks&lt;/a&gt;. It relies on the &lt;a rel="noopener nofollow noreferrer" href="https://www.sigstore.dev/"&gt;sigstore&lt;/a&gt; project tools, and more specifically &lt;code&gt;&lt;a rel="noopener nofollow noreferrer" href="https://github.com/sigstore/cosign"&gt;cosign&lt;/a&gt;&lt;/code&gt;. Although it doesn’t add new functionality, it will surely help to keep our cluster more protected.&lt;/p&gt;

&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-24-whats-new/#3031" rel="noopener noreferrer"&gt;Kubernetes 1.24 - What's new?&lt;/a&gt;" article.&lt;/p&gt;

&lt;h3 id="3503"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/3503"&gt;#3503&lt;/a&gt; Host network support for Windows pods&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Net new to Alpha&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; windows&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;WindowsHostNetwork&lt;/code&gt; &lt;strong&gt;Default value:&lt;/strong&gt; &lt;code&gt;false&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;There is a weird situation in Windows pods where you can set &lt;code&gt;hostNetwork=true&lt;/code&gt; for them, but it doesn't change anything. There isn't any platform impediment, the implementation was just missing.&lt;/p&gt;

&lt;p&gt;Starting with Kubernetes 1.26, the &lt;code&gt;kubelet&lt;/code&gt; can now request that Windows pods use the host's network namespace instead of creating a new pod network namespace.&lt;/p&gt;

&lt;p&gt;This will come handy to avoid port exhaustion where there's large amounts of services.&lt;/p&gt;

&lt;h3 id="1981"&gt;
&lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/enhancements/issues/1981"&gt;#1981&lt;/a&gt; Support for Windows privileged containers&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage:&lt;/strong&gt; Graduating to Stable&lt;br&gt;&lt;strong&gt;Feature group:&lt;/strong&gt; windows&lt;br&gt;&lt;strong&gt;Feature gate:&lt;/strong&gt; &lt;code&gt;WindowsHostProcessContainers &lt;strong&gt;Default value: &lt;/strong&gt;true&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This enhancement brings the &lt;a rel="noopener nofollow noreferrer" href="https://kubernetes.io/docs/concepts/workloads/pods/#privileged-mode-for-containers"&gt;privileged containers&lt;/a&gt; feature available in Linux to Windows hosts.&lt;/p&gt;

&lt;p&gt;Privileged containers have access to the host, as if they were running directly on it. Although they are not recommended for most of the workloads, they are quite useful for administration, security, and monitoring purposes.&lt;/p&gt;

&lt;p&gt;Read more in our "&lt;a href="https://sysdig.com/blog/kubernetes-1-22-whats-new/#1981" rel="noopener noreferrer"&gt;Kubernetes 1.22 - What's new?&lt;/a&gt;" article.&lt;/p&gt;








&lt;p&gt;That’s all for Kubernetes 1.26, folks! Exciting as always; get ready to upgrade your clusters if you are intending to use any of these features.&lt;/p&gt;

&lt;p&gt;If you liked this, you might want to check out our previous ‘What’s new in Kubernetes’ editions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://sysdig.com/blog/kubernetes-1-26-whats-new/" rel="noopener noreferrer"&gt;Kubernetes 1.26 - What's new?&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://sysdig.com/blog/kubernetes-1-25-whats-new/" rel="noopener noreferrer"&gt;Kubernetes 1.25 - What's new?&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://sysdig.com/blog/kubernetes-1-24-whats-new/" rel="noopener noreferrer"&gt;Kubernetes 1.24 - What's new?&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://sysdig.com/blog/kubernetes-1-23-whats-new/" rel="noopener noreferrer"&gt;Kubernetes 1.23 - What's new?&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://sysdig.com/blog/kubernetes-1-22-whats-new/" rel="noopener noreferrer"&gt;Kubernetes 1.22 - What's new?&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://sysdig.com/blog/kubernetes-1-21-whats-new/" rel="noopener noreferrer"&gt;Kubernetes 1.21 - What's new?&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://sysdig.com/blog/whats-new-kubernetes-1-20/" rel="noopener noreferrer"&gt;Kubernetes 1.20 - What's new?&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://sysdig.com/blog/whats-new-kubernetes-1-19/" rel="noopener noreferrer"&gt;Kubernetes 1.19 - What's new?&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://sysdig.com/blog/whats-new-kubernetes-1-18/" rel="noopener noreferrer"&gt;Kubernetes 1.18 - What's new?&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://sysdig.com/blog/whats-new-kubernetes-1-17/" rel="noopener noreferrer"&gt;Kubernetes 1.17 - What's new?&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://sysdig.com/blog/whats-new-kubernetes-1-16/" rel="noopener noreferrer"&gt;Kubernetes 1.16 - What's new?&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://sysdig.com/blog/whats-new-kubernetes-1-15/" rel="noopener noreferrer"&gt;Kubernetes 1.15 - What's new?&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://sysdig.com/blog/whats-new-kubernetes-1-14/" rel="noopener noreferrer"&gt;Kubernetes 1.14 - What's new?&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://sysdig.com/blog/whats-new-in-kubernetes-1-13" rel="noopener noreferrer"&gt;Kubernetes 1.13 - What's new?&lt;/a&gt;&lt;/li&gt;



&lt;li&gt;&lt;a href="https://sysdig.com/blog/whats-new-in-kubernetes-1-12" rel="noopener noreferrer"&gt;Kubernetes 1.12 - What's new?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Get involved in the Kubernetes community:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Visit &lt;a rel="noopener nofollow noreferrer" href="https://kubernetes.io"&gt;the project homepage&lt;/a&gt;.&lt;/li&gt;



&lt;li&gt;Check out &lt;a rel="noopener nofollow noreferrer" href="https://github.com/kubernetes/"&gt;the Kubernetes project on GitHub&lt;/a&gt;.&lt;/li&gt;



&lt;li&gt;Get involved &lt;a rel="noopener nofollow noreferrer" href="https://kubernetes.io/community/"&gt;with the Kubernetes community&lt;/a&gt;.&lt;/li&gt;



&lt;li&gt;Meet the maintainers &lt;a rel="noopener nofollow noreferrer" href="https://slack.k8s.io"&gt;on the Kubernetes Slack&lt;/a&gt;.&lt;/li&gt;



&lt;li&gt;Follow &lt;a rel="noopener nofollow noreferrer" href="https://twitter.com/kubernetesio"&gt;@KubernetesIO on Twitter&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And if you enjoy keeping up to date with the Kubernetes ecosystem, &lt;a href="https://go.sysdig.com/container-newsletter-signup.html" rel="noopener noreferrer"&gt;subscribe to our container newsletter&lt;/a&gt;, a monthly email with the coolest stuff happening in the cloud-native ecosystem.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>docker</category>
    </item>
    <item>
      <title>Understanding Kubernetes Limits and Requests</title>
      <dc:creator>Javier Martínez</dc:creator>
      <pubDate>Mon, 21 Nov 2022 08:43:18 +0000</pubDate>
      <link>https://dev.to/sysdig/understanding-kubernetes-limits-and-requests-5m1</link>
      <guid>https://dev.to/sysdig/understanding-kubernetes-limits-and-requests-5m1</guid>
      <description>&lt;p&gt;When working with containers in Kubernetes, it’s important to know what are the resources involved and how they are needed. Some processes will require more CPU or memory than others. Some are critical and should never be starved. &lt;/p&gt;

&lt;p&gt;Knowing that, we should configure our containers and Pods properly in order to get the best of both.&lt;/p&gt;

&lt;p&gt;In this article, we will see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Introduction to Kubernetes Limits and Requests&lt;/li&gt;



&lt;li&gt;Hands-on example&lt;/li&gt;



&lt;li&gt;Kubernetes Requests&lt;/li&gt;



&lt;li&gt;Kubernetes Limits&lt;/li&gt;



&lt;li&gt;CPU particularities&lt;/li&gt;



&lt;li&gt;Memory particularities&lt;/li&gt;



&lt;li&gt;Namespace ResourceQuota&lt;/li&gt;



&lt;li&gt;Namespace LimitRange&lt;/li&gt;



&lt;li&gt;Conclusion&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id="introduction"&gt;Introduction to Kubernetes Limits and Requests&lt;/h2&gt;

&lt;p&gt;Limits and Requests are important settings when working with Kubernetes. This article will focus on the two most important ones: CPU and memory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kubernetes defines Limits as the&lt;/strong&gt; &lt;strong&gt;maximum amount of a resource&lt;/strong&gt; to be used by a container. This means that the container can never consume more than the memory amount or CPU amount indicated. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Requests, on the other hand, are the minimum guaranteed amount of a resource&lt;/strong&gt; that is reserved for a container.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FKubernetes-Limits-and-Request-04-1-1170x585.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FKubernetes-Limits-and-Request-04-1-1170x585.png" alt="" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id="handson"&gt;Hands-on example&lt;/h2&gt;

&lt;p&gt;Let’s have a look at this deployment, where we are setting up limits and requests for two different containers on both CPU and memory.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;kind: Deployment
apiVersion: extensions/v1beta1
…
template:
  spec:
    containers:
      - name: redis
        image: redis:5.0.3-alpine
        resources:
&lt;strong&gt;limits&lt;/strong&gt;:
            memory: 600Mi
            cpu: 1
&lt;strong&gt;requests&lt;/strong&gt;:
            memory: 300Mi
            cpu: 500m
      - name: busybox
        image: busybox:1.28
        resources:
&lt;strong&gt;limits&lt;/strong&gt;:
            memory: 200Mi
            cpu: 300m
&lt;strong&gt;requests&lt;/strong&gt;:
            memory: 100Mi
            cpu: 100m&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Let’s say we are running a cluster with, for example, 4 cores and 16GB RAM nodes. We can extract a lot of information:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/Kubernetes-Limits-and-Request-05.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FKubernetes-Limits-and-Request-05-1170x828.png" alt="Kubernetes Limits and Requests practical example" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Pod effective request&lt;/strong&gt; is 400 MiB of memory and 600 millicores of CPU. You need a node with enough free allocatable space to schedule the pod.&lt;/li&gt;



&lt;li&gt;
&lt;strong&gt;CPU shares&lt;/strong&gt; for the redis container will be 512, and 102 for the busybox container. Kubernetes always assign 1024 shares to every core, so redis: 1024 * 0.5 cores ≅ 512 and busybox: 1024 * 0.1cores ≅ 102&lt;/li&gt;



&lt;li&gt;Redis container will be &lt;strong&gt;OOM killed&lt;/strong&gt; if it tries to allocate more than 600MB of RAM, most likely making the pod fail.&lt;/li&gt;



&lt;li&gt;Redis will suffer &lt;strong&gt;CPU throttle&lt;/strong&gt; if it tries to use more than 100ms of CPU in every 100ms, (since we have 4 cores, available time would be 400ms every 100ms) causing performance degradation.&lt;/li&gt;



&lt;li&gt;Busybox container will be &lt;strong&gt;OOM killed&lt;/strong&gt; if it tries to allocate more than 200MB of RAM, resulting in a failed pod.&lt;/li&gt;



&lt;li&gt;Busybox will suffer &lt;strong&gt;CPU throttle&lt;/strong&gt; if it tries to use more than 30ms of CPU every 100ms, causing performance degradation.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id="kubernetesrequests"&gt;Kubernetes Requests&lt;/h2&gt;

&lt;p&gt;Kubernetes defines requests as a &lt;strong&gt;guaranteed minimum amount of a resource&lt;/strong&gt; to be used by a container.&lt;/p&gt;

&lt;p&gt;Basically, it will set the minimum amount of the resource for the container to consume.&lt;/p&gt;

&lt;p&gt;When a Pod is scheduled, kube-scheduler will check the Kubernetes requests in order to allocate it to a particular Node that can satisfy at least that amount for all containers in the Pod. If the requested amount is higher than the available resource, the Pod will not be scheduled and remain in Pending status.&lt;/p&gt;

&lt;p&gt;For more information about Pending status, check &lt;a href="https://sysdig.com/blog/kubernetes-pod-pending-problems/" rel="noreferrer noopener"&gt;Understanding Kubernetes Pod pending problems&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In this example, in the container definition we set a request for 100m cores of CPU and 4Mi of memory:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;resources:
   requests:
        cpu: 0.1
        memory: 4Mi&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Requests are used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When allocating Pods to a Node, so the indicated requests by the containers in the Pod are satisfied.&lt;/li&gt;



&lt;li&gt;At runtime, the indicated amount of requests will be guaranteed as a minimum for the containers in that Pod.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage4.png" alt="How to set good CPU requests" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id="kuberneteslimits"&gt;Kubernetes Limits&lt;/h2&gt;

&lt;p&gt;Kubernetes defines &lt;strong&gt;limits&lt;/strong&gt; as a &lt;strong&gt;maximum amount of a resource&lt;/strong&gt; to be used by a container.&lt;/p&gt;

&lt;p&gt;This means that the container can never consume more than the memory amount or CPU amount indicated.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;    resources:
      limits:
        cpu: 0.5
        memory: 100Mi
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Limits are used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When allocating Pods to a Node. If no requests are set, by default, Kubernetes will assign requests = limits.&lt;/li&gt;



&lt;li&gt;At runtime, Kubernetes will check that the containers in the Pod are not consuming a higher amount of resources than indicated in the limit.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage6.png" alt="Setting good Limits in Kubernetes" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id="cpuparticularities"&gt;CPU particularities&lt;/h2&gt;

&lt;p&gt;CPU is a &lt;strong&gt;compressible resource&lt;/strong&gt;, meaning that it can be stretched in order to satisfy all the demand. In case that the processes request too much CPU, some of them will be throttled.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CPU&lt;/strong&gt; represents &lt;strong&gt;computing processing time&lt;/strong&gt;, measured in cores. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You can use millicores (m) to represent smaller amounts than a core (e.g., 500m would be half a core)&lt;/li&gt;



&lt;li&gt;The minimum amount is 1m&lt;/li&gt;



&lt;li&gt;A Node might have more than one core available, so requesting CPU &amp;gt; 1 is possible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FKubernetes-Limits-and-Requests-1-1170x644.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FKubernetes-Limits-and-Requests-1-1170x644.png" alt="Kubernetes requests for CPU image" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id="memoryparticularities"&gt;Memory particularities&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Memory&lt;/strong&gt; is a &lt;strong&gt;non-compressible&lt;/strong&gt; resource, meaning that it can’t be stretched in the same manner as CPU. If a process doesn’t get enough memory to work, the process is killed.&lt;/p&gt;

&lt;p&gt;Memory is measured in Kubernetes in &lt;strong&gt;bytes&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You can use, E, P, T, G, M, k to represent Exabyte, Petabyte, Terabyte, Gigabyte, Megabyte and kilobyte, although only the last four are commonly used. (e.g., 500M, 4G)&lt;/li&gt;



&lt;li&gt;Warning: don’t use lowercase m for memory (this represents Millibytes, which is ridiculously low)&lt;/li&gt;



&lt;li&gt;You can define Mebibytes using Mi, as well as the rest as Ei, Pi, Ti (e.g., 500Mi)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;span&gt;&lt;em&gt;A Mebibyte (and their analogues Kibibyte, Gibibyte,...) is 2 to the power of 20 bytes. It was created to avoid the confusion with the Kilo, Mega definitions of the metric system. You should be using this notation, as it's the canonical definition for bytes, while Kilo and Mega are multiples of 1000&lt;/em&gt;&lt;br&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FKubernetes-Limits-and-Requests-2-1170x644.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FKubernetes-Limits-and-Requests-2-1170x644.png" alt="Kubernetes Limits for memory image" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id="bestpractices"&gt;Best practices&lt;/h2&gt;

&lt;p&gt;In very few cases should you be using limits to control your resources usage in Kubernetes. This is because if you want to avoid starvation (ensure that every important process gets its share), you should be using requests in the first place. &lt;/p&gt;

&lt;p&gt;By setting up limits, you are only preventing a process from retrieving additional resources in exceptional cases, causing an OOM kill in the event of memory, and Throttling in the event of CPU (process will need to wait until the CPU can be used again).&lt;/p&gt;

&lt;p&gt;For more information, check the &lt;a href="https://sysdig.com/blog/troubleshoot-kubernetes-oom/" rel="noopener noreferrer"&gt;article about OOM and Throttling&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you’re setting a request value equal to the limit in all containers of a Pod, that Pod will get the Guaranteed Quality of Service. &lt;/p&gt;

&lt;p&gt;Note as well, that Pods that have a resource usage higher than the requests are more likely to be evicted, so setting up very low requests cause more harm than good.For more information, check the article about &lt;a href="https://docs.google.com/document/u/0/d/1NvedVZgcPdtiSIFZH_-q5-C43xt12nfMWERA6Mk5dOc/edit" rel="noopener noreferrer"&gt;Pod eviction and Quality of Service&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id="namespaceresourcequota"&gt;Namespace ResourceQuota&lt;/h2&gt;

&lt;p&gt;Thanks to namespaces, we can isolate Kubernetes resources into different groups, also called tenants.&lt;/p&gt;

&lt;p&gt;With &lt;strong&gt;ResourceQuotas&lt;/strong&gt;, you can &lt;strong&gt;set a memory or CPU limit to the entire namespace&lt;/strong&gt;, ensuring that entities in it can’t consume more from that amount.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;apiVersion: v1
kind: ResourceQuota
metadata:
  name: mem-cpu-demo
spec:
  hard:
    requests.cpu: 2
    requests.memory: 1Gi
    limits.cpu: 3
    limits.memory: 2Gi

&lt;/code&gt;&lt;/pre&gt;

&lt;ul&gt;
&lt;li&gt;requests.cpu: the maximum amount of CPU for the sum of all requests in this namespace&lt;/li&gt;



&lt;li&gt;requests.memory: the maximum amount of Memory for the sum of all requests in this namespace&lt;/li&gt;



&lt;li&gt;limits.cpu: the maximum amount of CPU for the sum of all limits in this namespace&lt;/li&gt;



&lt;li&gt;limits.memory: the maximum amount of memory for the sum of all limits in this namespace&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then, apply it to your namespace:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;kubectl apply -f resourcequota.yaml --namespace=mynamespace
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You can list the current ResourceQuota for a namespace with:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;kubectl get resourcequota -n mynamespace
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Note that if you set up ResourceQuota for a given resource in a namespace, you then need to specify limits or requests accordingly for every Pod in that namespace. If not, Kubernetes will return a “failed quota” error:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Error from server (Forbidden): error when creating "mypod.yaml": pods "mypod" is forbidden: failed quota: mem-cpu-demo: must specify limits.cpu,limits.memory,requests.cpu,requests.memory
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In case you try to add a new Pod with container limits or requests that exceed the current ResourceQuota, Kubernetes will return an “exceeded quota” error:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Error from server (Forbidden): error when creating "mypod.yaml": pods "mypod" is forbidden: exceeded quota: mem-cpu-demo, requested: limits.memory=2Gi,requests.memory=2Gi, used: limits.memory=1Gi,requests.memory=1Gi, limited: limits.memory=2Gi,requests.memory=1Gi
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id="namespacelimitrange"&gt;Namespace LimitRange&lt;/h2&gt;

&lt;p&gt;ResourceQuotas are useful if we want to restrict the total amount of a resource allocatable for a namespace. But what happens if we want to give default values to the elements inside?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LimitRanges&lt;/strong&gt; are a Kubernetes policy that &lt;strong&gt;restricts the resource settings for each entity&lt;/strong&gt; in a namespace.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;apiVersion: v1
kind: LimitRange
metadata:
  name: cpu-resource-constraint
spec:
  limits:
  - default:
      cpu: 500m
    defaultRequest:
      cpu: 500m
    min:
      cpu: 100m
    max:
      cpu: "1"
    type: Container
&lt;/code&gt;&lt;/pre&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;default&lt;/code&gt;: Containers created will have this value if none is specified.&lt;/li&gt;



&lt;li&gt;
&lt;code&gt;min&lt;/code&gt;: Containers created can’t have limits or requests smaller than this.&lt;/li&gt;



&lt;li&gt;
&lt;code&gt;max&lt;/code&gt;: Containers created can’t have limits or requests bigger than this.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Later, if you create a new Pod with no requests or limits set, LimitRange will automatically set these values to all its containers:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;    Limits:
      cpu:  500m
    Requests:
      cpu:  100m
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now, imagine that you add a new Pod with 1200M as limit. You will receive the following error:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Error from server (Forbidden): error when creating "pods/mypod.yaml": pods "mypod" is forbidden: maximum cpu usage per Container is 1, but limit is 1200m
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Note that by default, all containers in Pod will effectively have a request of 100m CPU, even with no LimitRanges set.&lt;/p&gt;

&lt;h2 id="conclusion"&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Choosing the optimal limits for our Kubernetes cluster is key in order to get the best of both energy consumption and costs.&lt;/p&gt;

&lt;p&gt;Oversizing or dedicating too many resources for our Pods may lead to costs skyrocketing.&lt;/p&gt;

&lt;p&gt;Undersizing or dedicating very few CPU or Memory will lead to applications not performing correctly, or even Pods being evicted.&lt;/p&gt;

&lt;p&gt;As mentioned, Kubernetes limits shouldn’t be used, except in very specific situations, as they may cause more harm than good. There’s a chance that a Container is killed in case of Out of Memory, or throttled in case of Out of CPU.&lt;/p&gt;

&lt;p&gt;For requests, use them when you need to ensure a process gets a guaranteed share of a resource.&lt;/p&gt;








&lt;h2&gt;Rightsize your Kubernetes resources with Sysdig Monitor&lt;/h2&gt;





&lt;p&gt;With Sysdig Monitor new feature, cost advisor, you can optimize your Kubernetes costs&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Memory requests&lt;/li&gt;



&lt;li&gt;CPU requests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sysdig Advisor accelerates mean time to resolution (MTTR) with live logs, performance data, and suggested remediation steps. It’s the easy button for Kubernetes troubleshooting!&lt;/p&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FKubernetes-Limits-and-Request-06-1170x1063.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FKubernetes-Limits-and-Request-06-1170x1063.png" alt="" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/company/free-trial-monitor/" rel="noopener noreferrer"&gt;Try it free for 30 days!&lt;/a&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>monitoring</category>
      <category>cpu</category>
      <category>memory</category>
    </item>
    <item>
      <title>The four Golden Signals of Kubernetes monitoring</title>
      <dc:creator>Javier Martínez</dc:creator>
      <pubDate>Fri, 28 Oct 2022 09:20:10 +0000</pubDate>
      <link>https://dev.to/sysdig/the-four-golden-signals-of-kubernetes-monitoring-b7d</link>
      <guid>https://dev.to/sysdig/the-four-golden-signals-of-kubernetes-monitoring-b7d</guid>
      <description>&lt;p&gt;&lt;strong&gt;Golden Signals&lt;/strong&gt; are a reduced set of metrics that offer a wide view of a service from a user or consumer perspective: Latency, Traffic, Errors and Saturation. By focusing on these, you can be quicker at detecting potential problems that might be directly affecting the behavior of the application.&lt;/p&gt;

&lt;p&gt;Google introduced the term "Golden Signals" to refer to the essential metrics that you need to measure in your applications. They are the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
Errors - rate of requests that fail.&lt;/li&gt;



&lt;li&gt;
Saturation - consumption of your system resources.&lt;/li&gt;



&lt;li&gt;
Traffic - amount of use of your service per time unit.&lt;/li&gt;



&lt;li&gt;
Latency - the time it takes to serve a request.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/image9-10.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage9-10-1170x644.png" alt="The four Golden Signals of Kubernetes monitoring"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/BlogImages-GHAminer-featured-1.png" rel="noopener noreferrer"&gt;&lt;br&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is just a set of essential signals to start monitoring in your system. In other words, if you’re wondering which signals to monitor, you will need to look at these four first.&lt;/p&gt;

&lt;p&gt;Enter: Goldilocks and the four Monitoring Signals&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Once upon a time, there was a little girl called Goldilocks, who lived at the other side of the wood and had been sent on an errand by her mother, passed by the house, and looked in at the window…&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id="errors"&gt;Errors&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Goldilocks then tried the little chair, which belonged to the Little Bear, and found it just right, but she sat in it so hard that she broke it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/image4-19.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage4-19-1170x644.png" alt="The four Golden Signals of Kubernetes monitoring"&gt;&lt;/a&gt;The error rate for the chairs is ⅓&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Errors&lt;/strong&gt; golden signal measures the rate of requests that fail.&lt;/p&gt;

&lt;p&gt;Note that measuring the bulk amount of errors might not be the best course of action. If your application has a sudden peak of requests, then logically the amount of failed requests may increase.&lt;/p&gt;

&lt;p&gt;That’s why usually monitoring systems focus on the error rate, calculated as the percent of calls that are failing from the total.&lt;/p&gt;

&lt;p&gt;If you’re managing a web application, typically you will discriminate between those calls returning HTTP status in the 400-499 range (client errors) and 500-599 (server errors).&lt;/p&gt;

&lt;h3&gt;Measuring errors in Kubernetes&lt;/h3&gt;

&lt;p&gt;One thermometer for the errors happening in Kubernetes is the Kubelet. You can use several Kubernetes State Metrics in Prometheus to measure the amount of errors.&lt;/p&gt;

&lt;p&gt;The most important one is &lt;code&gt;kubelet_runtime_operations_errors_total&lt;/code&gt;, which indicates low level issues in the node, like problems with container runtime.&lt;/p&gt;

&lt;p&gt;If you want to visualize errors per operation, you can use &lt;code&gt;kubelet_runtime_operations_total&lt;/code&gt; to divide.&lt;/p&gt;

&lt;h3&gt;Errors example&lt;/h3&gt;

&lt;p&gt;Here's the Kubelet Prometheus metric for error rate in a Kubernetes cluster:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;sum(rate(kubelet_runtime_operations_errors_total{cluster="",
job="kubelet", metrics_path="/metrics"}[$__rate_interval])) 
by (instance, operation_type)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/image16-2-1170x445.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage16-2-1170x445.png" alt="The four Golden Signals of Kubernetes monitoring"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id="saturation"&gt;Saturation&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Goldilocks tasted the porridge in the dear little bowl, and it was just right, and it tasted so good that she tasted and tasted, and tasted and tasted until she was full.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/image17-2-1170x644.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage17-2-1170x644.png" alt="The four Golden Signals of Kubernetes monitoring"&gt;&lt;/a&gt;After eating one small bowl, Goldilocks is unable to eat more. That’s saturation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Saturation&lt;/strong&gt; measures the consumption of your system resources, usually as a percentage of the maximum capacity. Examples include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CPU usage&lt;/li&gt;



&lt;li&gt;Disk space&lt;/li&gt;



&lt;li&gt;Memory usage&lt;/li&gt;



&lt;li&gt;Network bandwidth&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the end, cloud applications run on machines, which have a limited amount of these resources.&lt;/p&gt;

&lt;p&gt;In order to correctly measure, you should be aware of the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What are the consequences if the resource is depleted? It could be that your entire system is unusable because this space has run out. Or maybe further requests are throttled until the system is less saturated.&lt;/li&gt;



&lt;li&gt;Saturation is not always about resources about to be depleted. It’s also about over-resourcing, or allocating a higher quantity of resources than what is needed. This one is crucial for cost savings.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Measuring saturation in Kubernetes&lt;/h3&gt;

&lt;p&gt;Since saturation depends on the resource being observed, you can use different metrics for Kubernetes entities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;node_cpu_seconds_total&lt;/code&gt; to measure machine CPU utilization.&lt;/li&gt;



&lt;li&gt;
&lt;code&gt;container_memory_usage_bytes&lt;/code&gt; to measure the memory utilization at container level (paired with &lt;code&gt;container_memory_max_usage_bytes&lt;/code&gt;).&lt;/li&gt;



&lt;li&gt;The amount of Pods that a &lt;a href="https://sysdig.com/learn-cloud-native/kubernetes-101/what-is-a-kubernetes-node/" rel="noopener noreferrer"&gt;Node&lt;/a&gt; can contain is also a Kubernetes resource.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Saturation example&lt;/h3&gt;

&lt;p&gt;Here’s a PromQL example of a Saturation signal, measuring CPU usage percent in a Kubernetes node.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;100 - (avg by (instance) (rate(node_cpu_seconds_total{}[5m])) * 100)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/image13-8-1170x410.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage13-8-1170x410.png" alt="The four Golden Signals of Kubernetes monitoring"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id="traffic"&gt;Traffic&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;And the Middle-sized Bear said:&lt;br&gt;“Somebody has been tumbling my bed!”&lt;br&gt;And the Little bear piped:&lt;br&gt;“Somebody has been tumbling my bed, and here she is!”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/image12-7-1170x644.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage12-7-1170x644.png" alt="The four Golden Signals of Kubernetes monitoring"&gt;&lt;/a&gt;One of the beds is being used, but none should be being used instead. That’s an unusual traffic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Traffic&lt;/strong&gt; measures the amount of use of your service per time unit.&lt;/p&gt;

&lt;p&gt;In essence, this will represent the usage of your current service. This is important not only for business reasons, but also to detect anomalies.&lt;/p&gt;

&lt;p&gt;Is the amount of requests too high? This could be due to a peak of users or because of a misconfiguration causing retries.&lt;/p&gt;

&lt;p&gt;Is the amount of requests too low? That may reflect that one of your systems is failing.&lt;/p&gt;

&lt;p&gt;Still, traffic signals should always be measured with a time reference. As an example, this blog receives more visits from Tuesday to Thursday.&lt;/p&gt;

&lt;p&gt;Depending on your application, you could be measuring traffic by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requests per minute for a web application&lt;/li&gt;



&lt;li&gt;Queries per minute for a database application&lt;/li&gt;



&lt;li&gt;Endpoint requests per minute for an API&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Traffic example&lt;/h3&gt;

&lt;p&gt;Here’s a Google Analytics chart displaying traffic distributed by hour:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/image3-23.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage3-23.png" alt="The four Golden Signals of Kubernetes monitoring"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id="latency"&gt;Latency&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;At that, Goldilocks woke in a fright, and jumped out of the window and ran away as fast as her legs could carry her, and never went near the Three Bears’ snug little house again.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/image11-6-1170x644.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage11-6-1170x644.png" alt="The four Golden Signals of Kubernetes monitoring"&gt;&lt;/a&gt;Goldilocks ran down the stairs in just two seconds. That’s a very low latency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Latency&lt;/strong&gt; is defined as the time it takes to serve a request.&lt;/p&gt;

&lt;h3&gt;Average latency&lt;/h3&gt;

&lt;p&gt;When working with latencies, your first impulse may be to measure average latency, but depending on your system that might not be the best idea. There may be very fast or very slow requests distorting the results.&lt;/p&gt;

&lt;p&gt;Instead, consider using a percentile, like p99, p95, and p50 (also known as median) to measure how the fastest 99%, 95%, or 50% of requests, respectively, took to complete.&lt;/p&gt;

&lt;h3&gt;Failed vs. successful&lt;/h3&gt;

&lt;p&gt;When measuring latency, it’s also important to discriminate between failed and successful requests, as failed ones might take sensibly less time than the correct ones.&lt;/p&gt;

&lt;h3&gt;Apdex Score&lt;/h3&gt;

&lt;p&gt;As described above, latency information may not be informative enough:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Some users might perceive applications as slower, depending on the action they are performing.&lt;/li&gt;



&lt;li&gt;Some users might perceive applications as slower, based on the default latencies of the industry.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where the Apdex (Application Performance Index) comes in. It’s defined as:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/image1-37.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage1-37.png" alt="The four Golden Signals of Kubernetes monitoring"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Where t is the target latency that we consider as reasonable.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Satisfied will represent the amount of users with requests under the target latency.&lt;/li&gt;



&lt;li&gt;Tolerant will represent the amount of non-satisfied users with requests below four times the target latency.&lt;/li&gt;



&lt;li&gt;Frustrated will represent the amount of users with requests above the tolerant latency.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The output for the formula will be an index from 0 to 1, indicating how performant our system is in terms of latency.&lt;/p&gt;

&lt;h3&gt;Measuring latency in Kubernetes&lt;/h3&gt;

&lt;p&gt;In order to measure the latency in your Kubernetes cluster, you can use metrics like &lt;code&gt;http_request_duration_seconds_sum&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You can also measure the latency for the api-server by using Prometheus metrics like &lt;code&gt;apiserver_request_duration_seconds&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;Latency example&lt;/h3&gt;

&lt;p&gt;Here’s an example of a Latency PromQL query for the 95% best performing HTTP requests in Prometheus:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;histogram_quantile(0.95, sum(rate(prometheus_http_request_duration_seconds_bucket[5m]))
by (le))
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/image10-10-1170x408.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage10-10-1170x408.png" alt="The four Golden Signals of Kubernetes monitoring"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id="red-method"&gt;RED Method&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;RED Method&lt;/strong&gt; was created by Tom Wilkie, from Weaveworks. It is heavily inspired by the Golden Signals and it’s focused on microservices architectures.&lt;/p&gt;

&lt;p&gt;RED stands for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rate&lt;/li&gt;



&lt;li&gt;Error&lt;/li&gt;



&lt;li&gt;Duration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Rate&lt;/strong&gt; measures the number of requests per second (equivalent to Traffic in the Golden Signals).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Error&lt;/strong&gt; measures the number of failed requests (similar to the one in Golden Signals).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Duration&lt;/strong&gt; measures the amount of time to process a request (similar to Latency in Golden Signals).&lt;/p&gt;

&lt;h2 id="use-method"&gt;USE Method&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;USE Method&lt;/strong&gt; was created by Brendan Gregg and it’s used to measure infrastructure.&lt;/p&gt;

&lt;p&gt;USE stands for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Utilization&lt;/li&gt;



&lt;li&gt;Saturation&lt;/li&gt;



&lt;li&gt;Errors&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That means for every resource in your system (CPU, disk, etc.), you need to check the three elements above.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Utilization&lt;/strong&gt; is defined as the percentage of usage for that resource.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Saturation&lt;/strong&gt; is defined as the queue for requests in the system.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Errors&lt;/strong&gt; is defined as the number of errors happening in the system.&lt;/p&gt;

&lt;p&gt;While it may not be intuitive, Saturation in Golden Signals is not similar to the Saturation in USE, but rather Utilization.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/image8-12-1170x439.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fimage8-12-1170x439.png" alt="The four Golden Signals of Kubernetes monitoring"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id="practical-example"&gt;A practical example of Golden signals in Kubernetes&lt;/h2&gt;

&lt;p&gt;As an example to illustrate the use of Golden Signals, here’s a simple go application example with Prometheus instrumentation. This application will apply a random delay between 0 and 12 seconds in order to give usable information of latency. Traffic will be generated with curl, with several infinite loops.&lt;/p&gt;

&lt;p&gt;An &lt;a rel="noopener nofollow noreferrer" href="https://prometheus.io/docs/practices/histograms/"&gt;histogram&lt;/a&gt; was included to collect metrics related to latency and requests. These metrics will help us obtain the initial three Golden Signals: latency, request rate and error rate. To obtain saturation directly with Prometheus and node-exporter, use percentage of CPU in the nodes.&lt;/p&gt;



&lt;pre&gt;&lt;code&gt;
File: main.go
-------------
package main
import (
    "fmt"
    "log"
    "math/rand"
    "net/http"
    "time"
    "github.com/gorilla/mux"
    "github.com/prometheus/client_golang/prometheus/promhttp"
)
func main() {
    //Prometheus: Histogram to collect required metrics
    histogram := prometheus.NewHistogramVec(prometheus.HistogramOpts{
        Name:    "greeting_seconds",
        Help:    "Time take to greet someone",
        Buckets: []float64{1, 2, 5, 6, 10}, //Defining small buckets as this app should not take more than 1 sec to respond
    }, []string{"code"}) //This will be partitioned by the HTTP code.
    router := mux.NewRouter()
    router.Handle("/sayhello/{name}", Sayhello(histogram))
    router.Handle("/metrics", promhttp.Handler()) //Metrics endpoint for scrapping
    router.Handle("/{anything}", Sayhello(histogram))
    router.Handle("/", Sayhello(histogram))
    //Registering the defined metric with Prometheus
    prometheus.Register(histogram)
    log.Fatal(http.ListenAndServe(":8080", router))
}
func Sayhello(histogram *prometheus.HistogramVec) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        //Monitoring how long it takes to respond
        start := time.Now()
        defer r.Body.Close()
        code := 500
        defer func() {
            httpDuration := time.Since(start)
            histogram.WithLabelValues(fmt.Sprintf("%d", code)).Observe(httpDuration.Seconds())
        }()
        if r.Method == "GET" {
            vars := mux.Vars(r)
            code = http.StatusOK
            if _, ok := vars["anything"]; ok {
                //Sleep random seconds
                rand.Seed(time.Now().UnixNano())
                n := rand.Intn(2) // n will be between 0 and 3
                time.Sleep(time.Duration(n) * time.Second)
                code = http.StatusNotFound
                w.WriteHeader(code)
            }
            //Sleep random seconds
            rand.Seed(time.Now().UnixNano())
            n := rand.Intn(12) //n will be between 0 and 12
            time.Sleep(time.Duration(n) * time.Second)
            name := vars["name"]
            greet := fmt.Sprintf("Hello %s \n", name)
            w.Write([]byte(greet))
        } else {
            code = http.StatusBadRequest
            w.WriteHeader(code)
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The application was deployed in a Kubernetes cluster with Prometheus and Grafana, and generated a dashboard with Golden Signals. In order to obtain the data for the dashboards, these are the PromQL queries:&lt;/p&gt;

&lt;h3&gt;Latency:&lt;/h3&gt;

&lt;pre&gt;&lt;code&gt;sum(greeting_seconds_sum)/sum(greeting_seconds_count)  //Average
histogram_quantile(0.95, sum(rate(greeting_seconds_bucket[5m])) by (le)) //Percentile p95&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Request rate:&lt;/h3&gt;

&lt;pre&gt;&lt;code&gt;sum(rate(greeting_seconds_count{}[2m]))  //Including errors
rate(greeting_seconds_count{code="200"}[2m])  //Only 200 OK requests&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Errors per second:&lt;/h3&gt;

&lt;pre&gt;&lt;code&gt;sum(rate(greeting_seconds_count{code!="200"}[2m]))&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Saturation:&lt;/h3&gt;

&lt;pre&gt;&lt;code&gt;100 - (avg by (instance) (irate(node_cpu_seconds_total{}[5m])) * 100)&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id="conclusion"&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Golden Signals, RED, and USE are just guidelines on what you should be focusing on when looking at your systems. But these are just the bare minimum on what to measure.&lt;/p&gt;

&lt;p&gt;Understand the &lt;strong&gt;errors&lt;/strong&gt; in your system. They will be a thermometer of all the other metrics, as they will point to any unusual behavior. And remember that you need to correctly mark requests as erroneous, but only the ones that should be exceptionally incorrect. Otherwise, your system will be prone to false positives or false negatives.&lt;/p&gt;

&lt;p&gt;Measure &lt;strong&gt;latency&lt;/strong&gt; of your requests. Try to understand your bottlenecks and what the negative experiences are when latency is higher than expected.&lt;/p&gt;

&lt;p&gt;Visualize &lt;strong&gt;saturation&lt;/strong&gt; and understand the resources involved in your solution. What are the consequences if a resource gets depleted?&lt;/p&gt;

&lt;p&gt;Measure &lt;strong&gt;traffic&lt;/strong&gt; to understand your usage curves. You will be able to find the best time to take down your system for an update, or you could be alerted when there’s an unexpected amount of users.&lt;/p&gt;

&lt;p&gt;Once metrics are in place, it’s important to set up alerts, which will notify you in case any of these metrics reach a certain threshold.&lt;/p&gt;









&lt;h2&gt;Track golden signals easily with Sysdig Monitor&lt;/h2&gt;
&lt;p&gt;With Sysdig Monitor, you can quickly review the golden signals in your system, out of the box.&lt;/p&gt;

&lt;p&gt;Review easily the Latency, Errors, Saturation and Traffic for the Pods in your cluster. And thanks to its Container Observability with eBPF, you can do this without adding any app or code instrumentation.&lt;/p&gt;

&lt;p&gt;Sysdig Advisor accelerates mean time to resolution (MTTR) with live logs, performance data, and suggested remediation steps. It’s the easy button for Kubernetes troubleshooting!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/GoldenSignals-11.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FGoldenSignals-11.png" alt="Sysdig Monitor Golden Signals"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/company/free-trial-monitor/" rel="noopener noreferrer"&gt;Try it free for 30 days!&lt;/a&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>monitoring</category>
      <category>prometheus</category>
    </item>
    <item>
      <title>Kubernetes ErrImagePull and ImagePullBackOff in detail</title>
      <dc:creator>Javier Martínez</dc:creator>
      <pubDate>Wed, 05 Oct 2022 19:34:01 +0000</pubDate>
      <link>https://dev.to/sysdig/kubernetes-errimagepull-and-imagepullbackoff-in-detail-1ga2</link>
      <guid>https://dev.to/sysdig/kubernetes-errimagepull-and-imagepullbackoff-in-detail-1ga2</guid>
      <description>&lt;p&gt;
Pod statuses like ImagePullBackOff or ErrImagePull are common when working with containers.
&lt;/p&gt;

&lt;p&gt;
&lt;strong&gt;ErrImagePull&lt;/strong&gt; is an error happening when the image specified for a container can’t be retrieved or pulled.
&lt;/p&gt;

&lt;p&gt;
&lt;strong&gt;ImagePullBackOff&lt;/strong&gt; is the waiting grace period while the image pull is fixed.
&lt;/p&gt;

&lt;p&gt;
&lt;a href="https://sysdig.com/wp-content/uploads/ErrImagePull-ImagePullbackOff-00.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FErrImagePull-ImagePullbackOff-00.png" alt="ErrImagePull and ImagePullBackOff cover"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
In this article, we will take a look at:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
Container Images
&lt;/li&gt;
&lt;li&gt;
Pulling Images
&lt;/li&gt;
&lt;li&gt;
Image Pull Policy
&lt;/li&gt;
&lt;li&gt;
ErrImagePull
&lt;/li&gt;
&lt;li&gt;
Debugging ErrImagePull
&lt;/li&gt;
&lt;li&gt;
Monitoring Image Pull Errors
&lt;/li&gt;
&lt;li&gt;
Other Image Errors
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id="containerimages"&gt;Container Images&lt;/h2&gt;

&lt;p&gt;
One of the greatest strengths of containerization is the ability to run any particular image in seconds. A &lt;b&gt;container&lt;/b&gt; is a group of processes executing in isolation from the underlying system. A&lt;b&gt; container image&lt;/b&gt; contains all the resources needed to run those processes: the binaries, libraries, and any necessary configuration.
&lt;/p&gt;

&lt;p&gt;
A &lt;b&gt;container registry&lt;/b&gt; is a repository for container images, where there are two basic actions:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Push&lt;/strong&gt;: upload an image so it’s available in the repo
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pull&lt;/strong&gt;: download an image to use it in a container
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;span&gt;&lt;em&gt;
The docker CLI will be used in the examples for this article, but you can use any tool that implements the Open Container Initiative Distribution specs for all the container registry interactions.
&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;

&lt;h2 id="pullingimages"&gt;Pulling images&lt;/h2&gt;
&lt;p&gt;
Images can be defined by name. In addition, a particular version of an image can be labeled with a specific &lt;strong&gt;name&lt;/strong&gt; or &lt;strong&gt;tag&lt;/strong&gt;. It can also be identified by its &lt;strong&gt;digest&lt;/strong&gt;, a hash of the content.
&lt;/p&gt;

&lt;p&gt;
The tag &lt;code&gt;latest&lt;/code&gt; refers to the most recent version of a given image.
&lt;/p&gt;

&lt;h3&gt;Pull images by name&lt;/h3&gt;

&lt;p&gt;
By only providing the name for the image, the image with tag &lt;code&gt;latest&lt;/code&gt; will be pulled
&lt;/p&gt;

&lt;pre&gt;docker pull nginx
kubectl run mypod nginx
&lt;/pre&gt;

&lt;h3&gt;Pull images by name + tag&lt;/h3&gt;

&lt;p&gt;
If you don’t want to pull the &lt;code&gt;latest&lt;/code&gt; image, you can provide a specific release tag:
&lt;/p&gt;

&lt;pre&gt;docker pull nginx:1.23.1-alpine
kubectl run mypod nginx:1.23.1-alpine
&lt;/pre&gt;

&lt;p&gt;
For more information, you can check this &lt;a href="https://sysdig.es/blog/toctou-tag-mutability/" rel="noopener noreferrer"&gt;article about tag mutability&lt;/a&gt;.
&lt;/p&gt;

&lt;h3&gt;Pull images by digest&lt;/h3&gt;

&lt;p&gt;
A digest is sha256 hash of the actual image. You can pull images using this digest, then verify its authenticity and integrity with the downloaded file.
&lt;/p&gt;

&lt;pre&gt;docker pull sha256:d164f755e525e8baee113987bdc70298da4c6f48fdc0bbd395817edf17cf7c2b
kubectl run mypod --image=nginx:sha25645b23dee08af5e43a7fea6c4cf9c25ccf269ee113168c19722f87876677c5
&lt;/pre&gt;

&lt;h2 id="imagepullpolicy"&gt;Image Pull Policy&lt;/h2&gt;

&lt;p&gt;
Kubernetes features the ability to set an &lt;strong&gt;Image Pull Policy&lt;/strong&gt; (&lt;strong&gt;imagePullPolicy&lt;/strong&gt; field) for each container. Based on this, the way the &lt;code&gt;kubelet&lt;/code&gt; retrieves the container image will differ.
&lt;/p&gt;

&lt;p&gt;
There are three different values for imagePullPolicy:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Always
&lt;/li&gt;
&lt;li&gt;IfNotPresent
&lt;/li&gt;
&lt;li&gt;Never
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Always&lt;/h3&gt;

&lt;p&gt;
With imagePullPolicy set to &lt;strong&gt;Always&lt;/strong&gt;, kubelet &lt;strong&gt;will check the repository every time&lt;/strong&gt; when pulling images for this container.
&lt;/p&gt;

&lt;h3&gt;IfNotPresent&lt;/h3&gt;

&lt;p&gt;
With imagePullPolicy set to &lt;strong&gt;IfNotPresent&lt;/strong&gt;, kubelet will only pull images from the repository&lt;strong&gt; if it doesn’t exist&lt;/strong&gt; in the node locally.
&lt;/p&gt;

&lt;h3&gt;Never&lt;/h3&gt;

&lt;p&gt;
With imagePullPolicy set to &lt;strong&gt;Never&lt;/strong&gt;, kubelet &lt;strong&gt;will never try to pull&lt;/strong&gt; images from the image registry. If there’s an image cached locally (pre-pulled), it will be used to start the container.
&lt;/p&gt;

&lt;p&gt;
If the image is not present locally, Pod creation will fail with &lt;code&gt;ErrImageNeverPull&lt;/code&gt; error.
&lt;/p&gt;

&lt;p&gt;
&lt;a href="https://sysdig.com/wp-content/uploads/ErrImagePull-ImagePullbackOff-01.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FErrImagePull-ImagePullbackOff-01.png" alt="ImagePullPolicy description: always, never and ifnotpresent"&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
Note that you can modify the entire image pull policy of your cluster by using the AlwaysPullImages &lt;a href="https://sysdig.com/blog/kubernetes-admission-controllers/" rel="noopener noreferrer"&gt;admission controller&lt;/a&gt;.
&lt;/p&gt;

&lt;h3&gt;Default Image Pull Policy&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;If you omit the imagePullPolicy and the tag is &lt;code&gt;latest&lt;/code&gt;, imagePullPolicy is set to &lt;code&gt;Always.&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;If you omit the imagePullPolicy and the tag for the image, imagePullPolicy is set to &lt;code&gt;Always&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;If you omit the imagePullPolicy and the tag is set to a value different than &lt;code&gt;latest&lt;/code&gt;, imagePullPolicy is set to &lt;code&gt;IfNotPresent.&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id="errimagepull"&gt;ErrImagePull&lt;/h2&gt;

&lt;p&gt;
When Kubernetes tries to pull an image for a container in a Pod, things might go wrong. The status &lt;strong&gt;ErrImagePull&lt;/strong&gt; is displayed when &lt;code&gt;kubelet&lt;/code&gt; tried to start a container in the Pod, but something was wrong with the image specified in your Pod, Deployment, or ReplicaSet manifest.
&lt;/p&gt;

&lt;p&gt;
Imagine that you are using kubectl to retrieve information about the Pods in your cluster:
&lt;/p&gt;

&lt;pre&gt;$ kubectl get pods
NAME    READY   STATUS             RESTARTS   AGE
goodpod 1/1     Running            0          21h
mypod   0/1     ErrImagePull       0          4s
&lt;/pre&gt;

&lt;p&gt;
Which means:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pod is not in &lt;code&gt;READY&lt;/code&gt; status
&lt;/li&gt;
&lt;li&gt;Status is &lt;code&gt;ErrImagePull&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
Additionally, you can check the logs for containers in your Pod:
&lt;/p&gt;

&lt;pre&gt;$ kubectl logs mypod --all-containers
Error from server (BadRequest): container "mycontainer" in pod "mypod" is waiting to start: trying and failing to pull image
&lt;/pre&gt;

&lt;p&gt;
In this case, this is pointing to a 400 Error (BadRequest), since probably the image indicated is not available or doesn’t exist.
&lt;/p&gt;

&lt;h2 id="imagepullbackoff"&gt;ImagePullBackOff&lt;/h2&gt;

&lt;p&gt;
ImagePullBackOff is a Kubernetes waiting status, a grace period with an increased back-off between retries. After the back-off period expires, kubelet will try to pull the image again.
&lt;/p&gt;

&lt;p&gt;
This is similar to the &lt;a href="https://sysdig.com/blog/debug-kubernetes-crashloopbackoff/" rel="noopener noreferrer"&gt;CrashLoopBackOff status&lt;/a&gt;, which is also a grace period between retries after an error in a container. Back-off time is increased each retry, up to a maximum of five minutes.
&lt;/p&gt;

&lt;p&gt;
Note that ImagePullBackOff is not an error. As mentioned, it’s just a status reason that is caused by a problem when pulling the image.
&lt;/p&gt;

&lt;pre&gt;$ kubectl get pods
NAME    READY   STATUS             RESTARTS   AGE
goodpod 1/1     Running            0          21h
mypod   0/1     ImagePullBackOff   0          84s
&lt;/pre&gt;

&lt;p&gt;
Which means:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pod is not in &lt;code&gt;READY&lt;/code&gt; status
&lt;/li&gt;
&lt;li&gt;Status is &lt;code&gt;ImagePullBackOff&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Unlike CrashLoopBackOff, there are no restarts (technically Pod hasn’t even started)
&lt;/li&gt;
&lt;/ul&gt;

&lt;pre&gt;$ kubectl describe pod mypod
State:          Waiting
Reason:       ImagePullBackOff
...
Warning  Failed     3m57s (x4 over 5m28s)  kubelet            Error: ErrImagePull
Warning  Failed     3m42s (x6 over 5m28s)  kubelet            Error: ImagePullBackOff
Normal   BackOff    18s (x20 over 5m28s)   kubelet            Back-off pulling image "failed-image"
&lt;/pre&gt;

&lt;p&gt;
&lt;a href="https://sysdig.com/wp-content/uploads/ErrImagePull-ImagePullbackOff-02.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FErrImagePull-ImagePullbackOff-02.png" alt="ErrImagePull and ImagePullBackOff timeline"&gt;&lt;/a&gt;


&lt;/p&gt;
&lt;h2 id="debuggingerrimagepull"&gt;Debugging ErrImagePull and ImagePullBackOff&lt;/h2&gt;
&lt;p&gt;
There are several potential causes of why you might encounter an Image Pull Error. Here are some examples:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Wrong image name
&lt;/li&gt;
&lt;li&gt;Wrong image tag
&lt;/li&gt;
&lt;li&gt;Wrong image digest
&lt;/li&gt;
&lt;li&gt;Network problem or image repo not available
&lt;/li&gt;
&lt;li&gt;Pulling from a private registry but not imagePullSecret was provided
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
This is just a list of possible causes, but it’s important to note that there might be many others based on your solution. The best course of action would be to check:
&lt;/p&gt;

&lt;pre&gt;
$ kubectl describe pod podname

$ kubect logs podname –all-containers

$ kubectl get events --field-selector involvedObject.name=podname
&lt;/pre&gt;

&lt;p&gt;
In the following example you can see how to dig into the logs, where an image error is found.

&lt;a href="https://sysdig.com/wp-content/uploads/ErrImagePull-ImagePullbackOff-03.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FErrImagePull-ImagePullbackOff-03.png" alt="Three terminals with debugging options for ErrImagePull and ImagePullBackOff"&gt;&lt;/a&gt;

&lt;/p&gt;
&lt;h2 id="otherimageerrors"&gt;Other image errors&lt;/h2&gt;


&lt;h3&gt;ErrImageNeverPull&lt;/h3&gt;
&lt;p&gt;
This error appears when kubelet fails to pull an image in the node and the imagePullPolicy is set to Never. In order to fix it, either change the Pull Policy to allow images to be pulled externally or add the correct image locally.
&lt;/p&gt;

&lt;h3&gt;Pending&lt;/h3&gt;

&lt;p&gt;
Remember that an ErrImagePull and the associated ImagePullBackOff may be different from a Pending status on your Pod.
&lt;/p&gt;

&lt;p&gt;
Pending status, most likely, is the result of kube-scheduler not being able to assign your Pod to a working or eligible Node.
&lt;/p&gt;

&lt;h2 id="monitoringimagepullerrors"&gt;Monitoring Image Pull Errors in Prometheus&lt;/h2&gt;

&lt;p&gt;
Using Prometheus and Kube State Metrics (KSM), we can easily track our Pods with containers in ImagePullBackOff or ErrImagePull statuses.
&lt;/p&gt;

&lt;pre&gt;kube_pod_container_status_waiting_reason{reason="ErrImagePull"}
kube_pod_container_status_waiting_reason{reason="ImagePullBackOff"}
&lt;/pre&gt;

&lt;p&gt;
In fact, these two metrics are complementary, as we can see in the following Prometheus queries:
&lt;/p&gt;

&lt;p&gt;
&lt;a href="https://sysdig.com/wp-content/uploads/ErrImagePull-ImagePullbackOff-04-1.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FErrImagePull-ImagePullbackOff-04-1.png" alt="Monitoring ErrImagePull and ImagePullBackOff in Prometheus"&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
The Pod is shifting between the waiting period in ImagePullBackOff and the image pull retrial returning an ErrImagePull.
&lt;/p&gt;

&lt;p&gt;
Also, if you’re using containers with &lt;code&gt;ImagePullPolicy&lt;/code&gt; set to &lt;code&gt;Never&lt;/code&gt;, remember that you need to track the error as &lt;code&gt;ErrImageNeverPull&lt;/code&gt;.
&lt;/p&gt;

&lt;pre&gt;kube_pod_container_status_waiting_reason{reason="ErrImageNeverPull"}
&lt;/pre&gt;

&lt;h2 id="conclusion"&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;
Container images are a great way to kickstart your cloud application needs. Thanks to them, you have access to thousands of curated applications that are ready to be started and scaled.
&lt;/p&gt;

&lt;p&gt;
However, due to misconfiguration, misalignments, or repository problems, image errors might start appearing. A container can’t start properly if the image definition is malformed or there are errors on the setup.
&lt;/p&gt;

&lt;p&gt;
Kubernetes provides a graceful period in case of an image pull error. This Image Pull Backoff is quite useful, as it gives you time to fix the problem in the image definition. But you need to be aware when this happens in your cluster and what does it mean each time.
&lt;/p&gt;




&lt;h2&gt;Troubleshoot Image Pull Errors with Sysdig Monitor&lt;/h2&gt;

&lt;p&gt;
With the Advisor feature of Sysdig Monitor, you can easily review Image Pull Errors happening in your Kubernetes cluster.
&lt;/p&gt;

&lt;p&gt;
Sysdig Advisor accelerates mean time to resolution (MTTR) with live logs, performance data, and suggested remediation steps. It’s the easy button for Kubernetes troubleshooting!
&lt;/p&gt;

&lt;p&gt;
&lt;a href="https://sysdig.com/wp-content/uploads/ErrImagePull-ImagePullbackOff-05.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FErrImagePull-ImagePullbackOff-05.png" alt="Troubleshooting ErrImagePull and ImagePullBackOff in Sysdig Monitor"&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
&lt;a href="https://sysdig.com/company/free-trial-monitor/" rel="noopener noreferrer"&gt;Try it free for 30 days!&lt;/a&gt;
&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>container</category>
      <category>docker</category>
      <category>orchestration</category>
    </item>
    <item>
      <title>Understanding Kubernetes Evicted Pods</title>
      <dc:creator>Javier Martínez</dc:creator>
      <pubDate>Tue, 20 Sep 2022 15:55:44 +0000</pubDate>
      <link>https://dev.to/sysdig/understanding-kubernetes-evicted-pods-1hmd</link>
      <guid>https://dev.to/sysdig/understanding-kubernetes-evicted-pods-1hmd</guid>
      <description>&lt;p&gt;
What does it mean that Kubernetes Pods are evicted? They are terminated, usually the result of not having enough resources. But why does this happen?
&lt;/p&gt;

&lt;p&gt;
&lt;strong&gt;Eviction&lt;/strong&gt; is a process where a &lt;strong&gt;Pod&lt;/strong&gt; assigned to a Node is &lt;strong&gt;asked for termination&lt;/strong&gt;. One of the most common cases in Kubernetes is &lt;strong&gt;Preemption&lt;/strong&gt;, where in order to schedule a new Pod in a Node with limited resources, another Pod needs to be terminated to leave resources to the first one. 
&lt;/p&gt;
&lt;p&gt;
Also, Kubernetes constantly checks resources and evicts Pods if needed, a process called &lt;strong&gt;Node-pressure eviction&lt;/strong&gt;.
&lt;/p&gt;

&lt;p&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FBlog_Images-SecureKubernetesDeployment-featured.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FBlog_Images-SecureKubernetesDeployment-featured.png" alt="Understanding Kubernetes Evicted Pods main image"&gt;&lt;/a&gt;
 
&lt;em&gt;Every day, thousands of Pods are evicted from their homes. Stranded and confused, they have to abandon their previous lifestyle. Some of them even become nodeless. The current society, imposing higher demands of CPU and memory, is part of the problem.&lt;/em&gt;
&lt;/p&gt;



&lt;p&gt;
During this article, you will discover:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
Reasons why Pods are evicted: Preemption and Node-pressure
&lt;/li&gt;
&lt;li&gt;
Preemption eviction
&lt;/li&gt;
&lt;li&gt;
Pod Priority Classes
&lt;/li&gt;
&lt;li&gt;
Node-pressure eviction
&lt;/li&gt;
&lt;li&gt;
Quality of Service Classes
&lt;/li&gt;
&lt;li&gt;
Other types of eviction
&lt;/li&gt;
&lt;li&gt;
Kubernetes Pod eviction monitoring in Prometheus
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id="reasonswhypodsareevicted"&gt;Reasons why Pods are evicted: Preemption and Node-pressure&lt;/h2&gt;

&lt;p&gt;
There are several reasons why Pod eviction can happen in Kubernetes. The most important ones are:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Preemption
&lt;/li&gt;
&lt;li&gt;Node-pressure eviction
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id="preemptioneviction"&gt;Preemption eviction&lt;/h2&gt;

&lt;p&gt;
&lt;strong&gt;Preemption&lt;/strong&gt; is the following process: if a new Pod needs to be scheduled but &lt;strong&gt;doesn’t have any suitable Node with enough resources&lt;/strong&gt;, then kube-scheduler will check if by &lt;strong&gt;evicting&lt;/strong&gt; (terminating) some Pods with &lt;strong&gt;lower priority&lt;/strong&gt; the new Pod can be part of that Node.
&lt;/p&gt;

&lt;p&gt;
Let’s first understand how Kubernetes scheduling works.
&lt;/p&gt;

&lt;h3&gt;Pod Scheduling&lt;/h3&gt;

&lt;p&gt;
Kubernetes &lt;strong&gt;Scheduling&lt;/strong&gt; is the process where &lt;strong&gt;Pods are assigned to nodes&lt;/strong&gt;.
&lt;/p&gt;

&lt;p&gt;
By default, there’s a Kubernetes entity responsible for scheduling, called &lt;code&gt;kube-scheduler&lt;/code&gt; which will be running in the control plane. The Pod will start in the &lt;a href="https://sysdig.com/blog/kubernetes-pod-pending-problems/" rel="noopener noreferrer"&gt;Pending state&lt;/a&gt; until a matching node is found.
&lt;/p&gt;

&lt;p&gt;
The process of assigning a Pod to a Node follows this sequence:
&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Filtering
&lt;/li&gt;
&lt;li&gt;Scoring
&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;Filtering&lt;/h4&gt;

&lt;p&gt;
During the &lt;strong&gt;Filtering step,&lt;/strong&gt; &lt;code&gt;kube-scheduler&lt;/code&gt; will &lt;strong&gt;select&lt;/strong&gt; all Nodes &lt;strong&gt;where the current Pod might be placed&lt;/strong&gt;. Features like Taints and Tolerations will be taken into account here. Once finished, it will have a list of suitable Nodes for that Pod.
&lt;/p&gt;

&lt;h4&gt;Scoring&lt;/h4&gt;

&lt;p&gt;
During the &lt;strong&gt;Scoring step,&lt;/strong&gt; &lt;code&gt;kube-scheduler&lt;/code&gt; will take the resulting list from the previous step and &lt;strong&gt;assign a score to each of the nodes&lt;/strong&gt;. This way, candidate nodes are ordered from most suitable to least. In case two nodes have the same score, kube-scheduler orders them randomly.
&lt;/p&gt;

&lt;p&gt;
&lt;a href="https://sysdig.com/wp-content/uploads/Troubleshooting-Evicted-Pods-02.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FTroubleshooting-Evicted-Pods-02.png" alt="Filtering and Scoring process"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
But, what happens if there are no suitable Nodes for a Pod to run? When that’s the case, Kubernetes will start the &lt;strong&gt;preemption&lt;/strong&gt; process, trying to &lt;strong&gt;evict&lt;/strong&gt; lower priority Pods in order for the new one to be assigned.
&lt;/p&gt;

&lt;h3 id="podpriorityclasses"&gt;Pod Priority Classes&lt;/h3&gt;

&lt;p&gt;
How can I prevent a particular Pod from being evicted in case of a preemption process? Chances are, a specific Pod is critical for you and should never be terminated.
&lt;/p&gt;

&lt;p&gt;
That’s why Kubernetes features &lt;strong&gt;Priority Classes&lt;/strong&gt;.
&lt;/p&gt;

&lt;p&gt;
A &lt;strong&gt;Priority Class&lt;/strong&gt; is a Kubernetes object that allows us to map &lt;strong&gt;numerical priority values&lt;/strong&gt; to specific Pods. Those with a higher value are classified as more important and less likely to be evicted.
&lt;/p&gt;

&lt;p&gt;
You can query current Priority Classes using:
&lt;/p&gt;

&lt;pre&gt;kubectl get priorityclasses
kubectl get pc

NAME                      VALUE        GLOBAL-DEFAULT   AGE
system-cluster-critical   2000000000   false            2d
system-node-critical      2000001000   false            2d
&lt;/pre&gt;

&lt;h3&gt;Priority Class example&lt;/h3&gt;

&lt;p&gt;
Let’s do a practical example using the &lt;a href="https://tapas.io/episode/220482" rel="noopener noreferrer"&gt;Berry Club comic&lt;/a&gt; from &lt;a href="https://www.mrlovenstein.com/" rel="noopener noreferrer"&gt;Mr. Lovenstein&lt;/a&gt;:
&lt;/p&gt;

&lt;p&gt;
There are three Pods representing blueberry, raspberry and strawberry:
&lt;/p&gt;

&lt;pre&gt;NAME         READY   STATUS             RESTARTS   AGE
blueberry    1/1     Running            0          4h41m
raspberry    1/1     Running            0          58m
strawberry   1/1     Running            0          5h22m
&lt;/pre&gt;

&lt;p&gt;
And there are two Priority Classes: trueberry and falseberry. The first one will have a higher value indicating higher priority.
&lt;/p&gt;

&lt;pre&gt;apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: trueberry
value: 1000000
globalDefault: false
description: "This fruit is a true berry"

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: falseberry
value: 5000
globalDefault: false
description: "This fruit is a false berry"
&lt;/pre&gt;

&lt;ul&gt;
&lt;li&gt;Blueberry will have the trueberry priority class (value = 1000000)
&lt;/li&gt;
&lt;li&gt;raspberry and strawberry will both have the falseberry priority class (value = 5000)
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
This will mean that in case of a preemption, raspberry and strawberry are more likely to be evicted to make room for higher priority Pods.
&lt;/p&gt;

&lt;p&gt;
Then assign the Priority Classes to Pods by adding this to the Pod definition:
&lt;/p&gt;

&lt;pre&gt; priorityClassName: trueberry
&lt;/pre&gt;

&lt;p&gt;
Let’s now try to add three more fruits, but with a twist. All of the new fruits will contain the higher Priority Class called &lt;code&gt;trueberry&lt;/code&gt;.
&lt;/p&gt;

&lt;p&gt;
Since the three new fruits have memory or CPU requirements that the node can’t satisfy, &lt;code&gt;kubelet&lt;/code&gt; evicts all Pods with lower priority than the new fruits. Blueberry stays running as it has the higher priority class.
&lt;/p&gt;

&lt;pre&gt;NAME         READY   STATUS             RESTARTS   AGE
banana       0/1     ContainerCreating  0          2s
blueberry    1/1     Running            0          4h42m
raspberry    0/1     Terminating        0          59m
strawberry   0/1     Terminating        0          5h23m
tomato       0/1     ContainerCreating  0          2s
watermelon   0/1     ContainerCreating  0          2s
&lt;/pre&gt;

&lt;p&gt;
&lt;a href="https://sysdig.com/wp-content/uploads/Blog_Images-TrobleshootingEvicted-diagram4.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FBlog_Images-TrobleshootingEvicted-diagram4.png" alt="Kubernetes Priority Classes - Live example"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
This is the end result:
&lt;/p&gt;

&lt;pre&gt;NAME         READY   STATUS             RESTARTS   AGE
banana       1/1     Running            0          3s
blueberry    1/1     Running            0          4h43m
tomato       1/1     Running            0          3s
watermelon   1/1     Running            0          3s
&lt;/pre&gt;

&lt;p&gt;
These are strange times for berry club...
&lt;/p&gt;

&lt;h2 id="nodepressureeviction"&gt;Node-pressure eviction&lt;/h2&gt;

&lt;p&gt;
Apart from preemption, Kubernetes also constantly checks node resources, like disk pressure, CPU or Out of Memory (OOM).
&lt;/p&gt;

&lt;p&gt;
In case a resource (like &lt;strong&gt;CPU&lt;/strong&gt; or &lt;strong&gt;memory&lt;/strong&gt;) consumption in the node &lt;strong&gt;reaches a certain threshold&lt;/strong&gt;, &lt;code&gt;kubelet&lt;/code&gt; will start evicting Pods in order to &lt;strong&gt;free up the resource&lt;/strong&gt;. Quality of Service (QoS) will be taken into account to determine the eviction order.
&lt;/p&gt;

&lt;h3 id="qosclasses"&gt;Quality of Service Classes&lt;/h3&gt;

&lt;p&gt;
In Kubernetes, Pods are giving one of three &lt;strong&gt;QoS Classes&lt;/strong&gt;, which will define how likely they are going to be evicted in case of lack of resources, from less likely to more likely:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Guaranteed&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Burstable&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BestEffort&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
How are these QoS Classes assigned to Pods? This is based on &lt;strong&gt;limits and requests&lt;/strong&gt; for &lt;strong&gt;CPU&lt;/strong&gt; and &lt;strong&gt;memory&lt;/strong&gt;. As a reminder:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Limits&lt;/strong&gt;: maximum amount of a resource that a container can use.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Requests&lt;/strong&gt;: minimum desired amount of resources for a container to run.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
For more information about limits and requests, please check &lt;a href="https://sysdig.com/blog/kubernetes-limits-requests/" rel="noopener noreferrer"&gt;Understanding Kubernetes limits and requests by example&lt;/a&gt;.
&lt;/p&gt;

&lt;p&gt;
&lt;a href="https://sysdig.com/wp-content/uploads/Troubleshooting-Evicted-Pods-03.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FTroubleshooting-Evicted-Pods-03.png" alt="QoS Classes in Kubernetes"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;h4&gt;Guaranteed&lt;/h4&gt;

&lt;p&gt;
A Pod is assigned with a QoS Class of &lt;strong&gt;Guaranteed&lt;/strong&gt; if:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;All containers in the Pod have &lt;strong&gt;both Limits and Requests set&lt;/strong&gt; for CPU and memory.
&lt;/li&gt;
&lt;li&gt;All containers in the Pod have &lt;strong&gt;the same value&lt;/strong&gt; for CPU Limit and CPU Request.
&lt;/li&gt;
&lt;li&gt;All containers in the Pod have &lt;strong&gt;the same value&lt;/strong&gt; for memory Limit and memory Request.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
A Guaranteed Pod won’t be evicted in normal circumstances to allocate another Pod in the node.
&lt;/p&gt;

&lt;h4&gt;Burstable&lt;/h4&gt;

&lt;p&gt;
A Pod is assigned with a QoS Class of &lt;strong&gt;Burstable&lt;/strong&gt; if:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It doesn’t have QoS Class of Guaranteed.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Either Limits or Requests&lt;/strong&gt; have been set for a container in the Pod.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
A Burstable Pod can be evicted, but less likely than the next category.
&lt;/p&gt;

&lt;h4&gt;BestEffort&lt;/h4&gt;

&lt;p&gt;
A Pod will be assigned with a QoS Class of &lt;strong&gt;BestEffort&lt;/strong&gt; if:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No Limits and Requests&lt;/strong&gt; are set for any container in the Pod.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
BestEffort Pods have the highest chance of eviction in case of a node-pressure process happening in the node.
&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;em&gt;
Important: there may be other available resources in Limits and Requests, like ephemeral-storage, but they are not used for QoS Class calculation.
&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;


&lt;a href="https://sysdig.com/wp-content/uploads/Blog_Images-TrobleshootingEvicted-diagram3.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FBlog_Images-TrobleshootingEvicted-diagram3.png" alt="Quality of Service cheatsheet"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
As mentioned, QoS Classes will be taken into account for node-pressure eviction. Here’s the process that happens internally.
&lt;/p&gt;

&lt;p&gt;
The kubelet ranks the Pods to be evicted in the following order:
&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;BestEffort&lt;/code&gt; Pods or &lt;code&gt;Burstable&lt;/code&gt; Pods where usage exceeds requests
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Burstable&lt;/code&gt; Pods where usage is below requests or &lt;code&gt;Guaranteed&lt;/code&gt; Pods
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Kubernetes will try to evict Pods from group 1 before group 2.
&lt;/p&gt;

&lt;p&gt;
Some takeaways from the above:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If you add very low requests in your containers, their Pod is likely going to be assigned group 1, which means it's more likely to be evicted.
&lt;/li&gt;
&lt;li&gt;You can’t tell which specific Pod is going to be evicted, just that Kubernetes will try to evict ones from group 1 before group 2.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Guaranteed&lt;/code&gt; Pods are usually safe from eviction: &lt;code&gt;kubelet&lt;/code&gt; won’t evict them in order to schedule other Pods. But if some system services need more resources, the kubelet will terminate &lt;code&gt;Guaranteed&lt;/code&gt; Pods if necessary, always with the lowest priority.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id="othertypeseviction"&gt;Other types of eviction&lt;/h2&gt;

&lt;p&gt;
This article is focused on preemption and node-pressure eviction, but Pods can be evicted in other ways as well. Examples include:
&lt;/p&gt;

&lt;h3&gt;API-initiated eviction&lt;/h3&gt;

&lt;p&gt;
You can request an on-demand eviction of a Pod in one of your nodes by using &lt;a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.25/#create-eviction-pod-v1-core" rel="noopener noreferrer"&gt;Kubernetes Eviction API&lt;/a&gt;.
&lt;/p&gt;

&lt;h3&gt;Taint-based eviction&lt;/h3&gt;

&lt;p&gt;
With Kubernetes Taints and Tolerations you can guide how your Pods should be assigned to Nodes. But if you apply a &lt;code&gt;NoExecute&lt;/code&gt; taint to an existing Node, all Pods which are not tolerating it will be immediately evicted.
&lt;/p&gt;

&lt;h3&gt;Node drain&lt;/h3&gt;

&lt;p&gt;
There are times when Nodes become unusable or you don't want to work on them anymore. The command &lt;code&gt;kubectl cordon&lt;/code&gt; prevents new Pods to be scheduled on it, but there’s also the possibility to completely empty all current Pods at once. If you run &lt;code&gt;kubectl drain nodename&lt;/code&gt;, all Pods in the node will be evicted, respecting its graceful termination period.
&lt;/p&gt;

&lt;h2 id="podevictionprometheus"&gt;Kubernetes Pod eviction monitoring in Prometheus&lt;/h2&gt;

&lt;p&gt;
In your cloud solution, you can use Prometheus to easily monitor Pod evictions by doing:
&lt;/p&gt;

&lt;pre&gt;
kube_pod_status_reason{reason="Evicted"} &amp;gt; 0
&lt;/pre&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/Troubleshooting-Evicted-Pods-6.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FTroubleshooting-Evicted-Pods-6.png" alt="Monitor Evicted Pods with Prometheus"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;
This will display all evicted Pods in your cluster. You can also pair this with &lt;code&gt;kube_pod_status_phase{phase="Failed"}&lt;/code&gt; in order to alert on those who were evicted after a failure in the Pod.
&lt;/p&gt;

&lt;p&gt;
If you want to dig deeper, check the following articles for monitoring resources in Prometheus:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://sysdig.com/blog/kubernetes-resource-limits/" rel="noopener noreferrer"&gt;How to rightsize the Kubernetes resource limits&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://sysdig.com/blog/kubernetes-capacity-planning/" rel="noopener noreferrer"&gt;Kubernetes capacity planning: How to rightsize the requests of your cluster &lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id="conclusion"&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;
As you can see, eviction is just another feature from Kubernetes which allows you to control limited resources: in this case, the nodes that Pods will be using.
&lt;/p&gt;

&lt;p&gt;
During &lt;strong&gt;preemption&lt;/strong&gt;, Kubernetes will try to free up resources by evicting less priority Pods to schedule a new one. With Priority Classes you can control which Pods are more likely to keep running after preemption since there’s less chance that they will be evicted.
&lt;/p&gt;

&lt;p&gt;
During execution, Kubernetes will check for &lt;strong&gt;Node-pressure&lt;/strong&gt; and evict Pods if needed. With QoS classes you can control which Pods are more likely to be evicted in case of node-pressure.
&lt;/p&gt;

&lt;p&gt;
Memory and CPU are important resources in your nodes, and you need to configure your Pods, containers and nodes to use the right amount of them. If you manage these resources accordingly, there could not only be a benefit in costs, but also you can ensure that the important processes will keep running, no matter how.
&lt;/p&gt;





&lt;h2&gt;Get ahead of Pod eviction with Sysdig Monitor&lt;/h2&gt;
&lt;p&gt;
With Sysdig Advisor, you can review cluster resource availability in order to prevent Pod eviction. Featuring:
&lt;/p&gt;

&lt;ul&gt;

&lt;li&gt;CPU overcommitment metric

&lt;/li&gt;
&lt;li&gt;CPU Capacity

&lt;/li&gt;
&lt;li&gt;How to fix tips
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/cpu-overcommit.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2Fcpu-overcommit.png" alt="Screenshot of Sysdig Monitor, with the CPU overcommit resource"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;
Sysdig Advisor accelerates mean time to resolution (MTTR) with live logs, performance data, and suggested remediation steps. It’s the easy button for Kubernetes troubleshooting!
&lt;/p&gt;

&lt;p&gt;
&lt;a href="https://sysdig.com/company/free-trial-monitor/" rel="noopener noreferrer"&gt;Try it free for 30 days!&lt;/a&gt;
&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>memory</category>
      <category>pods</category>
      <category>evicted</category>
    </item>
    <item>
      <title>What is Kubernetes CrashLoopBackOff? And how to fix it</title>
      <dc:creator>Javier Martínez</dc:creator>
      <pubDate>Mon, 29 Aug 2022 11:49:48 +0000</pubDate>
      <link>https://dev.to/sysdig/what-is-kubernetes-crashloopbackoff-and-how-to-fix-it-1p84</link>
      <guid>https://dev.to/sysdig/what-is-kubernetes-crashloopbackoff-and-how-to-fix-it-1p84</guid>
      <description>&lt;p&gt;&lt;strong&gt;CrashLoopBackOff is a Kubernetes state representing a restart loop&lt;/strong&gt; that is happening in a Pod: a container in the Pod is started, but &lt;strong&gt;crashes and is then restarted, over and over again&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Kubernetes will wait an increasing back-off time between restarts to give you a chance to fix the error. As such, CrashLoopBackOff is not an error on itself, but indicates that there’s an error happening that prevents a Pod from starting properly.&lt;/p&gt;

&lt;p&gt;Note that the reason why it’s restarting is because its &lt;code&gt;restartPolicy&lt;/code&gt; is set to &lt;code&gt;Always&lt;/code&gt;(by default) or &lt;code&gt;OnFailure&lt;/code&gt;. &lt;a href="https://sysdig.com/blog/how-to-monitor-kubelet/" rel="noopener noreferrer"&gt;The kubelet&lt;/a&gt; is then reading this configuration and restarting the containers in the Pod and causing the loop. This behavior is actually useful, since this provides some time for missing resources to finish loading, as well as for us to detect the problem and debug it – more on that &lt;a href="https://sysdig.com/blog/debug-kubernetes-crashloopbackoff/#howtodebugtroubleshootandfixkubernetescrashloopbackoff" rel="noopener noreferrer"&gt;later&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;That explains the &lt;em&gt;CrashLoop&lt;/em&gt; part, but what about the &lt;strong&gt;&lt;em&gt;BackOff&lt;/em&gt; time&lt;/strong&gt;? Basically, it’s an &lt;strong&gt;exponential delay between restarts&lt;/strong&gt; (10s, 20s, 40s, …) which is capped at five minutes. When a Pod state is displaying CrashLoopBackOff, it means that it’s currently waiting the indicated time before restarting the pod again. And it will probably fail again, unless it’s fixed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/What-is-Crashloopbackoff-02.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FWhat-is-Crashloopbackoff-02.png" alt="Kubernetes Crashloopbackoff, an illustrated representation. A Pod is in a loop. It tries to run, but it fails, so it goes to a Failed state. If waits a bit to help you debug, then it tries to run again. If the issue is not fixed, we are in a loop. It fails again."&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this article you’ll see:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;a href="https://sysdig.com/blog/debug-kubernetes-crashloopbackoff/#whatisakubernetescrashloopbackoffthemeaning" rel="noopener noreferrer"&gt;What is CrashLoopBackOff?&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; &lt;a href="https://sysdig.com/blog/debug-kubernetes-crashloopbackoff/#howcaniseeiftherearecrashloopbackoffinmycluster" rel="noopener noreferrer"&gt;How to detect CrashLoopBackOff problems&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; &lt;a href="https://sysdig.com/blog/debug-kubernetes-crashloopbackoff/#whydoesacrashloopbackoffoccur" rel="noopener noreferrer"&gt;Common causes for a CrashLoopBackOff&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; &lt;a href="https://sysdig.com/blog/debug-kubernetes-crashloopbackoff/#howtodebugtroubleshootandfixkubernetescrashloopbackoff" rel="noopener noreferrer"&gt;Kubernetes tools for debugging a CrashLoopBackOff&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; &lt;a href="https://sysdig.com/blog/debug-kubernetes-crashloopbackoff/#howtoalertonkubernetescrashloopbackoff" rel="noopener noreferrer"&gt;How to detect CrashLoopBackOff with Prometheus&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  How to detect a CrashLoopBackOff in your cluster?
&lt;/h2&gt;

&lt;p&gt;Most likely, you discovered one or more pods in this state by listing the pods with &lt;code&gt;kubectl get pods&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl get pods
NAME                     READY     STATUS             RESTARTS   AGE
flask-7996469c47-d7zl2   1/1       Running            1          77d
flask-7996469c47-tdr2n   1/1       Running            0          77d
nginx-5796d5bc7c-2jdr5   0/1       CrashLoopBackOff   2          1m
nginx-5796d5bc7c-xsl6p   0/1       CrashLoopBackOff   2          1m

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From the output, you can see that the last two pods:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Are not&lt;/strong&gt; in &lt;code&gt;READY&lt;/code&gt; condition (&lt;code&gt;0/1&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;  Their &lt;strong&gt;status&lt;/strong&gt; displays &lt;code&gt;CrashLoopBackOff&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;  Column &lt;code&gt;RESTARTS&lt;/code&gt; displays &lt;strong&gt;one or more&lt;/strong&gt; restarts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These three signals are pointing to what we explained: pods are failing, and they are being restarted. Between restarts, there’s a grace period which is represented as &lt;code&gt;CrashLoopBackOff&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You may also be “lucky” enough to find the Pod in the brief time it is in the &lt;code&gt;Running&lt;/code&gt; or the &lt;code&gt;Failed&lt;/code&gt; state.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/What-is-Crashloopbackoff-03.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FWhat-is-Crashloopbackoff-03.png" alt="A timeline of a CrashloopBackoff. Everytime it fails, the BackoffTime and the Restart Count are increased"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Common reasons for a CrashLoopBackOff
&lt;/h2&gt;

&lt;p&gt;It’s important to note that a CrashLoopBackOff is &lt;strong&gt;not the actual error that is crashing the pod&lt;/strong&gt;. Remember that it’s just displaying the loop happening in the &lt;code&gt;STATUS&lt;/code&gt; column. You need to find the underlying error affecting the containers.&lt;/p&gt;

&lt;p&gt;Some of the errors linked to the actual &lt;strong&gt;application&lt;/strong&gt; are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Misconfigurations:&lt;/strong&gt; Like a typo in a configuration file.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;A resource is not available:&lt;/strong&gt; Like a PersistentVolume that is not mounted.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Wrong command line arguments:&lt;/strong&gt; Either missing, or the incorrect ones.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Bugs &amp;amp; Exceptions:&lt;/strong&gt; That can be anything, very specific to your application.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And finally, errors from the network and permissions are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  You tried to &lt;strong&gt;bind an existing port&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;  The &lt;strong&gt;memory limits are too low&lt;/strong&gt;, so the container is &lt;a href="https://sysdig.com/blog/troubleshoot-kubernetes-oom/" rel="noopener noreferrer"&gt;Out Of Memory killed&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Errors in the &lt;a href="https://sysdig.com/blog/whats-new-kubernetes-1-16/#950" rel="noopener noreferrer"&gt;liveness probes&lt;/a&gt;&lt;/strong&gt; are not reporting the Pod as ready.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Read-only filesystems&lt;/strong&gt;, or lack of permissions in general.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once again, this is just a list of possible causes but there could be many others.&lt;/p&gt;

&lt;p&gt;Let’s now see how to dig deeper and find the actual cause.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to debug, troubleshoot and fix a CrashLoopBackOff state
&lt;/h2&gt;

&lt;p&gt;From the previous section, you understand that there are plenty of reasons why a pod ends up in a CrashLoopBackOff state. Now, how do you know which one is affecting you? Let’s review &lt;strong&gt;some tools you can use to debug it&lt;/strong&gt;, and in which order to use it.&lt;/p&gt;

&lt;p&gt;This could be our best course of action:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Check the &lt;strong&gt;pod description&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt; Check the &lt;strong&gt;pod&lt;/strong&gt; &lt;strong&gt;logs&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt; Check the &lt;strong&gt;events&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt; Check the &lt;strong&gt;deployment&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  1. Check the pod description – kubectl describe pod
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;kubectl describe pod&lt;/code&gt; command provides detailed information of a specific Pod and its containers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl describe pod the-pod-name
Name:         the-pod-name
Namespace:    default
Priority:     0
…
State:          Waiting
Reason:       CrashLoopBackOff
Last State:     Terminated
Reason:       Error
…
Warning  BackOff                1m (x5 over 1m)   kubelet, ip-10-0-9-132.us-east-2.compute.internal  Back-off restarting failed container
…

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From the describe output, you can extract the following information:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Current pod &lt;code&gt;State&lt;/code&gt; is &lt;code&gt;Waiting&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Reason&lt;/strong&gt; for the Waiting state is “&lt;strong&gt;CrashLoopBackOff&lt;/strong&gt;”.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Last&lt;/strong&gt; (or previous) state was “&lt;strong&gt;Terminated&lt;/strong&gt;”.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Reason&lt;/strong&gt; for the last termination was “&lt;strong&gt;Error&lt;/strong&gt;”.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That aligns with the loop behavior we’ve been explaining.&lt;/p&gt;

&lt;p&gt;By using &lt;code&gt;kubectl describe pod&lt;/code&gt; you can check for misconfigurations in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  The pod definition.&lt;/li&gt;
&lt;li&gt;  The &lt;strong&gt;container&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;  The &lt;strong&gt;image&lt;/strong&gt; pulled for the container.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Resources&lt;/strong&gt; allocated for the container.&lt;/li&gt;
&lt;li&gt;  Wrong or missing &lt;strong&gt;arguments&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;  …
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;…
Warning  BackOff                1m (x5 over 1m)   kubelet, ip-10-0-9-132.us-east-2.compute.internal  Back-off restarting failed container
…

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the final lines, you see a list of the last events associated with this pod, where one of those is &lt;code&gt;"Back-off restarting failed container"&lt;/code&gt;. This is the event linked to the restart loop. There should be just one line even if multiple restarts have happened.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Check the logs – kubectl logs
&lt;/h3&gt;

&lt;p&gt;You can view the logs for all the containers of the pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl logs mypod --all-containers

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or even a container in that pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl logs mypod -c mycontainer

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In case there’s a wrong value in the affected pod, logs may be displaying useful information.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Check the events – kubectl get events
&lt;/h3&gt;

&lt;p&gt;They can be listed with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get events

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Alternatively, you can list all of the events of a single Pod by using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get events --field-selector involvedObject.name=mypod

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note that this information is also present at the bottom of the &lt;code&gt;describe pod&lt;/code&gt; output.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Check the deployment – kubectl describe deployment
&lt;/h3&gt;

&lt;p&gt;You can get this information with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl describe deployment mydeployment

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If there’s a Deployment defining the desired Pod state, it might contain a misconfiguration that is causing the CrashLoopBackOff.&lt;/p&gt;

&lt;h3&gt;
  
  
  Putting it all together
&lt;/h3&gt;

&lt;p&gt;In the following example you can see how to dig into the logs, where an error in a command argument is found.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/What-is-Crashloopbackoff-04.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FWhat-is-Crashloopbackoff-04.png" alt="Debugging a Crashloopbackoff. It shows three terminals with the relationship between several debug commands."&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How to detect CrashLoopBackOff in Prometheus
&lt;/h2&gt;

&lt;p&gt;If you’re using &lt;a href="https://sysdig.com/blog/prometheus-query-examples/" rel="noopener noreferrer"&gt;Prometheus for cloud monitoring&lt;/a&gt;, here are some tips that can help you alert when a CrashLoopBackOff takes place.&lt;/p&gt;

&lt;p&gt;You can quickly scan the containers in your cluster that are in CrashLoopBackOff &lt;code&gt;status&lt;/code&gt; by using the following expression (you will need &lt;a href="https://docs.sysdig.com/en/docs/installation/sysdig-agent/agent-configuration/enable-kube-state-metrics/" rel="noopener noreferrer"&gt;Kube State Metrics&lt;/a&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kube_pod_container_status_waiting_reason{reason="CrashLoopBackOff"} == 1

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/What-is-Crashloopbackoff-05.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FWhat-is-Crashloopbackoff-05.png" alt="PromQL example of CrashLoopBackOff detection based on pod status waiting."&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Alternatively, you can trace the amount of restarts happening in pods with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;rate(kube_pod_container_status_restarts_total[5m]) &amp;gt; 0

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/What-is-Crashloopbackoff-06.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FWhat-is-Crashloopbackoff-06.png" alt="PromQL example of CrashLoopBackOff detection based on restart rate"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Warning:&lt;/strong&gt; Not all restarts happening in your cluster are related to CrashLoopBackOff statuses.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/What-is-Crashloopbackoff-07.png" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FWhat-is-Crashloopbackoff-07.png" alt="Correlation between restarts and crashloopbackoff. Not all restarts are caused by a crashloopbackoff."&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After every CrashLoopBackOff period there should be a restart (1), but there could be restarts not related with CrashLoopBackOff (2).&lt;/p&gt;

&lt;p&gt;Afterwards, you could create a Prometheus Alerting Rule like the following to receive notifications if any of your pods are in this state:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- alert: RestartsAlert
  expr: rate(kube_pod_container_status_restarts_total[5m]) &amp;gt; 0
  for: 10m
  labels:
    severity: warning
  annotations:
    summary: Pod is being restarted
  description: Pod {{ $labels.pod }} in {{ $labels.namespace }} has a container {{ $labels.container }} which is being restarted

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In this article, we have seen how CrashLoopBackOff isn’t an error by itself, but just a notification of the retrial loop that is happening in the pod.&lt;/p&gt;

&lt;p&gt;We saw a description of the states it passes through, and then how to track it with &lt;code&gt;kubectl&lt;/code&gt; commands.&lt;/p&gt;

&lt;p&gt;Also, we have seen common misconfigurations that can cause this state and what tools you can use to debug it.&lt;/p&gt;

&lt;p&gt;Finally, we reviewed how Prometheus can help us in tracking and alerting CrashLoopBackOff events in our pods.&lt;/p&gt;

&lt;p&gt;Although not an intuitive message, CrashLoopBackOff is a useful concept that makes sense and is nothing to be afraid of.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;em&gt;Debug CrashLoopBackOff faster with Sysdig Monitor&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;Advisor, a new Kubernetes troubleshooting product in Sysdig Monitor, &lt;strong&gt;accelerates troubleshooting by up to 10x&lt;/strong&gt;. Advisor displays a prioritized list of issues and relevant troubleshooting data to surface the biggest problem areas and accelerate time to resolution.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/wp-content/uploads/What-is-Crashloopbackoff-08.gif" rel="noopener noreferrer"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsysdig.com%2Fwp-content%2Fuploads%2FWhat-is-Crashloopbackoff-08.gif" alt="How to debug a crashloopbackoff with Sysdig Monitor Advisor"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sysdig.com/company/free-trial-monitor/" rel="noopener noreferrer"&gt;Try it for yourself for free for 30 days!&lt;/a&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>crashloopbackoff</category>
      <category>pods</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
