<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ana Cozma</title>
    <description>The latest articles on DEV Community by Ana Cozma (@the_cozma).</description>
    <link>https://dev.to/the_cozma</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F769233%2F0f14f6ea-58bf-4797-b2fe-b2677507caa6.jpeg</url>
      <title>DEV Community: Ana Cozma</title>
      <link>https://dev.to/the_cozma</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/the_cozma"/>
    <language>en</language>
    <item>
      <title>How to Check TLS Configuration of URLs with Curl and Bash Script</title>
      <dc:creator>Ana Cozma</dc:creator>
      <pubDate>Tue, 20 Aug 2024 14:22:06 +0000</pubDate>
      <link>https://dev.to/the_cozma/how-to-check-tls-configuration-of-urls-with-curl-and-bash-script-3ad1</link>
      <guid>https://dev.to/the_cozma/how-to-check-tls-configuration-of-urls-with-curl-and-bash-script-3ad1</guid>
      <description>&lt;p&gt;If you are working in an Azure environment and you are using Azure Availability Tests you might run into the following Health Advisory event:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;On 31 October 2024, in alignment with the Azure wide legacy TLS deprecation, TLS 1.0/1.1 protocol versions and the below listed TLS 1.2/1.3 legacy Cipher suites and &amp;gt; Elliptical curves will be retired for Application Insights availability tests.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For a list of deprecated versions and remaining supported versions have a look over the official documentation &lt;a href="https://learn.microsoft.com/en-us/azure/azure-monitor/app/availability?tabs=standard#deprecating-tls-configuration" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;But how do you quickly check which endpoint in your availability tests is impacted?&lt;/p&gt;

&lt;p&gt;This was the initial scenario that led me to create and ultimately write this blog post, but these checks can be applied to any case where you need to retrieve and verify the TLS configuration of URLs. Whether you're ensuring compliance with security standards, troubleshooting connection issues, or simply gathering information for audits, the script can help you get the information you need.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using Curl for one URL
&lt;/h2&gt;

&lt;p&gt;You can use the curl command with the -v (verbose) option to see detailed information about the TLS handshake, including the TLS version by running the command below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; https://example.com 2&amp;gt;&amp;amp;1 | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"SSL connection using"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Explanation of the command and its parameters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;code&gt;curl&lt;/code&gt; to make a request to the specified URL.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;-s&lt;/code&gt; option makes curl silent, except for errors.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;-v&lt;/code&gt; option outputs verbose information.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;2&amp;gt;&amp;amp;1&lt;/code&gt; redirects the standard error (where verbose output is written) to standard output, allowing grep to filter it.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;grep "SSL connection using"&lt;/code&gt; command filters out the line containing the TLS version.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By running the command the output will be something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;*&lt;/span&gt; SSL connection using TLSv1.2 / ECDHE_RSA_AES_256_GCM_SHA384
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But what if you have more than one URL you need to check? Running this command manually can be tiring and it will involve a lot of copy-pasting which can be time-consuming. So let's look into saving up some time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using Curl and a Bash script to loop through a list of URLs
&lt;/h2&gt;

&lt;p&gt;We can take this a step further and create a script that will accept a list of URLs, loop through them and output the information we need. You can achieve this by following the steps below:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Create the file:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;First, let's create a file &lt;code&gt;check_tls_version.sh&lt;/code&gt; or you can name it however you want:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;touch &lt;/span&gt;check_tls_version.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Make the Script Executable:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod&lt;/span&gt; +x check_tls_details.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Create the Script:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vi check_tls_version.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can use vi, nano or just open in any Editor of your choice and paste inside the newly created file the code below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="c"&gt;# Check if a file was provided as an argument&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Usage: &lt;/span&gt;&lt;span class="nv"&gt;$0&lt;/span&gt;&lt;span class="s2"&gt; &amp;lt;file_with_urls&amp;gt;"&lt;/span&gt;
  &lt;span class="nb"&gt;exit &lt;/span&gt;1
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="c"&gt;# Read the file line by line&lt;/span&gt;
&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nv"&gt;IFS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; url&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="c"&gt;# Make sure the line is not empty&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$url&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Checking &lt;/span&gt;&lt;span class="nv"&gt;$url&lt;/span&gt;&lt;span class="s2"&gt;..."&lt;/span&gt;

    &lt;span class="c"&gt;# Use curl to fetch the TLS details&lt;/span&gt;
    &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$url&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; 2&amp;gt;&amp;amp;1&lt;span class="si"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;# Extract the TLS version&lt;/span&gt;
    &lt;span class="nv"&gt;tls_version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$output&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"SSL connection using"&lt;/span&gt; | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{print $5}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;# Extract the cipher suite&lt;/span&gt;
    &lt;span class="nv"&gt;cipher_suite&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$output&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"SSL connection using"&lt;/span&gt; | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-F&lt;/span&gt;&lt;span class="s1"&gt;'/'&lt;/span&gt; &lt;span class="s1"&gt;'{print $2}'&lt;/span&gt; | xargs&lt;span class="si"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;# Extract the elliptic curve (if available)&lt;/span&gt;
    &lt;span class="nv"&gt;elliptic_curve&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$output&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"SSL certificate verify"&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="s1"&gt;'(?&amp;lt;=using ).*(?= curve)'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;# Output the results&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"URL: &lt;/span&gt;&lt;span class="nv"&gt;$url&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"TLS Version: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;tls_version&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;Could&lt;/span&gt;&lt;span class="p"&gt; not retrieve TLS version&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Cipher Suite: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;cipher_suite&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;Could&lt;/span&gt;&lt;span class="p"&gt; not retrieve cipher suite&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$elliptic_curve&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
      &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Elliptic Curve: &lt;/span&gt;&lt;span class="nv"&gt;$elliptic_curve&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;else
      &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Elliptic Curve: Could not retrieve elliptic curve or not applicable"&lt;/span&gt;
    &lt;span class="k"&gt;fi
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt;
  &lt;span class="k"&gt;fi
done&lt;/span&gt; &amp;lt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the sample above, I added comments on what each line does, but feel free to modify it to extract more or less of the information you are interested in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Prepare a File with URLs:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Create a text file with one URL per line, e.g., &lt;code&gt;urls.txt&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;https://example.com
https://anotherexample.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;5. Run the Script:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Having both the script (&lt;code&gt;heck_tls_details.sh&lt;/code&gt;) and our list of URLs we need to check(&lt;code&gt;urls.txt&lt;/code&gt;), we can now run the script we created:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./check_tls_details.sh urls.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;6. Sample Output:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The script will have an output similar to this one for each URL you provided:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;Checking https://example.com...&lt;/span&gt;
&lt;span class="na"&gt;URL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://example.com&lt;/span&gt;
&lt;span class="na"&gt;TLS Version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TLSv1.3&lt;/span&gt;
&lt;span class="na"&gt;Cipher Suite&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AEAD-AES128-GCM-SHA256&lt;/span&gt;
&lt;span class="na"&gt;Elliptic Curve&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;X25519&lt;/span&gt;

&lt;span class="s"&gt;Checking https://anotherexample.com...&lt;/span&gt;
&lt;span class="na"&gt;URL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://anotherexample.com&lt;/span&gt;
&lt;span class="na"&gt;TLS Version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TLSv1.2&lt;/span&gt;
&lt;span class="na"&gt;Cipher Suite&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ECDHE-RSA-AES256-GCM-SHA384&lt;/span&gt;
&lt;span class="na"&gt;Elliptic Curve&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;prime256v1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Explanation of the output parameters:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;URL:&lt;/em&gt; The URLs being checked - the ones you provide in the urls.txt file.\&lt;br&gt;
&lt;em&gt;TLS Version:&lt;/em&gt; The TLS version used by the URL.\&lt;br&gt;
&lt;em&gt;Cipher Suite:&lt;/em&gt; The cipher suite used for the connection.\&lt;br&gt;
&lt;em&gt;Elliptic Curve:&lt;/em&gt; The elliptic curve used, if applicable.&lt;/p&gt;

&lt;p&gt;Now that you have your data you can simply compare the TLS version, Cipher Suite or Elliptic Curve against the deprecated or supported versions and take appropriate actions to update them.&lt;/p&gt;

&lt;p&gt;Lastly, I have used curl on Mac, but you can use the same on Windows by installing it from &lt;a href="https://curl.se/windows/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Thank you for reading and I hope this helps someone out there with your use case!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>security</category>
      <category>bash</category>
      <category>devops</category>
    </item>
    <item>
      <title>Understanding and Mitigating the Latest OpenSSH Vulnerability (CVE-2024-6387) in AKS</title>
      <dc:creator>Ana Cozma</dc:creator>
      <pubDate>Wed, 17 Jul 2024 12:46:47 +0000</pubDate>
      <link>https://dev.to/the_cozma/understanding-and-mitigating-the-latest-openssh-vulnerability-cve-2024-6387-in-aks-44ak</link>
      <guid>https://dev.to/the_cozma/understanding-and-mitigating-the-latest-openssh-vulnerability-cve-2024-6387-in-aks-44ak</guid>
      <description>&lt;p&gt;Recently a new vulnerability in OpenSSH has been identified and the first question that popped into my mind was: &lt;em&gt;How do I make sure my nodes are not affected by _this vulnerability&lt;/em&gt;?&lt;/p&gt;

&lt;p&gt;In this blog post, I wanted to go over what the vulnerability is, how it can be exploited, explain how you can check if your Azure Kubernetes Service (AKS) is vulnerable to CVE-2024-6387 and what you can do about it, including different options for upgrading the VMSS image and how to choose between them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understand the vulnerability
&lt;/h2&gt;

&lt;h3&gt;
  
  
  CVE-2024-6387
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2024-6387" rel="noopener noreferrer"&gt;CVE-2024-6387&lt;/a&gt; is a critical unauthenticated RCE-as-root vulnerability that was identified in the OpenSSH server, &lt;code&gt;sshd&lt;/code&gt;, in glibc-based Linux systems. If exploited, this vulnerability grants full root access, affects the default configuration and does not require user interaction thus it is classified as a &lt;strong&gt;High Severity&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This was identified on the 1st of July 2024.&lt;/p&gt;

&lt;p&gt;The researchers who discovered it also noted that in 2006 OpenSSH faced this vulnerability known as &lt;a href="https://security-tracker.debian.org/tracker/CVE-2006-5051" rel="noopener noreferrer"&gt;CVE-2006-5051&lt;/a&gt;. While the 2006 one was patched, the bug has reappeared. This is why the latest, CVE-2024-6387, vulnerability is dubbed the "regreSSHion bug": we see a reintroduction of an issue that was fixed due to code changes.&lt;/p&gt;

&lt;p&gt;CVE-2024-6387 vulnerability impacts the following OpenSSH server versions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Open SSH version between &lt;code&gt;8.5p1 - 9.8p1&lt;/code&gt; (excluding)&lt;/li&gt;
&lt;li&gt;Open SSH versions earlier than &lt;code&gt;4.4p1&lt;/code&gt;, if they’ve not backport-patched against CVE-2006-5051 or patched against CVE-2008-4109&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  CVE-2024-6409
&lt;/h3&gt;

&lt;p&gt;As of the 9th of July another vulnerability has been discovered: &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2024-6409" rel="noopener noreferrer"&gt;CVE-2024-6409&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This is a distinct vulnerability from the regreSSHion bug. The vulnerability allows an attacker to execute code within the &lt;em&gt;privsep&lt;/em&gt; child process. This child process is a part of OpenSSH that runs with restricted privileges to limit the damage that can be done if it is compromised.&lt;/p&gt;

&lt;p&gt;The vulnerability is caused by a race condition related to how signals are handled. This means that the &lt;em&gt;privsep&lt;/em&gt; child process can be exploited because the timing of signal handling operations can be manipulated, leading to unintended behavior that allows code execution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Impact&lt;/strong&gt; OpenSSH versions 8.7p1 and 8.8p1 shipped with Red Hat Enterprise Linux 9.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Machines patched for CVE-2024-6387 will also be patched for CVE-2024-6409.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Suggested actions against the vulnerability
&lt;/h2&gt;

&lt;p&gt;To protect against this vulnerability the main suggestion is to upgrade the package version using a command like or similar to &lt;code&gt;apt upgrade opensshh-sftp-server&lt;/code&gt;, but if you cannot do this and you need a quick workaround then an option would be to set the &lt;code&gt;LoginGraceTime&lt;/code&gt; SSH configuration parameter to 0 as recommended by &lt;a href="https://ubuntu.com/blog/ubuntu-regresshion-security-fix" rel="noopener noreferrer"&gt;Ubuntu&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Let's look into both recommendations and understand them a bit more and let's start with the workaround:&lt;/p&gt;

&lt;h3&gt;
  
  
  Set LoginGraceTime to 0
&lt;/h3&gt;

&lt;p&gt;OpenSSH allows remote connections to the server machines. &lt;code&gt;LoginGraceTime&lt;/code&gt; SSH server configuration parameter specifies &lt;em&gt;the time allowed for successful authentication to the server&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;This means that setting a longer Grace time period allows for more open unauthenticated connections to be made. Setting a shorter Grace time period can protect against a brute force attack in certain cases.&lt;/p&gt;

&lt;p&gt;In the context of the identified vulnerability, this is important because the vulnerable code is called only when the &lt;code&gt;LoginGraceTime&lt;/code&gt; timer triggers. So the reasoning is that by setting it to 0, which means no timeout, you prevent the timer from firing, the code will not be called and thus the vulnerability is eliminated.&lt;/p&gt;

&lt;p&gt;But there is a caveat here.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;While you eliminate the risk of calling the vulnerable code, and you are protected against brute force attacks, by setting this to 0 you are making &lt;code&gt;sshd&lt;/code&gt; vulnerable to denial of service attacks. So it's good to consider your options carefully and the tradeoff when you are configuring these settings.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Denial of Service through MaxStartups Exhaustion Explained&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;MaxStartups&lt;/code&gt; is another &lt;code&gt;sshd&lt;/code&gt; configuration that limits the number of concurrent unauthenticated connections.&lt;/p&gt;

&lt;p&gt;If &lt;code&gt;LoginGraceTime&lt;/code&gt; is set to 0, attackers can open numerous connections without being timed out. Since these connections won't be closed due to timeout, they will remain open indefinitely.&lt;/p&gt;

&lt;p&gt;This can exhaust the allowed number of connections specified by &lt;code&gt;MaxStartups&lt;/code&gt;, preventing legitimate users from accessing the SSH service.&lt;/p&gt;

&lt;p&gt;Essentially, the server becomes overwhelmed with these open connections, leading to a denial of service for legitimate users (hence the denial of service).&lt;/p&gt;

&lt;p&gt;This is why the main recommendation is to upgrade to a patched version of &lt;code&gt;sshd&lt;/code&gt; where the underlying vulnerability has been addressed. This ensures that &lt;code&gt;LoginGraceTime&lt;/code&gt; can be set to a reasonable value, and the server can handle connection attempts appropriately without being vulnerable to a DoS attack via &lt;code&gt;MaxStartups&lt;/code&gt; exhaustion.&lt;/p&gt;

&lt;h3&gt;
  
  
  Upgrade to a patched version of &lt;code&gt;sshd&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Now onto the main fix and what this means for your virtual machine scale sets (VMSS) in the AKS context. When running AKS, modifying the VMSS yourself is generally not recommended due to the following reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;em&gt;Managed Service:&lt;/em&gt; AKS is a &lt;strong&gt;managed&lt;/strong&gt; Kubernetes service, meaning Microsoft handles most of the underlying infrastructure management for you. Directly modifying VMSS configurations can interfere with the automated management and updates provided by AKS.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;em&gt;Configuration Consistency:&lt;/em&gt; AKS maintains certain configurations to ensure the cluster operates correctly. Manual modifications to the VMSS could lead to a configuration drift, where the manually set configurations diverge from the managed state AKS expects and maintains.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;em&gt;Stability and Reliability:&lt;/em&gt; Direct modifications can lead to instability or unexpected behavior within your cluster. This includes potential issues during upgrades, scaling operations, or applying patches.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because of these reasons handling the fix for the vulnerability means waiting for the Azure release team to provide us with a patched image.&lt;/p&gt;

&lt;h2&gt;
  
  
  Check the AKS version
&lt;/h2&gt;

&lt;p&gt;When you upgrade Kubernetes it also upgrades the node images so a good place to start is to identify the version of Kubernetes your AKS clusters are running. You can do this through the Azure portal, CLI, or API.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Azure Portal:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Navigate to your AKS cluster resource and check the version information in the &lt;em&gt;Overview&lt;/em&gt; section.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Azure CLI:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;az aks show &lt;span class="nt"&gt;--resource-group&lt;/span&gt; &amp;lt;ResourceGroupName&amp;gt; &lt;span class="nt"&gt;--name&lt;/span&gt; &amp;lt;AKSClusterName&amp;gt; &lt;span class="nt"&gt;--query&lt;/span&gt; kubernetesVersion
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Note: Replace &lt;code&gt;ResourceGroupName&lt;/code&gt; and &lt;code&gt;AKSClusterName&lt;/code&gt; with your actual resource group and AKS cluster names.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Then by making use of &lt;code&gt;kubectl&lt;/code&gt; command line, you can retrieve the exact version of the node images you are using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get nodes &lt;span class="nt"&gt;-o&lt;/span&gt; wide
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By running these commands you will know your Kubernetes version and also the OS Image version your nodes are running on. Now you can compare your node image version against the versions mentioned in the CVE details as vulnerable to know if you are running the nodes on an image that has a vulnerable version of &lt;code&gt;sshd&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Check and upgrade the AKS VMSS node image
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Identify the patched image version
&lt;/h3&gt;

&lt;p&gt;Azure Kubernetes Service regularly provides new node images, so it's good to upgrade your node images frequently to take advantage of the latest AKS features. Linux node images are updated weekly, and Windows node images are updated monthly.&lt;/p&gt;

&lt;p&gt;For Azure, and AKS more specifically, you should perform the following checks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check for the node image with a patched &lt;code&gt;sshd&lt;/code&gt; version on &lt;a href="https://github.com/Azure/AKS/releases" rel="noopener noreferrer"&gt;GitHub Azure AKS Releases&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Check the rollout schedule of the patched node image in your region &lt;a href="https://releases.aks.azure.com/#tabeuro" rel="noopener noreferrer"&gt;AKS Release Status page&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;*&lt;em&gt;**Tip&lt;/em&gt;&lt;em&gt;: It is also a good practice in general to check the release page for announcements on upcoming releases and the fixes they include and keep your node images up to date to protect against the latest vulnerabilities.&lt;/em&gt;*&lt;/p&gt;

&lt;p&gt;At the time of the writing of the current article, we'll be looking out for the rollout of the image with version: &lt;strong&gt;202407.08.0&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Generally, when you upgrade the Kubernetes version the images will be upgraded as well, but when you have a security patch you might want to upgrade only the image and not the Kubernetes version.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Please consider carefully before upgrading a node image version because it's not possible to downgrade it afterward!&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Verify the patched image version availability
&lt;/h3&gt;

&lt;p&gt;In order to &lt;strong&gt;check for available node image upgrades&lt;/strong&gt; for the nodes in your node pool simply run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;az aks nodepool get-upgrades &lt;span class="nt"&gt;--nodepool-name&lt;/span&gt; mynodepool &lt;span class="nt"&gt;--cluster-name&lt;/span&gt; myAKSCluster &lt;span class="nt"&gt;--resource-group&lt;/span&gt; myResourceGroup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the JSON output, check the &lt;code&gt;latestNodeImageVersion&lt;/code&gt; parameter which indicates the version of the latest image available that the nodes can be upgraded to.&lt;/p&gt;

&lt;p&gt;Then, you want to check the actual node image you are running on (can be done via Azure Portal or CLI). If you're using CLI for this command as well then just run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;az aks nodepool show &lt;span class="nt"&gt;--resource-group&lt;/span&gt; myResourceGroup &lt;span class="nt"&gt;--cluster-name&lt;/span&gt; myAKSCluster &lt;span class="nt"&gt;--name&lt;/span&gt; mynodepool &lt;span class="nt"&gt;--query&lt;/span&gt; nodeImageVersion
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Simply compare the two image versions. If there is a difference this means there is an upgrade available for your nodes. If not, you are already running on the latest and you should check the releases for the rollout of the image you are interested in upgrading to.&lt;/p&gt;

&lt;p&gt;Having the image version available in your region, the next step will be &lt;strong&gt;performing the actual node image upgrade&lt;/strong&gt;. There are several ways of handling this depending on your scenario which I will detail below.&lt;/p&gt;

&lt;h3&gt;
  
  
  Upgrade all node images in all node pools
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;TL;DR&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CLI Command:&lt;/strong&gt; &lt;code&gt;az aks upgrade --node-image-only&lt;/code&gt; \&lt;br&gt;
&lt;strong&gt;Scope:&lt;/strong&gt; This command applies the upgrade to all node pools in the specified AKS cluster. \&lt;br&gt;
&lt;strong&gt;Use Case:&lt;/strong&gt; Use this when you want to ensure that all nodes in your entire cluster are updated to the latest node image version.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How To&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Use the az &lt;code&gt;aks upgrade&lt;/code&gt; command with the &lt;code&gt;--node-image-only&lt;/code&gt; flag to upgrade the node images across all node pools in the AKS cluster. This command ensures that only the node image is upgraded without altering the Kubernetes version.&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight shell"&gt;&lt;code&gt;az aks upgrade &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--resource-group&lt;/span&gt; myResourceGroup &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--name&lt;/span&gt; myAKSCluster &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--node-image-only&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;After initiating the upgrade, you can verify the status of the node images using the &lt;code&gt;kubectl get nodes&lt;/code&gt; command with a specific JSONPath query to output the node names and their image versions.&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get nodes &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{range .items[*]}{.metadata.name}{"\t"}{.metadata.labels.kubernetes\.azure\.com\/node-image-version}{"\n"}{end}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Once the upgrade is complete, you can retrieve the updated details of the node pools, including the current node image version, using the az aks show command.&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight shell"&gt;&lt;code&gt;az aks show &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--resource-group&lt;/span&gt; myResourceGroup &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--name&lt;/span&gt; myAKSCluster
&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;/ol&gt;

&lt;h3&gt;
  
  
  Upgrade a specific node pool
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;TL;DR&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CLI Command:&lt;/strong&gt; &lt;code&gt;az aks nodepool upgrade --node-image-only&lt;/code&gt;\&lt;br&gt;
&lt;strong&gt;Scope:&lt;/strong&gt; This command targets a specific node pool within the AKS cluster, identified by the --name parameter.\&lt;br&gt;
&lt;strong&gt;Use Case:&lt;/strong&gt; Use this when you need to upgrade the node image for only one particular node pool, perhaps for testing or staggered rollout purposes.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How To&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;If you want to upgrade the node image of a specific node pool without affecting the entire cluster, use the &lt;code&gt;az aks nodepool upgrade&lt;/code&gt; command with the &lt;code&gt;--node-image-only&lt;/code&gt; flag.&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight shell"&gt;&lt;code&gt;az aks nodepool upgrade &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--resource-group&lt;/span&gt; myResourceGroup &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cluster-name&lt;/span&gt; myAKSCluster &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--name&lt;/span&gt; mynodepool &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--node-image-only&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Similar to the cluster-wide upgrade, check the status of the node images with the kubectl get nodes command.&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get nodes &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{range .items[*]}{.metadata.name}{"\t"}{.metadata.labels.kubernetes\.azure\.com\/node-image-version}{"\n"}{end}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Use the az aks nodepool show command to get the details of the updated node pool.&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight shell"&gt;&lt;code&gt;az aks nodepool show &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--resource-group&lt;/span&gt; myResourceGroup &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cluster-name&lt;/span&gt; myAKSCluster &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--name&lt;/span&gt; mynodepool
&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;/ol&gt;

&lt;h3&gt;
  
  
  Use Node Surge to Speed Up Upgrades
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;TL;DR&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CLI Command&lt;/strong&gt;: &lt;code&gt;az aks nodepool update --max-surge&lt;/code&gt; \&lt;br&gt;
&lt;strong&gt;Scope&lt;/strong&gt;: This command also targets a specific node pool but includes the --max-surge parameter to control the number of extra nodes that can be created to expedite the upgrade. \&lt;br&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: Use this when you want to perform a faster upgrade of a node pool by temporarily increasing the number of nodes during the upgrade process, thereby reducing downtime or upgrade duration.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How To&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;To speed up the node image upgrade process, you can use the az aks node pool update command with the --max-surge flag, which specifies the number of extra nodes used during the upgrade process. This allows more nodes to be upgraded simultaneously.&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight shell"&gt;&lt;code&gt;az aks nodepool update &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--resource-group&lt;/span&gt; myResourceGroup &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cluster-name&lt;/span&gt; myAKSCluster &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--name&lt;/span&gt; mynodepool &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--max-surge&lt;/span&gt; 33% &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--no-wait&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Check the node image status as previously described.&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get nodes &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{range .items[*]}{.metadata.name}{"\t"}{.metadata.labels.kubernetes\.azure\.com\/node-image-version}{"\n"}{end}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Retrieve the updated node pool details using the az aks node pool show command.&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight shell"&gt;&lt;code&gt;az aks nodepool show &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--resource-group&lt;/span&gt; myResourceGroup &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cluster-name&lt;/span&gt; myAKSCluster &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--name&lt;/span&gt; mynodepool
&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The choice between the three will depend on what your strategy will be and what you want to focus on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If you have a new security patch or critical update and want every node in your cluster to be updated as quickly as possible without specifying individual node pools, upgrade the entire cluster.&lt;/li&gt;
&lt;li&gt;If you are running different workloads on separate node pools and want to update the node image for only one specific pool to test compatibility or performance just target upgrade.&lt;/li&gt;
&lt;li&gt;If you need a faster upgrade for a specific node pool and can afford to temporarily add more nodes to handle the upgrade process, use node surge.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;I hope this article will give you an idea of this particular security vulnerability and how you can mitigate it and how you can approach security patches in the future in the context of AKS VMSS. Thank you for reading!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>azure</category>
      <category>aks</category>
      <category>devops</category>
    </item>
    <item>
      <title>AWS: Handling 'Cannot delete entity, must remove tokens from principal first' error</title>
      <dc:creator>Ana Cozma</dc:creator>
      <pubDate>Thu, 08 Feb 2024 12:19:33 +0000</pubDate>
      <link>https://dev.to/the_cozma/aws-handling-cannot-delete-entity-must-remove-tokens-from-principal-first-error-57pl</link>
      <guid>https://dev.to/the_cozma/aws-handling-cannot-delete-entity-must-remove-tokens-from-principal-first-error-57pl</guid>
      <description>&lt;p&gt;This blog post will be a quick one focusing on troubleshooting a less clear error, &lt;em&gt;'Cannot delete entity, must remove tokens from principal first'&lt;/em&gt;, that Terraform can throw when you try to delete IAM users from AWS.&lt;/p&gt;

&lt;p&gt;Let's assume that in your Terraform configuration you manage IAM users and you want to delete one of them. You'd think that by simply removing the Terraform code and then running &lt;code&gt;terraform apply&lt;/code&gt; it will delete the users. Which was my case. But then as soon as I ran the command to destroy the resource I ran into an issue:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  - destroy

Terraform will perform the following actions:

  # aws_iam_user.little_tester will be destroyed
  # (because aws_iam_user.little_tester is not in configuration)
  - resource "aws_iam_user" "little_tester" {
      - arn           = "arn:aws:iam::xxxxxxxxxx:user/little_tester" -&amp;gt; null
      - force_destroy = false -&amp;gt; null
      - id            = "little_tester" -&amp;gt; null
      - name          = "little_tester" -&amp;gt; null
      - path          = "/" -&amp;gt; null
      - tags          = {
          - "Company"  = "MyCompany"
          - "Location" = "Aruba"
          - "Unit"     = "Front Desk"
        } -&amp;gt; null
      - tags_all      = {
          - "Company"  = "MyCompany"
          - "Location" = "Aruba"
          - "Unit"     = "Front Desk"
        } -&amp;gt; null
      - unique_id     = "AAAAAAAAAAAAAAAAA" -&amp;gt; null
    }

Plan: 0 to add, 0 to change, 1 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

aws_iam_user.little_tester: Destroying... [id=little_tester]
╷
│ Error: deleting IAM User (little_tester): DeleteConflict: Cannot delete entity, must remove tokens from principal first.
│     status code: 409, request id: ...
│
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  So what does this mean?
&lt;/h2&gt;

&lt;p&gt;The error Cannot delete entity, must remove tokens from principal first. says that the user has some tokens that need to be removed before the user itself can be deleted. The tokens it refers to can be active access keys or registered MFA devices.&lt;/p&gt;

&lt;p&gt;The decision to prevent the deletion of a user if any of these active tokens are associated to it makes sense since from a security perspective because it aims to prevent accidental deletion of users that are still active. &lt;/p&gt;

&lt;p&gt;A way to confirm if this is the case is to go to AWS Console and check the user's Security credentials. There you should see any active access keys or registered MFA devices.&lt;/p&gt;

&lt;p&gt;Having checked that, I saw that the user had an Access key that was still active and had an active MFA device. I removed both manually and then ran &lt;code&gt;terraform apply&lt;/code&gt; again. And it worked! The user was deleted successfully.&lt;/p&gt;

&lt;h2&gt;
  
  
  How can this happen?
&lt;/h2&gt;

&lt;p&gt;The user's access token and the MFA device configured to his account were not managed by Terraform, meaning they were created manually. So Terraform was not aware of them and could not delete them. And this was preventing the deletion of the user.&lt;/p&gt;

&lt;p&gt;How this could come to be is if the user was created through Terraform code, but all the other configurations were done manually after the user was created: adding an access key, adding an MFA device, etc. So then you end up with a mix of Terraform-managed and non-Terraform-managed resources.&lt;/p&gt;

&lt;p&gt;Something to think about for future cases, this could also happen if you create a user group in Terraform and then add users to it manually later on. These users will be part of the group, but Terraform will not be aware of them and will not be able to manage them. Or in any other scenario where you mix non-Terraform-managed and Terraform-managed resources.&lt;/p&gt;

&lt;h2&gt;
  
  
  What can you do?
&lt;/h2&gt;

&lt;p&gt;First option, is to add the access key and MFA device to the Terraform configuration so then creation and removal of the users will be part of a complete flow fully managed by Terraform.&lt;/p&gt;

&lt;p&gt;Second option is to simply manually go to AWS Console &amp;gt; IAM, and check the user's Security credentials and MFA devices. For the active ones simply deactivate them and remove them manually. Then simply run to your configuration and run &lt;code&gt;terraform apply&lt;/code&gt; again.&lt;/p&gt;

&lt;p&gt;And lastly, you can add the &lt;a href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_user#force_destroy" rel="noopener noreferrer"&gt;&lt;code&gt;force_destroy&lt;/code&gt; argument&lt;/a&gt; to the &lt;code&gt;aws_iam_user&lt;/code&gt; resource in your Terraform configuration. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;force_destroy - (Optional, default false) When destroying this user, destroy even if it has non-Terraform-managed IAM access keys, login profile or MFA devices. Without force_destroy a user with non-Terraform-managed access keys and login profile will fail to be destroyed.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;By enabling it, it will allow Terraform to delete the user even if it has non-Terraform-managed access keys and MFA devices. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Warning!&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;While it does seem a convenient option, be very careful with this argument, as it can lead to accidental deletion of users that are still active. So I would advise you to use it only if you are sure that the user is not active (maybe have a check in place that runs before the destruction of the resources), that you are aware of the security implications and lastly check the access of the team members that can run the Terraform code.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Hope this helps someone out there!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>terraform</category>
      <category>iac</category>
    </item>
    <item>
      <title>Azure Application Gateway WAF config vs WAF policy</title>
      <dc:creator>Ana Cozma</dc:creator>
      <pubDate>Thu, 02 Nov 2023 13:49:04 +0000</pubDate>
      <link>https://dev.to/the_cozma/azure-application-gateway-waf-config-vs-waf-policy-i56</link>
      <guid>https://dev.to/the_cozma/azure-application-gateway-waf-config-vs-waf-policy-i56</guid>
      <description>&lt;p&gt;Recently, I had to enable WAF on our Azure Application Gateway. Because of our infrastructure setup, I wanted to have all the rules from OWASP 3.2 enabled, but I needed to be able to exclude some of our (valid) requests from being blocked as well. To achieve this, I could either try to configure the WAF Config section on our Gateway or create a WAF policy. &lt;/p&gt;

&lt;p&gt;Given that it was not entirely clear how you can use proper exclusions and filters based on what you need, I decided to write this post to explain the differences I found between the two and how you can use them.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is WAF?
&lt;/h2&gt;

&lt;p&gt;To recap what Web Application Firewall (WAF) is, here is a brief explanation from the official documentation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Azure Web Application Firewall (WAF) on Azure Application Gateway provides centralized protection of your web applications from common exploits and vulnerabilities. Web applications are increasingly targeted by malicious attacks that exploit commonly known vulnerabilities. SQL injection and cross-site scripting are among the most common attacks.&lt;/p&gt;

&lt;p&gt;WAF on Application Gateway is based on the Core Rule Set (CRS) from the Open Web Application Security Project (OWASP).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Before being able to enable and benefit from WAF capabilities, you will need to check the &lt;strong&gt;SKU&lt;/strong&gt; of the Application Gateway you have. WAF can only be enabled on the &lt;strong&gt;WAF_v2&lt;/strong&gt; SKU and not the Standard SKU. As this was my case as well, I first had to change the SKU of the Application Gateway. This can be done either from the Azure Portal or using Terraform (or any other tool for IaC, in my case, I used this one). &lt;/p&gt;

&lt;p&gt;After this, you can proceed with configuring WAF. This can be done in two ways: either using the built-in &lt;strong&gt;WAF Config section&lt;/strong&gt; of the Application Gateway or creating a &lt;strong&gt;WAF policy&lt;/strong&gt; for the Azure Application Gateway.&lt;/p&gt;

&lt;p&gt;Let's look at what each one is and how you can use them.&lt;/p&gt;

&lt;h2&gt;
  
  
  WAF config
&lt;/h2&gt;

&lt;p&gt;The WAF config section is a built-in part of the Application Gateway configuration as can be seen in the image below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvswb0svgpid5wxry3lyh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvswb0svgpid5wxry3lyh.png" alt="Waf Config" width="800" height="282"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;WAF Config is the Application Gateway's built-in method to configure WAF and it is the section where you can add your configurations such as exclusions or custom rules.&lt;/p&gt;

&lt;p&gt;When using Terraform you can find the &lt;code&gt;waf_configuration&lt;/code&gt; block under the &lt;code&gt;azurerm_application_gateway&lt;/code&gt; &lt;a href="https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/application_gateway#waf_configuration" rel="noopener noreferrer"&gt;resource&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Let's look at an example of how you can configure it using Terraform. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;Scenario:&lt;/em&gt; I would like to configure it to use OWASP 3.2 rules, enable the WAF, and exclude some of our telemetry requests from being blocked while also disabling some rules. This is how the basic configuration would look like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_application_gateway"&lt;/span&gt; &lt;span class="s2"&gt;"application_gateway"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="p"&gt;(...)&lt;/span&gt;
  &lt;span class="s2"&gt;"waf_configuration"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;enabled&lt;/span&gt;                  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="nx"&gt;firewall_mode&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Prevention"&lt;/span&gt;
    &lt;span class="nx"&gt;rule_set_type&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"OWASP"&lt;/span&gt;
    &lt;span class="nx"&gt;rule_set_version&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"3.2"&lt;/span&gt;
    &lt;span class="nx"&gt;file_upload_limit_mb&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
    &lt;span class="nx"&gt;max_request_body_size_kb&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;
    &lt;span class="nx"&gt;request_body_check&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;

    &lt;span class="nx"&gt;exclusion&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;match_variable&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"RequestCookieNames"&lt;/span&gt;
        &lt;span class="nx"&gt;selector&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"telemetry"&lt;/span&gt;
        &lt;span class="nx"&gt;selector_match_operator&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Contains"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nx"&gt;disabled_rule_group&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;rule_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"REQUEST-920-PROTOCOL-ENFORCEMENT"&lt;/span&gt;
    &lt;span class="nx"&gt;rules&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="mi"&gt;920230&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="mi"&gt;920320&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nx"&gt;disabled_rule_group&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;rule_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"REQUEST-921-PROTOCOL-ATTACK"&lt;/span&gt;
      &lt;span class="nx"&gt;rules&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="mi"&gt;921180&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="mi"&gt;921170&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The scenario was simple and while the configuration itself is not hard to do, there are a few &lt;strong&gt;drawbacks&lt;/strong&gt; to using it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;it does not allow you to add &lt;em&gt;custom rules&lt;/em&gt; from the Azure Portal UI. This means that if you want to add a custom rule, you will have to do it using the Azure CLI (or PowerShell). I would like to ideally have all my configurations in one place and not have to use multiple tools to configure or maintain my resources.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;if you have multiple Application Gateways, you will have to configure each one of them separately. Because WAF Config is built-in the Application Gateway this also means it is &lt;em&gt;managed locally to that specific Application Gateway&lt;/em&gt;. While it's configuration applies to everything in the Azure Application Gateway resource. Which was my case as well as I don't manage just one Application Gateway.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;if you are working with Azure Front Door it's good to know that you cannot use WAF Config in that context. This is because &lt;em&gt;Azure Front Door does not support WAF Config.&lt;/em&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  WAF policy
&lt;/h2&gt;

&lt;p&gt;As opposed to WAF Config, which is a built-in functionality in the Application Gateway, WAF policies are a &lt;strong&gt;standalone resource&lt;/strong&gt; that enables you to configure WAF. This means that you can create a WAF policy and then apply it to multiple Application Gateways or even Azure Front Door resources as well.&lt;/p&gt;

&lt;p&gt;WAF policy allows you to have a &lt;strong&gt;centralized configuration&lt;/strong&gt; for all your WAF resources. This means that you can have the same configuration for all your WAF resources and you can also have a &lt;strong&gt;single place&lt;/strong&gt; where you can manage your WAF configuration.&lt;/p&gt;

&lt;p&gt;Because it is a standalone resource the first benefit is you will be able to find all the configurations necessary in the Azure Portal UI: TODO: Rephrase&lt;/p&gt;

&lt;p&gt;You have your Managed rules:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbtgbv10be5rlzddyptij.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbtgbv10be5rlzddyptij.png" alt="Alt text" width="800" height="618"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Your Custom rules:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4fgwemmpv3iawgpzrdyk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4fgwemmpv3iawgpzrdyk.png" alt="Alt text" width="800" height="477"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And your associated gateways:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu757091jzw7t33w1hias.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu757091jzw7t33w1hias.png" alt="Alt text" width="800" height="340"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As you can already guess from the screenshots, WAF Policy gives you a bit more control over your configuration as you can be more detailed in what you want to exclude or include in your rules.&lt;/p&gt;

&lt;p&gt;You have the flexibility to link a WAF (Web Application Firewall) policy in various ways: you can connect it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;globally&lt;/strong&gt; by assigning it to an Azure Application Gateway resource&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;per-site&lt;/strong&gt; level by linking it to a listener&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;per URI&lt;/strong&gt; level by associating it with a particular route path&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For more details (and examples) on how you can link a WAF policy to your resources, you can check the official documentation &lt;a href="https://learn.microsoft.com/en-us/azure/web-application-firewall/ag/policy-overview" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In Terraform this means you will need to create a new resource:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_web_application_firewall_policy"&lt;/span&gt; &lt;span class="s2"&gt;"waf_policy"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"wafpolicy"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;

  &lt;span class="nx"&gt;policy_settings&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;enabled&lt;/span&gt;                     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="nx"&gt;mode&lt;/span&gt;                        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Prevention"&lt;/span&gt;
    &lt;span class="nx"&gt;request_body_check&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
    &lt;span class="nx"&gt;file_upload_limit_in_mb&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
    &lt;span class="nx"&gt;max_request_body_size_in_kb&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;## Example of managed rules&lt;/span&gt;
  &lt;span class="nx"&gt;managed_rules&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;managed_rule_set&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;type&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"OWASP"&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"3.2"&lt;/span&gt;
      &lt;span class="nx"&gt;rule_group_override&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;rule_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"REQUEST-920-PROTOCOL-ENFORCEMENT"&lt;/span&gt;
        &lt;span class="nx"&gt;disabled_rules&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
          &lt;span class="mi"&gt;920200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="mi"&gt;920201&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="mi"&gt;920202&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="nx"&gt;rule_group_override&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;rule_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"REQUEST-921-PROTOCOL-ATTACK"&lt;/span&gt;
        &lt;span class="nx"&gt;disabled_rules&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
          &lt;span class="mi"&gt;921170&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="mi"&gt;921180&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="nx"&gt;rule_group_override&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;rule_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"REQUEST-942-APPLICATION-ATTACK-SQLI"&lt;/span&gt;
        &lt;span class="nx"&gt;disabled_rules&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
          &lt;span class="mi"&gt;942430&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;## Example of custom rules&lt;/span&gt;
  &lt;span class="nx"&gt;custom_rules&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ExcludeServicesFromWAF"&lt;/span&gt;
    &lt;span class="nx"&gt;priority&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;14&lt;/span&gt;
    &lt;span class="nx"&gt;rule_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"MatchRule"&lt;/span&gt;

    &lt;span class="nx"&gt;match_conditions&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;match_variables&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;variable_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"RequestUri"&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="nx"&gt;operator&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Contains"&lt;/span&gt;
      &lt;span class="nx"&gt;negation_condition&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
      &lt;span class="nx"&gt;match_values&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s2"&gt;"/service1/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;(...)&lt;/span&gt;
      &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nx"&gt;action&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After creating the WAF Policy you will need to associate it to the Application Gateway which will be done by adding the following parameters to the &lt;code&gt;azurerm_application_gateway&lt;/code&gt; resource:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;  &lt;span class="nx"&gt;firewall_policy_id&lt;/span&gt;                &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_web_application_firewall_policy&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;wafpolicy&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;force_firewall_policy_association&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first thing you will notice, if you go to Azure Portal, is that in your Application Gatway resource you will no longer have the WAF Config section available, but a link to the WAF Policy you just created:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flti0u6bnjow052xzf8xu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flti0u6bnjow052xzf8xu.png" alt="Alt text" width="575" height="339"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This means that any change you want to make to your WAF configuration you will need to do it in the WAF Policy resource itself and not in the Application Gateway resource.&lt;br&gt;
This offered me the granularity I needed to be able to exclude the requests I wanted and also have the same configuration for all my Application Gateways.&lt;/p&gt;

&lt;p&gt;In my case, WAF Config was not the right answer for what I needed: have the same exclusions on all our gateways and also have the same custom rules regradless of environment and allow me to exclude the requests that were coming from our services. &lt;br&gt;
This is why I decided to look into WAF policies instead.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final thoughts
&lt;/h2&gt;

&lt;p&gt;WAF Config is a good option if you want to configure WAF settings at Application Gateway level that applies to all the listeners and rules within it. It's quite suitable if you have a single set of WAF settings that you want to apply to all your web applications behind the Application Gateway.&lt;/p&gt;

&lt;p&gt;Whereas, WAF Policy will be a good choice when you need to have a more granular control over your WAF settings, where you need to define custom WAF settings and rules per-application or per-bath basis. One use-case for this could be if you have several applications behind the Application Gateway that have different security concerns and require configuring different WAF settings.&lt;/p&gt;

&lt;p&gt;I did not dive into all the rules and settings you can configure for WAF, which will be the topic of a separate more in-depth article, but I hope this post will help you decide which one is the best option for you.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Thank you for reading and hope this helps somebody else!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>azure</category>
      <category>waf</category>
    </item>
    <item>
      <title>Ensuring Seamless Operations: Troubleshooting and Resolving Dapr Certificate Expiry</title>
      <dc:creator>Ana Cozma</dc:creator>
      <pubDate>Thu, 20 Jul 2023 10:50:49 +0000</pubDate>
      <link>https://dev.to/the_cozma/ensuring-seamless-operations-troubleshooting-and-resolving-dapr-certificate-expiry-4a74</link>
      <guid>https://dev.to/the_cozma/ensuring-seamless-operations-troubleshooting-and-resolving-dapr-certificate-expiry-4a74</guid>
      <description>&lt;p&gt;A CNCF project, the &lt;a href="https://dapr.io/" rel="noopener noreferrer"&gt;Distributed Application Runtime (Dapr)&lt;/a&gt; provides APIs that simplify microservice connectivity. Whether your communication pattern is service to service invocation or pub/sub messaging, Dapr helps you write resilient and secured microservices. Essentially, it provides a new way to build microservices by using the reusable blocks implemented as  sidecars.&lt;/p&gt;

&lt;p&gt;While Dapr is great as it is language agnostic and it solves some challenges that come with microservices and distributed systems, such as message broker integration, encryption etc, troubleshooting Dapr issues can be quite challenging. Dapr logs, especially the error messages, can be quite generic and sometimes do not provide enough information for you to understand what is going on.&lt;/p&gt;

&lt;p&gt;In this blog post, I want to detail a problem I had with Dapr certificate expiration, how I troubleshoot the root cause, the symptoms the application was manifesting and how I managed to solve it.&lt;/p&gt;

&lt;p&gt;I want also to highlight how important it is to have proper monitoring in place so I will be touching upon that as well by showing you some lessons learned and what I ended setting up to save me from repeating the same mistakes in the future.&lt;/p&gt;

&lt;h2&gt;
  
  
  Symptoms
&lt;/h2&gt;

&lt;p&gt;Application deployment was failing because it could not inject the dapr sidecar. It kept restarting until reaching the 5min defaut timeout and rolledback. Checking the events on the pod I noticed the &lt;code&gt;GET /healthz&lt;/code&gt; endpoints for liveness and readiness probes were throwing &lt;code&gt;connect: connection refused&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;There were no errors in app logs or in the Dapr sidecar logs. The only thing I noticed was that the dapr sidecar was in &lt;code&gt;CrashLoopBackOff&lt;/code&gt; state.&lt;/p&gt;

&lt;h2&gt;
  
  
  Troubleshooting
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;em&gt;Step 1&lt;/em&gt;: &lt;strong&gt;Dapr Operator&lt;/strong&gt; logs
&lt;/h3&gt;

&lt;p&gt;Since no longs were available on pod or Dapr sidecar, I started by checking the logs of the next best thing which was the &lt;strong&gt;Dapr Operator&lt;/strong&gt; and I noticed the following error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"instance"&lt;/span&gt;:&lt;span class="s2"&gt;"dapr-operator-0000000000-abcd"&lt;/span&gt;,&lt;span class="s2"&gt;"level"&lt;/span&gt;:&lt;span class="s2"&gt;"info"&lt;/span&gt;,&lt;span class="s2"&gt;"msg"&lt;/span&gt;:&lt;span class="s2"&gt;"starting webhooks"&lt;/span&gt;,&lt;span class="s2"&gt;"scope"&lt;/span&gt;:&lt;span class="s2"&gt;"dapr.operator"&lt;/span&gt;,&lt;span class="s2"&gt;"time"&lt;/span&gt;:&lt;span class="s2"&gt;"2023-05-25T12:51:13.267369255Z"&lt;/span&gt;,&lt;span class="s2"&gt;"type"&lt;/span&gt;:&lt;span class="s2"&gt;"log"&lt;/span&gt;,&lt;span class="s2"&gt;"ver"&lt;/span&gt;:&lt;span class="s2"&gt;"1.10.4"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
I0525 12:51:13.269285       1 leaderelection.go:248] attempting to acquire leader lease dapr-system/webhooks.dapr.io...
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"instance"&lt;/span&gt;:&lt;span class="s2"&gt;"dapr-operator-0000000000-abcd"&lt;/span&gt;,&lt;span class="s2"&gt;"level"&lt;/span&gt;:&lt;span class="s2"&gt;"info"&lt;/span&gt;,&lt;span class="s2"&gt;"msg"&lt;/span&gt;:&lt;span class="s2"&gt;"Conversion webhook for &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;subscriptions.dapr.io&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; is up to date"&lt;/span&gt;,&lt;span class="s2"&gt;"scope"&lt;/span&gt;:&lt;span class="s2"&gt;"dapr.operator"&lt;/span&gt;,&lt;span class="s2"&gt;"time"&lt;/span&gt;:&lt;span class="s2"&gt;"2023-05-25T12:51:13.277615379Z"&lt;/span&gt;,&lt;span class="s2"&gt;"type"&lt;/span&gt;:&lt;span class="s2"&gt;"log"&lt;/span&gt;,&lt;span class="s2"&gt;"ver"&lt;/span&gt;:&lt;span class="s2"&gt;"1.10.4"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
W0601 02:52:46.530879       1 reflector.go:347] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: watch of &lt;span class="k"&gt;*&lt;/span&gt;v1.Secret ended with: an error on the server &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"unable to decode an event from the watch stream: http2: client connection lost"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; has prevented the request from succeeding
W0601 02:52:46.531001       1 reflector.go:347] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: watch of &lt;span class="k"&gt;*&lt;/span&gt;v1alpha1.Configuration ended with: an error on the server &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"unable to decode an event from the watch stream: http2: client connection lost"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; has prevented the request from succeeding
W0601 02:52:46.531061       1 reflector.go:347] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: watch of &lt;span class="k"&gt;*&lt;/span&gt;v1.Service ended with: an error on the server &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"unable to decode an event from the watch stream: http2: client connection lost"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; has prevented the request from succeeding
E0601 02:52:46.531050       1 leaderelection.go:330] error retrieving resource lock dapr-system/operator.dapr.io: Get &lt;span class="s2"&gt;"https://X.X.X.X:443/apis/coordination.k8s.io/v1/namespaces/dapr-system/leases/operator.dapr.io"&lt;/span&gt;: http2: client connection lost
W0601 02:52:46.531095       1 reflector.go:347] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: watch of &lt;span class="k"&gt;*&lt;/span&gt;v1alpha1.Resiliency ended with: an error on the server &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"unable to decode an event from the watch stream: http2: client connection lost"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; has prevented the request from succeeding
W0601 02:52:46.530891       1 reflector.go:347] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: watch of &lt;span class="k"&gt;*&lt;/span&gt;v1alpha1.Component ended with: an error on the server &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"unable to decode an event from the watch stream: http2: client connection lost"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; has prevented the request from succeeding
E0601 02:52:46.531191       1 leaderelection.go:330] error retrieving resource lock dapr-system/webhooks.dapr.io: Get &lt;span class="s2"&gt;"https://X.X.X.X:443/apis/coordination.k8s.io/v1/namespaces/dapr-system/leases/webhooks.dapr.io"&lt;/span&gt;: http2: client connection lost
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Dapr operator works by establishing an admission webhook, which enables Kubernetes (K8s) to interact with it when it intends to deploy a new pod. After a successful response, the daprd container is added to the pod. For more in detail information on how the operator works, check the &lt;a href="https://docs.dapr.io/concepts/dapr-services/operator/" rel="noopener noreferrer"&gt;Dapr Operator control plane service overview documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;em&gt;Step 2&lt;/em&gt;: Investigate the &lt;code&gt;http2: client connection lost&lt;/code&gt; error
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;http2: client connection lost&lt;/code&gt; error indicated to me that K8s could not successfully invoke the admission webhook, so I started to check one by one:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Network connectivity&lt;/strong&gt;: The error message mentioned a potential issue with the client connection being lost. So I verified that the machine running the Dapr process could establish a stable connection to the Kubernetes API server. Checked for any network connectivity issues or firewalls that might be interfering with the communication. Everything was fine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;API server issues&lt;/strong&gt;: I also checked for any issues with the Kubernetes API server itself, such as high load, resource constraints, or misconfiguration. No issues found.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Namespace or resource deletion&lt;/strong&gt;: I checked that no resources had been deleted in the the dapr-system namespace or the webhooks.dapr.io resource. Everything was still there.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;em&gt;Step 3&lt;/em&gt;: &lt;strong&gt;AKS cluster&lt;/strong&gt; logs
&lt;/h3&gt;

&lt;p&gt;So as next step, I started looking into the &lt;strong&gt;AKS cluster logs&lt;/strong&gt; and noticed that all the services that were also using Dapr had the following error &lt;code&gt;authentication handshake failed&lt;/code&gt;. The full log is below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"app_id"&lt;/span&gt;:&lt;span class="s2"&gt;"app1"&lt;/span&gt;,&lt;span class="s2"&gt;"instance"&lt;/span&gt;:&lt;span class="s2"&gt;"app1-123456-abc7"&lt;/span&gt;,&lt;span class="s2"&gt;"level"&lt;/span&gt;:&lt;span class="s2"&gt;"info"&lt;/span&gt;,&lt;span class="s2"&gt;"msg"&lt;/span&gt;:&lt;span class="s2"&gt;"sending workload csr request to sentry"&lt;/span&gt;,&lt;span class="s2"&gt;"scope"&lt;/span&gt;:&lt;span class="s2"&gt;"dapr.runtime.grpc.internal"&lt;/span&gt;,&lt;span class="s2"&gt;"time"&lt;/span&gt;:&lt;span class="s2"&gt;"2023-06-19T13:19:53.535345802Z"&lt;/span&gt;,&lt;span class="s2"&gt;"type"&lt;/span&gt;:&lt;span class="s2"&gt;"log"&lt;/span&gt;,&lt;span class="s2"&gt;"ver"&lt;/span&gt;:&lt;span class="s2"&gt;"1.10.4"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
2023-06-19 15:19:53.535 
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"app_id"&lt;/span&gt;:&lt;span class="s2"&gt;"app1"&lt;/span&gt;,&lt;span class="s2"&gt;"instance"&lt;/span&gt;:&lt;span class="s2"&gt;"app1-123456-abc7"&lt;/span&gt;,&lt;span class="s2"&gt;"level"&lt;/span&gt;:&lt;span class="s2"&gt;"info"&lt;/span&gt;,&lt;span class="s2"&gt;"msg"&lt;/span&gt;:&lt;span class="s2"&gt;"renewing certificate: requesting new cert and restarting gRPC server"&lt;/span&gt;,&lt;span class="s2"&gt;"scope"&lt;/span&gt;:&lt;span class="s2"&gt;"dapr.runtime.grpc.internal"&lt;/span&gt;,&lt;span class="s2"&gt;"time"&lt;/span&gt;:&lt;span class="s2"&gt;"2023-06-19T13:19:53.535329702Z"&lt;/span&gt;,&lt;span class="s2"&gt;"type"&lt;/span&gt;:&lt;span class="s2"&gt;"log"&lt;/span&gt;,&lt;span class="s2"&gt;"ver"&lt;/span&gt;:&lt;span class="s2"&gt;"1.10.4"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
2023-06-19 15:19:53.535 
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"app_id"&lt;/span&gt;:&lt;span class="s2"&gt;"app1"&lt;/span&gt;,&lt;span class="s2"&gt;"instance"&lt;/span&gt;:&lt;span class="s2"&gt;"app1-123456-abc7"&lt;/span&gt;,&lt;span class="s2"&gt;"level"&lt;/span&gt;:&lt;span class="s2"&gt;"error"&lt;/span&gt;,&lt;span class="s2"&gt;"msg"&lt;/span&gt;:&lt;span class="s2"&gt;"error starting server: error from authenticator CreateSignedWorkloadCert: error from sentry SignCertificate: rpc error: code = Unavailable desc = connection error: desc = &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2023-06-19T13:19:51Z is after 2023-06-16T12:31:17Z&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;,&lt;span class="s2"&gt;"scope"&lt;/span&gt;:&lt;span class="s2"&gt;"dapr.runtime.grpc.internal"&lt;/span&gt;,&lt;span class="s2"&gt;"time"&lt;/span&gt;:&lt;span class="s2"&gt;"2023-06-19T13:19:53.535259601Z"&lt;/span&gt;,&lt;span class="s2"&gt;"type"&lt;/span&gt;:&lt;span class="s2"&gt;"log"&lt;/span&gt;,&lt;span class="s2"&gt;"ver"&lt;/span&gt;:&lt;span class="s2"&gt;"1.10.4"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The errors above were a confirmation that it could not establish a connection because it could not authenticate due to an handshake failure.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;em&gt;Step 4&lt;/em&gt;: &lt;em&gt;Dapr Sentry&lt;/em&gt; logs
&lt;/h3&gt;

&lt;p&gt;To dig deeper, I researched how Dapr handles mTLS which pointed me to the Dapr Sentry service.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The Dapr Sentry service manages mTLS between services and acts as a certificate authority. It generates mTLS certificates and distributes them to any running sidecars. This allows sidecars to communicate with encrypted, mTLS traffic.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So I went to check the &lt;em&gt;Dapr Sentry&lt;/em&gt; logs and I finally found the issue: &lt;strong&gt;Dapr root certificate expired&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;2023-06-19 14:49:06.566 
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"instance"&lt;/span&gt;:&lt;span class="s2"&gt;"dapr-sentry-123456-abc7"&lt;/span&gt;,&lt;span class="s2"&gt;"level"&lt;/span&gt;:&lt;span class="s2"&gt;"warning"&lt;/span&gt;,&lt;span class="s2"&gt;"msg"&lt;/span&gt;:&lt;span class="s2"&gt;"Dapr root certificate expiration warning: certificate has expired."&lt;/span&gt;,&lt;span class="s2"&gt;"scope"&lt;/span&gt;:&lt;span class="s2"&gt;"dapr.sentry"&lt;/span&gt;,&lt;span class="s2"&gt;"time"&lt;/span&gt;:&lt;span class="s2"&gt;"2023-06-19T12:49:06.566339341Z"&lt;/span&gt;,&lt;span class="s2"&gt;"type"&lt;/span&gt;:&lt;span class="s2"&gt;"log"&lt;/span&gt;,&lt;span class="s2"&gt;"ver"&lt;/span&gt;:&lt;span class="s2"&gt;"1.10.4"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In order to view the logs of the Dapr Sentry service you can run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl logs &lt;span class="nt"&gt;--selector&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;app&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;dapr-sentry &lt;span class="nt"&gt;--namespace&lt;/span&gt; &amp;lt;NAMESPACE&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Generating a new root certificate
&lt;/h2&gt;

&lt;p&gt;By default, the certificate expires after 365 days. You can change the expiration time by setting the &lt;code&gt;--cert-chain-expiration&lt;/code&gt; flag when you start the Dapr Sentry service. The value is in days.&lt;/p&gt;

&lt;p&gt;With Dapr, you can encrypt communication between applications using self-signed one valid for 1 year so it was time to renew the certificate.&lt;/p&gt;

&lt;p&gt;To renew the certificate, I followed the recommended steps to root and issuer certificate upgrade using CLI. You can find the steps &lt;a href="https://docs.dapr.io/operations/security/mtls/#root-and-issuer-certificate-upgrade-using-cli-recommended" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Generated brand new root and issuer certificates, signed by a newly generated private root key by running the following command:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dapr mtls renew-certificate &lt;span class="nt"&gt;-k&lt;/span&gt; &lt;span class="nt"&gt;--valid-until&lt;/span&gt; &amp;lt;days&amp;gt; &lt;span class="nt"&gt;--restart&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;⌛  Starting certificate rotation
ℹ️  generating fresh certificates
ℹ️  Updating certifcates &lt;span class="k"&gt;in &lt;/span&gt;your Kubernetes cluster
ℹ️  Dapr control plane version 1.10.4 detected &lt;span class="k"&gt;in &lt;/span&gt;namespace dapr-system
✅  Certificate rotation is successful! Your new certicate is valid through Wed, 18 Jun 2025 13:37:30 UTC
ℹ️  Restarting deploy/dapr-sentry..
ℹ️  Restarting deploy/dapr-operator..
ℹ️  Restarting statefulsets/dapr-placement-server..
✅  All control plane services have restarted successfully!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Restarted one of the applications in kubernetes to see if the changes worked successfully. And it did!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Redeployed all applications that were using Dapr via our normal Github Actions pipelines.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There was no downtime and the process was quite smooth. Dapr does not renew certificates automatically so depending on your setup you will need to renew them manually or create an intermediary service that does it for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Next steps
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Lesson learned no 1: Have an overview of your Dapr services
&lt;/h3&gt;

&lt;p&gt;I had no overview of the Dapr system which caused me a lot of time in trying to get to the root cause. So first thing I did was to create a &lt;strong&gt;nice dashboard&lt;/strong&gt; where we can have an overview of our Dapr services and their certificates. I started from the official one from &lt;a href="https://github.com/dapr/dapr/tree/master/grafana" rel="noopener noreferrer"&gt;Grafana&lt;/a&gt; for this. But the dashboard is a bit outdated so I had some issues with the queries, so I did some changes and you can find the JSON of the dashboard below if it helps anyone.&lt;/p&gt;

&lt;p&gt;For the &lt;a href="https://ana-cozma.github.io/blog/posts/dapr-certificate-renewal/" rel="noopener noreferrer"&gt;full Grafana Dashboard JSON&lt;/a&gt;. Go to Next steps &amp;gt; Lesson learned no 1: Have an overview of your Dapr services and click on Expand report.&lt;/p&gt;

&lt;p&gt;I added some variables for the Prometheus datasource name and the cluster name. You can change the refresh rate of the dashboard and the time range.&lt;/p&gt;

&lt;p&gt;And the output creates a dashboard that something like below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjs6sq2hqz250dvlfctmt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjs6sq2hqz250dvlfctmt.png" alt="Dapr Dashboard" width="800" height="366"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each section of the dashboard has a very nice info box that will tell you what the section shows and how to interpret the data.&lt;/p&gt;

&lt;p&gt;Ideally I think a good practice is not to overload yourself by creating tons of dashboards that you will not look at, maintain or forget about.&lt;/p&gt;

&lt;p&gt;In this case, it's quite useful to have one because in the event of an incident or a problem, this will save you hours of troubleshooting and will give you a good overview of the system and what is failing. If you look at the outputs of the board itself you'll see that it logs: CSR Failures, Server TLS certificate issuance failures, etc.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson Learned no 2: Make sure you are aware before expiration
&lt;/h2&gt;

&lt;p&gt;Beginning &lt;strong&gt;30 days&lt;/strong&gt; prior to mTLS root certificate expiration the Dapr sentry service will emit hourly warning level logs indicating that the root certificate is about to expire. You can use these logs to set up alerts to notify you when the certificate is about to expire.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Dapr root certificate expiration warning: certificate expires in 2 days and 15 hours"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First thing is &lt;strong&gt;configure a Loki data source&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;I already had this done and setting it up might be the subject of another blog post. But in a nutshell, Loki is a log aggregation system that integrates with Grafana which allows you to ingest and query log data. So I just made sure I had a Loki data source configured correctly.&lt;/p&gt;

&lt;p&gt;Next, I created a &lt;strong&gt;create a log query&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In the Explore view of Grafana, selecting the Loki data source, I wrote a log query that retrieves the logs I want to use for the alert. The query you build might differ but it should match the logs produced by the kubectl logs command for dapr-sentry. &lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{cluster="$cluster", namespace="dapr-system"} |= `Dapr root certificate expiration warning: certificate expires in`
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Adjust the query based on the specific log lines or patterns you want to target. I wanted to get all the logs that had the warning message about the certificate expiration starting from the 30days mark. But you can just edit the query to log you x days before the expiration.&lt;/p&gt;

&lt;p&gt;A good rule of thumb is to &lt;strong&gt;test the log query&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;After executing the query, you should see the warnings in the log entries. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tip:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If no logs are returned, check that the query is correct, that the data source is set up correctly, and that the logs are being ingested by Loki.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Since all was good in my case, I proceeded to &lt;strong&gt;add this query from the Explore page to my previously created dashboard&lt;/strong&gt; so I can see the logs in the dashboard itself as well. So I created a new panel with the logs and a nice description of what the logs mean for anyone reading it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb7lyov367k9bi8lpb40h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb7lyov367k9bi8lpb40h.png" alt="Alt text" width="800" height="119"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And lastly, I &lt;strong&gt;create an alert rule&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;In the Alerting section in Grafana I went to "Create Rule" to define an alert rule. I configured the alert based on the previous query and I defined the conditions that trigger the alert based on the log query results. For example, you can set a condition like "Count() is above 0" to trigger the alert when there is at least one log entry matching the query. Or you can customize it based on your needs.&lt;/p&gt;

&lt;p&gt;Here the implementation of the alert might differ based on what tooling you use, which channel you want to be alerted on (slack, email etc).&lt;/p&gt;

&lt;p&gt;Hope this gave an insight into how you can troubleshoot and monitor Dapr in your environments.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Thank you for reading! And let me know if you have any questions or feedback.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>dapr</category>
      <category>kubernetes</category>
      <category>aks</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Troubleshooting and Resolving a Pod Stuck in 'CreateContainerConfigError' in Kubernetes</title>
      <dc:creator>Ana Cozma</dc:creator>
      <pubDate>Mon, 30 Jan 2023 21:58:25 +0000</pubDate>
      <link>https://dev.to/the_cozma/troubleshooting-and-resolving-a-pod-stuck-in-createcontainerconfigerror-in-kubernetes-56ij</link>
      <guid>https://dev.to/the_cozma/troubleshooting-and-resolving-a-pod-stuck-in-createcontainerconfigerror-in-kubernetes-56ij</guid>
      <description>&lt;p&gt;The other day I was making changes to my helm charts and, after deploying my application, I noticed that one of my pods was stuck in a &lt;code&gt;CreateContainerConfigError&lt;/code&gt; state. This is a pretty tricky error because it doesn't give you any details on what the underlying issue could be.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is the CreateContainerConfigError?
&lt;/h2&gt;

&lt;p&gt;To understand this, let's look at what happens at deployment time to give you an idea of the flow and what could go wrong at each step. &lt;/p&gt;

&lt;p&gt;When you deploy a pod, the first step is to pull the image from the registry and then create the container. If the &lt;strong&gt;image is not found&lt;/strong&gt;, then Kubernetes will return an &lt;strong&gt;ErrImagePull&lt;/strong&gt; error. If the image is found, then it will proceed to create the container. &lt;/p&gt;

&lt;p&gt;If the &lt;strong&gt;container creation fails&lt;/strong&gt;, then it will return a &lt;strong&gt;CreateContainerError&lt;/strong&gt; error. If the container creation succeeds, then Kubernetes will then start the container. &lt;/p&gt;

&lt;p&gt;If the &lt;strong&gt;container start fails&lt;/strong&gt;, then it will return a &lt;strong&gt;CreateContainerConfigError&lt;/strong&gt; error.&lt;/p&gt;

&lt;p&gt;In other words, the error happens when the container is transitioning from a Pending state to a Running state. It is at this point that the deployment configuration will be validated to make sure that the container can be started. If the configuration is invalid, then it will return a &lt;strong&gt;CreateContainerConfigError&lt;/strong&gt; error.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Troubleshoot the CreateContainerConfigError
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Disclaimer: There can be many reasons why the container configuration is invalid and it will depende on your specific configuration. I will only be covering the one that I have encountered. If you have encountered a different cause, please leave a comment below.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Because the error happens during the validation of the configuration, a good starting point is to double-check the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;is the ConfigMap missing? Is it properly configured?&lt;/li&gt;
&lt;li&gt;is a Secret missing? Is it properly configured?&lt;/li&gt;
&lt;li&gt;is the PersistentVolume missing? Is it properly configured?&lt;/li&gt;
&lt;li&gt;is the Pod being created correctly? Are there any empty or invalid fields?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now that we understand what the error is, and what we should be looking at, let's look at how to troubleshoot it and narrow down the problem.&lt;/p&gt;

&lt;h3&gt;
  
  
  Check the Pod Status
&lt;/h3&gt;

&lt;p&gt;The first thing I did was get the pod that I had the error and that I wanted to drill into.&lt;/p&gt;

&lt;p&gt;You can do this by running &lt;code&gt;kubectl get pods -n &amp;lt;namespace&amp;gt;&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;~ kubectl get pods &lt;span class="nt"&gt;-n&lt;/span&gt; my-service                                                                   
NAME                                READY   STATUS                       RESTARTS       AGE
my-service-00000000078c9fff-dssbk   0/2     CreateContainerConfigError   1 &lt;span class="o"&gt;(&lt;/span&gt;10s ago&lt;span class="o"&gt;)&lt;/span&gt;    28s
my-service-00000000bcddf7d-xfsmk    2/2     Running                      25 &lt;span class="o"&gt;(&lt;/span&gt;42h ago&lt;span class="o"&gt;)&lt;/span&gt;   16d
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Check the Events
&lt;/h3&gt;

&lt;p&gt;Next, we are interested to see all the events on the pod. &lt;/p&gt;

&lt;p&gt;You can do this by running &lt;code&gt;kubectl describe pod &amp;lt;pod-name&amp;gt; -n &amp;lt;namespace&amp;gt;&lt;/code&gt; and look at the bottom at the &lt;strong&gt;Events&lt;/strong&gt;. This will give you a lot of information about the pod, including the events that have happened to it similar to the following, which has been redacted to remove sensitive information.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;~ kubectl describe pod my-service-00000000078c9fff-dssbk &lt;span class="nt"&gt;-n&lt;/span&gt; my-service 

Name:             my-service-00000000078c9fff-dssbk
Namespace:        my-service
Priority:         0
Service Account:  default
Node:             &amp;lt;node-details&amp;gt;
Start Time:       Wed, 25 Jan 2023 15:36:14 +0100
Labels:           app.kubernetes.io/instance&lt;span class="o"&gt;=&lt;/span&gt;my-service
                  app.kubernetes.io/name&lt;span class="o"&gt;=&lt;/span&gt;my-service
                  pod-template-hash&lt;span class="o"&gt;=&lt;/span&gt;00000000
Annotations:      &amp;lt;annotations&amp;gt;
Status:           Pending
IP:               
IPs:
  IP:           
Controlled By:  ReplicaSet/
Containers:
  my-service:
    Container ID:
    Image:          &amp;lt;image-name&amp;gt;
    Image ID:
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       CreateContainerConfigError
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  128Mi
    Requests:
      cpu:      100m
      memory:   128Mi
    Liveness:   http-get http://:http/ &lt;span class="nv"&gt;delay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;15s &lt;span class="nb"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;60s &lt;span class="nv"&gt;period&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;60s &lt;span class="c"&gt;#success=1 #failure=3&lt;/span&gt;
    Readiness:  http-get http://:http/ &lt;span class="nv"&gt;delay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;15s &lt;span class="nb"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;60s &lt;span class="nv"&gt;period&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;60s &lt;span class="c"&gt;#success=1 #failure=3&lt;/span&gt;
    Environment:
    &lt;span class="o"&gt;(&lt;/span&gt;...&lt;span class="o"&gt;)&lt;/span&gt;
      AzureWebJobsStorage:                                                  &amp;lt;&lt;span class="nb"&gt;set &lt;/span&gt;to the key &lt;span class="s1"&gt;'AzureWebJobsStorage'&lt;/span&gt; &lt;span class="k"&gt;in &lt;/span&gt;secret &lt;span class="s1"&gt;'my-service'&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;                                     Optional: &lt;span class="nb"&gt;false
      &lt;/span&gt;AzureAccessKey:                                                       &amp;lt;&lt;span class="nb"&gt;set &lt;/span&gt;to the key &lt;span class="s1"&gt;'AzureAccessKey'&lt;/span&gt; &lt;span class="k"&gt;in &lt;/span&gt;secret &lt;span class="s1"&gt;'my-service'&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;                                          Optional: &lt;span class="nb"&gt;false
      &lt;/span&gt;AzureTopicEndpoint:                                                   &amp;lt;&lt;span class="nb"&gt;set &lt;/span&gt;to the key &lt;span class="s1"&gt;'AzureTopicEndpoint'&lt;/span&gt; &lt;span class="k"&gt;in &lt;/span&gt;secret &lt;span class="s1"&gt;'my-service'&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;                                      Optional: &lt;span class="nb"&gt;false
      &lt;/span&gt;ClientId:                                                             &amp;lt;&lt;span class="nb"&gt;set &lt;/span&gt;to the key &lt;span class="s1"&gt;'ClientId'&lt;/span&gt; &lt;span class="k"&gt;in &lt;/span&gt;secret &lt;span class="s1"&gt;'my-service'&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;                                                Optional: &lt;span class="nb"&gt;false&lt;/span&gt;
    &lt;span class="o"&gt;(&lt;/span&gt;...&lt;span class="o"&gt;)&lt;/span&gt;
    State:          Waiting
    &lt;span class="o"&gt;(&lt;/span&gt;...&lt;span class="o"&gt;)&lt;/span&gt;
Events:
  Type     Reason     Age                From               Message
  &lt;span class="nt"&gt;----&lt;/span&gt;     &lt;span class="nt"&gt;------&lt;/span&gt;     &lt;span class="nt"&gt;----&lt;/span&gt;               &lt;span class="nt"&gt;----&lt;/span&gt;               &lt;span class="nt"&gt;-------&lt;/span&gt;
  Normal   Scheduled  94s                default-scheduler  Successfully assigned my-service/my-service-00000000078c9fff-dssbk to &amp;lt;node-name&amp;gt;
  Normal   Pulled     94s                kubelet            Successfully pulled image &lt;span class="s2"&gt;"image"&lt;/span&gt; &lt;span class="k"&gt;in &lt;/span&gt;165.014261ms
  Warning  Failed     77s &lt;span class="o"&gt;(&lt;/span&gt;x4 over 94s&lt;span class="o"&gt;)&lt;/span&gt;  kubelet            Error: couldn&lt;span class="s1"&gt;'t find key ClientId in Secret my-service/my-service
(...)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Events section shows a list of all the events that have occurred in the process of creating the pod. &lt;/p&gt;

&lt;p&gt;And here we find the issue. The pod is actually missing a secret, the ClientId in my case, that it needs to start. And that is why the pod is in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;    State:          Waiting
      Reason:       CreateContainerConfigError
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want to double-check that the secret is missing, you can run &lt;code&gt;kubectl get secrets -n &amp;lt;namespace&amp;gt;&lt;/code&gt; and check if the secret is not there.&lt;/p&gt;

&lt;p&gt;Or you can output it in a JSON format and check that the key is missing by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get secret my-service &lt;span class="nt"&gt;-n&lt;/span&gt; my-service &lt;span class="nt"&gt;-o&lt;/span&gt; json | jq &lt;span class="s1"&gt;'.data | map_values(@base64d)'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How to Resolve the CreateContainerConfigError
&lt;/h2&gt;

&lt;p&gt;In my case, because I store my infrastructure and configuration (including the kubernetes secrets) in Terraform, I just needed to add the secret to the Terraform configuration, apply it and because the deployment had already timed out, re-run the deployment. But it would've picked it up automatically if I had applied it a bit sooner.&lt;/p&gt;

&lt;p&gt;Now that the pod has its necessary configuration and is valid if we run &lt;code&gt;kubectl get pods -n &amp;lt;namespace&amp;gt;&lt;/code&gt; again, we can see that the pod is now in a Running state.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get pods &lt;span class="nt"&gt;-n&lt;/span&gt; my-service                                                                   
NAME                                READY   STATUS                       RESTARTS       AGE
my-service-00000000078c9fff-dssbk   2/2     Running                       1 &lt;span class="o"&gt;(&lt;/span&gt;10s ago&lt;span class="o"&gt;)&lt;/span&gt;    28s
my-service-00000000bcddf7d-xfsmk    2/2     Terminating                  25 &lt;span class="o"&gt;(&lt;/span&gt;42h ago&lt;span class="o"&gt;)&lt;/span&gt;   16d
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And there you have it. You have successfully resolved the CreateContainerConfigError.&lt;/p&gt;

&lt;p&gt;This was an easy one, let me know what you encountered in the comments below and how you fixed it.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Happy Coding and I hope this helps someone!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>crypto</category>
      <category>blockchain</category>
      <category>web3</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Book Review: Observability Engineering: Achieving Production Excellence</title>
      <dc:creator>Ana Cozma</dc:creator>
      <pubDate>Wed, 25 Jan 2023 12:50:46 +0000</pubDate>
      <link>https://dev.to/the_cozma/book-review-observability-engineering-achieving-production-excellence-h9o</link>
      <guid>https://dev.to/the_cozma/book-review-observability-engineering-achieving-production-excellence-h9o</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fana-cozma.github.io%2Fblog%2Fcoffee%2Fbook-review-observability%2Fobservability.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fana-cozma.github.io%2Fblog%2Fcoffee%2Fbook-review-observability%2Fobservability.png" alt="Observability Engineering" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For the first book review in what I will call my &lt;em&gt;coffee reads&lt;/em&gt; section of the blog I will be reviewing the book &lt;a href="https://www.oreilly.com/library/view/observability-engineering/9781492050046/" rel="noopener noreferrer"&gt;Observability Engineering: Achieving Production Excellence&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Observability Engineering: Achieving Production Excellence
&lt;/h2&gt;

&lt;p&gt;The book, in its own description, sets out to be an advocate for the adoption of observability practices in the software industry. Written by Charity Majors, Liz Fong-Jones, and George Miranda from &lt;a href="https://www.honeycomb.io/" rel="noopener noreferrer"&gt;Honeycomb.io&lt;/a&gt;, the book aims to be a resource for anyone interested in learning more about what is good observability, how you can build on top of your system today, and how to implement it in your organization.&lt;/p&gt;

&lt;p&gt;The book, consisting of around 400 pages, is split into 3 main parts: the first part is an introduction to observability, the second part is a deep dive into the different observability tools and practices, and the third part is a guide on how to implement observability in your organization.&lt;/p&gt;

&lt;h2&gt;
  
  
  Thoughts on the book
&lt;/h2&gt;

&lt;p&gt;I bought this book for Kindle and went into it with basic knowledge about observability and having used the Honeycomb product a bit for work. &lt;/p&gt;

&lt;p&gt;My expectations from the title and the summary of the book were to learn about the different tools and practices in the realm of observability in addition to getting a better understanding of the theory behind observability and how it can be implemented in an organization.&lt;/p&gt;

&lt;h3&gt;
  
  
  Highlights
&lt;/h3&gt;

&lt;p&gt;The introduction to observability section was a very high-level overview of what observability is and how it differs from monitoring.&lt;/p&gt;

&lt;p&gt;I appreciated the change in mentality the book triggered in me around the concept of observability versus the concept of monitoring. The book did a very good job of explaining how observability has a more holistic approach than monitoring. I could relate to the examples given in the book about how monitoring is a reactive approach to problems and how observability is a proactive approach to problems and all the frustrations that came from relying solely on monitoring such as excessive alerting, false positives, and the lack of context, excessive use of dashboards, and the lack of understanding of the system.&lt;/p&gt;

&lt;p&gt;The book did not focus solely on the tools that Honeycomb uses and tried to offer an overview of what the market offers for this goal it did a good job at explaining the different tools and practices in terms of observability. The section on &lt;em&gt;&lt;em&gt;tracing&lt;/em&gt;&lt;/em&gt; explains how tracing works and how it can be used to understand the flow of a request through a system in great detail with examples and code snippets. I found it easy to follow and because of this, I understood what benefits it brings and how it can be used. The section on &lt;em&gt;&lt;em&gt;metrics&lt;/em&gt;&lt;/em&gt; was also very useful in explaining how you can understand the health of a system by using metrics and how you can also make use of them to detect any system anomalies.&lt;/p&gt;

&lt;p&gt;The main key takeaway from this chapter was that by using observability concepts and tooling correctly you don't need to rely on software engineers to understand the system, you can use the data, that is available for everyone, to understand the system and make decisions based on the data. &lt;/p&gt;

&lt;p&gt;The book also emphasizes the organizational and cultural changes that need to be made to implement observability in an organization. As an example, the book explains how the concept of blameless postmortems is a good way to encourage a culture of learning and how it can be used to improve the system. How if you rely on data then all the software engineers with a hero complex will have to adapt. I had a lot of respect for the authors for being honest about these points and for highlighting the day-to-day realities of implementing observability in an organization.&lt;/p&gt;

&lt;p&gt;The idea of adding case studies of companies that have implemented observability in their organization was very nice and I enjoyed reading about how different companies went about adopting observability practices and the challenges they needed to address.&lt;/p&gt;

&lt;h3&gt;
  
  
  Areas for Improvement
&lt;/h3&gt;

&lt;p&gt;Onto the things that I think could be revisited in the next versions of the book.&lt;/p&gt;

&lt;p&gt;There is a bit of repetition in the book, especially in the first part, where the same concepts are explained in different ways and the same idea, the difference between observability and monitoring, is reiterated several times. Definitely, this could be reduced in the next version of the book.&lt;/p&gt;

&lt;p&gt;The book also does not go into implementation details on how observability can &lt;em&gt;actually&lt;/em&gt; be adopted. It reiterates the challenges but is a bit succinct on how to address them. And the same for implementing observability practices in your system.&lt;/p&gt;

&lt;p&gt;Some sections had a lot of detailed code snippets which were hard to read. I skimmed over them to get the gist while in other sections I would have liked to see more code snippets to help me understand the concepts better.&lt;/p&gt;

&lt;p&gt;The case study section was too high level for my liking. I would've loved to read more about the &lt;em&gt;challenges&lt;/em&gt; they faced and &lt;em&gt;how&lt;/em&gt; they addressed them, rather than mention team collaboration as being the success metric in adopting observability. Go into detail about the tools they used and how they used them. What their lessons learned were.&lt;/p&gt;

&lt;p&gt;So because of this, the book felt a bit unbalanced at times in my opinion.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;To summarize, the book, I would say, had a lot of good key takeaways to set you on the path to adopting observability practices either in your day-to-day work or team level or maybe even organizational level, but it did fell short in some areas. Nevertheless, it was a good read and I would recommend it to anyone interested in learning more about observability. &lt;/p&gt;

&lt;p&gt;If you've read the book let me know what you thought about it in the comments below. Or if you have any recommendations for other books on the topic of observability I would love to hear them.&lt;/p&gt;

&lt;p&gt;I will also keep reviewing tech books on my blog as well as part of a Coffee Reads series.&lt;/p&gt;

&lt;p&gt;Enjoy your coffee!&lt;/p&gt;

</description>
      <category>discuss</category>
      <category>mcp</category>
      <category>community</category>
    </item>
    <item>
      <title>Kube-bench and Popeye: A Power Duo for AKS Security Compliance</title>
      <dc:creator>Ana Cozma</dc:creator>
      <pubDate>Mon, 23 Jan 2023 18:22:01 +0000</pubDate>
      <link>https://dev.to/the_cozma/kube-bench-and-popeye-a-power-duo-for-aks-security-compliance-38</link>
      <guid>https://dev.to/the_cozma/kube-bench-and-popeye-a-power-duo-for-aks-security-compliance-38</guid>
      <description>&lt;p&gt;In today's world, security is a top priority for any organization or at least it should be. With the rise of cloud computing, the number of security threats has increased exponentially.&lt;/p&gt;

&lt;p&gt;So how do we keep up? Where do we start?&lt;/p&gt;

&lt;p&gt;Microsoft has created a set of security benchmarks to give users a starting point for setting up their security configurations. The Microsoft cloud security benchmark (MCSB) is the successor of Azure Security Benchmark (ASB), which was rebranded in October 2022 (Currently in public preview). &lt;/p&gt;

&lt;p&gt;In this post, I would like to go over the Azure security baseline for Azure Kubernetes Service and give a shoutout to two tools that can aid you in the process of establishing your compliance with the baseline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Azure Security Baseline for AKS
&lt;/h2&gt;

&lt;p&gt;The Azure Security Baseline for &lt;a href="https://learn.microsoft.com/en-us/security/benchmark/azure/baselines/aks-security-baseline" rel="noopener noreferrer"&gt;Azure Kubernetes Service&lt;/a&gt; (AKS) is a set of recommendations for securing your AKS cluster.&lt;/p&gt;

&lt;p&gt;It is an exhaustive list of various aspects of AKS security and it also provides the corresponding actions to be taken in each case. From the documentation's overview:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;You can monitor this security baseline and its recommendations using Microsoft Defender for Cloud. Azure Policy definitions will be listed in the Regulatory Compliance section of the Microsoft Defender for Cloud dashboard.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;When a section has relevant Azure Policy Definitions, they are listed in this baseline to help you measure compliance to the Azure Security Benchmark controls and recommendations. Some recommendations may require a paid Microsoft Defender plan to enable certain security scenarios.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It is based on the CIS Kubernetes Benchmark and the Azure Security Benchmark v1.0.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;CIS Benchmarks are best practices for the secure configuration of a target system. Available for more than 100 CIS Benchmarks across 25+ vendor product families, CIS Benchmarks are developed through a unique consensus-based process comprised of cybersecurity professionals and subject matter experts around the world. CIS Benchmarks are the only consensus-based, best-practice security configuration guides both developed and accepted by government, business, industry, and academia.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For more information on &lt;strong&gt;CIS Benchmark&lt;/strong&gt; please check &lt;a href="https://www.cisecurity.org/cis-benchmarks/cis-benchmarks-faq" rel="noopener noreferrer"&gt;CIS Benchmark FAQ&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For more information on the &lt;strong&gt;CIS Benchmark for Kubernetes&lt;/strong&gt; please check the &lt;a href="https://www.cisecurity.org/benchmark/kubernetes" rel="noopener noreferrer"&gt;kubernetes benchmark&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;In the CIS Benchmark for Kubernetes document, there are instructions for both Master nodes and Worker nodes. But when using AKS we don't have access to the master nodes. In this case, we can make use of the &lt;a href="https://www.cisecurity.org/insights/blog/new-release-cis-azure-kubernetes-service-aks-benchmark" rel="noopener noreferrer"&gt;CIS Benchmark document for AKS&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What could we use to help us check our AKS setup against this benchmark?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We can start by looking at the Azure Portal and &lt;strong&gt;Microsoft Defender for Cloud&lt;/strong&gt;, checking out CIS compliance with &lt;strong&gt;Kube-bench&lt;/strong&gt; and any configuration mismatches with &lt;strong&gt;Popeye&lt;/strong&gt;. I will go into more detail on the last two tools. But first, let's see what Microsoft Defender for Cloud looks like and what can you get from it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Microsoft Defender for Cloud
&lt;/h2&gt;

&lt;p&gt;As suggested by Microsoft, we can start with Microsoft Defender for Cloud.&lt;br&gt;
If you go to Azure Portal and search for Microsoft Defender for Cloud, then filter by "Assessed Resources", and select your cluster you will see a list of all the cluster details and &lt;em&gt;Recommendations&lt;/em&gt; and the &lt;em&gt;Alerts&lt;/em&gt; tab as well.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa1oe5cfp5xn6kwojxqc7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa1oe5cfp5xn6kwojxqc7.png" alt="Microsoft Defender for Cloud" width="800" height="406"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's take the first recommendation as an example:&lt;br&gt;
&lt;em&gt;Azure Kubernetes Service clusters should have Defender profile enabled&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;If you click on it and expand it will give you the following information:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F50x27p24c7lj2ponmolp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F50x27p24c7lj2ponmolp.png" alt="Detail" width="800" height="714"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can choose to Exempt it, meaning you have either fixed this issue or you don't want to fix it or Enforce it, meaning you want to enforce this setting by adding it to an Azure Policy definition.&lt;/p&gt;

&lt;p&gt;There is also a nice description of the issue and suggested remediation steps to take.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxsxkzii2vipbxbbdvoas.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxsxkzii2vipbxbbdvoas.png" alt="Kube-bench" width="411" height="410"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Kube-bench
&lt;/h2&gt;

&lt;p&gt;The official repository can be found &lt;a href="https://github.com/aquasecurity/kube-bench" rel="noopener noreferrer"&gt;here&lt;/a&gt; with detailed installation instructions.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;kube-bench is a tool that checks whether Kubernetes is deployed securely by running the checks documented in the CIS Kubernetes Benchmark.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There are multiple ways of running this tool that you can check &lt;a href="https://github.com/aquasecurity/kube-bench/blob/main/docs/running.md" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Setting it up
&lt;/h3&gt;

&lt;p&gt;To test out this tool, I decided to just apply it to my local cluster so the first thing I did was start my &lt;a href="https://minikube.sigs.k8s.io/docs/start/" rel="noopener noreferrer"&gt;minikube&lt;/a&gt; instance and then I ran the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; minikube start
😄  minikube v1.22.0 on Darwin 12.6.2
✨  Using the hyperkit driver based on existing profile
👍  Starting control plane node minikube &lt;span class="k"&gt;in &lt;/span&gt;cluster minikube
🏃  Updating the running hyperkit &lt;span class="s2"&gt;"minikube"&lt;/span&gt; VM ...
🎉  minikube 1.28.0 is available! Download it: https://github.com/kubernetes/minikube/releases/tag/v1.28.0
💡  To disable this notice, run: &lt;span class="s1"&gt;'minikube config set WantUpdateNotification false'&lt;/span&gt;

🐳  Preparing Kubernetes v1.21.2 on Docker 20.10.6 ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: storage-provisioner, default-storageclass

❗  /usr/local/bin/kubectl is version 1.25.2, which may have incompatibilites with Kubernetes 1.21.2.
    ▪ Want kubectl v1.21.2? Try &lt;span class="s1"&gt;'minikube kubectl -- get pods -A'&lt;/span&gt;
🏄  Done! kubectl is now configured to use &lt;span class="s2"&gt;"minikube"&lt;/span&gt; cluster and &lt;span class="s2"&gt;"default"&lt;/span&gt; namespace by default

&lt;span class="c"&gt;# Download the job.yaml file&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; curl https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; job.yaml

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; job.yaml
job.batch/kube-bench created

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; kubectl get pods &lt;span class="nt"&gt;-A&lt;/span&gt;                                                                                                                                ✔  at minikube ⎈ 
NAMESPACE       NAME                                        READY   STATUS              RESTARTS   AGE
default         kube-bench-t2fgh                            0/1     ContainerCreating   0          5s

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; kubectl get pods &lt;span class="nt"&gt;-A&lt;/span&gt;                                                                                                                                ✔  at minikube ⎈
NAMESPACE       NAME                                        READY   STATUS      RESTARTS   AGE
default         kube-bench-t2fgh                            0/1     Completed   0          32s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can run Kube-bench inside a pod, but it will need access to the host's PID namespace to check the running processes, as well as access to some directories on the host where config files and other files are stored.&lt;/p&gt;

&lt;p&gt;The supplied &lt;code&gt;job.yaml&lt;/code&gt; file can be applied to run the tests as a job. This was enough for me to run locally to get a feel of what the tool does and how it generates the report.&lt;/p&gt;

&lt;p&gt;Next, after having run the tests, I wanted to get the report. The results of the tests can be found in the logs of the pod which you can get by running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; kubectl logs kube-bench-t2fgh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Kube-bench generates a report that looks like the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[INFO] 1 Master Node Security Configuration
[INFO] 1.1 Master Node Configuration Files
[PASS] 1.1.1 Ensure that the API server pod specification file permissions are set to 644 or more restrictive (Automated)
[PASS] 1.1.2 Ensure that the API server pod specification file ownership is set to root:root (Automated)
[PASS] 1.1.3 Ensure that the controller manager pod specification file permissions are set to 644 or more restrictive (Automated)
[PASS] 1.1.4 Ensure that the controller manager pod specification file ownership is set to root:root (Automated)
[PASS] 1.1.5 Ensure that the scheduler pod specification file permissions are set to 644 or more restrictive (Automated)
[PASS] 1.1.6 Ensure that the scheduler pod specification file ownership is set to root:root (Automated)
[PASS] 1.1.7 Ensure that the etcd pod specification file permissions are set to 644 or more restrictive (Automated)
[PASS] 1.1.8 Ensure that the etcd pod specification file ownership is set to root:root (Automated)
[WARN] 1.1.9 Ensure that the Container Network Interface file permissions are set to 644 or more restrictive (Manual)
[WARN] 1.1.10 Ensure that the Container Network Interface file ownership is set to root:root (Manual)
[FAIL] 1.1.11 Ensure that the etcd data directory permissions are set to 700 or more restrictive (Automated)
[FAIL] 1.1.12 Ensure that the etcd data directory ownership is set to etcd:etcd (Automated)
[PASS] 1.1.13 Ensure that the admin.conf file permissions are set to 644 or more restrictive (Automated)
[PASS] 1.1.14 Ensure that the admin.conf file ownership is set to root:root (Automated)
[PASS] 1.1.15 Ensure that the scheduler.conf file permissions are set to 644 or more restrictive (Automated)
[PASS] 1.1.16 Ensure that the scheduler.conf file ownership is set to root:root (Automated)
[PASS] 1.1.17 Ensure that the controller-manager.conf file permissions are set to 644 or more restrictive (Automated)
[PASS] 1.1.18 Ensure that the controller-manager.conf file ownership is set to root:root (Automated)
[FAIL] 1.1.19 Ensure that the Kubernetes PKI directory and file ownership is set to root:root (Automated)
[WARN] 1.1.20 Ensure that the Kubernetes PKI certificate file permissions are set to 644 or more restrictive (Manual)
[WARN] 1.1.21 Ensure that the Kubernetes PKI key file permissions are set to 600 (Manual)
[INFO] 1.2 API Server
[WARN] 1.2.1 Ensure that the --anonymous-auth argument is set to false (Manual)
[PASS] 1.2.2 Ensure that the --token-auth-file parameter is not set (Automated)
[PASS] 1.2.3 Ensure that the --kubelet-https argument is set to true (Automated)
[PASS] 1.2.4 Ensure that the --kubelet-client-certificate and --kubelet-client-key arguments are set as appropriate (Automated)
[FAIL] 1.2.5 Ensure that the --kubelet-certificate-authority argument is set as appropriate (Automated)
[PASS] 1.2.6 Ensure that the --authorization-mode argument is not set to AlwaysAllow (Automated)
[PASS] 1.2.7 Ensure that the --authorization-mode argument includes Node (Automated)
[PASS] 1.2.8 Ensure that the --authorization-mode argument includes RBAC (Automated)
[WARN] 1.2.9 Ensure that the admission control plugin EventRateLimit is set (Manual)
[PASS] 1.2.10 Ensure that the admission control plugin AlwaysAdmit is not set (Automated)
[WARN] 1.2.11 Ensure that the admission control plugin AlwaysPullImages is set (Manual)
[WARN] 1.2.12 Ensure that the admission control plugin SecurityContextDeny is set if PodSecurityPolicy is not used (Manual)
[PASS] 1.2.13 Ensure that the admission control plugin ServiceAccount is set (Automated)
[PASS] 1.2.14 Ensure that the admission control plugin NamespaceLifecycle is set (Automated)
[FAIL] 1.2.15 Ensure that the admission control plugin PodSecurityPolicy is set (Automated)
[PASS] 1.2.16 Ensure that the admission control plugin NodeRestriction is set (Automated)
[PASS] 1.2.17 Ensure that the --insecure-bind-address argument is not set (Automated)
[PASS] 1.2.18 Ensure that the --insecure-port argument is set to 0 (Automated)
[PASS] 1.2.19 Ensure that the --secure-port argument is not set to 0 (Automated)
[FAIL] 1.2.20 Ensure that the --profiling argument is set to false (Automated)
[FAIL] 1.2.21 Ensure that the --audit-log-path argument is set (Automated)
[FAIL] 1.2.22 Ensure that the --audit-log-maxage argument is set to 30 or as appropriate (Automated)
[FAIL] 1.2.23 Ensure that the --audit-log-maxbackup argument is set to 10 or as appropriate (Automated)
[FAIL] 1.2.24 Ensure that the --audit-log-maxsize argument is set to 100 or as appropriate (Automated)
[WARN] 1.2.25 Ensure that the --request-timeout argument is set as appropriate (Manual)
[PASS] 1.2.26 Ensure that the --service-account-lookup argument is set to true (Automated)
[PASS] 1.2.27 Ensure that the --service-account-key-file argument is set as appropriate (Automated)
[PASS] 1.2.28 Ensure that the --etcd-certfile and --etcd-keyfile arguments are set as appropriate (Automated)
[PASS] 1.2.29 Ensure that the --tls-cert-file and --tls-private-key-file arguments are set as appropriate (Automated)
[PASS] 1.2.30 Ensure that the --client-ca-file argument is set as appropriate (Automated)
[PASS] 1.2.31 Ensure that the --etcd-cafile argument is set as appropriate (Automated)
[WARN] 1.2.32 Ensure that the --encryption-provider-config argument is set as appropriate (Manual)
[WARN] 1.2.33 Ensure that encryption providers are appropriately configured (Manual)
[WARN] 1.2.34 Ensure that the API Server only makes use of Strong Cryptographic Ciphers (Manual)
[INFO] 1.3 Controller Manager
[WARN] 1.3.1 Ensure that the --terminated-pod-gc-threshold argument is set as appropriate (Manual)
[FAIL] 1.3.2 Ensure that the --profiling argument is set to false (Automated)
[PASS] 1.3.3 Ensure that the --use-service-account-credentials argument is set to true (Automated)
[PASS] 1.3.4 Ensure that the --service-account-private-key-file argument is set as appropriate (Automated)
[PASS] 1.3.5 Ensure that the --root-ca-file argument is set as appropriate (Automated)
[PASS] 1.3.6 Ensure that the RotateKubeletServerCertificate argument is set to true (Automated)
[PASS] 1.3.7 Ensure that the --bind-address argument is set to 127.0.0.1 (Automated)
[INFO] 1.4 Scheduler
[FAIL] 1.4.1 Ensure that the --profiling argument is set to false (Automated)
[PASS] 1.4.2 Ensure that the --bind-address argument is set to 127.0.0.1 (Automated)

== Remediations master ==
1.1.9 Run the below command (based on the file location on your system) on the master node.
For example,
chmod 644 &amp;lt;path/to/cni/files&amp;gt;

1.1.10 Run the below command (based on the file location on your system) on the master node.
For example,
chown root:root &amp;lt;path/to/cni/files&amp;gt;

1.1.11 On the etcd server node, get the etcd data directory, passed as an argument --data-dir,
from the below command:
ps -ef | grep etcd
Run the below command (based on the etcd data directory found above). For example,
chmod 700 /var/lib/etcd

1.1.12 On the etcd server node, get the etcd data directory, passed as an argument --data-dir,
from the below command:
ps -ef | grep etcd
Run the below command (based on the etcd data directory found above).
For example, chown etcd:etcd /var/lib/etcd

1.1.19 Run the below command (based on the file location on your system) on the master node.
For example,
chown -R root:root /etc/kubernetes/pki/

1.2.5 Follow the Kubernetes documentation and setup the TLS connection between
the apiserver and kubelets. Then, edit the API server pod specification file
/etc/kubernetes/manifests/kube-apiserver.yaml on the master node and set the
--kubelet-certificate-authority parameter to the path to the cert file for the certificate authority.
--kubelet-certificate-authority=&amp;lt;ca-string&amp;gt;

1.2.9 Follow the Kubernetes documentation and set the desired limits in a configuration file.
Then, edit the API server pod specification file /etc/kubernetes/manifests/kube-apiserver.yaml
and set the below parameters.
--enable-admission-plugins=...,EventRateLimit,...
--admission-control-config-file=&amp;lt;path/to/configuration/file&amp;gt;

1.2.11 Edit the API server pod specification file /etc/kubernetes/manifests/kube-apiserver.yaml
on the master node and set the --enable-admission-plugins parameter to include
AlwaysPullImages.
--enable-admission-plugins=...,AlwaysPullImages,...

1.2.12 Edit the API server pod specification file /etc/kubernetes/manifests/kube-apiserver.yaml
on the master node and set the --enable-admission-plugins parameter to include
SecurityContextDeny, unless PodSecurityPolicy is already in place.
--enable-admission-plugins=...,SecurityContextDeny,...

1.2.15 Follow the documentation and create Pod Security Policy objects as per your environment.
Then, edit the API server pod specification file /etc/kubernetes/manifests/kube-apiserver.yaml
on the master node and set the --enable-admission-plugins parameter to a
value that includes PodSecurityPolicy:
--enable-admission-plugins=...,PodSecurityPolicy,...
Then restart the API Server.

1.2.20 Edit the API server pod specification file /etc/kubernetes/manifests/kube-apiserver.yaml
on the master node and set the below parameter.
--profiling=false

1.2.21 Edit the API server pod specification file /etc/kubernetes/manifests/kube-apiserver.yaml
on the master node and set the --audit-log-path parameter to a suitable path and
file where you would like audit logs to be written, for example:
--audit-log-path=/var/log/apiserver/audit.log

1.2.22 Edit the API server pod specification file /etc/kubernetes/manifests/kube-apiserver.yaml
on the master node and set the --audit-log-maxage parameter to 30 or as an appropriate number of days:
--audit-log-maxage=30

1.2.23 Edit the API server pod specification file /etc/kubernetes/manifests/kube-apiserver.yaml
on the master node and set the --audit-log-maxbackup parameter to 10 or to an appropriate
value.
--audit-log-maxbackup=10

1.2.24 Edit the API server pod specification file /etc/kubernetes/manifests/kube-apiserver.yaml
on the master node and set the --audit-log-maxsize parameter to an appropriate size in MB.
For example, to set it as 100 MB:
--audit-log-maxsize=100

1.2.25 Edit the API server pod specification file /etc/kubernetes/manifests/kube-apiserver.yaml
and set the below parameter as appropriate and if needed.
For example,
--request-timeout=300s

1.2.32 Follow the Kubernetes documentation and configure a EncryptionConfig file.
Then, edit the API server pod specification file /etc/kubernetes/manifests/kube-apiserver.yaml
on the master node and set the --encryption-provider-config parameter to the path of that file: --encryption-provider-config=&amp;lt;/path/to/EncryptionConfig/File&amp;gt;

1.2.33 Follow the Kubernetes documentation and configure a EncryptionConfig file.
In this file, choose aescbc, kms or secretbox as the encryption provider.

1.2.34 Edit the API server pod specification file /etc/kubernetes/manifests/kube-apiserver.yaml
on the master node and set the below parameter.
--tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM
_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM
_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM
_SHA384

1.3.1 Edit the Controller Manager pod specification file /etc/kubernetes/manifests/kube-controller-manager.yaml
on the master node and set the --terminated-pod-gc-threshold to an appropriate threshold,
for example:
--terminated-pod-gc-threshold=10

1.3.2 Edit the Controller Manager pod specification file /etc/kubernetes/manifests/kube-controller-manager.yaml
on the master node and set the below parameter.
--profiling=false

1.4.1 Edit the Scheduler pod specification file /etc/kubernetes/manifests/kube-scheduler.yaml file
on the master node and set the below parameter.
--profiling=false


== Summary master ==
39 checks PASS
12 checks FAIL
13 checks WARN
0 checks INFO

[INFO] 2 Etcd Node Configuration
[INFO] 2 Etcd Node Configuration Files
[PASS] 2.1 Ensure that the --cert-file and --key-file arguments are set as appropriate (Automated)
[PASS] 2.2 Ensure that the --client-cert-auth argument is set to true (Automated)
[PASS] 2.3 Ensure that the --auto-tls argument is not set to true (Automated)
[PASS] 2.4 Ensure that the --peer-cert-file and --peer-key-file arguments are set as appropriate (Automated)
[PASS] 2.5 Ensure that the --peer-client-cert-auth argument is set to true (Automated)
[PASS] 2.6 Ensure that the --peer-auto-tls argument is not set to true (Automated)
[PASS] 2.7 Ensure that a unique Certificate Authority is used for etcd (Manual)

== Summary etcd ==
7 checks PASS
0 checks FAIL
0 checks WARN
0 checks INFO

[INFO] 3 Control Plane Configuration
[INFO] 3.1 Authentication and Authorization
[WARN] 3.1.1 Client certificate authentication should not be used for users (Manual)
[INFO] 3.2 Logging
[WARN] 3.2.1 Ensure that a minimal audit policy is created (Manual)
[WARN] 3.2.2 Ensure that the audit policy covers key security concerns (Manual)

== Remediations controlplane ==
3.1.1 Alternative mechanisms provided by Kubernetes such as the use of OIDC should be
implemented in place of client certificates.

3.2.1 Create an audit policy file for your cluster.

3.2.2 Consider modification of the audit policy in use on the cluster to include these items, at a
minimum.


== Summary controlplane ==
0 checks PASS
0 checks FAIL
3 checks WARN
0 checks INFO

[INFO] 4 Worker Node Security Configuration
[INFO] 4.1 Worker Node Configuration Files
[PASS] 4.1.1 Ensure that the kubelet service file permissions are set to 644 or more restrictive (Automated)
[PASS] 4.1.2 Ensure that the kubelet service file ownership is set to root:root (Automated)
[PASS] 4.1.3 If proxy kubeconfig file exists ensure permissions are set to 644 or more restrictive (Manual)
[PASS] 4.1.4 If proxy kubeconfig file exists ensure ownership is set to root:root (Manual)
[PASS] 4.1.5 Ensure that the --kubeconfig kubelet.conf file permissions are set to 644 or more restrictive (Automated)
[PASS] 4.1.6 Ensure that the --kubeconfig kubelet.conf file ownership is set to root:root (Automated)
[WARN] 4.1.7 Ensure that the certificate authorities file permissions are set to 644 or more restrictive (Manual)
[WARN] 4.1.8 Ensure that the client certificate authorities file ownership is set to root:root (Manual)
[PASS] 4.1.9 Ensure that the kubelet --config configuration file has permissions set to 644 or more restrictive (Automated)
[PASS] 4.1.10 Ensure that the kubelet --config configuration file ownership is set to root:root (Automated)
[INFO] 4.2 Kubelet
[PASS] 4.2.1 Ensure that the anonymous-auth argument is set to false (Automated)
[PASS] 4.2.2 Ensure that the --authorization-mode argument is not set to AlwaysAllow (Automated)
[PASS] 4.2.3 Ensure that the --client-ca-file argument is set as appropriate (Automated)
[PASS] 4.2.4 Ensure that the --read-only-port argument is set to 0 (Manual)
[PASS] 4.2.5 Ensure that the --streaming-connection-idle-timeout argument is not set to 0 (Manual)
[FAIL] 4.2.6 Ensure that the --protect-kernel-defaults argument is set to true (Automated)
[PASS] 4.2.7 Ensure that the --make-iptables-util-chains argument is set to true (Automated)
[WARN] 4.2.8 Ensure that the --hostname-override argument is not set (Manual)
[WARN] 4.2.9 Ensure that the --event-qps argument is set to 0 or a level which ensures appropriate event capture (Manual)
[WARN] 4.2.10 Ensure that the --tls-cert-file and --tls-private-key-file arguments are set as appropriate (Manual)
[PASS] 4.2.11 Ensure that the --rotate-certificates argument is not set to false (Automated)
[PASS] 4.2.12 Verify that the RotateKubeletServerCertificate argument is set to true (Manual)
[WARN] 4.2.13 Ensure that the Kubelet only makes use of Strong Cryptographic Ciphers (Manual)

== Remediations node ==
4.1.7 Run the following command to modify the file permissions of the
--client-ca-file chmod 644 &amp;lt;filename&amp;gt;

4.1.8 Run the following command to modify the ownership of the --client-ca-file.
chown root:root &amp;lt;filename&amp;gt;

4.2.6 If using a Kubelet config file, edit the file to set protectKernelDefaults: true.
If using command line arguments, edit the kubelet service file
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf on each worker node and
set the below parameter in KUBELET_SYSTEM_PODS_ARGS variable.
--protect-kernel-defaults=true
Based on your system, restart the kubelet service. For example:
systemctl daemon-reload
systemctl restart kubelet.service

4.2.8 Edit the kubelet service file /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
on each worker node and remove the --hostname-override argument from the
KUBELET_SYSTEM_PODS_ARGS variable.
Based on your system, restart the kubelet service. For example:
systemctl daemon-reload
systemctl restart kubelet.service

4.2.9 If using a Kubelet config file, edit the file to set eventRecordQPS: to an appropriate level.
If using command line arguments, edit the kubelet service file
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf on each worker node and
set the below parameter in KUBELET_SYSTEM_PODS_ARGS variable.
Based on your system, restart the kubelet service. For example:
systemctl daemon-reload
systemctl restart kubelet.service

4.2.10 If using a Kubelet config file, edit the file to set tlsCertFile to the location
of the certificate file to use to identify this Kubelet, and tlsPrivateKeyFile
to the location of the corresponding private key file.
If using command line arguments, edit the kubelet service file
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf on each worker node and
set the below parameters in KUBELET_CERTIFICATE_ARGS variable.
--tls-cert-file=&amp;lt;path/to/tls-certificate-file&amp;gt;
--tls-private-key-file=&amp;lt;path/to/tls-key-file&amp;gt;
Based on your system, restart the kubelet service. For example:
systemctl daemon-reload
systemctl restart kubelet.service

4.2.13 If using a Kubelet config file, edit the file to set TLSCipherSuites: to
TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256
or to a subset of these values.
If using executable arguments, edit the kubelet service file
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf on each worker node and
set the --tls-cipher-suites parameter as follows, or to a subset of these values.
--tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256
Based on your system, restart the kubelet service. For example:
systemctl daemon-reload
systemctl restart kubelet.service


== Summary node ==
16 checks PASS
1 checks FAIL
6 checks WARN
0 checks INFO

[INFO] 5 Kubernetes Policies
[INFO] 5.1 RBAC and Service Accounts
[WARN] 5.1.1 Ensure that the cluster-admin role is only used where required (Manual)
[WARN] 5.1.2 Minimize access to secrets (Manual)
[WARN] 5.1.3 Minimize wildcard use in Roles and ClusterRoles (Manual)
[WARN] 5.1.4 Minimize access to create pods (Manual)
[WARN] 5.1.5 Ensure that default service accounts are not actively used. (Manual)
[WARN] 5.1.6 Ensure that Service Account Tokens are only mounted where necessary (Manual)
[WARN] 5.1.7 Avoid use of system:masters group (Manual)
[WARN] 5.1.8 Limit use of the Bind, Impersonate and Escalate permissions in the Kubernetes cluster (Manual)
[INFO] 5.2 Pod Security Policies
[WARN] 5.2.1 Minimize the admission of privileged containers (Automated)
[WARN] 5.2.2 Minimize the admission of containers wishing to share the host process ID namespace (Automated)
[WARN] 5.2.3 Minimize the admission of containers wishing to share the host IPC namespace (Automated)
[WARN] 5.2.4 Minimize the admission of containers wishing to share the host network namespace (Automated)
[WARN] 5.2.5 Minimize the admission of containers with allowPrivilegeEscalation (Automated)
[WARN] 5.2.6 Minimize the admission of root containers (Automated)
[WARN] 5.2.7 Minimize the admission of containers with the NET_RAW capability (Automated)
[WARN] 5.2.8 Minimize the admission of containers with added capabilities (Automated)
[WARN] 5.2.9 Minimize the admission of containers with capabilities assigned (Manual)
[INFO] 5.3 Network Policies and CNI
[WARN] 5.3.1 Ensure that the CNI in use supports Network Policies (Manual)
[WARN] 5.3.2 Ensure that all Namespaces have Network Policies defined (Manual)
[INFO] 5.4 Secrets Management
[WARN] 5.4.1 Prefer using secrets as files over secrets as environment variables (Manual)
[WARN] 5.4.2 Consider external secret storage (Manual)
[INFO] 5.5 Extensible Admission Control
[WARN] 5.5.1 Configure Image Provenance using ImagePolicyWebhook admission controller (Manual)
[INFO] 5.7 General Policies
[WARN] 5.7.1 Create administrative boundaries between resources using namespaces (Manual)
[WARN] 5.7.2 Ensure that the seccomp profile is set to docker/default in your pod definitions (Manual)
[WARN] 5.7.3 Apply Security Context to Your Pods and Containers (Manual)
[WARN] 5.7.4 The default namespace should not be used (Manual)

== Remediations policies ==
5.1.1 Identify all clusterrolebindings to the cluster-admin role. Check if they are used and
if they need this role or if they could use a role with fewer privileges.
Where possible, first bind users to a lower privileged role and then remove the
clusterrolebinding to the cluster-admin role :
kubectl delete clusterrolebinding [name]

5.1.2 Where possible, remove get, list and watch access to secret objects in the cluster.

5.1.3 Where possible replace any use of wildcards in clusterroles and roles with specific
objects or actions.

5.1.4 Where possible, remove create access to pod objects in the cluster.

5.1.5 Create explicit service accounts wherever a Kubernetes workload requires specific access
to the Kubernetes API server.
Modify the configuration of each default service account to include this value
automountServiceAccountToken: false

5.1.6 Modify the definition of pods and service accounts which do not need to mount service
account tokens to disable it.

5.1.7 Remove the system:masters group from all users in the cluster.

5.1.8 Where possible, remove the impersonate, bind and escalate rights from subjects.

5.2.1 Create a PSP as described in the Kubernetes documentation, ensuring that
the .spec.privileged field is omitted or set to false.

5.2.2 Create a PSP as described in the Kubernetes documentation, ensuring that the
.spec.hostPID field is omitted or set to false.

5.2.3 Create a PSP as described in the Kubernetes documentation, ensuring that the
.spec.hostIPC field is omitted or set to false.

5.2.4 Create a PSP as described in the Kubernetes documentation, ensuring that the
.spec.hostNetwork field is omitted or set to false.

5.2.5 Create a PSP as described in the Kubernetes documentation, ensuring that the
.spec.allowPrivilegeEscalation field is omitted or set to false.

5.2.6 Create a PSP as described in the Kubernetes documentation, ensuring that the
.spec.runAsUser.rule is set to either MustRunAsNonRoot or MustRunAs with the range of
UIDs not including 0.

5.2.7 Create a PSP as described in the Kubernetes documentation, ensuring that the
.spec.requiredDropCapabilities is set to include either NET_RAW or ALL.

5.2.8 Ensure that allowedCapabilities is not present in PSPs for the cluster unless
it is set to an empty array.

5.2.9 Review the use of capabilites in applications running on your cluster. Where a namespace
contains applicaions which do not require any Linux capabities to operate consider adding
a PSP which forbids the admission of containers which do not drop all capabilities.

5.3.1 If the CNI plugin in use does not support network policies, consideration should be given to
making use of a different plugin, or finding an alternate mechanism for restricting traffic
in the Kubernetes cluster.

5.3.2 Follow the documentation and create NetworkPolicy objects as you need them.

5.4.1 if possible, rewrite application code to read secrets from mounted secret files, rather than
from environment variables.

5.4.2 Refer to the secrets management options offered by your cloud provider or a third-party
secrets management solution.

5.5.1 Follow the Kubernetes documentation and setup image provenance.

5.7.1 Follow the documentation and create namespaces for objects in your deployment as you need
them.

5.7.2 Use security context to enable the docker/default seccomp profile in your pod definitions.
An example is as below:
  securityContext:
    seccompProfile:
      type: RuntimeDefault

5.7.3 Follow the Kubernetes documentation and apply security contexts to your pods. For a
suggested list of security contexts, you may refer to the CIS Security Benchmark for Docker
Containers.

5.7.4 Ensure that namespaces are created to allow for appropriate segregation of Kubernetes
resources and that all new resources are created in a specific namespace.


== Summary policies ==
0 checks PASS
0 checks FAIL
26 checks WARN
0 checks INFO

== Summary total ==
62 checks PASS
13 checks FAIL
48 checks WARN
0 checks INFO
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This can also be run inside the AKS cluster by following the instructions &lt;a href="https://github.com/aquasecurity/kube-bench/blob/main/docs/running.md#running-in-an-aks-cluster" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;br&gt;
As a reminder: Kube-bench cannot be run on AKS master nodes. It can only be run on worker nodes, this is not a limitation of Kube-bench but of AKS as mentioned before.&lt;/p&gt;
&lt;h3&gt;
  
  
  The report breakdown
&lt;/h3&gt;

&lt;p&gt;From the report above you can see that Kube-bench benchmarks 5 sections of your configurations which are the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Control Plane Components&lt;/li&gt;
&lt;li&gt;Etcd&lt;/li&gt;
&lt;li&gt;Control Plane Configurations&lt;/li&gt;
&lt;li&gt;Worker Nodes&lt;/li&gt;
&lt;li&gt;Policies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each section starts by describing which section it targets, the lines having the INFO tag. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[INFO] 5 Kubernetes Policies
[INFO] 5.1 RBAC and Service Accounts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then it lists the checks that are performed for that section. Each check gets a PASS, FAIL or WARN status. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[WARN] 5.1.1 Ensure that the cluster-admin role is only used where required (Manual)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And after the tests run, it also suggests remediations for the check that got a WARN/FAIL status. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;== Remediations policies ==
5.1.1 Identify all clusterrolebindings to the cluster-admin role. Check if they are used and
if they need this role or if they could use a role with fewer privileges.
Where possible, first bind users to a lower privileged role and then remove the
clusterrolebinding to the cluster-admin role :
kubectl delete clusterrolebinding [name]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And at the end you can find a summary of the section:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;== Summary policies ==
0 checks PASS
0 checks FAIL
26 checks WARN
0 checks INFO
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Potential use cases
&lt;/h3&gt;

&lt;p&gt;By running it as a CronJon in your cluster, Kube-bench can help you identify potential security issues in your cluster. It is a great tool to have in your toolbox and it is very easy to use.&lt;br&gt;
You can configure it to run on a schedule like every week or month and get a report on the security of your cluster, while also taking into account the specific CIS benchmark for your cloud provider. For example, you can set up and run the &lt;a href="https://github.com/aquasecurity/kube-bench/blob/main/job-aks.yaml" rel="noopener noreferrer"&gt;job-aks.yaml&lt;/a&gt; file to run the tests on an AKS cluster.&lt;/p&gt;
&lt;h2&gt;
  
  
  Popeye - A Kubernetes Cluster Sanitizer
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpy82msy4one7f55o1zvm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpy82msy4one7f55o1zvm.png" alt="Popeye" width="425" height="425"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The repository for the tool can be found &lt;a href="https://github.com/derailed/popeye" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This is a read-only utility that scans live Kubernetes clusters and reports potential issues with deployed resources and configurations.&lt;br&gt;
What I liked about this tool is that it is very easy to install and use and it achieves what it promises: it reduces the cognitive overload one faces when operating a Kubernetes cluster in the wild.&lt;/p&gt;
&lt;h3&gt;
  
  
  Setting it up
&lt;/h3&gt;

&lt;p&gt;Popeye can be used standalone using the command line, using a &lt;a href="https://github.com/derailed/popeye/blob/master/spinach/spinach_aks.yml" rel="noopener noreferrer"&gt;spinach.yml&lt;/a&gt; file or running directly in the cluster as a CronJob.&lt;/p&gt;

&lt;p&gt;In this post, I will be using the command line option on a mac. So to install it I just ran:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Install popeye
&amp;gt; brew install derailed/popeye/popeye
# Check popeye version
&amp;gt; popeye version                                                                                                                   
 ___     ___ _____   _____                       K          .-'-.
| _ \___| _ \ __\ \ / / __|                       8     __|      `\
|  _/ _ \  _/ _| \ V /| _|                         s   `-,-`--._   `\
|_| \___/_| |___| |_| |___|                       []  .-&amp;gt;'  a     `|-'
  Biffs`em and Buffs`em!                            `=/ (__/_       /
                                                      \_,    `    _)
                                                         `----;  |
Version:   0.10.1
Commit:    ae19897a4b5d3738a3e98179207759e45a53a64c
Date:      2022-06-28T14:46:13Z
Logs:      /var/folders/vp/l8dlq0gn3x71f3vk82shmzlm0000gn/T/popeye.log

# Connected to my AKS cluster
# Check the context I am using
&amp;gt; kubectl config current-context
# Run popeye
&amp;gt; popeye
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The report breakdown
&lt;/h3&gt;

&lt;p&gt;The report will be printed to the console and it will look something like the following snippet below. I have removed some of the output for brevity and to give you an idea of the output format and the types of checks that are performed and the results.&lt;/p&gt;

&lt;p&gt;The report is nicely split into sections and each section has a summary of the checks performed and the results. It ends with giving a grade to the cluster.&lt;/p&gt;

&lt;p&gt;The color coding is also very helpful to quickly identify the issues:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Level&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Icon&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Jurassic&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Color&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Ok&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;OK&lt;/td&gt;
&lt;td&gt;Green&lt;/td&gt;
&lt;td&gt;Happy!&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Info&lt;/td&gt;
&lt;td&gt;🔊&lt;/td&gt;
&lt;td&gt;I&lt;/td&gt;
&lt;td&gt;BlueGreen&lt;/td&gt;
&lt;td&gt;FYI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Warn&lt;/td&gt;
&lt;td&gt;😱&lt;/td&gt;
&lt;td&gt;W&lt;/td&gt;
&lt;td&gt;Yellow&lt;/td&gt;
&lt;td&gt;Potential Issue&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Error&lt;/td&gt;
&lt;td&gt;💥&lt;/td&gt;
&lt;td&gt;E&lt;/td&gt;
&lt;td&gt;Red&lt;/td&gt;
&lt;td&gt;Action required&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; popeye                                                                                                                                      

 ___     ___ _____   _____                                                      K          .-&lt;span class="s1"&gt;'-.
| _ \___| _ \ __\ \ / / __|                                                      8     __|      `\
|  _/ _ \  _/ _| \ V /| _|                                                        s   `-,-`--._   `\
|_| \___/_| |___| |_| |___|                                                      []  .-&amp;gt;'&lt;/span&gt;  a     &lt;span class="sb"&gt;`&lt;/span&gt;|-&lt;span class="s1"&gt;'
  Biffs`em and Buffs`em!                                                          `=/ (__/_       /
                                                                                    \_,    `    _)
                                                                                       `----;  |


GENERAL [AKS-STAGING]
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · Connectivity...................................................................................✅
  · MetricServer...................................................................................✅


CLUSTER (1 SCANNED)                                                          💥 0 😱 0 🔊 0 ✅ 1 100٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · Version........................................................................................✅
    ✅ [POP-406] K8s version OK.


CLUSTERROLES (13 SCANNED)                                                   💥 0 😱 0 🔊 0 ✅ 13 100٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · azure-policy-webhook-cluster-role..............................................................✅
  · dapr-operator-admin............................................................................✅
  · dashboard-reader...............................................................................✅
  · gatekeeper-manager-role........................................................................✅
  · grafana-agent..................................................................................✅
  · keda-operator..................................................................................✅
  · keda-operator-external-metrics-reader..........................................................✅
  · kong-kong......................................................................................✅
  · omsagent-reader................................................................................✅
  · policy-agent...................................................................................✅
  · system:coredns-autoscaler......................................................................✅
  · system:metrics-server..........................................................................✅
  · system:prometheus..............................................................................✅


CLUSTERROLEBINDINGS (19 SCANNED)                                             💥 0 😱 6 🔊 0 ✅ 13 68٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · azure-policy-webhook-cluster-rolebinding.......................................................✅
  · dapr-operator..................................................................................✅
  · dapr-role-tokenreview-binding..................................................................😱
    😱 [POP-1300] References a ClusterRole (system:auth-delegator) which does not exist.
  · dashboard-reader-global........................................................................✅
  · gatekeeper-manager-rolebinding.................................................................✅
  · grafana-agent..................................................................................✅
  · keda-operator..................................................................................✅
  · keda-operator-hpa-controller-external-metrics..................................................✅
  · keda-operator-system-auth-delegator............................................................😱
    😱 [POP-1300] References a ClusterRole (system:auth-delegator) which does not exist.
  · kong-kong......................................................................................✅
  · kubelet-api-admin..............................................................................😱
    😱 [POP-1300] References a ClusterRole (system:kubelet-api-admin) which does not exist.
  · metrics-server:system:auth-delegator...........................................................😱
    😱 [POP-1300] References a ClusterRole (system:auth-delegator) which does not exist.
  · omsagentclusterrolebinding.....................................................................✅
  · policy-agent...................................................................................✅
  · replicaset-controller..........................................................................😱
    😱 [POP-1300] References a ClusterRole (system:controller:replicaset-controller) which does not
       exist.
  · system:coredns-autoscaler......................................................................✅
  · system:discovery...............................................................................😱
    😱 [POP-1300] References a ClusterRole (system:discovery) which does not exist.
  · system:metrics-server..........................................................................✅
  · system:prometheus..............................................................................✅


CONFIGMAPS (43 SCANNED)                                                     💥 0 😱 0 🔊 37 ✅ 6 100٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · dapr-system/kube-root-ca.crt...................................................................🔊
    🔊 [POP-400] Used? Unable to locate resource reference.
  · dapr-system/operator.dapr.io...................................................................🔊
    🔊 [POP-400] Used? Unable to locate resource reference.
  · dapr-system/webhooks.dapr.io...................................................................🔊
    🔊 [POP-400] Used? Unable to locate resource reference.
  · default/kube-root-ca.crt.......................................................................🔊
    🔊 [POP-400] Used? Unable to locate resource reference.
(...)


DAEMONSETS (9 SCANNED)                                                        💥 0 😱 2 🔊 0 ✅ 7 77٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · kube-system/azure-ip-masq-agent................................................................✅
  · kube-system/cloud-node-manager.................................................................✅
  · kube-system/cloud-node-manager-windows.........................................................✅
  · kube-system/csi-azuredisk-node.................................................................✅
  · kube-system/csi-azuredisk-node-win.............................................................✅
  · kube-system/csi-azurefile-node.................................................................✅
  · kube-system/csi-azurefile-node-win.............................................................✅
  · kube-system/kube-proxy.........................................................................😱
    🐳 kube-proxy
      😱 [POP-107] No resource limits defined.
    🐳 kube-proxy-bootstrap
      😱 [POP-107] No resource limits defined.
  · prometheus-agent/grafana-agent.................................................................😱
    🐳 agent
      😱 [POP-106] No resources requests/limits defined.


DEPLOYMENTS (29 SCANNED)                                                     💥 0 😱 4 🔊 3 ✅ 22 86٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · dapr-system/dapr-dashboard.....................................................................🔊
    🐳 dapr-dashboard
      🔊 [POP-108] Unnamed port 8080.
  · dapr-system/dapr-operator......................................................................🔊
    🐳 dapr-operator
      🔊 [POP-108] Unnamed port 6500.
  · dapr-system/dapr-sentry........................................................................🔊
    🐳 dapr-sentry
      🔊 [POP-108] Unnamed port 50001.
  · dapr-system/dapr-sidecar-injector..............................................................✅
  · gatekeeper-system/gatekeeper-audit.............................................................✅
  · gatekeeper-system/gatekeeper-controller........................................................✅
  · keda-system/keda-operator......................................................................✅
  · keda-system/keda-operator-metrics-apiserver....................................................✅
  · kong/kong-kong.................................................................................😱
    🐳 ingress-controller
      😱 [POP-106] No resources requests/limits defined.
    🐳 proxy
      😱 [POP-106] No resources requests/limits defined.
  · kube-system/azure-policy.......................................................................✅
  · kube-system/azure-policy-webhook...............................................................✅
  · kube-system/coredns............................................................................✅
  · kube-system/coredns-autoscaler.................................................................✅
  · kube-system/konnectivity-agent.................................................................✅
  · kube-system/metrics-server.....................................................................✅
(...)


HORIZONTALPODAUTOSCALERS (4 SCANNED)                                          💥 0 😱 2 🔊 0 ✅ 2 50٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · HPA............................................................................................😱
    😱 [POP-604] If ALL HPAs triggered, 11500m will match/exceed cluster CPU(4931m) capacity by
       6570m.
    😱 [POP-605] If ALL HPAs triggered, 14720Mi will match/exceed cluster memory(1523Mi) capacity by
       13196Mi.
  · my-service/keda-hpa-my-service.................................................................✅
  · my-service-2/keda-hpa-my-service-2.............................................................😱
    😱 [POP-602] Replicas (1/100) at burst will match/exceed cluster CPU(4931m) capacity by 5070m.
    😱 [POP-603] Replicas (1/100) at burst will match/exceed cluster memory(1523Mi) capacity by
       11276Mi.
(...)


INGRESSES (13 SCANNED)                                                      💥 0 😱 0 🔊 0 ✅ 13 100٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · my-service/my-service..........................................................................✅
  · kong/kong-dapr.................................................................................✅
(...)


NAMESPACES (23 SCANNED)                                                     💥 0 😱 0 🔊 3 ✅ 20 100٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · dapr-system....................................................................................✅
  · default........................................................................................🔊
    🔊 [POP-400] Used? Unable to locate resource reference.
  · gatekeeper-system..............................................................................✅
  · keda-system....................................................................................✅
  · kong...........................................................................................✅
  · kube-node-lease................................................................................🔊
    🔊 [POP-400] Used? Unable to locate resource reference.
  · kube-public....................................................................................🔊
    🔊 [POP-400] Used? Unable to locate resource reference.
  · kube-system....................................................................................✅
  · logstash.......................................................................................✅
  · prometheus-agent...............................................................................✅
(...)


NETWORKPOLICIES (2 SCANNED)                                                  💥 0 😱 0 🔊 0 ✅ 2 100٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · kube-system/konnectivity-agent.................................................................✅
  · kube-system/tunnelfront........................................................................✅


NODES (3 SCANNED)                                                             💥 0 😱 2 🔊 0 ✅ 1 33٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · aks-default-***-vmss000000.....................................................................😱
    😱 [POP-710] Memory threshold (80%) reached 83%.
  · aks-default-***-vmss000001.....................................................................😱
    😱 [POP-710] Memory threshold (80%) reached 105%.
  · aks-default-***-vmss00000a.....................................................................✅


PERSISTENTVOLUMES (3 SCANNED)                                                💥 0 😱 0 🔊 0 ✅ 3 100٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · pvc-**.........................................................................................✅
  · pvc-**.........................................................................................✅
  · pvc-**.........................................................................................✅


PERSISTENTVOLUMECLAIMS (3 SCANNED)                                           💥 0 😱 0 🔊 0 ✅ 3 100٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · dapr-system/raft-log-dapr-placement-server-0...................................................✅
  · dapr-system/raft-log-dapr-placement-server-1...................................................✅
  · dapr-system/raft-log-dapr-placement-server-2...................................................✅


PODS (72 SCANNED)                                                             💥 1 😱 71 🔊 0 ✅ 0 0٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · dapr-system/dapr-dashboard-1-lq942.............................................................😱
    😱 [POP-301] Connects to API Server? ServiceAccount token is mounted.
    🐳 dapr-dashboard
      😱 [POP-102] No probes defined.
      🔊 [POP-108] Unnamed port 8080.
  · dapr-system/dapr-operator-1-chmmq..............................................................😱
    😱 [POP-301] Connects to API Server? ServiceAccount token is mounted.
    🐳 dapr-operator
      😱 [POP-205] Pod was restarted (17) times.
      🔊 [POP-105] Liveness probe uses a port#, prefer a named port.
      🔊 [POP-105] Readiness probe uses a port#, prefer a named port.
      🔊 [POP-108] Unnamed port 6500.
  · dapr-system/dapr-placement-server-0............................................................😱
    😱 [POP-301] Connects to API Server? ServiceAccount token is mounted.
    😱 [POP-302] Pod could be running as root user. Check SecurityContext/Image.
    🐳 dapr-placement-server
      🔊 [POP-105] Liveness probe uses a port#, prefer a named port.
      🔊 [POP-105] Readiness probe uses a port#, prefer a named port.
      😱 [POP-306] Container could be running as root user. Check SecurityContext/Image.
(...)


PODDISRUPTIONBUDGETS (8 SCANNED)                                              💥 0 😱 2 🔊 0 ✅ 6 75٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · dapr-system/dapr-dashboard-disruption-budget...................................................✅
  · dapr-system/dapr-operator-disruption-budget....................................................✅
  · dapr-system/dapr-placement-server-disruption-budget............................................✅
  · dapr-system/dapr-sentry-budget.................................................................✅
  · dapr-system/dapr-sidecar-injector-disruption-budget............................................✅
  · kube-system/coredns-pdb........................................................................😱
    😱 [POP-403] Deprecated PodDisruptionBudget API group "policy/v1beta1". Use "policy/v1" instead.
  · kube-system/konnectivity-agent.................................................................😱
    😱 [POP-403] Deprecated PodDisruptionBudget API group "policy/v1beta1". Use "policy/v1" instead.
  · logstash/logstash-logstash-pdb.................................................................✅


PODSECURITYPOLICIES (0 SCANNED)                                              💥 0 😱 0 🔊 0 ✅ 0 100٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · Nothing to report.


REPLICASETS (197 SCANNED)                                                  💥 0 😱 0 🔊 0 ✅ 197 100٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · dapr-system/dapr-dashboard-1...................................................................✅
  · dapr-system/dapr-operator-1....................................................................✅
  · dapr-system/dapr-sentry-1......................................................................✅
  · dapr-system/dapr-sidecar-injector-1............................................................✅
  · keda-system/keda-operator-1....................................................................✅
  · keda-system/keda-operator-metrics-apiserver-1..................................................✅
(...)


ROLES (5 SCANNED)                                                            💥 0 😱 0 🔊 0 ✅ 5 100٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · default/secret-reader..........................................................................✅
  · gatekeeper-system/gatekeeper-manager-role......................................................✅
  · kong/kong-kong.................................................................................✅
  · kube-system/azure-policy-webhook-role..........................................................✅
  · kube-system/policy-pod-agent...................................................................✅


ROLEBINDINGS (7 SCANNED)                                                      💥 0 😱 2 🔊 0 ✅ 5 71٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · default/dapr-secret-reader.....................................................................✅
  · gatekeeper-system/gatekeeper-manager-rolebinding...............................................✅
  · kong/kong-kong.................................................................................✅
  · kube-system/azure-policy-webhook-rolebinding...................................................✅
  · kube-system/keda-operator-auth-reader..........................................................😱
    😱 [POP-1300] References a Role (kube-system/extension-apiserver-authentication-reader) which
       does not exist.
  · kube-system/metrics-server-auth-reader.........................................................😱
    😱 [POP-1300] References a Role (kube-system/extension-apiserver-authentication-reader) which
       does not exist.
  · kube-system/policy-pod-agent...................................................................✅


SECRETS (277 SCANNED)                                                     💥 0 😱 0 🔊 254 ✅ 23 100٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · dapr-system/dapr-operator-token-tdnb4..........................................................🔊
    🔊 [POP-400] Used? Unable to locate resource reference.
  · dapr-system/dapr-sidecar-injector-cert.........................................................✅
  · dapr-system/dapr-trust-bundle..................................................................✅
(...)


SERVICES (35 SCANNED)                                                        💥 3 😱 17 🔊 7 ✅ 8 42٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · dapr-system/dapr-api...........................................................................🔊
    🔊 [POP-1102] Use of target port #6500 for service port TCP::80. Prefer named port.
  · dapr-system/dapr-dashboard.....................................................................😱
    🔊 [POP-1102] Use of target port #8080 for service port TCP::8080. Prefer named port.
    😱 [POP-1109] Only one Pod associated with this endpoint.
  · dapr-system/dapr-placement-server..............................................................🔊
    🔊 [POP-1102] Use of target port #50005 for service port TCP:api:50005. Prefer named port.
    🔊 [POP-1102] Use of target port #8201 for service port TCP:raft-node:8201. Prefer named port.
  · dapr-system/dapr-sentry........................................................................🔊
    🔊 [POP-1102] Use of target port #50001 for service port TCP::80. Prefer named port.
  · dapr-system/dapr-sidecar-injector..............................................................✅
  · dapr-system/dapr-webhook.......................................................................💥
    💥 [POP-1106] No target ports match service port TCP::443.
  · default/dapr-eventgrid-func-dapr...............................................................💥
    💥 [POP-1100] No pods match service selector.
    💥 [POP-1105] No associated endpoints.
  · default/kubernetes.............................................................................✅
(...)


SERVICEACCOUNTS (38 SCANNED)                                                 💥 0 😱 1 🔊 8 ✅ 29 97٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · dapr-system/dapr-operator......................................................................✅
  · dapr-system/dashboard-reader...................................................................✅
  · dapr-system/default............................................................................🔊
    🔊 [POP-400] Used? Unable to locate resource reference.
  · default/default................................................................................🔊
    🔊 [POP-400] Used? Unable to locate resource reference.
  · domain-event-emitter/default...................................................................✅
  · domain-start/default...........................................................................✅
  · gatekeeper-system/default......................................................................🔊
    🔊 [POP-400] Used? Unable to locate resource reference.
  · gatekeeper-system/gatekeeper-admin.............................................................✅
(...)


STATEFULSETS (2 SCANNED)                                                     💥 0 😱 0 🔊 0 ✅ 2 100٪
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
  · dapr-system/dapr-placement-server..............................................................✅
  · logstash/logstash-logstash.....................................................................✅


SUMMARY
┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅
Your cluster score: 82 -- B
                                                                                o          .-'&lt;/span&gt;-.
                                                                                 o     __| B    &lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
                                                                                  o   &lt;span class="sb"&gt;`&lt;/span&gt;-,-&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="nt"&gt;--&lt;/span&gt;._   &lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
                                                                                 &lt;span class="o"&gt;[]&lt;/span&gt;  .-&amp;gt;&lt;span class="s1"&gt;'  a     `|-'&lt;/span&gt;
                                                                                  &lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/ &lt;span class="o"&gt;(&lt;/span&gt;__/_       /
                                                                                    &lt;span class="se"&gt;\_&lt;/span&gt;,    &lt;span class="sb"&gt;`&lt;/span&gt;    _&lt;span class="o"&gt;)&lt;/span&gt;
                                                                                       &lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="nt"&gt;----&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  |
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Popeye has a bit of a different scope than Kube-bench in the sense that it scans your cluster for best practices and potential issues. It is not a security scanner like Kube-bench, but it can be used to find potential security issues and for your cluster management.&lt;/p&gt;

&lt;p&gt;It targets the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;nodes&lt;/li&gt;
&lt;li&gt;namespaces&lt;/li&gt;
&lt;li&gt;pods&lt;/li&gt;
&lt;li&gt;services&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Its main scope is to find misconfigurations like port mismatches, dead or unused resources etc. You can find the full list of available sanitizers &lt;a href="https://github.com/derailed/popeye#sanitizers" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Potential use cases
&lt;/h3&gt;

&lt;p&gt;Should you choose it to run in a pipeline or as a job you can build on it and make action items to fix based on the errors reported in the report.&lt;/p&gt;

&lt;p&gt;Or this could just be a little helpful tool to run on your cluster to get a quick overview of the state of your cluster. And you can extract any action items for yourself. In this case, you can save the report using the &lt;code&gt;--save&lt;/code&gt; flag and attach it to your JIRA Ticket or PR.&lt;/p&gt;

&lt;p&gt;This will save the report in your current working directory.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; popeye &lt;span class="nt"&gt;--save&lt;/span&gt;                                                                                                                    
/var/folders/vp/l8dlq0gn3x71f3vk82shmzlm0000gn/T/popeye/sanitizer_aks-staging_1673563387070145000.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;I quite enjoyed using both Popeye and Kube-bench. They are both very useful in different ways. Popeye is more of a cluster management tool, while Kube-bench is more of a security scanner. But they can both be used to improve the security of your cluster.&lt;/p&gt;

&lt;p&gt;In the next posts from this series, we will take a deeper look at the reports in more detail, focusing on the errors and seeing what we can do to improve our cluster score.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Until then, thank you for reading and I hope you found this useful. Let me know what other tools you use&lt;/em&gt;&lt;/p&gt;

</description>
      <category>watercooler</category>
    </item>
    <item>
      <title>Terraform vs. Helm for managing K8s objects</title>
      <dc:creator>Ana Cozma</dc:creator>
      <pubDate>Wed, 10 Aug 2022 09:02:57 +0000</pubDate>
      <link>https://dev.to/the_cozma/terraform-vs-helm-for-managing-k8s-objects-3bgh</link>
      <guid>https://dev.to/the_cozma/terraform-vs-helm-for-managing-k8s-objects-3bgh</guid>
      <description>&lt;p&gt;When I started migrating to Kubernetes (K8s) I discovered that I can use Terraform for managing not only the infrastructure, but also I could define the K8s objects in it, but I also could use Helm to handle that. But what would be a good way to handle this?&lt;/p&gt;

&lt;p&gt;In this post we will cover Terraform and Helm for managing Kubernetes clusters with some code snippets and an idea on how you can use them together to get you started.&lt;/p&gt;

&lt;h3&gt;
  
  
  Structure of the post:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;What is Terraform?&lt;/li&gt;
&lt;li&gt;Manage Kubernetes Resources via Terraform&lt;/li&gt;
&lt;li&gt;What is Helm?&lt;/li&gt;
&lt;li&gt;Manage Kubernetes Resources via Helm&lt;/li&gt;
&lt;li&gt;Using Helm and Terraform Together&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What is Terraform?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.terraform.io/" rel="noopener noreferrer"&gt;HashiCorp Terraform&lt;/a&gt; is an infrastructure as code tool that lets you define both cloud and on-prem resources in human-readable configuration files that you can version, reuse, and share. &lt;/p&gt;

&lt;p&gt;It can manage low-level components like compute, storage, and networking resources, as well as high-level components like DNS entries and SaaS features.&lt;/p&gt;

&lt;p&gt;Terraform treats Infrastructure as Code (IaC) meaning teams manage infrastructure setup with configuration files instead of using graphical user interface (think of Azure Portal and the like). &lt;/p&gt;

&lt;p&gt;&lt;em&gt;So why would you use Terraform?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Some of the &lt;em&gt;benefits&lt;/em&gt; include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;It allows teams to build, change, and manage the infrastructure in a &lt;strong&gt;safe, consistent, and repeatable way&lt;/strong&gt; by defining resource configurations that can be versioned, reused, and shared.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It supports all major cloud &lt;strong&gt;providers&lt;/strong&gt;: Azure, AWS, GCP and many other which you can find &lt;a href="https://registry.terraform.io/browse/providers" rel="noopener noreferrer"&gt;by browsing their registry&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Providers define individual units of infrastructure, for example compute instances or private networks, as resources. You can compose resources from different providers into reusable Terraform configurations called modules, and manage them with a consistent language and workflow.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You define your providers in the terraform code as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;required_providers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;helm&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"2.5.1"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nx"&gt;azuread&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"hashicorp/azuread"&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"2.23.0"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nx"&gt;azurerm&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"hashicorp/azurerm"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;It's configuration language is &lt;strong&gt;declarative&lt;/strong&gt;: &lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Meaning that it describes the desired &lt;strong&gt;end-state for your infrastructure&lt;/strong&gt;, in contrast to procedural programming languages that require step-by-step instructions to perform tasks. Terraform providers automatically calculate dependencies between resources to create or destroy them in the correct order.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This means that any new team member joining will be able to understand the infrastructure setup you have just by going through the configuration files.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;It's &lt;strong&gt;state&lt;/strong&gt; allows you to track resource changes throughout your deployments.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;All configurations are subject to &lt;strong&gt;version control&lt;/strong&gt; to safely collaborate on infrastructure.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can read more on the advantages it brings by looking over the &lt;a href="https://www.terraform.io/use-cases/infrastructure-as-code" rel="noopener noreferrer"&gt;official use cases&lt;/a&gt; from the Terraform documentation.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Terraform - How does it work?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Terraform follows a simple &lt;em&gt;workflow&lt;/em&gt; for managing your infrastructure. So once a new resource or changes to resources are desired, the team will:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Initialize&lt;/strong&gt; the backend by running the &lt;code&gt;terraform init&lt;/code&gt; command, which will install the plugins Terraform needs to manage the infrastructure.&lt;/p&gt;

&lt;p&gt;Output will look something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;Initializing&lt;/span&gt; &lt;span class="nx"&gt;modules&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;
&lt;span class="err"&gt;(...)&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;module1&lt;/span&gt; &lt;span class="nx"&gt;in&lt;/span&gt; &lt;span class="err"&gt;../../&lt;/span&gt;&lt;span class="nx"&gt;modules&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;module1&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;module2&lt;/span&gt; &lt;span class="nx"&gt;in&lt;/span&gt; &lt;span class="err"&gt;../../&lt;/span&gt;&lt;span class="nx"&gt;modules&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;module2&lt;/span&gt;

&lt;span class="nx"&gt;Initializing&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;backend&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;

&lt;span class="nx"&gt;Successfully&lt;/span&gt; &lt;span class="nx"&gt;configured&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;backend&lt;/span&gt; &lt;span class="s2"&gt;"azurerm"&lt;/span&gt;&lt;span class="err"&gt;!&lt;/span&gt; &lt;span class="nx"&gt;Terraform&lt;/span&gt; &lt;span class="nx"&gt;will&lt;/span&gt; &lt;span class="nx"&gt;automatically&lt;/span&gt;
&lt;span class="nx"&gt;use&lt;/span&gt; &lt;span class="nx"&gt;this&lt;/span&gt; &lt;span class="nx"&gt;backend&lt;/span&gt; &lt;span class="nx"&gt;unless&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;backend&lt;/span&gt; &lt;span class="nx"&gt;configuration&lt;/span&gt; &lt;span class="nx"&gt;changes&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;

&lt;span class="nx"&gt;Initializing&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="nx"&gt;plugins&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Reusing&lt;/span&gt; &lt;span class="nx"&gt;previous&lt;/span&gt; &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="nx"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;hashicorp&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;kubernetes&lt;/span&gt; &lt;span class="nx"&gt;from&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;dependency&lt;/span&gt; &lt;span class="nx"&gt;lock&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Reusing&lt;/span&gt; &lt;span class="nx"&gt;previous&lt;/span&gt; &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="nx"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;hashicorp&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;azuread&lt;/span&gt; &lt;span class="nx"&gt;from&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;dependency&lt;/span&gt; &lt;span class="nx"&gt;lock&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Reusing&lt;/span&gt; &lt;span class="nx"&gt;previous&lt;/span&gt; &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="nx"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;hashicorp&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;azurerm&lt;/span&gt; &lt;span class="nx"&gt;from&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;dependency&lt;/span&gt; &lt;span class="nx"&gt;lock&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Reusing&lt;/span&gt; &lt;span class="nx"&gt;previous&lt;/span&gt; &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="nx"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;hashicorp&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;helm&lt;/span&gt; &lt;span class="nx"&gt;from&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;dependency&lt;/span&gt; &lt;span class="nx"&gt;lock&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Installing&lt;/span&gt; &lt;span class="nx"&gt;hashicorp&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;helm&lt;/span&gt; &lt;span class="nx"&gt;v2&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mf"&gt;5.1&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Installed&lt;/span&gt; &lt;span class="nx"&gt;hashicorp&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;helm&lt;/span&gt; &lt;span class="nx"&gt;v2&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mf"&gt;5.1&lt;/span&gt; &lt;span class="err"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;signed&lt;/span&gt; &lt;span class="nx"&gt;by&lt;/span&gt; &lt;span class="nx"&gt;HashiCorp&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Installing&lt;/span&gt; &lt;span class="nx"&gt;hashicorp&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;kubernetes&lt;/span&gt; &lt;span class="nx"&gt;v2&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mf"&gt;10.0&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Installed&lt;/span&gt; &lt;span class="nx"&gt;hashicorp&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;kubernetes&lt;/span&gt; &lt;span class="nx"&gt;v2&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mf"&gt;10.0&lt;/span&gt; &lt;span class="err"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;signed&lt;/span&gt; &lt;span class="nx"&gt;by&lt;/span&gt; &lt;span class="nx"&gt;HashiCorp&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Installing&lt;/span&gt; &lt;span class="nx"&gt;hashicorp&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;azuread&lt;/span&gt; &lt;span class="nx"&gt;v2&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mf"&gt;23.0&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Installed&lt;/span&gt; &lt;span class="nx"&gt;hashicorp&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;azuread&lt;/span&gt; &lt;span class="nx"&gt;v2&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mf"&gt;23.0&lt;/span&gt; &lt;span class="err"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;signed&lt;/span&gt; &lt;span class="nx"&gt;by&lt;/span&gt; &lt;span class="nx"&gt;HashiCorp&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Installing&lt;/span&gt; &lt;span class="nx"&gt;hashicorp&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;azurerm&lt;/span&gt; &lt;span class="nx"&gt;v3&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mf"&gt;10.0&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Installed&lt;/span&gt; &lt;span class="nx"&gt;hashicorp&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;azurerm&lt;/span&gt; &lt;span class="nx"&gt;v3&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mf"&gt;10.0&lt;/span&gt; &lt;span class="err"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;signed&lt;/span&gt; &lt;span class="nx"&gt;by&lt;/span&gt; &lt;span class="nx"&gt;HashiCorp&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;

&lt;span class="nx"&gt;Partner&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="nx"&gt;community&lt;/span&gt; &lt;span class="nx"&gt;providers&lt;/span&gt; &lt;span class="nx"&gt;are&lt;/span&gt; &lt;span class="nx"&gt;signed&lt;/span&gt; &lt;span class="nx"&gt;by&lt;/span&gt; &lt;span class="nx"&gt;their&lt;/span&gt; &lt;span class="nx"&gt;developers&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;
&lt;span class="nx"&gt;If&lt;/span&gt; &lt;span class="nx"&gt;you&lt;/span&gt;&lt;span class="s1"&gt;'d like to know more about provider signing, you can read about it here:
https://www.terraform.io/docs/cli/plugins/signing.html

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the example above we initialize the backend, in our case we store the state file in a container in azure, the providers (azurerm, azuread, helm, kubernetes) and we get a successful message of completion.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Plan&lt;/strong&gt; the changes to be made by running the &lt;code&gt;terraform plan&lt;/code&gt; command, which will give a review of the 'planned' changes Terraform will make to match your configuration.&lt;/p&gt;

&lt;p&gt;During the plan, Terraform will mark which resources will be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;added with a '+' sign, &lt;/li&gt;
&lt;li&gt;updated with a '~' sign or &lt;/li&gt;
&lt;li&gt;deleted with a '-' sign.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this example Terraform is creating a brand new resource as you can see all attributes are marked with the + sign:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;  &lt;span class="c1"&gt;# module.monitoring.helm_release.prometheus_agent[0] will be created&lt;/span&gt;
  &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"helm_release"&lt;/span&gt; &lt;span class="s2"&gt;"prometheus_agent"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;atomic&lt;/span&gt;                     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
      &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;chart&lt;/span&gt;                      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"../../charts/prometheus-agent"&lt;/span&gt;
      &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;cleanup_on_fail&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
      &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;create_namespace&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
      &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;dependency_update&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
      &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;disable_crd_hooks&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
      &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;disable_openapi_validation&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
      &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;disable_webhooks&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
      &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;force_update&lt;/span&gt;               &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But in this one, the resource was already previously created, and Terraform is just updating something to it which is marked with the ~ sign:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;Terraform&lt;/span&gt; &lt;span class="nx"&gt;used&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;selected&lt;/span&gt; &lt;span class="nx"&gt;providers&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;generate&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;following&lt;/span&gt; &lt;span class="nx"&gt;execution&lt;/span&gt; &lt;span class="nx"&gt;plan&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Resource&lt;/span&gt; &lt;span class="nx"&gt;actions&lt;/span&gt; &lt;span class="nx"&gt;are&lt;/span&gt; &lt;span class="nx"&gt;indicated&lt;/span&gt; &lt;span class="nx"&gt;with&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;following&lt;/span&gt; &lt;span class="nx"&gt;symbols&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;create&lt;/span&gt;
  &lt;span class="err"&gt;~&lt;/span&gt; &lt;span class="nx"&gt;update&lt;/span&gt; &lt;span class="nx"&gt;in-place&lt;/span&gt;

&lt;span class="nx"&gt;Terraform&lt;/span&gt; &lt;span class="nx"&gt;will&lt;/span&gt; &lt;span class="nx"&gt;perform&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;following&lt;/span&gt; &lt;span class="nx"&gt;actions&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;# module.grafana.grafana_organization.org will be updated in-place&lt;/span&gt;
  &lt;span class="err"&gt;~&lt;/span&gt; &lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"grafana_organization"&lt;/span&gt; &lt;span class="s2"&gt;"org"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="err"&gt;~&lt;/span&gt; &lt;span class="nx"&gt;admins&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
          &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;"new_admin@grafana.admin"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="nx"&gt;id&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"1"&lt;/span&gt;
        &lt;span class="nx"&gt;name&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"MyOrg"&lt;/span&gt;
        &lt;span class="c1"&gt;# (5 unchanged attributes hidden)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Terraform keeps track of your real infrastructure in a &lt;em&gt;state&lt;/em&gt; file, which acts as a source of truth for your environment. Meaning it uses this file to determine the changes to make to your infrastructure so that it will match your configuration.&lt;/p&gt;

&lt;p&gt;This is also helpful to detect infrastructure drifts between the desired state and the current one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Apply&lt;/strong&gt; the desired changes by running the &lt;code&gt;terraform apply&lt;/code&gt; command.&lt;/p&gt;

&lt;p&gt;A view from the Terraform official documentation:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwsslgd6zgbsqvz0uwmin.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwsslgd6zgbsqvz0uwmin.png" alt=" " width="800" height="290"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So this is Terraform in a nutshell. &lt;/p&gt;

&lt;h2&gt;
  
  
  Manage Kubernetes Resources via Terraform
&lt;/h2&gt;

&lt;p&gt;Terraform’s Kubernetes (K8S) provider is used to interact with the resources supported by Kubernetes and offers many benefits, but it's important to note that the capability is still new. This means you might not have all of the resources available in the provider or there might be some open bugs.&lt;/p&gt;

&lt;p&gt;That said, how would this look?&lt;br&gt;
Let's look at an example for an AKS cluster in Terraform:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_kubernetes_cluster"&lt;/span&gt; &lt;span class="s2"&gt;"main"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                              &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"aks-${var.prefix}-${var.env}"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;                          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;               &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;dns_prefix&lt;/span&gt;                        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${var.prefix}-${var.env}"&lt;/span&gt;
  &lt;span class="nx"&gt;role_based_access_control_enabled&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ad_admin_group&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt; &lt;span class="err"&gt;?&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;kubernetes_version&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;kubernetes_version&lt;/span&gt;

  &lt;span class="nx"&gt;dynamic&lt;/span&gt; &lt;span class="s2"&gt;"azure_active_directory_role_based_access_control"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;for_each&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ad_admin_group&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt; &lt;span class="err"&gt;?&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;admin_group_object_ids&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ad_admin_group&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="nx"&gt;azure_rbac_enabled&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;dynamic&lt;/span&gt; &lt;span class="s2"&gt;"oms_agent"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;for_each&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;oms_agent_enabled&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="err"&gt;?&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;log_analytics_workspace_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;log_analytics_workspace_id&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;azure_policy_enabled&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;azure_policy_enabled&lt;/span&gt;
  &lt;span class="nx"&gt;http_application_routing_enabled&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="nx"&gt;api_server_authorized_ip_ranges&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;api_server_authorized_ip_ranges&lt;/span&gt;

  &lt;span class="nx"&gt;default_node_pool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"default"&lt;/span&gt;
    &lt;span class="nx"&gt;enable_auto_scaling&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;enable_auto_scaling&lt;/span&gt;
    &lt;span class="nx"&gt;max_count&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;enable_auto_scaling&lt;/span&gt; &lt;span class="err"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;max_count&lt;/span&gt; &lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;
    &lt;span class="nx"&gt;min_count&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;enable_auto_scaling&lt;/span&gt; &lt;span class="err"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;min_count&lt;/span&gt; &lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;
    &lt;span class="nx"&gt;node_count&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;node_count&lt;/span&gt;
    &lt;span class="nx"&gt;type&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"VirtualMachineScaleSets"&lt;/span&gt;
    &lt;span class="nx"&gt;vm_size&lt;/span&gt;              &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;node_size&lt;/span&gt;
    &lt;span class="nx"&gt;tags&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tags&lt;/span&gt;
    &lt;span class="nx"&gt;orchestrator_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;node_pool_orchestrator_version&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;service_principal&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;client_id&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azuread_application&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;application_id&lt;/span&gt;
    &lt;span class="nx"&gt;client_secret&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azuread_service_principal_password&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tags&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So you create the resources, run &lt;code&gt;terraform apply&lt;/code&gt; and it will provision your infrastructure.&lt;/p&gt;

&lt;p&gt;For the &lt;strong&gt;&lt;em&gt;deployment&lt;/em&gt;&lt;/strong&gt; we create a separate Terraform file using the kubernetes deployment resource:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"kubernetes_deployment"&lt;/span&gt; &lt;span class="s2"&gt;"example"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;metadata&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"example"&lt;/span&gt;
    &lt;span class="nx"&gt;labels&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;App&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Example"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;spec&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;replicas&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="nx"&gt;selector&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;match_labels&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;App&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Example"&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nx"&gt;template&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;metadata&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;labels&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;App&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Example"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="nx"&gt;spec&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;container&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;image&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"nginx:1.7.8"&lt;/span&gt;
          &lt;span class="nx"&gt;name&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"example"&lt;/span&gt;

          &lt;span class="nx"&gt;port&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;container_port&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;
          &lt;span class="p"&gt;}&lt;/span&gt;

          &lt;span class="nx"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;limits&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="nx"&gt;cpu&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"0.5"&lt;/span&gt;
              &lt;span class="nx"&gt;memory&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"512Mi"&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="nx"&gt;requests&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="nx"&gt;cpu&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"250m"&lt;/span&gt;
              &lt;span class="nx"&gt;memory&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"50Mi"&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
          &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And in order to create this you again run &lt;code&gt;terraform apply&lt;/code&gt; and confirm the changes.&lt;/p&gt;

&lt;p&gt;Same approach will apply to creating a &lt;strong&gt;&lt;em&gt;service&lt;/em&gt;&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"kubernetes_service"&lt;/span&gt; &lt;span class="s2"&gt;"example"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;metadata&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"example"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;spec&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;selector&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;App&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;kubernetes_deployment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;example&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;template&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;App&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nx"&gt;port&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;port&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;
      &lt;span class="nx"&gt;target_port&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"LoadBalancer"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And if you want to scale this setup then the approach is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; Make the changes to the replica count
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;  &lt;span class="nx"&gt;spec&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;replicas&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="nx"&gt;selector&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;match_labels&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;App&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Example"&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Apply the terraform code and confirm the changes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Aside from this we can store the &lt;code&gt;kubernetes_namespace&lt;/code&gt; resource:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"kubernetes_namespace"&lt;/span&gt; &lt;span class="s2"&gt;"example"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;metadata&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;annotations&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"example"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"example"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And any secrets you might need:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"kubernetes_secret"&lt;/span&gt; &lt;span class="s2"&gt;"example"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;metadata&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"example"&lt;/span&gt;
    &lt;span class="nx"&gt;namespace&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"example"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;"some_setting"&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"false"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using this approach means you can take &lt;em&gt;advantage&lt;/em&gt; of the benefits of Terraform including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;can use one tool for managing your infrastructure resources and also for your cluster management&lt;/li&gt;
&lt;li&gt;you use one language for all your infrastructure resources and also the k8s objects&lt;/li&gt;
&lt;li&gt;you can nicely see the plan of your changes before provisioning resources&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;em&gt;disadvantages&lt;/em&gt; are of course you need to be familiar with hcl language and if the team is new adopting it would take a bit of time.&lt;/p&gt;

&lt;p&gt;The K8s Terraform provider might not fully support all the beta objects so you might need to wait.&lt;/p&gt;

&lt;p&gt;If you are interested in provisioning a cluster and all the K8s objects via Terraform please check the &lt;a href="https://learn.hashicorp.com/tutorials/terraform/kubernetes-provider" rel="noopener noreferrer"&gt;official documentation&lt;/a&gt; for step by step settings.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Helm?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://helm.sh/" rel="noopener noreferrer"&gt;Helm&lt;/a&gt; is a package manager tool that helps you manage Kubernetes applications. Helm makes use of &lt;strong&gt;Helm Charts&lt;/strong&gt; to define, install, and upgrade Kubernetes application.&lt;/p&gt;

&lt;p&gt;Let's look over some terminology when working with Helm: &lt;/p&gt;

&lt;p&gt;&lt;em&gt;Helm:&lt;/em&gt; is the command-line interface that helps you define, install, and upgrade your Kubernetes application using charts.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Charts:&lt;/em&gt; are the format for Helm’s application package. The &lt;strong&gt;chart&lt;/strong&gt; is a bundle of information necessary to create an instance of a Kubernetes application. Basically a package of files and templates that gets converted into Kubernetes objects at deployment time.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Chart repository:&lt;/em&gt; is the location where you can store and share packaged charts.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The config:&lt;/em&gt; contains configuration information that can be merged into a packaged chart to create a releasable object.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The Release:&lt;/em&gt; is a running instance of a chart, combined with a specific config. It is created by Helm to track the installation of the charts you created/defined.&lt;/p&gt;

&lt;p&gt;For more details on &lt;a href="https://helm.sh/docs/topics/architecture/" rel="noopener noreferrer"&gt;Helm architecture&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Some of the &lt;em&gt;benefits&lt;/em&gt; Helm brings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Charts are in YAML format and are &lt;strong&gt;reusable&lt;/strong&gt; because they provide repeatable application installation. Because of this you can use them in multiple environments (think dev, staging and prod following the same one). &lt;/li&gt;
&lt;li&gt;Because charts build a repeatable process this makes deployments easier.&lt;/li&gt;
&lt;li&gt;A lot of charts are &lt;strong&gt;already &lt;a href="https://artifacthub.io/" rel="noopener noreferrer"&gt;available&lt;/a&gt;&lt;/strong&gt;, but you can create your own as well - &lt;strong&gt;custom&lt;/strong&gt; charts.&lt;/li&gt;
&lt;li&gt;You can create &lt;strong&gt;dependencies&lt;/strong&gt; between the charts and can also use &lt;a href="https://helm.sh/docs/chart_template_guide/subcharts_and_globals/" rel="noopener noreferrer"&gt;&lt;strong&gt;sub-charts&lt;/strong&gt;&lt;/a&gt; to add more flexibility to your setup.&lt;/li&gt;
&lt;li&gt;Charts serve as a single point of authority.&lt;/li&gt;
&lt;li&gt;Releases are tracked.&lt;/li&gt;
&lt;li&gt;You can upgrade or rollback multiple K8s objects together.&lt;/li&gt;
&lt;li&gt;Charts can be easily installed/ uninstalled.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Manage Kubernetes Resources via Helm
&lt;/h2&gt;

&lt;p&gt;We looked over the main terminology when using Helm, but let's see how it would look like.&lt;/p&gt;

&lt;p&gt;First thing we need to do is in the repository where we have our code we run the &lt;code&gt;helm create &amp;lt;app_name&amp;gt;&lt;/code&gt; command. This command creates a chart directory along with the common files and directories used in a chart. &lt;a href="https://helm.sh/docs/helm/helm_create/" rel="noopener noreferrer"&gt;More information on the command&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;And this will create a structure as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.
└── example
    ├── Chart.yaml
    ├── charts
    ├── templates
    │   ├── NOTES.txt
    │   ├── _helpers.tpl
    │   ├── deployment.yaml
    │   ├── hpa.yaml
    │   ├── ingress.yaml
    │   ├── service.yaml
    │   ├── serviceaccount.yaml
    │   └── tests
    │       └── test-connection.yaml
    └── values.yaml

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We notice it created:&lt;br&gt;
A &lt;em&gt;Chart.yaml&lt;/em&gt; file which just contains the information about the chart.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v2&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;example&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;A Helm chart for Kubernetes&lt;/span&gt;

&lt;span class="c1"&gt;# A chart can be either an 'application' or a 'library' chart.&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# Application charts are a collection of templates that can be packaged into versioned archives&lt;/span&gt;
&lt;span class="c1"&gt;# to be deployed.&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# Library charts provide useful utilities or functions for the chart developer. They're included as&lt;/span&gt;
&lt;span class="c1"&gt;# a dependency of application charts to inject those utilities and functions into the rendering&lt;/span&gt;
&lt;span class="c1"&gt;# pipeline. Library charts do not define any templates and therefore cannot be deployed.&lt;/span&gt;
&lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;application&lt;/span&gt;

&lt;span class="c1"&gt;# This is the chart version. This version number should be incremented each time you make changes&lt;/span&gt;
&lt;span class="c1"&gt;# to the chart and its templates, including the app version.&lt;/span&gt;
&lt;span class="c1"&gt;# Versions are expected to follow Semantic Versioning (https://semver.org/)&lt;/span&gt;
&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;0.1.0&lt;/span&gt;

&lt;span class="c1"&gt;# This is the version number of the application being deployed. This version number should be&lt;/span&gt;
&lt;span class="c1"&gt;# incremented each time you make changes to the application. Versions are not expected to&lt;/span&gt;
&lt;span class="c1"&gt;# follow Semantic Versioning. They should reflect the version the application is using.&lt;/span&gt;
&lt;span class="c1"&gt;# It is recommended to use it with quotes.&lt;/span&gt;
&lt;span class="na"&gt;appVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.16.0"&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;em&gt;charts&lt;/em&gt; directory where you can add any charts that your chart depends on.&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;templates&lt;/em&gt; directory:&lt;br&gt;
A directory to store partials and helpers. The file called &lt;strong&gt;_helpers.tpl&lt;/strong&gt; is the default location for template partials that the rest of the yaml files rely on as we will see.&lt;/p&gt;

&lt;p&gt;How does this work?&lt;/p&gt;

&lt;p&gt;Let's take a small example, in that file we define the fullname:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;/*&lt;/span&gt;
&lt;span class="nv"&gt;Create a default fully qualified app name.&lt;/span&gt;
&lt;span class="nv"&gt;We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).&lt;/span&gt;
&lt;span class="nv"&gt;If release name contains chart name it will be used as a full name.&lt;/span&gt;
&lt;span class="nv"&gt;*/&lt;/span&gt;&lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;- define "example.fullname" -&lt;/span&gt;&lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;- if .Values.fullnameOverride&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;- .Values.fullnameOverride | trunc 63 | trimSuffix "-"&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;- else&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;- $name&lt;/span&gt; &lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;= default .Chart.Name .Values.nameOverride&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;- if contains $name .Release.Name&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;- .Release.Name | trunc 63 | trimSuffix "-"&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;- else&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-"&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;- end&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;- end&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;- end&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And in our &lt;code&gt;service.yaml&lt;/code&gt; file we can include the defined fullname like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Service&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{{&lt;/span&gt; &lt;span class="nv"&gt;include "example.fullname" .&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;- include "example.labels" . | nindent 4&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{{&lt;/span&gt; &lt;span class="nv"&gt;.Values.service.type&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
  &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{{&lt;/span&gt; &lt;span class="nv"&gt;.Values.service.port&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
      &lt;span class="na"&gt;targetPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http&lt;/span&gt;
      &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TCP&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt;- include "example.selectorLabels" . | nindent 4&lt;/span&gt; &lt;span class="pi"&gt;}}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And in the &lt;code&gt;values.yaml&lt;/code&gt; file we can also override it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Default values for example.&lt;/span&gt;
&lt;span class="c1"&gt;# This is a YAML-formatted file.&lt;/span&gt;
&lt;span class="c1"&gt;# Declare variables to be passed into your templates.&lt;/span&gt;

&lt;span class="na"&gt;replicaCount&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;

&lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;repository&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx&lt;/span&gt;
  &lt;span class="na"&gt;pullPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;IfNotPresent&lt;/span&gt;
  &lt;span class="c1"&gt;# Overrides the image tag whose default is the chart appVersion.&lt;/span&gt;
  &lt;span class="na"&gt;tag&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;

&lt;span class="na"&gt;imagePullSecrets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[]&lt;/span&gt;
&lt;span class="na"&gt;nameOverride&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
&lt;span class="na"&gt;fullnameOverride&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;

&lt;span class="na"&gt;serviceAccount&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;# Specifies whether a service account should be created&lt;/span&gt;
  &lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="c1"&gt;# Annotations to add to the service account&lt;/span&gt;
  &lt;span class="na"&gt;annotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{}&lt;/span&gt;
  &lt;span class="c1"&gt;# The name of the service account to use.&lt;/span&gt;
  &lt;span class="c1"&gt;# If not set and create is true, a name is generated using the fullname template&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;

&lt;span class="na"&gt;podAnnotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{}&lt;/span&gt;

&lt;span class="na"&gt;podSecurityContext&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{}&lt;/span&gt;
  &lt;span class="c1"&gt;# fsGroup: 2000&lt;/span&gt;

&lt;span class="na"&gt;securityContext&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{}&lt;/span&gt;
  &lt;span class="c1"&gt;# capabilities:&lt;/span&gt;
  &lt;span class="c1"&gt;#   drop:&lt;/span&gt;
  &lt;span class="c1"&gt;#   - ALL&lt;/span&gt;
  &lt;span class="c1"&gt;# readOnlyRootFilesystem: true&lt;/span&gt;
  &lt;span class="c1"&gt;# runAsNonRoot: true&lt;/span&gt;
  &lt;span class="c1"&gt;# runAsUser: 1000&lt;/span&gt;

&lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ClusterIP&lt;/span&gt;
  &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
&lt;span class="s"&gt;(...)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the example above the fullname will be the Chart name since we didn't pass any value in the &lt;code&gt;values.yaml&lt;/code&gt; file for the override based on the definition of the parameter in the &lt;code&gt;_helpers.tpl&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;values.yaml&lt;/em&gt; file which contains the default values for your templates. You can at this point split the values file per each environment: one for staging, one for production etc.&lt;/p&gt;

&lt;p&gt;From here onwards you can start configuring based on what you need. If you check the yaml files you will notice the structure is very similar to what we defined in the K8s objects terraform code.&lt;/p&gt;

&lt;p&gt;For the installation of the charts you can either go one by one and make use of the &lt;code&gt;helm install/upgrade&lt;/code&gt; commands OR you can add this in your CI/CD pipelines.&lt;/p&gt;

&lt;p&gt;Going by the command line could look something like:&lt;br&gt;
&lt;code&gt;helm upgrade example infra/charts/example --install --wait --atomic --namespace=example --set=app.name=example --values=infra/charts/example/values.yaml&lt;/code&gt;&lt;br&gt;
where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;infra/charts/example - is the location of your &lt;code&gt;Chart.yaml&lt;/code&gt; file&lt;/li&gt;
&lt;li&gt;values=infra/charts/example/values.yaml - is the location of the values file&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--wait&lt;/code&gt; - will wait until either the release is successful or the default timeout is reached (5m) if no timeout is specified&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--atomic&lt;/code&gt;- if set, upgrade process rolls back changes made in case of failed upgrade&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Check the full synopsis of the command &lt;a href="https://helm.sh/docs/helm/helm_upgrade/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;helm install or helm upgrade --install?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;install&lt;/em&gt; sub-command always installs a brand new chart, while the &lt;em&gt;upgrade&lt;/em&gt; sub-command can upgrade an existing chart and install a new one, if the chart hasn't been installed before. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;For simplicity you can always you the upgrade sub-command.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Using Helm and Terraform Together
&lt;/h2&gt;

&lt;p&gt;Helm and Terraform are not mutually exclusive and can be used together in the same K8s setup even if the actual setup really depends on your project complexity, which benefits you want to make use of and which drawbacks you can live with.&lt;/p&gt;

&lt;p&gt;In a potential setup where you would use both you could structure it something like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;use Terraform to &lt;em&gt;create and manage resources&lt;/em&gt;: the K8s Cluster, the K8s namespace, and the K8s secrets( if any )&lt;/li&gt;
&lt;li&gt;use Helm charts to &lt;em&gt;deploy&lt;/em&gt; your applications&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the setup we currently use and it has served us well so far.&lt;/p&gt;

&lt;p&gt;It is worth mentioning that you can also use Terraform to handle your Helm deploys using the &lt;code&gt;helm_release&lt;/code&gt; &lt;a href="https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release" rel="noopener noreferrer"&gt;resource&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;In this approach you would have both infrastructure and provisioning in one place - in Terraform. I will not go in this post in the differences between them, but I will mention going with this approach should depend on how &lt;em&gt;frequent&lt;/em&gt; you need to apply changes to your infrastructure because the way this works is during &lt;code&gt;terraform apply&lt;/code&gt; operation the helm release will take place.&lt;/p&gt;

&lt;p&gt;There is no one-size-fits-all approach, but you should tailor tooling and the strategy to your needs.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Hope you find this helpful. Thank you for reading and feel free to comment on your experience and what you prefer to use and why.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>helm</category>
      <category>terraform</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Terraform: Alternative to the Template provider on Apple M1 MBP</title>
      <dc:creator>Ana Cozma</dc:creator>
      <pubDate>Fri, 05 Aug 2022 08:24:30 +0000</pubDate>
      <link>https://dev.to/the_cozma/terraform-alternative-to-the-template-provider-on-apple-m1-mbp-1f2l</link>
      <guid>https://dev.to/the_cozma/terraform-alternative-to-the-template-provider-on-apple-m1-mbp-1f2l</guid>
      <description>&lt;p&gt;We ran into an issue while applying our Terraform infrastructure on a M1 Mac where we were making use of the &lt;a href="https://github.com/hashicorp/terraform-provider-template" rel="noopener noreferrer"&gt;Terraform Provider Template&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;When applying it, we were getting the following error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;template v2.2.0 does not have a package available for your current platform, darwin_arm64
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since the provider is archived, we need to find an alternative.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;What does archiving mean?&lt;/em&gt;&lt;br&gt;
Per Terraform &lt;a href="https://www.terraform.io/internals/archiving" rel="noopener noreferrer"&gt;Archiving Providers&lt;/a&gt; documentation.&lt;/p&gt;

&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;The code repository and all commit, issue, and PR history will still be available.&lt;/li&gt;
&lt;li&gt;Existing released binaries will remain available on the releases site.&lt;/li&gt;
&lt;li&gt;Documentation for the provider will remain on the Terraform website.&lt;/li&gt;
&lt;li&gt;Issues and pull requests are not being monitored, merged, or added.&lt;/li&gt;
&lt;li&gt;No new releases will be published.&lt;/li&gt;
&lt;li&gt;Nightly acceptance tests may not be run.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;em&gt;So what alternatives do we have instead of the deprecated provider?&lt;/em&gt; &lt;/p&gt;

&lt;p&gt;Let's look at an example.&lt;/p&gt;
&lt;h2&gt;
  
  
  Resource using the deprecated Template provider
&lt;/h2&gt;

&lt;p&gt;Let's say we have the following resource - a grafana dashboard json that we store in our Terraform code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"template_file"&lt;/span&gt; &lt;span class="s2"&gt;"grafana_json"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;template&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"${path.module}/grafana_dashboard.json"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;vars&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;title&lt;/span&gt;                      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;monitoring_title&lt;/span&gt;
    &lt;span class="nx"&gt;monitoring_datasource_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;monitoring_datasource_name&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the grafana dashboard Terraform resource:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"grafana_dashboard"&lt;/span&gt; &lt;span class="s2"&gt;"metrics"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;config_json&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;template_file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;grafana_json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rendered&lt;/span&gt;
  &lt;span class="nx"&gt;folder&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;monitoring_folder&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you try to apply this piece of code it will throw the aforementioned error.&lt;/p&gt;

&lt;h2&gt;
  
  
  Updating to the built-in &lt;code&gt;templatefile&lt;/code&gt; Terraform function
&lt;/h2&gt;

&lt;p&gt;We can make use of the built-in &lt;code&gt;templatefile&lt;/code&gt; &lt;a href="https://www.terraform.io/language/functions/templatefile" rel="noopener noreferrer"&gt;Terraform function&lt;/a&gt; that:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;&lt;code&gt;templatefile&lt;/code&gt; reads the file at the given path and renders its content as a template using a supplied set of template variables.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The function uses the format:&lt;br&gt;
&lt;code&gt;templatefile(path, vars)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;In our case the &lt;code&gt;path&lt;/code&gt; is the path to the grafana json file.&lt;/p&gt;

&lt;p&gt;And &lt;code&gt;vars&lt;/code&gt; contains all the variables that we need to use for the grafana dashboard json file.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;The "vars" argument must be a map. Within the template file, each of the keys in the map is available as a variable for interpolation. The template may also use any other function available in the Terraform language, except that recursive calls to templatefile are not permitted. Variable names must each start with a letter, followed by zero or more letters, digits, or underscores.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And our new code looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"grafana_dashboard"&lt;/span&gt; &lt;span class="s2"&gt;"metrics"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;config_json&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;templatefile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"${path.module}/grafana_dashboard.json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;title&lt;/span&gt;                      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;monitoring_title&lt;/span&gt;
    &lt;span class="nx"&gt;monitoring_datasource_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;monitoring_datasource_name&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="nx"&gt;folder&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;monitoring_folder&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Thanks to the new &lt;code&gt;templatefile&lt;/code&gt; function, we can get rid of the &lt;code&gt;template_file&lt;/code&gt; data source.&lt;/p&gt;

&lt;p&gt;This means at this point we no longer rely on the hashicorp/template provider and we can apply our infrastructure changes.&lt;/p&gt;

&lt;p&gt;At this point you can apply the infrastructural changes, but it still might not work and throw the error and this is because if the infrastructure has already been initialized and applied previously we have a record of the deprecated provider stored in the lock file.&lt;/p&gt;

&lt;h2&gt;
  
  
  The template provider is still in the &lt;code&gt;.terraform.lock.hcl&lt;/code&gt; file
&lt;/h2&gt;

&lt;p&gt;If you run &lt;code&gt;terraform init&lt;/code&gt; and still see that the template provider is being installed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;Initializing&lt;/span&gt; &lt;span class="nx"&gt;modules&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;

&lt;span class="nx"&gt;Initializing&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;backend&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;

&lt;span class="nx"&gt;Initializing&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="nx"&gt;plugins&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Reusing&lt;/span&gt; &lt;span class="nx"&gt;previous&lt;/span&gt; &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="nx"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;grafana&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;grafana&lt;/span&gt; &lt;span class="nx"&gt;from&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;dependency&lt;/span&gt; &lt;span class="nx"&gt;lock&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;
&lt;span class="err"&gt;...&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Reusing&lt;/span&gt; &lt;span class="nx"&gt;previous&lt;/span&gt; &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="nx"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;hashicorp&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;template&lt;/span&gt; &lt;span class="nx"&gt;from&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;dependency&lt;/span&gt; &lt;span class="nx"&gt;lock&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;
&lt;span class="err"&gt;...&lt;/span&gt;
&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Using&lt;/span&gt; &lt;span class="nx"&gt;previously-installed&lt;/span&gt; &lt;span class="nx"&gt;hashicorp&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;template&lt;/span&gt; &lt;span class="nx"&gt;v2&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt;

&lt;span class="nx"&gt;Terraform&lt;/span&gt; &lt;span class="nx"&gt;has&lt;/span&gt; &lt;span class="nx"&gt;been&lt;/span&gt; &lt;span class="nx"&gt;successfully&lt;/span&gt; &lt;span class="nx"&gt;initialized&lt;/span&gt;&lt;span class="err"&gt;!&lt;/span&gt;

&lt;span class="nx"&gt;You&lt;/span&gt; &lt;span class="nx"&gt;may&lt;/span&gt; &lt;span class="nx"&gt;now&lt;/span&gt; &lt;span class="nx"&gt;begin&lt;/span&gt; &lt;span class="nx"&gt;working&lt;/span&gt; &lt;span class="nx"&gt;with&lt;/span&gt; &lt;span class="nx"&gt;Terraform&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Try&lt;/span&gt; &lt;span class="nx"&gt;running&lt;/span&gt; &lt;span class="s2"&gt;"terraform plan"&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;see&lt;/span&gt;
&lt;span class="nx"&gt;any&lt;/span&gt; &lt;span class="nx"&gt;changes&lt;/span&gt; &lt;span class="nx"&gt;that&lt;/span&gt; &lt;span class="nx"&gt;are&lt;/span&gt; &lt;span class="nx"&gt;required&lt;/span&gt; &lt;span class="nx"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;your&lt;/span&gt; &lt;span class="nx"&gt;infrastructure&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;All&lt;/span&gt; &lt;span class="nx"&gt;Terraform&lt;/span&gt; &lt;span class="nx"&gt;commands&lt;/span&gt;
&lt;span class="nx"&gt;should&lt;/span&gt; &lt;span class="nx"&gt;now&lt;/span&gt; &lt;span class="nx"&gt;work&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;

&lt;span class="nx"&gt;If&lt;/span&gt; &lt;span class="nx"&gt;you&lt;/span&gt; &lt;span class="nx"&gt;ever&lt;/span&gt; &lt;span class="nx"&gt;set&lt;/span&gt; &lt;span class="nx"&gt;or&lt;/span&gt; &lt;span class="nx"&gt;change&lt;/span&gt; &lt;span class="nx"&gt;modules&lt;/span&gt; &lt;span class="nx"&gt;or&lt;/span&gt; &lt;span class="nx"&gt;backend&lt;/span&gt; &lt;span class="nx"&gt;configuration&lt;/span&gt; &lt;span class="nx"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;Terraform&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;
&lt;span class="nx"&gt;rerun&lt;/span&gt; &lt;span class="nx"&gt;this&lt;/span&gt; &lt;span class="nx"&gt;command&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;reinitialize&lt;/span&gt; &lt;span class="nx"&gt;your&lt;/span&gt; &lt;span class="nx"&gt;working&lt;/span&gt; &lt;span class="nx"&gt;directory&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;If&lt;/span&gt; &lt;span class="nx"&gt;you&lt;/span&gt; &lt;span class="nx"&gt;forget&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;other&lt;/span&gt;
&lt;span class="nx"&gt;commands&lt;/span&gt; &lt;span class="nx"&gt;will&lt;/span&gt; &lt;span class="nx"&gt;detect&lt;/span&gt; &lt;span class="nx"&gt;it&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="nx"&gt;remind&lt;/span&gt; &lt;span class="nx"&gt;you&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;do&lt;/span&gt; &lt;span class="nx"&gt;so&lt;/span&gt; &lt;span class="nx"&gt;if&lt;/span&gt; &lt;span class="nx"&gt;necessary&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then check the Dependency Lock File, &lt;code&gt;terraform.hcl.lock&lt;/code&gt; file. If you still see it there:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"registry.terraform.io/hashicorp/template"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"2.2.0"&lt;/span&gt;
  &lt;span class="nx"&gt;hashes&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s2"&gt;"h1:0wlehNaxBX7GJQnPfQwTNvvAf38Jm0Nv7ssKGMaG6Og="&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;"zh:01702196f0a0492ec07917db7aaa595843d8f171dc195f4c988d2ffca2a06386"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;"zh:09aae3da826ba3d7df69efeb25d146a1de0d03e951d35019a0f80e4f58c89b53"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;"zh:09ba83c0625b6fe0a954da6fbd0c355ac0b7f07f86c91a2a97849140fea49603"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;"zh:0e3a6c8e16f17f19010accd0844187d524580d9fdb0731f675ffcf4afba03d16"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;"zh:45f2c594b6f2f34ea663704cc72048b212fe7d16fb4cfd959365fa997228a776"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;"zh:77ea3e5a0446784d77114b5e851c970a3dde1e08fa6de38210b8385d7605d451"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;"zh:8a154388f3708e3df5a69122a23bdfaf760a523788a5081976b3d5616f7d30ae"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;"zh:992843002f2db5a11e626b3fc23dc0c87ad3729b3b3cff08e32ffb3df97edbde"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;"zh:ad906f4cebd3ec5e43d5cd6dc8f4c5c9cc3b33d2243c89c5fc18f97f7277b51d"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;"zh:c979425ddb256511137ecd093e23283234da0154b7fa8b21c2687182d9aea8b2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check who is requiring the provider (maybe it's still being used in the code elsewhere). This can be done by running the &lt;code&gt;terraform providers&lt;/code&gt; command, which:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The terraform providers command shows information about the provider requirements of the configuration in the current working directory, as an aid to understanding where each requirement was detected from.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;Providers&lt;/span&gt; &lt;span class="nx"&gt;required&lt;/span&gt; &lt;span class="nx"&gt;by&lt;/span&gt; &lt;span class="nx"&gt;configuration&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
&lt;span class="err"&gt;.&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;io&lt;/span&gt;&lt;span class="p"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;grafana&lt;/span&gt;&lt;span class="p"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;grafana&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="err"&gt;...&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="err"&gt;...&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="err"&gt;...&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;grafana&lt;/span&gt;
&lt;span class="err"&gt;│  &lt;/span&gt; &lt;span class="err"&gt;└──&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;io&lt;/span&gt;&lt;span class="p"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;grafana&lt;/span&gt;&lt;span class="p"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;grafana&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="err"&gt;└──&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;module&lt;/span&gt;
    &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="err"&gt;...&lt;/span&gt;

&lt;span class="nx"&gt;Providers&lt;/span&gt; &lt;span class="nx"&gt;required&lt;/span&gt; &lt;span class="nx"&gt;by&lt;/span&gt; &lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;

    &lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;io&lt;/span&gt;&lt;span class="p"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;hashicorp&lt;/span&gt;&lt;span class="p"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;template&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;io&lt;/span&gt;&lt;span class="p"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;grafana&lt;/span&gt;&lt;span class="p"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;grafana&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this case we can see that the template provider is required by the state.&lt;/p&gt;

&lt;p&gt;In order to get rid of this dependency, make sure you update Terraform to any versions greater than &lt;strong&gt;v.1.1.3.&lt;/strong&gt; and this is because they fixed the following issue: &lt;a href="https://github.com/hashicorp/terraform/pull/30192" rel="noopener noreferrer"&gt;https://github.com/hashicorp/terraform/pull/30192&lt;/a&gt; in version 1.1.3.&lt;/p&gt;

&lt;p&gt;In order to update the lock file and remove the entry for the deprecated template provider, we run &lt;code&gt;terraform init&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is because Terraform relies on two sources for determining the truth: the &lt;em&gt;configuration&lt;/em&gt; itself and the &lt;em&gt;state&lt;/em&gt;. If you remove the dependency on a particular provider from &lt;em&gt;both&lt;/em&gt; your configuration and the state then running &lt;code&gt;terraform init&lt;/code&gt; will remove any existing lock file entry for that provider.&lt;/p&gt;

&lt;p&gt;And let's look at the output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;Initializing&lt;/span&gt; &lt;span class="nx"&gt;modules&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;

&lt;span class="nx"&gt;Initializing&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;backend&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;

&lt;span class="nx"&gt;Successfully&lt;/span&gt; &lt;span class="nx"&gt;configured&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;backend&lt;/span&gt; &lt;span class="s2"&gt;"azurerm"&lt;/span&gt;&lt;span class="err"&gt;!&lt;/span&gt; &lt;span class="nx"&gt;Terraform&lt;/span&gt; &lt;span class="nx"&gt;will&lt;/span&gt; &lt;span class="nx"&gt;automatically&lt;/span&gt;
&lt;span class="nx"&gt;use&lt;/span&gt; &lt;span class="nx"&gt;this&lt;/span&gt; &lt;span class="nx"&gt;backend&lt;/span&gt; &lt;span class="nx"&gt;unless&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;backend&lt;/span&gt; &lt;span class="nx"&gt;configuration&lt;/span&gt; &lt;span class="nx"&gt;changes&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;

&lt;span class="nx"&gt;Initializing&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="nx"&gt;plugins&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;

&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Reusing&lt;/span&gt; &lt;span class="nx"&gt;previous&lt;/span&gt; &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="nx"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;grafana&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;grafana&lt;/span&gt; &lt;span class="nx"&gt;from&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;dependency&lt;/span&gt; &lt;span class="nx"&gt;lock&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;

&lt;span class="nx"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Using&lt;/span&gt; &lt;span class="nx"&gt;previously-installed&lt;/span&gt; &lt;span class="nx"&gt;grafana&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;grafana&lt;/span&gt; &lt;span class="nx"&gt;v1&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mf"&gt;17.0&lt;/span&gt;

&lt;span class="nx"&gt;Terraform&lt;/span&gt; &lt;span class="nx"&gt;has&lt;/span&gt; &lt;span class="nx"&gt;made&lt;/span&gt; &lt;span class="nx"&gt;some&lt;/span&gt; &lt;span class="nx"&gt;changes&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="nx"&gt;dependency&lt;/span&gt; &lt;span class="nx"&gt;selections&lt;/span&gt; &lt;span class="nx"&gt;recorded&lt;/span&gt;
&lt;span class="nx"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lock&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;hcl&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Review&lt;/span&gt; &lt;span class="nx"&gt;those&lt;/span&gt; &lt;span class="nx"&gt;changes&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="nx"&gt;commit&lt;/span&gt; &lt;span class="nx"&gt;them&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;your&lt;/span&gt;
&lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="nx"&gt;control&lt;/span&gt; &lt;span class="nx"&gt;system&lt;/span&gt; &lt;span class="nx"&gt;if&lt;/span&gt; &lt;span class="nx"&gt;they&lt;/span&gt; &lt;span class="nx"&gt;represent&lt;/span&gt; &lt;span class="nx"&gt;changes&lt;/span&gt; &lt;span class="nx"&gt;you&lt;/span&gt; &lt;span class="nx"&gt;intended&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;make&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;

&lt;span class="nx"&gt;Terraform&lt;/span&gt; &lt;span class="nx"&gt;has&lt;/span&gt; &lt;span class="nx"&gt;been&lt;/span&gt; &lt;span class="nx"&gt;successfully&lt;/span&gt; &lt;span class="nx"&gt;initialized&lt;/span&gt;&lt;span class="err"&gt;!&lt;/span&gt;

&lt;span class="nx"&gt;You&lt;/span&gt; &lt;span class="nx"&gt;may&lt;/span&gt; &lt;span class="nx"&gt;now&lt;/span&gt; &lt;span class="nx"&gt;begin&lt;/span&gt; &lt;span class="nx"&gt;working&lt;/span&gt; &lt;span class="nx"&gt;with&lt;/span&gt; &lt;span class="nx"&gt;Terraform&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Try&lt;/span&gt; &lt;span class="nx"&gt;running&lt;/span&gt; &lt;span class="s2"&gt;"terraform plan"&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;see&lt;/span&gt;
&lt;span class="nx"&gt;any&lt;/span&gt; &lt;span class="nx"&gt;changes&lt;/span&gt; &lt;span class="nx"&gt;that&lt;/span&gt; &lt;span class="nx"&gt;are&lt;/span&gt; &lt;span class="nx"&gt;required&lt;/span&gt; &lt;span class="nx"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;your&lt;/span&gt; &lt;span class="nx"&gt;infrastructure&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;All&lt;/span&gt; &lt;span class="nx"&gt;Terraform&lt;/span&gt; &lt;span class="nx"&gt;commands&lt;/span&gt;
&lt;span class="nx"&gt;should&lt;/span&gt; &lt;span class="nx"&gt;now&lt;/span&gt; &lt;span class="nx"&gt;work&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;

&lt;span class="nx"&gt;If&lt;/span&gt; &lt;span class="nx"&gt;you&lt;/span&gt; &lt;span class="nx"&gt;ever&lt;/span&gt; &lt;span class="nx"&gt;set&lt;/span&gt; &lt;span class="nx"&gt;or&lt;/span&gt; &lt;span class="nx"&gt;change&lt;/span&gt; &lt;span class="nx"&gt;modules&lt;/span&gt; &lt;span class="nx"&gt;or&lt;/span&gt; &lt;span class="nx"&gt;backend&lt;/span&gt; &lt;span class="nx"&gt;configuration&lt;/span&gt; &lt;span class="nx"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;Terraform&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;
&lt;span class="nx"&gt;rerun&lt;/span&gt; &lt;span class="nx"&gt;this&lt;/span&gt; &lt;span class="nx"&gt;command&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;reinitialize&lt;/span&gt; &lt;span class="nx"&gt;your&lt;/span&gt; &lt;span class="nx"&gt;working&lt;/span&gt; &lt;span class="nx"&gt;directory&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;If&lt;/span&gt; &lt;span class="nx"&gt;you&lt;/span&gt; &lt;span class="nx"&gt;forget&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;other&lt;/span&gt;
&lt;span class="nx"&gt;commands&lt;/span&gt; &lt;span class="nx"&gt;will&lt;/span&gt; &lt;span class="nx"&gt;detect&lt;/span&gt; &lt;span class="nx"&gt;it&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="nx"&gt;remind&lt;/span&gt; &lt;span class="nx"&gt;you&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;do&lt;/span&gt; &lt;span class="nx"&gt;so&lt;/span&gt; &lt;span class="nx"&gt;if&lt;/span&gt; &lt;span class="nx"&gt;necessary&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And we see the template provider is no longer there. &lt;/p&gt;

&lt;p&gt;Now you can safely commit the freshly updated &lt;a href="https://www.terraform.io/language/files/dependency-lock" rel="noopener noreferrer"&gt;Dependency Lock File&lt;/a&gt;. &lt;/p&gt;

</description>
      <category>terraform</category>
      <category>iac</category>
      <category>tutorial</category>
      <category>beginners</category>
    </item>
    <item>
      <title>[K8s] Fix Helm release failing with an upgrade still in progress</title>
      <dc:creator>Ana Cozma</dc:creator>
      <pubDate>Mon, 30 May 2022 19:49:10 +0000</pubDate>
      <link>https://dev.to/the_cozma/k8s-fix-helm-release-failing-with-an-upgrade-still-in-progress-52cd</link>
      <guid>https://dev.to/the_cozma/k8s-fix-helm-release-failing-with-an-upgrade-still-in-progress-52cd</guid>
      <description>&lt;p&gt;This article applies to: &lt;strong&gt;Helm v3.8.0&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Helm helps you manage Kubernetes applications — Helm Charts help you define, install, and upgrade even the most complex Kubernetes application. More details on Helm and the commands can be found in the &lt;a href="https://helm.sh/" rel="noopener noreferrer"&gt;official documentation&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Assuming you use Helm to handle your releases, you might end up in a case where the release will be stuck in a pending state and all subsequent releases will keep failing.&lt;/p&gt;

&lt;p&gt;This could happen if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you run the upgrade command from the cli and accidentally (or not) interrupt it or&lt;/li&gt;
&lt;li&gt;you have two deploys running at the same time (maybe in Github Actions, for example)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Basically any interruption that occurred during your install/upgrade process &lt;strong&gt;could&lt;/strong&gt; lead you to a state where you cannot install another release anymore.&lt;/p&gt;

&lt;p&gt;In the release logs the failing upgrade will show an error similar to the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Error: UPGRADE FAILED: release &amp;lt;name&amp;gt; failed, and has been rolled back due to atomic being set: timed out waiting for the condition
Error: Error: The process '/usr/bin/helm3' failed with exit code 1
Error: The process '/usr/bin/helm3' failed with exit code 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the status will be stuck in: &lt;code&gt;PENDING_INSTALL&lt;/code&gt; or &lt;code&gt;PENDING_UPGRADE&lt;/code&gt; depending on the command you were running.&lt;/p&gt;

&lt;p&gt;Because of this pending state when you run the command to list all release it will return empty:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; helm list --all                                                                               
NAME    NAMESPACE   REVISION    UPDATED STATUS  CHART   APP VERSION
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So what can we do now? In this article we will look over the two options described below. Keep in mind that based on your setup it could be another issue, but I'm hoping that these two pointers will give you a place to start.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Roll back to the previous working version using the &lt;code&gt;helm rollback&lt;/code&gt; command.&lt;/li&gt;
&lt;li&gt;Delete the helm secret associated with the release and re-run the upgrade command.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So let's look at each option in detail.&lt;/p&gt;

&lt;h3&gt;
  
  
  First option: Roll back to the previous working version using the &lt;code&gt;helm rollback&lt;/code&gt; command
&lt;/h3&gt;

&lt;p&gt;From the official documentation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This command rolls back a release to a previous revision.&lt;br&gt;
The first argument of the rollback command is the name of a release, and the second is a revision (version) number. If this argument is omitted, it will roll back to the previous release.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So, in this case, let's get the history of the releases:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm history &amp;lt;releasename&amp;gt; -n &amp;lt;namespace&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the output you will notice the STATUS of your release with: &lt;code&gt;pending-upgrade&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;REVISION UPDATED                  STATUS          CHART     APP VERSION DESCRIPTION
1        Wed May 25 11:45:40 2022 DEPLOYED        api-0.1.0 1.16.0      Upgrade complete
2        Mon May 30 14:32:46 2022 PENDING_UPGRADE api-0.1.0 1.16.0      Preparing upgrade
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now let's perform the rollback by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm rollback &amp;lt;release&amp;gt; &amp;lt;revision&amp;gt; --namespace &amp;lt;namespace&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So in our case we run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; helm rollback api 1 --namespace api
Rollback was a success.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After we get confirmation that the rollback was successful, we run the command to get the history again.&lt;/p&gt;

&lt;p&gt;We now see we have two releases and that our rollback was successful having the STATUS is: &lt;code&gt;deployed&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; helm history api -n api

REVISION UPDATED                  STATUS      CHART     APP VERSION DESCRIPTION
1        Wed May 25 11:45:40 2022 SUPERSEEDED api-0.1.0 1.16.0      Upgrade complete
2        Mon May 30 14:32:46 2022 SUPERSEEDED api-0.1.0 1.16.0      Preparing upgrade
3        Mon May 30 14:45:46 2022 DEPLOYED    api-0.1.0 1.16.0      Rollback to 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So what if the solution above didn't work?&lt;/p&gt;

&lt;h3&gt;
  
  
  Second option: Delete the helm secret associated with the release and re-run the upgrade command
&lt;/h3&gt;

&lt;p&gt;First we get all the secrets for the namespace by running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get secrets -n &amp;lt;namespace&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the output you will notice a list of secrets in the following format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NAME                        TYPE               DATA AGE
api                         Opaque             21   473d
sh.helm.release.v1.api.v648 helm.sh/release.v1 1    6d5h
sh.helm.release.v1.api.v649 helm.sh/release.v1 1    5d1h
sh.helm.release.v1.api.v650 helm.sh/release.v1 1    57m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;So what's in a secret?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Helm3 makes use of the Kubernetes &lt;a href="https://kubernetes.io/docs/concepts/configuration/secret/" rel="noopener noreferrer"&gt;Secrets&lt;/a&gt; object to store any information regarding a release. These secrets are basically used by Helm to store and read it's state every time we run: &lt;code&gt;helm list&lt;/code&gt;, &lt;code&gt;helm history&lt;/code&gt; or &lt;code&gt;helm upgrade&lt;/code&gt; in our case.&lt;/p&gt;

&lt;p&gt;The naming of the secrets is &lt;strong&gt;unique per namespace&lt;/strong&gt;. &lt;br&gt;
The format follows the following convention:&lt;br&gt;
&lt;code&gt;sh.helm.release.v1.&amp;lt;release_name&amp;gt;.&amp;lt;release_version&amp;gt;&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;There is a max of 10 secrets that are stored by default, but you can modify this by setting the &lt;code&gt;--history-max&lt;/code&gt; flag in your &lt;a href="https://helm.sh/docs/helm/helm_upgrade/" rel="noopener noreferrer"&gt;helm upgrade&lt;/a&gt; command.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;--history-max int                            limit the maximum number of revisions saved per release. Use 0 for no limit (default 10)&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now that we know what these secrets are used for, let's delete the helm secret associated with the pending release by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl delete secret sh.helm.release.v1.&amp;lt;release_name&amp;gt;.v&amp;lt;release_version&amp;gt; -n &amp;lt;namespace&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, we re-run the helm upgrade command (either from command line or from your deployment workflow), which, if all was good so far, should succeed.&lt;/p&gt;

&lt;p&gt;There is an open issue with &lt;a href="https://github.com/helm/helm/issues/4558" rel="noopener noreferrer"&gt;Helm&lt;/a&gt; so hopefully these workaround won't be needed anymore. But it's open since 2018. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;Of course there could be other cases or issues, but I hope this is a nice place to start. If you ran into something similar I would love to read your input in what the issue was and how you solved it especially since I didn't find the error message to be intuitive.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Thank you for reading!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>beginners</category>
      <category>devops</category>
      <category>helm</category>
    </item>
    <item>
      <title>[K8s] How to restart Kubernetes Pods</title>
      <dc:creator>Ana Cozma</dc:creator>
      <pubDate>Wed, 18 May 2022 07:04:25 +0000</pubDate>
      <link>https://dev.to/the_cozma/k8s-how-to-restart-kubernetes-pods-53mj</link>
      <guid>https://dev.to/the_cozma/k8s-how-to-restart-kubernetes-pods-53mj</guid>
      <description>&lt;p&gt;This article applies for &lt;strong&gt;Kubernetes v1.15&lt;/strong&gt; and above.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://kubernetes.io/" rel="noopener noreferrer"&gt;Kubernetes&lt;/a&gt;, also known as K8s, is an open-source system for automating deployment, scaling, and management of containerized applications.&lt;/p&gt;

&lt;p&gt;It groups containers that make up an application into logical units for easy management and discovery. But what if something happens to the container? In this case, you might need a quick and easy way to restart it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/" rel="noopener noreferrer"&gt;Kubernetes Pods&lt;/a&gt; usually run until there is a new deployment that will replace them. Therefore, there is no straightforward way to restart a single pod.&lt;/p&gt;

&lt;p&gt;What happens when one container fails is that, instead of restarting it, it will be replaced. &lt;/p&gt;

&lt;h1&gt;
  
  
  Restarting Pods Options
&lt;/h1&gt;

&lt;p&gt;There are a few available options that we will cover in this article:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Scaling down the number of replicas indicating how many Pods it should be maintaining in the ReplicaSet, effectively removing pods, then scaling back up&lt;br&gt;
  &lt;strong&gt;Causes downtime&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Deleting a single pod, forcing K8s to recreate it &lt;br&gt;
 &lt;strong&gt;Might cause downtime&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Starting a rollout (rolling restart method)&lt;br&gt;
 &lt;strong&gt;No downtime&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So let's look at each option in a bit of details and keep in mind that each option could work for you depending on your needs. Some questions to ask: is it a live environment? is it a new setup? can you afford and outage on the app?&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Changing Replicas
&lt;/h2&gt;

&lt;p&gt;An option for restarting the pods is to effectively "shut them off" by scaling the number of deployment replicas first to zero:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl scale deployment &amp;lt;name&amp;gt; --replicas=0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this case, K8s will remove all the replicas that are no longer required.&lt;/p&gt;

&lt;p&gt;And then, scaling them back up to the desired number.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl scale deployment &amp;lt;name&amp;gt; --replicas=&amp;lt;desired_number&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will stop and terminate all current pods and will schedule new pods in their place.&lt;/p&gt;

&lt;p&gt;Because we are "shutting down" pods, this option will cause downtime since there will be no container available. So if you're running on a production system the rolling restart method would be the better approach.&lt;/p&gt;

&lt;p&gt;The names of the new scheduled pods &lt;em&gt;will&lt;/em&gt; be different from the previous ones. &lt;/p&gt;

&lt;p&gt;Run the following command to get the new names of the pods:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pods -n &amp;lt;namespace&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2. Deleting a Pod
&lt;/h2&gt;

&lt;p&gt;First we get all the pods in a namespace by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pods -n &amp;lt;namespace&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then we delete a single pod by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl delete pod &amp;lt;pod_name&amp;gt; -n &amp;lt;namespace&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;K8s will note the change and the state difference and will schedule new pods until the desired state is achieved.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Rolling Restart
&lt;/h2&gt;

&lt;p&gt;From version 1.15 K8s now allows you to execute a rolling restart of your deployment. &lt;br&gt;
&lt;em&gt;Note: Not only the kubectl versions needs to be updated, but make sure the cluster is running on this version as well.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Rolling restart is used to restart all the pods from a deployment in sequence by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl rollout restart deployment -n &amp;lt;yournamespace&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After running this command, K8s proceeds to shut down and restart each pod in the deployment one by one. &lt;/p&gt;

&lt;p&gt;Because this is done sequentially, there is always some pod running meaning the application itself will still be available, effectively allowing for &lt;strong&gt;zero downtime&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Thank you for reading and I hope this helps someone!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>kubernetes</category>
      <category>k8s</category>
    </item>
  </channel>
</rss>
