<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ant(on) Weiss</title>
    <description>The latest articles on DEV Community by Ant(on) Weiss (@antweiss).</description>
    <link>https://dev.to/antweiss</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F106191%2F997cd74b-4346-4332-81f5-4ec7e4785416.jpeg</url>
      <title>DEV Community: Ant(on) Weiss</title>
      <link>https://dev.to/antweiss</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/antweiss"/>
    <language>en</language>
    <item>
      <title>Truly Reactive Cloud Native AI Agents with Kagent and Khook</title>
      <dc:creator>Ant(on) Weiss</dc:creator>
      <pubDate>Tue, 09 Sep 2025 13:43:54 +0000</pubDate>
      <link>https://dev.to/antweiss/truly-reactive-cloud-native-ai-agents-with-kagent-and-khook-4knj</link>
      <guid>https://dev.to/antweiss/truly-reactive-cloud-native-ai-agents-with-kagent-and-khook-4knj</guid>
      <description>&lt;h1&gt;
  
  
  Agent Cloud-Native in the House!
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftd7uwirbs595y5zzsk2b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftd7uwirbs595y5zzsk2b.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Excited by the vision of smart AI agents watching over your Kubernetes cluster? &lt;br&gt;
Want an easy fully cloud-native way to run your agentic software? &lt;br&gt;
Time to discover &lt;a href="https://kagent.dev" rel="noopener noreferrer"&gt;Kagent&lt;/a&gt;! &lt;br&gt;
It’s a fairly young OSS project started by the folks at &lt;a href="https://Solo.io" rel="noopener noreferrer"&gt;Solo.io&lt;/a&gt; which aims at making the building and running of AI Agents on Kubernetes easy and fun. With all the bells and whistles one would expect - authz and authn, security, visualization, governance and audit, optimization, you name it. Some of it on the roadmap, some of it being built as we speak.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd0tw6oe2omamz4u4x5ty.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd0tw6oe2omamz4u4x5ty.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h1&gt;
  
  
  Joining the Community
&lt;/h1&gt;

&lt;p&gt;I started playing with Kagent a couple of months ago, loved its ergonomics and decided to build a workshop around it - meanwhile fixing some docs, submitting PRs and joining the community meetings. If you’re looking for a great OSS community to join - look no further - it’s a warm and welcoming bunch of very smart folks. And there’s a lot of work to be done!&lt;/p&gt;
&lt;h1&gt;
  
  
  The Need for Reactivity
&lt;/h1&gt;

&lt;p&gt;So after working with Kagent for a while I realized something was missing. For me, anyway. You see - conceptually the difference between an agent and a tool is that an agent acts on your behalf, taking decisions in alignment with the declared goals and guidelines. (While with a tool - you need to take all of the decisions yourself and command your wishes.) In this respect - much of the cloud native software is already agentic. Anything built on KRM (&lt;br&gt;
&lt;a href="https://www.geeksforgeeks.org/linux-unix/kubernetes-resource-model-krm-and-how-to-make-use-of-yaml/" rel="noopener noreferrer"&gt;Kubernetes Resource Model&lt;/a&gt;) is declarative by nature - the user declares the desired state and then the operators or the controllers make sure it becomes the actual state. And in that respect - they are definitely our agents - acting on our behalf, recreating missing pods, reconciling broken states. True agents are reactive by nature - they listen on events and correct course accordingly with state or goal declarations being just one type of event.&lt;/p&gt;

&lt;p&gt;And that’s exactly what Kagent didn’t have. Until now in order to summon an agent  - one needed to chat to it - either in Kagent’s sleek Web UI, over CLI or API.  But what good is an agent if it just sits there waiting for instructions? I wanted a way to make my agents reactive.&lt;/p&gt;
&lt;h1&gt;
  
  
  Khook
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn8mzxcg00nd8jaybew8o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn8mzxcg00nd8jaybew8o.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Enter Khook - a Kubernetes controller that allows defining:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Kubernetes events to listen to&lt;/li&gt;
&lt;li&gt;the agent to call&lt;/li&gt;
&lt;li&gt;the templated prompt to pass to the agent &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Khook assumes autonomous remediation and incident response. Finally - the ops person's dream come true!&lt;/p&gt;

&lt;p&gt;The following diagram shows how Khook works:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F74c60roop2xdyui3skl5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F74c60roop2xdyui3skl5.png" alt=" "&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;It’s been a while since I’ve developed a full-blown Kubernetes controller from scratch and I have to thank Kiro and Cursor (and specifically Sonnet 3.7) for taking care of all the boilerplate for me. This definitely  made me more productive.&lt;/p&gt;
&lt;h1&gt;
  
  
  A Hook to the Future
&lt;/h1&gt;

&lt;p&gt;Last Monday I presented Khook to the Kagent community and it makes me happy that it was met with a lot of excitement. In fact - since Friday - the &lt;a href="https://github.com/kagent-dev/khook" rel="noopener noreferrer"&gt;khook repository&lt;/a&gt; (with the gracious help from Eitan Yarmush) has been transferred in the kagent-dev org on Github and we’re actively looking for early users and contributors to take the hooks for test drives and find interesting bugs. Oh, and if you clicked on the link - make sure to star the repo. It will make my day brighter 🙏&lt;/p&gt;

&lt;p&gt;While building this project I realized it can be of much wider use - becoming the connecting tissue between virtually any type of event (think task queues, DB transactions, webhooks) and any type of A2A-compatible agent (all agent communication in Kagent is based on the &lt;a href="https://github.com/a2aproject/A2A" rel="noopener noreferrer"&gt;A2A protocol&lt;/a&gt;). Especially now with &lt;a href="https://kagent.dev/docs/kagent/examples/a2a-byo" rel="noopener noreferrer"&gt;Kagent supporting BYO agents&lt;/a&gt;. So yes - a lot of space for improvement, innovation, experimentation. &lt;/p&gt;

&lt;p&gt;Find this exciting too? Drop me a line, &lt;a href="https://discord.com/invite/Fu3k65f2k3" rel="noopener noreferrer"&gt;join the Kagent community&lt;/a&gt; and contribute to Kagent or Khook or both.&lt;/p&gt;

&lt;p&gt;And may our future be agentic.&lt;/p&gt;

&lt;p&gt;Here's a demo of Khook triggering a Kagent agent:&lt;br&gt;
  &lt;iframe src="https://www.youtube.com/embed/cYXLKAXZnso"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

</description>
      <category>ai</category>
      <category>kubernetes</category>
      <category>eventdriven</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Improve App Availability with Preemptible Pods and PriorityClasses</title>
      <dc:creator>Ant(on) Weiss</dc:creator>
      <pubDate>Tue, 20 Aug 2024 09:37:12 +0000</pubDate>
      <link>https://dev.to/antweiss/improve-app-availability-with-preemptible-pods-and-priorityclasses-3gh2</link>
      <guid>https://dev.to/antweiss/improve-app-availability-with-preemptible-pods-and-priorityclasses-3gh2</guid>
      <description>&lt;p&gt;Multiple apps are competing for resources in your cluster? Want to optimize resource allocation and application uptime?&lt;br&gt;
Look into configuring PriorityClasses and preemptible pods. &lt;/p&gt;

&lt;p&gt;Here's how it works: &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdnjpanslz7eorxq1s9bp.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdnjpanslz7eorxq1s9bp.gif" alt="Image description" width="800" height="941"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For the full overview and a practical walkthrough - read &lt;a href="https://www.perfectscale.io/blog/preemptible-pods" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>sre</category>
    </item>
    <item>
      <title>Karpenter moving to 1.0.0 - with the new stability guarantees</title>
      <dc:creator>Ant(on) Weiss</dc:creator>
      <pubDate>Wed, 14 Aug 2024 08:44:05 +0000</pubDate>
      <link>https://dev.to/aws-builders/karpenter-moving-to-100-with-the-new-stability-guarantees-2ohd</link>
      <guid>https://dev.to/aws-builders/karpenter-moving-to-100-with-the-new-stability-guarantees-2ohd</guid>
      <description>&lt;p&gt;Karpenter is slowly but surely becoming the de-facto standard node autoscaler for Kubernetes. It started at AWS and is now getting &lt;a href="https://github.com/Azure/karpenter-provider-azure" rel="noopener noreferrer"&gt;adopted for AKS on Azure&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Many organizations already switched to Karpenter from whatever they were using - if it's the good old cluster-autoscaler or a commercial pay-to-scale solution from 3rd party vendors. &lt;/p&gt;

&lt;p&gt;Now Karpenter is also a part of the Kubernetes autoscaling SIG. And that's why the Karpenter team decided it's a great time to promote &lt;code&gt;sigs.k8s.io/karpenter&lt;/code&gt; to package version &lt;code&gt;v1.0.0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;As &lt;a href="https://github.com/kubernetes-sigs/karpenter/issues/1570" rel="noopener noreferrer"&gt;the official proposal&lt;/a&gt; says: "The sigs.k8s.io/karpenter package has long-been in a production-ready state &lt;strong&gt;for users&lt;/strong&gt;, but has not reflected this production-ready state through its versioning scheme. Given the first initial release of v1 APIs within Karpenter, the maintainer team feels this is the best time to make the bump to v1.0.0."&lt;/p&gt;

&lt;h2&gt;
  
  
  New Stability Guarantees
&lt;/h2&gt;

&lt;p&gt;The linked issue goes on to outline the &lt;em&gt;new stability guarantees&lt;/em&gt; - but these are of course not referring to the stability of Karpenter as a product. Instead it's talking about Karpenter APIs now being subject to &lt;a href="https://kubernetes.io/docs/reference/using-api/" rel="noopener noreferrer"&gt;standard Kubernetes stability guarantees&lt;/a&gt;. While the package itself may be subject to breaking changes within the v1.x.y major version without a bump to v2.x.y.&lt;/p&gt;

&lt;h2&gt;
  
  
  Karpenter is Maturing
&lt;/h2&gt;

&lt;p&gt;All in all - this is great news. Karpenter has been reliable and cost-effective for quite some time but now it's also maturing as an OSS project and a part of the Kubernetes ecosystem.&lt;/p&gt;

&lt;p&gt;Interested in how to get the most out of your Karpenter when combined with pod optimization? Read this &lt;a href="https://www.perfectscale.io/blog/getting-the-most-out-of-karpenter-with-perfectscale" rel="noopener noreferrer"&gt;post&lt;/a&gt; I wrote for PerfectScale a while ago.&lt;/p&gt;

&lt;p&gt;Have you made the switch to Karpenter? Did it provide the optimization you expected? Share in comments!&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>karpenter</category>
    </item>
    <item>
      <title>9 Ways to Spin Up an EKS Cluster - Way 4 - CloudFormation</title>
      <dc:creator>Ant(on) Weiss</dc:creator>
      <pubDate>Sun, 04 Aug 2024 14:54:09 +0000</pubDate>
      <link>https://dev.to/aws-builders/9-ways-to-spin-up-an-eks-cluster-way-4-cloudformation-3len</link>
      <guid>https://dev.to/aws-builders/9-ways-to-spin-up-an-eks-cluster-way-4-cloudformation-3len</guid>
      <description>&lt;p&gt;I cheated for this one! Read on to see how: &lt;/p&gt;

&lt;p&gt;AWS Cloudformation is a robust Infrastructure-As-Code tool. It's well-supported and allows us to write templates in JSON or YAML. And it is also used behind the scenes by a number of tools in the cloud native ecosystem. For example &lt;a href="https://kops.sigs.k8s.io/#what-is-kops" rel="noopener noreferrer"&gt;kops&lt;/a&gt;. But more importantly for this post - it's used by &lt;code&gt;eksctl&lt;/code&gt;!&lt;/p&gt;

&lt;p&gt;In my &lt;a href="https://dev.to/aws-builders/9-ways-to-spin-up-an-eks-cluster-way-3-eksctl-2op9"&gt;previous post&lt;/a&gt; I used &lt;code&gt;eksctl&lt;/code&gt; to spin up a cluster complete with Karpenter and a few more add-ons. &lt;/p&gt;

&lt;p&gt;So for this post - instead of starting from scratch I decided to just reuse the stack template created by &lt;code&gt;eksctl&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;Yes, I could have used one of the quickstart stacks &lt;a href="https://github.com/aws-ia/cloudformation-base-eks" rel="noopener noreferrer"&gt;provided by AWS&lt;/a&gt; but this repo has so many options that I got lost reading the docs. &lt;/p&gt;

&lt;p&gt;So instead I just opted to export the template from the existing stack. That's the cheating part :)&lt;/p&gt;

&lt;h2&gt;
  
  
  Exporting the CloudFormation template
&lt;/h2&gt;

&lt;p&gt;But how does one export a CloudFormation template from &lt;code&gt;eksctl&lt;/code&gt;? As I found out - this was requested repeatedly, but never implemented. See &lt;a href="https://github.com/eksctl-io/eksctl/issues/5291" rel="noopener noreferrer"&gt;here&lt;/a&gt; for example. &lt;br&gt;
So instead I went to the CloudFormation console in AWS, clicked on the stacks the I wanted (the ones &lt;code&gt;eksctl&lt;/code&gt; generated) and went to the 'Template' tab.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fosylhxqyye7ifdcz9uqh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fosylhxqyye7ifdcz9uqh.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But as you can notice - this only gives us JSON. Now I'm a YAML engineer, so I wanted yaml. For which I clicked on 'View in Application Composer' - and when there - 'Template' and switched the toggle to 'YAML'. Voila - I can now copy the template text and continue editing it on my laptop.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fugxafnms88rjl6edja67.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fugxafnms88rjl6edja67.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In its basic form &lt;code&gt;eksctl&lt;/code&gt; creates 2 stacks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one for the EKS control plane&lt;/li&gt;
&lt;li&gt;another one for the managed nodegroup&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I could've bundled both stacks into one file but this separation actually makes a lot of sense. We may want more than one node group in our EKS, or we may decide to let go of node groups and opt to manage nodes with Karpenter. (You should!)&lt;/p&gt;

&lt;p&gt;So I created 2 template files - &lt;a href="https://github.com/antweiss/9-ways-2-EKS/blob/main/way-4-cloudformation/eks.yaml" rel="noopener noreferrer"&gt;eks.yaml&lt;/a&gt; (for the control plane) and &lt;a href="https://github.com/antweiss/9-ways-2-EKS/blob/main/way-4-cloudformation/ng.yaml" rel="noopener noreferrer"&gt;ng.yaml&lt;/a&gt; (for the node group). &lt;br&gt;
I've edited them both so they have no hard-coded resource names. Everything is based on the stack name you chose.&lt;/p&gt;

&lt;p&gt;Second stack receives the name of the first stack as a parameter and uses some of its exports like this:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;

&lt;span class="c1"&gt;# the parameter:&lt;/span&gt;
&lt;span class="na"&gt;Parameters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ClusterStack&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Name of the ClusterStack&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;String&lt;/span&gt;
    &lt;span class="na"&gt;Default&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eks-way4&lt;/span&gt;
&lt;span class="c1"&gt;# and the reference:&lt;/span&gt;
&lt;span class="na"&gt;SecurityGroupIds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Fn::ImportValue&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;${ClusterStack}::ClusterSecurityGroupId'&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;I'm also defining SSH access to the nodes with an imported SSH key.&lt;br&gt;
You could of course skip the whole SSH stuff, but I tend to believe it's important - especially when bringing up clusters for learning purposes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Spinning Up EKS iwth CloudFormation
&lt;/h2&gt;

&lt;p&gt;Here's how to use this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Clone the example repo: 
```bash
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;git clone &lt;a href="https://github.com/antweiss/9-ways-2-EKS.git" rel="noopener noreferrer"&gt;https://github.com/antweiss/9-ways-2-EKS.git&lt;/a&gt;&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2. Change into the cloudformation folder:
```bash


cd 9-ways-2-EKS/way-4-cloudformation


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ol&gt;
&lt;li&gt;Generate an ssh key:
```bash
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;ssh-keygen  -f ./id_rsa -N '' -C eks-way4&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;4. Insert the public key into the node group template:
I'm saving the original file with *bak* extension.
```bash


sed -ibak "s/SSH_KEY/$(cat id_rsa.pub)/g" ng.yaml


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The result should look something like (ng.yaml line 15):&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;

&lt;span class="na"&gt;ImportedKeyPair&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::EC2::KeyPair&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;KeyName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;AWS::StackName&lt;/span&gt;
     &lt;span class="c1"&gt;# this was PublicKeyMaterial: SSH_KEY&lt;/span&gt;
      &lt;span class="na"&gt;PublicKeyMaterial&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIMgmzvgvz7NENY5X25QFLFlMHVCp7U98ykm1s3+JYftI eks-way4&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ol&gt;
&lt;li&gt;And finally run the deploy.sh script with 2 parameters - the desired name of the cluster and the AWS region:
```bash
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;./deploy.sh eks-way4 eu-central-1&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
Inside the script this is translated into the following 2 CFN (cloudformation) invocations:
```bash


aws cloudformation create-stack --stack-name $1 \
                                --region $2 \
                                --template-body file://eks.yaml \
                                --capabilities CAPABILITY_NAMED_IAM
aws cloudformation create-stack --stack-name $1-ng \
                                --parameters ParameterKey=ClusterStack,ParameterValue=$1 \
                                --region $2 \
                                --template-body file://ng.yaml \
                                --capabilities CAPABILITY_NAMED_IAM


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Things to note here: the necessary capability CAPABILITY_NAMED_IAM&lt;br&gt;
and the parameter named ClusterStack that I'm passing to the second stack.&lt;/p&gt;

&lt;p&gt;After a short while we can verify the state of our stacks:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

aws cloudformation list-stacks &lt;span class="nt"&gt;--stack-status-filter&lt;/span&gt; CREATE_COMPLETE &lt;span class="nt"&gt;--region&lt;/span&gt; eu-central-1 &lt;span class="nt"&gt;--max-items&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2 &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"StackSummaries[*].StackName"&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;If this returns the names of our 2 stacks:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

[
    "eks-way4-ng",
    "eks-way4"
]


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;then we're great! If not - proceed to the CloudFormation UI in AWS console to find the reasons why your stack creation failed.&lt;/p&gt;

&lt;h1&gt;
  
  
  Summary
&lt;/h1&gt;

&lt;p&gt;CloudFormation is a great IaC tool if you're fine with AWS vendor-lock. It's definitely possible to create an EKS cluster with CloudFormation and that's what such tools as &lt;code&gt;kops&lt;/code&gt; and &lt;code&gt;eksctl&lt;/code&gt; do under the hood. &lt;br&gt;
While writing CloudFormation from scratch is no fun - we can use the templates generated by &lt;code&gt;eksctl&lt;/code&gt; - as I did. Or we can build the templates ourselves in the &lt;a href="https://aws.amazon.com/application-composer/" rel="noopener noreferrer"&gt;AWS Application Composer&lt;/a&gt;. But then we need to take into the account everything necessary for the cluster operation - security groups, gateways, routing rules, IAM policies. And that's a lot to take care of.&lt;/p&gt;

&lt;p&gt;Are you using pure CloudFormation as your IaC tool?&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>eks</category>
      <category>aws</category>
    </item>
    <item>
      <title>We Can Resize Pods without Restarts! Or Can't We?</title>
      <dc:creator>Ant(on) Weiss</dc:creator>
      <pubDate>Thu, 01 Aug 2024 09:22:56 +0000</pubDate>
      <link>https://dev.to/antweiss/we-can-resize-pods-without-restarts-or-cant-we-18a2</link>
      <guid>https://dev.to/antweiss/we-can-resize-pods-without-restarts-or-cant-we-18a2</guid>
      <description>&lt;p&gt;Kubernetes v1.27 released in April 2023 came with an exciting announcement - we can now resize pod CPU and memory requests and limits in-place! Without deleting the pod or even restarting the containers!&lt;/p&gt;

&lt;p&gt;This happened more than a year ago and since then a lot of folks seem to think this feature is already publicly available or is due to become so tomorrow.&lt;/p&gt;

&lt;p&gt;But the reality is that this was originally released as an Alpha feature and since then had no success moving to Beta due to a number of unresolved issues.&lt;/p&gt;

&lt;p&gt;Latest status as of June 2024 is that it has been pushed back to v1.32:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuo0em9jlene8n6ki26wm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuo0em9jlene8n6ki26wm.png" alt="Image description" width="800" height="193"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's the &lt;a href="https://github.com/kubernetes/enhancements/issues/1287#issuecomment-2155234389" rel="noopener noreferrer"&gt;link&lt;/a&gt; to that comment on Github.&lt;/p&gt;

&lt;p&gt;So first of all - this isn't coming tomorrow. But we can still play with the feature and understand its advantages and shortcomings. Which is exactly what I'm planning to do in this post.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get a Cluster with Alpha Features
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://k3d.io/" rel="noopener noreferrer"&gt;k3d&lt;/a&gt; is irreplaceable when we want quickly and cheaply test Kubernetes Alpha features. All we need to do is to pass the correct feature gate to the correct control plane component.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install k3d
&lt;/h2&gt;

&lt;p&gt;If you still haven't done so - install k3d:&lt;br&gt;
with curl and bash:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;or with another method of your choice listed &lt;a href="https://k3d.io/v5.7.2/#installation" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In our case the component is the API server and the feature gate is called &lt;code&gt;InPlacePodVerticalScaling&lt;/code&gt; as can be seen &lt;a href="https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I'm spinning up a single-node cluster with the following config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;cat &amp;lt;&amp;lt;'EOF' | k3d cluster create -c -&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;k3d.io/v1alpha3&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Simple&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pod-resize&lt;/span&gt;
&lt;span class="na"&gt;servers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
&lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rancher/k3s:v1.30.2-k3s2&lt;/span&gt;
&lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;k3d&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;disableLoadbalancer&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="na"&gt;k3s&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;extraArgs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# the feature gate is passed here&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;arg&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;--kube-apiserver-arg=feature-gates=InPlacePodVerticalScaling=true&lt;/span&gt;
        &lt;span class="na"&gt;nodeFilters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;server:*&lt;/span&gt;
&lt;span class="s"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  The Happy Path - Updating the CPU
&lt;/h1&gt;

&lt;p&gt;Now let's create a pod with one container defining resource requests and limits.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;stress&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;progrium/stress&lt;/span&gt;
    &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--cpu"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--vm"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--vm-bytes"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;128M"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--vm-hang"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;stress&lt;/span&gt;
    &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;150M&lt;/span&gt;
        &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;100m&lt;/span&gt;
      &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;150M&lt;/span&gt;
        &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;100m&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can create the pod with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; https://raw.githubusercontent.com/perfectscale-io/inplace-pod-resize/main/guaranteed.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I'm using &lt;code&gt;progrium/stress&lt;/code&gt; and setting it up for slow success by requesting a tenth of the CPU it needs and just enough memory. &lt;/p&gt;

&lt;p&gt;&lt;code&gt;stress --vm 1 --vm-bytes 128M --vm-hang 3&lt;/code&gt; - this tells stress to spawn one worker that allocates 128 Mb of memory and then releases them every 3 seconds. &lt;br&gt;
My pod is only currently allowed to have 150M of memory, so I expect it to run fine.&lt;/p&gt;

&lt;p&gt;While this '&lt;code&gt;stress --cpu 1&lt;/code&gt; tells the container to use one whole CPU. While it's actually allowed to only use 0.1 CPU. So it'll surely get throttled.&lt;/p&gt;

&lt;p&gt;The container starts just fine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pod
NAME     READY   STATUS      RESTARTS   AGE
stress   1/1     Running   0         7s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After a few minutes I can also check its resource consumption by running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl top pod stress
NAME     CPU(cores)   MEMORY(bytes)
stress   101m         131Mi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's running happily, consuming the 101m of CPU and 131M of memory. All within the limits.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pod QoS Matters
&lt;/h2&gt;

&lt;p&gt;Now let's try to increase our container's limits in-place to give it more resources and see what happens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl patch pod stress -p '{"spec" : { "containers" : [{"name" : "stress", "resources": { "limits": {"cpu":"300m","memory":"250M"}}}]}}'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Oops! That didn't work!&lt;br&gt;
We're getting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The Pod "stress" is invalid: metadata: Invalid value: "Guaranteed": Pod QoS is immutable
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So what we now know is that while we can change the values of limits and requests - we can't change the &lt;a href="https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/" rel="noopener noreferrer"&gt;pod QoS class&lt;/a&gt;. I.e the relationship between the requests and the limits has to stay the same.&lt;/p&gt;

&lt;h2&gt;
  
  
  Updating the Resources
&lt;/h2&gt;

&lt;p&gt;Let's try to update both the requests and the limits while staying within the Guaranteed QoS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl patch pod stress -p '{"spec" : { "containers" : [{"name" : "stress", "resources": {"requests": {"cpu":"300m","memory": "250M"}, "limits": {"cpu":"300m","memory":"250M"}}}]}}'
pod/stress patched
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we now watch &lt;code&gt;kubectl top pod stress&lt;/code&gt; we will se how the container gradually gets the additional CPU time:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgzt7kds7vxuoje1lnaei.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgzt7kds7vxuoje1lnaei.gif" alt="Image description" width="1920" height="1080"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The CGroups Behind the Scenes
&lt;/h2&gt;

&lt;p&gt;Now, being the curious cat that I am - I wanted to check how this works behind the scenes. I know there are cgroups involved in setting container resource restrictions but I like checking myself how stuff works.&lt;br&gt;
The great thing with k3d is it's very easy to get into your nodes with a simple &lt;code&gt;docker exec&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker exec -it k3d-pod-resize-server-0 sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now I want to find my container and identify the path to its cgroup definition.&lt;br&gt;
Find the container ID using &lt;code&gt;ctr&lt;/code&gt;  - the containerd command-line utility:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ctr c ls | grep stress
a4ad15ff9c7a71a0f1c34cdce9d1ae9d18ebd4e7b01f3c92ee796e5180729460    docker.io/progrium/stress:latest                       io.containerd.runc.v2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and then  - find the cgroup information for my container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ctr c info a4ad15ff9c7a71a0f1c34cdce9d1ae9d18ebd4e7b01f3c92ee796e5180729460 | grep cgroup

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;which will give me something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"destination": "/sys/fs/cgroup",
                "type": "cgroup",
                "source": "cgroup",
            "cgroupsPath": "/kubepods/podaa80f5b5-d68b-4ab6-ac38-df493310068b/a4ad15ff9c7a71a0f1c34cdce9d1ae9d18ebd4e7b01f3c92ee796e5180729460",
                    "type": "cgroup"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The important parts here are &lt;code&gt;/sys/fs/cgroup&lt;/code&gt; where all the cgroup definitions are found and the &lt;code&gt;cgroupsPath&lt;/code&gt; - where the specific constraints for this container are defined. &lt;/p&gt;

&lt;p&gt;You'll notice there's a hierarchy there - first we have the &lt;code&gt;pod...&lt;/code&gt; directory and then - the directory named as the container id. This being a single-container pod - all the cgroup values will be featured in the parent folder. So that's where we're going to look.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat /sys/fs/cgroup/kubepods/podaa80f5b5-d68b-4ab6-ac38-df493310068b/memory.max

249999360
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's right - 250 Mb of memory in bytes!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat /sys/fs/cgroup/kubepods/podaa80f5b5-d68b-4ab6-ac38-df493310068b/cpu.max

30000 100000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An that's correct too! According to the RedHat documentation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The first value is the allowed time quota in microseconds for which all processes collectively in a child group can run during one period. The second value specifies the length of the period.&lt;br&gt;
During a single period, when processes in a control group collectively exhaust the time specified by this quota, they are throttled for the remainder of the period and not allowed to run until the next period.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Impact on Scheduling
&lt;/h2&gt;

&lt;p&gt;Another thing I wanted to try is update the requests to more than my node can give and check if the scheduler will try to reschedule my pod to another node because the current one doesn't have the needed capacity.&lt;/p&gt;

&lt;p&gt;Let's check how many cpus my node has access to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get node -ojsonpath="{ .items[].status.allocatable.cpu } cpus"
8 cpus%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I got 8. So let's try to request 10 and see what happens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl patch pod stress -p '{"spec" : { "containers" : [{"name" : "stress", "resources": {"requests": {"cpu": "10"}, "limits": {"cpu":"10"}}}]}}'
pod/stress patched
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Alas, while the requests got updated - nothing else happens. Pod doesn't get rescheduled or evicted. Why? No idea..  Have I tried creating it with 10 cpu request from the beginning - it would have stayed pending because there aren't any nodes large enough. So I would expect the pod with requests higher than a node can satisfy to get evicted. But maybe my thinking is flawed?&lt;/p&gt;

&lt;h2&gt;
  
  
  Negating Resources
&lt;/h2&gt;

&lt;p&gt;Until now all worked fine because we were only adding resources. Everybody likes having more stuff, nobody likes when stuff is taken away from them. &lt;/p&gt;

&lt;p&gt;Let's start by taking back the CPU time we granted in the previous section:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl patch pod stress -p '{"spec" : { "containers" : [{"name" : "stress", "resources": {"requests": {"cpu":"100m"}, "limits": {"cpu":"100m"}}}]}}'
pod/stress patched
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I'm bringing the CPU requests back to 100m. Quite expectedly in a couple of seconds &lt;code&gt;kubectl top&lt;/code&gt; will show me that pod cpu consumption went down to 100m.&lt;br&gt;
And the cgroup &lt;code&gt;cpu.max&lt;/code&gt; file will get updated as expected:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat /sys/fs/cgroup/kubepods/podaa80f5b5-d68b-4ab6-ac38-df493310068b/cpu.max
10000 100000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But what if I try to reduce memory?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl patch pod stress -p '{"spec" : { "containers" : [{"name" : "stress", "resources": {"requests": {"memory": "150M"}, "limits": {"memory":"150M"}}}]}}
pod/stress patched
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Seems to work fine. Checking the cgroups I see the config has been updated:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat /sys/fs/cgroup/kubepods/podaa80f5b5-d68b-4ab6-ac38-df493310068b/memory.max
149999616
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And what if I need to free even more memory?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl patch pod stress -p '{"spec" : { "containers" : [{"name" : "stress", "resources": {"requests": {"memory": "100M"}, "limits": {"memory":"100M"}}}]}}
pod/stress patched
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note that I'm reducing memory to 100M which should cause my container to get OOMKilled. And it seems to work:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pod stress -ojsonpath="{ .spec.containers[0].resources }"

{"limits":{"cpu":"100m","memory":"100M"},"requests":{"cpu":"100m","memory":"100M"}}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But I see that the pod continues running!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pod
NAME     READY   STATUS    RESTARTS   AGE
stress   1/1     Running   0          21m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And checking the cgroup &lt;code&gt;memory.max&lt;/code&gt; file shows why:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat /sys/fs/cgroup/kubepods/podaa80f5b5-d68b-4ab6-ac38-df493310068b/memory.max
149999616
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The cgroup wasn't updated! Looks like something is getting in our way - protecting the container from getting less memory than it's already using. While this makes sense as a precaution - taking away memory from a running process may lead to irreversible corruption -  this now leads to container limits holding an incorrect value which will surely puzzle anyone trying to understand why it's not getting OOMKilled.&lt;/p&gt;

&lt;p&gt;I would expect some validating admission hook to tell me that memory can't be reduced. Looks like a bug to me.&lt;/p&gt;

&lt;h2&gt;
  
  
  Saving Hungry Pods
&lt;/h2&gt;

&lt;p&gt;Ok, we found out that memory being an incompressible resource - we can't really reduce it in-place to a value lower what than the container is already using.&lt;/p&gt;

&lt;p&gt;But can we save an OOMing container by giving it more memory?&lt;/p&gt;

&lt;p&gt;Let's try that with a similar pod but one that gets only 100M of memory from the get go (while trying to allocate 128):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;hungry&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;progrium/stress&lt;/span&gt;
    &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--cpu"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--vm"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--vm-bytes"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;128M"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--vm-hang"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;stress&lt;/span&gt;
    &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;100M&lt;/span&gt;
      &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;100M&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create -f https://raw.githubusercontent.com/perfectscale-io/inplace-pod-resize/main/hungry.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Quite expectedly the container gets OOMKilled almost instantly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pod hungry
NAME     READY   STATUS      RESTARTS     AGE
hungry   0/1     OOMKilled   1 (5s ago)   8s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And it will continue restarting and getting OOMkilled until we update its memory limits. So let's save it from this misery by giving it the memory it needs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl patch pod hungry -p '{"spec" : { "containers" : [{"name" : "stress", "resources": {"requests": {"memory": "200M"}, "limits": {"memory":"200M"}}}]}}'
pod/hungry patched
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This seems to work fine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pod hungry -ojsonpath="{ .spec.containers[0].resources }"
{"limits":{"memory":"200M"},"requests":{"memory":"200M"}}%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But the pod continues getting killed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pod hungry
NAME     READY   STATUS      RESTARTS      AGE
hungry   0/1     OOMKilled   4 (33s ago)   60s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And if check the cgroup &lt;code&gt;memory.max&lt;/code&gt; file we'll see why:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat /sys/fs/cgroup/kubepods/burstable/pod708b8195-0ca0-45e0-9f2b-015f679c98da/memory.max
99999744
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Its memory limit never actually got updated! &lt;br&gt;
Why? I wasn't able to find an answer for this one. Why disallow saving containers from getting killed by providing them memory they need? I'm not aware of the technical limitations that would prevent this and I also didn't find anything in the &lt;a href="https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources#in-place-update-of-pod-resources" rel="noopener noreferrer"&gt;KEP docs&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So it looks like the only way to fix the OOMKill is still by deleting the pod and creating a new one with more memory.&lt;/p&gt;

&lt;h1&gt;
  
  
  Summary
&lt;/h1&gt;

&lt;p&gt;In-place pod resizing is a long awaited feature. Still in alpha since v1.27 it will hopefully make it to beta by v1.32. &lt;br&gt;
If the drawbacks and bugs get fixed.&lt;br&gt;
And here are some of them I found:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Memory can't be reduced lower than currently used. But there's no notification about that.&lt;/li&gt;
&lt;li&gt;Giving more resources than available on the node doesn't lead to pod eviction (true for both CPU and Memory)&lt;/li&gt;
&lt;li&gt;If a pod is getting OOMKilled - it's not possible to give it more memory to save it from getting killed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Will these get eventually fixed? I certainly hope so. Will the feature get it to beta by v1.32? Let's keep our fingers crossed.&lt;/p&gt;

&lt;p&gt;Something in this post isn't clear or correct? Let me know in the comments. &lt;/p&gt;

&lt;p&gt;Thanks for reading and may your pods keep running!&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
    </item>
    <item>
      <title>Fixing ko local image publishing on MacOs</title>
      <dc:creator>Ant(on) Weiss</dc:creator>
      <pubDate>Mon, 22 Jul 2024 13:37:55 +0000</pubDate>
      <link>https://dev.to/antweiss/fixing-ko-local-image-publishing-on-macos-2p0d</link>
      <guid>https://dev.to/antweiss/fixing-ko-local-image-publishing-on-macos-2p0d</guid>
      <description>&lt;h1&gt;
  
  
  Preamble:
&lt;/h1&gt;

&lt;p&gt;I still use Docker desktop to run containers on my MacBook Air. I know there's &lt;a href="https://github.com/abiosoft/colima" rel="noopener noreferrer"&gt;Colima&lt;/a&gt; but have no time to switch and deal with the consequences. &lt;br&gt;
I also recently started using &lt;a href="https://github.com/ko-build/ko/tree/main" rel="noopener noreferrer"&gt;ko&lt;/a&gt; for containerizing my Go apps. &lt;/p&gt;
&lt;h1&gt;
  
  
  ko is Great but...
&lt;/h1&gt;

&lt;p&gt;I love &lt;strong&gt;ko&lt;/strong&gt; - it builds distroless secure and slim images. But there's one issue - by default - &lt;code&gt;ko build&lt;/code&gt; pushes the resulting image to the remote registry. &lt;br&gt;
It's kinda fine for continuous delivery, but I do a lot of experiments and I don't always want to publish all the garbage I create to remote - trying to be considerate of network bandwidth and image storage.&lt;/p&gt;

&lt;p&gt;So instead I want to build my images to the local image storage. &lt;br&gt;
It's possible to do that with &lt;code&gt;ko build . -L&lt;/code&gt;&lt;br&gt;
Just that on MacOs this was failing for me with the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2024/07/22 15:52:50 Loading otomato/myapp:717e6196339c956bc878bd58f5ab8244a709dc0510051f9e6df72620f28a2aaa
2024/07/22 15:52:50 daemon.Write response:
Error: failed to publish images: error publishing ko://github.com/otomato/myapp: error loading image: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Calling the Docker daemon
&lt;/h1&gt;

&lt;p&gt;Clearly the docker client inside &lt;code&gt;ko&lt;/code&gt; is trying to contact Docker daemon on the standard socket and failing.&lt;/p&gt;

&lt;p&gt;I tried googling for this error but didn't find anything. So I decided to solve it myself.&lt;br&gt;
Here's the thing - on MacOS the Docker socket isn't the standard &lt;code&gt;/var/run/docker.sock&lt;/code&gt; - instead it's at &lt;code&gt;~/Library/Containers/com.docker.docker/Data/docker.raw.sock&lt;/code&gt;&lt;/p&gt;
&lt;h1&gt;
  
  
  The Solution
&lt;/h1&gt;

&lt;p&gt;In order to fix this what I needed to do is create a symlink from the actual Docker socket to where the standard Docker client expects to find it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo ln&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; ~/Library/Containers/com.docker.docker/Data/docker.raw.sock /var/run/docker.sock
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that Docker daemon can be contacted via the standard socket address - ko can push images to it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ko build . -B -L --platform linux/arm64
2024/07/22 16:04:04 Building github.com/otomato/myapp for linux/arm64
2024/07/22 16:04:04 Loading otomato/myapp:717e6196339c956bc878bd58f5ab8244a709dc0510051f9e6df72620f28a2aaa
2024/07/22 16:04:05 Loaded otomato/myapp:717e6196339c956bc878bd58f5ab8244a709dc0510051f9e6df72620f28a2aaa
2024/07/22 16:04:05 Adding tag latest
2024/07/22 16:04:05 Added tag latest
otomato/myapp:717e6196339c956bc878bd58f5ab8244a709dc0510051f9e6df72620f28a2aaa
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Meanwhile I also opened an issue on the &lt;code&gt;ko&lt;/code&gt; repo. But until it's fixed - this hack works like charm.&lt;/p&gt;

&lt;p&gt;Hope this helps you too.&lt;/p&gt;

</description>
      <category>go</category>
      <category>containers</category>
      <category>cloudnative</category>
      <category>devops</category>
    </item>
    <item>
      <title>9 Ways to Spin Up an EKS Cluster - Way 3 - eksctl</title>
      <dc:creator>Ant(on) Weiss</dc:creator>
      <pubDate>Thu, 27 Jun 2024 11:00:29 +0000</pubDate>
      <link>https://dev.to/aws-builders/9-ways-to-spin-up-an-eks-cluster-way-3-eksctl-2op9</link>
      <guid>https://dev.to/aws-builders/9-ways-to-spin-up-an-eks-cluster-way-3-eksctl-2op9</guid>
      <description>&lt;p&gt;In my &lt;a href="https://dev.to/aws-builders/9-ways-to-an-eks-cluster-way-2-aws-cli-3g94"&gt;previous post&lt;/a&gt; I showed how to spin up an EKS cluster with pure shell and AWS CLI. (All the links to other posts in this series will be &lt;a href="https://dev.to/aws-builders/8-ways-to-spin-up-an-eks-cluster-210b"&gt;here&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;This used to be the easiest way of getting to a cluster without leaving your terminal. But pretty early in EKS history (2017) some smart folks from a company named Weaveworks(RIP) realized it was too cumbersome to do this using the &lt;code&gt;aws cli&lt;/code&gt; subcommand and that EKS is complex enough to deserve a command-line client of its own.  That's how &lt;code&gt;eksctl&lt;/code&gt; was born. &lt;/p&gt;

&lt;p&gt;A few months ago Weaveworks (who brought us a plethora of great OSS tools like Flux, Flagger and Weave) was shut down. But AWS announced full support for eksctl in 2019 - so &lt;code&gt;eksctl&lt;/code&gt; is now the de-facto standard EKS CLI tool.&lt;/p&gt;

&lt;p&gt;The great thing about &lt;code&gt;eksctl&lt;/code&gt; is that it allows one to create and manage clusters not only using one-off commands with arguments but also with YAML configuration files - in a true and familiar IaC way.&lt;/p&gt;

&lt;p&gt;We'll check out both options but first let's install eksctl and generate an SSH key so we can connect to the nodes in the clusters we create if needed. Please note - I'm not endorsing SSH connections to your EKS nodes. Do avoid this if possible - so as not to cause inadvertent configuration drift. But sometimes we still need this for troubleshooting, especially in training environments. So let's have the SSH key handy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install eksctl
&lt;/h2&gt;

&lt;p&gt;If you're on Linux - here are the official instructions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# for ARM systems, set ARCH to: `arm64`, `armv6` or `armv7`&lt;/span&gt;
&lt;span class="nv"&gt;ARCH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;amd64
&lt;span class="nv"&gt;PLATFORM&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;uname&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;_&lt;span class="nv"&gt;$ARCH&lt;/span&gt;

curl &lt;span class="nt"&gt;-sLO&lt;/span&gt; &lt;span class="s2"&gt;"https://github.com/eksctl-io/eksctl/releases/latest/download/eksctl_&lt;/span&gt;&lt;span class="nv"&gt;$PLATFORM&lt;/span&gt;&lt;span class="s2"&gt;.tar.gz"&lt;/span&gt;

&lt;span class="c"&gt;# (Optional) Verify checksum&lt;/span&gt;
curl &lt;span class="nt"&gt;-sL&lt;/span&gt; &lt;span class="s2"&gt;"https://github.com/eksctl-io/eksctl/releases/latest/download/eksctl_checksums.txt"&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nv"&gt;$PLATFORM&lt;/span&gt; | &lt;span class="nb"&gt;sha256sum&lt;/span&gt; &lt;span class="nt"&gt;--check&lt;/span&gt;

&lt;span class="nb"&gt;tar&lt;/span&gt; &lt;span class="nt"&gt;-xzf&lt;/span&gt; eksctl_&lt;span class="nv"&gt;$PLATFORM&lt;/span&gt;.tar.gz &lt;span class="nt"&gt;-C&lt;/span&gt; /tmp &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;rm &lt;/span&gt;eksctl_&lt;span class="nv"&gt;$PLATFORM&lt;/span&gt;.tar.gz

&lt;span class="nb"&gt;sudo mv&lt;/span&gt; /tmp/eksctl /usr/local/bin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Please note this doesn't install such eksctl prerequisites as &lt;code&gt;kubectl&lt;/code&gt; and &lt;code&gt;aws-iam-authenticator&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;And if, like me -  you're on a Mac - definitely use &lt;code&gt;brew&lt;/code&gt; as it takes care of all dependencies. (even though the official &lt;code&gt;eksctl&lt;/code&gt; docs don't recommend it)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;brew tap weaveworks/tap
brew install weaveworks/tap/eksctl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And now - let's generate that ssh key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ssh-keygen  -f ./id_rsa -N ''
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will create an &lt;code&gt;id_rsa&lt;/code&gt; and &lt;code&gt;id_rsa.pub&lt;/code&gt; in your current directory. Make sure to run the following &lt;code&gt;eksctl&lt;/code&gt; commands from the same directory and it will pick up this key by default.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sidenote - the VPC
&lt;/h3&gt;

&lt;p&gt;If you've read the previous post in this series (where we created an EKS cluster using the AWS CLI), you'd notice that creating the VPC was a separate step. The added value of &lt;code&gt;eksctl&lt;/code&gt; is it takes care of most dependencies and add-ons for us without the need of running additional commands. The same is true for VPC creation. A new VPC with default subnet configuration is created for us each time we spin up a new cluster, unless we specifically define we want to re-use an existing VPC.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Create an EKS cluster - eksctl with arguments
&lt;/h3&gt;

&lt;p&gt;The most straightforward way of creating an EKS cluster with &lt;code&gt;eksctl&lt;/code&gt; is providing all the arguments on the command-line and letting the tool take care of the defaults. This approach, while limited and not repeatable enough can definitely give us a cluster. &lt;/p&gt;

&lt;p&gt;The command I provide here defines quite a number of settings I personally find important even for small toy clusters I spin up for fun and games. But &lt;code&gt;eksctl&lt;/code&gt; can do its job even with less stuff defined. Look in the official "Getting Started" docs if you want just the bare bones.&lt;/p&gt;

&lt;p&gt;So here's what I decided to use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# First - define the environment. &lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CLUSTER_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;way3
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;eu-central-1
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;K8S_VERSION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1.30
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;NODE_TYPE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;t2.medium
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;MIN_NODES&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;MAX_NODES&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I'm starting out with small nodes and already preparing the cluster for auto-scaling with min and max nodes definitions. It's important to note that &lt;code&gt;eksctl&lt;/code&gt; allows us to enable the IAM policy for ASG acces and define the auto-scaling range. But it doesn't take care of installing &lt;code&gt;cluster-autoscaler&lt;/code&gt;. We'd need to do that separately. If we wanted to... On the other hand - these days it makes total sense to start out with Karpenter. For which &lt;code&gt;eksctl&lt;/code&gt; does provide support, but not on the command line. whcih means we'll see how to configure Karpenter in the next section.&lt;/p&gt;

&lt;p&gt;And now - time to spin up the cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;eksctl create cluster &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="nv"&gt;$CLUSTER_NAME&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
                      &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="nv"&gt;$AWS_REGION&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
                      &lt;span class="nt"&gt;--with-oidc&lt;/span&gt; &lt;span class="nt"&gt;--version&lt;/span&gt; &lt;span class="nv"&gt;$K8S_VERSION&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
                      &lt;span class="nt"&gt;--nodegroup-name&lt;/span&gt; ng-&lt;span class="nv"&gt;$CLUSTER_NAME&lt;/span&gt;&lt;span class="nt"&gt;-1&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
                      &lt;span class="nt"&gt;--node-type&lt;/span&gt; t2.medium &lt;span class="se"&gt;\&lt;/span&gt;
                      &lt;span class="nt"&gt;--nodes&lt;/span&gt; 1 &lt;span class="nt"&gt;--nodes-min&lt;/span&gt; 1 &lt;span class="nt"&gt;--nodes-max&lt;/span&gt; 3 &lt;span class="se"&gt;\ &lt;/span&gt;
                      &lt;span class="nt"&gt;--spot&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
                      &lt;span class="nt"&gt;--ssh-access&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
                      &lt;span class="nt"&gt;--asg-access&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
                      &lt;span class="nt"&gt;--external-dns-access&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
                      &lt;span class="nt"&gt;--full-ecr-access&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
                      &lt;span class="nt"&gt;--alb-ingress-access&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command gives us a full-featured cluster with IAM policies for ECR access (&lt;code&gt;--full-ecr-access&lt;/code&gt;), external dns controller (&lt;code&gt;--external-dns-access&lt;/code&gt;) , ALB ingress controller (&lt;code&gt;--alb-ingress-access&lt;/code&gt;), OIDC support and more. It also runs its nodes on spot instances for cost optimization. Which is totally fine for a toy cluster but may be not appropriate if the application you're planning to deploy isn't disruption-tolerant.&lt;/p&gt;

&lt;p&gt;From the command output we learn that in the background our command is converted into a couple of CloudFormation stacks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
2024-06-27 12:51:47 [ℹ]  will create a CloudFormation stack for cluster itself and 0 nodegroup stack(s)
2024-06-27 12:51:47 [ℹ]  will create a CloudFormation stack for cluster itself and 1 managed nodegroup stack(s)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After about 15 minutes (depending on the weather and the region you've decided to use) CloudFormation returns and we can access our cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get node
NAME                                             STATUS   ROLES    AGE   VERSION
ip-192-168-56-76.eu-central-1.compute.internal   Ready    &amp;lt;none&amp;gt;   35m   v1.29.3-eks-ae9a62a
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note that the new cluster context is added to your &lt;code&gt;kubeconfig&lt;/code&gt; automatically. &lt;br&gt;
If you want to update the &lt;code&gt;kubeconfig&lt;/code&gt; at a later time you can use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;eksctl utils write-kubeconfig &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="nv"&gt;$CLUSTER_NAME&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="nv"&gt;$AWS_REGION&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But, as we already said - the CLI approach is limited. To do real IaC  we want to put the cluster definitions in a YAML config file. This gives us a lot more capabilities, and allows to commit the config file to source control for further collaboration, change tracking and  automation.&lt;/p&gt;

&lt;p&gt;But first - let's remove the cluster we just created:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;eksctl delete cluster --region=$AWS_REGION --name=$CLUSTER_NAME
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Create an EKS cluster - eksctl with a config file.
&lt;/h3&gt;

&lt;p&gt;The config file I provide here gives us everything we defined at the command line and more. As mentioned - it also allows us to install Karpenter in the same &lt;code&gt;eksctl&lt;/code&gt; execution - thus giving us an industry-standard auto-scaling EKS cluster with just-in-time node provisioning. You can grab this file in &lt;a href="https://github.com/antweiss/9-ways-2-EKS/tree/main/way-3-eksctl" rel="noopener noreferrer"&gt;Github&lt;/a&gt; too.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eksctl.io/v1alpha5&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ClusterConfig&lt;/span&gt;

&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;way3&lt;/span&gt;
  &lt;span class="na"&gt;region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eu-central-1&lt;/span&gt;
  &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.30"&lt;/span&gt;
  &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;karpenter.sh/discovery&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;way3&lt;/span&gt;
&lt;span class="na"&gt;iam&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;withOIDC&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="na"&gt;managedNodeGroups&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ng-way3-1&lt;/span&gt;
    &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;worker&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
    &lt;span class="na"&gt;instanceType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;t2.medium&lt;/span&gt;
    &lt;span class="na"&gt;desiredCapacity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
    &lt;span class="na"&gt;minSize&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="na"&gt;maxSize&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
    &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;nodegrouprole&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;way3&lt;/span&gt;
    &lt;span class="na"&gt;volumeSize&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt;
    &lt;span class="na"&gt;iam&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;withAddonPolicies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;externalDNS&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;certManager&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;awsLoadBalancerController&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;albIngress&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;ebs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;efs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;imageBuilder&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;cloudWatch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;ssh&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;allow&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="c1"&gt;# will use ~/.ssh/id_rsa.pub as the default ssh key&lt;/span&gt;

&lt;span class="na"&gt;karpenter&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;0.37.0'&lt;/span&gt;
  &lt;span class="na"&gt;createServiceAccount&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="na"&gt;withSpotInterruptionQueue&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An attentive eye will also notice I've also defined some additional stuff such as &lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/control-plane-logs.html" rel="noopener noreferrer"&gt;CloudWatch logging of the control plane&lt;/a&gt;,  EBS and EFS access. Consider removing these lines if you don't need them. &lt;br&gt;
Also you'll notice that not only it installs Karpenter, it also takes care of setting up the SpotInterruptionQueue, which allows Karpenter to replace spot instances before they die.&lt;br&gt;
And there are many additional options available.&lt;br&gt;
So yes - this is a very scalable approach, which takes care of more or less everything one might need in an EKS cluster.&lt;/p&gt;

&lt;p&gt;Execute this plan with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;eksctl create cluster -f cluster.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This again creates a CloudFormation execution that, granted we have all the necessary permissions, should complete successfully.&lt;/p&gt;

&lt;p&gt;Let's check that Karpenter got installed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get pod &lt;span class="nt"&gt;-A&lt;/span&gt;
NAMESPACE     NAME                         READY   STATUS    RESTARTS   AGE
karpenter     karpenter-79db484bbf-flzzq   1/1     Running   0          32s
karpenter     karpenter-79db484bbf-nfhsp   1/1     Running   0          32s
kube-system   aws-node-8h4ln               2/2     Running   0          17m
kube-system   aws-node-vq8wj               2/2     Running   0          18m
kube-system   coredns-6f6d89bcc9-qx497     1/1     Running   0          24m
kube-system   coredns-6f6d89bcc9-wwjtp     1/1     Running   0          24m
kube-system   kube-proxy-8mnd2             1/1     Running   0          18m
kube-system   kube-proxy-c5zkp             1/1     Running   0          17m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yup, here it is! &lt;/p&gt;

&lt;p&gt;The upside of using the config file is of course the ability to manage stuff in a somewhat idempotent way. So for example if we want to change our node group config - we can update the following lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ng-1&lt;/span&gt;
    &lt;span class="s"&gt;labels&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;worker&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
    &lt;span class="na"&gt;instanceType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;t2.medium&lt;/span&gt;
    &lt;span class="na"&gt;desiredCapacity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="na"&gt;minSize&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="na"&gt;maxSize&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and then run &lt;code&gt;eksctl update nodegroup -f cluster.yaml&lt;/code&gt; - this will update our NodeGroup autoscaling range.&lt;/p&gt;

&lt;p&gt;And of course eksctl provides us with a plethora of addtional commands that come very handy for ongoing management of EKS clusters:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;eksctl &lt;span class="nt"&gt;-h&lt;/span&gt;
The official CLI &lt;span class="k"&gt;for &lt;/span&gt;Amazon EKS

Usage: eksctl &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;command&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;flags]

Commands:
  eksctl anywhere                        EKS anywhere
  eksctl associate                       Associate resources with a cluster
  eksctl completion                      Generates shell completion scripts &lt;span class="k"&gt;for &lt;/span&gt;bash, zsh or fish
  eksctl create                          Create resource&lt;span class="o"&gt;(&lt;/span&gt;s&lt;span class="o"&gt;)&lt;/span&gt;
  eksctl delete                          Delete resource&lt;span class="o"&gt;(&lt;/span&gt;s&lt;span class="o"&gt;)&lt;/span&gt;
  eksctl deregister                      Deregister a non-EKS cluster
  eksctl disassociate                    Disassociate resources from a cluster
  eksctl drain                           Drain resource&lt;span class="o"&gt;(&lt;/span&gt;s&lt;span class="o"&gt;)&lt;/span&gt;
  eksctl &lt;span class="nb"&gt;enable                          &lt;/span&gt;Enable features &lt;span class="k"&gt;in &lt;/span&gt;a cluster
  eksctl get                             Get resource&lt;span class="o"&gt;(&lt;/span&gt;s&lt;span class="o"&gt;)&lt;/span&gt;
  eksctl &lt;span class="nb"&gt;help                            &lt;/span&gt;Help about any &lt;span class="nb"&gt;command
  &lt;/span&gt;eksctl info                            Output the version of eksctl, kubectl and OS info
  eksctl register                        Register a non-EKS cluster
  eksctl scale                           Scale resources&lt;span class="o"&gt;(&lt;/span&gt;s&lt;span class="o"&gt;)&lt;/span&gt;
  eksctl &lt;span class="nb"&gt;set                             &lt;/span&gt;Set values
  eksctl &lt;span class="nb"&gt;unset                           &lt;/span&gt;Unset values
  eksctl update                          Update resource&lt;span class="o"&gt;(&lt;/span&gt;s&lt;span class="o"&gt;)&lt;/span&gt;
  eksctl upgrade                         Upgrade resource&lt;span class="o"&gt;(&lt;/span&gt;s&lt;span class="o"&gt;)&lt;/span&gt;
  eksctl utils                           Various utils
  eksctl version                         Output the version of eksctl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All in all - eksctl is the go to tool for EKS management if you haven't already standardized your cloud platform on another IaC solution such as Terraform, Pulumi, CDK or others which we'll look into in the folowing posts.&lt;/p&gt;

&lt;p&gt;Thanks for reading and may your clusters be lean!&lt;/p&gt;

&lt;p&gt;P.S. now you got a cluster - why not start managing its cost and performance for free with &lt;a href="https://perfectscale.io" rel="noopener noreferrer"&gt;PerfectScale&lt;/a&gt; - the leading Kubernetes cost optimization solution? &lt;/p&gt;

&lt;p&gt;Join now to build clusters you can be proud of: &lt;a href="https://perfectscale.io" rel="noopener noreferrer"&gt;https://perfectscale.io&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>eks</category>
      <category>iac</category>
    </item>
    <item>
      <title>DevOps Shorts 028 - Peter Guagenti</title>
      <dc:creator>Ant(on) Weiss</dc:creator>
      <pubDate>Thu, 11 Apr 2024 13:29:40 +0000</pubDate>
      <link>https://dev.to/antweiss/devops-shorts-028-peter-guagenti-10pj</link>
      <guid>https://dev.to/antweiss/devops-shorts-028-peter-guagenti-10pj</guid>
      <description>&lt;p&gt;I haven't posted about DevOps Shorts episodes here yet, I think. Mainly publishing them on my homepage at &lt;a href="https://antweiss.com" rel="noopener noreferrer"&gt;https://antweiss.com&lt;/a&gt;. But now I intend to re-post them here too for wider exposure.&lt;/p&gt;

&lt;p&gt;So here goes:&lt;/p&gt;

&lt;h2&gt;
  
  
  Peter Guagenti - The AI is an Iron Man Suit for the Mind
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcouncils.forbes.com%2Fprofile%2F_next%2Fimage%3Furl%3Dhttps%253A%252F%252Fs3.amazonaws.com%252Fcco-avatars%252F49668400-f321-45cf-9e7b-7b82383a3dd4.png%26w%3D256%26q%3D75" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcouncils.forbes.com%2Fprofile%2F_next%2Fimage%3Furl%3Dhttps%253A%252F%252Fs3.amazonaws.com%252Fcco-avatars%252F49668400-f321-45cf-9e7b-7b82383a3dd4.png%26w%3D256%26q%3D75" alt="Peter Guagenti"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Today's episode is a bit of a departure from my regular format. And it's symbolic. The evolution of GenAI is definitely changing how we work in IT. The change may still not be very evident but we all know it's coming. And we still need to understand what changes. Beside the StackOVerflow drop in popularity that is.&lt;/p&gt;

&lt;p&gt;That's why my guest this time is Peter Guagenti - the President and CMO at &lt;a href="https://www.tabnine.com/" rel="noopener noreferrer"&gt;Tabnine&lt;/a&gt; - the AI coding assistant. Peter has worked at Nginx, CockroachDB and SingleStore, so he has a deep understanding of platform tooling and open source. And today he's bringing the message of AI-assisted coding. And together we're trying to understand how that changes platform and Devops work.&lt;/p&gt;

&lt;p&gt;Listen to the episode to learn:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Why AI changes everything about how we work (in DevOps too)&lt;/li&gt;
&lt;li&gt;Where AI extends beyond code completion/generation&lt;/li&gt;
&lt;li&gt;What's the role of context awareness&lt;/li&gt;
&lt;li&gt;How it changes our creativity&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The episode is live on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://open.spotify.com/episode/3fOYs0gHHHr29mnOArq1Fq?si=9510ebedf86047ec" rel="noopener noreferrer"&gt;Spotify&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://youtu.be/MViM52Bgcgo" rel="noopener noreferrer"&gt;Youtube&lt;/a&gt; - also embedded at the bottom of this post.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Watch out for new episodes!&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This episode is brought to you by &lt;a href="https://perfectscale.io" rel="noopener noreferrer"&gt;PerfectScale&lt;/a&gt; - the automated K8s optimization and management platform&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;To watch DevOps Shorts 028:&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/MViM52Bgcgo"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

</description>
      <category>devops</category>
      <category>platformengineering</category>
      <category>podcast</category>
      <category>ai</category>
    </item>
    <item>
      <title>Adding a canonical url to dev.to posts (in basic markdown editor)</title>
      <dc:creator>Ant(on) Weiss</dc:creator>
      <pubDate>Tue, 02 Apr 2024 15:33:47 +0000</pubDate>
      <link>https://dev.to/antweiss/adding-a-canonical-url-to-devto-posts-in-basic-markdown-editor-1enn</link>
      <guid>https://dev.to/antweiss/adding-a-canonical-url-to-devto-posts-in-basic-markdown-editor-1enn</guid>
      <description>&lt;p&gt;TLDR: &lt;br&gt;
add a markdown header: &lt;code&gt;canonical_url: https://your.url.here&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Took me some time to find this. The preamble is I'm using the basic markdown editor on dev.to. Or at least I was doing it until now. Somehow wasn't even aware of Rich+Markdown option arriving. Trying it for the first time right now - writing this little blurb.&lt;br&gt;
Anyway - I needed to update some of my older posts with the canonical links that were missing.&lt;br&gt;
Almost all the guides out there showed how to do it in the new editor - through the cog menu at the bottom of the editor. &lt;br&gt;
&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6xumo0bsczx5h94gyglq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6xumo0bsczx5h94gyglq.png" alt="The cog menu" width="366" height="624"&gt;&lt;/a&gt;&lt;br&gt;
But I needed to add it to older posts which were edited with basic.&lt;br&gt;
I guessed it would come down to adding a header, but it took me a few googlings to find out which one.&lt;br&gt;
So yes - just add this header:&lt;br&gt;
&lt;code&gt;canonical_url: https://your.url.here&lt;/code&gt; &lt;br&gt;
And you're good.&lt;/p&gt;

&lt;p&gt;Happy blogging!&lt;/p&gt;

</description>
      <category>blogging</category>
      <category>seo</category>
      <category>howtodevto</category>
      <category>canonical</category>
    </item>
    <item>
      <title>9 Ways to an EKS Cluster - Way 2 - AWS CLI</title>
      <dc:creator>Ant(on) Weiss</dc:creator>
      <pubDate>Sun, 25 Feb 2024 17:08:10 +0000</pubDate>
      <link>https://dev.to/aws-builders/9-ways-to-an-eks-cluster-way-2-aws-cli-3g94</link>
      <guid>https://dev.to/aws-builders/9-ways-to-an-eks-cluster-way-2-aws-cli-3g94</guid>
      <description>&lt;p&gt;In my previous post I started out with &lt;a href="https://dev.to/aws-builders/8-ways-to-spin-up-an-eks-cluster-210b"&gt;&lt;strong&gt;Way 1 - Create an EKS Cluster in AWS Management Console&lt;/strong&gt;&lt;/a&gt;. (All the links to other posts in this series will be &lt;a href="https://dev.to/aws-builders/8-ways-to-spin-up-an-eks-cluster-210b"&gt;here&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;Using the management console is quick and intuitive. But, as discussed - real platform engineers don't click on buttons. Instead they manage their Infra as Code. And - not all code was created the same. We can identify 3 layers of IaC - each one increasing in complexity and, therefore - flexibility:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Command line (CLI)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;DSL + Interpeter (e.g HCL + Terraform, yaml + Ansbile, etc)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Pure programming language (with Boto3, CDK, Pulumi, etc)&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So today we will start from the most accessible layer - the AWS cli.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;The AWS Command Line Interface (AWS CLI) is a unified tool to manage all your AWS services (not just EKS). With just one tool to download and configure, you can control multiple AWS services from the command line and automate them through scripts.&lt;/p&gt;

&lt;p&gt;In case you still don't have AWS CLIv2 - please follow the &lt;a href="https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html#getting-started-install-instructions" rel="noopener noreferrer"&gt;official instructions&lt;/a&gt; to get it installed.&lt;/p&gt;

&lt;p&gt;While at it - I heartily recommend you to install &lt;a href="https://github.com/awslabs/aws-shell" rel="noopener noreferrer"&gt;aws-shell&lt;/a&gt; which boosts your aws cli productivity by providing graphical autocompletion, hints and shortcuts as shown in the image below. I only discovered it recently myself and it's definitely a game changer!&lt;br&gt;
&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0fimdzeguhqjcpo3qk0j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0fimdzeguhqjcpo3qk0j.png" alt="aws-shell" width="800" height="168"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Of course once you get the cluster running it's highly desirable that you also have &lt;code&gt;kubectl&lt;/code&gt; and/or &lt;a href="https://flathub.org/apps/dev.k8slens.OpenLens" rel="noopener noreferrer"&gt;OpenLens&lt;/a&gt; installed to interact with the cluster. But that's not specific to this EKS provisioning method.&lt;/p&gt;
&lt;h2&gt;
  
  
  Creating a VPC
&lt;/h2&gt;

&lt;p&gt;I skipped this part in the description of &lt;strong&gt;Way 1&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;But when working with the CLI - it makes sense to wrap everything into one script (provided in the accompanying &lt;a href="https://github.com/antweiss/9-ways-2-EKS/blob/main/way-2-aws-cli/eks.sh" rel="noopener noreferrer"&gt;repo&lt;/a&gt;) so let's create the VPC right here. &lt;/p&gt;

&lt;p&gt;The required VPC config is non-trivial - with 2 public and 2 private subnets and all the associated gateways, rout tables and security groups. Luckily, AWS provides a CloudFormation template to make this easier, so we'll just use that. &lt;/p&gt;

&lt;p&gt;The template defines a default IPv4 CIDR range for your VPC. Each node, Pod, and load balancer that you deploy is assigned an IPv4 address from this block. It provides enough IP addresses for most implementations, but if it doesn't, then you can change it. For more information, see &lt;a href="https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Subnets.html#VPC_Sizing" rel="noopener noreferrer"&gt;VPC and subnet sizing&lt;/a&gt; in the Amazon VPC User Guide.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create the VPC CloudFormation stack:
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;EXPORT AWS_REGION=eu-central-1
aws cloudformation create-stack --stack-name Way2VPC \
    --region $AWS_REGION \
    --template-url https://s3.us-west-2.amazonaws.com/amazon-eks/cloudformation/2020-10-29/amazon-eks-vpc-private-subnets.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Stack creation takes a few minutes but the CLI prompt returns immediately. In order to check the stack status please run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws cloudformation describe-stacks --stack-name Way2VPC \
    --region eu-central-1 \
    --query 'Stacks[*].StackStatus'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once it returns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[
    "CREATE_COMPLETE"
]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;we can continue to -&amp;gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating the Cluster Role
&lt;/h2&gt;

&lt;p&gt;As mentioned in the previous post - we have to create an IAM role that will allow the control plane of our EKS to manage its nodes. We will name our role for this blog post &lt;strong&gt;Way2EKSClusterRole&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The following comes from the &lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/create-cluster.html" rel="noopener noreferrer"&gt;official guide&lt;/a&gt;, but as all of these commands are executed from the CLI - it makes sense to put them here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create the policy file:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat &amp;gt;eks-cluster-role-trust-policy.json &amp;lt;&amp;lt;EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "eks.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Create the IAM role:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws iam create-role --role-name Way2EKSClusterRole --assume-role-policy-document file://"eks-cluster-role-trust-policy.json"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Attach the Amazon EKS managed policy named &lt;a href="https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonEKSClusterPolicy.html#AmazonEKSClusterPolicy-json" rel="noopener noreferrer"&gt;AmazonEKSClusterPolicy&lt;/a&gt; to the role:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/AmazonEKSClusterPolicy --role-name Way2EKSClusterRole
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Finally - Let's Create the Cluster
&lt;/h2&gt;

&lt;p&gt;Now that we have the VPC and the role -  we can create the cluster.&lt;br&gt;
First - define the environment variables. Feel free to modify these as appropriate for your environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export REGION=eu-central-1
export CLUSTERNAME=way2
export K8S_VERSION=1.29
export ROLE_ARN=$(aws iam get-role --role-name Way2EKSClusterRole --query 'Role.Arn' --output text)
export SECURITY_GROUP=$(aws cloudformation describe-stacks --stack-name Way2VPC --region eu-central-1 --query 'Stacks[*].Outputs[?OutputKey==`SecurityGroups`].OutputValue | [0] | [0]' --output text)
export SUBNET_IDS=$(aws cloudformation describe-stacks --stack-name Way2VPC --region eu-central-1 --query 'Stacks[*].Outputs[?OutputKey==`SubnetIds`].OutputValue | [0] | [0]' --output text)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note I'm using the &lt;code&gt;--query&lt;/code&gt; option to retrieve the necessary resource properties and &lt;code&gt;--output text&lt;/code&gt; to make sure they are not quoted. This is needed to use them as env vars in the following command that finally creates our cluster!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws eks create-cluster --region $REGION \
  --name $CLUSTERNAME \
  --kubernetes-version $K8S_VERSION \
  --role-arn $ROLE_ARN \
  --resources-vpc-config subnetIds=$SUBNET_IDS,securityGroupIds=$SECURITY_GROUP

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It takes a several minutes to create the cluster.&lt;br&gt;
You can verify it's been created by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws eks describe-cluster --region $REGION --name $CLUSTERNAME --query "cluster.status"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the response is "ACTIVE" - we're good to go and connect to the cluster by generating a &lt;code&gt;kubeconfig&lt;/code&gt; definition:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws eks update-kubeconfig --region $REGION --name  $CLUSTERNAME
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Adding a NodeGroup
&lt;/h2&gt;

&lt;p&gt;Once this works and we can successfully run &lt;code&gt;kubectl get nodes&lt;/code&gt; - we recall we still need to add nodes.&lt;br&gt;
The options here are abound - we can choose between managed and unmanaged node groups (or even AWS Fargate), we can define which AMIs and instance types to choose and if the resulting machines will Spot or On-Demand. This is really beyond the scope of my post. Right here we'll opt for a minimum viable nodegroup - with defaults defined by AWS.&lt;/p&gt;
&lt;h2&gt;
  
  
  Creating the Node Role
&lt;/h2&gt;

&lt;p&gt;Our nodes also need an IAM Role - to pull container images from ECR, to assign IPs for the AWS CNI and a bunch of other stuff.&lt;/p&gt;

&lt;p&gt;Let's create that role:&lt;/p&gt;

&lt;p&gt;Define the trust relationship:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat &amp;gt;node-role-trust-relationship.json &amp;lt;&amp;lt;EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And now create the role and attach all the necessary policies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export NODE_ROLE_NAME=Way2EKSNodeRole
aws iam create-role \
  --role-name $NODE_ROLE_NAME \
  --assume-role-policy-document file://"node-role-trust-relationship.json"
aws iam attach-role-policy \
  --policy-arn arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy \
  --role-name $NODE_ROLE_NAME
aws iam attach-role-policy \
  --policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly \
  --role-name $NODE_ROLE_NAME
aws iam attach-role-policy \
  --policy-arn arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy \
  --role-name $NODE_ROLE_NAME
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  A side note about IPv6
&lt;/h2&gt;

&lt;p&gt;All the commands I give only provision a cluster with IPv4 support, because that's what the majority of us need. Should you need IPv6 support - please refer to the official docs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finally Create the NodeGroup
&lt;/h2&gt;

&lt;p&gt;There's a funny quirk to the way subnet ids are passed to this command. In &lt;code&gt;aws eks create-cluster&lt;/code&gt; subnet ids need to be comma-separated. But in &lt;code&gt;create-nodegroup&lt;/code&gt; they are for some reason expected to be separated by spaces... Go figure this out :))&lt;/p&gt;

&lt;p&gt;So first do this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export SUBNET_IDS=$(echo $SUBNET_IDS | tr ',' ' ')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export NODE_ROLE_ARN=$(aws iam get-role --role-name $NODE_ROLE_NAME --query 'Role.Arn' --output text)

aws eks create-nodegroup --cluster-name $CLUSTERNAME \
--nodegroup-name Way2NodeGroup \
--subnets subnet-05093c7f5ffd9227d subnet-0c3871d7e909fbb0d subnet-098075cc435686217 subnet-0b5940fcf21e402ad \
--node-role arn:aws:iam::117473350851:role/Way2EKSNodeRole \
--region $REGION

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates a NodeGroup with the default scaling params of&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; "minSize": 1,
 "maxSize": 2,
 "desiredSize": 2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you need a different scaling config - modify this accordingly.&lt;/p&gt;

&lt;p&gt;After a few minutes we can recheck our nodes by running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get node
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Which should give us something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NAME                                               STATUS   ROLES    AGE     VERSION
ip-192-168-177-103.eu-central-1.compute.internal   Ready    &amp;lt;none&amp;gt;   5m57s   v1.29.0-eks-5e0fdde
ip-192-168-98-28.eu-central-1.compute.internal     Ready    &amp;lt;none&amp;gt;   5m58s   v1.29.0-eks-5e0fdde
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And we now have an EKS cluster all created from the CLI!&lt;br&gt;
And you can even connect it to PerfectScale to start monitoring and optimizing your Kubernetes resource usage right from the start - &lt;a href="https://app.perfectscale.io/account/sign-up" rel="noopener noreferrer"&gt;sign up here&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Creating an EKS cluster can be done with (relatively) simple AWS CLI commands. &lt;br&gt;
There are a lot of commands to run, and while they can be &lt;a href="https://github.com/antweiss/9-ways-2-EKS/blob/main/way-2-aws-cli/eks.sh" rel="noopener noreferrer"&gt;wrapped in a script&lt;/a&gt; and parameterized - it's still not a very good solution. The good thing is that we don't need anything beside AWS CLI. Well, some CloudFormation, but it's AWS-provided. &lt;br&gt;
The worst part is that such a script isn't idempotent. Once we create all (or some of) these resources - the script isn't going to work. Also removing all the multiple resources we've created is a lot of manual work now.&lt;/p&gt;

&lt;p&gt;And that's why we're going to explore &lt;a href="https://dev.to/aws-builders/8-ways-to-spin-up-an-eks-cluster-210b"&gt;additional ways of provisioning an EKS cluster&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;See you in the next installment of this series.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>eks</category>
      <category>kubernetes</category>
      <category>devops</category>
    </item>
    <item>
      <title>Exploring cgroups v2 and MemoryQoS With EKS and Bottlerocket</title>
      <dc:creator>Ant(on) Weiss</dc:creator>
      <pubDate>Mon, 19 Feb 2024 14:49:05 +0000</pubDate>
      <link>https://dev.to/aws-builders/exploring-cgroups-v2-and-memoryqos-with-eks-and-bottlerocket-a7g</link>
      <guid>https://dev.to/aws-builders/exploring-cgroups-v2-and-memoryqos-with-eks-and-bottlerocket-a7g</guid>
      <description>&lt;p&gt;Bottlerocket is a Linux-based operating system optimized for hosting containers. It was originally developed at AWS specifically for runnning secure and performant Kubernetes nodes. It's minimal, secure and supports atomic updates.&lt;/p&gt;

&lt;p&gt;According to this &lt;a href="https://github.com/Bottlerocket-os/Bottlerocket/discussions/2874" rel="noopener noreferrer"&gt;discussion&lt;/a&gt; - starting with Bottlerocket 1.13.0 (Mar 2023) new distributions will default to using Cgroups v2 interface for process organization and enforcing resource limits.&lt;/p&gt;

&lt;p&gt;In this post I intend to explore how this works for EKS clusters running Kubernetes 1.26+ and what this change means for EKS users.&lt;/p&gt;

&lt;h1&gt;
  
  
  Cgroups - An Intro
&lt;/h1&gt;

&lt;p&gt;Cgroups (abbreviated from Control Groups) - is a Linux kernel feature that lies at the foundation of what we now know as Linux containers.&lt;/p&gt;

&lt;p&gt;The feature allows to limit. account for and isolate resource usage for a collection of processes.&lt;/p&gt;

&lt;p&gt;It was developed at Google circa 2007 and merged into Linux kernel mainline in 2008.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwizardzines.com%2Fimages%2Fuploads%2Fcgroups.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwizardzines.com%2Fimages%2Fuploads%2Fcgroups.png" alt="Julia Evans' wonderful cgroups comic"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Cgroups and Kubernetes
&lt;/h2&gt;

&lt;p&gt;Kubernetes allows us to define resource usage for containers via the &lt;a href="https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#resources" rel="noopener noreferrer"&gt;resources&lt;/a&gt; map in the Pod API spec.&lt;br&gt;
These definitions are then passed by the kubelet on to the container runtime on the node and translated into Cgroups configuration. &lt;/p&gt;

&lt;p&gt;Up until version 1.25 Kubernetes only supported Cgroups v1 by default. In 1.25 - stable support for Cgroups v2 was added. Now if running on a node with Cgroups v2 - the kubelet automatically identifies this and perfroms accordingly. But what does this mean for our workload configuration? In order to understand that we need to explain what Cgroups v2 is.&lt;/p&gt;
&lt;h2&gt;
  
  
  Cgroups V2
&lt;/h2&gt;

&lt;p&gt;Cgroups v2 was released in 2015 introducing API redesign - mainly for a unified hierarchy and improved consistency. The following diagram shows the change in how Cgroup controllers are ordered in v2 vs. v1:&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fv2%2Fresize%3Afit%3A640%2Fformat%3Awebp%2F1%2AP7ZLLF_F4TMgGfaJ2XIfuQ.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fv2%2Fresize%3Afit%3A640%2Fformat%3Awebp%2F1%2AP7ZLLF_F4TMgGfaJ2XIfuQ.png" alt="cgroup hierarchy"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;According to &lt;a href="https://kubernetes.io/docs/concepts/architecture/cgroups/#:~:text=Some%20Kubernetes%20features%20exclusively%20use%20cgroup%20v2%20for%20enhanced%20resource%20management%20and%20isolation.%20For%20example%2C%20the%20MemoryQoS%20feature%20improves%20memory%20QoS%20and%20relies%20on%20cgroup%20v2%20primitives." rel="noopener noreferrer"&gt;this architecture document&lt;/a&gt; : &lt;em&gt;"Some Kubernetes features exclusively use cgroup v2 for enhanced resource management and isolation. For example, the MemoryQoS feature improves memory QoS and relies on cgroup v2 primitives."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;And when we look at the description of the aforementioned &lt;em&gt;MemoryQoS&lt;/em&gt; feature we find out that "In cgroup v1, and prior to this feature, the container runtime never took into account and effectively ignored spec.containers[].resources.requests["memory"]." and that "Fortunately, cgroup v2 brings a new design and implementation to achieve full protection on memory... With this experimental feature, quality-of-service for pods and containers extends to cover not just CPU time but memory as well."&lt;/p&gt;

&lt;p&gt;Well, first of all - &lt;strong&gt;it's a bit shocking and even insulting to learn that container runtimes ignored our settings&lt;/strong&gt;! But I was also very curious to learn how this changes now that cgroups v2 support is introduced.&lt;/p&gt;
&lt;h2&gt;
  
  
  MemoryQoS and Cgroups v2
&lt;/h2&gt;

&lt;p&gt;According to this &lt;a href="https://kubernetes.io/blog/2021/11/26/qos-memory-resources/" rel="noopener noreferrer"&gt;page&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;Memory QoS uses the memory controller of cgroup v2 to guarantee memory resources in Kubernetes. Memory requests and limits of containers in pod are used to set specific interfaces &lt;code&gt;memory.min and&lt;/code&gt;memory.high&lt;code&gt;provided by the memory controller. When `memory.min&lt;/code&gt; is set to memory requests, memory resources are reserved and never reclaimed by the kernel; this is how Memory QoS ensures the availability of memory for Kubernetes pods. And if memory limits are set in the container, this means that the system needs to limit container memory usage, Memory QoS uses &lt;code&gt;memory.high&lt;/code&gt; to throttle workload approaching it's memory limit, ensuring that the system is not overwhelmed by instantaneous memory allocation.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fkubernetes.io%2Fblog%2F2021%2F11%2F26%2Fqos-memory-resources%2Fmemory-qos-cal.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fkubernetes.io%2Fblog%2F2021%2F11%2F26%2Fqos-memory-resources%2Fmemory-qos-cal.svg" alt="container memory in cgroup v2"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is all great! Let's now provision an EKS cluster with some Bottlerocket nodes and see how this works in practice.&lt;/p&gt;

&lt;p&gt;To easily spin up a cluster - use the &lt;a href="https://github.com/antweiss/botllerocket-cgroupv2/blob/main/cluster.yaml" rel="noopener noreferrer"&gt;cluster.yaml&lt;/a&gt; in the attached github repository:&lt;/p&gt;

&lt;p&gt;generate ssh keys:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh-keygen &lt;span class="nt"&gt;-f&lt;/span&gt; ./mykey
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and create the cluster&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;eksctl create cluster &lt;span class="nt"&gt;-f&lt;/span&gt; cluster.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will create a cluster with one Bottlerocket node. It also configures ssh access to the nodes by running the &lt;a href="https://github.com/Bottlerocket-os/Bottlerocket-admin-container" rel="noopener noreferrer"&gt;Bottlerocket admin container&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This means we can now access the node:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;NODE_IP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;kubectl get node &lt;span class="nt"&gt;-oyaml&lt;/span&gt; | yq  &lt;span class="s1"&gt;'.items[].status.addresses[] | select(.type=="ExternalIP") | .address'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
ssh &lt;span class="nt"&gt;-i&lt;/span&gt; mykey ec2-user@&lt;span class="nv"&gt;$NODE_IP&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We get greeted with the following screen:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkz1iy6w20r9g1seljoyw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkz1iy6w20r9g1seljoyw.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As this says - we can get admin access to the Bottlerocket filesystem by running &lt;code&gt;sudo sheltie&lt;/code&gt;. So let's do that!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;[&lt;/span&gt;ec2-user@admin]&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;sheltie
&lt;span class="o"&gt;[&lt;/span&gt;bash-5.1]&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;whoami
&lt;/span&gt;root
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we can check if we in fact have &lt;code&gt;cgroupv2&lt;/code&gt; enabled:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;[&lt;/span&gt;bash-5.1]&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;stat&lt;/span&gt; &lt;span class="nt"&gt;-fc&lt;/span&gt; %T /sys/fs/cgroup/
cgroup2fs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yup! This is cgroupv2! Were this &lt;code&gt;cgroupv1&lt;/code&gt; the output would've been &lt;code&gt;tmpfs&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Let's Deploy a Pod
&lt;/h2&gt;

&lt;p&gt;Ok, now let's deploy a pod to our node. We'll do that by creating a deployment based on the following &lt;code&gt;yaml&lt;/code&gt; spec. This deploys &lt;a href="https://github.com/antweiss/busyhttp" rel="noopener noreferrer"&gt;antweiss/busyhttp&lt;/a&gt;, that I forked from &lt;a href="https://github.com/jpetazzo/busyhttp" rel="noopener noreferrer"&gt;jpetazzo/busyhttp&lt;/a&gt; and added memory load and release endpoints to.&lt;br&gt;
You'll notice that the pod runs a container with Guaranteed QoS - i.e memory and CPU limits are equal to requests:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;busyhttp&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;busyhttp&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;busyhttp&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;busyhttp&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;otomato/busyhttp&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;busyhttp&lt;/span&gt;
        &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;200Mi"&lt;/span&gt;
            &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;250m"&lt;/span&gt;
          &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;200Mi"&lt;/span&gt;
            &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;250m"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This spec is found in &lt;a href="https://github.com/antweiss/botllerocket-cgroupv2/blob/main/dep.yaml" rel="noopener noreferrer"&gt;dep.yaml&lt;/a&gt; and we can deploy it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; dep.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Check the Cgroup Impact
&lt;/h2&gt;

&lt;p&gt;Now let's go back to our node and see how our resource definitions are reflected in the &lt;code&gt;cgroup&lt;/code&gt; config.&lt;/p&gt;

&lt;p&gt;Back inside the &lt;code&gt;sheltie&lt;/code&gt; prompt let's explore the containers running on Bottlerocket. Bottlerocket OS is using &lt;code&gt;containerd&lt;/code&gt; container runtime. In order to interact with it we'll need to use &lt;a href="https://github.com/projectatomic/containerd/blob/master/docs/cli.md" rel="noopener noreferrer"&gt;&lt;code&gt;ctr&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;When we run &lt;code&gt;ctr help&lt;/code&gt; - we get the following:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpo39grtb6hof9mcdzgee.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpo39grtb6hof9mcdzgee.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So &lt;code&gt;ctr&lt;/code&gt; is unsupported. A bit discouraging, but well, it's working. Let's try to look at our containers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bash-5.1&lt;span class="nv"&gt;$ &lt;/span&gt;ctr containers &lt;span class="nb"&gt;ls
&lt;/span&gt;CONTAINER    IMAGE    RUNTIME
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No containers?! But I do see my pod running on the node! Where is my container? Well the answer to that is &lt;code&gt;namespaces&lt;/code&gt;. Yup, just like kubernetes or linux kernel - containerd has namespaces. And all the containers executed by the kubelet live in a namespace called "k8s.io". We can see it by running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bash-5.1&lt;span class="nv"&gt;$ &lt;/span&gt;ctr ns &lt;span class="nb"&gt;ls
&lt;/span&gt;NAME   LABELS
k8s.io
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ok, let's check the containers in the "k8s.io" namespace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bash-5.1&lt;span class="nv"&gt;$ &lt;/span&gt;ctr &lt;span class="nt"&gt;-n&lt;/span&gt; k8s.io containers &lt;span class="nb"&gt;ls
&lt;/span&gt;CONTAINER                                                           IMAGE                                                                                                RUNTIME
0ed99eae66803896504d1853859d8866e00669b2610ba65cba6a17aa1300da48    112233445566.dkr.ecr.eu-central-1.amazonaws.com/eks/pause:3.1-eksbuild.1                             io.containerd.runc.v2
154d9b7b3a83e4db6e3e4ac4ac1f836321337c604c3b590b5188b7a0773bdae1    docker.io/otomato/busyhttp:latest                                                                    io.containerd.runc.v2
3b63efe56d15e9c315c668a5913e17ade420cf7fb5ff7fa62b3c9b0e1574eab4    112233445566.dkr.ecr.eu-central-1.amazonaws.com/amazon/aws-network-policy-agent:v1.0.7-eksbuild.1    io.containerd.runc.v2
3c85fa3829f59f517db1c766e490a014357a760ce12e2859004cdfb8ea3d7cc6    112233445566.dkr.ecr.eu-central-1.amazonaws.com/eks/pause:3.1-eksbuild.1                             io.containerd.runc.v2
420b62c7b2bdf2e7aa10baf8e4afd1ebda0cfff66300a23846758a029ad31222    112233445566.dkr.ecr.eu-central-1.amazonaws.com/eks/pause:3.1-eksbuild.1                             io.containerd.runc.v2
4e21008f8fc70580906990fb95bed91f9155495270fbac1efb043f81e62a1c51    112233445566.dkr.ecr.eu-central-1.amazonaws.com/eks/coredns:v1.11.1-eksbuild.4                       io.containerd.runc.v2
4f9d20de851414160a5003eb6988f2b0df81dfe3d72d4ba3705db01a4571b515    112233445566.dkr.ecr.eu-central-1.amazonaws.com/eks/kube-proxy:v1.29.0-minimal-eksbuild.1            io.containerd.runc.v2
5f8924422d852d09ad44f5d8579d9abaa78304d303007d566300db8f61978ee5    112233445566.dkr.ecr.eu-central-1.amazonaws.com/amazon-k8s-cni:v1.16.0-eksbuild.1                    io.containerd.runc.v2
7f5afdfbb9a8599c3c5888664f0df349aab8740be21d87e629ff7390e0524c2a    112233445566.dkr.ecr.eu-central-1.amazonaws.com/eks/pause:3.1-eksbuild.1                             io.containerd.runc.v2
80ed8ba3fd4624770eb17087c1a046c90be28e9fb2e31630c82e67b4c0ae19dd    112233445566.dkr.ecr.eu-central-1.amazonaws.com/eks/pause:3.1-eksbuild.1                             io.containerd.runc.v2
a0929884ccdd72de6bf848a037e80b206a4fb4e2f9b77be568bac8f51787cccb    112233445566.dkr.ecr.eu-central-1.amazonaws.com/eks/pause:3.1-eksbuild.1                             io.containerd.runc.v2
a0dd24ecb0aee4eb645d25c75a3eade0c8c35fb09127db5ff7d8136d7bb86efe    112233445566.dkr.ecr.eu-central-1.amazonaws.com/eks/coredns:v1.11.1-eksbuild.4                       io.containerd.runc.v2
c3a3a9214b11b808a88c0293312d8877497f06f87435af3a7334717a13588c26    112233445566.dkr.ecr.eu-central-1.amazonaws.com/amazon-k8s-cni-init:v1.16.0-eksbuild.1               io.containerd.runc.v2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we're talking! We have all the usual suspects here - coredns, kube-proxy, the omnipresent &lt;a href="https://docs.mirantis.com/mke/3.4/ref-arch/pause-containers.html" rel="noopener noreferrer"&gt;pause&lt;/a&gt; containers. But right now we're interested in the container based on the &lt;code&gt;docker.io/otomato/busyhttp:latest&lt;/code&gt; image.&lt;/p&gt;

&lt;p&gt;Let's look for its cgroup definition in the cgroup filesystem we discovered previously. First we need to filter out the container id. &lt;code&gt;ctr&lt;/code&gt; supports filters for its listing function. So the way to parse out the container id by image name is the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export CONTAINER_ID=$(ctr -n k8s.io containers ls -q image==docker.io/otomato/busyhttp:latest)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note the &lt;code&gt;-q&lt;/code&gt; that tells &lt;code&gt;ctr&lt;/code&gt; to only output the id.&lt;/p&gt;

&lt;p&gt;Now we can find the container's cgroup config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;find /sys/fs/cgroup/ &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt;&lt;span class="nv"&gt;$CONTAINER_ID&lt;/span&gt;&lt;span class="k"&gt;*&lt;/span&gt;
/sys/fs/cgroup/kubepods.slice/kubepods-pod5be5d94a_cbfe_416f_9010_6338003af666.slice/cri-containerd-154d9b7b3a83e4db6e3e4ac4ac1f836321337c604c3b590b5188b7a0773bdae1.scope
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives us a long path somewhere inside a folder called &lt;code&gt;kubepods.slice&lt;/code&gt;. Let's wrap this path in an environment variable and look around:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export MY_CGROUP_DIR=$(find /sys/fs/cgroup/ -name *$CONTAINER_ID*)
ls ${MY_CGROUP_DIR}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Whew! That's a lot of files! Now according to &lt;a href="https://kubernetes.io/blog/2021/11/26/qos-memory-resources/" rel="noopener noreferrer"&gt;this page on Memory QoS&lt;/a&gt; - our &lt;code&gt;requests.memory&lt;/code&gt; should be translated to &lt;code&gt;memory.min&lt;/code&gt; while &lt;code&gt;memory.high&lt;/code&gt; is calculated the following way:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;memory,high = (pod.spec.containers[i].resources.limits[memory] or nodeAllocatableMemory) * throttlingFactor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's look at the limit first:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat ${MY_CGROUP_DIR}/memory.high
max
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hmm. That's not a number. But we can also notice that there's a file called &lt;code&gt;memory.max&lt;/code&gt;. Let's look inside that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat ${MY_CGROUP_DIR}/memory.high
209715200
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ok, here's our limit! 209715200 bytes is exactlymthe 200Mi we defined in the &lt;code&gt;resources&lt;/code&gt; section of our pod spec.&lt;/p&gt;

&lt;p&gt;Now what about the requests? Let's look at &lt;code&gt;memory.min&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat ${MY_CGROUP_DIR}/memory.min
0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;0 is not the request we've defined. And that makes sense. &lt;a href="https://kubernetes.io/blog/2021/11/26/qos-memory-resources/" rel="noopener noreferrer"&gt;Memory QoS&lt;/a&gt; has been in alpha since Kubernetes 1.22 (August 2021) and according to the &lt;a href="https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2570-memory-qos/kep.yaml" rel="noopener noreferrer"&gt;KEP data&lt;/a&gt; was still in alpha as of 1.27.&lt;/p&gt;

&lt;p&gt;In order to see the actual request values for memory reflected in cgroup config one needs to enable the Memory QoS &lt;a href="https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/#kubelet-config-k8s-io-v1beta1-KubeletConfiguration:~:text=effect.%20Default%3A%2015-,featureGates,-map%5Bstring%5Dbool" rel="noopener noreferrer"&gt;feature gate in kubelet config&lt;/a&gt; as defined &lt;a href="https://github.com/kubernetes/kubernetes/blob/master/pkg/features/kube_features.go#L492" rel="noopener noreferrer"&gt;here&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;kubelet.config.k8s.io/v1beta1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;KubeletConfiguration&lt;/span&gt;
&lt;span class="na"&gt;featureGates&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;MamoryQoS&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Trouble is - due to the atomic nature of Bottlerocket OS - we can't change its KubeletConfiguration file (found at /etc/kubernetes/kubelet/config) directly. We can only pass settings through &lt;code&gt;settings.kubernetes&lt;/code&gt; via the API or a config file. But these currently don't support setting feature gates. So it looks like the only way to modify the Kubelet to support Memory QoS on EKS Bottlerocket nodes is to build our own Bottlerocket images. Which is a subject for a whole another blog post.&lt;/p&gt;

&lt;p&gt;And for now - let's shrug our shoulders, scratch our heads and bring down our EKS cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;eksctl delete cluster -f cluster.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Summing it All Up
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;cgroup v2&lt;/code&gt; is enabled by default in current Bottlerocket EKS instances. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;this allows a better organized resource management on the nodes&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;an important Kubernetes feature based on &lt;code&gt;cgroup v2&lt;/code&gt; is Memory QoS that ensure that memory requests are actually allocated by the container runtime and not merely checked for by the Kubernetes scheduler&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;MemoryQoS is still in &lt;code&gt;alpha&lt;/code&gt; after 2 years&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;There's no easy way to enable Memory QoS on Bottlerocket nodes without building the AMIs ourselves.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Anyway - this was an interesting exploration. And if there's anything I got wrong or didn't make clear - please let me know in comments.&lt;/p&gt;

&lt;p&gt;May all your containers run smoothly!&lt;/p&gt;

&lt;p&gt;The config files used in the blog post can be found &lt;a href="https://github.com/antweiss/botllerocket-cgroupv2" rel="noopener noreferrer"&gt;in this github repo&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This post was originally published &lt;a href="https://antweiss.com/blog/exploring-cgroups-v2-and-memoryqos-with-eks-and-bottlerocket/" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>9 Ways to Spin Up an EKS Cluster - Way 1 - the Console</title>
      <dc:creator>Ant(on) Weiss</dc:creator>
      <pubDate>Mon, 22 Jan 2024 18:27:43 +0000</pubDate>
      <link>https://dev.to/aws-builders/8-ways-to-spin-up-an-eks-cluster-210b</link>
      <guid>https://dev.to/aws-builders/8-ways-to-spin-up-an-eks-cluster-210b</guid>
      <description>&lt;h1&gt;
  
  
  We love EKS!
&lt;/h1&gt;

&lt;p&gt;If you're running on AWS - the best, most hassle-free option of getting a Kubernetes cluster is by using EKS - the Elastic Kubernetes Service. The control plane of EKS clusters is fully managed by AWS while the data plane - i.e. the worker nodes can be defined and managed by the user in various available configurations. &lt;/p&gt;

&lt;p&gt;As with anything in modern cloud services - there are a number of ways we can create and manage EKS clusters. Organizations only starting with building out their delivery platform need to choose the provisioning and managing method. This choice has a significant impact on their platform evolution. Often the criteria for making this choice isn't clear. &lt;/p&gt;

&lt;p&gt;In this series I intend to give an overview of all the different options and provide a rundown of the benefits and downsides of each method.&lt;/p&gt;

&lt;p&gt;And here's our list:&lt;/p&gt;

&lt;h3&gt;
  
  
  Way 1 - Create an EKS Cluster in AWS Management Console
&lt;/h3&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://dev.to/aws-builders/9-ways-to-an-eks-cluster-way-2-aws-cli-3g94"&gt;Way 2 - Create an EKS Cluster in AWS cli&lt;/a&gt;
&lt;/h3&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://dev.to/aws-builders/9-ways-to-spin-up-an-eks-cluster-way-3-eksctl-2op9"&gt;Way 3 - Create an EKS Cluster with eksctl&lt;/a&gt;
&lt;/h3&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://dev.to/aws-builders/9-ways-to-spin-up-an-eks-cluster-way-4-cloudformation-3len"&gt;Way 4 - Create an EKS Cluster with CloudFormation&lt;/a&gt;
&lt;/h3&gt;

&lt;h3&gt;
  
  
  Way 5 - Create an EKS Cluster with python and boto3
&lt;/h3&gt;

&lt;h3&gt;
  
  
  Way 6 - Create an EKS Cluster with AWS CDK
&lt;/h3&gt;

&lt;h3&gt;
  
  
  Way 7 - Create an EKS Cluster with Terraform
&lt;/h3&gt;

&lt;h3&gt;
  
  
  Way 8 - Create an EKS Cluster with Pulumi
&lt;/h3&gt;

&lt;h3&gt;
  
  
  Way 9 - Create an EKS Cluster with Crossplane
&lt;/h3&gt;

&lt;p&gt;In fact - the first 3 ways listed here (Management console, eksctl and aws cli) are all laid out in &lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/create-cluster.html" rel="noopener noreferrer"&gt;this AWS guide&lt;/a&gt;, so I won't go into too much technical detail. But some things are still worth noting.&lt;/p&gt;

&lt;p&gt;So, without further ado - let's start!&lt;/p&gt;

&lt;h2&gt;
  
  
  Way 1 - Create an EKS Cluster in AWS Management Console
&lt;/h2&gt;

&lt;p&gt;So the fastest, most straightforward way of provisioning any AWS service is of course by going to the console and clicking your way through. No need to install anything on your computer, no need to learn new tools and languages.&lt;/p&gt;

&lt;p&gt;And it's actually so easy! Just go to your AWS Management Console, find EKS in the list of available services and proceed to "Add Cluster -&amp;gt; Create":&lt;br&gt;
&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcui751kyqnk5ewvu75zh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcui751kyqnk5ewvu75zh.png" alt="Add Cluster" width="800" height="179"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Right? Wrong!&lt;br&gt;
In fact - before clicking your way to a cluster you need to:&lt;/p&gt;

&lt;p&gt;a) Create a VPC and subnets that meet &lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/network_reqs.html" rel="noopener noreferrer"&gt;Amazon EKS requirements&lt;/a&gt;. &lt;br&gt;
b) Create a Cluster Role in AWS IAM by following this &lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/service_IAM_role.html#create-service-role" rel="noopener noreferrer"&gt;guide&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;And then you can click your way through!&lt;/p&gt;
&lt;h2&gt;
  
  
  On choosing the Kubernetes version
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd0w3lrdz9vq9nvsmf9pk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd0w3lrdz9vq9nvsmf9pk.png" alt="Choose k8s version" width="800" height="148"&gt;&lt;/a&gt;&lt;br&gt;
This is something we need to consider for all the methods listed. Unless some specific limitation prevents you - always choose the latest version (currently it's 1.29). AWS make sure to test the version they provide and regularly deprecate older versions. Each Kubernetes version gets &lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html" rel="noopener noreferrer"&gt;14 months of standard support&lt;/a&gt; and upgrading your production cluster can get nerve-wrecking and time-consuming. So again - make sure to always choose the latest one.&lt;/p&gt;
&lt;h2&gt;
  
  
  A note on observability
&lt;/h2&gt;

&lt;p&gt;The third screen you need to click through when creating EKS from the console is the Observability one. This currently allows you to enable EKS monitoring using &lt;a href="https://aws.amazon.com/prometheus/" rel="noopener noreferrer"&gt;Amazon Managed Service for Prometheus&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;You only need this if you're not using a 3rd party observability service (like DataDog or NewRelic) - because all of them support monitoring EKS today and you can then set this up at a later stage.&lt;/p&gt;
&lt;h2&gt;
  
  
  Creating some nodes
&lt;/h2&gt;

&lt;p&gt;After you've successfully clicked through, waited a while and finally saw the cluster state in the console change from "Creating" to "Active" - it's time to connect to the control plane from your kubectl client.&lt;br&gt;
&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4eny6ukx3t8sg412vxhp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4eny6ukx3t8sg412vxhp.png" alt="Cluster active" width="800" height="140"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That's where you'll need the AWS CLI, even if you've used the console for everything else until now. Get the kubeconfig:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws eks update-kubeconfig --name mycluster --region eu-central-1

Added new context arn:aws:eks:eu-central-1:XXXXXXXXXXX:cluster/mycluster to /Users/antweiss/.kube/config
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Try to look at the nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(⎈ | mycluster:default)➜  kubectl get node
No resources found
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And that's where we realize we still need to create the nodes!&lt;br&gt;
This can be done by going to EKS-&amp;gt;Clusters-&amp;gt;mycluster-&amp;gt;Compute and choosing either to use self-manged nodes, create a managed &lt;em&gt;Node Group&lt;/em&gt; or utilize a &lt;em&gt;Fargate Profile&lt;/em&gt;.&lt;br&gt;
&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7vw97grysztfvzd5wah4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7vw97grysztfvzd5wah4.png" alt="Add some nodes" width="800" height="543"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What option to use for your EKS nodes is a topic for a whole separate post. I won't go into it here. You can consult &lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/eks-compute.html" rel="noopener noreferrer"&gt;this page&lt;/a&gt; for a basic comparison of all these options. OR drop me a note in comments if you'd like my advice.&lt;/p&gt;

&lt;h2&gt;
  
  
  Provisioning EKS from the Management Console - the Bottom Line
&lt;/h2&gt;

&lt;p&gt;As we saw in this post - the manual method is kinda straightforward, but it still leaves a lot of detail for us to take care of.&lt;br&gt;
In addition - this method doesn't scale well. It can work ok for a couple of small clusters but once we are in production -  running at scale, across multiple geographical regions - managing things by hand becomes too slow and error prone. Professional platform engineers manage their &lt;em&gt;infrastructure as code&lt;/em&gt;. &lt;br&gt;
And that will be shown in the upcoming installments of this series.&lt;/p&gt;

&lt;p&gt;Subscribe for updates and leave your comments if there's anything unclear or plain wrong! ;)&lt;/p&gt;

</description>
      <category>eks</category>
      <category>aws</category>
      <category>kubernetes</category>
    </item>
  </channel>
</rss>
