<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Yoshik Karnawat</title>
    <description>The latest articles on DEV Community by Yoshik Karnawat (@yoshik_karnawat).</description>
    <link>https://dev.to/yoshik_karnawat</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3416078%2F1eb0d2c4-4320-41b7-a44c-35acb2dd2a0b.jpg</url>
      <title>DEV Community: Yoshik Karnawat</title>
      <link>https://dev.to/yoshik_karnawat</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/yoshik_karnawat"/>
    <language>en</language>
    <item>
      <title>Kubernetes Autoscaling: Never Manually Scale Again</title>
      <dc:creator>Yoshik Karnawat</dc:creator>
      <pubDate>Sun, 17 Aug 2025 11:54:04 +0000</pubDate>
      <link>https://dev.to/yoshik_karnawat/kubernetes-autoscaling-never-manually-scale-again-5dfk</link>
      <guid>https://dev.to/yoshik_karnawat/kubernetes-autoscaling-never-manually-scale-again-5dfk</guid>
      <description>&lt;p&gt;As a Site Reliability Engineer who has managed countless production Kubernetes clusters, I've learned that one of the biggest challenges teams face is properly sizing their applications. Too little resources? Your app crashes under load. Too much? You're burning money unnecessarily. &lt;/p&gt;

&lt;p&gt;That's where Kubernetes autoscaling comes to the rescue.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Autoscaling and Why Should You Care?
&lt;/h2&gt;

&lt;p&gt;Imagine you're running an online store. During normal hours, you might need 3 servers to handle traffic. But during Black Friday sales, you suddenly need 20 servers. Then afterward, you scale back down to save costs.&lt;/p&gt;

&lt;p&gt;Kubernetes autoscaling does exactly this - but automatically. No more 3 AM wake-up calls to manually add servers during traffic spikes.&lt;/p&gt;

&lt;p&gt;There are two main types of autoscaling in Kubernetes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Horizontal Pod Autoscaler (HPA)&lt;/strong&gt;: Adds more copies of your application&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vertical Pod Autoscaler (VPA)&lt;/strong&gt;: Gives more power (CPU/memory) to existing copies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Think of HPA as hiring more cashiers during busy hours, while VPA is like giving your existing cashiers faster computers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Horizontal Pod Autoscaler (HPA): Adding More Workers
&lt;/h2&gt;

&lt;h3&gt;
  
  
  How HPA Works
&lt;/h3&gt;

&lt;p&gt;HPA continuously monitors your application's resource usage. When CPU or memory usage gets too high, it automatically creates more pod copies to handle the load. When traffic decreases, it scales back down.&lt;/p&gt;

&lt;p&gt;Here's the process:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Monitor&lt;/strong&gt;: HPA checks your app's metrics every 15 seconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Calculate&lt;/strong&gt;: It determines if more or fewer pods are needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scale&lt;/strong&gt;: It adds or removes pods accordingly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wait&lt;/strong&gt;: It waits for the system to stabilize before making another decision&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Setting Up HPA
&lt;/h3&gt;

&lt;p&gt;Let's say you have a web application that should scale up when CPU usage exceeds 70%. Here's how you set it up:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;autoscaling/v2&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HorizontalPodAutoscaler&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;web-app-hpa&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;scaleTargetRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
    &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;web-app&lt;/span&gt;
  &lt;span class="na"&gt;minReplicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;      &lt;span class="c1"&gt;# Never go below 2 pods&lt;/span&gt;
  &lt;span class="na"&gt;maxReplicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;     &lt;span class="c1"&gt;# Never exceed 10 pods&lt;/span&gt;
  &lt;span class="na"&gt;metrics&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Resource&lt;/span&gt;
    &lt;span class="na"&gt;resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cpu&lt;/span&gt;
      &lt;span class="na"&gt;target&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Utilization&lt;/span&gt;
        &lt;span class="na"&gt;averageUtilization&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;70&lt;/span&gt;  &lt;span class="c1"&gt;# Scale when CPU &amp;gt; 70%&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Real-World Example
&lt;/h3&gt;

&lt;p&gt;I once worked with an e-commerce company that saw traffic spikes during lunch hours (12-2 PM). Without HPA, their app would slow down terribly during these periods, causing customers to abandon their shopping carts.&lt;/p&gt;

&lt;p&gt;After implementing HPA:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Before lunch rush&lt;/strong&gt;: 3 pods running (normal load)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;During lunch rush&lt;/strong&gt;: Automatically scaled to 8 pods&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;After lunch rush&lt;/strong&gt;: Gradually scaled back to 3 pods&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Result? Page load times stayed consistent, and they processed 40% more orders during peak hours.&lt;/p&gt;

&lt;h3&gt;
  
  
  HPA Best Practices
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. Set Resource Requests&lt;/strong&gt;&lt;br&gt;
Your pods MUST have CPU and memory requests defined. HPA can't work without them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;100m&lt;/span&gt;        &lt;span class="c1"&gt;# Required for HPA&lt;/span&gt;
    &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;128Mi&lt;/span&gt;
  &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;500m&lt;/span&gt;
    &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;512Mi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Don't Set Min Replicas Too Low&lt;/strong&gt;&lt;br&gt;
Always keep at least 2 replicas running. If your single pod crashes, your entire service goes down.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Monitor Scaling Events&lt;/strong&gt;&lt;br&gt;
Use these commands to see what HPA is doing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get hpa
kubectl describe hpa web-app-hpa
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4. Avoid Flapping&lt;/strong&gt;&lt;br&gt;
If your app scales up and down too frequently, increase the stabilization window:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;behavior&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;scaleDown&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;stabilizationWindowSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;300&lt;/span&gt;  &lt;span class="c1"&gt;# Wait 5 minutes before scaling down&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Vertical Pod Autoscaler (VPA): Giving More Power
&lt;/h2&gt;

&lt;h3&gt;
  
  
  How VPA Works
&lt;/h3&gt;

&lt;p&gt;While HPA adds more workers, VPA makes existing workers more powerful. It monitors your application's actual resource usage over time and automatically adjusts CPU and memory allocations.&lt;/p&gt;

&lt;p&gt;VPA has three modes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Off&lt;/strong&gt;: Only provides recommendations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Initial&lt;/strong&gt;: Sets resources only when pods are created&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto&lt;/strong&gt;: Automatically updates running pods (requires pod restart)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Setting Up VPA
&lt;/h3&gt;

&lt;p&gt;Here's a basic VPA configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;autoscaling.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;VerticalPodAutoscaler&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;web-app-vpa&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;targetRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
    &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;web-app&lt;/span&gt;
  &lt;span class="na"&gt;updatePolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;updateMode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Auto"&lt;/span&gt;
  &lt;span class="na"&gt;resourcePolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;containerPolicies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;containerName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;web-app&lt;/span&gt;
      &lt;span class="na"&gt;minAllowed&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;100m&lt;/span&gt;
        &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;128Mi&lt;/span&gt;
      &lt;span class="na"&gt;maxAllowed&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;2000m&lt;/span&gt;
        &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;2Gi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  When to Use VPA
&lt;/h3&gt;

&lt;p&gt;VPA is perfect for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Database workloads&lt;/strong&gt;: Databases often need more memory rather than more instances&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data processing applications&lt;/strong&gt;: These might need varying amounts of CPU and memory&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Legacy applications&lt;/strong&gt;: Apps that can't easily scale horizontally&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  VPA Limitations to Know
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. Pod Restarts Required&lt;/strong&gt;&lt;br&gt;
VPA needs to restart pods to apply new resource settings. Plan for this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Not Suitable for Stateful Apps&lt;/strong&gt;&lt;br&gt;
Avoid VPA for databases or other stateful applications where pod restarts cause issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Still in Beta&lt;/strong&gt;&lt;br&gt;
VPA is less mature than HPA. Test thoroughly before production use.&lt;/p&gt;
&lt;h2&gt;
  
  
  Can You Use HPA and VPA Together?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Short answer&lt;/strong&gt;: Generally no, not on the same metrics.&lt;/p&gt;

&lt;p&gt;If both HPA and VPA try to scale based on CPU usage, they'll fight each other:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HPA adds more pods because CPU is high&lt;/li&gt;
&lt;li&gt;VPA increases CPU allocation because usage is high&lt;/li&gt;
&lt;li&gt;This creates confusion and unpredictable behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Safe combination&lt;/strong&gt;: Use HPA for CPU scaling and VPA only for memory optimization:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# HPA handles CPU scaling&lt;/span&gt;
&lt;span class="na"&gt;metrics&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Resource&lt;/span&gt;
  &lt;span class="na"&gt;resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cpu&lt;/span&gt;
    &lt;span class="na"&gt;target&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Utilization&lt;/span&gt;
      &lt;span class="na"&gt;averageUtilization&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;70&lt;/span&gt;

&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="c1"&gt;# VPA handles only memory&lt;/span&gt;
&lt;span class="na"&gt;resourcePolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containerPolicies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;containerName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;web-app&lt;/span&gt;
    &lt;span class="na"&gt;controlledResources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;memory"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# Only memory, not CPU&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Choosing the Right Autoscaling Strategy
&lt;/h2&gt;

&lt;p&gt;Here's my decision framework after years of production experience:&lt;/p&gt;

&lt;h3&gt;
  
  
  Use HPA When:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Your app is &lt;strong&gt;stateless&lt;/strong&gt; (web servers, APIs)&lt;/li&gt;
&lt;li&gt;You have &lt;strong&gt;variable traffic patterns&lt;/strong&gt; (daily/weekly spikes)&lt;/li&gt;
&lt;li&gt;Your app can handle &lt;strong&gt;multiple instances&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;You need &lt;strong&gt;fast scaling response&lt;/strong&gt; (seconds to minutes)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Use VPA When:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Your app has &lt;strong&gt;unpredictable resource needs&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;You're running &lt;strong&gt;batch jobs or data processing&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;You have &lt;strong&gt;stateful applications&lt;/strong&gt; that can't scale horizontally&lt;/li&gt;
&lt;li&gt;You want to &lt;strong&gt;optimize resource costs&lt;/strong&gt; over time&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Use Neither When:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Your app has &lt;strong&gt;steady, predictable load&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Resource requirements are &lt;strong&gt;well-known and stable&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;You prefer &lt;strong&gt;manual control&lt;/strong&gt; over scaling decisions&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting Started: Your First Autoscaling Setup
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Install Metrics Server&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 2: Create a Simple HPA&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl autoscale deployment web-app &lt;span class="nt"&gt;--cpu-percent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;70 &lt;span class="nt"&gt;--min&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2 &lt;span class="nt"&gt;--max&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 3: Generate Some Load&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl run load-test &lt;span class="nt"&gt;--image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;busybox &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;--restart&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Never &lt;span class="nt"&gt;--&lt;/span&gt; /bin/sh
&lt;span class="c"&gt;# Inside the pod, run:&lt;/span&gt;
&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do &lt;/span&gt;wget &lt;span class="nt"&gt;-q&lt;/span&gt; &lt;span class="nt"&gt;-O-&lt;/span&gt; http://web-app-service&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 4: Watch It Scale&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get hpa &lt;span class="nt"&gt;--watch&lt;/span&gt;
kubectl get pods &lt;span class="nt"&gt;--watch&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Kubernetes autoscaling isn't just a nice-to-have feature - it's essential for running resilient, cost-effective applications at scale. HPA helps you handle traffic spikes automatically, while VPA ensures you're not wasting resources.&lt;/p&gt;

&lt;p&gt;Start simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Implement HPA for your stateless web applications&lt;/li&gt;
&lt;li&gt;Set conservative scaling thresholds initially
&lt;/li&gt;
&lt;li&gt;Monitor and adjust based on real usage patterns&lt;/li&gt;
&lt;li&gt;Consider VPA for resource optimization once you're comfortable&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Remember, autoscaling is as much about saving money as it is about handling load. Done right, it keeps your applications responsive while optimizing costs - letting you sleep better at night knowing your systems can handle whatever comes their way.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>learning</category>
    </item>
    <item>
      <title>DevOps Skills Alone Aren’t Enough</title>
      <dc:creator>Yoshik Karnawat</dc:creator>
      <pubDate>Sat, 09 Aug 2025 04:25:37 +0000</pubDate>
      <link>https://dev.to/yoshik_karnawat/devops-skills-alone-arent-enough-heres-why-4kb5</link>
      <guid>https://dev.to/yoshik_karnawat/devops-skills-alone-arent-enough-heres-why-4kb5</guid>
      <description>&lt;p&gt;I've been an SRE engineer for over two years now. SRE stands for Site Reliability Engineering. It's about keeping production systems reliable and efficient.&lt;/p&gt;

&lt;p&gt;But here's the thing - it's not just about learning fancy terms. You know, SLA, SLO, MTTR, service reliability. That stuff matters, but the real work is keeping systems stable and making processes better.&lt;/p&gt;

&lt;p&gt;Since college, I believed one thing: skills beat paper.&lt;/p&gt;

&lt;p&gt;I thought you didn't need good grades or certifications. Just be good at what you do. That's it.&lt;/p&gt;

&lt;p&gt;Spoiler alert: I was completely wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  My College Mindset
&lt;/h2&gt;

&lt;p&gt;Back then, I was all about side projects. I'd spin up my own VMs. Set up Caddy servers. Configure Cloudflare. Build stuff that actually worked.&lt;/p&gt;

&lt;p&gt;This felt real. This felt valuable.&lt;/p&gt;

&lt;p&gt;Why would I need a piece of paper when I could show actual results? My GitHub was full of projects. My servers were running smoothly. I could troubleshoot problems and fix them.&lt;/p&gt;

&lt;p&gt;That had to be enough. Right?&lt;/p&gt;

&lt;p&gt;Press enter or click to view image in full size&lt;br&gt;
DevOps Skills Alone Aren't Enough&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhr7o3q27fi4m3vdvgf8q.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhr7o3q27fi4m3vdvgf8q.jpeg" alt="DevOps Skills Alone Aren’t Enough" width="800" height="522"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Reality Check Hit Different
&lt;/h2&gt;

&lt;p&gt;I was wrong. Dead wrong.&lt;/p&gt;

&lt;p&gt;And I learned this through a series of painful rejections and missed opportunities.&lt;/p&gt;

&lt;p&gt;Picture this: You're confident in your abilities. You apply for roles you know you can crush. Your technical skills are solid. Your projects speak for themselves. But somehow, you never make it past the initial screening.&lt;/p&gt;

&lt;p&gt;That was my reality for months.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Discovered (And Why It Matters to You)
&lt;/h2&gt;

&lt;p&gt;Skills give you the knowledge to work on real-world systems. But certifications? They give you something entirely different: credibility.&lt;/p&gt;

&lt;p&gt;Think about it from a hiring manager's perspective. They get hundreds of resumes. Everyone claims to know Kubernetes. Everyone says they understand monitoring. Everyone talks about incident response.&lt;/p&gt;

&lt;p&gt;How do they separate the real engineers from the wannabes?&lt;/p&gt;

&lt;p&gt;Certifications act as a filter. They validate claims. They prove you didn't just tinker with Docker on weekends - you understand the concepts well enough to pass rigorous exams.&lt;/p&gt;

&lt;p&gt;Here's a question for you: How many times have you been overlooked despite having the right skills? Drop a comment below - I bet your experience mirrors mine.&lt;/p&gt;

&lt;h2&gt;
  
  
  The $400 Game-Changer
&lt;/h2&gt;

&lt;p&gt;Let me share something that'll probably surprise you. The CKAD certification costs around $400.&lt;/p&gt;

&lt;p&gt;My old thinking: "Four hundred dollars for paper that expires? That's ridiculous."&lt;/p&gt;

&lt;p&gt;My new reality: This "expensive paper" transformed my career trajectory.&lt;/p&gt;

&lt;p&gt;Here's what actually happens:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Immediate salary impact:&lt;/strong&gt; Most engineers see 10–20% salary increases after cloud-native certifications. On an $80,000 salary, that's $8,000–16,000 annually. The math is simple - you're profitable from month one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Visibility explosion:&lt;/strong&gt; Recruiters hunt for specific certifications. Without CKAD, you're invisible in Kubernetes job searches. With it, your phone starts ringing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Promotion acceleration:&lt;/strong&gt; When two engineers compete for advancement, guess who wins? The one with verified credentials.&lt;/p&gt;

&lt;p&gt;The career boost from that $400 investment? It's exponentially higher than you'd imagine.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stop Overthinking the Investment
&lt;/h2&gt;

&lt;p&gt;Forget the $400 cost. Ignore the three-year expiry.&lt;/p&gt;

&lt;p&gt;Focus on this instead:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The role you'll land because of that certification&lt;/li&gt;
&lt;li&gt;The salary jump that follows&lt;/li&gt;
&lt;li&gt;The confidence surge in technical conversations&lt;/li&gt;
&lt;li&gt;The opportunities that suddenly appear&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That expiry date? It's actually brilliant. It keeps you current in a field that evolves at breakneck speed. An eternal certification would be worthless within two years anyway.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Insurance Policy You Didn't Know You Needed
&lt;/h2&gt;

&lt;p&gt;Think of certifications as career insurance. You pay a small premium upfront to protect against massive potential losses.&lt;/p&gt;

&lt;p&gt;What losses? Missing opportunities because you lack credentials.&lt;/p&gt;

&lt;p&gt;Every month without certifications costs you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Better job prospects&lt;/li&gt;
&lt;li&gt;Higher compensation&lt;/li&gt;
&lt;li&gt;Career advancement&lt;/li&gt;
&lt;li&gt;Professional recognition&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The opportunity cost of staying uncertified far exceeds any certification fee.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Transformation Story
&lt;/h2&gt;

&lt;p&gt;Now I balance both worlds. Side projects remain crucial - they build real skills. But certifications validate those skills to decision-makers.&lt;/p&gt;

&lt;p&gt;My SRE focus areas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AWS/GCP/Azure cloud platforms&lt;/li&gt;
&lt;li&gt;Kubernetes ecosystem (CKA, CKAD)&lt;/li&gt;
&lt;li&gt;Monitoring and observability tools&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Your Turn: What's Holding You Back?
&lt;/h2&gt;

&lt;p&gt;I'm curious - what's your biggest barrier to getting certified? Is it the cost? Time constraints? Impostor syndrome?&lt;/p&gt;

&lt;p&gt;Share in the comments. I've probably faced the same obstacle and can offer some perspective.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Truth About Our Industry
&lt;/h2&gt;

&lt;p&gt;We work in a field where HR departments filter resumes with keyword searches. Where hiring managers need quick validation methods. Where automated systems decide who gets interviews.&lt;/p&gt;

&lt;p&gt;You can fight the system or work within it. Fighting is noble but ineffective. Working within it gets results.&lt;/p&gt;

&lt;p&gt;Build skills through hands-on projects. Prove those skills through certifications. This combination is unbeatable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Here's What Changes When You Get Certified
&lt;/h2&gt;

&lt;p&gt;The external validation shifts everything:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Conversations with senior engineers become peer discussions&lt;/li&gt;
&lt;li&gt;Salary negotiations start from higher baselines&lt;/li&gt;
&lt;li&gt;Career opportunities multiply exponentially&lt;/li&gt;
&lt;li&gt;Confidence permeates every technical interaction&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That $400 investment doesn't just buy a certificate. It purchases career transformation.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Final Thoughts (And a Challenge for You)
&lt;/h2&gt;

&lt;p&gt;I started this journey believing skills trumped credentials. I was half right - skills matter enormously. But credentials amplify those skills in ways I never anticipated.&lt;/p&gt;

&lt;p&gt;Don't make my mistake. Don't let pride or cost concerns slow your career progression. The boost from strategic certifications will surprise you.&lt;/p&gt;

&lt;p&gt;Here's my challenge: Pick one certification relevant to your career goals. Budget for it this month. Schedule the exam within 90 days.&lt;/p&gt;

&lt;p&gt;Your future self will thank you for taking action today instead of waiting for the "perfect moment" that never comes.&lt;/p&gt;

&lt;p&gt;What certification will you tackle first? Drop your choice in the comments.&lt;/p&gt;

&lt;p&gt;Thanks for reading!&lt;/p&gt;

</description>
      <category>devops</category>
      <category>career</category>
    </item>
    <item>
      <title>Linux Observability: Troubleshooting Made Simple</title>
      <dc:creator>Yoshik Karnawat</dc:creator>
      <pubDate>Wed, 06 Aug 2025 05:02:11 +0000</pubDate>
      <link>https://dev.to/yoshik_karnawat/linux-observability-troubleshooting-made-simple-1k8e</link>
      <guid>https://dev.to/yoshik_karnawat/linux-observability-troubleshooting-made-simple-1k8e</guid>
      <description>&lt;p&gt;No jargon, no complexity, real command line solutions.&lt;/p&gt;

&lt;p&gt;Whether you're keeping systems running smoothly, managing deployments, or building backend services, these 10 commands will help you during 3 AM outages.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn9cgazphx84b6adfk4y7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn9cgazphx84b6adfk4y7.png" alt="Linux Debugging Commands" width="770" height="451"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. iostat
&lt;/h2&gt;

&lt;p&gt;Real-time disk performance statistics that show which storage devices are your performance bottlenecks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;iostat &lt;span class="nt"&gt;-x&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key indicators:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;%util&lt;/code&gt; - If this is consistently above 80%, your disk is a bottleneck&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;await&lt;/code&gt; - Average wait time for I/O requests (milliseconds)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;iowait&lt;/code&gt; - CPU time spent waiting for disk operations&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  2. vmstat
&lt;/h2&gt;

&lt;p&gt;A comprehensive view of system resource usage including memory, CPU, and I/O activity.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vmstat 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key indicators:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;si/so&lt;/code&gt; - Swap in/out activity (any consistent values here mean memory pressure)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;wa&lt;/code&gt; - I/O wait percentage (high values indicate disk bottlenecks)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;r&lt;/code&gt; - Number of processes waiting for CPU time&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. lsof
&lt;/h2&gt;

&lt;p&gt;Every open file, socket, and network connection on your system, along with which process owns it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;lsof &lt;span class="nt"&gt;-i&lt;/span&gt; :8080
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Use cases:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Port conflicts:&lt;/strong&gt; Find what's already using a port before your app starts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory leaks:&lt;/strong&gt; Track file descriptor leaks with &lt;code&gt;lsof -p&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Network debugging:&lt;/strong&gt; See all network connections with &lt;code&gt;lsof -i&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Find the biggest file handle users:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;lsof | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{print $2}'&lt;/span&gt; | &lt;span class="nb"&gt;sort&lt;/span&gt; | &lt;span class="nb"&gt;uniq&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; | &lt;span class="nb"&gt;sort&lt;/span&gt; &lt;span class="nt"&gt;-nr&lt;/span&gt; | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-10&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This one-liner shows which processes are using the most file handles, crucial for debugging file descriptor exhaustion.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. sar
&lt;/h2&gt;

&lt;p&gt;Historical system performance data that helps you understand performance patterns over time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sar &lt;span class="nt"&gt;-u&lt;/span&gt; 1 10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Track different metrics:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CPU usage:&lt;/strong&gt; &lt;code&gt;sar -u&lt;/code&gt; shows user, system, and idle time percentages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory:&lt;/strong&gt; &lt;code&gt;sar -r&lt;/code&gt; displays memory utilization trends&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Network:&lt;/strong&gt; &lt;code&gt;sar -n DEV&lt;/code&gt; shows network interface statistics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Review historical data:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sar &lt;span class="nt"&gt;-f&lt;/span&gt; /var/log/sysstat/sa...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Unlike real-time tools, sar lets you see what happened during that 3AM performance spike when nobody was watching.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. iotop
&lt;/h2&gt;

&lt;p&gt;Which specific processes are generating disk I/O, sorted by actual usage.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;iotop &lt;span class="nt"&gt;-o&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Use cases:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identify I/O hogs instantly without guessing&lt;/li&gt;
&lt;li&gt;Track total read/write per process in real-time&lt;/li&gt;
&lt;li&gt;Find runaway processes that are thrashing your disks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;-o&lt;/code&gt; flag shows only processes actually doing I/O, filtering out the noise.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. strace
&lt;/h2&gt;

&lt;p&gt;Every system call your process makes - the ultimate debugging microscope.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;strace &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;trace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;file 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Advanced use:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Track file access:&lt;/strong&gt; &lt;code&gt;-e trace=file&lt;/code&gt; shows only file-related calls&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitor network:&lt;/strong&gt; &lt;code&gt;-e trace=network&lt;/code&gt; for socket operations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time analysis:&lt;/strong&gt; &lt;code&gt;-T&lt;/code&gt; shows time spent in each system call&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;For running processes:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;strace &lt;span class="nt"&gt;-p&lt;/span&gt;  &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; /tmp/trace.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This captures the behavior of a running process and all its children, writing to a file for later analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. netstat with ss
&lt;/h2&gt;

&lt;p&gt;Detailed network socket information and connection states.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ss &lt;span class="nt"&gt;-tulpn&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Advanced use:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Find connection states:&lt;/strong&gt; &lt;code&gt;ss -o state established&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory usage per socket:&lt;/strong&gt; &lt;code&gt;ss -m&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Process information:&lt;/strong&gt; &lt;code&gt;ss -p&lt;/code&gt; shows which process owns each connection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Track connection problems:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ss &lt;span class="nt"&gt;-s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This summary shows socket statistics including how many connections are in different states.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. dstat
&lt;/h2&gt;

&lt;p&gt;Combined CPU, disk, network, and memory statistics in a single, colorful display.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dstat &lt;span class="nt"&gt;-cdngy&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Flag breakdown:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;-c&lt;/code&gt; - CPU stats&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-d&lt;/code&gt; - Disk stats&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-n&lt;/code&gt; - Network stats&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-g&lt;/code&gt; - Page stats&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-y&lt;/code&gt; - System stats&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Custom intervals:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dstat &lt;span class="nt"&gt;--top-cpu&lt;/span&gt; &lt;span class="nt"&gt;--top-io&lt;/span&gt; &lt;span class="nt"&gt;--top-mem&lt;/span&gt; 5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This shows the top processes consuming CPU, I/O, and memory every 5 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  9. pidstat
&lt;/h2&gt;

&lt;p&gt;Detailed resource usage for individual processes over time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pidstat &lt;span class="nt"&gt;-u&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Track specific processes:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pidstat &lt;span class="nt"&gt;-p&lt;/span&gt;  1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why it's better than ps:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Shows trends over time, not just snapshots&lt;/li&gt;
&lt;li&gt;Per-thread statistics with &lt;code&gt;-t&lt;/code&gt; flag&lt;/li&gt;
&lt;li&gt;Historical data when combined with sar&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  10. perf
&lt;/h2&gt;

&lt;p&gt;Deep CPU performance analysis including cache misses, branch predictions, and instruction efficiency.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;perf top
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Advanced profiling:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;perf record &lt;span class="nt"&gt;-g&lt;/span&gt; 
perf report
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;System-wide analysis:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;perf &lt;span class="nb"&gt;stat&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt; &lt;span class="nb"&gt;sleep &lt;/span&gt;10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This runs system-wide performance counters for 10 seconds, showing you efficiency metrics like instructions per cycle and cache hit rates.&lt;/p&gt;

&lt;p&gt;Thanks for reading.&lt;/p&gt;

</description>
      <category>linux</category>
    </item>
  </channel>
</rss>
