<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Shreyans Sonthalia</title>
    <description>The latest articles on DEV Community by Shreyans Sonthalia (@ssshreyans26).</description>
    <link>https://dev.to/ssshreyans26</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3874860%2Ff8671b9e-9208-4c87-936c-2a34766b2b94.jpg</url>
      <title>DEV Community: Shreyans Sonthalia</title>
      <link>https://dev.to/ssshreyans26</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ssshreyans26"/>
    <language>en</language>
    <item>
      <title>Why Your Kubernetes Pod Keeps Getting Killed — And It's Not an OOMKill</title>
      <dc:creator>Shreyans Sonthalia</dc:creator>
      <pubDate>Wed, 15 Apr 2026 12:16:25 +0000</pubDate>
      <link>https://dev.to/ssshreyans26/why-your-kubernetes-pod-keeps-getting-killed-and-its-not-an-oomkill-3ji6</link>
      <guid>https://dev.to/ssshreyans26/why-your-kubernetes-pod-keeps-getting-killed-and-its-not-an-oomkill-3ji6</guid>
      <description>&lt;p&gt;&lt;em&gt;A real-world debugging guide: from mysterious pod terminations to discovering a hidden kernel memory leak consuming 55% of node RAM.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Incident
&lt;/h2&gt;

&lt;p&gt;It was a regular morning when we noticed something off. One of our production services — running on an EKS cluster — had been terminated and a new pod had spun up in its place. No deployment had been triggered. No config changes. The pod just... died.&lt;/p&gt;

&lt;p&gt;The Grafana dashboard for the old pod told a strange story:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Memory usage had climbed to &lt;strong&gt;832 MiB&lt;/strong&gt;, then abruptly dropped to zero&lt;/li&gt;
&lt;li&gt;CPU dropped to zero at the same time&lt;/li&gt;
&lt;li&gt;After a ~45 minute gap, a new pod appeared and started running normally&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The new pod was already using &lt;strong&gt;757 MiB&lt;/strong&gt; of memory and running just fine. So what killed the old one?&lt;/p&gt;

&lt;p&gt;This is the story of how we debugged it — and what we found.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: The Obvious Suspect — OOMKill
&lt;/h2&gt;

&lt;p&gt;When a Kubernetes pod dies unexpectedly, the first thing most engineers check is whether it was killed for using too much memory (an OOMKill). We looked at the deployment spec:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;100m&lt;/span&gt;
    &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;500Mi&lt;/span&gt;
  &lt;span class="c1"&gt;# No limits set&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No memory limit was configured. In Kubernetes, if you don't set a &lt;code&gt;limit&lt;/code&gt;, the container can use as much memory as the node has available. So this wasn't a container-level OOMKill — the pod had no ceiling to hit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But wait&lt;/strong&gt; — if the new pod was happily running at 757 MiB, why would 832 MiB on the old pod be a problem? Something else was going on.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: Checking Kubernetes Events
&lt;/h2&gt;

&lt;p&gt;We tried to pull events for the terminated pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get events &lt;span class="nt"&gt;-n&lt;/span&gt; live &lt;span class="nt"&gt;--field-selector&lt;/span&gt; involvedObject.name&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;pod-name&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nothing. Kubernetes only retains events for about an hour, and the pod had died over 4 hours ago. The events had expired.&lt;/p&gt;

&lt;p&gt;But when we checked broader events in the namespace, we found something interesting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TaintManagerEviction  pod/&amp;lt;new-pod&amp;gt;  Cancelling deletion of Pod
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the &lt;strong&gt;node&lt;/strong&gt; had recent events:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NodeNotReady   node/&amp;lt;node-name&amp;gt;   Node status is now: NodeNotReady
NodeReady      node/&amp;lt;node-name&amp;gt;   Node status is now: NodeReady
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The node itself had gone &lt;strong&gt;NotReady&lt;/strong&gt;. When a Kubernetes node stops responding to the API server, all pods on that node get evicted. This explained the pod termination — but why did the node go NotReady?&lt;/p&gt;

&lt;h3&gt;
  
  
  What Does NodeNotReady Mean?
&lt;/h3&gt;

&lt;p&gt;Every node in a Kubernetes cluster runs a process called the &lt;strong&gt;kubelet&lt;/strong&gt;. The kubelet sends periodic heartbeats to the API server (the control plane) saying "I'm alive and healthy." If the API server doesn't receive a heartbeat within a grace period (default 40 seconds), it marks the node as &lt;code&gt;NotReady&lt;/code&gt; and begins evicting pods to reschedule them elsewhere.&lt;/p&gt;

&lt;p&gt;A node goes NotReady when the kubelet process is too overwhelmed to send these heartbeats — usually due to extreme resource pressure (CPU, memory, or disk).&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3: The CPU Credit Theory (A Red Herring)
&lt;/h2&gt;

&lt;p&gt;The node was a &lt;code&gt;t3a.medium&lt;/code&gt; instance on AWS. T3/T3a instances are &lt;strong&gt;burstable&lt;/strong&gt; — they don't give you full CPU all the time. We initially suspected that the instance had exhausted its CPU credits and was being throttled, causing the kubelet to miss heartbeats.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Not familiar with AWS burstable instances and CPU credits?&lt;/strong&gt; Read our deep dive: &lt;a href="https://dev.to/ssshreyans26/aws-burstable-instances-explained-cpu-credits-throttling-and-why-your-t3-instance-isnt-what-you-39o4"&gt;AWS Burstable Instances Explained: CPU Credits, Throttling, and Why Your t3 Instance Isn't What You Think&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We checked the credit configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 describe-instance-credit-specifications &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--instance-ids&lt;/span&gt; &amp;lt;instance-id&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CpuCredits: unlimited
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;T3 Unlimited mode&lt;/strong&gt; was already enabled — meaning the instance could burst beyond its credit balance without throttling (you just pay for the extra usage). We verified with CloudWatch: CPU credits were at 0 but surplus credits were maxed at 576. The instance was not being throttled.&lt;/p&gt;

&lt;p&gt;CPU credits: ruled out.&lt;/p&gt;

&lt;p&gt;But CloudWatch revealed something alarming: &lt;strong&gt;the instance had been running at ~100% CPU utilization for the entire day&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 4: What's Eating the CPU?
&lt;/h2&gt;

&lt;p&gt;We checked CPU usage of all pods on the node using &lt;code&gt;kubectl top&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;service pod:           7m    (0.007 CPUs)
weave-scope-agent:    40m
aws-node:             23m
kube-proxy:            7m
ebs-csi:               3m
efs-csi:               5m
──────────────────────────
Total:               ~85m   (out of 2000m available)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pods were barely using any CPU. Yet CloudWatch showed 100% at the instance level. The CPU was being consumed by something &lt;strong&gt;outside of Kubernetes pods&lt;/strong&gt; — at the operating system or kernel level.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 5: Getting Inside the Node
&lt;/h2&gt;

&lt;p&gt;We needed to look at the node's operating system directly. We used AWS Systems Manager (SSM) to run commands on the instance without SSH:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ssm send-command &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--instance-ids&lt;/span&gt; &amp;lt;instance-id&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--document-name&lt;/span&gt; &lt;span class="s2"&gt;"AWS-RunShellScript"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--parameters&lt;/span&gt; &lt;span class="s1"&gt;'commands=["cat /proc/loadavg"]'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;34.04 25.03 22.70
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;A load average of 34 on a 2-CPU machine.&lt;/strong&gt; That's 17x the capacity. The system was completely overloaded.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is Load Average?
&lt;/h3&gt;

&lt;p&gt;Load average represents the average number of processes that are either currently running on a CPU, or waiting in the queue to run. On a 2-CPU machine, a load average of 2.0 means both CPUs are fully utilized. A load average of 34 means there are 34 processes competing for 2 CPUs — each process spends most of its time waiting.&lt;/p&gt;

&lt;p&gt;The three numbers represent the 1-minute, 5-minute, and 15-minute averages. All three being high meant this had been going on for a long time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 6: Finding the Real Bottleneck with Linux PSI
&lt;/h2&gt;

&lt;p&gt;Linux has a feature called &lt;strong&gt;Pressure Stall Information (PSI)&lt;/strong&gt; that tells you exactly which resource is the bottleneck. We checked &lt;code&gt;/proc/pressure/&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CPU:    some avg10=85.41  avg60=84.62  avg300=82.10
Memory: some avg10=98.98  avg60=98.90  avg300=98.38
        full avg10=62.85  avg60=63.91  avg300=63.33
IO:     some avg10=0.04   avg60=0.16   avg300=0.21
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The numbers told a clear story:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;some&lt;/code&gt;&lt;/strong&gt; = percentage of time at least one process was stalled on this resource&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;full&lt;/code&gt;&lt;/strong&gt; = percentage of time ALL processes were stalled&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;99% of the time, some process was waiting for memory. 63% of the time, ALL processes were completely stalled.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This wasn't a CPU problem at all — it was a &lt;strong&gt;memory problem&lt;/strong&gt; that manifested as high CPU usage.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 7: Swap Thrashing — The Real Killer
&lt;/h2&gt;

&lt;p&gt;The memory stats confirmed it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MemTotal:      3,936 MB
MemFree:          86 MB     (only 86 MB free!)
MemAvailable:    735 MB     (after counting reclaimable cache)
SwapTotal:     1,048 MB
SwapFree:        549 MB     (500 MB of swap in use)
Committed_AS:  5,001 MB     (5 GB committed on a 4 GB machine!)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The node had 5 GB of memory committed on a machine with only 4 GB of RAM. The overflow was being handled by &lt;strong&gt;swap&lt;/strong&gt; — a section of the disk used as overflow memory. But disk is 100,000x slower than RAM, and when the system constantly moves data between RAM and disk, you get &lt;strong&gt;swap thrashing&lt;/strong&gt;: the CPU spends all its time waiting for disk I/O instead of doing useful work.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Want to understand swap, swap thrashing, and why memory problems cause CPU spikes?&lt;/strong&gt; Read our explainer: &lt;a href="https://dev.to/ssshreyans26/linux-memory-explained-swap-kernel-slab-and-skbuff-what-kubernetes-doesnt-show-you-i1a"&gt;Linux Memory Explained: Swap, Kernel Slab, and skbuff — What Kubernetes Doesn't Show You&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This explained everything: swap thrashing -&amp;gt; kubelet can't send heartbeats -&amp;gt; NodeNotReady -&amp;gt; pod evicted. But where was all the memory going?&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 8: The Hidden Memory Consumer — Kernel Slab
&lt;/h2&gt;

&lt;p&gt;We checked memory distribution across all pods on the node:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PODS (total):                    616 MB
├── main service                 381 MB
├── weave-scope-agent             64 MB
├── aws-node (VPC CNI)            39 MB
├── promtail                      30 MB
├── ebs-csi-node                  26 MB
├── kube-proxy                    24 MB

SYSTEM PROCESSES:                 58 MB
PAGE CACHE:                      831 MB
FREE:                             87 MB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's only about 1.6 GB accounted for. On a 4 GB node, where was the other &lt;strong&gt;2+ GB?&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;KERNEL SLAB:                   2,194 MB
├── SReclaimable:                 50 MB   (can be freed)
├── SUnreclaim:                2,143 MB   (CANNOT be freed!)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2.1 GB of non-reclaimable kernel memory.&lt;/strong&gt; Over half the node's RAM was consumed by the Linux kernel itself, completely invisible to all Kubernetes monitoring tools.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;What is kernel slab memory and why can't Kubernetes see it?&lt;/strong&gt; This is covered in detail in: &lt;a href="https://dev.to/ssshreyans26/linux-memory-explained-swap-kernel-slab-and-skbuff-what-kubernetes-doesnt-show-you-i1a"&gt;Linux Memory Explained: Swap, Kernel Slab, and skbuff — What Kubernetes Doesn't Show You&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Normal &lt;code&gt;SUnreclaim&lt;/code&gt; on a healthy node is &lt;strong&gt;50-200 MB&lt;/strong&gt;. Our node had &lt;strong&gt;2,143 MB&lt;/strong&gt;. Something was leaking memory inside the kernel.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 9: Inside the Slab — 1.66 Million Leaked Network Packets
&lt;/h2&gt;

&lt;p&gt;We examined &lt;code&gt;/proc/slabinfo&lt;/code&gt; to see what was consuming the slab:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SLAB OBJECT              COUNT        x  SIZE      =  TOTAL
────────────────────────────────────────────────────────────
kmalloc-1k            1,667,384    x  1,024 B   =  1,632 MB
skbuff_head_cache     1,657,980    x    256 B   =    414 MB
────────────────────────────────────────────────────────────
These two alone:                                   2,046 MB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;1.66 million &lt;code&gt;skbuff_head_cache&lt;/code&gt; entries&lt;/strong&gt; — each one representing a network packet header in the Linux kernel. And 1.67 million &lt;code&gt;kmalloc-1k&lt;/code&gt; allocations (the associated packet data). The almost 1:1 ratio confirmed this was a &lt;strong&gt;network subsystem memory leak&lt;/strong&gt;: millions of network packets stuck in kernel memory, never being cleaned up.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;What is &lt;code&gt;skbuff&lt;/code&gt; and how does it relate to network packets?&lt;/strong&gt; Explained in: &lt;a href="https://dev.to/ssshreyans26/linux-memory-explained-swap-kernel-slab-and-skbuff-what-kubernetes-doesnt-show-you-i1a"&gt;Linux Memory Explained: Swap, Kernel Slab, and skbuff — What Kubernetes Doesn't Show You&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Step 10: It's Not Just One Node
&lt;/h2&gt;

&lt;p&gt;Our affected pod ran on a dedicated node. Maybe this was a one-off? We checked two other nodes in the cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                      affected node   large node       another small node
                      (t3a.medium)    (t3a.xlarge)     (t3a.medium)
──────────────────────────────────────────────────────────────────────────
Total RAM             3,936 MB        16,207 MB        3,938 MB
Slab (SUnreclaim)     2,143 MB         4,533 MB        1,744 MB
skbuff count          1,667,384        3,309,501        1,310,669
Memory pressure       98.98%           0.00%            0.00%
Load average          32.64            2.11             0.12
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Every node had the same leak.&lt;/strong&gt; The &lt;code&gt;t3a.xlarge&lt;/code&gt; node (16 GB) had an even bigger leak at 4.5 GB — but survived because it had enough RAM headroom. The other &lt;code&gt;t3a.medium&lt;/code&gt; nodes were ticking time bombs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 11: The Culprit — An Abandoned Monitoring Tool
&lt;/h2&gt;

&lt;p&gt;What was common across all nodes and was intercepting network traffic? &lt;strong&gt;A network visualization DaemonSet.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We had &lt;a href="https://github.com/weaveworks/scope" rel="noopener noreferrer"&gt;Weave Scope&lt;/a&gt; running on every node — a tool that captures and analyzes network traffic to build a real-time map of your infrastructure.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get daemonsets &lt;span class="nt"&gt;-n&lt;/span&gt; weave
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NAME                DESIRED   CURRENT   READY   AGE
weave-scope-agent   16        16        16      2y326d
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key findings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Installed &lt;strong&gt;2 years and 326 days ago&lt;/strong&gt; via raw &lt;code&gt;kubectl apply&lt;/code&gt; (no Helm, no GitOps)&lt;/li&gt;
&lt;li&gt;Running &lt;code&gt;weaveworks/scope:1.13.2&lt;/code&gt; — the &lt;strong&gt;last version ever released&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Weaveworks, the company behind it, shut down in 2024&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;The DaemonSet was running on all 16 nodes, intercepting all network traffic&lt;/li&gt;
&lt;li&gt;Its packet interception was creating socket buffers in kernel space that were never freed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Over weeks and months, these accumulated into the millions, consuming gigabytes of kernel memory on every node.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Fix
&lt;/h2&gt;

&lt;p&gt;We deleted the entire namespace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl delete namespace weave
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The effect was immediate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                          BEFORE              AFTER
──────────────────────────────────────────────────────
Slab (SUnreclaim)         2,143 MB            74 MB
MemFree                   87 MB               1,937 MB
MemAvailable              735 MB              2,600 MB
Memory pressure           98.98%              0.00%
Load average              32.64               0.39
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the agent processes were killed, the kernel cleaned up all the orphaned socket buffers. &lt;strong&gt;2 GB of memory was freed instantly.&lt;/strong&gt; No node restart was even needed.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Your monitoring tools can be the problem
&lt;/h3&gt;

&lt;p&gt;A monitoring tool designed to give visibility into our infrastructure was silently killing it. Tools that intercept network traffic at the kernel level can cause kernel-level resource leaks that are invisible to standard Kubernetes metrics.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Kubernetes metrics have a blind spot
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;kubectl top&lt;/code&gt; and Prometheus container metrics only show &lt;strong&gt;userspace&lt;/strong&gt; memory used by containers. The 2.1 GB of kernel slab memory was completely invisible. We only found it by SSHing into the node and checking &lt;code&gt;/proc/meminfo&lt;/code&gt; and &lt;code&gt;/proc/slabinfo&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If you're running node-exporter, consider alerting on &lt;code&gt;node_memory_SUnreclaim_bytes&lt;/code&gt; — it would have caught this early.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Small nodes amplify kernel-level issues
&lt;/h3&gt;

&lt;p&gt;A &lt;code&gt;t3a.medium&lt;/code&gt; (4 GB RAM) leaves very little headroom after kubelet, container runtime, CNI plugins, CSI drivers, DaemonSet pods, and OS overhead. Any kernel-level issue eats directly into the limited space available for your workloads.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Audit your DaemonSets regularly
&lt;/h3&gt;

&lt;p&gt;DaemonSets run on every node. A single misbehaving DaemonSet multiplies its impact across your entire infrastructure. Review them periodically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get daemonsets &lt;span class="nt"&gt;--all-namespaces&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ask: Is this still needed? Is it maintained? When was it last updated?&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Abandoned open-source software is a liability
&lt;/h3&gt;

&lt;p&gt;Running unmaintained software in production — especially software that operates at the kernel level — is a risk that's easy to forget about. If the maintainers or company behind a tool have moved on, you should too.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. High CPU doesn't always mean high computation
&lt;/h3&gt;

&lt;p&gt;Our node showed 100% CPU, but actual computation was negligible. The CPU was spent on memory management — swapping pages in and out of disk. When you see high CPU coupled with high memory usage, check for swap thrashing first.&lt;/p&gt;

&lt;h3&gt;
  
  
  7. Follow the evidence, not assumptions
&lt;/h3&gt;

&lt;p&gt;Our investigation path: OOMKill? (no) -&amp;gt; CPU credits? (no) -&amp;gt; Node issue? (yes, NodeNotReady) -&amp;gt; What caused it? (memory pressure) -&amp;gt; Where's the memory? (kernel slab) -&amp;gt; What's in the slab? (leaked socket buffers) -&amp;gt; What's leaking? (abandoned DaemonSet). Each wrong hypothesis was eliminated with data, not guesswork.&lt;/p&gt;




&lt;h2&gt;
  
  
  Debugging Cheatsheet
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Kubernetes-level
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Pod events&lt;/span&gt;
kubectl get events &lt;span class="nt"&gt;-n&lt;/span&gt; &amp;lt;namespace&amp;gt; &lt;span class="nt"&gt;--sort-by&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'.lastTimestamp'&lt;/span&gt;

&lt;span class="c"&gt;# Node conditions&lt;/span&gt;
kubectl describe node &amp;lt;node-name&amp;gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-A5&lt;/span&gt; Conditions

&lt;span class="c"&gt;# All pods on a node&lt;/span&gt;
kubectl get pods &lt;span class="nt"&gt;--all-namespaces&lt;/span&gt; &lt;span class="nt"&gt;--field-selector&lt;/span&gt; spec.nodeName&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;node-name&amp;gt; &lt;span class="nt"&gt;-o&lt;/span&gt; wide

&lt;span class="c"&gt;# Pod resource usage&lt;/span&gt;
kubectl top pods &lt;span class="nt"&gt;-n&lt;/span&gt; &amp;lt;namespace&amp;gt;

&lt;span class="c"&gt;# List all DaemonSets&lt;/span&gt;
kubectl get daemonsets &lt;span class="nt"&gt;--all-namespaces&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  OS-level (via SSM or SSH)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# System pressure — which resource is the bottleneck?&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; /proc/pressure/cpu
&lt;span class="nb"&gt;cat&lt;/span&gt; /proc/pressure/memory
&lt;span class="nb"&gt;cat&lt;/span&gt; /proc/pressure/io

&lt;span class="c"&gt;# Memory breakdown — look for SUnreclaim&lt;/span&gt;
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s2"&gt;"MemTotal|MemFree|MemAvailable|Slab|SReclaimable|SUnreclaim|SwapTotal|SwapFree"&lt;/span&gt; /proc/meminfo

&lt;span class="c"&gt;# Top kernel slab consumers&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; /proc/slabinfo | &lt;span class="nb"&gt;sort&lt;/span&gt; &lt;span class="nt"&gt;-k3&lt;/span&gt; &lt;span class="nt"&gt;-rn&lt;/span&gt; | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-10&lt;/span&gt;

&lt;span class="c"&gt;# Load average&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; /proc/loadavg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  AWS-level
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# CPU credit balance (burstable instances)&lt;/span&gt;
aws cloudwatch get-metric-statistics &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--namespace&lt;/span&gt; AWS/EC2 &lt;span class="nt"&gt;--metric-name&lt;/span&gt; CPUCreditBalance &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--dimensions&lt;/span&gt; &lt;span class="nv"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;InstanceId,Value&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--start-time&lt;/span&gt; &amp;lt;start&amp;gt; &lt;span class="nt"&gt;--end-time&lt;/span&gt; &amp;lt;end&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--period&lt;/span&gt; 300 &lt;span class="nt"&gt;--statistics&lt;/span&gt; Average

&lt;span class="c"&gt;# Run commands on a node without SSH&lt;/span&gt;
aws ssm send-command &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--instance-ids&lt;/span&gt; &amp;lt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--document-name&lt;/span&gt; &lt;span class="s2"&gt;"AWS-RunShellScript"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--parameters&lt;/span&gt; &lt;span class="s1"&gt;'commands=["your-command"]'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/ssshreyans26/aws-burstable-instances-explained-cpu-credits-throttling-and-why-your-t3-instance-isnt-what-you-39o4"&gt;AWS Burstable Instances Explained: CPU Credits, Throttling, and Why Your t3 Instance Isn't What You Think&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/ssshreyans26/linux-memory-explained-swap-kernel-slab-and-skbuff-what-kubernetes-doesnt-show-you-i1a"&gt;Linux Memory Explained: Swap, Kernel Slab, and skbuff — What Kubernetes Doesn't Show You&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;The most dangerous problems in production aren't the ones that set off alarms — they're the ones that slowly accumulate in places you're not looking.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devops</category>
      <category>kubernetes</category>
      <category>linux</category>
      <category>sre</category>
    </item>
    <item>
      <title>Linux Memory Explained: Swap, Kernel Slab, and skbuff — What Kubernetes Doesn't Show You</title>
      <dc:creator>Shreyans Sonthalia</dc:creator>
      <pubDate>Wed, 15 Apr 2026 12:10:50 +0000</pubDate>
      <link>https://dev.to/ssshreyans26/linux-memory-explained-swap-kernel-slab-and-skbuff-what-kubernetes-doesnt-show-you-i1a</link>
      <guid>https://dev.to/ssshreyans26/linux-memory-explained-swap-kernel-slab-and-skbuff-what-kubernetes-doesnt-show-you-i1a</guid>
      <description>&lt;p&gt;&lt;em&gt;Your &lt;code&gt;kubectl top&lt;/code&gt; says the node has plenty of free memory. The node crashes anyway. Here's what's hiding in the gap.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem With Kubernetes Memory Metrics
&lt;/h2&gt;

&lt;p&gt;When you run &lt;code&gt;kubectl top node&lt;/code&gt;, you see something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NAME              CPU     MEMORY
ip-10-2-1-35     45m     616Mi/3936Mi    (15%)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;15% memory usage. Looks healthy, right?&lt;/p&gt;

&lt;p&gt;But the node is swap thrashing, the load average is 34, and pods are being evicted. How?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Because Kubernetes only shows you userspace memory&lt;/strong&gt; — the memory your containers are using. It doesn't show you what the Linux kernel is consuming behind the scenes. On the node we were debugging, the kernel was secretly eating &lt;strong&gt;2.1 GB&lt;/strong&gt; out of 4 GB — and &lt;code&gt;kubectl&lt;/code&gt; had no idea.&lt;/p&gt;

&lt;p&gt;This post explains the layers of Linux memory that Kubernetes can't see, and how to find them when things go wrong.&lt;/p&gt;




&lt;h2&gt;
  
  
  How Linux Organizes Memory
&lt;/h2&gt;

&lt;p&gt;When you check &lt;code&gt;/proc/meminfo&lt;/code&gt; on a Linux machine, you see dozens of entries. Here's how they fit together on a 4 GB node:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Total RAM: 4,096 MB
├── Used by applications (Anonymous pages):     617 MB
│   ├── Container processes (what kubectl sees)
│   └── System processes (kubelet, containerd, etc.)
├── Page Cache (file-backed pages):             831 MB
│   └── Cached file data (can be reclaimed)
├── Kernel Slab:                              2,194 MB  ← invisible to k8s
│   ├── SReclaimable:      50 MB (can be freed)
│   └── SUnreclaim:     2,143 MB (cannot be freed!)
├── Kernel Stack, Page Tables, etc.:             60 MB
└── Free:                                        87 MB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Kubernetes metrics cover the first bucket. Everything else is the OS and kernel.&lt;/p&gt;

&lt;p&gt;Let's break down each layer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 1: Application Memory (What Kubernetes Shows)
&lt;/h2&gt;

&lt;p&gt;This is the memory your processes actively use — variables, heap allocations, stack frames. In Linux terms, these are &lt;strong&gt;anonymous pages&lt;/strong&gt; (memory not backed by any file on disk).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# What Kubernetes reports&lt;/span&gt;
kubectl top pods &lt;span class="nt"&gt;-n&lt;/span&gt; live

NAME                         CPU     MEMORY
nightfort-688ccc5974-p47qs   7m      381Mi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This 381 MiB is the &lt;strong&gt;Resident Set Size (RSS)&lt;/strong&gt; of the container's processes — the amount of physical RAM their memory allocations are currently occupying.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Number Isn't the Full Picture
&lt;/h3&gt;

&lt;p&gt;RSS only counts memory &lt;strong&gt;your process asked for&lt;/strong&gt;. It doesn't count:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Memory the kernel allocated &lt;strong&gt;on behalf of&lt;/strong&gt; your process (network buffers, file descriptors)&lt;/li&gt;
&lt;li&gt;Kernel data structures for managing your containers (cgroups, namespaces)&lt;/li&gt;
&lt;li&gt;Shared libraries loaded once but used by multiple containers&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Layer 2: Page Cache
&lt;/h2&gt;

&lt;p&gt;The page cache is Linux's way of &lt;strong&gt;caching file data in RAM&lt;/strong&gt; so that repeated reads don't hit the disk.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;First read of a file:   Disk → RAM (page cache) → Process     [slow]
Second read:             Page cache → Process                   [fast]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On our node, 831 MB was used for page cache. This sounds like a lot, but page cache is &lt;strong&gt;reclaimable&lt;/strong&gt; — the kernel will automatically free it when applications need more RAM. It's essentially "free memory being used productively."&lt;/p&gt;

&lt;p&gt;This is why &lt;code&gt;MemAvailable&lt;/code&gt; is often much higher than &lt;code&gt;MemFree&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MemFree:        87 MB    (truly unused)
MemAvailable:  735 MB    (free + reclaimable cache)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key insight&lt;/strong&gt;: If you see low &lt;code&gt;MemFree&lt;/code&gt; but healthy &lt;code&gt;MemAvailable&lt;/code&gt;, your system is fine — the kernel is just being smart about caching. Panic when &lt;code&gt;MemAvailable&lt;/code&gt; is low.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 3: Kernel Slab Memory (The Hidden Consumer)
&lt;/h2&gt;

&lt;p&gt;This is where things get interesting — and where our production incident hid for months.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the Slab Allocator?
&lt;/h3&gt;

&lt;p&gt;The Linux kernel constantly needs to create and destroy small data structures: file descriptors, inode objects, network packet headers, process descriptors, and hundreds of other internal types. Allocating and freeing these one at a time from the general-purpose memory allocator would be slow.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;slab allocator&lt;/strong&gt; solves this by maintaining &lt;strong&gt;pre-allocated pools&lt;/strong&gt; for each object type. Think of it like a restaurant kitchen with separate prep stations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Instead of:
  "I need an inode" → malloc(sizeof(inode)) → slow, fragmentation

The kernel does:
  "I need an inode" → grab one from the inode pool → fast, no fragmentation
  "Done with inode" → return it to the pool → ready for reuse
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each pool is called a &lt;strong&gt;slab cache&lt;/strong&gt;. You can see all of them in &lt;code&gt;/proc/slabinfo&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; /proc/slabinfo | &lt;span class="nb"&gt;sort&lt;/span&gt; &lt;span class="nt"&gt;-k3&lt;/span&gt; &lt;span class="nt"&gt;-rn&lt;/span&gt; | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-10&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kmalloc-1k        1,667,384   1024 bytes each  →  1,632 MB
skbuff_head_cache 1,657,980    256 bytes each  →    414 MB
dentry                9,248    192 bytes each  →    1.7 MB
xfs_inode             9,649   1024 bytes each  →    9.4 MB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  SReclaimable vs SUnreclaim
&lt;/h3&gt;

&lt;p&gt;Slab memory is split into two categories:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SReclaimable&lt;/strong&gt; — Slab caches that hold &lt;strong&gt;cached data&lt;/strong&gt; the kernel can regenerate. The biggest example is the &lt;strong&gt;dentry cache&lt;/strong&gt; (directory entry cache), which caches filesystem path lookups. If memory is needed, the kernel can shrink these caches.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SUnreclaim&lt;/strong&gt; — Slab caches that hold &lt;strong&gt;active data&lt;/strong&gt; the kernel is currently using. Network packet buffers, open file descriptors, active inode structures. These &lt;strong&gt;cannot be freed&lt;/strong&gt; until the code that created them explicitly releases them.&lt;/p&gt;

&lt;p&gt;On a healthy node:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SReclaimable:    200 MB   (caches, will shrink if needed)
SUnreclaim:      100 MB   (active kernel objects)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On our broken node:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SReclaimable:     50 MB
SUnreclaim:    2,143 MB   ← 21x normal!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why Kubernetes Can't See Slab Memory
&lt;/h3&gt;

&lt;p&gt;Kubernetes resource metrics come from &lt;strong&gt;cgroups&lt;/strong&gt; (control groups), which track memory allocated by processes inside containers. Kernel slab allocations happen &lt;strong&gt;outside of any cgroup&lt;/strong&gt; — they're charged to the kernel, not to any container. Even if your container triggered the kernel allocation (by sending a network packet, for example), the slab memory shows up as kernel memory, not container memory.&lt;/p&gt;

&lt;p&gt;This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;kubectl top&lt;/code&gt; won't show it&lt;/li&gt;
&lt;li&gt;Prometheus container metrics won't show it&lt;/li&gt;
&lt;li&gt;Your pod's memory limit won't be hit by it&lt;/li&gt;
&lt;li&gt;But it still uses physical RAM on the node&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The only way to see it is by checking &lt;code&gt;/proc/meminfo&lt;/code&gt; or using node-exporter's &lt;code&gt;node_memory_SUnreclaim_bytes&lt;/code&gt; metric.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 4: Swap — The Emergency Overflow
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is Swap?
&lt;/h3&gt;

&lt;p&gt;Swap is a section of the disk that Linux uses as &lt;strong&gt;overflow memory&lt;/strong&gt; when physical RAM is full.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;RAM (4 GB)     →  Fast (nanoseconds)    →  Expensive
Disk/Swap      →  Slow (milliseconds)   →  Cheap
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the kernel needs to free up RAM (because something needs more memory and there's nothing reclaimable left), it takes memory pages that haven't been accessed recently and writes them to the swap area on disk. This is called &lt;strong&gt;swapping out&lt;/strong&gt; or &lt;strong&gt;paging out&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  A Step-by-Step Example
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Stage 1: Everything fits in RAM&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;RAM  [App 750MB] [Kubelet 200MB] [Other 500MB] [Cache 700MB] [Free 1.8GB]
Swap [empty]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All processes' memory is in RAM. Memory access is fast. No problems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stage 2: RAM fills up&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;RAM  [App 830MB] [Kubelet 200MB] [Other 800MB] [Cache 700MB] [Slab 2.1GB] [Free 87MB]
Swap [empty]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Free memory is nearly gone. The kernel starts shrinking the page cache, but slab (SUnreclaim) can't be freed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stage 3: Swap kicks in&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;RAM  [App 750MB] [Kubelet 100MB] [Other 600MB] [Slab 2.1GB] [Cache 300MB]
Swap [Kubelet-old-pages 100MB | App-idle-pages 80MB | Other 320MB] = 500MB used
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The kernel identified memory pages that hadn't been accessed recently and moved them to disk. RAM now has room for active work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stage 4: Swap thrashing&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is where things go catastrophically wrong. When a process needs a page that was swapped out:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Normal access (page in RAM):
  CPU: "Give me address 0x1234"
  RAM: "Here you go"
  → 100 nanoseconds

Swapped access (page on disk):
  CPU: "Give me address 0x1234"
  RAM: "Not here — it's on disk"              → PAGE FAULT
  Kernel: "I need to load it from swap"
  Kernel: "But RAM is full. Let me swap OUT another page first"
  Disk write: Evict some other page to swap    → 1-5 milliseconds
  Disk read: Load the requested page           → 1-5 milliseconds
  CPU: "Finally!"
  → 2-10 milliseconds total (100,000x slower)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now multiply this by dozens of processes, all needing pages that were swapped out:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Process A needs a page → it's on disk → swap in A, swap out B → 5ms
Process B runs → needs its page → swapped out by A! → swap in B, swap out C → 5ms
Process C runs → needs its page → swapped out by B! → swap in C, swap out A → 5ms
Process A runs → needs its page → swapped out by C! → ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This circular eviction is &lt;strong&gt;swap thrashing&lt;/strong&gt;. The system does almost no useful work — all CPU time is spent managing page faults and disk I/O.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Swap Thrashing Looks Like a CPU Problem
&lt;/h3&gt;

&lt;p&gt;CloudWatch and &lt;code&gt;top&lt;/code&gt; will show 100% CPU utilization during swap thrashing. But the CPU isn't doing computation. Here's the breakdown:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Actual computation:      ~5%     (your app, kubelet, etc.)
Kernel swap management:  ~30%    (deciding what to evict, page table updates)
I/O wait:               ~65%    (waiting for disk reads/writes)
────────────────────────────────
Total:                  ~100%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The load average also skyrockets because Linux counts processes in &lt;strong&gt;uninterruptible sleep&lt;/strong&gt; (waiting for disk I/O) in the load average. If 30 processes are all waiting for swap pages, the load average shows 30 — even though very little CPU work is happening.&lt;/p&gt;

&lt;p&gt;This is why our node showed a load average of 34 with pods using only 85m of CPU. The CPUs weren't busy computing — they were busy &lt;strong&gt;waiting for the disk&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is skbuff? (Socket Buffers)
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;sk_buff&lt;/code&gt; (socket buffer) is the data structure at the heart of Linux networking. Every network packet — in or out — is represented by an &lt;code&gt;sk_buff&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Anatomy of a Network Packet in Linux
&lt;/h3&gt;

&lt;p&gt;When your container sends an HTTP request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Application: send("GET /health HTTP/1.1\r\n...")
    ↓
Kernel: allocate an sk_buff
    ├── skbuff_head_cache entry (256 bytes) — metadata, pointers, protocol info
    └── kmalloc-1k entry (1024 bytes) — the actual packet data
    ↓
Network stack: add TCP header, IP header, Ethernet header
    ↓
Network driver: transmit the packet
    ↓
Kernel: free the sk_buff ← THIS is what wasn't happening
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On a healthy system, &lt;code&gt;sk_buff&lt;/code&gt; structures are allocated when a packet is created and freed when the packet is sent/received/dropped. The slab pool recycles them efficiently.&lt;/p&gt;

&lt;h3&gt;
  
  
  What a Leak Looks Like
&lt;/h3&gt;

&lt;p&gt;On our node, we found:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;skbuff_head_cache:  1,657,980 objects  (414 MB)
kmalloc-1k:         1,667,384 objects  (1,632 MB)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The almost 1:1 ratio between skbuff headers and 1KB allocations is the signature of a network packet leak. Each packet consists of a header + data buffer. 1.66 million packets were stuck in kernel memory, never freed.&lt;/p&gt;

&lt;p&gt;At a normal rate of ~1000 packets/second, 1.66 million packets represents about &lt;strong&gt;28 minutes of traffic&lt;/strong&gt; that was captured and never released. Over days and weeks, with the leaking tool constantly intercepting traffic, this accumulated to gigabytes.&lt;/p&gt;




&lt;h2&gt;
  
  
  How to Investigate Memory Issues on Kubernetes Nodes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Check if the problem is even memory
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; /proc/pressure/memory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;some avg10=98.98 avg60=98.90 avg300=98.38 total=381246311078
full avg10=62.85 avg60=63.91 avg300=63.33 total=281968539996
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;some &amp;gt; 50%&lt;/code&gt; → memory pressure exists&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;full &amp;gt; 10%&lt;/code&gt; → severe memory pressure (all tasks stalling)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;full &amp;gt; 50%&lt;/code&gt; → critical — system is barely functional&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 2: Get the full memory breakdown
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s2"&gt;"MemTotal|MemFree|MemAvailable|Buffers|Cached|Slab|SReclaimable|SUnreclaim|SwapTotal|SwapFree|AnonPages|Committed_AS"&lt;/span&gt; /proc/meminfo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Read it as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MemTotal         → Total physical RAM
MemFree          → Completely unused RAM
MemAvailable     → Free + reclaimable (what's actually available)
AnonPages        → Application memory (what kubectl roughly shows)
Cached + Buffers → Page cache (reclaimable, usually harmless)
Slab             → Kernel internal allocations
  SReclaimable   → Kernel caches (can be freed)
  SUnreclaim     → Active kernel objects (cannot be freed!)
SwapTotal        → Total swap space
SwapFree         → Unused swap (SwapTotal - SwapFree = swap used)
Committed_AS     → Total memory promised to all processes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Red flags&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;SUnreclaim&lt;/code&gt; &amp;gt; 500 MB on a small node → possible kernel memory leak&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Committed_AS&lt;/code&gt; &amp;gt; &lt;code&gt;MemTotal + SwapTotal&lt;/code&gt; → system is overcommitted&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SwapFree&lt;/code&gt; much less than &lt;code&gt;SwapTotal&lt;/code&gt; → active swapping&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;MemAvailable&lt;/code&gt; &amp;lt; 10% of &lt;code&gt;MemTotal&lt;/code&gt; → trouble ahead&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 3: If slab is high, find out what's in it
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Show top slab consumers by object count&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; /proc/slabinfo | &lt;span class="nb"&gt;sort&lt;/span&gt; &lt;span class="nt"&gt;-k3&lt;/span&gt; &lt;span class="nt"&gt;-rn&lt;/span&gt; | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-10&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Common slab objects and what they mean:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Object&lt;/th&gt;
&lt;th&gt;What It Is&lt;/th&gt;
&lt;th&gt;High Count Means&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;skbuff_head_cache&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Network packet headers&lt;/td&gt;
&lt;td&gt;Network packet leak or very high traffic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;kmalloc-*&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;General kernel allocations&lt;/td&gt;
&lt;td&gt;Often paired with another leak&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dentry&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Directory entry cache&lt;/td&gt;
&lt;td&gt;Many files/paths accessed (usually reclaimable)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;inode_cache&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;File inode cache&lt;/td&gt;
&lt;td&gt;Many files accessed (usually reclaimable)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ext4_inode_cache&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;ext4 filesystem inodes&lt;/td&gt;
&lt;td&gt;Same as above, ext4 specific&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;nf_conntrack&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Connection tracking entries&lt;/td&gt;
&lt;td&gt;Too many network connections / conntrack leak&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Step 4: Check for swap thrashing
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Load average (should be &amp;lt; number of CPUs)&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; /proc/loadavg

&lt;span class="c"&gt;# Swap usage&lt;/span&gt;
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s2"&gt;"SwapTotal|SwapFree"&lt;/span&gt; /proc/meminfo

&lt;span class="c"&gt;# If swap is being actively used, check swap I/O&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; /proc/vmstat | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s2"&gt;"pswpin|pswpout"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;pswpin&lt;/code&gt; = pages swapped in from disk (high = thrashing)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;pswpout&lt;/code&gt; = pages swapped out to disk (high = thrashing)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Monitoring: What to Alert On
&lt;/h2&gt;

&lt;p&gt;If you're running Prometheus with node-exporter, set up alerts for these metrics:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Alert when non-reclaimable slab memory exceeds 500MB&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;alert&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HighKernelSlabMemory&lt;/span&gt;
  &lt;span class="na"&gt;expr&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;node_memory_SUnreclaim_bytes &amp;gt; 500 * 1024 * &lt;/span&gt;&lt;span class="m"&gt;1024&lt;/span&gt;
  &lt;span class="na"&gt;for&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;30m&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;warning&lt;/span&gt;
  &lt;span class="na"&gt;annotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;summary&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;High&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;non-reclaimable&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;kernel&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;slab&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;memory&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;on&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;$labels.instance&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;}}"&lt;/span&gt;

&lt;span class="c1"&gt;# Alert when swap usage exceeds 50%&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;alert&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HighSwapUsage&lt;/span&gt;
  &lt;span class="na"&gt;expr&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;(1 - node_memory_SwapFree_bytes / node_memory_SwapTotal_bytes) &amp;gt; &lt;/span&gt;&lt;span class="m"&gt;0.5&lt;/span&gt;
  &lt;span class="na"&gt;for&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;15m&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;warning&lt;/span&gt;

&lt;span class="c1"&gt;# Alert when memory pressure is high (PSI)&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;alert&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;MemoryPressureHigh&lt;/span&gt;
  &lt;span class="na"&gt;expr&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;node_pressure_memory_stalled_seconds_total rate &amp;gt; &lt;/span&gt;&lt;span class="m"&gt;0.5&lt;/span&gt;
  &lt;span class="na"&gt;for&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5m&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;critical&lt;/span&gt;

&lt;span class="c1"&gt;# Alert when available memory is critically low&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;alert&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LowAvailableMemory&lt;/span&gt;
  &lt;span class="na"&gt;expr&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes &amp;lt; &lt;/span&gt;&lt;span class="m"&gt;0.1&lt;/span&gt;
  &lt;span class="na"&gt;for&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10m&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;critical&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;kubectl top&lt;/code&gt; only shows container memory.&lt;/strong&gt; The kernel can consume gigabytes that are invisible to Kubernetes. Always check &lt;code&gt;/proc/meminfo&lt;/code&gt; when debugging node-level memory issues.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;High &lt;code&gt;SUnreclaim&lt;/code&gt; means something is wrong.&lt;/strong&gt; Normal is 50-200 MB. If it's in the gigabytes, you have a kernel memory leak — find the leaking slab cache in &lt;code&gt;/proc/slabinfo&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Swap thrashing masquerades as a CPU problem.&lt;/strong&gt; If you see high CPU + high load average + swap usage, the CPU isn't busy computing — it's busy waiting for disk I/O from swap.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Page cache is not a problem.&lt;/strong&gt; Low &lt;code&gt;MemFree&lt;/code&gt; with healthy &lt;code&gt;MemAvailable&lt;/code&gt; is normal — the kernel is caching files intelligently. Only worry when &lt;code&gt;MemAvailable&lt;/code&gt; drops.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Network monitoring tools can leak socket buffers.&lt;/strong&gt; Any tool that intercepts packets at the kernel level (Weave Scope, long-running tcpdump, certain service mesh sidecars) can accumulate &lt;code&gt;sk_buff&lt;/code&gt; objects in slab memory over time.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Monitor &lt;code&gt;node_memory_SUnreclaim_bytes&lt;/code&gt;.&lt;/strong&gt; This is the one metric that would have caught our issue months before it caused an outage.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;&lt;em&gt;This post is part of a series on debugging Kubernetes pod terminations. Read the full incident story: &lt;a href="https://dev.tolink-to-blog-1"&gt;Why Your Kubernetes Pod Keeps Getting Killed — And It's Not an OOMKill&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>linux</category>
      <category>monitoring</category>
      <category>performance</category>
    </item>
    <item>
      <title>AWS Burstable Instances Explained: CPU Credits, Throttling, and Why Your t3 Instance Isn't What You Think</title>
      <dc:creator>Shreyans Sonthalia</dc:creator>
      <pubDate>Wed, 15 Apr 2026 12:10:14 +0000</pubDate>
      <link>https://dev.to/ssshreyans26/aws-burstable-instances-explained-cpu-credits-throttling-and-why-your-t3-instance-isnt-what-you-39o4</link>
      <guid>https://dev.to/ssshreyans26/aws-burstable-instances-explained-cpu-credits-throttling-and-why-your-t3-instance-isnt-what-you-39o4</guid>
      <description>&lt;p&gt;&lt;em&gt;You launched a t3a.medium with "2 vCPUs" but you're not getting 2 CPUs. Here's what you're actually paying for.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Misconception
&lt;/h2&gt;

&lt;p&gt;You go to the AWS console, launch a &lt;code&gt;t3a.medium&lt;/code&gt;, and see this in the spec:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Spec&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;vCPUs&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory&lt;/td&gt;
&lt;td&gt;4 GiB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Price&lt;/td&gt;
&lt;td&gt;~$0.047/hr&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Most engineers assume they're getting 2 full CPU cores, always available, for $0.047/hr. &lt;strong&gt;That's not what's happening.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What the "T" in T3 Means
&lt;/h2&gt;

&lt;p&gt;AWS has several instance families:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Family&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;CPU Model&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;T3/T3a/T4g&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Burstable&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Shared, credit-based&lt;/td&gt;
&lt;td&gt;t3a.medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;M5/M6i/M7i&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;General purpose&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Dedicated&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;m5.large&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;C5/C6i/C7i&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Compute optimized&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Dedicated&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;c5.large&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The &lt;strong&gt;"T" stands for burstable&lt;/strong&gt;. When you buy a T-series instance, you're not buying dedicated CPU cores. You're buying a &lt;strong&gt;fraction&lt;/strong&gt; of a CPU with the ability to temporarily use more.&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;t3a.medium&lt;/code&gt; gives you &lt;strong&gt;20% of each vCPU as a baseline&lt;/strong&gt; — meaning you can continuously use &lt;strong&gt;0.4 vCPUs&lt;/strong&gt; (20% x 2). The other 80% is available on-demand, but only if you have &lt;strong&gt;CPU credits&lt;/strong&gt; to spend.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why is it cheaper?
&lt;/h3&gt;

&lt;p&gt;This is the deal AWS offers: because most workloads don't use 100% CPU all the time, AWS can pack ~5 burstable instances onto the same physical hardware that would serve 1 dedicated instance. You get a discount; AWS gets better hardware utilization.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;t3a.medium:  2 vCPUs (burstable, 20% baseline)  →  ~$0.047/hr
m5.large:    2 vCPUs (dedicated, 100% always)    →  ~$0.096/hr
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The m5.large costs 2x more because those CPUs are reserved for you, always.&lt;/p&gt;




&lt;h2&gt;
  
  
  How CPU Credits Work
&lt;/h2&gt;

&lt;p&gt;The credit system is how AWS meters your burst usage.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Basic Math
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1 CPU credit = 1 vCPU running at 100% for 1 minute&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;t3a.medium&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Earns&lt;/strong&gt;: 24 credits per hour (12 per vCPU x 2 vCPUs)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Baseline&lt;/strong&gt;: 20% per vCPU (this is what 24 credits/hr translates to)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maximum balance&lt;/strong&gt;: 576 credits (can bank up to 24 hours worth)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  A Real-World Example
&lt;/h3&gt;

&lt;p&gt;Say you're running a Kubernetes node with a service that normally uses &lt;strong&gt;0.01 CPU&lt;/strong&gt; (1% of one core). That's well under the 0.4 baseline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Earning:     24 credits/hour
Spending:    ~0.6 credits/hour  (0.01 CPU ≈ 0.5% utilization)
Net:         +23.4 credits/hour accumulating
Max balance: 576 credits
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your credit balance slowly fills up over 24 hours. Life is good.&lt;/p&gt;

&lt;p&gt;Now imagine a traffic spike hits and the node needs full CPU:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hour 1:  2.0 vCPUs used (100%)  → spends 120 credits
Hour 2:  2.0 vCPUs used (100%)  → spends 120 credits
Hour 3:  2.0 vCPUs used (100%)  → spends 120 credits
Hour 4:  2.0 vCPUs used (100%)  → spends 120 credits
...
After ~5 hours at full CPU:     → 576 credits exhausted
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Prepaid Data Plan Analogy
&lt;/h3&gt;

&lt;p&gt;Think of CPU credits like a &lt;strong&gt;prepaid mobile data plan&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You get 1 GB/day at &lt;strong&gt;4G speed&lt;/strong&gt; (full CPU)&lt;/li&gt;
&lt;li&gt;After 1 GB is used up, you're &lt;strong&gt;throttled to 2G speed&lt;/strong&gt; (20% baseline)&lt;/li&gt;
&lt;li&gt;You can still use the internet, but everything is painfully slow&lt;/li&gt;
&lt;li&gt;Next day, your quota starts accumulating again&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What Happens When Credits Hit Zero
&lt;/h2&gt;

&lt;p&gt;This is where things get serious.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;With credits:    2.0 vCPUs available at full speed
Without credits: 2.0 vCPUs CAPPED at 20% → effectively 0.4 vCPUs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AWS hypervisor literally &lt;strong&gt;limits how many CPU cycles your instance can execute&lt;/strong&gt;. Your instance still shows 2 vCPUs, but each one can only do 20% of the work.&lt;/p&gt;

&lt;h3&gt;
  
  
  Impact on Kubernetes
&lt;/h3&gt;

&lt;p&gt;On a Kubernetes node throttled to 0.4 vCPUs, everything competes for scraps:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubelet              → needs CPU for heartbeats every 10s
kube-proxy           → needs CPU for network rules
containerd           → container runtime
OS processes         → systemd, journald, etc.
Your application     → the thing you actually care about
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the kubelet can't send a heartbeat to the API server within 40 seconds (the default &lt;code&gt;node-monitor-grace-period&lt;/code&gt;), the API server marks the node as &lt;strong&gt;NodeNotReady&lt;/strong&gt; and starts evicting pods. Your application goes down — not because it was using too much CPU, but because the underlying node was throttled.&lt;/p&gt;




&lt;h2&gt;
  
  
  T3 Unlimited Mode
&lt;/h2&gt;

&lt;p&gt;AWS offers a way out: &lt;strong&gt;T3 Unlimited mode&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check current mode&lt;/span&gt;
aws ec2 describe-instance-credit-specifications &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--instance-ids&lt;/span&gt; &amp;lt;instance-id&amp;gt;

&lt;span class="c"&gt;# Enable unlimited mode&lt;/span&gt;
aws ec2 modify-instance-credit-specification &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--instance-credit-specification&lt;/span&gt; &lt;span class="nv"&gt;InstanceId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;,CpuCredits&lt;span class="o"&gt;=&lt;/span&gt;unlimited
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With Unlimited mode:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your instance &lt;strong&gt;never gets throttled&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;When credits are exhausted, you keep bursting at full speed&lt;/li&gt;
&lt;li&gt;You pay a small surcharge for "surplus credits" (~$0.05 per vCPU-hour on t3a)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When Unlimited Mode Costs Extra
&lt;/h3&gt;

&lt;p&gt;You only pay extra when:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Your earned credits are exhausted, AND&lt;/li&gt;
&lt;li&gt;You're using more than baseline (20%)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If your average usage is below 20%, Unlimited mode costs &lt;strong&gt;nothing extra&lt;/strong&gt; — you earn enough credits to cover the occasional burst.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Average 10% usage:  Free — credits cover all bursts
Average 20% usage:  Free — exactly at baseline
Average 50% usage:  Extra cost — 30% surplus x $0.05/vCPU-hr
Average 100% usage: Expensive — just use a dedicated instance
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Credit Balance: How to Check and What to Look For
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Via CloudWatch
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws cloudwatch get-metric-statistics &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--namespace&lt;/span&gt; AWS/EC2 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--metric-name&lt;/span&gt; CPUCreditBalance &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--dimensions&lt;/span&gt; &lt;span class="nv"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;InstanceId,Value&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;instance-id&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--start-time&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'6 hours ago'&lt;/span&gt; +%Y-%m-%dT%H:%M:%S&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--end-time&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; +%Y-%m-%dT%H:%M:%S&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--period&lt;/span&gt; 300 &lt;span class="nt"&gt;--statistics&lt;/span&gt; Average
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Key Metrics to Monitor
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;What It Means&lt;/th&gt;
&lt;th&gt;Alert When&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CPUCreditBalance&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Earned credits remaining&lt;/td&gt;
&lt;td&gt;Drops below 50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CPUSurplusCreditBalance&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Surplus credits used (Unlimited mode)&lt;/td&gt;
&lt;td&gt;Consistently above 0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CPUSurplusCreditsCharged&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Surplus credits you're paying for&lt;/td&gt;
&lt;td&gt;Unexpected charges&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CPUCreditUsage&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Credits spent in the period&lt;/td&gt;
&lt;td&gt;Sustained high usage&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Reading the Credit Balance
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;576 credits  → Full (24 hours of baseline earned)
200 credits  → Healthy — some bursting happening
50 credits   → Warning — approaching exhaustion
0 credits    → Standard mode: THROTTLED / Unlimited mode: paying surplus
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Instance Comparison: When to Use What
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Recommended&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Dev/staging environments&lt;/td&gt;
&lt;td&gt;t3a.medium&lt;/td&gt;
&lt;td&gt;Low baseline usage, cost-effective&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kubernetes worker nodes (production)&lt;/td&gt;
&lt;td&gt;m5.large or m6i.large&lt;/td&gt;
&lt;td&gt;Predictable performance, no throttling risk&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CI/CD build agents&lt;/td&gt;
&lt;td&gt;t3a.xlarge with Unlimited&lt;/td&gt;
&lt;td&gt;Burst during builds, idle otherwise&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Databases&lt;/td&gt;
&lt;td&gt;m5/r5 series&lt;/td&gt;
&lt;td&gt;Never throttle a database&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Batch processing&lt;/td&gt;
&lt;td&gt;c5/c6i series&lt;/td&gt;
&lt;td&gt;Sustained compute needs dedicated CPU&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Single dedicated-node workloads&lt;/td&gt;
&lt;td&gt;m5.medium over t3a.medium&lt;/td&gt;
&lt;td&gt;Same vCPU count, guaranteed performance, ~10% more cost&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The Hidden Cost of Burstable
&lt;/h3&gt;

&lt;p&gt;A &lt;code&gt;t3a.medium&lt;/code&gt; at $0.047/hr seems cheaper than an &lt;code&gt;m5.large&lt;/code&gt; at $0.096/hr. But consider:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When a t3a node gets throttled and your pod gets evicted, what's the cost of that downtime?&lt;/li&gt;
&lt;li&gt;When you spend 3 hours debugging why a pod keeps dying, what's the engineering cost?&lt;/li&gt;
&lt;li&gt;If you enable Unlimited and burst frequently, the surplus charges can approach dedicated instance pricing anyway&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For production Kubernetes nodes, the small extra cost of dedicated instances often pays for itself in reliability and reduced debugging time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Reference: T3/T3a Instance Family
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Instance&lt;/th&gt;
&lt;th&gt;vCPUs&lt;/th&gt;
&lt;th&gt;RAM&lt;/th&gt;
&lt;th&gt;Baseline/vCPU&lt;/th&gt;
&lt;th&gt;Credits/hr&lt;/th&gt;
&lt;th&gt;Max Balance&lt;/th&gt;
&lt;th&gt;Price/hr (Mumbai)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;t3a.micro&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1 GiB&lt;/td&gt;
&lt;td&gt;10%&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;288&lt;/td&gt;
&lt;td&gt;~$0.012&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;t3a.small&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2 GiB&lt;/td&gt;
&lt;td&gt;20%&lt;/td&gt;
&lt;td&gt;24&lt;/td&gt;
&lt;td&gt;576&lt;/td&gt;
&lt;td&gt;~$0.024&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;t3a.medium&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;4 GiB&lt;/td&gt;
&lt;td&gt;20%&lt;/td&gt;
&lt;td&gt;24&lt;/td&gt;
&lt;td&gt;576&lt;/td&gt;
&lt;td&gt;~$0.047&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;t3a.large&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;8 GiB&lt;/td&gt;
&lt;td&gt;30%&lt;/td&gt;
&lt;td&gt;36&lt;/td&gt;
&lt;td&gt;864&lt;/td&gt;
&lt;td&gt;~$0.075&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;t3a.xlarge&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;16 GiB&lt;/td&gt;
&lt;td&gt;40%&lt;/td&gt;
&lt;td&gt;96&lt;/td&gt;
&lt;td&gt;2304&lt;/td&gt;
&lt;td&gt;~$0.150&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Note: Baseline percentages are &lt;strong&gt;per vCPU&lt;/strong&gt;. A t3a.medium with 20% baseline on 2 vCPUs gives you 0.4 vCPUs of sustained compute.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;T-series instances are not dedicated compute.&lt;/strong&gt; The "2 vCPUs" you see is the burst ceiling, not the sustained capacity. Your sustained capacity is the baseline percentage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CPU credit exhaustion causes throttling, not failure.&lt;/strong&gt; Your instance doesn't stop — it slows down. This is often worse than a crash because it causes cascading timeouts and hard-to-diagnose performance issues.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enable Unlimited mode on all production T-series instances.&lt;/strong&gt; There's no reason to risk throttling in production. The surplus cost is minimal for occasional bursts.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;If you consistently need more than baseline, switch to a dedicated instance.&lt;/strong&gt; T-series instances are designed for workloads that are mostly idle with occasional spikes — not for sustained high CPU usage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Monitor &lt;code&gt;CPUCreditBalance&lt;/code&gt; in CloudWatch.&lt;/strong&gt; Set up alerts before credits hit zero so you can react proactively.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;&lt;em&gt;This post is part of a series on debugging Kubernetes pod terminations. Read the full incident story: &lt;a href="https://dev.tolink-to-blog-1"&gt;Why Your Kubernetes Pod Keeps Getting Killed — And It's Not an OOMKill&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>cloud</category>
      <category>performance</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
