<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Shailendra Verma</title>
    <description>The latest articles on DEV Community by Shailendra Verma (@shailendra_verma).</description>
    <link>https://dev.to/shailendra_verma</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3675312%2F795a4859-3d7c-4f62-bdeb-d8163500ccec.jpeg</url>
      <title>DEV Community: Shailendra Verma</title>
      <link>https://dev.to/shailendra_verma</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/shailendra_verma"/>
    <language>en</language>
    <item>
      <title>When Containers Kill Nodes: Understanding Zombie Processes and PID 1</title>
      <dc:creator>Shailendra Verma</dc:creator>
      <pubDate>Wed, 24 Dec 2025 14:42:39 +0000</pubDate>
      <link>https://dev.to/shailendra_verma/when-containers-kill-nodes-understanding-zombie-processes-and-pid-1-3o7f</link>
      <guid>https://dev.to/shailendra_verma/when-containers-kill-nodes-understanding-zombie-processes-and-pid-1-3o7f</guid>
      <description>&lt;p&gt;&lt;strong&gt;The Hook&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Early in my career, I witnessed something that changed how I think about containers forever. We were running MySQL on Kubernetes with Rocky Linux nodes. Everything seemed fine until nodes started dying one by one. The culprit? Zombie processes. Hundreds of them, silently accumulating until the node couldn't take it anymore.&lt;/p&gt;

&lt;p&gt;This incident taught me a fundamental truth: &lt;strong&gt;containers are not lightweight VMs. They're just processes.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What Exactly Are Zombie Processes?
&lt;/h2&gt;

&lt;p&gt;When a process finishes execution in Linux, it doesn't just disappear. It enters a "zombie" state the process has completed, but its entry still exists in the process table.&lt;/p&gt;

&lt;p&gt;Why? Because the parent process needs to read the child's exit status using the &lt;code&gt;wait()&lt;/code&gt; system call. Until the parent calls &lt;code&gt;wait()&lt;/code&gt;, the child remains a zombie.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Parent Process
      |
      |--- fork() ---&amp;gt; Child Process
      |                     |
      |                     | (does work)
      |                     |
      |                     v
      |                 Exits (becomes zombie)
      |                     |
      |&amp;lt;--- wait() ---------+
      |
      v
   Zombie cleaned up
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In a normal Linux system, this isn't a big problem. If a parent dies without calling &lt;code&gt;wait()&lt;/code&gt;, the orphaned children get adopted by the &lt;code&gt;init&lt;/code&gt; process (PID 1). The init process periodically reaps these zombies.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Why Containers Break This Model&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Here's where containers get tricky.&lt;/p&gt;

&lt;p&gt;When you run a container without an init process, your application becomes &lt;strong&gt;PID 1&lt;/strong&gt;. There is no traditional init process. Your app is now responsible for reaping zombie processes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; mysql:8.0&lt;/span&gt;
&lt;span class="c"&gt;# MySQL process becomes PID 1&lt;/span&gt;
&lt;span class="c"&gt;# It was never designed to be an init system&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Most applications including MySQL are not designed to be init processes. They don't call &lt;code&gt;wait()&lt;/code&gt; on orphaned children. So when child processes die, they become zombies with no one to clean them up.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The accumulation begins.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;On the node, &lt;code&gt;ps aux | grep Z&lt;/code&gt; showed hundreds of zombie MySQL helper processes each one dead but still holding onto its entry in the process table.&lt;/p&gt;

&lt;p&gt;Each zombie holds:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An entry in the process table&lt;/li&gt;
&lt;li&gt;A PID (and PIDs are finite)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Eventually, you run out of PIDs or the process table fills up. New processes can't spawn. The node becomes unstable. Services crash.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The Fix: Tini&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The solution is surprisingly simple: use a proper init process designed for containers.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/krallin/tini" rel="noopener noreferrer"&gt;Tini&lt;/a&gt; is a minimal init system built specifically for containers. It:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Runs as PID 1&lt;/li&gt;
&lt;li&gt;Spawns your application as a child process&lt;/li&gt;
&lt;li&gt;Forwards signals properly&lt;/li&gt;
&lt;li&gt;Reaps zombie processes by calling &lt;code&gt;wait()&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Implementation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 1: Install in Dockerfile&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; mysql:8.0&lt;/span&gt;

&lt;span class="c"&gt;# Install tini&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;apt-get update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; tini

&lt;span class="c"&gt;# Set tini as entrypoint&lt;/span&gt;
&lt;span class="k"&gt;ENTRYPOINT&lt;/span&gt;&lt;span class="s"&gt; ["/usr/bin/tini", "--"]&lt;/span&gt;

&lt;span class="c"&gt;# Your actual command&lt;/span&gt;
&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["mysqld"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Option 2: Use Docker's built-in init&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--init&lt;/span&gt; mysql:8.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Option 3: Kubernetes&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mysql&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mysql:8.0&lt;/span&gt;
    &lt;span class="c1"&gt;# Note: For Kubernetes, you typically bake tini into the image&lt;/span&gt;
    &lt;span class="c1"&gt;# or use a base image that includes it&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In Kubernetes, the safest pattern is to bake an init like &lt;code&gt;tini&lt;/code&gt; into the image, because relying on runtime flags is not portable across environments.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The Bigger Lesson&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This incident challenged my mental model of containers. I used to think of them as "lightweight VMs" isolated boxes running their own little world.&lt;/p&gt;

&lt;p&gt;The reality is different. A container is just a process with fancy isolation (namespaces, cgroups). It shares the kernel with the host. When that process misbehaves by spawning zombies, consuming memory, or exhausting PIDs the host suffers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Understanding this changes how you:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Debug container issues&lt;/li&gt;
&lt;li&gt;Design container images&lt;/li&gt;
&lt;li&gt;Think about resource limits and isolation&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Quick Reference&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;What Happens&lt;/th&gt;
&lt;th&gt;Fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;App as PID 1, spawns children&lt;/td&gt;
&lt;td&gt;Zombies accumulate&lt;/td&gt;
&lt;td&gt;Use tini or --init&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;App crashes without signal handling&lt;/td&gt;
&lt;td&gt;Orphaned children become zombies&lt;/td&gt;
&lt;td&gt;Proper init + signal forwarding&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Too many zombies&lt;/td&gt;
&lt;td&gt;PID exhaustion, node instability&lt;/td&gt;
&lt;td&gt;Prevention via init system&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Zombies are normal&lt;/strong&gt; they only become a problem when not reaped&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Containers don't have a traditional init&lt;/strong&gt; by default&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Your app shouldn't be PID 1&lt;/strong&gt; unless it's designed for it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tini/dumb-init are simple fixes&lt;/strong&gt; that should be standard practice&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Containers are processes&lt;/strong&gt;, not VMs never forget this&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;&lt;em&gt;Have you faced similar container gotchas in production? I'd love to hear your war stories.&lt;/em&gt;&lt;/p&gt;




</description>
      <category>docker</category>
      <category>kubernetes</category>
      <category>linux</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
