<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Lee Yao</title>
    <description>The latest articles on DEV Community by Lee Yao (@lee_yao_cfeb14fb9b141b8c5).</description>
    <link>https://dev.to/lee_yao_cfeb14fb9b141b8c5</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3917027%2F9dba3dda-178a-479e-a596-411d2f08f71d.jpg</url>
      <title>DEV Community: Lee Yao</title>
      <link>https://dev.to/lee_yao_cfeb14fb9b141b8c5</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/lee_yao_cfeb14fb9b141b8c5"/>
    <language>en</language>
    <item>
      <title>Why My Spark Container Keeps Exiting — Docker PID 1 and the Daemon Trap</title>
      <dc:creator>Lee Yao</dc:creator>
      <pubDate>Thu, 07 May 2026 04:41:08 +0000</pubDate>
      <link>https://dev.to/lee_yao_cfeb14fb9b141b8c5/why-my-spark-container-keeps-exiting-docker-pid-1-and-the-daemon-trap-dgf</link>
      <guid>https://dev.to/lee_yao_cfeb14fb9b141b8c5/why-my-spark-container-keeps-exiting-docker-pid-1-and-the-daemon-trap-dgf</guid>
      <description>&lt;p&gt;I spent an embarrassing amount of time staring at my terminal, watching Spark containers start and immediately die. Three different attempts, three different failure modes, all in the same afternoon. If you're setting up Spark inside Docker and your container just... vanishes, this post is for you.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Setup
&lt;/h2&gt;

&lt;p&gt;I'm building a CMS Medicare streaming pipeline — pulling hospital charge data from the CMS public API, pushing it through Kafka, processing it with Spark Structured Streaming, and landing the results in Snowflake. The whole stack runs in Docker Compose. Kafka and ZooKeeper came up without a hitch. Spark did not.&lt;/p&gt;

&lt;p&gt;Here's what my &lt;code&gt;docker-compose.yml&lt;/code&gt; looked like at the start:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;zookeeper&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;confluentinc/cp-zookeeper:7.4.0&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;ZOOKEEPER_CLIENT_PORT&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2181&lt;/span&gt;

  &lt;span class="na"&gt;kafka&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;confluentinc/cp-kafka:7.4.0&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;zookeeper&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;9092:9092"&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;KAFKA_ZOOKEEPER_CONNECT&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;zookeeper:2181&lt;/span&gt;
      &lt;span class="na"&gt;KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;

  &lt;span class="na"&gt;spark&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bitnami/spark:3.5&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;kafka&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;SPARK_MODE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;master&lt;/span&gt;

  &lt;span class="na"&gt;spark-worker&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bitnami/spark:3.5&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;spark&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;SPARK_MODE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;worker&lt;/span&gt;
      &lt;span class="na"&gt;SPARK_MASTER_URL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark://spark:7077&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Looked reasonable enough. It wasn't.&lt;/p&gt;




&lt;h2&gt;
  
  
  Attempt 1 — The Image That No Longer Exists
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Error response from daemon: failed to resolve reference
"docker.io/bitnami/spark:3.5": not found
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;bitnami/spark:3.5&lt;/code&gt; had been pulled from Docker Hub. I tried &lt;code&gt;3.5.3&lt;/code&gt;. Gone. Tried &lt;code&gt;bitnami/spark:3&lt;/code&gt;. Also gone. The entire Bitnami Spark image line had been removed with no notice.&lt;/p&gt;

&lt;p&gt;This is the first thing worth remembering before we even get to the real problem: &lt;strong&gt;third-party images on Docker Hub can disappear at any time.&lt;/strong&gt; There is no deprecation warning, no migration guide. For anything that needs to be reproducible, you either pin to a verified digest or mirror the image in a private registry.&lt;/p&gt;

&lt;p&gt;I switched to the Apache official image: &lt;code&gt;apache/spark:3.5.1-python3&lt;/code&gt;. That one pulled fine.&lt;/p&gt;




&lt;h2&gt;
  
  
  Attempt 2 — Wrong Environment Variables
&lt;/h2&gt;

&lt;p&gt;I updated the image name but kept the same environment variable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;spark&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apache/spark:3.5.1-python3&lt;/span&gt;
  &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;SPARK_MODE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;master&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;docker-compose up -d&lt;/code&gt; reported all containers as "Started." But &lt;code&gt;docker ps&lt;/code&gt; only showed two running — Kafka and ZooKeeper. The Spark containers had already exited.&lt;/p&gt;

&lt;p&gt;The problem: &lt;strong&gt;&lt;code&gt;SPARK_MODE&lt;/code&gt; is a Bitnami-specific environment variable.&lt;/strong&gt; The Apache official image has never heard of it.&lt;/p&gt;

&lt;p&gt;Bitnami's image ships with a custom entrypoint script that reads &lt;code&gt;SPARK_MODE&lt;/code&gt; and decides whether to launch a master or worker. It's a convenience layer Bitnami built on top of vanilla Spark. The Apache official image has none of this. Its default entrypoint (&lt;code&gt;/opt/entrypoint.sh&lt;/code&gt;) simply executes whatever command you pass in. If you don't pass a meaningful command, it finishes and exits.&lt;/p&gt;

&lt;p&gt;The lesson: switching between images from different publishers is not just swapping the &lt;code&gt;image:&lt;/code&gt; field. Different publishers package the same software with different entrypoints, different environment variables, and different directory layouts. Before you can use an image correctly, you need to understand how &lt;em&gt;that specific image&lt;/em&gt; expects to be started.&lt;/p&gt;




&lt;h2&gt;
  
  
  Attempt 3 — The Real Trap: &lt;code&gt;start-master.sh&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Spark comes bundled with &lt;code&gt;start-master.sh&lt;/code&gt;. That seems like the right tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;spark&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apache/spark:3.5.1-python3&lt;/span&gt;
  &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/opt/spark/sbin/start-master.sh&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same result. "Started." No Spark container.&lt;/p&gt;

&lt;p&gt;The container was starting. Spark Master was launching. And then everything was shutting down within a fraction of a second. To understand why, you need to know one foundational Docker rule.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Core Rule: Docker Containers Live and Die with PID 1
&lt;/h2&gt;

&lt;p&gt;Every container has a main process — specified by &lt;code&gt;CMD&lt;/code&gt;, &lt;code&gt;ENTRYPOINT&lt;/code&gt;, or &lt;code&gt;command&lt;/code&gt; in your Compose file. Inside the container, this process gets &lt;strong&gt;PID 1&lt;/strong&gt;. When PID 1 exits, the container exits. No exceptions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PID 1 is running  →  container is running
PID 1 exits       →  container exits immediately
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now look at what &lt;code&gt;start-master.sh&lt;/code&gt; actually does internally (simplified):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nb"&gt;nohup &lt;/span&gt;java &lt;span class="nt"&gt;-cp&lt;/span&gt; &lt;span class="nv"&gt;$SPARK_CLASSPATH&lt;/span&gt; org.apache.spark.deploy.master.Master &amp;amp;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Master started."&lt;/span&gt;
&lt;span class="nb"&gt;exit &lt;/span&gt;0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See that &lt;code&gt;&amp;amp;&lt;/code&gt;? It puts the Spark Master process into the &lt;strong&gt;background&lt;/strong&gt;. The shell script (PID 1) spawns a child Java process, prints a message, and calls &lt;code&gt;exit 0&lt;/code&gt;. The moment it does that, Docker kills the container and everything inside it — including the Spark Master that just started.&lt;/p&gt;

&lt;p&gt;Here's the exact timeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;t=0.0s  Container starts; PID 1 = start-master.sh (bash)
t=0.1s  Bash forks a Java process (Spark Master) into the background
t=0.2s  Bash script reaches exit 0 → PID 1 terminates
t=0.2s  Docker detects PID 1 exit → tears down the container
t=0.2s  The background Java process is killed along with it
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Spark Master was alive for about 0.2 seconds.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;start-master.sh&lt;/code&gt; was written for bare-metal servers and VMs, where you start a background daemon and the OS keeps it alive after the startup script exits. Docker doesn't work that way. Docker is watching PID 1 and only PID 1.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Kafka and ZooKeeper Didn't Have This Problem
&lt;/h2&gt;

&lt;p&gt;Confluent's images use &lt;code&gt;exec&lt;/code&gt; in their entrypoints:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;exec &lt;/span&gt;kafka-server-start /etc/kafka/server.properties
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In bash, &lt;code&gt;exec&lt;/code&gt; &lt;strong&gt;replaces the current process&lt;/strong&gt; with the specified command. The shell doesn't fork a child — it &lt;em&gt;becomes&lt;/em&gt; Kafka. Kafka inherits PID 1, runs in the foreground, and blocks indefinitely.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Image&lt;/th&gt;
&lt;th&gt;What PID 1 Does&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;cp-kafka&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;exec kafka-server-start&lt;/code&gt; (foreground, blocking)&lt;/td&gt;
&lt;td&gt;✅ Container stays alive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;cp-zookeeper&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;exec zookeeper-server-start&lt;/code&gt; (foreground, blocking)&lt;/td&gt;
&lt;td&gt;✅ Container stays alive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;apache/spark&lt;/code&gt; + &lt;code&gt;start-master.sh&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Forks Java to background with &lt;code&gt;&amp;amp;&lt;/code&gt;, script exits&lt;/td&gt;
&lt;td&gt;❌ Container exits immediately&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The entire difference: &lt;code&gt;&amp;amp;&lt;/code&gt; versus &lt;code&gt;exec&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Four Ways to Fix It
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Fix A: &lt;code&gt;tail -f /dev/null&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;spark&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apache/spark:3.5.1-python3&lt;/span&gt;
  &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tail"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-f"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/dev/null"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./spark-apps:/opt/spark-apps&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;tail -f /dev/null&lt;/code&gt; watches a file that never gets new content. PID 1 blocks forever. Submit jobs via &lt;code&gt;docker exec&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;exec &lt;/span&gt;my-spark-container &lt;span class="se"&gt;\&lt;/span&gt;
  /opt/spark/bin/spark-submit &lt;span class="se"&gt;\&lt;/span&gt;
  /opt/spark-apps/my_job.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; local development, one-off job submission.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fix B: Run the Spark Master Class Directly
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="s"&gt;bash -c "&lt;/span&gt;
  &lt;span class="s"&gt;/opt/spark/bin/spark-class org.apache.spark.deploy.master.Master&lt;/span&gt;
  &lt;span class="s"&gt;--host spark --port 7077 --webui-port 8080&lt;/span&gt;
  &lt;span class="s"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Skips the wrapper script entirely. The Master process runs in the foreground as PID 1.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; when you actually need a running Master/Worker cluster.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fix C: Custom Entrypoint Script
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# custom-entrypoint.sh&lt;/span&gt;
/opt/spark/sbin/start-master.sh   &lt;span class="c"&gt;# starts daemon in background&lt;/span&gt;
&lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /opt/spark/logs/&lt;span class="k"&gt;*&lt;/span&gt;         &lt;span class="c"&gt;# blocks + streams logs to stdout&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./custom-entrypoint.sh:/opt/custom-entrypoint.sh&lt;/span&gt;
&lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bash /opt/custom-entrypoint.sh&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Master auto-starts, container stays alive, and you get log output via &lt;code&gt;docker logs&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; when you want Spark to auto-start and want logs accessible.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fix D: Use a Docker-Friendly Image
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;jupyter/pyspark-notebook&lt;/code&gt; handles all of this correctly out of the box. Their entrypoints are built around &lt;code&gt;exec&lt;/code&gt; from the start.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; quick prototyping. Tradeoff: you depend on a third party to keep the image available.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Docker containers exit when PID 1 exits. Always.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;start-master.sh&lt;/code&gt; backgrounds Spark with &lt;code&gt;&amp;amp;&lt;/code&gt; and exits — which kills the container.&lt;/li&gt;
&lt;li&gt;Confluent's images use &lt;code&gt;exec&lt;/code&gt;, making the service itself PID 1 and keeping the container alive.&lt;/li&gt;
&lt;li&gt;The fix: ensure PID 1 is a foreground process that never returns.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Three patterns to spot in any startup script:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;command &amp;amp;&lt;/code&gt; — background execution, PID 1 exits shortly after → container dies&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;exec command&lt;/code&gt; — replaces PID 1, container lives as long as the process does → container survives&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;nohup command &amp;amp;&lt;/code&gt; — classic daemon pattern, same problem as &lt;code&gt;&amp;amp;&lt;/code&gt; in Docker → container dies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Docker containers are not VMs. On a VM, daemonizing a process and exiting the startup script is completely normal. In Docker, the startup script &lt;em&gt;is&lt;/em&gt; the container. Once you internalize that, most "why does my container keep exiting" questions answer themselves.&lt;/p&gt;

</description>
      <category>docker</category>
      <category>spark</category>
      <category>dataengineering</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
