<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Venkata BhumiReddy</title>
    <description>The latest articles on DEV Community by Venkata BhumiReddy (@venkata_bhumireddy).</description>
    <link>https://dev.to/venkata_bhumireddy</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3830349%2Ffe392389-fb3a-4322-9f3d-361bee324311.png</url>
      <title>DEV Community: Venkata BhumiReddy</title>
      <link>https://dev.to/venkata_bhumireddy</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/venkata_bhumireddy"/>
    <language>en</language>
    <item>
      <title>Kubernetes Probe Anti-Pattern: Stop Restarting Pods That Don't Need It</title>
      <dc:creator>Venkata BhumiReddy</dc:creator>
      <pubDate>Wed, 18 Mar 2026 04:17:03 +0000</pubDate>
      <link>https://dev.to/venkata_bhumireddy/kubernetes-probe-anti-pattern-stop-restarting-pods-that-dont-need-it-3khe</link>
      <guid>https://dev.to/venkata_bhumireddy/kubernetes-probe-anti-pattern-stop-restarting-pods-that-dont-need-it-3khe</guid>
      <description>&lt;p&gt;Have you ever watched your pod restart counter climb during a MongoDB re-election event or any LDAP connection timeout or any external system failures — even though the JVM was perfectly fine? That's not a MongoDB or external system problem. That's a probe configuration problem. And it's one of the most common anti-patterns we see across Kubernetes deployments.&lt;/p&gt;

&lt;p&gt;This post walks through the problem, the live simulation we ran to prove it, and the exact fix using Spring Boot Actuator health groups.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Are Kubernetes Probes?
&lt;/h2&gt;

&lt;p&gt;Kubernetes uses three probe types to monitor container health:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Probe&lt;/th&gt;
&lt;th&gt;Question it answers&lt;/th&gt;
&lt;th&gt;Action on failure&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Liveness&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Is the JVM alive and not deadlocked?&lt;/td&gt;
&lt;td&gt;Kills and &lt;strong&gt;restarts&lt;/strong&gt; the pod&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Readiness&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Is the app ready to receive traffic?&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Removes&lt;/strong&gt; pod from Service endpoints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Startup&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Has the app finished initializing?&lt;/td&gt;
&lt;td&gt;Kills pod if startup is too slow&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The difference between liveness and readiness is the most important thing to understand before you configure either one:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Liveness&lt;/strong&gt; says: &lt;em&gt;"this process is broken beyond self-repair — kill it."&lt;/em&gt;&lt;br&gt;
&lt;strong&gt;Readiness&lt;/strong&gt; says: &lt;em&gt;"this process isn't ready right now — don't send it traffic."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Anti-Pattern
&lt;/h2&gt;

&lt;p&gt;Here's what a lot of teams ship to production:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;livenessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;httpGet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/actuator/health&lt;/span&gt;   &lt;span class="c1"&gt;# ❌ includes MongoDB, diskSpace, ALL deps&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;
  &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
  &lt;span class="na"&gt;failureThreshold&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;

&lt;span class="na"&gt;readinessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;httpGet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/actuator/health&lt;/span&gt;   &lt;span class="c1"&gt;# ❌ same endpoint as liveness&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;
  &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
  &lt;span class="na"&gt;failureThreshold&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both probes point to &lt;code&gt;/actuator/health&lt;/code&gt;. That endpoint aggregates &lt;strong&gt;everything&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"UP"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"components"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"diskSpace"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"UP"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"livenessState"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"UP"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"mongo"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"UP"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"mongoDB"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"UP"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ping"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;           &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"UP"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"readinessState"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"UP"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The moment MongoDB goes down — even temporarily during a normal primary re-election — &lt;code&gt;/actuator/health&lt;/code&gt; returns &lt;code&gt;DOWN&lt;/code&gt;. The liveness probe fails. Kubernetes kills the pod.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The pod restart does nothing to fix MongoDB.&lt;/strong&gt; The JVM was healthy. You just killed a healthy process for no reason.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Failure Cascade
&lt;/h2&gt;

&lt;p&gt;Here's the timeline when this anti-pattern hits a MongoDB re-election:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;t=0s    MongoDB primary pod deleted (normal Kubernetes rolling update / failure)
t=2s    Spring Boot MongoDB driver loses connection
t=2s    /actuator/health → mongo: DOWN → overall: DOWN
t=5s    Liveness probe check #1 → FAIL
t=10s   Liveness probe check #2 → FAIL  ← failureThreshold: 2 reached
t=10s   Kubernetes KILLS the pod
t=40s   Pod still restarting ... MongoDB finishes re-election ✓
t=70s   Pod finally UP — but RESTARTS counter now shows 1, 2, 3...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If MongoDB stays down long enough, you get a &lt;strong&gt;restart loop&lt;/strong&gt;. The pod restarts repeatedly, failing health checks each time, never getting a chance to recover on its own.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Fix: Split Your Probes
&lt;/h2&gt;

&lt;p&gt;Spring Boot has had dedicated probe endpoints since 2.3. All you need is to enable health groups.&lt;/p&gt;

&lt;h3&gt;
  
  
  Spring Boot configuration
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# application.yaml&lt;/span&gt;
&lt;span class="na"&gt;management&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;endpoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;health&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;probes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;             &lt;span class="c1"&gt;# enables /liveness and /readiness&lt;/span&gt;
      &lt;span class="na"&gt;show-details&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always&lt;/span&gt;
      &lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;liveness&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;livenessState&lt;/span&gt;        &lt;span class="c1"&gt;# ✅ JVM only&lt;/span&gt;
        &lt;span class="na"&gt;readiness&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongo,readinessState&lt;/span&gt; &lt;span class="c1"&gt;# ✅ DB failure → remove from LB&lt;/span&gt;
  &lt;span class="na"&gt;endpoints&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;web&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;exposure&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;health,info&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Kubernetes deployment
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;livenessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;httpGet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/actuator/health/liveness&lt;/span&gt;   &lt;span class="c1"&gt;# ✅ JVM only&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;
  &lt;span class="na"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;
  &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
  &lt;span class="na"&gt;failureThreshold&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;

&lt;span class="na"&gt;readinessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;httpGet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/actuator/health/readiness&lt;/span&gt;  &lt;span class="c1"&gt;# ✅ DB failure stops traffic, not pod&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;
  &lt;span class="na"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt;
  &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
  &lt;span class="na"&gt;failureThreshold&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What each endpoint returns now
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;GET /actuator/health/liveness&lt;/code&gt;&lt;/strong&gt; — MongoDB DOWN? &lt;strong&gt;Still &lt;code&gt;200 UP&lt;/code&gt;.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"UP"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"components"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"livenessState"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"UP"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;code&gt;GET /actuator/health/readiness&lt;/code&gt;&lt;/strong&gt; — MongoDB DOWN? &lt;strong&gt;Returns &lt;code&gt;503 DOWN&lt;/code&gt;.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DOWN"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"components"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"readinessState"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"UP"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"mongo"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DOWN"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now when MongoDB goes down:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Readiness fails → pod is &lt;strong&gt;removed from the Service endpoints&lt;/strong&gt; (no traffic)&lt;/li&gt;
&lt;li&gt;Liveness stays UP → pod is &lt;strong&gt;never killed&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;MongoDB recovers in ~20-30s → readiness passes → pod &lt;strong&gt;automatically rejoins&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;RESTARTS counter: &lt;strong&gt;0&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Live Demo
&lt;/h2&gt;

&lt;p&gt;Checkout code from &lt;a href="https://github.com/codebhumi/app-kubernetes-probes" rel="noopener noreferrer"&gt;https://github.com/codebhumi/app-kubernetes-probes&lt;/a&gt; and follow instructions in README.MD to compile and build this application.&lt;/p&gt;

&lt;p&gt;Now set this up on Docker Desktop Kubernetes with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MongoDB Community Operator (3-node replica set)&lt;/li&gt;
&lt;li&gt;Spring Boot 3.3 / Java 21&lt;/li&gt;
&lt;li&gt;Priority-weighted replica set so pod-0 is always the preferred primary&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Weighted priority — makes the demo reproducible
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;memberConfig&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;votes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="na"&gt;priority&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2"&lt;/span&gt;    &lt;span class="c1"&gt;# pod-0: always preferred primary&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;votes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="na"&gt;priority&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;votes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="na"&gt;priority&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you always know which pod to kill to trigger a re-election.&lt;/p&gt;

&lt;h3&gt;
  
  
  The kill command
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Confirm who is primary&lt;/span&gt;
kubectl &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; mongodb-replicaset-0 &lt;span class="nt"&gt;-n&lt;/span&gt; mongodb &lt;span class="nt"&gt;-c&lt;/span&gt; mongod &lt;span class="nt"&gt;--&lt;/span&gt; mongosh &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-u&lt;/span&gt; admin &lt;span class="nt"&gt;-p&lt;/span&gt; MyMongoExperiment &lt;span class="nt"&gt;--authenticationDatabase&lt;/span&gt; admin &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--eval&lt;/span&gt; &lt;span class="s1"&gt;'rs.status().members.forEach(m =&amp;gt; print(m.name, m.stateStr))'&lt;/span&gt;

&lt;span class="c"&gt;# Simulate full outage — kill all three pods&lt;/span&gt;
kubectl delete pod mongodb-replicaset-0 &lt;span class="se"&gt;\&lt;/span&gt;
                    mongodb-replicaset-1 &lt;span class="se"&gt;\&lt;/span&gt;
                    mongodb-replicaset-2 &lt;span class="nt"&gt;-n&lt;/span&gt; mongodb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What to watch
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Terminal 1 — pod status + endpoints (your "load balancer view")&lt;/span&gt;
watch &lt;span class="nt"&gt;-n&lt;/span&gt; 2 &lt;span class="s1"&gt;'
echo "=== PODS ==="
kubectl get pods -n mongodb | grep app-kubernetes-probes
echo ""
echo "=== ENDPOINTS ==="
kubectl get endpoints app-kubernetes-probes-svc -n mongodb
'&lt;/span&gt;

&lt;span class="c"&gt;# Terminal 2 — health probe responses&lt;/span&gt;
&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"--- &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%H:%M:%S&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt; ---"&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"LIVENESS:  "&lt;/span&gt;
  curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; /dev/null &lt;span class="nt"&gt;-w&lt;/span&gt; &lt;span class="s2"&gt;"%{http_code}"&lt;/span&gt; http://localhost:30080/actuator/health/liveness
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"READINESS: "&lt;/span&gt;
  curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; /dev/null &lt;span class="nt"&gt;-w&lt;/span&gt; &lt;span class="s2"&gt;"%{http_code}"&lt;/span&gt; http://localhost:30080/actuator/health/readiness
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt;
  &lt;span class="nb"&gt;sleep &lt;/span&gt;3
&lt;span class="k"&gt;done&lt;/span&gt;

&lt;span class="c"&gt;# Terminal 3 — API traffic&lt;/span&gt;
&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%H:%M:%S&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; /dev/null &lt;span class="nt"&gt;-w&lt;/span&gt; &lt;span class="s1"&gt;'%{http_code}'&lt;/span&gt; http://localhost:30080/api/products&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="nb"&gt;sleep &lt;/span&gt;2
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Anti-pattern result
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PODS:
app-kubernetes-probes   0/1   Running   1    ← killed once
app-kubernetes-probes   0/1   Running   2    ← killed again
app-kubernetes-probes   1/1   Running   3    ← back but 3 restarts

ENDPOINTS:
app-kubernetes-probes-svc   10.1.0.15:8080   ← NEW IP (pod was killed)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Correct pattern result
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PODS:
app-kubernetes-probes   0/1   Running   0    ← removed from LB, NOT killed
app-kubernetes-probes   1/1   Running   0    ← rejoined, ZERO restarts

ENDPOINTS:
app-kubernetes-probes-svc   10.1.0.16:8080   ← SAME IP (pod survived!)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The same IP rejoining is the smoking gun.&lt;/strong&gt; It proves the pod was never killed — just temporarily removed from rotation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Before vs After
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Anti-pattern&lt;/th&gt;
&lt;th&gt;Correct pattern&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MongoDB goes down&lt;/td&gt;
&lt;td&gt;Liveness fails → &lt;strong&gt;pod killed&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Readiness fails → pod removed from LB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RESTARTS counter&lt;/td&gt;
&lt;td&gt;Climbs: 1, 2, 3...&lt;/td&gt;
&lt;td&gt;Stays at &lt;strong&gt;0&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Recovery time&lt;/td&gt;
&lt;td&gt;60-90s (restart + initialDelay)&lt;/td&gt;
&lt;td&gt;20-30s (just re-election time)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pod IP after recovery&lt;/td&gt;
&lt;td&gt;New IP — pod was killed&lt;/td&gt;
&lt;td&gt;Same IP — pod survived&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Alert noise&lt;/td&gt;
&lt;td&gt;CrashLoopBackOff fires&lt;/td&gt;
&lt;td&gt;No alerts — expected transient state&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Root cause addressed?&lt;/td&gt;
&lt;td&gt;No — restart doesn't fix MongoDB&lt;/td&gt;
&lt;td&gt;N/A — pod never restarted&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The Rule of Thumb
&lt;/h2&gt;

&lt;p&gt;Put &lt;strong&gt;only &lt;code&gt;livenessState&lt;/code&gt;&lt;/strong&gt; in your liveness group. That's almost always sufficient. If the JVM is alive and not deadlocked, liveness should pass — regardless of what external dependencies are doing.&lt;/p&gt;

&lt;p&gt;Put external dependencies (&lt;code&gt;mongo&lt;/code&gt;, &lt;code&gt;redis&lt;/code&gt;, &lt;code&gt;db&lt;/code&gt;) in your &lt;strong&gt;readiness&lt;/strong&gt; group. Their failure means "I can't serve requests right now" — not "kill me."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Liveness  → am I broken?      → livenessState only
Readiness → am I ready?       → livenessState + all your dependencies
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Both probes pointing to &lt;code&gt;/actuator/health&lt;/code&gt; is an anti-pattern&lt;/li&gt;
&lt;li&gt;When MongoDB goes down, liveness fails, pod gets killed unnecessarily&lt;/li&gt;
&lt;li&gt;Enable &lt;code&gt;probes.enabled: true&lt;/code&gt; in Spring Boot&lt;/li&gt;
&lt;li&gt;Configure &lt;code&gt;group.liveness.include: livenessState&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Configure &lt;code&gt;group.readiness.include: mongo,readinessState&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Switch probe paths to &lt;code&gt;/actuator/health/liveness&lt;/code&gt; and &lt;code&gt;/actuator/health/readiness&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Add &lt;code&gt;serverSelectionTimeoutMS=3000&lt;/code&gt; to your MongoDB URI&lt;/li&gt;
&lt;li&gt;Watch your RESTARTS counter drop to zero&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/" rel="noopener noreferrer"&gt;Kubernetes Probe Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.spring.io/spring-boot/reference/actuator/endpoints.html#actuator.endpoints.kubernetes-probes" rel="noopener noreferrer"&gt;Spring Boot — Kubernetes Probes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.spring.io/spring-boot/reference/actuator/endpoints.html#actuator.endpoints.health.groups" rel="noopener noreferrer"&gt;Spring Boot — Health Groups&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mongodb/mongodb-kubernetes-operator" rel="noopener noreferrer"&gt;MongoDB Community Operator&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learnk8s.io/production-best-practices" rel="noopener noreferrer"&gt;Kubernetes Production Best Practices&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Demonstrated on Docker Desktop Kubernetes with MongoDB Community Operator, Spring Boot 3.3, Java 21. Production target: OpenShift.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>springboot</category>
      <category>resilience</category>
      <category>bestpractices</category>
    </item>
  </channel>
</rss>
