<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jyothi Kumar</title>
    <description>The latest articles on DEV Community by Jyothi Kumar (@jyothi_kumar_e50d1adf42ce).</description>
    <link>https://dev.to/jyothi_kumar_e50d1adf42ce</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3935358%2Fbcf775fa-8b92-44d4-9bf0-969a8b3ac00c.png</url>
      <title>DEV Community: Jyothi Kumar</title>
      <link>https://dev.to/jyothi_kumar_e50d1adf42ce</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jyothi_kumar_e50d1adf42ce"/>
    <language>en</language>
    <item>
      <title>Kubernetes in Production:</title>
      <dc:creator>Jyothi Kumar</dc:creator>
      <pubDate>Sat, 16 May 2026 18:53:32 +0000</pubDate>
      <link>https://dev.to/jyothi_kumar_e50d1adf42ce/kubernetes-in-production-3kpi</link>
      <guid>https://dev.to/jyothi_kumar_e50d1adf42ce/kubernetes-in-production-3kpi</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flaswzhm04o282h7kgjop.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flaswzhm04o282h7kgjop.png" alt=" " width="800" height="730"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Kubernetes in Production: Deployments, Scaling, and Troubleshooting the Right Way
&lt;/h1&gt;

&lt;p&gt;So you've got Kubernetes running locally. Maybe you've even deployed a few services to a staging cluster. But production is a different beast — and most tutorials stop right before things get real.&lt;/p&gt;

&lt;p&gt;This article covers what actually matters when running Kubernetes in production: reliable deployments, smart scaling, and debugging when things go wrong (because they will).&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Deployments: Ship Safely Every Time
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Use Rolling Updates with Sensible Defaults
&lt;/h3&gt;

&lt;p&gt;Kubernetes rolls out updates by default, but the defaults aren't always production-safe. Always set these explicitly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;RollingUpdate&lt;/span&gt;
    &lt;span class="na"&gt;rollingUpdate&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;maxSurge&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
      &lt;span class="na"&gt;maxUnavailable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;maxUnavailable: 0&lt;/code&gt; ensures no pod is terminated before a healthy replacement is running. This is the single most impactful change you can make to reduce deployment-related downtime.&lt;/p&gt;

&lt;h3&gt;
  
  
  Set Readiness and Liveness Probes
&lt;/h3&gt;

&lt;p&gt;Without probes, Kubernetes assumes a pod is ready the moment it starts. That's almost never true.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;readinessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;httpGet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/healthz&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;
  &lt;span class="na"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
  &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;

&lt;span class="na"&gt;livenessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;httpGet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/healthz&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;
  &lt;span class="na"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;15&lt;/span&gt;
  &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Readiness probe&lt;/strong&gt;: controls when traffic is sent to the pod&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Liveness probe&lt;/strong&gt;: restarts the pod if it's stuck or deadlocked&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you only implement one thing from this article, make it readiness probes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Always Set Resource Requests and Limits
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;250m"&lt;/span&gt;
    &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;256Mi"&lt;/span&gt;
  &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;500m"&lt;/span&gt;
    &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;512Mi"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without requests, the scheduler can't make good placement decisions. Without limits, a single misbehaving pod can starve its neighbors. Both will cause you pain in production.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Scaling: Handle Traffic Without Drama
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Horizontal Pod Autoscaler (HPA)
&lt;/h3&gt;

&lt;p&gt;HPA scales your pods based on CPU, memory, or custom metrics.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;autoscaling/v2&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HorizontalPodAutoscaler&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-app-hpa&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;scaleTargetRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
    &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-app&lt;/span&gt;
  &lt;span class="na"&gt;minReplicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
  &lt;span class="na"&gt;maxReplicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
  &lt;span class="na"&gt;metrics&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Resource&lt;/span&gt;
      &lt;span class="na"&gt;resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cpu&lt;/span&gt;
        &lt;span class="na"&gt;target&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Utilization&lt;/span&gt;
          &lt;span class="na"&gt;averageUtilization&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;60&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few rules of thumb:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Never set &lt;code&gt;minReplicas: 1&lt;/code&gt;&lt;/strong&gt; for production workloads — you lose high availability&lt;/li&gt;
&lt;li&gt;Target &lt;strong&gt;60–70% CPU utilization&lt;/strong&gt;, not 80%+. You want headroom before the next scale event kicks in&lt;/li&gt;
&lt;li&gt;Give HPA time to stabilize — avoid tuning it based on a single traffic spike&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cluster Autoscaler
&lt;/h3&gt;

&lt;p&gt;HPA scales pods; Cluster Autoscaler scales nodes. Use both together.&lt;/p&gt;

&lt;p&gt;When HPA adds pods and there's no room on existing nodes, Cluster Autoscaler provisions new nodes automatically. When load drops, it removes underutilized nodes to cut costs.&lt;/p&gt;

&lt;p&gt;Key config tip: set &lt;code&gt;--scale-down-utilization-threshold=0.5&lt;/code&gt; to avoid aggressive scale-downs that can disrupt workloads.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pod Disruption Budgets (PDBs)
&lt;/h3&gt;

&lt;p&gt;PDBs protect your app during node maintenance or autoscaling events:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;policy/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PodDisruptionBudget&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-app-pdb&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;minAvailable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-app&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells Kubernetes: "Never take down more pods than would leave fewer than 2 running." Without a PDB, rolling node upgrades can silently take down your entire service.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Troubleshooting: Debug Like a Pro
&lt;/h2&gt;

&lt;p&gt;Here's a systematic approach when something breaks in production.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1 — Check Pod Status
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get pods &lt;span class="nt"&gt;-n&lt;/span&gt; &amp;lt;namespace&amp;gt;
kubectl describe pod &amp;lt;pod-name&amp;gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &amp;lt;namespace&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Look at the &lt;code&gt;Events&lt;/code&gt; section at the bottom of &lt;code&gt;describe&lt;/code&gt; output first. It tells you exactly what Kubernetes tried to do and where it failed.&lt;/p&gt;

&lt;p&gt;Common states and what they mean:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;th&gt;Likely Cause&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CrashLoopBackOff&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;App is crashing on startup — check logs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Pending&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;No node can schedule the pod — check resource requests or taints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;OOMKilled&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Memory limit too low — increase limits or fix a memory leak&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ImagePullBackOff&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Wrong image name/tag or missing registry credentials&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Step 2 — Read the Logs
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Current logs&lt;/span&gt;
kubectl logs &amp;lt;pod-name&amp;gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &amp;lt;namespace&amp;gt;

&lt;span class="c"&gt;# Previous container instance (if crashing)&lt;/span&gt;
kubectl logs &amp;lt;pod-name&amp;gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &amp;lt;namespace&amp;gt; &lt;span class="nt"&gt;--previous&lt;/span&gt;

&lt;span class="c"&gt;# Follow live logs&lt;/span&gt;
kubectl logs &lt;span class="nt"&gt;-f&lt;/span&gt; &amp;lt;pod-name&amp;gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &amp;lt;namespace&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--previous&lt;/code&gt; flag is critical for &lt;code&gt;CrashLoopBackOff&lt;/code&gt; — it shows you logs from the crashed container, not the restarted one.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3 — Exec Into the Pod
&lt;/h3&gt;

&lt;p&gt;When logs aren't enough:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; &amp;lt;pod-name&amp;gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &amp;lt;namespace&amp;gt; &lt;span class="nt"&gt;--&lt;/span&gt; /bin/sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From inside the pod you can test DNS resolution, check environment variables, curl internal services, and verify file mounts — all in the actual runtime environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4 — Check Events Cluster-Wide
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get events &lt;span class="nt"&gt;-n&lt;/span&gt; &amp;lt;namespace&amp;gt; &lt;span class="nt"&gt;--sort-by&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'.lastTimestamp'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is often overlooked but invaluable. Node pressure, failed mounts, scheduler failures — all show up here.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5 — Inspect Resource Pressure
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl top nodes
kubectl top pods &lt;span class="nt"&gt;-n&lt;/span&gt; &amp;lt;namespace&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If nodes are under memory or CPU pressure, they'll start evicting pods. This can look like random pod restarts when the real problem is a noisy neighbor.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Reference Checklist
&lt;/h2&gt;

&lt;p&gt;Before any production deployment, verify:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Readiness and liveness probes are configured&lt;/li&gt;
&lt;li&gt;[ ] Resource requests and limits are set&lt;/li&gt;
&lt;li&gt;[ ] &lt;code&gt;maxUnavailable: 0&lt;/code&gt; in rolling update strategy&lt;/li&gt;
&lt;li&gt;[ ] HPA is configured with &lt;code&gt;minReplicas &amp;gt;= 2&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;[ ] Pod Disruption Budget exists for critical services&lt;/li&gt;
&lt;li&gt;[ ] Image tags are pinned (never use &lt;code&gt;:latest&lt;/code&gt; in production)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;Most Kubernetes outages aren't caused by Kubernetes itself — they're caused by missing probes, absent resource limits, or no disruption budgets. The cluster is doing exactly what it's configured to do. Production-readiness is about closing those gaps before traffic finds them for you.&lt;/p&gt;

&lt;p&gt;Got questions or war stories from your own clusters? Drop them in the comments.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>infrastructure</category>
      <category>kubernetes</category>
      <category>sre</category>
    </item>
  </channel>
</rss>
