<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Yutaro Yamanaka</title>
    <description>The latest articles on DEV Community by Yutaro Yamanaka (@yutaroyamanaka).</description>
    <link>https://dev.to/yutaroyamanaka</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1016493%2F8614e458-bc3a-43ca-a2f8-44b59624dbee.png</url>
      <title>DEV Community: Yutaro Yamanaka</title>
      <link>https://dev.to/yutaroyamanaka</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/yutaroyamanaka"/>
    <language>en</language>
    <item>
      <title>Understand how graceful shutdown can achieve zero downtime during k8s rolling update</title>
      <dc:creator>Yutaro Yamanaka</dc:creator>
      <pubDate>Sun, 29 Jan 2023 08:10:20 +0000</pubDate>
      <link>https://dev.to/yutaroyamanaka/understand-how-graceful-shutdown-can-achieve-zero-downtime-during-k8s-rolling-update-15eh</link>
      <guid>https://dev.to/yutaroyamanaka/understand-how-graceful-shutdown-can-achieve-zero-downtime-during-k8s-rolling-update-15eh</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;TL;DR If we don't prepare graceful shutdown for our application running on k8s, it can return 502 error (Bad Gateway) momentarily during the rolling update.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;First, I'll briefly explain how old pods will be terminated after the rolling update starts. Then I'll show the simple graceful shutdown implementation that helped my Go application have zero downtime.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;What happens in pod termination?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;According to &lt;a href="https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-lifetime" rel="noopener noreferrer"&gt;official documentation&lt;/a&gt;, the following two steps will run asynchronously;&lt;/p&gt;

&lt;p&gt;Step1. Run preStop hook if it is defined on the manifest file. After this, send SIGTERM to terminate a process in each container of the shutting-down pod.&lt;/p&gt;

&lt;p&gt;Step2. Detach shutting-down pods from their associated service. The service doesn't route requests to those pods anymore.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9y5yhs1dbgp5qc94g0hr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9y5yhs1dbgp5qc94g0hr.png" alt="Image description" width="800" height="667"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If we don't let the application sleep for a few seconds by preStop hook or handle SIGTERM appropriately, Step1 can finish earlier than Step2. And if there are some requests before Step2 ends, the service might route those requests to terminated pods and return 502 error. Therefore, rolling update can cause a short downtime until all coming requests are routed to new pods.&lt;/p&gt;

&lt;p&gt;Let's understand this more with two experiments.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Experiment&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Setup&lt;/strong&gt;&lt;br&gt;
Actually, running sleep command as a preStop hook is the easiest way to realize graceful shutdown. However, if our application is running on the lightweight container such as alpine, we can't set that command because shell is unavailable on such a container. &lt;/p&gt;

&lt;p&gt;The plan B is to handle SIGTERM in the application-code level.&lt;/p&gt;

&lt;p&gt;Here is my application code in Go.&lt;br&gt;
package main&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
 &lt;span class="s"&gt;"context"&lt;/span&gt;
 &lt;span class="s"&gt;"flag"&lt;/span&gt;
 &lt;span class="s"&gt;"log"&lt;/span&gt;
 &lt;span class="s"&gt;"net/http"&lt;/span&gt;
 &lt;span class="s"&gt;"os"&lt;/span&gt;
 &lt;span class="s"&gt;"os/signal"&lt;/span&gt;
 &lt;span class="s"&gt;"syscall"&lt;/span&gt;
 &lt;span class="s"&gt;"time"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;
 &lt;span class="n"&gt;flag&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DurationVar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"shutdown.delay"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"duration until shutdown starts"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="n"&gt;flag&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Parse&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

 &lt;span class="n"&gt;srv&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Server&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;Addr&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;":8080"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;Handler&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HandlerFunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseWriter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hello world"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;

 &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stop&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;signal&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NotifyContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Background&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Interrupt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;syscall&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SIGTERM&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
 &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Server is running"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;srv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ListenAndServe&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ErrServerClosed&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
 &lt;span class="p"&gt;}()&lt;/span&gt;

 &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Done&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
   &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="n"&gt;srv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Shutdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The server starts its shutdown in the seconds specified by shutdown.delay flag.&lt;br&gt;
I created a local k8s cluster by minikube and used &lt;a href="https://github.com/tsenart/vegeta" rel="noopener noreferrer"&gt;vegeta&lt;/a&gt; for sending HTTP requests to my application. You can check k8s manifest files and Dockerfile on &lt;a href="https://gist.github.com/yutaroyamanaka/3b77f028bfec131b86f2207714d7306c" rel="noopener noreferrer"&gt;Gist&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Experiment without graceful shutdown&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let's start with the first experiment which doesn't have graceful shutdown.&lt;br&gt;
In this case, we can set 0s as shutdown.delay.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# deployment.yaml&lt;/span&gt;
&lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;graceful-shudown&lt;/span&gt;
        &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;--shutdown.delay=0s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# rolling update starts in 30 seconds&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sleep &lt;/span&gt;30&lt;span class="p"&gt;;&lt;/span&gt; kubectl rollout restart deployment graceful-shutdown
&lt;span class="c"&gt;# execute vegeta command on a different tab&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"GET http://graceful.shutdown.test"&lt;/span&gt; | vegeta attack &lt;span class="nt"&gt;-duration&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;60s &lt;span class="nt"&gt;-rate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000 | &lt;span class="nb"&gt;tee &lt;/span&gt;results.bin | vegeta report
Requests      &lt;span class="o"&gt;[&lt;/span&gt;total, rate, throughput]  60000, 1000.02, 996.32
Duration      &lt;span class="o"&gt;[&lt;/span&gt;total, attack, &lt;span class="nb"&gt;wait&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;      59.999783354s, 59.999059582s, 723.772µs
Latencies     &lt;span class="o"&gt;[&lt;/span&gt;mean, 50, 95, 99, max]    136.958326ms, 553.588µs, 10.9967ms, 5.001062432s, 5.089183568s
Bytes In      &lt;span class="o"&gt;[&lt;/span&gt;total, mean]              690719, 11.51
Bytes Out     &lt;span class="o"&gt;[&lt;/span&gt;total, mean]              0, 0.00
Success       &lt;span class="o"&gt;[&lt;/span&gt;ratio]                    99.63%
Status Codes  &lt;span class="o"&gt;[&lt;/span&gt;code:count]               200:59779  502:221
Error Set:
502 Bad Gateway
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I sent requests for 60 seconds and started rolling update in 30 seconds. As we can see, some 502 responses were returned.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Experiment with graceful shutdown&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For this experiment, I set 5s as shutdown.delay and kept other settings the same as the previous experiment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# deployment.yaml&lt;/span&gt;
&lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;graceful-shudown&lt;/span&gt;
        &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;--shutdown.delay=5s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# rolling update starts in 30 seconds&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sleep &lt;/span&gt;30&lt;span class="p"&gt;;&lt;/span&gt; kubectl rollout restart deployment graceful-shutdown
&lt;span class="c"&gt;# execute vegeta command on a different tab&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"GET http://graceful.shutdown.test"&lt;/span&gt; | vegeta attack &lt;span class="nt"&gt;-duration&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;60s &lt;span class="nt"&gt;-rate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000 | &lt;span class="nb"&gt;tee &lt;/span&gt;results.bin | vegeta report
Requests      &lt;span class="o"&gt;[&lt;/span&gt;total, rate, throughput]  60000, 1000.02, 1000.00
Duration      &lt;span class="o"&gt;[&lt;/span&gt;total, attack, &lt;span class="nb"&gt;wait&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;      59.999790006s, 59.999058824s, 731.182µs
Latencies     &lt;span class="o"&gt;[&lt;/span&gt;mean, 50, 95, 99, max]    1.662431ms, 512.264µs, 3.372343ms, 26.208994ms, 178.154272ms
Bytes In      &lt;span class="o"&gt;[&lt;/span&gt;total, mean]              660000, 11.00
Bytes Out     &lt;span class="o"&gt;[&lt;/span&gt;total, mean]              0, 0.00
Success       &lt;span class="o"&gt;[&lt;/span&gt;ratio]                    100.00%
Status Codes  &lt;span class="o"&gt;[&lt;/span&gt;code:count]               200:60000
Error Set:
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This time, all responses had 200 status.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;To avoid downtime during rolling update, we have to implement graceful shutdown by some methods such as a preStop hook or signal handling before the server starts shutdown.&lt;/p&gt;

</description>
      <category>cryptocurrency</category>
      <category>crypto</category>
      <category>offers</category>
      <category>web3</category>
    </item>
  </channel>
</rss>
