Understand how graceful shutdown can achieve zero downtime during k8s rolling update

Yutaro Yamanaka — Sun, 29 Jan 2023 08:10:20 +0000

TL;DR If we don't prepare graceful shutdown for our application running on k8s, it can return 502 error (Bad Gateway) momentarily during the rolling update.

First, I'll briefly explain how old pods will be terminated after the rolling update starts. Then I'll show the simple graceful shutdown implementation that helped my Go application have zero downtime.

What happens in pod termination?

According to official documentation, the following two steps will run asynchronously;

Step1. Run preStop hook if it is defined on the manifest file. After this, send SIGTERM to terminate a process in each container of the shutting-down pod.

Step2. Detach shutting-down pods from their associated service. The service doesn't route requests to those pods anymore.

If we don't let the application sleep for a few seconds by preStop hook or handle SIGTERM appropriately, Step1 can finish earlier than Step2. And if there are some requests before Step2 ends, the service might route those requests to terminated pods and return 502 error. Therefore, rolling update can cause a short downtime until all coming requests are routed to new pods.

Let's understand this more with two experiments.

Experiment

Setup
Actually, running sleep command as a preStop hook is the easiest way to realize graceful shutdown. However, if our application is running on the lightweight container such as alpine, we can't set that command because shell is unavailable on such a container.

The plan B is to handle SIGTERM in the application-code level.

Here is my application code in Go.
package main

import (
 "context"
 "flag"
 "log"
 "net/http"
 "os"
 "os/signal"
 "syscall"
 "time"
)

func main() {
 var t time.Duration
 flag.DurationVar(&t, "shutdown.delay", 0, "duration until shutdown starts")
 flag.Parse()

 srv := http.Server{
  Addr: ":8080",
  Handler: http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
   w.Write([]byte("hello world"))
  }),
 }

 ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM)
 defer stop()
 go func() {
  log.Println("Server is running")
  if err := srv.ListenAndServe(); err != http.ErrServerClosed {
   log.Fatal(err)
  }
 }()

 for {
  select {
  case <-ctx.Done():
   time.Sleep(t)
   srv.Shutdown(ctx)
   return
  }
 }
}

The server starts its shutdown in the seconds specified by shutdown.delay flag.
I created a local k8s cluster by minikube and used vegeta for sending HTTP requests to my application. You can check k8s manifest files and Dockerfile on Gist.

Experiment without graceful shutdown

Let's start with the first experiment which doesn't have graceful shutdown.
In this case, we can set 0s as shutdown.delay.

# deployment.yaml
template:
    spec:
      containers:
      - name: graceful-shudown
        args:
        - --shutdown.delay=0s

# rolling update starts in 30 seconds
$ sleep 30; kubectl rollout restart deployment graceful-shutdown
# execute vegeta command on a different tab
$ echo "GET http://graceful.shutdown.test" | vegeta attack -duration=60s -rate=1000 | tee results.bin | vegeta report
Requests      [total, rate, throughput]  60000, 1000.02, 996.32
Duration      [total, attack, wait]      59.999783354s, 59.999059582s, 723.772µs
Latencies     [mean, 50, 95, 99, max]    136.958326ms, 553.588µs, 10.9967ms, 5.001062432s, 5.089183568s
Bytes In      [total, mean]              690719, 11.51
Bytes Out     [total, mean]              0, 0.00
Success       [ratio]                    99.63%
Status Codes  [code:count]               200:59779  502:221
Error Set:
502 Bad Gateway

I sent requests for 60 seconds and started rolling update in 30 seconds. As we can see, some 502 responses were returned.

Experiment with graceful shutdown

For this experiment, I set 5s as shutdown.delay and kept other settings the same as the previous experiment.

# deployment.yaml
template:
    spec:
      containers:
      - name: graceful-shudown
        args:
        - --shutdown.delay=5s

# rolling update starts in 30 seconds
$ sleep 30; kubectl rollout restart deployment graceful-shutdown
# execute vegeta command on a different tab
$ echo "GET http://graceful.shutdown.test" | vegeta attack -duration=60s -rate=1000 | tee results.bin | vegeta report
Requests      [total, rate, throughput]  60000, 1000.02, 1000.00
Duration      [total, attack, wait]      59.999790006s, 59.999058824s, 731.182µs
Latencies     [mean, 50, 95, 99, max]    1.662431ms, 512.264µs, 3.372343ms, 26.208994ms, 178.154272ms
Bytes In      [total, mean]              660000, 11.00
Bytes Out     [total, mean]              0, 0.00
Success       [ratio]                    100.00%
Status Codes  [code:count]               200:60000
Error Set:

This time, all responses had 200 status.

Conclusion

To avoid downtime during rolling update, we have to implement graceful shutdown by some methods such as a preStop hook or signal handling before the server starts shutdown.

DEV Community: Yutaro Yamanaka

Understand how graceful shutdown can achieve zero downtime during k8s rolling update

What happens in pod termination?

Experiment

Conclusion