DEV Community

Cover image for Kubernetes + Node.js: Health Checks and Graceful Shutdown Done Right
Ali
Ali

Posted on

Kubernetes + Node.js: Health Checks and Graceful Shutdown Done Right

TL;DR: Kubernetes sends SIGTERM and waits 30 seconds. If your Node.js app doesn't handle it properly, requests fail during deployments. This guide covers liveness/readiness probes, graceful shutdown, and the keep-alive problem.


You deploy a new version. Kubernetes starts rolling out pods. Users start seeing 502 errors.

Sound familiar?

The problem isn't Kubernetes. It's how your Node.js app handles the shutdown sequence. Get it right, and you get zero-downtime deployments. Get it wrong, and every deploy causes errors.

How Kubernetes Terminates Pods

When Kubernetes decides to kill a pod (deployment, scale-down, node drain), this happens:

1. Pod marked for termination
2. Pod removed from Service endpoints (no new traffic)
3. PreStop hook runs (if configured)
4. SIGTERM sent to container
5. Grace period countdown starts (default: 30s)
6. If still running: SIGKILL sent (forced kill)
Enter fullscreen mode Exit fullscreen mode

Your app has 30 seconds to:

  1. Stop accepting new connections
  2. Finish in-flight requests
  3. Close database/cache connections
  4. Exit cleanly

If you don't exit in time, Kubernetes sends SIGKILL. That's an immediate, non-graceful termination. Connections drop. Transactions fail.

The SIGTERM Problem

By default, Node.js does nothing special with SIGTERM:

// Default behavior: exit immediately
const server = app.listen(3000)

// When SIGTERM arrives:
// - Active requests get ECONNRESET
// - Database connections drop
// - No cleanup runs
Enter fullscreen mode Exit fullscreen mode

You need to handle it explicitly:

process.on('SIGTERM', () => {
  console.log('SIGTERM received. Shutting down gracefully...')
  server.close(() => {
    console.log('HTTP server closed')
    process.exit(0)
  })
})
Enter fullscreen mode Exit fullscreen mode

But server.close() alone isn't enough. Keep reading.

Liveness vs Readiness Probes

Kubernetes uses two probes to manage traffic:

Liveness Probe

Question: "Is this container alive?"

If it fails: Kubernetes restarts the container.

Implementation:

livenessProbe:
  httpGet:
    path: /health/live
    port: 3000
  initialDelaySeconds: 10
  periodSeconds: 10
Enter fullscreen mode Exit fullscreen mode
app.get('/health/live', (req, res) => {
  // If the process is running, it's alive
  res.status(200).json({ status: 'alive' })
})
Enter fullscreen mode Exit fullscreen mode

Liveness should almost always return 200. The only time it should fail is if your app is in a broken state that requires a restart (deadlock, memory corruption).

Common mistake: Making liveness depend on database connectivity. If your database goes down, Kubernetes restarts all your pods. Now you have no pods AND no database. Bad.

Readiness Probe

Question: "Can this container handle traffic?"

If it fails: Kubernetes stops routing traffic to this pod.

Implementation:

readinessProbe:
  httpGet:
    path: /health/ready
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 5
Enter fullscreen mode Exit fullscreen mode
let isShuttingDown = false

app.get('/health/ready', (req, res) => {
  if (isShuttingDown) {
    return res.status(503).json({ status: 'shutting_down' })
  }
  res.status(200).json({ status: 'ready' })
})

process.on('SIGTERM', () => {
  isShuttingDown = true
  // Continue with shutdown...
})
Enter fullscreen mode Exit fullscreen mode

During shutdown: Return 503. Kubernetes stops sending new traffic. Existing requests finish.

The Keep-Alive Problem

HTTP keep-alive connections are the silent killer of graceful shutdown.

Here's what happens:

1. Client opens connection to your pod
2. Client sends request
3. Your pod responds
4. Connection stays open (keep-alive)
5. Kubernetes sends SIGTERM to your pod
6. You call server.close()
7. server.close() waits for all connections to close
8. Keep-alive connections are idle but open
9. server.close() waits... and waits...
10. 30 seconds pass
11. SIGKILL
12. Client's next request on that connection fails
Enter fullscreen mode Exit fullscreen mode

The problem: server.close() doesn't close idle connections. It just stops accepting new ones.

Solution: Track and terminate connections

const connections = new Set()

server.on('connection', (socket) => {
  connections.add(socket)
  socket.on('close', () => connections.delete(socket))
})

process.on('SIGTERM', () => {
  // Stop new connections
  server.close()

  // Close idle keep-alive connections
  for (const socket of connections) {
    socket.end()  // Graceful close
  }

  // Force-close after timeout
  setTimeout(() => {
    for (const socket of connections) {
      socket.destroy()  // Force close
    }
  }, 10000)
})
Enter fullscreen mode Exit fullscreen mode

The Race Condition

There's a race between:

  1. Kubernetes removing your pod from endpoints
  2. Your app receiving SIGTERM

Sometimes, requests arrive AFTER SIGTERM but BEFORE the pod is removed from the load balancer.

Solution: Shutdown delay

process.on('SIGTERM', async () => {
  console.log('SIGTERM received')

  // 1. Mark as not ready (stop new requests)
  isShuttingDown = true

  // 2. Wait for load balancer to catch up
  await new Promise(resolve => setTimeout(resolve, 5000))

  // 3. Now close the server
  server.close()
})
Enter fullscreen mode Exit fullscreen mode

Or configure it in your Kubernetes spec:

spec:
  terminationGracePeriodSeconds: 60
  containers:
  - name: app
    lifecycle:
      preStop:
        exec:
          command: ["sleep", "5"]
Enter fullscreen mode Exit fullscreen mode

The preStop hook runs before SIGTERM, giving the load balancer time to update.

Complete Manual Implementation

Here's everything together:

import express from 'express'
import { PrismaClient } from '@prisma/client'

const prisma = new PrismaClient()
const app = express()
let isShuttingDown = false
const connections = new Set()

// Health checks
app.get('/health/live', (req, res) => {
  res.status(200).json({ status: 'alive' })
})

app.get('/health/ready', (req, res) => {
  if (isShuttingDown) {
    return res.status(503).json({ status: 'shutting_down' })
  }
  res.status(200).json({ status: 'ready' })
})

// Your routes
app.get('/api/data', async (req, res) => {
  const data = await prisma.user.findMany()
  res.json(data)
})

const server = app.listen(3000)

// Track connections
server.on('connection', (socket) => {
  connections.add(socket)
  socket.on('close', () => connections.delete(socket))
})

// Graceful shutdown
async function shutdown(signal) {
  console.log(`${signal} received. Starting graceful shutdown...`)

  // 1. Stop accepting new work
  isShuttingDown = true

  // 2. Wait for load balancer (if in Kubernetes)
  await new Promise(resolve => setTimeout(resolve, 5000))

  // 3. Stop HTTP server
  server.close()

  // 4. Close idle connections
  for (const socket of connections) {
    socket.end()
  }

  // 5. Wait for active requests (max 25 seconds)
  await new Promise(resolve => setTimeout(resolve, 25000))

  // 6. Force close remaining connections
  for (const socket of connections) {
    socket.destroy()
  }

  // 7. Close database
  await prisma.$disconnect()

  console.log('Graceful shutdown complete')
  process.exit(0)
}

process.on('SIGTERM', () => shutdown('SIGTERM'))
process.on('SIGINT', () => shutdown('SIGINT'))
Enter fullscreen mode Exit fullscreen mode

That's 60+ lines just for proper Kubernetes shutdown. Every app needs this.

The Zero-Config Way

Kaput handles all of this automatically:

import express from 'express'
import '@joint-ops/kaput'
import { expressHealthMiddleware } from '@joint-ops/kaput'
import { PrismaClient } from '@prisma/client'

const prisma = new PrismaClient()
const app = express()

// Kubernetes-ready health checks
app.use(expressHealthMiddleware())

app.get('/api/data', async (req, res) => {
  const data = await prisma.user.findMany()
  res.json(data)
})

app.listen(3000)
Enter fullscreen mode Exit fullscreen mode

That's it. Kaput:

  • Handles SIGTERM
  • Tracks connections
  • Returns 503 on readiness during shutdown
  • Closes Prisma in the right order
  • Manages timeouts

Kubernetes Manifest

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  template:
    spec:
      terminationGracePeriodSeconds: 60
      containers:
      - name: app
        image: my-app:latest
        ports:
        - containerPort: 3000
        livenessProbe:
          httpGet:
            path: /health/live
            port: 3000
          initialDelaySeconds: 10
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health/ready
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 5
        lifecycle:
          preStop:
            exec:
              command: ["sleep", "5"]
Enter fullscreen mode Exit fullscreen mode

Kaput's expressHealthMiddleware() provides:

  • /health - General health status
  • /health/live - Liveness probe (always 200)
  • /health/ready - Readiness probe (503 during shutdown)

Common Kubernetes + Node.js Mistakes

1. No SIGTERM Handler

// Bad: Process exits immediately, connections drop
app.listen(3000)
Enter fullscreen mode Exit fullscreen mode

2. Liveness Depends on External Services

# Bad: If DB is down, pods restart forever
livenessProbe:
  httpGet:
    path: /health  # Returns 500 if DB is down
Enter fullscreen mode Exit fullscreen mode
# Good: Separate liveness from dependencies
livenessProbe:
  httpGet:
    path: /health/live  # Always 200 if process is running
Enter fullscreen mode Exit fullscreen mode

3. Readiness Doesn't Change on Shutdown

// Bad: Keeps accepting traffic after SIGTERM
app.get('/health/ready', (req, res) => {
  res.status(200).json({ status: 'ready' })
})
Enter fullscreen mode Exit fullscreen mode

4. terminationGracePeriodSeconds Too Short

# Bad: Only 10 seconds to shutdown
terminationGracePeriodSeconds: 10

# Good: Enough time for requests to complete
terminationGracePeriodSeconds: 60
Enter fullscreen mode Exit fullscreen mode

5. No preStop Hook

# Bad: Race between endpoint removal and SIGTERM
containers:
- name: app
  # No preStop hook

# Good: Delay to let load balancer update
lifecycle:
  preStop:
    exec:
      command: ["sleep", "5"]
Enter fullscreen mode Exit fullscreen mode

6. Using npm start in Dockerfile

# Bad: npm intercepts SIGTERM
CMD npm start

# Good: Node receives SIGTERM
CMD ["node", "dist/server.js"]
Enter fullscreen mode Exit fullscreen mode

Testing Graceful Shutdown

Local Testing

# Terminal 1: Start your app
node server.js

# Terminal 2: Send requests
while true; do curl -s http://localhost:3000/api/data; sleep 0.1; done

# Terminal 3: Send SIGTERM
kill -SIGTERM $(pgrep -f "node server.js")

# Watch Terminal 1 for shutdown logs
# Watch Terminal 2 - requests should complete without errors
Enter fullscreen mode Exit fullscreen mode

Kubernetes Testing

# Watch pod status
kubectl get pods -w

# In another terminal, trigger a rollout
kubectl rollout restart deployment/my-app

# Monitor for 5xx errors
kubectl logs -f deployment/my-app | grep -E "(error|Error|ERROR)"
Enter fullscreen mode Exit fullscreen mode

Quick Start with Kaput

npm install @joint-ops/kaput
Enter fullscreen mode Exit fullscreen mode
// server.js
import express from 'express'
import '@joint-ops/kaput'
import { expressHealthMiddleware } from '@joint-ops/kaput'

const app = express()
app.use(expressHealthMiddleware())

app.get('/', (req, res) => res.json({ ok: true }))

app.listen(3000, () => {
  console.log('Server running on port 3000')
})
Enter fullscreen mode Exit fullscreen mode

Deploy to Kubernetes with the manifest above. Zero-downtime deployments. No boilerplate.


Summary

Kubernetes graceful shutdown requires:

  1. SIGTERM handler - Stop accepting work, clean up
  2. Readiness probe - Return 503 during shutdown
  3. Liveness probe - Always 200 (process alive)
  4. Connection tracking - Close keep-alive connections
  5. Shutdown delay - Wait for load balancer to update
  6. Resource ordering - HTTP before databases

Kaput handles all of this with one import. Add health middleware, deploy, done.


Links


Written by Muhammad Ali, Full-Stack & Web3 Gaming Engineer at JointOps. We build production-grade backends for containerized environments.

Tags: #kubernetes #nodejs #gracefulshutdown #devops #docker #healthchecks #expressjs #production

Top comments (0)