1xApi

Posted on Mar 29 • Originally published at 1xapi.com

How to Implement Graceful Shutdown in Node.js APIs for Zero-Downtime Deployments (2026 Guide)

#node #devops #kubernetes #api

Zero-downtime deployments are non-negotiable for production APIs. Yet one of the most common causes of dropped requests and 502 errors during deployments is something deceptively simple: your Node.js process doesn't know how to die gracefully.

When Kubernetes sends a SIGTERM to your pod, or Docker stops a container, your API has a window to finish in-flight requests, close database connections, flush queues, and exit cleanly. Without a proper shutdown handler, requests get silently dropped, transactions left open, and Redis connections leak — all while your users experience mysterious errors during what should be a seamless deploy.

This guide walks through building a production-grade graceful shutdown system for Node.js APIs in 2026, covering Express, Fastify, Hono, and Kubernetes-specific patterns.

Why Graceful Shutdown Matters in 2026

Modern deployment pipelines run rolling updates continuously. A typical Kubernetes rolling update sends SIGTERM to the old pod while simultaneously routing new traffic to the new pod. The gap between these two events — the termination grace period — is your window to clean up.

Scenario	Without Graceful Shutdown	With Graceful Shutdown
Rolling deploy	Requests dropped, 502 errors	Zero dropped requests
Scale-down event	Connections terminated	Connections drained
Pod eviction	Open transactions, data risk	Clean commit or rollback
Health check transition	Traffic sent to dying pod	Removed from load balancer first

In Kubernetes, the default terminationGracePeriodSeconds is 30 seconds. That's your budget. Use it wisely.

Understanding the Shutdown Signal Chain

Before writing code, understand what actually happens when your pod terminates:

Kubernetes sends SIGTERM to PID 1 in your container
Kubernetes simultaneously removes your pod from the Endpoints list (this can take 2–10 seconds to propagate through kube-proxy/iptables)
After terminationGracePeriodSeconds, Kubernetes sends SIGKILL (no escape)

The critical gap is step 2. Traffic may still arrive at your pod for several seconds after it receives SIGTERM. This is why a naive process.exit(0) on SIGTERM still drops requests.

The solution: add a pre-stop sleep and stop accepting new connections gradually.

Step 1: Basic Shutdown Handler

Let's start with the fundamentals — a minimal but correct shutdown handler for any Node.js HTTP server:

// server.js
import express from 'express';

const app = express();
let isShuttingDown = false;

// Middleware: reject new requests during shutdown
app.use((req, res, next) => {
  if (isShuttingDown) {
    res.set('Connection', 'close');
    return res.status(503).json({ error: 'Server is shutting down' });
  }
  next();
});

app.get('/health', (req, res) => {
  if (isShuttingDown) return res.status(503).json({ status: 'shutting_down' });
  res.json({ status: 'ok' });
});

app.get('/api/data', async (req, res) => {
  // Simulate async work
  await new Promise(resolve => setTimeout(resolve, 200));
  res.json({ data: 'response' });
});

const server = app.listen(3000, () => {
  console.log('Server listening on port 3000');
});

// Track active connections for forced drain
const connections = new Set();
server.on('connection', socket => {
  connections.add(socket);
  socket.on('close', () => connections.delete(socket));
});

async function shutdown(signal) {
  console.log(`[Shutdown] Received ${signal}`);
  isShuttingDown = true;

  // Stop accepting new connections
  server.close(err => {
    if (err) {
      console.error('[Shutdown] Error:', err);
      process.exit(1);
    }
    console.log('[Shutdown] All connections closed. Exiting.');
    process.exit(0);
  });

  // Force-close idle connections after 30s
  setTimeout(() => {
    console.warn('[Shutdown] Timeout hit, destroying remaining connections');
    for (const socket of connections) socket.destroy();
    process.exit(1);
  }, 30_000);
}

process.on('SIGTERM', () => shutdown('SIGTERM'));
process.on('SIGINT',  () => shutdown('SIGINT'));

This handles the basics: the isShuttingDown flag prevents new work from entering, and server.close() waits for in-flight requests before exiting. The 30-second hard exit is your safety net.

Step 2: The Pre-Stop Sleep (Kubernetes Critical Pattern)

Because Kubernetes propagates endpoint removal asynchronously, you need to delay the actual shutdown start by a few seconds after receiving SIGTERM. This prevents 502s from traffic that's still being routed to the terminating pod.

There are two ways to implement this:

Option A: Kubernetes `preStop` hook (recommended)

# deployment.yaml
spec:
  terminationGracePeriodSeconds: 60
  containers:
  - name: api
    lifecycle:
      preStop:
        exec:
          command: ["/bin/sh", "-c", "sleep 10"]

The preStop hook runs before SIGTERM is sent. Setting it to sleep 10 gives kube-proxy 10 seconds to drain existing connections before your app even starts shutting down. Your terminationGracePeriodSeconds must be greater than preStop duration + your app's shutdown time.

Option B: Sleep inside the shutdown handler

async function shutdown(signal) {
  console.log(`[Shutdown] Received ${signal}`);
  isShuttingDown = true;

  // Give load balancer time to route traffic away (5–15 seconds)
  const DRAIN_DELAY = parseInt(process.env.SHUTDOWN_DRAIN_DELAY ?? '10', 10) * 1000;
  console.log(`[Shutdown] Waiting ${DRAIN_DELAY}ms for traffic drain...`);
  await new Promise(resolve => setTimeout(resolve, DRAIN_DELAY));

  // Now stop accepting connections
  server.close(() => {
    console.log('[Shutdown] Server closed. Exiting.');
    process.exit(0);
  });

  setTimeout(() => process.exit(1), 30_000);
}

Option A is preferred in Kubernetes because it keeps your app logic clean and the delay configurable per-environment without code changes.

Step 3: Production-Grade Shutdown Manager

Real APIs have more than just HTTP connections to clean up: PostgreSQL pools, Redis connections, open file handles, message queue consumers. Here's a ShutdownManager class that handles all of them:

// shutdown.js
export class ShutdownManager {
  #isShuttingDown = false;
  #cleanupHandlers = [];
  #timeout;

  constructor({ timeoutMs = 30_000 } = {}) {
    this.#timeout = timeoutMs;
  }

  /** Register a named cleanup handler */
  register(name, fn) {
    this.#cleanupHandlers.push({ name, fn });
    return this; // chainable
  }

  /** Call from your server setup */
  listen(server) {
    const connections = new Set();
    server.on('connection', s => {
      connections.add(s);
      s.on('close', () => connections.delete(s));
    });

    const handler = async signal => {
      if (this.#isShuttingDown) return;
      this.#isShuttingDown = true;
      console.log(`\n[Shutdown] Signal: ${signal}`);

      // Hard timeout as last resort
      const forceExit = setTimeout(() => {
        console.error('[Shutdown] Timeout exceeded — forcing exit');
        process.exit(1);
      }, this.#timeout);
      forceExit.unref(); // Don't prevent clean exit

      // 1. Stop HTTP server
      await new Promise(resolve => server.close(resolve));
      console.log('[Shutdown] HTTP server closed');

      // 2. Run cleanup handlers in order
      for (const { name, fn } of this.#cleanupHandlers) {
        try {
          await fn();
          console.log(`[Shutdown] ✓ ${name}`);
        } catch (err) {
          console.error(`[Shutdown] ✗ ${name}:`, err.message);
        }
      }

      // 3. Force-close any remaining sockets
      for (const s of connections) s.destroy();

      clearTimeout(forceExit);
      console.log('[Shutdown] Complete. Goodbye.');
      process.exit(0);
    };

    process.on('SIGTERM', handler);
    process.on('SIGINT',  handler);
  }

  get isShuttingDown() { return this.#isShuttingDown; }
}

Usage with real resources:

// app.js
import { Pool } from 'pg';
import { Redis } from 'ioredis';
import { ShutdownManager } from './shutdown.js';

const pool  = new Pool({ connectionString: process.env.DATABASE_URL });
const redis = new Redis(process.env.REDIS_URL);

const shutdown = new ShutdownManager({ timeoutMs: 30_000 });

shutdown
  .register('PostgreSQL pool', () => pool.end())
  .register('Redis client',    () => redis.quit())
  .register('Flush metrics',   () => metrics.flush()); // optional

const server = app.listen(3000);
shutdown.listen(server);

Each cleanup handler runs sequentially, with individual error isolation — a failed Redis close won't prevent the PostgreSQL pool from shutting down.

Step 4: Health Check Integration

Your Kubernetes readiness probe needs to know when to stop sending traffic before SIGTERM arrives. Update your health endpoint:

app.get('/health/ready', (req, res) => {
  if (shutdown.isShuttingDown) {
    // Return 503 → Kubernetes removes pod from endpoints immediately
    return res.status(503).json({
      status: 'shutting_down',
      message: 'Pod is draining'
    });
  }
  res.json({ status: 'ready' });
});

app.get('/health/live', (req, res) => {
  // Liveness probe stays 200 until we're fully done
  // (returning 500 triggers a pod restart, not a clean shutdown)
  res.json({ status: 'alive' });
});

Corresponding Kubernetes probes:

readinessProbe:
  httpGet:
    path: /health/ready
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 5
  failureThreshold: 1   # Remove from rotation immediately on first failure

livenessProbe:
  httpGet:
    path: /health/live
    port: 3000
  initialDelaySeconds: 10
  periodSeconds: 10
  failureThreshold: 3

With failureThreshold: 1 on the readiness probe, as soon as your pod returns 503 on /health/ready, Kubernetes removes it from the load balancer — eliminating the race condition entirely.

Step 5: Fastify and Hono Patterns

The same principles apply to modern frameworks, with minor differences.

Fastify (with built-in `fastify.close()`)

import Fastify from 'fastify';

const fastify = Fastify({ logger: true });
let isShuttingDown = false;

fastify.addHook('onRequest', async (request, reply) => {
  if (isShuttingDown) {
    reply.header('Connection', 'close');
    reply.status(503).send({ error: 'shutting down' });
  }
});

const shutdown = async signal => {
  fastify.log.info({ signal }, 'Shutdown initiated');
  isShuttingDown = true;

  await fastify.close(); // waits for in-flight requests + runs onClose hooks
  process.exit(0);
};

process.on('SIGTERM', shutdown);
process.on('SIGINT',  shutdown);

Fastify's fastify.close() handles the connection draining internally and fires any registered fastify.addHook('onClose', ...) handlers — making cleanup registration more ergonomic.

Hono on Node.js

import { serve } from '@hono/node-server';
import { Hono } from 'hono';

const app = new Hono();
let isShuttingDown = false;

app.use('*', async (c, next) => {
  if (isShuttingDown) {
    return c.json({ error: 'shutting down' }, 503);
  }
  return next();
});

const server = serve({ fetch: app.fetch, port: 3000 });

process.on('SIGTERM', async () => {
  isShuttingDown = true;
  server.close(() => process.exit(0));
  setTimeout(() => process.exit(1), 30_000);
});

Step 6: Handling BullMQ Workers

If your API runs BullMQ background workers, graceful shutdown is critical — abruptly stopping a worker leaves jobs in the active state and they won't be retried until the lock expires (default: 30 seconds).

import { Worker } from 'bullmq';

const worker = new Worker('emails', processJob, {
  connection: redis,
  lockDuration: 30_000,
});

shutdown.register('BullMQ worker', async () => {
  await worker.close();           // waits for current job to finish
  console.log('Worker drained');
});

worker.close() calls worker.pause() internally, waits for the active job to complete, then closes the connection. Pair it with a lockRenewer for jobs that take longer than lockDuration.

Complete Deployment Checklist

Here's the full picture for zero-downtime Kubernetes deployments in 2026:

# deployment.yaml (production template)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0       # Never reduce capacity during rollout
  template:
    spec:
      terminationGracePeriodSeconds: 60
      containers:
      - name: api
        image: my-api:latest
        ports:
        - containerPort: 3000
        env:
        - name: SHUTDOWN_DRAIN_DELAY
          value: "10"
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 10"]
        readinessProbe:
          httpGet:
            path: /health/ready
            port: 3000
          periodSeconds: 5
          failureThreshold: 1
        livenessProbe:
          httpGet:
            path: /health/live
            port: 3000
          periodSeconds: 10
          failureThreshold: 3

Checklist:

[ ] process.on('SIGTERM') handler registered
[ ] process.on('SIGINT') handler registered
[ ] isShuttingDown flag rejects new requests with 503
[ ] /health/ready returns 503 when shutting down
[ ] Database pool properly closed on shutdown
[ ] Redis/cache client properly closed on shutdown
[ ] BullMQ workers drained before exit
[ ] Hard timeout (process.exit(1)) as last resort
[ ] preStop hook adds drain delay in Kubernetes
[ ] terminationGracePeriodSeconds > preStop + shutdown time

Testing Your Shutdown Handler

Don't assume it works — test it:

# Start your server
node server.js &
SERVER_PID=$!

# Send some requests in a loop
for i in $(seq 1 20); do
  curl -s http://localhost:3000/api/data &
done

# Send SIGTERM mid-flight
sleep 0.1 && kill -TERM $SERVER_PID

# Wait and check — no 502s, no lost responses
wait
echo "All requests completed cleanly"

For load testing under shutdown, use k6:

// k6-shutdown-test.js
import http from 'k6/http';
import { check } from 'k6';

export const options = {
  vus: 50,
  duration: '30s',
};

export default function () {
  const res = http.get('http://localhost:3000/api/data');
  check(res, {
    'status is 200 or 503': r => r.status === 200 || r.status === 503,
    'never 502': r => r.status !== 502,
  });
}

A 503 during shutdown is acceptable (it's intentional). A 502 means a connection was dropped — that's a bug.

Summary

Graceful shutdown is one of those things that seems optional until production punishes you for ignoring it. In a world of continuous deployment, Kubernetes rolling updates, and auto-scaling, every restart is a potential incident without it.

The pattern in 2026 is clear:

Handle SIGTERM and SIGINT
Set an isShuttingDown flag immediately and return 503 on readiness
Stop accepting new connections with server.close()
Run cleanup handlers (DB, Redis, queues) sequentially
Force exit after 30 seconds as a last resort
Add a preStop sleep in Kubernetes to absorb the endpoint propagation delay

With these patterns in place, your API endpoints can deploy dozens of times a day without a single dropped request.

Originally published at 1xAPI.com

DEV Community

How to Implement Graceful Shutdown in Node.js APIs for Zero-Downtime Deployments (2026 Guide)

Why Graceful Shutdown Matters in 2026

Understanding the Shutdown Signal Chain

Step 1: Basic Shutdown Handler

Step 2: The Pre-Stop Sleep (Kubernetes Critical Pattern)

Option A: Kubernetes `preStop` hook (recommended)

Option B: Sleep inside the shutdown handler

Step 3: Production-Grade Shutdown Manager

Step 4: Health Check Integration

Step 5: Fastify and Hono Patterns

Fastify (with built-in `fastify.close()`)

Hono on Node.js

Step 6: Handling BullMQ Workers

Complete Deployment Checklist

Testing Your Shutdown Handler

Summary

Top comments (0)

Why Graceful Shutdown Matters in 2026

Understanding the Shutdown Signal Chain

Step 1: Basic Shutdown Handler

Step 2: The Pre-Stop Sleep (Kubernetes Critical Pattern)

Option A: Kubernetes preStop hook (recommended)

Option B: Sleep inside the shutdown handler

Step 3: Production-Grade Shutdown Manager

Step 4: Health Check Integration

Step 5: Fastify and Hono Patterns

Fastify (with built-in fastify.close())

Hono on Node.js

Step 6: Handling BullMQ Workers

Complete Deployment Checklist

Testing Your Shutdown Handler

Summary

Option A: Kubernetes `preStop` hook (recommended)

Fastify (with built-in `fastify.close()`)