1xApi

Posted on Mar 30 • Originally published at 1xapi.com

How to Implement Zero-Downtime Graceful Shutdown in Node.js APIs (2026 Guide)

#node #devops #kubernetes #api

Zero-downtime deployments are only possible if your API can shut down cleanly. Without a proper graceful shutdown, rolling deployments drop in-flight requests, Kubernetes rolls back, and your users see 502 errors.

This guide covers everything you need to implement production-grade graceful shutdown in Node.js APIs in 2026 — from basic SIGTERM handling to connection draining, health check transitions, and Kubernetes lifecycle hooks.

Why Graceful Shutdown Matters in 2026

Modern APIs run in containerized, orchestrated environments (Kubernetes, ECS, Cloud Run). When you deploy a new version:

The orchestrator sends SIGTERM to the old pod
It waits for terminationGracePeriodSeconds (default: 30s)
If the process hasn't exited, it sends SIGKILL — forcefully terminating all connections

Without handling SIGTERM, your Node.js process dies instantly, dropping every in-flight request mid-response. In a busy API with hundreds of concurrent connections, this causes a flood of 502/504 errors.

The cost: lost transactions, broken API clients, Kubernetes health check failures, and potential rollbacks.

The Five Stages of Graceful Shutdown

A production-grade shutdown follows this sequence:

Receive SIGTERM — stop accepting new requests
Signal readiness/liveness probes — return 503 from /health so load balancers deregister you
Drain in-flight requests — wait for all active requests to complete
Close resource connections — DB pools, Redis, message queues, file handles
Exit cleanly — process.exit(0) with success code

Skip any step, and you either drop requests or leave zombie connections in your database pool.

Basic SIGTERM Handler

Here's the minimum viable graceful shutdown for an Express API:

import express from 'express';
import http from 'http';

const app = express();
const server = http.createServer(app);

let isShuttingDown = false;

// Health check — returns 503 during shutdown so LB stops routing
app.get('/health', (req, res) => {
  if (isShuttingDown) {
    return res.status(503).json({ status: 'shutting_down' });
  }
  res.json({ status: 'healthy', uptime: process.uptime() });
});

app.get('/api/users', async (req, res) => {
  const users = await db.query('SELECT * FROM users LIMIT 20');
  res.json(users);
});

async function gracefulShutdown(signal: string) {
  console.log(`[${signal}] Starting graceful shutdown...`);
  isShuttingDown = true;

  // Stop accepting new connections
  server.close(async (err) => {
    if (err) {
      console.error('Error closing server:', err);
      process.exit(1);
    }

    try {
      await Promise.all([
        db.end(),         // close DB pool
        redis.quit(),     // close Redis connection
      ]);
      console.log('All connections closed. Exiting cleanly.');
      process.exit(0);
    } catch (cleanupErr) {
      console.error('Cleanup error:', cleanupErr);
      process.exit(1);
    }
  });

  // Force-exit safety valve — Kubernetes will SIGKILL at 30s anyway
  setTimeout(() => {
    console.error('Graceful shutdown timed out. Forcing exit.');
    process.exit(1);
  }, 25000);
}

process.on('SIGTERM', () => gracefulShutdown('SIGTERM'));
process.on('SIGINT',  () => gracefulShutdown('SIGINT'));

server.listen(3000, () => console.log('API listening on :3000'));

The key insight here: returning 503 from /health is the most important step. It tells your Kubernetes readiness probe and any upstream load balancer to stop routing new traffic to this pod immediately, giving in-flight requests time to drain before the connection count hits zero.

The Keep-Alive Connection Problem

server.close() stops accepting new connections, but there's a subtle trap: HTTP/1.1 keep-alive connections keep the server alive indefinitely.

If a client opened a persistent connection and is sitting idle between requests, server.close() won't close that socket — it waits for the client to close it naturally. In practice, this means your server can hang for the full grace period without ever exiting.

The Fix: Track and Destroy Idle Connections

const activeConnections = new Set<import('net').Socket>();

server.on('connection', (socket) => {
  activeConnections.add(socket);
  socket.on('close', () => activeConnections.delete(socket));
});

// Inside gracefulShutdown, after server.close():
function destroyIdleConnections() {
  for (const socket of activeConnections) {
    // Only destroy sockets with no pending request (idle keep-alive)
    // @ts-ignore — _httpMessage is internal but reliable
    if (!socket._httpMessage || socket._httpMessage.finished) {
      socket.destroy();
      activeConnections.delete(socket);
    }
  }
}

// Call this after server.close() to push idle connections out
destroyIdleConnections();

This forces idle keep-alive sockets closed immediately while letting active requests finish naturally. Node.js 22+ does not yet close idle keep-alive connections automatically during server.close() — there's an open GitHub issue (#60617) tracking this for a future built-in server.closeIdleConnections() method, but as of March 2026 you still need to implement it manually.

Tracking Active Requests with a Counter

For complex APIs, you want to know exactly when the last in-flight request completes:

let activeRequests = 0;
let resolveIdle: (() => void) | null = null;

// Middleware to count in-flight requests
app.use((req, res, next) => {
  activeRequests++;
  res.on('finish', () => {
    activeRequests--;
    if (isShuttingDown && activeRequests === 0 && resolveIdle) {
      resolveIdle();
    }
  });
  next();
});

function waitForIdleRequests(timeoutMs = 20000): Promise<void> {
  if (activeRequests === 0) return Promise.resolve();
  return new Promise((resolve, reject) => {
    resolveIdle = resolve;
    setTimeout(() => reject(new Error('Drain timeout')), timeoutMs);
  });
}

// In gracefulShutdown:
async function gracefulShutdown(signal: string) {
  console.log(`[${signal}] Shutting down. Active requests: ${activeRequests}`);
  isShuttingDown = true;

  server.close();
  destroyIdleConnections();

  try {
    await waitForIdleRequests(20000);
    console.log('All requests drained.');
  } catch {
    console.warn('Drain timeout — forcing close with active requests.');
  }

  await Promise.allSettled([db.end(), redis.quit()]);
  process.exit(0);
}

Promise.allSettled is intentional here — you want to attempt cleanup on all resources even if one fails. Using Promise.all would short-circuit on the first failure, potentially leaving DB connections open.

Kubernetes Integration

Kubernetes manages your pod lifecycle through two mechanisms that interact with graceful shutdown:

terminationGracePeriodSeconds

# deployment.yaml
spec:
  template:
    spec:
      terminationGracePeriodSeconds: 60   # Give 60s total
      containers:
        - name: api
          image: your-api:latest
          readinessProbe:
            httpGet:
              path: /health
              port: 3000
            periodSeconds: 5
            failureThreshold: 1           # Deregister after first 503
          livenessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 10
            periodSeconds: 10

preStop Hook

There's a race condition between SIGTERM arrival and the load balancer finishing deregistration. Even after the pod is marked "Terminating", in-flight requests from already-established connections continue arriving for a few seconds.

The fix: add a preStop hook that sleeps before SIGTERM fires:

lifecycle:
  preStop:
    exec:
      command: ["sleep", "5"]

This adds a 5-second delay between the pod entering "Terminating" state and SIGTERM being sent. With a terminationGracePeriodSeconds: 60, your timeline becomes:

T+0s — Pod marked Terminating, preStop hook runs
T+5s — SIGTERM sent, app starts shutdown sequence
T+5-25s — Active requests drain, DB/Redis close
T+30s — Clean process.exit(0)
T+60s — Kubernetes SIGKILL deadline (never reached)

Full Kubernetes Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0         # Zero-downtime: never remove before adding
  template:
    spec:
      terminationGracePeriodSeconds: 60
      containers:
        - name: api
          image: your-api:latest
          ports:
            - containerPort: 3000
          readinessProbe:
            httpGet:
              path: /health
              port: 3000
            periodSeconds: 5
            failureThreshold: 1
          livenessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 10
            periodSeconds: 10
          lifecycle:
            preStop:
              exec:
                command: ["sleep", "5"]

With maxUnavailable: 0, Kubernetes won't terminate the old pod until the new one passes its readiness probe — ensuring continuous availability.

Database and Redis Cleanup

Different clients need different teardown strategies:

import pg from 'pg';
import { createClient } from 'redis';
import { Sequelize } from 'sequelize';

const pgPool = new pg.Pool({ max: 20 });
const redisClient = createClient();
const sequelize = new Sequelize(/* ... */);

async function closeAllConnections() {
  const results = await Promise.allSettled([
    // PostgreSQL pool — waits for active queries to finish, then closes
    pgPool.end().then(() => console.log('PG pool closed')),

    // Redis — QUIT sends a clean close command
    redisClient.quit().then(() => console.log('Redis closed')),

    // Sequelize — closes all pool connections
    sequelize.close().then(() => console.log('Sequelize pool closed')),
  ]);

  for (const result of results) {
    if (result.status === 'rejected') {
      console.error('Connection close error:', result.reason);
    }
  }
}

Important: pgPool.end() waits for all pending queries to complete before closing connections. This is exactly what you want — it won't interrupt a query that's already executing.

Handling Uncaught Errors During Shutdown

Don't let error handlers interfere with your shutdown sequence:

let isShuttingDown = false;

process.on('uncaughtException', (err) => {
  console.error('Uncaught exception:', err);
  if (!isShuttingDown) {
    gracefulShutdown('uncaughtException');
  }
});

process.on('unhandledRejection', (reason) => {
  console.error('Unhandled rejection:', reason);
  if (!isShuttingDown) {
    gracefulShutdown('unhandledRejection');
  }
});

process.on('SIGTERM', () => {
  if (!isShuttingDown) gracefulShutdown('SIGTERM');
});

process.on('SIGINT', () => {
  if (!isShuttingDown) gracefulShutdown('SIGINT');
});

The isShuttingDown guard prevents double-invocation — e.g., if SIGTERM triggers shutdown and then an error fires during cleanup.

BullMQ Worker Shutdown

Background job workers need special handling — don't stop consuming the queue mid-job:

import { Worker } from 'bullmq';
import { redisConnection } from './redis';

const worker = new Worker('emails', processEmailJob, {
  connection: redisConnection,
  concurrency: 5,
});

async function shutdownWorker() {
  console.log('Closing BullMQ worker...');
  // close() waits for all active jobs to finish before stopping
  await worker.close();
  console.log('Worker closed cleanly.');
}

process.on('SIGTERM', async () => {
  isShuttingDown = true;
  await Promise.all([
    shutdownServer(),   // HTTP server
    shutdownWorker(),   // BullMQ worker
  ]);
  await closeAllConnections();
  process.exit(0);
});

worker.close() in BullMQ 5 (released late 2025) waits for in-progress jobs to complete, then stops polling. It respects the terminationGracePeriodSeconds window automatically.

Testing Your Shutdown

Don't trust your shutdown code until you've tested it under load:

// test/graceful-shutdown.test.ts
import { spawn } from 'child_process';
import fetch from 'node-fetch';

describe('graceful shutdown', () => {
  it('completes in-flight requests on SIGTERM', async () => {
    const server = spawn('node', ['dist/server.js']);

    await new Promise(r => setTimeout(r, 500)); // wait for startup

    // Start a slow request (will take 2s to complete)
    const slowRequest = fetch('http://localhost:3000/api/slow');

    // Immediately send SIGTERM
    server.kill('SIGTERM');

    // The slow request should still succeed
    const response = await slowRequest;
    expect(response.status).toBe(200);

    // Server should have exited cleanly
    const exitCode = await new Promise(r => server.on('close', r));
    expect(exitCode).toBe(0);
  });

  it('returns 503 on /health during shutdown', async () => {
    const server = spawn('node', ['dist/server.js']);
    await new Promise(r => setTimeout(r, 500));

    server.kill('SIGTERM');
    await new Promise(r => setTimeout(r, 100)); // give it a moment

    const health = await fetch('http://localhost:3000/health');
    expect(health.status).toBe(503);
  });
});

Complete Production-Ready Implementation

Putting it all together — a production graceful shutdown module you can drop into any Node.js API:

// lib/shutdown.ts
import { Server } from 'http';
import { Socket } from 'net';

interface ShutdownOptions {
  server: Server;
  drainTimeoutMs?: number;
  cleanupFns?: Array<() => Promise<void>>;
  logger?: typeof console;
}

export function setupGracefulShutdown({
  server,
  drainTimeoutMs = 20000,
  cleanupFns = [],
  logger = console,
}: ShutdownOptions) {
  let isShuttingDown = false;
  let activeRequests = 0;
  let resolveIdle: (() => void) | null = null;
  const sockets = new Set<Socket>();

  server.on('connection', (socket: Socket) => {
    sockets.add(socket);
    socket.on('close', () => sockets.delete(socket));
  });

  // Middleware — attach to app BEFORE routes
  const requestTracker = (req: any, res: any, next: any) => {
    if (isShuttingDown) {
      res.set('Connection', 'close');
      return res.status(503).json({ error: 'Server is shutting down' });
    }
    activeRequests++;
    res.on('finish', () => {
      activeRequests--;
      if (activeRequests === 0 && resolveIdle) resolveIdle();
    });
    next();
  };

  const healthStatus = () => (isShuttingDown ? 'shutting_down' : 'healthy');

  async function shutdown(signal: string) {
    if (isShuttingDown) return;
    isShuttingDown = true;

    logger.info(`[${signal}] Graceful shutdown started. Active: ${activeRequests}`);

    server.close();

    // Destroy idle keep-alive connections
    for (const socket of sockets) {
      // @ts-ignore
      if (!socket._httpMessage || socket._httpMessage.finished) {
        socket.destroy();
      }
    }

    // Wait for in-flight requests
    if (activeRequests > 0) {
      await new Promise<void>((resolve, reject) => {
        resolveIdle = resolve;
        setTimeout(() => reject(new Error('Drain timeout')), drainTimeoutMs);
      }).catch((err) => logger.warn(err.message));
    }

    // Run cleanup functions in parallel
    await Promise.allSettled(cleanupFns.map((fn) => fn()));

    logger.info('Shutdown complete.');
    process.exit(0);
  }

  process.on('SIGTERM', () => shutdown('SIGTERM'));
  process.on('SIGINT', () => shutdown('SIGINT'));
  process.on('uncaughtException', (err) => {
    logger.error('Uncaught exception:', err);
    shutdown('uncaughtException');
  });
  process.on('unhandledRejection', (reason) => {
    logger.error('Unhandled rejection:', reason);
    shutdown('unhandledRejection');
  });

  return { requestTracker, healthStatus };
}

Usage in your server file:

// server.ts
import express from 'express';
import http from 'http';
import { setupGracefulShutdown } from './lib/shutdown';
import { pgPool, redisClient } from './db';

const app = express();
const server = http.createServer(app);

const { requestTracker, healthStatus } = setupGracefulShutdown({
  server,
  drainTimeoutMs: 20000,
  cleanupFns: [
    () => pgPool.end(),
    () => redisClient.quit(),
  ],
});

// Health check — must be BEFORE requestTracker middleware
app.get('/health', (req, res) => {
  const status = healthStatus();
  res.status(status === 'healthy' ? 200 : 503).json({ status });
});

// Request tracking middleware
app.use(requestTracker);

// Your API routes
app.use('/api', apiRouter);

server.listen(3000, () => console.log('API listening on :3000'));

Quick Reference: Shutdown Checklist

Step	Why It Matters
Handle SIGTERM + SIGINT	Catches both Kubernetes and Ctrl+C signals
Return 503 from /health	Triggers load balancer deregistration
Track active request count	Know when it's safe to close connections
Destroy idle keep-alive sockets	Prevent `server.close()` from hanging
Set 25s force-exit timeout	Never exceed Kubernetes terminationGracePeriod
Use `Promise.allSettled` for cleanup	Close all resources even if one fails
Add Kubernetes preStop sleep	Eliminate race condition on traffic draining
Test with concurrent load	Verify no requests dropped under real shutdown

Summary

Graceful shutdown isn't glamorous, but it's the difference between zero-downtime deployments and a flood of 502 errors in your observability dashboard.

The pattern is straightforward: flip a shutdown flag, return 503 from health checks, drain active requests, close resource connections, then exit. The Kubernetes preStop hook and correct terminationGracePeriodSeconds tuning give you the time budget to do it right.

If you're building production APIs and want to monetize them on RapidAPI via 1xAPI, graceful shutdown is one of the non-negotiables for maintaining the uptime SLAs that paying API consumers expect.

DEV Community