DEV Community

1xApi
1xApi

Posted on • Originally published at 1xapi.com

How to Implement Graceful Shutdown in Node.js APIs (Zero Dropped Requests on Deploy)

Deploying a Node.js API without proper graceful shutdown is like pulling the power cord on a running server. Every rolling deploy, every Kubernetes pod restart, every Docker container stop — if your app ignores SIGTERM, you're dropping in-flight requests, corrupting database transactions, and leaving clients staring at 502 errors.

This guide covers everything you need to implement graceful shutdown correctly in Node.js APIs in 2026 — from the basics of SIGTERM handling to keep-alive connection draining, database cleanup, Kubernetes configuration, and production-tested patterns.

Why Graceful Shutdown Matters in 2026

Modern APIs live in orchestrated environments. Kubernetes, Docker Swarm, ECS, and even simple docker compose down — all of them use the same shutdown sequence:

  1. Container receives SIGTERM (polite stop)
  2. Orchestrator waits terminationGracePeriodSeconds (Kubernetes default: 30s)
  3. If still running: container receives SIGKILL (force kill, no cleanup)

Two failure modes are common:

Pattern 1 — The Ignore: App doesn't listen to SIGTERM at all. Docker waits 10s then SIGKILL. Every in-flight request dies.

Pattern 2 — The Instant Exit: App calls process.exit(0) immediately on SIGTERM. Same result — requests dropped, database connections severed mid-query.

The cost: during a rolling deploy on a busy API handling 1,000 req/s, each 10s hard shutdown kills ~10,000 requests. With 5 pods rolling, that's 50,000 errors every deploy.

The Minimal Correct Implementation

Here's the foundation — every Node.js API should have this:

// server.js
import express from 'express';

const app = express();

app.get('/api/health', (req, res) => res.json({ status: 'ok' }));

app.get('/api/data', async (req, res) => {
  const data = await db.query('SELECT * FROM items LIMIT 100');
  res.json(data);
});

const server = app.listen(3000, () => {
  console.log('Server listening on :3000');
});

async function shutdown(signal) {
  console.log(`${signal} received — starting graceful shutdown`);

  // 1. Stop accepting new connections
  server.close(async () => {
    console.log('HTTP server closed');

    // 2. Close DB connections cleanly
    await db.end();
    console.log('Database pool closed');

    process.exit(0);
  });

  // 3. Force exit if drain takes too long
  setTimeout(() => {
    console.error('Shutdown timeout — forcing exit');
    process.exit(1);
  }, 10_000);
}

process.on('SIGTERM', () => shutdown('SIGTERM')); // Docker / Kubernetes
process.on('SIGINT',  () => shutdown('SIGINT'));  // Ctrl+C
Enter fullscreen mode Exit fullscreen mode

This is correct but incomplete. There's a critical problem: keep-alive connections.

The Keep-Alive Connection Problem

HTTP/1.1 keep-alive connections stay open between requests. When server.close() is called, Node.js stops accepting new connections but waits for all existing keep-alive connections to close on their own. A client that opened a keep-alive connection 5 minutes ago is still "connected" even if it hasn't sent a request in 4 minutes.

Result: server.close() callback never fires, your 10s timeout kicks in, and you get a hard exit anyway.

The fix is to track connections and destroy idle ones immediately on shutdown:

// connection-tracker.js
export function trackConnections(server) {
  const connections = new Map();
  let shuttingDown = false;

  server.on('connection', (socket) => {
    connections.set(socket, { idle: true, createdAt: Date.now() });

    socket.on('close', () => connections.delete(socket));
  });

  // Mark connection as active when handling a request
  server.on('request', (req, res) => {
    const socket = req.socket;
    const conn = connections.get(socket);
    if (conn) conn.idle = false;

    res.on('finish', () => {
      const conn = connections.get(socket);
      if (conn) {
        conn.idle = true;
        // If we're shutting down, destroy idle connections immediately
        if (shuttingDown) {
          socket.destroy();
        }
      }
    });
  });

  return {
    destroy() {
      shuttingDown = true;
      for (const [socket, meta] of connections) {
        if (meta.idle) {
          socket.destroy(); // Kill idle keep-alive connections
        }
        // Active connections will be destroyed when their request finishes (above)
      }
    },
    count: () => connections.size,
  };
}
Enter fullscreen mode Exit fullscreen mode
// server.js
import express from 'express';
import { trackConnections } from './connection-tracker.js';

const app = express();
const server = app.listen(3000);
const tracker = trackConnections(server);

async function shutdown(signal) {
  console.log(`${signal} — draining requests`);
  console.log(`Active connections: ${tracker.count()}`);

  // Destroy idle keep-alive connections immediately
  tracker.destroy();

  server.close(async () => {
    await db.end();
    process.exit(0);
  });

  setTimeout(() => {
    console.error(`Shutdown timeout with ${tracker.count()} connections remaining`);
    process.exit(1);
  }, 30_000);
}

process.on('SIGTERM', () => shutdown('SIGTERM'));
process.on('SIGINT',  () => shutdown('SIGINT'));
Enter fullscreen mode Exit fullscreen mode

Alternatively, use the well-maintained stoppable npm package (4M+ weekly downloads) which does this for you:

import stoppable from 'stoppable';

const server = stoppable(app.listen(3000), 10_000); // 10s grace period

async function shutdown(signal) {
  server.stop(async (err) => {
    if (err) console.error('Shutdown error:', err);
    await db.end();
    process.exit(err ? 1 : 0);
  });
}
Enter fullscreen mode Exit fullscreen mode

Draining In-Flight Requests Properly

For APIs with long-running requests (file uploads, heavy queries, streaming responses), you need a request counter so you can know exactly when it's safe to exit:

// in-flight-tracker.js
let inFlight = 0;
let shuttingDown = false;

export function inFlightMiddleware(req, res, next) {
  if (shuttingDown) {
    // Reject new requests during shutdown with 503
    res.set('Connection', 'close');
    return res.status(503).json({
      error: 'Service shutting down',
      retryAfter: 5,
    });
  }

  inFlight++;
  res.on('finish', () => {
    inFlight--;
    if (shuttingDown && inFlight === 0) {
      console.log('All in-flight requests drained');
    }
  });

  next();
}

export function startShutdown() {
  shuttingDown = true;
  return new Promise((resolve) => {
    if (inFlight === 0) return resolve();
    const interval = setInterval(() => {
      console.log(`Waiting for ${inFlight} in-flight requests...`);
      if (inFlight === 0) {
        clearInterval(interval);
        resolve();
      }
    }, 500);
  });
}
Enter fullscreen mode Exit fullscreen mode
import express from 'express';
import { inFlightMiddleware, startShutdown } from './in-flight-tracker.js';

const app = express();
app.use(inFlightMiddleware); // Must be first middleware

async function shutdown(signal) {
  console.log(`${signal} — draining requests`);

  await Promise.race([
    startShutdown(),
    new Promise((_, reject) =>
      setTimeout(() => reject(new Error('Drain timeout')), 25_000)
    ),
  ]);

  await db.end();
  await redis.quit();
  process.exit(0);
}
Enter fullscreen mode Exit fullscreen mode

Resource Cleanup Order

Order matters. Clean up in reverse dependency order:

async function shutdown(signal) {
  console.log(`${signal} — graceful shutdown started at ${new Date().toISOString()}`);

  const steps = [
    // 1. Stop health check passing (Kubernetes removes from load balancer)
    () => {
      isHealthy = false;
      console.log('Health check disabled');
    },

    // 2. Small delay so load balancer picks up health status
    () => new Promise(resolve => setTimeout(resolve, 2_000)),

    // 3. Stop accepting new connections
    () => new Promise((resolve, reject) => server.close((err) => err ? reject(err) : resolve())),

    // 4. Drain in-flight requests
    () => startShutdown(),

    // 5. Flush message queue / complete pending jobs
    () => queue.close({ timeout: 10_000 }),

    // 6. Close cache connections
    () => redis.quit(),

    // 7. Close database pool last (queries may be running until step 4 completes)
    () => db.end(),
  ];

  for (const step of steps) {
    try {
      await step();
    } catch (err) {
      console.error('Shutdown step failed:', err.message);
    }
  }

  console.log('Graceful shutdown complete');
  process.exit(0);
}
Enter fullscreen mode Exit fullscreen mode

Kubernetes Configuration

Your Node.js code and Kubernetes config must be aligned. The key rule: terminationGracePeriodSeconds must always be greater than your application shutdown timeout.

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-service
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0     # Never remove a pod before its replacement is ready
      maxSurge: 1           # Add 1 extra pod during rollout
  template:
    spec:
      terminationGracePeriodSeconds: 60  # Must be > app shutdown timeout (we use 30s)
      containers:
        - name: api
          image: my-api:latest
          ports:
            - containerPort: 3000
          readinessProbe:
            httpGet:
              path: /api/health
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 5
            failureThreshold: 3
          livenessProbe:
            httpGet:
              path: /api/health
              port: 3000
            initialDelaySeconds: 15
            periodSeconds: 10
          lifecycle:
            preStop:
              exec:
                # Give load balancer 5s to route away before SIGTERM fires
                command: ["/bin/sleep", "5"]
          env:
            - name: SHUTDOWN_TIMEOUT_MS
              value: "30000"
Enter fullscreen mode Exit fullscreen mode

The preStop hook is critical. Kubernetes sends SIGTERM at the same time it removes the pod from the load balancer's endpoint list — but the load balancer takes a few seconds to propagate. Without the preStop sleep, your pod receives SIGTERM and starts refusing connections while the load balancer still routes traffic to it, causing 5–10 seconds of 503 errors on every deploy.

With preStop: sleep 5 and terminationGracePeriodSeconds: 60, you get:

  • 0s: Pod enters terminating state, Kubernetes starts removing from endpoints
  • 5s: preStop completes, SIGTERM sent to container
  • 5–35s: Application drains connections
  • 35s: process.exit(0) if clean, or...
  • 60s: Kubernetes sends SIGKILL (safety net)

Health Check Integration

Your /health endpoint should reflect shutdown state:

let isHealthy = true;
let isReady = true;

// Liveness probe — is the process alive?
app.get('/api/health/live', (req, res) => {
  res.status(isHealthy ? 200 : 503).json({ status: isHealthy ? 'ok' : 'shutting_down' });
});

// Readiness probe — should this pod receive traffic?
app.get('/api/health/ready', (req, res) => {
  res.status(isReady ? 200 : 503).json({
    status: isReady ? 'ready' : 'not_ready',
    uptime: process.uptime(),
  });
});

async function shutdown(signal) {
  // First thing: fail readiness check
  // Kubernetes will stop sending new traffic within ~10s (periodSeconds * failureThreshold)
  isReady = false;

  // Wait for Kubernetes to notice (3 failed checks × 5s = 15s)
  await new Promise(resolve => setTimeout(resolve, 15_000));

  // Now drain and exit
  // ...
}
Enter fullscreen mode Exit fullscreen mode

Handling Uncaught Exceptions

Unhandled promise rejections and uncaught exceptions should also trigger graceful shutdown — not crash immediately:

process.on('uncaughtException', async (err) => {
  console.error('Uncaught exception:', err);
  await logger.error('uncaughtException', { error: err.message, stack: err.stack });
  await shutdown('uncaughtException');
});

process.on('unhandledRejection', async (reason) => {
  console.error('Unhandled rejection:', reason);
  await logger.error('unhandledRejection', { reason: String(reason) });
  // For unhandled rejections: log, don't exit — they may be non-fatal
  // If your policy is to exit: call shutdown() here
});
Enter fullscreen mode Exit fullscreen mode

Testing Graceful Shutdown

Don't deploy without testing it. A simple test:

// test/graceful-shutdown.test.js
import { describe, it, before, after } from 'node:test';
import assert from 'node:assert';
import { spawn } from 'node:child_process';

describe('Graceful Shutdown', () => {
  let proc;

  before(() => {
    proc = spawn('node', ['server.js'], {
      env: { ...process.env, PORT: '3100' },
    });
  });

  after(() => proc?.kill('SIGKILL'));

  it('completes in-flight requests before exiting', async () => {
    // Wait for server to start
    await new Promise(resolve => setTimeout(resolve, 500));

    // Start a slow request
    const requestPromise = fetch('http://localhost:3100/api/slow'); // takes 2s

    // Send SIGTERM 200ms later
    await new Promise(resolve => setTimeout(resolve, 200));
    proc.kill('SIGTERM');

    // The slow request should still complete
    const res = await requestPromise;
    assert.strictEqual(res.status, 200);
  });

  it('returns 503 for new requests after shutdown starts', async () => {
    await new Promise(resolve => setTimeout(resolve, 200));
    proc.kill('SIGTERM');
    await new Promise(resolve => setTimeout(resolve, 100));

    const res = await fetch('http://localhost:3100/api/data').catch(() => null);
    // Either 503 or connection refused — both are correct
    assert.ok(!res || res.status === 503);
  });
});
Enter fullscreen mode Exit fullscreen mode

Production Checklist

Before deploying, verify:

  • [ ] SIGTERM and SIGINT handlers are registered
  • [ ] server.close() is called (stops accepting new connections)
  • [ ] Keep-alive connections are explicitly destroyed or use stoppable
  • [ ] In-flight requests are tracked and drained
  • [ ] Database/cache connections are closed in dependency order
  • [ ] A hard timeout (setTimeout with process.exit(1)) is set as a safety net
  • [ ] terminationGracePeriodSeconds > app shutdown timeout + preStop delay
  • [ ] Readiness probe is failed before draining (so load balancer routes away)
  • [ ] Graceful shutdown is covered by integration tests

Summary

Graceful shutdown is a first-class operational concern, not an afterthought. The pattern is always the same:

  1. Fail readiness → load balancer stops routing
  2. Stop acceptingserver.close()
  3. Drain connections → destroy idle keep-alives, wait for active requests
  4. Clean up resources → queues, cache, database (in reverse dependency order)
  5. Exit cleanlyprocess.exit(0) with a hard timeout safety net

For APIs published on 1xAPI, this is especially important — clients retry on failure, and a graceful shutdown means their retry hits the new version instead of getting a connection reset.

Implement it once, test it properly, and never drop a request on deploy again.

Top comments (0)