DEV Community

AttractivePenguin
AttractivePenguin

Posted on

Node.js Graceful Shutdown: How to Stop Dropping Requests in Production

Node.js Graceful Shutdown: Stop Dropping Requests in Production

Your deployment just went out. Kubernetes gracefully terminated the old pods. And somehow — somehow — your error tracker lit up with a burst of 502s and connection resets. Users saw broken pages. Your Slack channel erupted.

Sound familiar? You are not alone. Improper shutdown handling is one of the most common yet overlooked causes of dropped requests in production Node.js workloads. The good news: it is entirely fixable, and the fix is simpler than you think.

Why Graceful Shutdown Matters

When a container or process receives a termination signal (SIGTERM), it does not die instantly. The orchestrator — Kubernetes, Docker, systemd — gives it a grace period. During that window, in-flight requests must complete, and new requests should be routed elsewhere.

Without a SIGTERM handler, Node.js exits immediately on signal. Any request being processed at that moment gets cut off mid-response. The client sees a connection reset, a timeout, or a partial response.

With proper graceful shutdown:

  1. You stop accepting new requests
  2. You let in-flight requests finish
  3. You close database connections and cleanup resources
  4. Then you exit cleanly

Let us build this step by step.


Step 1: Basic SIGTERM Handling

The absolute minimum — catch SIGTERM and exit on your terms:

// server.js
const http = require('http');

const server = http.createServer((req, res) => {
  res.writeHead(200);
  res.end('Hello, world!\n');
});

server.listen(3000, () => {
  console.log('Server listening on port 3000');
});

function shutdown() {
  console.log('Shutting down gracefully...');
  server.close(() => {
    console.log('All connections closed. Exiting.');
    process.exit(0);
  });

  // Fallback: force exit after 10 seconds
  setTimeout(() => {
    console.error('Forced shutdown after timeout');
    process.exit(1);
  }, 10000);
}

process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);
Enter fullscreen mode Exit fullscreen mode

What this does: server.close() stops the server from accepting new connections but keeps existing ones alive. Once all connections drain, the callback fires and we exit cleanly. The setTimeout is a safety net — if something hangs, we do not wait forever.

This is a start, but real applications have more complexity.


Step 2: Connection Draining with Express

In a typical Express app, you want to signal the load balancer to stop sending traffic before you start draining. Enter health check endpoints.

const express = require('express');
const app = express();
const PORT = 3000;

let isShuttingDown = false;

// Health check endpoint — load balancers hit this
app.get('/health', (req, res) => {
  if (isShuttingDown) {
    res.status(503).json({ status: 'shutting_down' });
  } else {
    res.status(200).json({ status: 'healthy' });
  }
});

// Reject new requests during shutdown
app.use((req, res, next) => {
  if (isShuttingDown) {
    res.set('Connection', 'close');
    res.status(503).json({ error: 'Server is shutting down' });
    return;
  }
  next();
});

// Your routes
app.get('/', (req, res) => {
  res.json({ message: 'Hello, world!' });
});

const server = app.listen(PORT, () => {
  console.log(`Server running on port ${PORT}`);
});

function shutdown() {
  console.log('Received shutdown signal');
  isShuttingDown = true;

  // Tell load balancer to stop routing here
  console.log('Marking server as unhealthy...');

  // Wait a bit for the load balancer to notice
  setTimeout(() => {
    console.log('Closing server...');
    server.close(() => {
      console.log('All connections drained. Exiting.');
      process.exit(0);
    });

    // Force exit after 15 seconds
    setTimeout(() => {
      console.error('Forced shutdown — connections did not drain in time');
      process.exit(1);
    }, 15000);
  }, 5000); // 5s for LB to notice health check failing
}

process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);
Enter fullscreen mode Exit fullscreen mode

The key insight: set isShuttingDown = true immediately, so the health check returns 503. Your load balancer (or Kubernetes Ingress) will stop routing new traffic. Then wait a few seconds before calling server.close() to give the LB time to update its routing table.


Step 3: The Same Pattern with Fastify

Fastify has a built-in graceful shutdown via close(), but the pattern is the same:

const fastify = require('fastify')({ logger: true });

let isShuttingDown = false;

fastify.get('/health', async (request, reply) => {
  if (isShuttingDown) {
    reply.code(503).send({ status: 'shutting_down' });
    return;
  }
  return { status: 'healthy' };
});

fastify.addHook('onRequest', async (request, reply) => {
  if (isShuttingDown) {
    reply.code(503).header('Connection', 'close').send({ error: 'Shutting down' });
    return;
  }
});

fastify.listen({ port: 3000, host: '0.0.0.0' });

async function shutdown() {
  isShuttingDown = true;
  console.log('Shutting down...');

  // Give LB time to notice
  await new Promise(resolve => setTimeout(resolve, 5000));

  try {
    await fastify.close();
    console.log('Fastify closed. Exiting.');
    process.exit(0);
  } catch (err) {
    console.error('Error during shutdown:', err);
    process.exit(1);
  }
}

process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);
Enter fullscreen mode Exit fullscreen mode

Fastify's close() method handles draining connections and closing the server for you.


Step 4: Kubernetes Pod Lifecycle Integration

This is where most people get it wrong. Kubernetes has a specific pod termination sequence, and your app needs to cooperate with it.

How Kubernetes Terminates a Pod

  1. Kubernetes sends SIGTERM to the container
  2. It waits for terminationGracePeriodSeconds (default: 30s)
  3. If the container is still running, it sends SIGKILL

Your app must finish draining within that window. Here is the critical configuration:

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-node-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-node-app
  template:
    metadata:
      labels:
        app: my-node-app
    spec:
      terminationGracePeriodSeconds: 60
      containers:
      - name: app
        image: my-node-app:latest
        ports:
        - containerPort: 3000
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 10"]
        livenessProbe:
          httpGet:
            path: /health
            port: 3000
          initialDelaySeconds: 10
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 5
Enter fullscreen mode Exit fullscreen mode

Why the preStop Hook Matters

When Kubernetes sends SIGTERM, it also immediately removes the pod from the Service endpoint list. But — and this is subtle — kube-proxy on each node has a propagation delay before iptables rules are updated. Requests already routed to this pod may still arrive for a few seconds after SIGTERM.

The preStop hook runs before SIGTERM is sent. By sleeping 10 seconds, we give the network fabric time to update. During this sleep, the pod is still serving traffic normally. Only after the sleep does SIGTERM arrive, and by then, the network should have stopped routing new traffic to us.

This is the single most impactful Kubernetes-specific fix for dropped requests.


Step 5: Handling Database Connections and Cleanup

Real apps have database pools, message queue consumers, and other resources. Drain those too:

const { Pool } = require('pg');
const pool = new Pool({ max: 20 });

async function shutdown() {
  isShuttingDown = true;
  console.log('Shutting down...');

  // Step 1: Stop accepting new HTTP requests
  // Step 2: Wait for load balancer to notice (via preStop or delay)
  await new Promise(resolve => setTimeout(resolve, 5000));

  // Step 3: Close HTTP server — drains in-flight requests
  await new Promise(resolve => server.close(resolve));

  // Step 4: Close database pool
  try {
    await pool.end();
    console.log('Database pool closed.');
  } catch (err) {
    console.error('Error closing DB pool:', err);
  }

  // Step 5: Exit
  console.log('Clean shutdown complete.');
  process.exit(0);
}

// Safety net — never exceed the Kubernetes grace period
setTimeout(() => {
  console.error('Grace period exceeded. Force exit.');
  process.exit(1);
}, 55000); // 55s if terminationGracePeriodSeconds is 60s

process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);
Enter fullscreen mode Exit fullscreen mode

Order matters: close the HTTP server before the database pool. If you close the DB first, in-flight requests will crash trying to query.


Real-World Scenarios

Scenario 1: Kubernetes Rolling Update

You update your Deployment image tag. Kubernetes creates a new pod, waits for it to pass readiness checks, then terminates an old pod. With proper graceful shutdown + preStop hook, the old pod drains completely before exiting. Zero dropped requests.

Scenario 2: Docker Restart

docker restart my-node-app
Enter fullscreen mode Exit fullscreen mode

Docker sends SIGTERM, waits 10 seconds (default), then SIGKILL. If your app handles SIGTERM and drains in under 10 seconds, you are fine. If not, increase the timeout:

docker stop --time 30 my-node-app
Enter fullscreen mode Exit fullscreen mode

Scenario 3: HPA Scale Down

Horizontal Pod Autoscaler scales down and terminates pods. Same mechanics apply — SIGTERM, grace period, drain. Your graceful shutdown code handles all of these identically.


FAQ and Troubleshooting

"I added a SIGTERM handler but requests still drop"

Check if your container runs as PID 1. In Docker, if your Node process is PID 1, it does not receive SIGTERM by default because PID 1 has special signal handling. Fix: use node directly (not as a shell child), or add init: true to your Docker Compose, or use tini as the init process.

Dockerfile fix:

RUN apt-get update && apt-get install -y tini
ENTRYPOINT ["tini", "--"]
CMD ["node", "server.js"]
Enter fullscreen mode Exit fullscreen mode

"My shutdown takes too long and gets SIGKILLed"

Your total shutdown time (preStop sleep + drain time + cleanup) must be less than terminationGracePeriodSeconds. If you have a 10s preStop sleep and need 20s to drain, set terminationGracePeriodSeconds: 35 (add buffer).

"HTTP keep-alive connections never drain"

server.close() waits for all connections to close. HTTP keep-alive keeps connections open for reuse, which means server.close() might never call its callback. Fix: destroy keep-alive connections after a timeout.

function forceCloseIdleConnections(server, timeout = 10000) {
  setTimeout(() => {
    server.closeAllConnections?.(); // Node 18.2+
  }, timeout);
}
Enter fullscreen mode Exit fullscreen mode

Or set a shorter keep-alive on shutdown:

server.on('connection', (socket) => {
  if (isShuttingDown) {
    socket.setKeepAlive(false);
    socket.setTimeout(5000);
  }
});
Enter fullscreen mode Exit fullscreen mode

"I am using PM2, not Kubernetes"

PM2 has its own shutdown mechanism. Use the --graceful-shutdown flag and listen for the PM2 signal:

process.on('SIGINT', shutdown); // PM2 sends SIGINT on stop/reload
Enter fullscreen mode Exit fullscreen mode

Run with: pm2 start server.js --graceful-shutdown 15000

"Readiness and liveness probes use the same endpoint — is that a problem?"

Ideally, use separate endpoints. Liveness = "is this process alive?" Readiness = "can this process handle traffic?" During shutdown, you want readiness to fail (503) while liveness still passes (200) — otherwise Kubernetes restarts the pod before it finishes draining.

app.get('/health/live', (req, res) => {
  res.json({ status: 'alive' });
});

app.get('/health/ready', (req, res) => {
  if (isShuttingDown) {
    res.status(503).json({ status: 'not_ready' });
  } else {
    res.json({ status: 'ready' });
  }
});
Enter fullscreen mode Exit fullscreen mode

The Complete Picture

Here is a summary checklist for production-ready graceful shutdown:

  • Listen for SIGTERM and SIGINT
  • Set a shutdown flag to fail health checks immediately
  • Separate liveness and readiness probes — only fail readiness during shutdown
  • Add a Kubernetes preStop hook (sleep 5–10s) to let the network update
  • Call server.close() after a short delay to drain in-flight requests
  • Close resources (DB, queues, file handles) after the HTTP server
  • Set a forced exit timeout well below terminationGracePeriodSeconds
  • Handle PID 1 signal issues with tini or direct node execution
  • Destroy idle keep-alive connections after a reasonable drain window

Implementing graceful shutdown is not optional for production workloads. It is the difference between zero-downtime deployments and angry users. The patterns above handle Kubernetes, Docker, and bare-metal alike. Add them to your template, and you will never drop a request on deployment day again.


Got questions or your own shutdown war stories? Drop them in the comments — I read every one.

Top comments (0)