Node.js Graceful Shutdown: Stop Dropping Requests in Production
Your deployment just went out. Kubernetes gracefully terminated the old pods. And somehow — somehow — your error tracker lit up with a burst of 502s and connection resets. Users saw broken pages. Your Slack channel erupted.
Sound familiar? You are not alone. Improper shutdown handling is one of the most common yet overlooked causes of dropped requests in production Node.js workloads. The good news: it is entirely fixable, and the fix is simpler than you think.
Why Graceful Shutdown Matters
When a container or process receives a termination signal (SIGTERM), it does not die instantly. The orchestrator — Kubernetes, Docker, systemd — gives it a grace period. During that window, in-flight requests must complete, and new requests should be routed elsewhere.
Without a SIGTERM handler, Node.js exits immediately on signal. Any request being processed at that moment gets cut off mid-response. The client sees a connection reset, a timeout, or a partial response.
With proper graceful shutdown:
- You stop accepting new requests
- You let in-flight requests finish
- You close database connections and cleanup resources
- Then you exit cleanly
Let us build this step by step.
Step 1: Basic SIGTERM Handling
The absolute minimum — catch SIGTERM and exit on your terms:
// server.js
const http = require('http');
const server = http.createServer((req, res) => {
res.writeHead(200);
res.end('Hello, world!\n');
});
server.listen(3000, () => {
console.log('Server listening on port 3000');
});
function shutdown() {
console.log('Shutting down gracefully...');
server.close(() => {
console.log('All connections closed. Exiting.');
process.exit(0);
});
// Fallback: force exit after 10 seconds
setTimeout(() => {
console.error('Forced shutdown after timeout');
process.exit(1);
}, 10000);
}
process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);
What this does: server.close() stops the server from accepting new connections but keeps existing ones alive. Once all connections drain, the callback fires and we exit cleanly. The setTimeout is a safety net — if something hangs, we do not wait forever.
This is a start, but real applications have more complexity.
Step 2: Connection Draining with Express
In a typical Express app, you want to signal the load balancer to stop sending traffic before you start draining. Enter health check endpoints.
const express = require('express');
const app = express();
const PORT = 3000;
let isShuttingDown = false;
// Health check endpoint — load balancers hit this
app.get('/health', (req, res) => {
if (isShuttingDown) {
res.status(503).json({ status: 'shutting_down' });
} else {
res.status(200).json({ status: 'healthy' });
}
});
// Reject new requests during shutdown
app.use((req, res, next) => {
if (isShuttingDown) {
res.set('Connection', 'close');
res.status(503).json({ error: 'Server is shutting down' });
return;
}
next();
});
// Your routes
app.get('/', (req, res) => {
res.json({ message: 'Hello, world!' });
});
const server = app.listen(PORT, () => {
console.log(`Server running on port ${PORT}`);
});
function shutdown() {
console.log('Received shutdown signal');
isShuttingDown = true;
// Tell load balancer to stop routing here
console.log('Marking server as unhealthy...');
// Wait a bit for the load balancer to notice
setTimeout(() => {
console.log('Closing server...');
server.close(() => {
console.log('All connections drained. Exiting.');
process.exit(0);
});
// Force exit after 15 seconds
setTimeout(() => {
console.error('Forced shutdown — connections did not drain in time');
process.exit(1);
}, 15000);
}, 5000); // 5s for LB to notice health check failing
}
process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);
The key insight: set isShuttingDown = true immediately, so the health check returns 503. Your load balancer (or Kubernetes Ingress) will stop routing new traffic. Then wait a few seconds before calling server.close() to give the LB time to update its routing table.
Step 3: The Same Pattern with Fastify
Fastify has a built-in graceful shutdown via close(), but the pattern is the same:
const fastify = require('fastify')({ logger: true });
let isShuttingDown = false;
fastify.get('/health', async (request, reply) => {
if (isShuttingDown) {
reply.code(503).send({ status: 'shutting_down' });
return;
}
return { status: 'healthy' };
});
fastify.addHook('onRequest', async (request, reply) => {
if (isShuttingDown) {
reply.code(503).header('Connection', 'close').send({ error: 'Shutting down' });
return;
}
});
fastify.listen({ port: 3000, host: '0.0.0.0' });
async function shutdown() {
isShuttingDown = true;
console.log('Shutting down...');
// Give LB time to notice
await new Promise(resolve => setTimeout(resolve, 5000));
try {
await fastify.close();
console.log('Fastify closed. Exiting.');
process.exit(0);
} catch (err) {
console.error('Error during shutdown:', err);
process.exit(1);
}
}
process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);
Fastify's close() method handles draining connections and closing the server for you.
Step 4: Kubernetes Pod Lifecycle Integration
This is where most people get it wrong. Kubernetes has a specific pod termination sequence, and your app needs to cooperate with it.
How Kubernetes Terminates a Pod
- Kubernetes sends SIGTERM to the container
- It waits for terminationGracePeriodSeconds (default: 30s)
- If the container is still running, it sends SIGKILL
Your app must finish draining within that window. Here is the critical configuration:
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-node-app
spec:
replicas: 3
selector:
matchLabels:
app: my-node-app
template:
metadata:
labels:
app: my-node-app
spec:
terminationGracePeriodSeconds: 60
containers:
- name: app
image: my-node-app:latest
ports:
- containerPort: 3000
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 10"]
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
Why the preStop Hook Matters
When Kubernetes sends SIGTERM, it also immediately removes the pod from the Service endpoint list. But — and this is subtle — kube-proxy on each node has a propagation delay before iptables rules are updated. Requests already routed to this pod may still arrive for a few seconds after SIGTERM.
The preStop hook runs before SIGTERM is sent. By sleeping 10 seconds, we give the network fabric time to update. During this sleep, the pod is still serving traffic normally. Only after the sleep does SIGTERM arrive, and by then, the network should have stopped routing new traffic to us.
This is the single most impactful Kubernetes-specific fix for dropped requests.
Step 5: Handling Database Connections and Cleanup
Real apps have database pools, message queue consumers, and other resources. Drain those too:
const { Pool } = require('pg');
const pool = new Pool({ max: 20 });
async function shutdown() {
isShuttingDown = true;
console.log('Shutting down...');
// Step 1: Stop accepting new HTTP requests
// Step 2: Wait for load balancer to notice (via preStop or delay)
await new Promise(resolve => setTimeout(resolve, 5000));
// Step 3: Close HTTP server — drains in-flight requests
await new Promise(resolve => server.close(resolve));
// Step 4: Close database pool
try {
await pool.end();
console.log('Database pool closed.');
} catch (err) {
console.error('Error closing DB pool:', err);
}
// Step 5: Exit
console.log('Clean shutdown complete.');
process.exit(0);
}
// Safety net — never exceed the Kubernetes grace period
setTimeout(() => {
console.error('Grace period exceeded. Force exit.');
process.exit(1);
}, 55000); // 55s if terminationGracePeriodSeconds is 60s
process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);
Order matters: close the HTTP server before the database pool. If you close the DB first, in-flight requests will crash trying to query.
Real-World Scenarios
Scenario 1: Kubernetes Rolling Update
You update your Deployment image tag. Kubernetes creates a new pod, waits for it to pass readiness checks, then terminates an old pod. With proper graceful shutdown + preStop hook, the old pod drains completely before exiting. Zero dropped requests.
Scenario 2: Docker Restart
docker restart my-node-app
Docker sends SIGTERM, waits 10 seconds (default), then SIGKILL. If your app handles SIGTERM and drains in under 10 seconds, you are fine. If not, increase the timeout:
docker stop --time 30 my-node-app
Scenario 3: HPA Scale Down
Horizontal Pod Autoscaler scales down and terminates pods. Same mechanics apply — SIGTERM, grace period, drain. Your graceful shutdown code handles all of these identically.
FAQ and Troubleshooting
"I added a SIGTERM handler but requests still drop"
Check if your container runs as PID 1. In Docker, if your Node process is PID 1, it does not receive SIGTERM by default because PID 1 has special signal handling. Fix: use node directly (not as a shell child), or add init: true to your Docker Compose, or use tini as the init process.
Dockerfile fix:
RUN apt-get update && apt-get install -y tini
ENTRYPOINT ["tini", "--"]
CMD ["node", "server.js"]
"My shutdown takes too long and gets SIGKILLed"
Your total shutdown time (preStop sleep + drain time + cleanup) must be less than terminationGracePeriodSeconds. If you have a 10s preStop sleep and need 20s to drain, set terminationGracePeriodSeconds: 35 (add buffer).
"HTTP keep-alive connections never drain"
server.close() waits for all connections to close. HTTP keep-alive keeps connections open for reuse, which means server.close() might never call its callback. Fix: destroy keep-alive connections after a timeout.
function forceCloseIdleConnections(server, timeout = 10000) {
setTimeout(() => {
server.closeAllConnections?.(); // Node 18.2+
}, timeout);
}
Or set a shorter keep-alive on shutdown:
server.on('connection', (socket) => {
if (isShuttingDown) {
socket.setKeepAlive(false);
socket.setTimeout(5000);
}
});
"I am using PM2, not Kubernetes"
PM2 has its own shutdown mechanism. Use the --graceful-shutdown flag and listen for the PM2 signal:
process.on('SIGINT', shutdown); // PM2 sends SIGINT on stop/reload
Run with: pm2 start server.js --graceful-shutdown 15000
"Readiness and liveness probes use the same endpoint — is that a problem?"
Ideally, use separate endpoints. Liveness = "is this process alive?" Readiness = "can this process handle traffic?" During shutdown, you want readiness to fail (503) while liveness still passes (200) — otherwise Kubernetes restarts the pod before it finishes draining.
app.get('/health/live', (req, res) => {
res.json({ status: 'alive' });
});
app.get('/health/ready', (req, res) => {
if (isShuttingDown) {
res.status(503).json({ status: 'not_ready' });
} else {
res.json({ status: 'ready' });
}
});
The Complete Picture
Here is a summary checklist for production-ready graceful shutdown:
- Listen for SIGTERM and SIGINT
- Set a shutdown flag to fail health checks immediately
- Separate liveness and readiness probes — only fail readiness during shutdown
- Add a Kubernetes preStop hook (sleep 5–10s) to let the network update
-
Call
server.close()after a short delay to drain in-flight requests - Close resources (DB, queues, file handles) after the HTTP server
-
Set a forced exit timeout well below
terminationGracePeriodSeconds - Handle PID 1 signal issues with tini or direct node execution
- Destroy idle keep-alive connections after a reasonable drain window
Implementing graceful shutdown is not optional for production workloads. It is the difference between zero-downtime deployments and angry users. The patterns above handle Kubernetes, Docker, and bare-metal alike. Add them to your template, and you will never drop a request on deployment day again.
Got questions or your own shutdown war stories? Drop them in the comments — I read every one.
Top comments (0)