😱 Production went down… because of a default setting
Have you ever seen this?
The dev team ships a new feature using a modern, elegant, enterprise-ready framework like NestJS.Everything works perfectly on local.QA passes.Production deploys.
And suddenly…
- 500 errors everywhere
- Kubernetes pods restarting
- Logs that tell you absolutely nothing
The root cause?
👉 A 100kb default request body limit that nobody knew existed.
Yes. That tiny default almost ruined our Friday night. 😅
As a DevOps engineer working with Kubernetes and Azure, this was a wake-up call. NestJS is an amazing framework—but defaults are not production-ready by magic.
This post is not about blaming developers.It’s about shared responsibility.
Framework defaults are convenient.Production environments are not.
Below are 20 NestJS defaults and behaviors that every DevOps engineer should be aware of when NestJS lands in a Kubernetes cluster.
1️⃣ The 100kb That Almost Killed PROD (Body Parser)
NestJS (with Express under the hood) uses body-parser.By default, it only accepts 100kb for JSON payloads.
If your API handles file uploads, base64 payloads, or large requests, you’ll get a PayloadTooLargeError that often becomes a silent 500.
👉 Fix it explicitly in main.ts:
app.use(bodyParser.json({ limit: '10mb' }));
Lesson: defaults are optimized for demos, not real traffic.
2️⃣ HTTP Timeouts Are Not Infinite
Node’s HTTP server does not wait forever.
Long-running requests, slow downstream services, or blocked connections can slowly exhaust your pod resources.
Configure:
- server.timeout
- keepAliveTimeout
- Or enforce limits at the Ingress / Load Balancer level
DevOps takeaway: timeouts are part of capacity planning.
3️⃣ The Logger That Doesn’t Log (in Production)
console.log is fine… until it isn’t.
In production you want:
- Structured logs (JSON)
- Correlation IDs
- Compatibility with ELK, Datadog, Azure Monitor
Use app.useLogger() with pino or winston.
Logs are not for developers.They are for incident response.
4️⃣ Shutdown Hooks: Kubernetes Is Not Polite
When Kubernetes terminates a pod, it sends SIGTERM.
If your app ignores it:
- In-flight requests are dropped
- Users see 5xx errors
- You lose trust
Enable graceful shutdown:
app.enableShutdownHooks();
And drain traffic properly.
5️⃣ Environment Variables: .env Is Not Production
@nestjs/config looks for .env files by default.
In Kubernetes, configuration comes from:
- Environment variables
- ConfigMaps
- Secrets
Avoid filesystem dependencies:
ConfigModule.forRoot({ ignoreEnvFile: process.env.NODE_ENV === 'production',});
6️⃣ Global Modules: Convenience vs Control
@global() modules feel nice.
They also:
- Hide dependencies
- Increase coupling
- Complicate testing
- Make scaling harder
What’s convenient for development can be painful in production.
7️⃣ Dependency Injection Can Betray You
Custom providers and factories can introduce:
- Circular dependencies
- Runtime undefined errors
These issues often appear only under load.
Use dependency graphs and be suspicious of “magic”.
8️⃣ Global Exception Filters Are Mandatory
Without a global exception filter:
- Errors are inconsistent
- Stack traces may leak
- Logs are unstructured
Always normalize errors:
app.useGlobalFilters(...)
Predictable errors = predictable operations.
9️⃣ Guards Can Kill Performance
Guards run before interceptors and pipes.
Heavy logic here (rate limiting, DB calls) affects every request.
Security ≠ expensive execution.
🔟 Shipping start:dev to Kubernetes (Yes, It Happens)
Always:
- Build with npm run build
- Run node dist/main.js
Otherwise you get:
- Huge images
- Dev dependencies in prod
- Slower startups
I’ve seen 2GB images because of this.
1️⃣1️⃣ CacheModule Is Not Shared
In-memory cache:
- Is pod-local
- Is volatile
- Breaks consistency in clusters
For real workloads, use Redis or another external store.
1️⃣2️⃣ Scheduled Tasks Can Block the Event Loop
@nestjs/schedule runs inside the same process.
Long-running jobs = slower APIs.
Move heavy work out or make it truly async.
1️⃣3️⃣ Correlation Headers Belong to Middleware
Headers like:
- X-Request-ID
- X-Correlation-ID
Should be:
- Global
- Mandatory
- Documented
Not copy-pasted across controllers.
1️⃣4️⃣ Health Checks Must Be Honest
Kubernetes doesn’t care if port 3000 responds.
It cares if:
- DB is reachable
- Queues are alive
- Dependencies are healthy
Liveness ≠ Readiness.
1️⃣5️⃣ Response Size Is a Memory Problem
Large payloads = memory pressure.
Pagination is not optional.It’s a survival mechanism.
1️⃣6️⃣ API Versioning Is a Day-One Decision
No versioning = broken clients later.
NestJS supports:
- URI versioning
- Header versioning
Pick one early.
1️⃣7️⃣ WebSockets Don’t Scale by Default
WebSocket connections stick to a pod.
For horizontal scaling, you need:
- Redis adapters
- Message brokers
Stateful connections break stateless assumptions.
1️⃣8️⃣ Database Connection Pools Matter
Default pool sizes are usually small.
Multiply that by pod replicas and you’ll hit DB limits fast.
Tune pools intentionally.
1️⃣9️⃣ Axios Has No Timeout (Yes, Really)
NestJS HttpModule uses Axios.
Axios defaults:
- No timeout
One hanging dependency can block your service forever.
Always set timeouts.
2️⃣0️⃣ Swagger Is Not Decoration
Outdated Swagger docs cause:
- Misuse
- Invalid payloads
- Production incidents
Automate updates. Treat docs as code.
🧠 Final Thought: DevOps Is Not Just YAML
NestJS is enterprise-ready, but enterprises are messy.
Scalability doesn’t come from frameworks alone.It comes from Dev and Ops understanding how software behaves under pressure.
So next time NestJS lands in your cluster, don’t just review Kubernetes manifests.
👉 Open main.ts.
It might save your weekend.
🤝 Let's Connect!
If you find this repository useful and want to see more content like this, follow me on LinkedIn to stay updated on more projects and resources!
If you’d like to support my work, you can buy me a coffee. Thank you for your support!
Thank you for reading! 😊



Top comments (0)