Before diving in, I must admit that the title is a bit vague. Let me describe the issue first, cause the solution below is not a silver bullet and it's suitable for specific runtime environments.
In our company we use Google Cloud Run to deploy web applications, and every app is built into a docker image. For now we use the default memory limit by Cloud Run which is 256 MB per container. Recently we started to notice that the part of applications go beyond this limit, causing a container to restart and in some cases even resulting to downtime of a service.
These applications run as Node processes (NextJS server and SvelteKit server), and docker entrypoint is just a pnpm run start
, that executes the corresponding command under the hood.
Troubleshooting
Unfortunately, it's impossible to get inside a running instance in Cloud Run, so the only way to debug a problem is via logs. Web interface doesn't provide you any information about running processes, just a chart with memory utilization from your containers:
On this chart, we can see that sometimes it's just not enough memory for a service.
My initial idea was to periodically run top
inside a container and log the output. Then try to see how different conditions like huge traffic affect the memory. There's a tool that helps you run multiple processes inside a docker container called multi-start. I slightly modified its code to execute top -bcn1 -o %MEM -w256
by some interval and log the result to stdout. But just after installing it to the image and deploying it to Google Cloud I noticed something strange:
1) Total memory consumption according to top
is 160 MB
. But the chart above says it takes up to 90% of memory. Well, memory calculation can be interesting sometimes
2) For some reason, the container thinks it has 1 Gb
RAM allocated, yet the limit in Cloud Run is set to 256 MB
3) pnpm run start
consumes almost the same amount of memory as NextJS server! Yet it's just a runner for an actual command
Solution
In fact, the only thing I had to do is to replace pnpm run start
with a corresponding script from package.json
, so in case of NextJS apps it was node ./node_modules/next/dist/bin/next start
and node ./build
for SvelteKit (you might have a different setup).
The results are prominent:
It immediately saved almost 60 MB or memory, which is a lot in such circumstances. See how the changes are reflected in the dashboard:
Conclusion
This indeed may sound like an edge case, but if you run your node application in an environment with such a limited resources, just try to invoke the scripts directly, without wrapping them into pnpm
command.
It also sounds very strange to me why pnpm
is this eager. If anyone has some ideas, please share them.
Hope it helps ๐
Top comments (0)