Recently, we migrated from Lambda to Fargate to optimize the performance of a Node.js (NestJS) application. Through trial and error, we experimented with various task specs (CPU and memory allocation) to find the optimal settings. Here’s the journey of how we arrived at the best configuration.
Differences from Lambda
At first, we were using Lambda, which provided a smooth experience because a new environment would spin up with every request, ensuring no resource contention. For short-lived processes, Lambda worked efficiently, and autoscaling was handled automatically.
However, for long-running batch processes or handling large amounts of data, we began to experience issues like cold starts and timeouts (30 seconds). That’s when we decided to move to Fargate, which offers more flexible resource management.
Our Experience with Fargate: Trial and Error with Task Specs
One of the strengths of Fargate is that you can finely tune the CPU and memory resources. However, this flexibility comes with the challenge of figuring out the optimal configuration. Since Node.js is single-threaded, we needed to determine the right amount of resources to allocate.
1vCPU 2048MB Configuration
We started with this configuration, but it quickly ran into problems due to Node.js's default heap memory limitations. Memory shortages were frequent, especially when processing large datasets, and garbage collection occurred too often. This led to unstable performance, with tasks crashing frequently.
2vCPU 4096MB Configuration
When we doubled the memory and increased the CPU, we saw a significant performance improvement. The single-threaded nature of Node.js was better balanced here, as background processes (like garbage collection and I/O operations) were handled on a separate core. This resulted in much better overall throughput, and this configuration became our go-to choice.
4vCPU 8184MB Configuration
Thinking that more resources would lower latency, we increased to 4vCPU. However, this turned out to be a mistake. Since Node.js is single-threaded, the extra CPU resources were left underutilized, leading to increased overhead and even worse performance. We quickly learned that over-allocating resources not only results in waste but can also degrade performance.
Autoscaling Configuration
One of the most important considerations when using Fargate is how to scale the number of tasks. We eventually settled on a task spec of 2vCPU 4096MB, and after that, the benefits of autoscaling became much more apparent.
The Benefits of 2vCPU 4096MB Configuration
We chose this configuration because it strikes the best balance for handling Node.js's single-threaded processing. With 1vCPU, the main thread is consumed by request handling, while garbage collection and I/O operations can choke the available memory and CPU, causing a drop in performance. By upgrading to 2vCPU, the main thread can focus on request processing, while background processes are efficiently handled by the second core.
Additionally, the 4096MB of memory helped reduce task crashes caused by heap memory shortages, resulting in stable operation. This configuration was particularly optimal for requests and batch processes that required significant memory usage.
Scaling Based on 60% CPU Utilization, Between 1-10 Tasks
For autoscaling, we set a rule where tasks are added when CPU usage exceeds 60%, and tasks are reduced when CPU usage drops below 60%. The number of tasks dynamically adjusts between 1 and 10.
The reason for this setup is that the 60% CPU utilization threshold served as a good indicator for when a Fargate task was beginning to feel strain. When CPU usage went over 60%, we noticed the request handling would start to slow down. By preemptively increasing the number of tasks, we were able to distribute the load and maintain performance.
By scaling up to 10 tasks, we ensured that even during peak traffic, the system could handle large numbers of requests. Conversely, scaling down to 1 task allowed us to minimize costs during periods of low traffic. This 1-10 range struck the perfect balance between maintaining performance and controlling costs.
Conclusion
One key takeaway from moving to Fargate is that, unlike Lambda, you need to take control of resource management yourself. For Node.js applications, over-allocating resources can actually have the opposite effect and degrade performance. Finding the right task specs and effectively utilizing autoscaling has enabled us to manage resources efficiently while keeping costs down.
Top comments (0)