Orel Bello for AWS Community Builders

Posted on Apr 24 • Originally published at Medium on Oct 21, 2024

Pay Less For Serverless: Practical Tips

#melioengineering #aws #lambda #serverless

Intro

We all know the benefits of using serverless architecture, the concept is pretty simple: we pay AWS for managing the infrastructure for us so that we can focus solely on developing, instead of handling and maintaining the servers.

But what about the costs?

In a small environment with infrequent access, the serverless architecture can actually save you money — for example, when you don’t have traffic, the environment scales to zero and you don’t pay at all.

But in a large environment, such as ours at Melio (where all of our architecture is serverless), the price can spike and reach over $100K monthly on the Lambda functions alone, so what can we do to optimize it?

The first thing we need to do is to determine which services will be used in a serverless architecture, and then we can see how to optimize them.

This blog post will explore the various strategies for cost optimization in a serverless architecture, focusing on services and best practices to ensure efficient spending.

Who am I and why do I care about cloud costs?

My name is Orel Bello, and for the last two years, I’ve been working as a DevOps Engineer at Melio. I’m an AWS Certified Solution Architect Professional and Melio’s focal point for FinOps.

Since I started using AWS, I have been paying attention to every resource price, as it is a big part of the AWS Solution Architect Associate certification that I went through at the beginning of my cloud journey, so I knew we had a lot to cut from.

Recently Melio started the enrollment process for AWS EDP (Enterprise Discount Plan), which requires cost optimization before, so let’s start saving money:

Lambda Pricing

Before optimizing Lambda costs, it’s important to understand the pricing model.

You are charged based on execution time (measured in milliseconds) and the amount of memory allocated.

For example, a function with 128MB of memory (which costs $0.0000000021 per millisecond) and an execution time of 3 seconds would cost ($0.0000000021 * 3000 =) $0.0000063 per invocation.

If you double the memory and halve the execution time, the cost will remain roughly the same. However, the performance improvement might vary depending on the task.

Remember, each Lambda function handles only one request at a time. Therefore, more requests lead to more invocations, which increases costs.

Introducing AWS Lambda Power Tuning:

So you just created a new Lambda function, how do you choose how much RAM you need to allocate? (While you can’t directly adjust vCPU values, increasing RAM indirectly enhances vCPU performance too.)

This open-source tool can help you optimize your Lambda function and suggest the best power configuration to minimize cost and/or maximize performance.

It will run your function on a benchmark, suggesting the best values for RAM, and will also show the average execution time.

So by increasing RAM based on the results, you can make your Lambda function run faster, and you’ll pay less (or at least the same) because the execution time is reduced.

2. Why not just set the timeout to the max value?

Setting the timeout to the maximum can be costly because you are charged for every millisecond your Lambda function runs. If an error occurs and the function simply waits for the timeout (for example, when you’re accessing an unresponsive API), you will incur unnecessary charges. Therefore, it’s crucial to set the timeout to fit your specific needs.

To determine the correct timeout value for your Lambda function, you can use CloudWatch metrics or the Lambda Power tool. These tools provide the average execution time, allowing you to add a buffer for safety and set an appropriate timeout value.

3. Don’t put all your code inside the Lambda handler

Lambda functions operate within a virtual environment that persists across invocations, known as a microVM. However, it’s crucial to note that the main function code (the handler) is executed fresh each time it’s called. If you set up resources like a database connection within the handler, they are recreated with every call, slowing performance and potentially increasing costs.

To improve performance and cut expenses, it’s best practice to set up lasting resources, such as database connections, outside the handler. This enables subsequent invocations to reuse these established resources, leading to quicker execution and savings.

4. Migrate to ARM-based AWS Graviton processor:

Using ARM architecture with Graviton processors instead of x86 processors can reduce the overall cost of your Lambda function by up to 20% while improving performance by 19%!

The migration itself is pretty simple, and unless you have some dependencies or libraries that are using x86, you don’t need to take any further steps while migrating to the graviton processor.

Of course, it’s always best practice to run tests on Dev environments first before making changes on Production, but the transition itself should be pretty seamless.

5. Provisioned Concurrency — Don’t use it recklessly!

Provisioned Concurrency keeps your Lambda functions ‘warm’ and ready for action, making them execute faster by eliminating cold starts and improving performance.

It’s important to pay attention that you’re billed based on the number of provisioned concurrency units and the duration they’re active, and if you use it recklessly, it can become very expensive.

So what do you need to do?

Use Provisioned Concurrency only for production workloads with user-facing functions, and avoid using it in development environments.
Provision the minimum required amount of concurrency that your function will need (by analyzing application traffic patterns and performance requirements, you can use Cloudwatch ProvisionedConcurrencyUtilization metric for that). Remember that over-provision will just cause extra costs.
Use the auto-scaling feature of Provisioned Concurrency to gradually scale your function based on utilization, ensuring you avoid over-provisioning.

Also, functions with shorter execution times require less Provisioned Concurrency, so if you optimize your code and RAM configuration, and lower your execution time, you can also save money on the Provisioned Concurrency.

Remember: A serverless environment will not cost you money when there is no traffic, but you will pay for the provisioned concurrency! So even if you have an inactive environment, you must take it into account.

6. Don’t Use Sleep:

Did you ever need to wait for an operation that is running outside the Lambda function to finish? Did you use ‘sleep’ while you wait?

For those of you who aren’t familiar with the sleep method, it’s pretty straightforward — you specify the amount of time you want the function to wait for the external operation to finish.

So why is it bad practice to use it inside a Lambda function?

As you may already guess, it’s because we pay for the time that the Lambda function is waiting for.

So what can we do instead of using sleep?

7. Introducing Step Functions:

Step Function is a serverless orchestration service that integrates natively with Lambda function and a lot of other services, and lets you create a workflow like a state machine.

This can help us divide a large Lambda function that needs to wait for an I/O operation to finish into smaller functions, and add between them a logic that waits and checks if the I/O operation has finished, outside of our Lambda function, so we won’t pay for the function while it’s waiting!

So if the wait is free on Step Functions, whatis the pricing?

We pay per the transition.

Let’s take a look at a common use case:

We triggered an operation from the Lambda function, and set a loop to check when it’s done, with a ‘WAIT’ between each check.

If we want to save costs, we can define the waiting time with a greater value, which will lower the number of transitions and reduce the overall cost.

For a small Step function, it’s pretty insignificant, but on a large scale, this can get expensive.

8. Compute Saving Plan:

So what is the AWS Compute Savings Plan?

You basically commit to using AWS Lambda for the next 1–3 years, and in exchange, you get a discount of up to 17% (the Compute Savings plan is also applied for EC2 and Fargate, and can reach an even greater discount of 66%).

The pricing model of a Savings Plan is more flexible than RIs (Reserved Instances), as you aren’t bound to use a specific instance type or a specific region.

If you’re afraid of the commitment, you can always choose the most basic option of 1 year with no upfront payment. If you’re working at a steady pace with a solid usage of Lambda functions, using saving plans should be a no-brainer.

9. Logs — storing is cheap, writing is EXPENSIVE

Logs are crucial, we just can’t live without them.

But, do we really need all of our logs? There are a few types of logs, such as DEBUG, INFO, WARN, ERROR, and FATAL, starting from the most common in decreasing order.

Do we really need to write them at such a high frequency? Is any INFO message really needed?

Also, if we’re using a thirdparty monitoring tool, which itself costs a lot, do we really need to write the logs to Cloudwatch as well?

We need to understand that nothing is free and writing logs costs money, and with some work, we can save a lot of money!

So what can you do?

Ensure that only crucial logs are written. (You can do so by utilizing the FingersCrossedHandler library, which sends logs only when errors occur).
Add a retention to delete the logs oncethey’re no longer needed (or archive them in S3 Glacier).
When applicable, consider using the new Infrequent Access tier on Cloudwatch, which can save you up to 50% on log group costs. (Please pay attention that it doesn’t fit every use case, as it doesn’t support real-time monitoring, metric filters, subscriptions filter, and log anomalies).

10. VPCE

This awesome feature is not unique for a serverless-based architecture, but it’s a must-have!

Basically, instead of getting out from your VPC to AWS via the NAT GW, which isn’t cheap, you can use the backbone network of AWS to connect to AWS services directly from your VPC, without traversing the public internet.

This solution is more secure, efficient, and cost-effective, and you can use it with different services, such as S3, DynamoDB, ECR, EC2, Lambda, KMS, SSM and so on.

This simple yet powerful feature can reduce your data processing costs and save you some money.

Conclusion

Cost optimization in a Serverless environment i’s (almost) all about the Lambda function.

There is no doubt that this kind of cost optimization requires more effort, both from the DevOps and the Developers, and there are not that many low-hanging-fruits, but once you define guidelines in your organization, and enforce them, you will be able to save a lot of money.

visit our career website

DEV Community

Pay Less For Serverless: Practical Tips

Who am I and why do I care about cloud costs?

Lambda Pricing

Conclusion

Top comments (0)