This is part four of the “Well-Architected Serverless” series. In this post, we’ll talk about the Cost Optimization (COST) pillar of the Well-Architected Framework (WAF).
The COST pillar concerns itself with the money you spend on your cloud infrastructure. It’s important to think about your system’s cost because, in reality, the perfect system won’t be used simply because it’s too expensive.
That means you have to analyze your security, performance, and reliability requirements to design a system that might not be 100% effective. Still, it’s as efficient as possible, making the costs of it bearable for day to day business.
This means the COST pillar is also tradable for more performance and reliability. Can your users wait a few hundred milliseconds longer for the result when they only have to pay a tenth of the price? Do your customers want to pay a fortune for the fact that your system is offline only two hours a year, or isn’t this worth their money, and they are okay with one-hour downtime every three months?
The COST pillar consists of five parts. Let’s look into them.
The first step to cloud financial management is establishing a cost optimization function at your company—someone or a team that spends dedicated time analyzing costs and proposing optimizations.
Next, you should have financial and technical leads at your company work more closely together. Serverless technology enables engineers to make better predictions on how much a customer’s action will actually cost, allowing financial personnel to make better pricing decisions.
Make sure your processes and company culture are aligned with costs when choosing AWS services; this way, an otherwise successful project won’t fail in the end because it’s too expensive to operate.
AWS also helps with tools that forecast the costs of your cloud infrastructure, and you can use Dashbird’s Lambda cost calculator if you’re just starting. Additionally, Dashbird has a built-in cost tracking feature to understand the per-resource cost and changes in costs across your serverless infrastructure.
It’s crucial to get a feeling for what you spent and what is actually used in your system. Luckily this is often a rather easy mathematical function for serverless technology. But as your system grows, you have to think about account structure. At least have one management account with one member account. This way, you can centralize the billing, making it easier to find out what could be optimized.
Understanding the detailed level of cost across your AWS managed services stack is key to start identifying optimization opportunities and making changes to drive better unit economics for your business. With Dashbird, an organization can understand what parts of the applications cost the most and what exactly is driving the cost at a low level (such as latency of Lambda executions or overprovisioning of resources). After the cost breakdown has been identified, it is easy to modify the application to be the most optimal configuration between cost, performance, and reliability.
Optimization targets are also required if you want to stay competitive. Make it your team’s goal to improve efficiency at least every six months. This can mean decommissioning resources that aren’t used anymore, which isn’t much of a problem for serverless systems because of the on-demand pricing. But it also means you should be on the lookout for new features that might lower your bill. Either in terms of direct savings like lower invocation times or indirectly by freeing an employee from manual work.
Here’s how Blow Ltd, UK’s leading on-demand beauty services software, reduced their time to discovery from hours to seconds, freeing up their developers’ time from debugging to focus on product development.
Use the right service for a use-case and use the right types, sizes, and numbers of services to do so. Get your data to S3 Glacier if it isn’t used anymore. Think about Lambda configurations and don’t leave everything on default; sometimes, more Lambda resources can lead to shorter invocation times that are cheaper in the end.
Don’t forget data transfer costs. Getting data out of AWS is especially costly; you need to include these costs when using third-party services that extract data out of your AWS infrastructure.
You should match up the demand and supply of your resources, so you have enough if a spike hits and don’t pay for things that aren’t used right now. In a serverless system, this is mostly done automatically for you.
Sometimes you have to integrate with non-serverless technology. If your access patterns lead a non-serverless system to scale up, this takes some time, but it also takes some time to scale down again, which you have to pay for even if it’s idle later.
As already mentioned, you should always be on the lookout for optimization opportunities. Just yesterday (December 1st, 2020), AWS released millisecond-based billing for Lambda. Until then, running under 100ms wouldn’t give you any cost savings, so you could save optimization time after you hit that mark. Was it enough to have an off shelve Python script? It could now be much cheaper to let an optimized Rust binary do the work.
You should take some time to review new services or new features of existing services to see if they fit your use-cases even better. That way, no competitor can outrun you just by being cheaper.
Well, this time, there is just one question. This is because serverless technology is priced on-demand, which doesn’t require you to think about what you do with resources you don’t use.
The starting point for any optimization journey is first getting a detailed understanding of the current situation and the cost structure of the application.
After understanding the situation and being able to derive insights from it, it’s good to iterate until you find what configuration is the cheapest while still delivering the expected results. Lambda is now billed by the millisecond, so getting an invocation time of 50ms could be worth your while.
Think about Lambda throttling and parallel executions, especially if you don’t charge your customers on-demand yourself. If they pay a fixed amount of money per time frame, but you forget that they can scale up to thousands of parallel invocations, you have a serious problem.
Tag your resources, so you know to which project they belong.
Keep logging costs with CloudWatch in mind. Keep your retention periods low if not required otherwise and only log what you need.
Last but not least, integrate managed services directly. You don’t need a Lambda function to integrate a DynamoDB table with an API Gateway. This also applies to many other AWS services. If all your Lambda does is a transformation from one format to another, it could very well be that you can do the same with a VTL template, which doesn’t have cold-starts or additional costs.
With Dashbird, your serverless infrastructure is constantly analyzed for optimization opportunities, and opportunities for better decision-making are surfaced in real-time.
Serverless technology gives you a detailed insight into your costs and many ways to optimize inefficiencies away. You only pay for what you use, so if you can get away without a Lambda function and directly integrate two services, you don’t have to pay for it.
But as with the other pillars, it’s not just about technology. It’s also about people. You have to keep their skills up to date and foster a culture of curiosity and cost awareness. If your team can learn new skills regularly, it can very well be that they get new ideas from the outside without even actively thinking about it. Use-cases that have been too costly before may now be very affordable.
You can find out more about building complex, Well-Architected serverless architectures in our recent webinar with Tim Robinson (AWS):