DEV Community

Cover image for Zoom, Spotify and others slashed their cloud costs by millions - how did they do it?
CAST AI
CAST AI

Posted on • Originally published at cast.ai

Zoom, Spotify and others slashed their cloud costs by millions - how did they do it?

In Q1 2021, Zoom reported that its gross margin widened to 73.9% from 69.4% in the previous quarter - primarily thanks to the optimization of public cloud resources.

And Zoom is certainly not the only company that realized the value of optimizing the cloud infrastructure.

As businesses migrate their workloads to the cloud and build cloud-native applications, they’re starting to realize that overprovisioning and cloud sprawl aren’t just urban legends. 

For startups, the cloud is an essential technology because of its unparalleled support for scalability. But the cloud may quickly turn into a struggle because of growing costs.

Here's what a16z wrote in a recent analysis:

“[...] across 50 of the top public software companies currently utilizing cloud infrastructure, an estimated $100B of market value is being lost among them due to cloud impact on margins — relative to running the infrastructure themselves.”

How can companies deal with the long-term cost implications of the cloud? Cloud cost optimization is the best answer.

3 companies that optimized their cloud costs

1. Spotify developed a cost allocation tool to save millions of dollars

Service ownership is a key problem in taming cloud costs. Tracking the teams’ responsibility for the cloud bill and keeping costs down is a pain.

To handle this, Spotify developed a homegrown solution called Cost Insights that tracks the company’s cloud expenses. By doing that, Spotify allows engineers to take ownership of cloud spend 

Spotify also helps its developers by providing suggestions about optimization strategies like autoscaling solutions within the portal. The internal crowdsourced document called Our Cookbook allows engineers to submit insights on what has worked for them in terms of system optimizations to help other teams.

Soon enough, developers started to treat cost optimization like a game - bragging about their victories and motivating other teams to play as well. Spotify is planning to add a leaderboard functionality to play on these social and competitive components of cost control.

Result? Spotify reduced its annual cloud spend by millions of dollars. All thanks to helping engineers to make smarter decisions about resource allocation. 

2. Segment optimized its infra and increased margin by 20%

Segment managed to reduce its infrastructure costs by 30% despite increasing traffic volume by 25%, all within six months. 

How come? It was all thanks to incremental optimization of infrastructure decisions.

Here’s one example: 

Segment has an internal validation service written in Node.js that validates incoming messages to ensure that they meet their message format.

The team ran the service in containers with one full vCPU and 4GB of memory allocated to each. To publish 200,000 messages per second, Segment needed 800 of these containers, as each container processed 250 messages per second.

The service still gets the same allocation, but the team rewrote and optimized the logic.

Result? Today, Segment processes 220,000 messages per second across only 340 containers. The throughput per container more than doubled to reach almost 650 messages per second, allowing Segment to cut its expenses for this component in half. 

BTW. Segment is no stranger to cloud cost optimization. The team shared another great story on their blog on how they handled rapidly growing cloud costs and trimmed their AWS bill by $1 million annually. 

3. La Fourche moved to different VMs and saved 69.9% 

The online grocery store La Fourche was running its containers on Amazon Elastic Kubernetes Service (EKS) and soon saw the cloud bill rise from $1,000 to $10,000. This is when the company’s CTO decided that optimization can’t wait.

If La Fourche waited any longer, it might have become the next victim to the never-ending cycle of growing cloud bills and long-term savings plans.

La Fourche started by analyzing the cloud bill in detail. The company turned to the CAST AI Savings Report and ran an agent in read-only mode over their EKS infrastructure. The Savings Report showed that moving to different virtual machines would help to slash costs by a lot.

Beforehand, La Fourche was using 15 t3.2xlarge and 2 t3.xlarge instances. At the time of running, the analysis generated the cost of $4,349.95.

Moving these workloads to 5 c5a.2xlarge instances instead would reduce by a smashing 69.9% ($1,310.40). Next month, La Fourche got a bill that was lower by $3,000.

Over to you

The latest edition of the State of the Cloud Report revealed that 61% of organizations are planning to optimize their use of cloud resources in search of cost savings. Cloud cost optimization is the top initiative for the fifth year in a row.

Reducing cloud costs can make a dramatic impact on a company’s bottom line. The examples we shared above clearly show that infrastructure costs depend on a team’s ability to provision just what it needs. 

Cloud cost optimization is the low-hanging fruit here - it allows staying in the cloud, but for half of the cost. 

Top comments (0)