DEV Community

Cover image for AWS re:Invent 2025 - Amazon's finops: Cloud cost lessons from a global e-commerce giant (AMZ308)
Kazuya
Kazuya

Posted on • Edited on

AWS re:Invent 2025 - Amazon's finops: Cloud cost lessons from a global e-commerce giant (AMZ308)

🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.

Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!

Overview

📖 AWS re:Invent 2025 - Amazon's finops: Cloud cost lessons from a global e-commerce giant (AMZ308)

In this video, Nathan Perry, Senior Cloud FinOps Architect at AWS, shares Amazon's internal FinOps transformation journey over five years. He outlines three key lessons: building foundations with AWS billing and cost management services like Cost and Usage Reports and Cost Explorer, driving efficiency through business-aligned mechanisms including a "credit score" metric for resource optimization, and scaling through intelligent automation. Perry emphasizes moving from monthly to hourly/ARN-level cost visibility, democratizing cost data across teams, and integrating business metrics with cloud costs. He demonstrates how Amazon evolved from custom tools to AWS native services, deployed Cloud Intelligence Dashboards for role-specific views, and created automated workflows that transform manual processes into self-improving systems, enabling teams to focus on strategic decisions rather than routine analysis.


; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Thumbnail 0

Building a Modern FinOps Foundation with AWS Billing and Cost Management Services

My name is Nathan Perry. I'm a Senior Cloud FinOps Architect on the AWS Optics team, and I've spent the last 11 years at Amazon helping some of our largest AWS customers build modern and scalable approaches to FinOps on the AWS cloud. For the last five years, I've worked with Amazon and all the Amazon lines of business as an AWS customer. If anyone's ever had a challenge explaining what they do to their friends or family, double that for me because I work for Amazon, but my customer is Amazon, and so it kind of breaks my parents' brain. But the idea is that I help Amazon understand how to operate effectively in the FinOps space, lower cost, work on cost visibility, cost allocation, and so I'm here today to share some of the lessons that I've learned in the last five years working with Amazon on this.

I want to share three key lessons from our modernization journey that can help you accelerate your FinOps practices. First, building on AWS billing and cost management services for your foundation. Second, driving efficiency through business-aligned mechanisms. And third, scaling your FinOps practice through intelligent automation. Whether you're just starting your FinOps journey or you're scaling up, these insights are intended to help move faster as you integrate your financial management systems with AWS billing and cost management. So let's start at the beginning to set the stage.

Thumbnail 100

Amazon and all of our lines of business operate on AWS. Just like many customers, Amazon had to learn to operate in the cloud, and more specifically how to build a really comprehensive FinOps stance, with the only caveat being enormous scale. So we started with a data foundation. Our FinOps journey started with custom financial reporting that provided monthly views of our cloud costs.

Thumbnail 130

Thumbnail 160

Today we're working on transforming that foundation using AWS Data Exports, AWS Cost and Usage Reports or CUR, and then other AWS billing and cost management services. We're moving away from visualizing cost at a monthly grain or an account grain in favor of ARN grain and hourly grain. We're democratizing our cost data, and what I mean by that is very often cost visibility can be siloed. It can be owned by FinOps teams or finance teams or operations teams. We're putting this in front of all of our teams, our builders, our leaders, our finance, our FinOps teams.

So building on this CUR foundation, we're enabling Cost Explorer across the organization to allow self-service cost analysis and enable more real-time decision making. We're deploying organization-wide tagging strategies that track investment and provide better fine-grain cost controls organization-wide. And we're integrating this with AWS features like Compute Optimizer and AWS Cost Optimization Hub to vend efficiency opportunities at scale. This represents a bit of a shift for us from some of the centralized reporting to a more distributed model of cost intelligence.

Thumbnail 220

And we're also leveraging AWS Organizations to help establish consistent controls, and it provides an aggregated tool set across all of our account footprints, like a unified framework for our entire AWS footprint. So remember that building on AWS services allows our teams to focus on driving business value rather than simply having to work on analysis. So with this foundation in place, let's look at how we approach efficiency and how the integration of business-specific data helps play a part.

Thumbnail 250

Thumbnail 270

Connecting Cloud Costs to Business Outcomes Through Data Integration

So the next challenge that we encountered was in driving widespread adoption of our cost management practices and in integrating these to run more efficiently in the cloud. Our key insight that we discovered was to make it as simple as possible to connect cloud costs to business outcomes at the team level. So we started by taking data and trying to make it more insightful. We combined two critical elements, granular costs from Cost and Usage Reports or CUR, and business metrics that teams actually care about. This allowed Amazon, and it will allow you to see, not just how much you're spending, but what you're getting for each dollar spent.

Thumbnail 300

Second, integrating business context. The key for us to driving adoption was connecting costs to business outcomes for our teams. We've included accounts and tag-based

cost allocation to business context, investment tracking, and we're automating return on investment analysis with AWS cost management services as this foundation. And we're working to simplify cost visibility.

Thumbnail 320

So for instances where the democratization or availability of Cost Explorer by itself is not enough for a team, we're leveraging the AWS Cloud Intelligence Dashboard. If anyone is not familiar with the Cloud Intelligence Dashboard or the CID, it's an open source framework that was built to provide AWS customers very actionable insight and optimization opportunities at really any scale. An example is Amazon's budget data, right? So we nest AWS service-specific budget data against actual AWS usage. This allows teams to see budgetary variances.

So finance teams or operational efficiency teams can step in if an individual builder team or a larger part of the organization is having challenges, if they need to work on cost reduction initiatives, or if they need to revisit their budgets or their forecasts due to changes in business needs. So we use the CID to integrate that contextualized business data alongside AWS infrastructure usage. And we create role-specific views that enable even more in-depth service analysis by our teams. They can now answer their own questions much more strongly.

I found having conversations with builders very challenging because without that visibility, it's like everyone is expected to be a FinOps expert. And we just really can't expect that of our engineering teams, right? We want them to build, and we want the monotony of FinOps to land on our shoulders because we're the FinOps practitioners in the room.

So here's a real world example. When a team sees a cost spike today, they can identify the accounts, the services, the resources where it originated. They can visualize the business impact by looking at that alongside metrics like revenue or budget. And they can tie it to an individual or initiative so that they can take immediate action.

Driving Efficiency at Scale: From Resource Utilization to Credit Score Mechanisms

So now that we have teams engaged with this cost data, let's look at how we're incorporating efficiency into that mix. While cost visibility is crucial, and honestly, it's probably the thing I'm most passionate about, driving efficiency really requires mechanisms that can measure, monitor and improve resource utilization at full scale. So here's how we evolved our approach to efficiency.

Thumbnail 480

The first thing was realizing that efficiency is not one size fits all. In the early days of our AWS adoption, we recognized the need to provide efficiency reporting to Amazon teams. Amazon has a pretty well-known culture of efficiency with our frugality mindset, and that's really baked into how our builders operate, but that alone wasn't enough to really enable efficient usage of the cloud. So we started by measuring the basic resource utilization metrics: CPU utilization, memory usage patterns, network throughput across all of our teams, and we quickly learned that efficiency was not one size fits all.

Thumbnail 540

Different workloads required different approaches, and the version of good that translated very well for a team like Amazon retail wasn't necessarily the same version of good that translated to teams with very different workloads like Alexa or Prime Video. So we needed to build central efficiency mechanisms and we needed alignment on how we could get as close to perfect in terms of these centralized efficiency metrics as we could.

We built mechanisms to track business specific efficiency and monitor resource utilization against this centrally agreed upon ideal. And again, this wasn't universal, but for the lines of business that it held mostly true, we saw very significant gains in efficiency and cost reduction really right out of the gate. We created a metric called the credit score, and so our credit score essentially measured resource efficiency across lots of different services, and it was an iterative way to basically continue honing this ideal of a central baseline while continuing to build on what that version of good should look like, what the North Star is.

So our teams using Graviton, is usage aligned with central efficiency campaigns around capacity utilization or storage class optimization?

Is it aligned with how the business data that you're pulling in, like revenue or your budget data, is correlating to that? So all of these were included in this formula that allowed us to measure essentially the FinOps maturity of all of our lines of business. And it allowed the mechanisms to really help teams optimize and save big dollars, large dollar amounts.

So here's an example of how it would work in practice. Teams would receive a weekly efficiency score, and it would provide cost recommendations that could be reviewed and prioritized by their level of impact. Teams were able to group these recommendations by things like technology category, storage, compute, generative AI, database, network. And then they could see them on accounts, on teams, they could associate them with owners. All stakeholders in the cloud cost lifecycle could see the opportunity and where it sits. So finance teams, leadership, technology owners, or operational efficiency teams, all of them could view this data through the lens of their own respective disciplines, so they could see it in a way that made the most sense to them.

Thumbnail 690

Thumbnail 700

Scaling FinOps Through Intelligent Automation and the Path Forward

Now that we've established our efficiency mechanisms, I want to talk about how automation has been helping us scale these practices. At Amazon, we've learned that true cloud financial management is not a series of isolated tasks. It's a continuous intelligent cycle. By integrating AWS services with automated workflows, we're working to transform what was once a very manual and reactive process into more of a self-improving system.

Starting with this concept of an intelligence cycle, our journey, which is still in progress, towards automating FinOps started with a really simple goal, which was just give the teams better visibility into their cloud costs. Through continuous iteration, we're working to create systems that have deeper insights into infrastructure spending patterns. So where teams once spent hours on manual analysis and spreadsheets, we're introducing automation into key processes like financial planning. Each improvement works to feed back into this learning cycle, and it's helping us enhance our capabilities and move closer to our vision of intelligent cloud financial management.

Thumbnail 770

Effective FinOps automation starts with really comprehensive usage of trends, budget variance, capacity requirements. When optimization opportunities are identified, teams receive notifications, and they receive these through their own preferred channels. For well understood scenarios, teams can define policies, and they can define thresholds that trigger automated responses while maintaining appropriate controls over more complex decisions. This balance between automation and oversight is super important and it's crucial to building trust and driving adoption.

Thumbnail 810

So on the topic of trust, the key for us to scaling FinOps wasn't just technology. It's building trust through very consistent transparency. Every action, whether it's automated or whether it's manual, it has to be logged. It has to be tracked in detail. Teams need to see exactly what's happening with their infrastructure costs and why. This transparency first approach has been really crucial in driving adoption of all the automation and the planning capabilities that we use across Amazon and all the Amazon lines of business, particularly in retail, where we've seen the impact of combining human insight with automated analysis.

Starting a FinOps automation journey doesn't require building everything at once. Begin with AWS services as your foundation and focus first on gaining visibility into your costs. Then gradually automate well understood processes. In our case, it was our financial planning and our OP one cycle. Planning and capacity management were big for us. As you build trust through transparency, and you have results that can back that trust up, you can expand both the scope and the sophistication of your automation. Our goal, and it's important to remember this, wasn't to remove humans from the mix. It was essentially to elevate their role from routine analysis to more strategic decision making.

So now that I've talked about how these three pieces fit together, visibility, efficiency, and automation, I want to take a look at how you can build your own FinOps roadmap.

Thumbnail 910

Throughout this section I've discussed various aspects of cloud financial management. Let me share something that I find pretty powerful, specifically how these pieces fit together to accelerate FinOps journeys of different sizes.

Thumbnail 930

Thumbnail 980

Every successful transformation starts with a solid foundation. We begin our journey with everything custom: our own tools, our own mechanisms, our own processes. What we discovered was that AWS billing and cost management services now provide better visibility than our custom solutions really ever did. The AWS Cost and Usage Report gives the granular data that our engineers and our leaders really never knew that they needed until they saw it. They didn't understand the power of that fine-grain cost control. Cost Explorer puts analysis capabilities directly into our team's hands, and AWS Organizations lets us implement governance at massive scale, whether we're managing dozens, thousands, or tens of thousands of accounts.

But data alone is not enough. We learned that lesson the hard way. Real transformation for us happened when we connected business outcomes to our cloud cost. Think about it: knowing you spent $100,000 on compute or storage is important, it's a helpful metric, but understanding that that $100,000 can be attributed to a million dollars in revenue, that's something that's very actionable. That's where the AWS Cost Intelligence Dashboards come in for us. They don't just show costs, they show team level value. Everybody is able to see the metrics that matter to them in their language, aligned with their goals. What began as an internal efficiency mechanism has evolved into features really that can be implemented by anyone today.

Thumbnail 1040

And automation. Imagine moving from monthly cost reviews to daily optimization actions. We're not talking about just alerts, we're talking about intelligent systems that detect anomalies, recommend optimization, and even implement improvement automatically. Teams that once spent hours analyzing spreadsheets now focus on strategic decisions while automation handles a lot of the routine. This isn't just about saving time, it's about operating at scale.

I want to make this concrete. Three years ago, we were where so many AWS customers were or still are today. We were juggling multiple tools, we were wrestling with manual processes, and we were really struggling to get clear cost visibility. Today, we're integrating AWS services and we're automating insights. Looking ahead to 2026 and into 2027, we're enabling AI-enhanced optimization, predictive actions, and we're automating capacity planning. The part that is the coolest to me is that this evolution is not years away for anybody that's trying to embrace it today. The capabilities are available right now.

Thumbnail 1130

Thumbnail 1150

So the acceleration path starts with three clear layers. First, laying your data foundation. Enable the Cost and Usage Report, implement a solid tag strategy, and get comfortable with tools like Cost Explorer and Cost Optimization Hub. This gives you the visibility that you need to move forward confidently. Next, build your integration layer. Deploy tools like Cost Intelligence Dashboards to connect your business metrics and enable automated recommendations. This helps turn data into business intelligence. And finally, activate your optimization layer. Implement efficiency recommendations, enable controls for transparency, and automate response actions. This is where FinOps practices become really scalable.

So remember how we started talking about the Dunning-Kruger effect? When Amazon started our FinOps journey, we really thought that our expertise in building custom solutions and tools was our greatest asset, and it gave us a very false sense of confidence in terms of how successful we were going to be right out of the gate. What we learned and what I hope you all take away from this is that often the fastest path isn't necessarily the obvious one. It requires that we challenge our assumptions. And while Amazon's scale can seem daunting, and it still does to me at times, the principles that we've learned work for organizations of any size.

Start with AWS billing and cost management services. Build mechanisms that matter to your business units and automate intelligently. What we found is that the smartest choice was really the simplest one, and it's not what we started with. That choice is something that you can use today to scale your own organization's approach to cloud financial management.

Thumbnail 1240

So thanks everybody. Before I leave, I want a quick plug for One Amazon Lane. If you haven't been to it in the Caesars Forum, it's in the southeast corner, but it's a very cool interactive exhibit to put hands on a bunch of really cool Amazon tech. Thank you all for your time today and enjoy re:Invent.


; This article is entirely auto-generated using Amazon Bedrock.

Top comments (0)