Kazuya

Posted on Dec 5, 2025 • Edited on Dec 8, 2025

AWS re:Invent 2025 - An unexpected journey building AWS MCP Servers (OPN401)

🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.

Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!

Overview

📖 AWS re:Invent 2025 - An unexpected journey building AWS MCP Servers (OPN401)

In this video, Paul Vincent and Laith Al-Saadoon from AWS share their journey building 60+ specialized AWS MCP Servers that have garnered over 7,400 GitHub stars. They demonstrate how Model Context Protocol acts as a universal connector between AI coding assistants and AWS services, enabling developers to build full-stack applications in days instead of weeks. The presentation includes live demos of the diagramming and pricing MCP servers, deep code walkthroughs of the Core server's role-based composability and the Billing and Cost Management server's multi-service integration, and explains their design philosophy of avoiding one-to-one API wrappers in favor of workflow-oriented tools. They emphasize their mono repo approach with shared templates and how these servers reduced their prototyping time from 6-8 weeks to 5 days.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Introduction: AWS MCP Servers and Community Impact

Good morning. Can you all hear me? Okay, good. Awesome. Good morning. I'm Paul Vincent, a Principal Prototyping Architect here at AWS. I'm joined here by Laith Al-Saadoon, another Principal AI Engineer at AWS. Hi everybody, and we're going to talk this morning about our unexpected journey building AWS MCP Servers. So thank you all for joining us this morning for the first session of the day.

We're going to cover a couple of things here. Let me go through the agenda. Here's what we'll cover first. We're going to start with a moment of gratitude. We're going to talk to you about thanking the community for the support that we've had for MCP. Laith and I run the MCP open source on AWS. Hopefully you've all used them thus far. We're also going to talk about how we got here a little bit, and then we'll have a quick refresher on MCP to get us all in the same context.

We'll also discuss MCP server and tool design choices, why we made some of the choices when we designed our initial MCP repository, and how you might use some of those techniques when designing your own internal MCP repositories for your own usage within your own companies. Then we're going to look at the architecture choices that we made. We're going to do some code walkthroughs and a demo of a couple of the MCP servers just to see them working.

First, I just want to say thank you for coming to our talk. Genuinely, I wanted to say thank you. It's really the community, the adoption, and your enthusiasm for those of you that have used the MCP servers that have made this a really great project. We've now had over 7,400 GitHub stars, lots of forks, and community contributions from both the community and the AWS service teams.

We have 60 specialized MCP servers across use cases, personas, and role-based types of servers that you can use in specific scenarios that you need. We also want to give a huge thank you to the AWS community builders and heroes and our other external advocates for AWS that have voluntarily written blogs and articles about how to use these MCP servers in whatever tool you choose, whether that's Quiro, Claude, or any other neat coding assistant or agent framework that supports MCP. We really want to thank you for your support.

With each of the downloads of these servers, we've had over several million downloads of MCP servers. This means any developer now is able to create a full stack serverless amateur radio logging web application, which was one of our marquee use cases when we first started building this. But one of my favorite quotes about what developers are actually saying about the MCP servers, I'll pull up in just a moment. While you're here, I have QR codes for the GitHub repo and the articles from external contributors and authors. Please take a look at those. I'll have them up again at the end.

This is what one of our customers, one of the developers in the community, is saying about the MCP servers, and this is exactly what we set out to do. This is a quote from Robert Weston from Alpha Data. It really perfectly captures why we built these MCP servers to begin with. He said this isn't full AWS automation. It helps you where you actually get stuck. The key thing here is this is how cloud development should feel. It just knows what I'm trying to build, how I build it on AWS, and takes away some of the guesswork of which service, which IAM policy, how do I do this. It basically builds all of that into your AI coding assistant.

We're really grateful for this quote because it validates what we set out to do to begin with. Now this project has exploded with a variety of use cases, whether that's in your coding assistant or you're building autonomous agents that do things on your AWS behalf, like monitoring your logs or changing configurations. We had one customer use one of the servers to do database administration for their RDS instances in a vast multi-account setup. We're really excited about what customers are saying and using our MCP servers for.

Now, this is a 400 level talk, so we wanted to set the stage and context about how we got here, and then we're going to go super deep into demos and code walkthroughs of some of the MCP servers.

Specifically, we'll explore some of the peculiar parts of the choices that we made in our MCP servers that may not be obvious and really set MCP servers apart from API wrappers on top of AWS. But for now, over to Paul.

The Genesis: From Prototyping Challenges to Model Context Protocol

So how did we get here? Nathan and I are part of a team focused on prototyping and cloud engineering. We build prototypes for customers, coming in and doing quick rapid prototypes that are typically six, seven, or eight week engagements. We decided that we needed to figure out how, in the age of agentic code development, we could make our engagements with our customers go faster and be more productive. We were finding that we spent a lot of time during these engagements on knowledge transfer and looking up APIs, trying to figure out what to do because the code assistants could generate code, but they had no idea how AWS worked.

That's where we came into the Model Context Protocol and MCP servers. The MCP servers allowed us to say, we know CDK really well, so let's put that into a server that we can then give to the model so the model then knows CDK really well. How to shorten that time frame for our prototyping engagements was really the reason we started looking into all of this. If anybody remembers in The Matrix when Neo gets jacked up to the thing in the back of his head and he all of a sudden knows kung fu, that's what we wanted to give our models. We wanted to be able to jack this thing in called MCP and say, now they know CDK and now they know AWS protocols, and now they know all the various services that AWS has. General models have a good feeling for these services, but they don't really know the detail that they need to actually perform and build quality code in a short amount of time. That's what MCP gives us.

Here's our Model Context Protocol. Let's do a real quick overview. It's an open standard. A lot of people like to say it's the USB port for models, right? It's just plug and play. I like to call it the HDMI port because it's a little bit more descriptive. If you think of before HDMI on audio connections, you had these red and white plugs, you had S-Video, and you had all the other different things if you wanted to connect your VCR, depending on what you had on the other side of it. It was a pain. MCP is the same way because before MCP we couldn't take just any API and give it to the model and have the model understand how to talk to that API. MCP does that by allowing us to have whatever we want on this side of it, on the model side, and in the middle we have this universal connector, whether it's HDMI, USB, or whatever you want to call it, that makes it really easy for the model to understand what you're talking to.

We have 60 plus specialized AWS MCP servers in our repository, broken up into three major buckets. The first is documentation and knowledge. That's the AWS Knowledge Server. If you haven't used that yet, I recommend you definitely check that out when you leave here. It's an MCP server that connects directly to AWS documentation. If you've ever had to search through AWS documentation to try to find out how to do something, you probably know that pain. It's very painful to look for things when you want to find them on AWS. The Knowledge Server allows your coding assistant to have direct access to all of our documentation really efficiently. It understands things like how to make a call to a DynamoDB table, how to invoke a particular service, how to use location services, and all those things are in the knowledge server and really quickly accessible via the knowledge server.

The second bucket is workload specialization. Those are MCP servers focused on services like EKS and serverless functions, so how to actually operate the service. Those MCP servers are there to help you and your coding assistant understand how to deal with EKS or how to deal with Lambda functions. Then we have our developer specialized bucket, and that's where we come into a server for CDK. We have a server for Terraform. I've actually had customers where we created a solution for them in CDK and then they said, oh, we forgot to tell you, we actually want Terraform. It's as easy as telling that MCP server to convert this whole CDK stack to a Terraform stack and you're done. That's how simple it is with MCP because we've given it that kung fu. It understands Terraform. It understands CDK, so it can do that translation very quickly for you.

We have the pricing, diagramming, and front-end specialty servers. How many times have you built something, deployed it to production, and then found out it costs a million dollars to run? Now you can actually create your code, do your development, and run the MCP server for pricing and optimization to see how much it's actually going to cost you to run with various scenarios. It will model that out using direct access to the APIs. The model understands AWS pricing, which can be a difficult challenge when you're trying to review something you've written or an application you've built.

Front-end is another developer-focused one. I'm a back-end coder. I've been coding for about three decades. I'm not a front-end developer, but I am now because I have the knowledge. I know how to do front-end development now because the front-end server allows me to write really quality front-end code without having to be a front-end developer, which is really cool for me. I know AWS, and that's the whole idea behind our MCP services. We want to give that feeling of instant expert knowledge to our coding assistants, whatever that might be.

Live Demo: Diagramming and Pricing Analysis with MCP

Now you might say that's great for AWS services. Well, internally, if you have some internal knowledge and internal processes and things, you can create your own MCP servers, plug them into your workflow, and now they know whatever expertise you have going on in your own companies. That's the idea behind MCP. So we're going to do a real quick demo. Let me see if I can push the right button. This is my Kiro instance. Because we only had 60 minutes, we can't really build the whole thing from scratch all the way to the end. So I have a solution already built here that I want to do some things to.

We talked about amateur radio. I'm an amateur radio guy, so we built a call logging system to be able to log radio contacts for the amateur radio hobby. I built that and it's a fully agentic workflow. It's using Agent Core. It's got three or four agents with an orchestrator. It's got a lot of things in here, but I don't really have a good visual of what it looks like. So I want to use my MCP to get a quick diagram of this solution. This is going to use the MCP server, the diagramming MCP server, to actually go out, take a look at what I've built, and then build me a nice graphical representation of that.

Quick show of hands. Does anybody do amateur radio? There are a few. I told you we exist. There are dozens of us. When I brought this up to Laith, he's like, I don't even know what that is. There are a lot of us. So while this is working, notice here "diagram generate diagram." That's the MCP tool that's loaded by the diagram MCP server. The cool thing about this is that I don't have that MCP server loaded. I have the MCP core server loaded. We've done some really cool things with our MCP core server. The concept is if we have 60 plus MCP servers that you're trying to use within your own organizations, how do you manage which ones to use? We've created MCP core, which is the only one we need when we're doing specific developer functions. I have a role configured on mine that says "solutions architect," and that role says to bind dynamically when I load the core MCP server a bunch of other MCP server tools based upon the profile that I've set within the configuration for that one MCP server. That's how we can manage 60 plus MCP servers without having to load them all in. We can do it more profile-based, and Laith can actually do a deep code walkthrough of how we did that.

It looks like my diagram has been saved. Let me pop it real quick and see what it looks like. There it is. Let me go bigger. So we built this just now on the fly. That MCP server went out, evaluated my code, and drew this diagram for me, and it got it right. We've got some agent systems here, multi-agent systems. This is using Agent Core. We have three agents, which is correct, and an orchestrator. We have some remote APIs going out to get things like QRZ.com for call sign lookup and verification. A lot of good things happening all through this diagram for us automatically through the magic of MCP.

Another thing we can do is ask how much it would cost to run this for 10,000 users. We'll do the same thing. I've bound to my profile since it says I'm a solutions architect, and I need to know things like pricing. It binds all those tools from the other MCP servers into this one instance of Core, and now it's going to actually call the pricing if I approve it. Let me prove that. It's going to call the pricing server and actually get all the pricing information. It's going to look at the code first and understand all the things I've built, and then it's going to call all the individual pricing APIs for those and build the full solution for me. Then it does some modeling on the number of users and that type of thing. I don't know why it's not approving automatically, but we'll keep saying yes. There we go.

So it's able to go out there and use the API to actually do that, and it'll do some scenario-based things as well. It can model 10,000 users, 100,000 users, a million users, and it'll bring that report out for you and give you a good feeling for what it's going to cost to run this solution the way it's coded today. That also gives you the insights to say, well, I need to change this area here. I want to change the architecture here because I need to cut costs. So it gives you that upfront way before you even deploy it, which is nice.

Who has used the AWS Calculator or Simple Monthly Calculator? Everybody, right? So if you haven't tried this pricing MCP server with your infrastructure as code stack or maybe something like that, you're in for a treat. You probably never want to go back to the calculator ever again. What's important to call out is that with managed services like Lambda and Bedrock, it's a highly variable cost based on on-demand per-use pricing model. We've built in the pricing server these prompts and tool descriptions that provide hints about assumptions, right, like assumptions about your stack. So you can actually bake it in and say 10,000 users that are calling Bedrock maybe three times a day with an LLM, and so it basically has some hints there to build out this estimate. So it's really powerful.

You notice it created the report over here, and I've asked it to write it out to a file so we can persist that over time, and it's working to do that. But if you can see here, we got a pretty good monthly cost to run this. If I wanted to run it at scale at 100,000 users, it's almost $2,000 to run an amateur radio application. I think that's a little outside of my hobby budget, so I'm not going to run this on my own. But if I wanted to provide a service maybe to other radio operators, that might be something I could think of.

So let's look at use cases and things we can do. If we take a look at the way we've organized these things, we have Explore and Plan. Explore and Plan is more of that phase of experimentation, where you ask what do I want to do? It's kind of quick, rapid prototyping development. Using our MCP service to do that quickly, it flows down into Create. That's where we've got our plan, we've got our user requirements, we've got those things kind of well defined within our workflow, and now we're going to use our MCP service to get that deep knowledge again, our kung fu, so we can actually build something really useful and really valuable.

Then we use another set of MCP servers to Test and Secure. We have MCP servers for architecture, and we have MCP servers for developing user test cases around the code that's being developed. This is the full flywheel effect. As you're going through there, then Review and Deploy. At the end of that test, we take a look at the test outputs, we review, and then we use MCP servers for Terraform, CloudFormation, or CDK to actually deploy our solution. We deploy that out to our environment, and then we have another set of MCP servers that we can use for maintaining and operationalizing it.

We can look at security logs, we can look at utilization, we can troubleshoot and repair any problems that we might have, any unforeseen things that didn't come out of our test. It always happens, right? Nothing's perfect. We have another set of MCP servers to help us with that phase of the cycle, and it's just a continual cycle. But what we're able to do is what might have taken weeks and weeks and weeks to do, now we can do really quickly. So we're able to now develop prototypes for our customers in a five-day period, where it used to take us six to eight weeks. Really good stuff coming out of this and using MCP for that.

Design Philosophy: Server Separation and Role-Based Composability

All right, so first, quick show of hands again. Who has not used MCP with an AI coding assistant? Raise your hand.

Okay, awesome. There are a few. So if you haven't, but if you're using AI coding assistance and you haven't used MCP at all, any sort of MCP, you're really in for a treat. Try that right after the session, even if it's not one of our MCP servers. If you use third-party open source ones like Context 7, it's going to change the way you look at your AI coding assistants.

I wanted to now go deeper into some of the choices that we made with our MCP servers, the real design behind them. We mentioned from the beginning we have 60+ MCP servers. Is that too many? Is that too few? Is that just right? That's something we are constantly assessing. But the one thing that we know for sure is that these are emerging standards. I would say it's hard to say that there are best practices in the space because it's just now a year old, but we've seen some emerging trends and things that we view as maybe standards or soft standards.

The first is the separation of servers. That's something we've decided for now is a great way to provide isolation with tools, just to have a user-controlled or operator-controlled constraint around the tools and context that's provided to your AI assistant or agent at any given time. So if you know definitively that you are only doing Terraform and, let's say, AgentCore, like two things on AWS, then you just need those two MCP servers. You don't need all 60. We think that's a great way and convenient way to separate what you need at any given time and control the context in your agent at a given time.

We could have built one big monolithic server, but we wanted to let developers decompose and recompose multiple specialized servers. Every project is different, so every customer is different. Maybe you're using Terraform or CDK. Maybe you use serverless or maybe you use Kubernetes and EKS, and you don't care about serverless. Whatever you choose, you can control that definitively with the MCP config in your session or your coding assistant or agent.

Now, having said that, we also wanted to think about multi-server composability. We quickly recognized that when we first started, we had five and then ten MCP servers: pricing, diagram, documentation, CDK, and AgentCore. Then that expanded as service teams and the AWS service team product team started contributing their MCP servers to the repo. We realized that we needed to think more deeply about composability, and what we saw emerge was role-based or profile-based composability.

We took the time to hand-carve groups of servers based on common jobs to be done or role-based groupings. So if you say solution architect, you get a set of MCP servers by default, and you just have that one core MCP server in your config. Let's say it's KIRO or whatever. You have this MCP.json and you just put one core MCP server and control the environment variables. I'll walk through that in a second. It just dynamically binds the other servers that are needed to fit that profile. We take a little bit of the guesswork from your plate.

That's another place, by the way, where it's open source, so it's a great way to get community contributions and get your input. If you take a look at core and you say, "Hey, this might be a good profile here. I always use these servers together," you can tell us, and maybe that's a place where we want to consider a profile. That's what we think about composability.

Breaking the Pattern: Why One-to-One API Wrappers Don't Work

The third tenant or ground rule is no one-to-one API wrappers. What this means is, has anyone done any MCP where you basically put one MCP tool to one open API method, like get something, and then there's a second tool like delete something or list something? That's the sort of thing. Some people know what I'm talking about. What we're seeing is that's becoming a little bit of an anti-pattern. Agents don't necessarily work that way.

When you have very granular tools, it takes multiple turns to do something that an agent could have done with one tool as a workflow. The guidance we've set is that tools should generally compose more than one API and possibly more than one service. We don't want to wrap APIs as individual tools. For AWS, that means we don't want to have a describe EC2 instance as a tool. We have that covered, by the way. There's an AWS API MCP server that covers the breadth of all AWS APIs but with one tool. How that works is it's more parameter driven. It accepts CLI commands and actually executes those, so you don't have 15,000 tools on the API server. You have one tool that accepts command lines, and that's very different from wrapping all the DynamoDB API methods and calling that the DynamoDB MCP server. That's what we don't do in these projects, and we're very careful and methodical about accepting anything that wraps an API method one-to-one.

It's very controversial because it's a quick path, a no-brainer, so to speak. You have these APIs, so let me just make them as tools. But we think it's a great place to think about the surface that your agent is actually going to work with in a new way versus the granular tools that exist in your APIs. Maybe your REST APIs are more coarse, and maybe that's fine, but we want folks to think critically about that, especially when it comes to all the AWS APIs.

Code Walkthrough: The Core MCP Server Architecture

So let us go through the code walkthrough. The first thing I'm going to do is walk through two MCP servers. They are open source on GitHub, so you can view this yourself at any time. If you scan the QR code, you can follow along with me if you have your laptop open. If not, just follow along here on the screen. I'm going to walk you through two MCP servers that I think really characterize what we mean by composability, no wrappers, and all three tenants that we talked about. The first one is core.

So the first thing I'll walk through is that we are using FastMCP from the Anthropic MCP SDK for Python. What this does in Python is it lets us proxy other MCP servers. We create a single MCP entry point and then bind other MCP servers as and when needed based on the roles. That's why you see here we have imports for all the other servers as well.

So now the first place you start when you look at this code, and just a reminder for MCP, this is not an MCP deep dive. We're deep diving into some of the choices we made in our repo. But just keep in mind that MCP starts from standard IO. These are all standard IO local MCP servers at the moment. It's basically started as a child process from your agent, whether that's Kiro or Cursor or what have you. It spawns a process on your local device, and that's what kickstarts this server.py.

We have a helper method that imports servers, and then we check for the profiles based on the environment. So when you create an MCP.json, you will see. Let me see if I can get this open real quick. It should be configured. It's not in your workspace. It's mine. If you flip it over, right here.

You can see here in the environment variable or property of the MCP.JSON, we give it an ENV and we can provide these key value pairs for the roles, and those are defined by the server here. We check for the presence of those roles. It can be one or more roles. We import the default, which is the AWS Foundation. Basically, this is not super complicated. We just match those conditions based on the presence of those profiles.

What we do is we import and use our helper function to call the import server, which basically binds a proxy. We have MCP import MCP server local proxy. It adds a prefix to the tools so that you know the tool itself came from the core server plus the binded proxy server. So we add this prefix, so the tool name contains the core and the proxied MCP server name. After all that with the setup, we conditionally bind and proxy those servers.

Once that's all set up, we have again the security identity, databases, NoSQL, time series, messaging and events. So again, we have this helper method. Then we get to main, and then we start the instance of MCP. We start this instance of MCP up here at the top. Before we actually get to main, where the entry point of the server.py is, that's where we're binding all the tools and servers first. Then we start the server with MCP.run over here at the end.

The core server logic is actually very simple: just conditional binding based on those key value pairs in your dot ENV file or ENV property in the MCP.JSON. It imports the existing MCP servers and the tools that it has for you, so you get all the properties of those other MCP servers. They all run locally. The exception right now is the Knowledge MCP server runs remotely. For the ones that do need AWS authentication, those MCP servers will use your current credentials. You can either set an AWS profile or basically the credential chain that boto3 follows, which usually starts with either environment variables, your AWS credentials file, or config file. It cascades down so it uses your current active session that way.

What we do is we set up a profile that we know we want to use for the MCP server so we can control that and say this MCP server is using a profile for maybe a different AWS account or maybe has a different IAM role and policies. We use profiles quite prolifically when we're starting these up so as not to mix up what we want in our current terminal active session with what we want on the MCP server. We control that through profiles. I highly recommend that approach.

A quick example of that would be we have Nova Canvas as one of our MCP server tools. You may not want your developers to have Nova Canvas access within their test account. So you might have a centralized account that you've approved to use inference of Nova Canvas in, and they would use that configuration for that profile that gives them role-based access into that. So then as they're generating images and things for their application, it's going to use that profile against that account and not their account that they're using to.

Deep Dive: Billing and Cost Management Server with SQL Integration

The second one I want to walk through again, I think this one really distinguishes itself for its sort of macro capabilities that span across services, but they're all related to billing and cost management or FinOps, right? So I want you to picture maybe the last time you had to investigate a cost anomaly on your AWS account.

Picture that. Or maybe now you have multiple accounts in your organization, cost allocation tags, and you want to factor in cost optimization things like savings plans and spot instances and the multiple services that requires. So now we're talking about roughly a dozen services. If you look under the covers, roughly a dozen services just around cost efficiency, cost explorer, cost and usage, and so this is really the billing and cost management server, or BCM, which is basically the entire FinOps profile into one MCP server. And why I think that really stands out is that composition for a job to be done, and it does not wrap APIs one to one. It provides higher order tools.

The other thing with billing and cost management is you have to deal with pagination a lot, and so you don't necessarily want to rewrite pagination handling over and over again on your servers for different API calls. So if I come to the US, yeah, same thing here. Let me just stop right here. You know, handling errors, right? Agents benefit from much more descriptive error messages, and so having this centralized billing server lets us conceptually think of the mental model as much more tractable because you can craft these error messages based on an agent that is acting as a FinOps person, a FinOps employee in your company. So you can actually add these custom error messages versus, you know, 500 or some other error message from AWS that might not necessarily help an agent do the FinOps job.

So this is a place where consolidating a server as macro tools and jobs to be done can help you with your mental model as well. We're showing you this not just because we want you to maybe use the server, but we also hope that you can see from some of the patterns here how to architect your own. Another reason this server stands out is that if you've used the cost and usage reporting and cost explorer APIs, it has large responses, pagination, and that sort of thing. If you use the API directly, many folks just use the console and you have a much more pleasant experience that way because it's all built in. But if you're building this yourself with the APIs, you have to deal with a lot of stuff: pagination, analytics. What we do in the server is actually build in a small in-memory SQLite table or tables so cost and usage data gets inserted or loaded into the server into your instance of the server at runtime.

This also makes it really easy for the agent, an AI agent, to actually understand the billing. It doesn't have to understand the way you have to query cost and usage reporting on AWS. It just needs to understand SQL, and we know now over maybe two years already that SQL actually works pretty well. So actually the main interface for billing and cost management is through SQL to the tools that we've developed here. I'll show you that right now. So we have this unified SQL server, basically querying a session database that is loaded, a persistent session database. And here we go. Just go over to that.

Everything is open source, so you can take a look at all of this on your own and see how you can use it to either extend it or fork it for your own internal usage. We have some helpers here as well for validating the SQL queries that the agent writes. Finally, we have the execute query function, which is actually put behind the tool calls that the agent uses. This is generally how the FastMCP framework works. You can just put this decorator of a tool and it will turn that into a tool that's available on the MCP server that's shared once you start it. Once you put this wrapper, or this decorator, which in effect wraps this function, the agent will see this as a tool to execute the SQL query with optional column definitions. At this point, the database has already been loaded with data from your cost and usage, and your agents are interacting through a local database versus calling APIs to AWS over and over.

We actually first built the server for our customer. They said, "Look, we have a lot of people, and some people aren't necessarily dedicated to FinOps, but they have to do cost and usage optimization all the time." Their biggest pain point was S3 object storage and figuring out how much they're spending on a given prefix in a bucket and account. For those of you who haven't used that or had to do that before, what you have to do is use Storage Lens to get very fine-grained details about your S3 storage costs at the prefix and bucket level. That also gives you more insight into the storage class and lifecycle rules that you have on a bucket.

So we said, "Great, we can do cost explorer, cost optimization, RDS reserved instances, and saving plans, but what about storage?" That's where we decided we needed to make this into one FinOps MCP server. We had been thinking about it, but when we had a customer tell us they wanted it, that really motivated us to get started. This one spans multiple services. S3 Storage Lens puts data in S3 buckets or Redshift, and you can set a few different destinations. However, it doesn't have a data plane API by itself. It has a control plane API, which means you can configure Storage Lens and start collecting data from the S3 buckets, but it doesn't have its own data plane API to actually query. You have to get the data from an S3 bucket or a Redshift data warehouse. You could use Athena, but we decided the agent needs to be able to make sure that S3 Storage Lens is enabled and then access the Storage Lens data.

The Storage Lens tools on the BCM go through the whole workflow of validating that there's a manifest for the Storage Lens and reading the manifest file so it understands bucket names, object names, and keys. From the manifest, we get the locations of the actual data files. These are just internal functions at this point that we use in the tool itself. We have different handling for CSV and Parquet on S3. Right now this only supports the S3 target, and then we have an Athena handler. Again, multiple services, but we wanted to take out all the guesswork and set up cross-service integration. You just have this Storage Lens tool with an Athena handler to actually create the tables against the Storage Lens data file locations based on the manifest.

We get to these functions here. Let me execute query, yes, right here. Finally we get to this. This is a method on the Athena handler class that is presented as a tool. For this particular one, we're not wrapping it with a decorator, but we actually bind it as a tool later on at startup. By the time the agent actually sees this tool and calls it, it will have validated the data of the manifest, the storage lens, the location, and created the Athena table against that. Now it has access to real storage lens data and can even create recommendations like, "Hey, this bucket over here with this prefix actually has 200 terabytes." It really gets fine-grained details versus just saying, "Hey, this account is using this much S3" or even just one bucket is using this much. It really gets down to the details so you can get really fine-grained information about where your spend is on an S3 bucket or your S3 infrastructure on AWS.

Building for Scale: Mono Repo Strategy and Shared Templates

So I'll take us out here. Thank you. Why did we set out to do this presentation to begin with? I think, you know, it's open source. A lot of people have already written about it and used it, hopefully most of you have used it, maybe even already in your coding agent of choice. But one thing we wanted to share is our experience actually building them. It's not like, "Hey, come and check out these servers." We really wanted to share with you what we've learned, how we built them, and maybe give you some tips if you're thinking about building your own set of MCP servers for your use cases in your company.

In fact, I had an ISP customer I spoke with a few months ago and they said, "We modeled our MCP server repository based on what you guys did," and that was really flattering. It was an honor. So a few things I wanted to point out. One of the choices we made was a mono repo. Instead of having 60 GitHub repos, we wanted to bring them all in one place, and that gives us a lot of leverage. It gives us consistent CI/CD, consistent code quality. Actually our code quality consistently increases. We won't accept a PR if it lowers our code quality and test coverage. For example, we have consistent static security scanning, style, and also just discoverability. AWS is a very large company, and it's no secret, so this actually helped us as well as customers find all the servers in one place so we don't have to reinvent wheels.

The other thing we did was shared templates. Because it's a mono repo and it's all written in Python, there's a lot of shared code for these servers, but we wanted to ship them as individual modules. So it's a mono repo, but it doesn't ship one module. We actually ship each server as its own pip package. So we had to use some templating to share code. In a classical sense, if this wasn't a mono repo, it would be a monolith with one package, and then all the code that was common would just be all in one place. But because we had a mono repo, we had templates that we built with cookiecutter, an open source templating framework for Python.

So just some highlights on how we built that. I'll take us out here, and we'll be available for questions just outside of the speaking area. The main thing, especially for coding assistants now, is that these servers can be used anywhere that you can use MCP. That's one of the virtues of MCP, as Paul mentioned from the beginning. You can use it wherever it's supported. We really wanted when we set out to do this at the beginning to create context-aware coding that actually understands AWS patterns, best practices, and anti-patterns.

We wanted to avoid having to repeatedly specify which AWS services to use, like DynamoDB or API Gateway in a particular way. The cross-service, cross-API approach really takes out the guesswork for the complex workflows you may have across the AWS portfolio. When implementing solutions in your MCP servers, focus on creating workflows at the highest conceptual level that you can wrap as a single tool. That's generally the pattern you want to follow.

Through this, we've accelerated our own capabilities. At Amazon, we call it dog fooding, which means we use our own tools. We've used this in our prototyping practice and our professional services practice daily. There's not a day that goes by without me using one of the MCP servers. It's accelerated our prototyping and build capabilities from weeks to hours.

One of the things that was really important to us is that this has lowered the learning curve for AWS. It feels like an extension of your code, whatever your actual business logic is. When you use these AWS MCP servers, they bring in all of that context for infrastructure code, IAM policies, cost optimization, and all of those things. This allows you to really focus on your differentiators and achieve faster time to value. You're not having to context switch between documentation and your code. All of that context gets brought into your coding assistant as and when needed.

Closing: Open Source Contribution and Community Engagement

I want to thank you. This is open source, so you can get involved. Please take a picture if you'd like. I see some phones out, which is awesome. While you have your phones, please take two minutes to do the post-session survey. Your feedback is really important to us. If you want to see more of me and Paul, or if you don't like our voices, let us know. Your feedback matters a lot to us, and we greatly appreciate it. We've baked in a few minutes here just for you. You should see it on the AWS events app where you can put in your feedback.

Check out the articles and resources if you want to learn more about what other people are saying about this. I think that's one of the greatest privileges we have—hearing what other folks are actually saying about this and using it in the real world. Contribute to it. It's open source under Apache 2.0. There are a lot of AWS service teams contributing, including AWS field teams like us, but we also have community contributions as well.

A great example is the core MCP server. Maybe you have a certain set of personas, or maybe you just want to look at the code and get inspiration for your own use cases. Please steal liberally from that. It's Apache 2.0, so there are no restrictions basically on that. With that, thank you very much. I appreciate you, and go give your solutions kung fu.