Kazuya

Posted on Dec 5, 2025 • Edited on Dec 8, 2025

AWS re:Invent 2025 - BYOK: The Key to Meeting Enterprise SaaS Security Demands (ISV405)

🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.

Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!

Overview

📖 AWS re:Invent 2025 - BYOK: The Key to Meeting Enterprise SaaS Security Demands (ISV405)

In this video, AWS Principal Solutions Architects Peter O'Donnell and Jenn Reed explain how ISVs can implement customer-controlled encryption using AWS KMS for data protection. They cover the evolution from expensive single-tenant HSMs to flexible cloud-based KMS, clarify different BYOK (Bring Your Own Key) implementations including import key material and cross-account key sharing, and demonstrate envelope encryption patterns. The session includes live code demonstrations of the AWS Encryption SDK and database-specific SDK for DynamoDB, showing hierarchical key rings, multi-tenant key management, and branch key caching strategies. They discuss cross-account resource policies requiring only encrypt, decrypt, generate data key, and describe key permissions, emphasize CloudTrail logging for auditability, and address the newly announced partner-managed delegated permissions feature. The presenters also explore advanced topics like searchable encryption, homomorphic encryption capabilities, key rotation without re-encryption, and trade-offs between security and performance in multi-tenant SaaS architectures.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Introduction: The Importance of Data Protection and Customer Key Management for ISVs

Hi, everybody. Thanks for being here. We know that investing in travel and being here is no small thing, so welcome to Las Vegas and welcome to Reinvent. We know that customers have this expectation around where their data is, and ISVs have this covenant that you are holding and handling customer data on behalf of that customer. If you're doing it on AWS, there are some very definite ways to not only secure customer data but to provide the necessary assurances and transparency so that customers can form independent opinions that you're doing the right thing over time.

My name is Peter O'Donnell. I'm a Principal Solutions Architect and I've been with Amazon Web Services for just about 10.5 years. This is my 12th Reinvent, and I'm joined today by Jenn Reed, another Principal Solutions Architect who covers one of our largest and most strategically critical ISVs. We're going to talk a little bit about what it means to protect data and where cryptography plays a role in that.

In the history of what it means to manage keys, we have come a long way from very expensive, hard to manage, cumbersome, single-tenant, gigantic HSMs that live in your own building to a much more flexible cloud-like capability with AWS KMS, our Key Management Service, which does still indeed have very expensive KMS HSMs behind it. We'll talk about why customers care about bringing their own key and what BYOK means, because as it turns out, we've overloaded that term in our history. We'll talk about two different ways of approaching what it means to protect and secure customer data with customer keys, and then finally, Jenn is going to show you some really neat software and some code that we've developed to give you an example and a taste for what it takes to implement these things.

Understanding BYOK: From Import Key Material to Customer-Controlled Encryption

I mentioned that we've overloaded the term BYOK. In the beginning, we had a very large and important financial services customer who demanded that we create something that was a big red button, in their words. This became a feature known as import key material. If you're familiar with KMS, there's this idea of a customer managed key, a CMK. You think about it as a key, and it has a key ID and it even has an ARN, an Amazon Resource Name. But the CMK is not the actual key. The key is a notional construct, and there is in fact key material within it.

The original BYOK feature allowed you to import your own key material into that construct, and that was actually not the interesting part. The interesting part is that we allow you to burn that key material instantaneously while keeping the construct of the CMK. That allows a customer to hit that big red button, remove the key material from inside that CMK, but later reimport it and undo that big red button. What we're primarily talking about today when we say BYOK is allowing customers to have their own key material and their own key to protect their data, and that is the high order tenet.

The high order tenet is that customer data ought to be protected by customer keys. This is the expectation that the engineers have. This is the expectation that the executives have. This is the expectation that regulators and other third-party stakeholders may have. It's a very simple, straightforward idea: if it's customer data, it should be protected with customer keys. We take that tenet very seriously in our own services. When we store and process data on your behalf across dozens and dozens of services from databases to file systems to the object store and everything else, we allow our customers to protect their data with their keys.

As ISVs, you have the same covenant and the same set of expectations for largely the same reasons. Slack was one of the first big ISVs that really got this right. If you use Slack, and we certainly do, there's a lot of confidential stuff in Slack, whether it's messages, the contact data itself, exchanging files, and of course Slack these days can do lots of other really interesting collaboration things. But this idea of what it means to have your own key is often perceived differently by different customers. We generally take it to mean that it's a customer CMK in a customer account that is shared cross-account to the ISV to make use of, and that's what we're going to talk about.

Why Customers Demand Control: The Big Red Button, Observability, and Compliance

When customers think about this problem, I think it's important as ISVs that you understand where they are coming from and why.

I need to have my keys, protect my data, and I mentioned this idea of the big red button. Because of this unique role where you are handling and processing customer data on their behalf, that's core to your business—it's a covenant of trust. But customers still very often believe that they need a big red button, so that if you have an event or they spill something into your software, which depending on what you're using, if it's logging, everybody in the logging business should know about log spills. The customer will want a mechanism to immediately disable access to that data, and that's the big red button.

But there are also dimensions of assurance that are required here. Customers want to observe how you are using the key material over time and how you are implementing the covenant of what you agreed to on paper or what you agreed to in a meeting. That's also part of what it means to let them use their own key. On AWS, we log almost all of our APIs into CloudTrail. I say almost all because there are data plane events in S3 that you actually have to turn on. But KMS logs everything—all encrypt, decrypt, generate data key, describe key, and a bunch of other APIs are in CloudTrail.

What's interesting about KMS is that if you use it cross-account, where a calling principal in one account using S3 references a key in another account, we output two events. Identical events, but into two separate trails. The trail of the calling principal, which might be your ISV service account, and the trail that belongs to the account that holds the resource—that is, the key. Customers really like this. You'll still have your own CloudTrail where you can see your use of the customer's key. But because the key belongs to the customer in the other account, they will get corresponding CloudTrail records as well, and this is very important.

I mentioned this idea of the big red button, but disabling and revoking access is just one part. Customers want to be able to control the lifecycle of that key, maybe give you a new key after a period of time. By thinking of this way and building this support into your products from the outset, you can support that. You can say, "We're going to start using a different key. Sure, go into the settings control panel and change the ARN of the key, and of course make sure you properly authorize it."

There's also a compliance dimension to this. Whether it's compliance with an external entity like a regulator or anything else, or even just internal controls compliance, being able to protect the data is not enough. You also must be able to tell that story and provide assurances over time. In an international market, the idea of data sovereignty becomes even more important. Especially in Europe now and increasingly in jurisdictions all over the world, there is a heightened emphasis on the core tenet that my data should be protected with my keys. That can help lead to data sovereignty outcomes.

Third-party stakeholders also love this. At the beginning of my career, I worked for a very large bank through AWS. I worked for AWS, but they were my big customer, and they made a clear decision early on that all data in the cloud would be encrypted. Today, that's a very normal thing to say. Ten years ago, it was actually pretty revolutionary that they were going to move a bank system or record to the cloud at all and demanded ubiquitous encryption. But today, this is table stakes.

If your product doesn't support these capabilities, there will be customers who will see that as a gating decision—a purchasing gating decision. If they can't have their data protected by their keys, they might not buy. So the motivation for you is, as I just said, that this is table stakes. Having been here since 2015, I can tell you when this wonderful customer of ours first said ubiquitous encryption, it was a big idea. We didn't really have ubiquitous encryption at the time in 2015. Today we do. But that also means that your products must as well.

Even if you support encryption, you need to be able to tell a rich story. If it's just an encrypted file system and there is no separate authorization once the file system is mounted, that may not solve your customer's requirements either.

AWS KMS: A Trusted Foundation for Customer Key Management

Now the good news is customers tend to trust AWS KMS. I can tell you firsthand that a lot of the first part of my career here was explaining to customers how KMS worked, explaining the HSMs, the design of the HSMs, and our regulatory position regarding FIPS or anything else. Now customers generally support KMS. They like its characteristics, they like that it comes from us, and they like that it's FIPS validated now with the 140-3, which is the new program that succeeded 140-2, and at level 3. If you're not a FIPS nerd, it doesn't matter. Just know that those are the table stakes, but we have it.

Achieving this kind of compliance coverage is really critical for customers, especially as they tell their own story. Their data that they are holding is protected by KMS. Why wouldn't their data that you are holding also meet that same bar? Now what's cool is that KMS is really flexible and lovely. It integrates with all of our managed services, and you can integrate it with your own software. In the second part of the presentation, Jen will show you what that means because we have built some really nice SDKs to make this easier for you.

We have an SDK for database services that even allows a little bit of searching over encrypted data in DynamoDB, and an SDK for general client-side encryption, whether it's a blob or anything else. We've got some SDKs that integrate with KMS, but they certainly don't lock you into it. One thing that will be shown is that you have alternatives for the key provider. It works out of the box with KMS, but if you want to hook it up to something else, you certainly can.

KMS solves a lot of problems for you. The soul of what we do here at AWS is what we call undifferentiated heavy lifting. We've already figured out how to scale KMS and how to keep it healthy. It is our highest level of SLA. We deliver KMS at an SLA of five nines, and in fact the reality is significantly more available than that. There's already transparent key rotation built into KMS where we will increment the underlying key material, and there's just a lot of genuine value that comes from the fact that you're integrating with KMS.

KMS Architecture Options: Native, CloudHSM, and External Key Store (XKS)

So there are a couple of different kinds of KMS. That all has the top layer, the KMS APIs, and then there's the second layer, which is where the keys actually live. The best solution for effectively all customers is native AWS KMS, where the key material is both generated within and used for cryptographic operations within KMS multi-tenant HSMs. These are the real deal. They are bespoke, they're built to our design, and they are FIPS validated. In fact, the entire box is FIPS validated. It's not just a little widget inside of a plain vanilla server. The entire module is a validated FIPS module.

But there are some alternatives that will be relevant for your customers. I mentioned earlier that the original definition of BYOK allowed for imported key material. We call this an imported CMK, where the CMK construct exists with the KMS API, but the key material is injected by the customer. Some customers came to us and said they couldn't quite convince everybody that a multi-tenant HSM is a good idea. I think it is, and I'm an expert in this stuff, but sometimes people are skeptical about what a multi-tenant HSM means. So we offered an alternative.

We have a CloudHSM product that is in fact a single-tenant HSM, and you can graft that onto the top layer of KMS. That's called a custom key store, and it does exactly what it says. If you've got a stakeholder or a customer stakeholder who absolutely can't accept using native KMS, you can hook CloudHSM up to it and get that outcome. We built another thing that we don't think anybody should use, but we built it because customers demanded it of us, which is practically the reason we build anything. That's called XKS, the external key store. This allows for what you will sometimes hear referred to as HYOK, hold your own key.

The cool part about the cloud is that we do all of the impossible part, from the ground to the concrete to locking the doors at night, all the way up the stack. Some customers,

whether because the requirement is placed upon them by a third party or because there's somebody in the security organization that needs a warm blanket, which is an acceptable reason, they believe that the key needs to live in a building that they control. It needs to be in their house. We'll build it, and we built it. This allows you to have an HSM that lives in your own office building but is hooked up to the front end of KMS, and that's called XKS. Probably don't let people do this. It comes with scaling and availability characteristics that are going to be radically different than regular KMS, where we do the scaling and availability for you, but it is out there.

Encryption Fundamentals: Envelope Encryption and the Key Hierarchy

The good news is that as you build to integrate with KMS, your customers still get all this flexibility. You yourself as the ISV don't really need to worry about this part, but it will be relevant for the conversation. Let's do a quick primer. I think a lot of people think they know what encryption is, but as it turns out, encrypting data is the easy part. All you have to do is combine a data key with data, and then you get ciphered data, encrypted data. That is categorically the easy part. That's what you're probably already doing.

But the problem is, what do I do with this data key? Where do I put it? How do I keep it safe over time? The answer is that you combine that data key with another key and you encrypt the key with a different key. This is known as wrapping, wrapping like a present. You're going to wrap that data key with the KMS key, and now you have an encrypted data key. So where do you store that? Well, as it turns out, the right place to store it is just to store it with the encrypted data. You store the data encryption key. I'm going to start calling that a DEK. You store the DEK wrapped with the top key that lives in KMS and you put it in one package, and then you put it somewhere safe, like S3.

The question is, you told me that you start with a DEK, you've got a data encryption key, then you have to protect that with another key. Great. Is it turtles all the way down? What do I do with the KMS key? That's where we come in. We do the rest of it. If you want to learn about the full key hierarchy of KMS, you can find that online. It's actually a pretty interesting story. We take care of all that other hard part. Of course, we protect your KMS keys at rest by encrypting them again, and then there's a whole thing, and it kind of is turtles all the way down, but it ends in printed out paper that if we ever lost power, we could turn the stack of turtles back on the hard way. We've never had that, knock wood, but we do have a plan for that.

That's what the primer for encryption is. We call this pattern of unique DEKs that protect data, with DEKs that are then protected by a higher order key, envelope encryption. This is the way all of it works at AWS. This is the way you should probably do it as well. Jen is going to talk about this idea of a key ring, about starting your own key hierarchy with a key that you get out of KMS.

Implementation Strategies: Cross-Account Key Sharing and the Data Key Broker Pattern

That all sounds great. Now we know why customers care. We know what they're looking for. We kind of understand envelope encryption. How do you do it? How should a customer let you use their key material to integrate with your software to protect their data that is in your hands? There are a couple of simple ways to do this. If you're just writing objects into S3, just do cross-account key sharing. End of list. But if you're holding a multi-tenant database with individual customer records, it's a lot more complicated than just giving the ARN.

If you've got your own data plane, depending on the nature of your ISV, and it's not just a relational database but perhaps a much more complex data plane that is itself probably multi-tenant, now you've got some very interesting complexities. Complexities around availability, complexities around persistence, because the only thing worse than losing data is losing a key, because losing a key is like losing data except even harder. So how do customers share their keys with you? There are three ways to do this. If what you're doing is primarily just interacting with AWS resources, that is to say you're not in the weeds of your own data plane in a multi-tenant data plane, just do it cross-account resource sharing.

Cross-account resource sharing is one of the oldest tricks in the IAM book. You write a resource policy that goes on the key, which we call a key policy. You delegate effectively a service principal that belongs to you, the ISV. So service.acme.corp.com is allowed to call encrypt, decrypt, generate data key, and describe key. That's the simplest part.

Some customers think, or some ISVs think, "Well, what I really want to do is assume one of their roles and then use that as the key." Why would you do that? Don't do that. Just do cross-account resource sharing. Now, sometimes vendors say, "I'll call this BYO key, BYOK, and I'll have all the key material. There's a key for Alice and a key for Bob." But it's not really bring your own key if you do that. Customers don't get the logging visibility that we talked about, and they certainly don't get the big red button that we talked about.

Your customer's level of trust in the eventual solution that you're going to market with can be greatly affected by the choices you make here. We want you to be successful. We really want our customers to be successful. You guys are our customers, but often your customers are also somehow our customers. We want you to do this the right way, which is why we're here having this talk.

Cross-account key sharing is really straightforward. Resources on AWS have their own policy document called a resource policy. When you're talking about S3, it's often referred to as a bucket policy. If you're talking about KMS, it's called a key policy. This is really straightforward. Statement: allow principle that belongs to Acme Corp. Action: encrypt, decrypt, generate data key, and describe key. Pretty straightforward stuff.

It's essential to understand that you're only going to get authorized for the cryptographic operations. You do not need and nor should you ask for other actions on the key. Those ought to belong to the key owner, your customer, because some of those actions can be really powerful. Disable does exactly what it sounds like. You don't need the ability to call disable on the key, and you should not ask for it. There are a couple of other very powerful actions. Put key policy—you do not need that, and nor should you ask for it.

Perhaps most powerful of all, there's an API called Schedule Key Deletion. We don't actually let you delete it right away because that's a very serious foot gun, but scheduled key deletion. You do not need it, and you should not ask for it. All you need is very simple: write on the key encrypt, decrypt, generate data key, and describe key.

How do you get this going? How do you bootstrap this new customer onboarding? It's probably enough to just publish some JSON. This is what it needs to look like—copy and paste it on your key policy. Your sales engineers can probably do this for your customer during onboarding or ramping or whatever. You could probably give them some CloudFormation, which would create a key and then put the right policy on it. There are lots of other ways to think about this. You want to give them a Terraform thing? Sure, whatever.

You should be reasoning about this because it's essential that your customers not only get it right so they can start receiving the benefit of your software, but it would be dumb to delay time to value just because it takes a week to figure out what to put on the key policy. You want your customers to get it right for safety and security. If they give you a loud KMS star, you now have too much ability for all the things I just talked about: schedule key deletion, disable, put key policy. Don't let them give you KMS on the key policy. Only take the actions you need.

Everybody knows the phrase least privilege. Make sure you get least privilege. There are a couple of different patterns ultimately. What I just described is really simple stuff. If I'm going to put an object in S3, I want to protect it by your key, all I have to say is put object key, key ID. But there are some more interesting and complex ways to do this than this idea of a data key broker.

You're going to start your own key hierarchy from the customer's KMS key. As you ingest log files or create whatever kind of unit or entity in your data plane, you can take that envelope encryption pattern and add one more layer. So it's KMS key to customer key, customer key to data keys. You can add one more layer too, depending on what makes the most sense for you. This effectively is a key hierarchy, and the data key broker pattern here is a simple way to do that.

As customers upload data, there is certainly the idea of crypto wear out. You don't want to use a single data encryption key for a colossal body of data. That has blast radius problems, and at hyperscale, there's a theoretical cryptoanalytic problem with that. If you create petabytes of data with a single data encryption key, it creates an entry point for an adversary to reverse the key. It's very easy to avoid that by not encrypting everything with a single data encryption key. The workflow looks like this. These slides will eventually get published on the internet. This is effectively what the flow looks like for this idea of the broker.

AWS Encryption SDKs: Simplifying Client-Side Encryption and Multi-Region Resiliency

There are some more questions about this. How do I request and deal with the data keys? How do I deal with multi-tenancy, and where does this solve? This is where our encryption SDKs come in. They're open source and available for a handful of very popular languages like Java and Python. It's awesome. It does exactly what it says on the tin. It's written by a group of experts, and it's open source so you can understand it, reason about it, and maybe even improve it on your own.

For client-side encryption, there are a bunch of questions. How do I properly implement it? What modes should I be using? Again, the SDK is almost definitely the right way to answer all these questions. The SDK has a message format that's pretty flexible. You can even get multi-region resiliency by registering either using multi-region keys, which is a new thing we introduced two to three years ago, or you can take the data and the wrapped data key, and it's wrapped under two different regions. So if it's in Oregon, you can unwrap the data encryption key in Oregon. There's another version of the same data encryption key that you can unwrap in Virginia.

There are two of these SDKs. One is the general purpose ESDK. This is suitable for blobs, whatever it is, a lump. For database specific applications, particularly DynamoDB, which is probably the right data store for most things, but it certainly works for the relational products as well. The database specific SDK has some additional considerations built into it. These are expert level software packages. They're both open source and maintained by a really brilliant group of people. The ESDK is designed to make it easy for portability so that you can deal with these things.

There's a way to register multiple key providers, so that if your customer expects you to be available in both Oregon and Virginia, just as examples, there's a way to do that. The single message format makes it really easy and high speed to deal with. I mentioned this idea of where do I store the wrapped key and where do I store the encrypted data. The answer is to put them together. The ESDK does that. It automatically puts them together. Ultimately, you only need to keep track of two things. One is that lump, the data, and where the key is. The key provider is not the data encryption key because the data encryption key is with the lump, the encrypted lump. The key provider is which KMS, KMS in Virginia or KMS in Oregon, because like most things at AWS, that will be region specific.

This is a little bit of an overview. There are some details left out of this for the simplicity of the diagram, but this is basically what that looks like. You call through to the ESDK. The ESDK calls down to the operating system to get the right crypto. Every operating system is going to know how to speak AES GCM, which is an implementation of AES, a mode. We get the blob that comes out, what's in the lower right there, where you get everything necessary together. I mentioned this idea of a hierarchical key ring, and this is where I'm going to turn it over to Jen. There's a deep explanation of this, and practically everything I said is examined very deeply in our documentation. So hopefully this is a pretty good primer.

Code Walkthrough: Implementing Hierarchical Keys with the Beer Store Application

Visual Studio is your friend. I hope everybody can see that pretty well. So, how many people in here understand Java? This will look familiar to you. This is a Java application called Beer Store, which is an example application. What has been done here is constructed it to show the hierarchical key. I wanted to start by looking at how those tenant data keys are created, and then from that we'll take a step back to understand where that's stored in DynamoDB and where that's declared. Then from a step back from that, we'll look at the actual KMS so you can understand where the KMS ID and those components are configured and declared, and then get called down here. This way you can understand the context from this perspective.

So if we look down here, the most important part is that you're going to have AWS credentials. In this implementation, when you're onboarding a tenant, each tenant is going to get their own KMS ID, their own client ID, and credentials. This is so that each one of your tenants will be able to understand from a process perspective which one is used for which key. For SaaS providers, you want to make sure each customer that you have, each tenant that you have, is able to manage their own keys independently, not all on one cycle.

We scroll down a little bit further here on that onboarding process. Each of those is going to get a branch key, and then it's going to have their credentials. Within the application that's going to call it, and then when someone is offboarding, because we are going to cache those in DynamoDB, they will age out. The caching default here is set to 15 minutes, but it becomes deactivated within the application so they can no longer use it, and then it comes out of the DynamoDB system itself. You can see that after it has expired, the key is no longer available.

If we scroll down here, you're going to see how the rotation of each of the keys works. If you're going to have customers managing their keys, you want them to manage the rotation of those keys. Each customer is going to have different expectations of what that rotation cycle should be based upon what their internal compliance or security program requires. Then scrolling down, we can see how you might want to handle multiple tenants' data at a single time. Here's a key ring, a multi-tenant key ring handling multiple customers at the same time, but each of those are cached separately.

Each of those data streams, one can't read the other because each of those data streams, even though it's being handled by the same process, each of those data streams are only readable by the encryption and the encryption keys stored for it. At the very bottom, if you have a customer, you can support both mechanisms. Setting up and establishing and maintaining KMS can be quite a process. For most customers' perspective, when you're setting up key rings and setting up those KMS keys, holding them from a multi-tenant perspective like the upper example makes a lot of sense. But if you have a customer who says no, I really want my KMS to be in a single tenant and it's only ever in the process with only my data, then here's an example of that just with a single tenant within that key ring being handled at one time. So there are two ways of doing that.

Now, if we take a step back, remember I said that in DynamoDB how that expiration works. So here is actually setting up that data key broker for the DynamoDB table, and you can see how each of those for actually setting up those data objects is configured.

What's established is important to understand. The critical part here is really understanding the coordination of those keys for each tenant within DynamoDB. When you're actually rotating them, you set up your intermediate keys. Down here is where you're actually setting up your key ID and the branch ID, which is per each tenant, along with the encryption context and the actual credential provider that's being called and can be defined in the other Java object we'll go into next. For that branch key request, here's how that's being called. The default time shown here is how long you're going to hold on to that delete. The actual sample application is 15 minutes, but it can be 5 minutes or 1 minute—however long you want to hold it cached.

That caching means if someone deletes it, it will then no longer be in DynamoDB after that. However, it may mean you're going to make requests to DynamoDB more frequently to refresh your cache. You have to hit that parameter of how long you want to cache for, because that's when something marked for deletion will actually disappear from DynamoDB. But that also means you have to be querying DynamoDB to actually refresh your cache. It's really about what application you're doing and what that means from a deletion perspective, and letting that guide your decision.

If most of your customers are happy with 15 minutes or 5 minutes but you have one customer that needs it to be a minute, then in that instance it might make sense to have most of them under the multi-tenant aspect and then have the other one on the single-tenant aspect with that more reduced time frame. That way you can handle their refresh without having to refresh every tenant's keys at the same cycle. Does that make sense to everyone?

So if we take a step back to actually look at the KMS client, here you can actually see how it's establishing from the SDK, setting up the individual KMS keys, scrolling down, that the service client will actually use when the application is running. Any questions? It's pretty straightforward. A lot of our security and observability customers use us, as well as some of our business apps like Slack and Zoom, and some others also use the same sort of concept. It's really about how you expose that so the customer is providing access appropriately to the KMS without having to plug in the ARN.

Partner-Managed Delegated Permissions and Addressing Customer Concerns About Caching

Peter was explaining earlier how you can give them CloudFormation scripts to do that. But on November 19th of this year, prior to re:Invent, the IAM team announced partner-managed delegated permissions, which actually makes this a little easier. Within your app or SaaS application, you can make a request via IAM. In order to register, you have to provide policy boundary templates that are approved by AWS to the IAM team, but what that is is a scoped-down policy. From your SaaS platform, you can make a request on that, on the customer's behalf, to the administrator or the user with the appropriate permissions into IAM for that administrator to approve or deny.

What that does is then allow you to actually update the permissions via CloudFormation or Terraform, leveraging that temporary delegation to actually deploy a role or deploy sharing of KMS. But it can only do that one time. After that point, if I know we talked about encrypt, decrypt, and generate key—most observability and security providers tend to just need decrypt because they're not encrypting the key. They want to decrypt it so that they can read it and then store it on their side in the data store that they're using. But they don't necessarily want to wrap it in an envelope on their side. Then they make the request back to actually read the data using your key, so using that cross-account role, but they never want to do any physical encrypting or you wouldn't want to do any physical encrypting of additional data because that can be a bit risky.

Or generate the keys. The customer should always generate those keys. There are different things you want to think about from your application's perspective in the design. What do you really need, and is it something you just need in order to set up? In that case, you can use a partner delegated permission request for a one-time action, and then with permission boundaries. Or is it something that you're going to want as a general rule for encrypt, decrypt, or generate keys? There are ways you can spread that out so at any one point in time, you can give that granular access to your customer.

What I think is really powerful is the ability for you to make those specific requests and policy statements so the customer can make those choices. You're not having the friction of having them go talk to the DevOps person to provide the script, to the template, to the right account. It'll initiate it in the appropriate account for that appropriate admin to approve or deny. So combining these capabilities together gives you as an ISV the ability to help customers bring their own keys, but do it in a way that makes the most sense. We're always encouraging our ISV partners to build on AWS so you can use KMS where appropriate, and then if the customer does have an on-premises key store, you're still able to access that by leveraging the KMS construct to import as well.

Any other questions? Yes, with branch keys, do you see any pushback from customers who are really focused on auditability where you're caching the key and not showing up just logging? Yes, I think you can have a patient conversation with the customer to explain. The question there is that with branch keys, because you're not going back to KMS every time, you're not going to get KMS CloudTrail events for all the things you're doing. True, I think customers, and I'd like to believe—maybe this is me being a naive solutions architect who always believes I could convince someone of something if I'm right—but the question is ultimately: if you want hyperscale and performance, you can't emit logging every time you touch something. I think it shouldn't probably be at the top of your list, but you could emit your own logging.

Absolutely, you should always emit logging for providing the telemetry and those actions because it's also in here. I scrolled past it, but you're also logging the number of times your application is making use of that key. And then of course that would be emitted in your application logs. But I would tell you, the question of caching anything is like this: it also maybe undermines the big red button, right? If I'm caching, if I've invented a new top key and I've got that clear top key, the big red button goes away, and a lot of things go away. Your colleagues in your security team, in your application security team, and the people that are doing your SOC report should all have strong opinions about this. Caching is one thing, but committing that top key to non-volatile storage and clearing it—there are a lot of things that of course you should not be doing.

Yes, I just want to add one more thing. Kevin Lee, one of the product managers for KMS, has a point. There are some out-of-band ways to check to see if you still have permission to call on the customers. If you know the caching basically flew the time frame that they promised was their customer for five minutes, right? That's a really interesting point. I think the point that Kevin's making there is that there are some out-of-band ways to check to see if you still have the authorization in the first place, and you could use that as a signal to drop that cache, right? Should I repeat that back? Okay, time to ask some questions now. I couldn't hear. Should we do more questions? Yeah, the question is, should we do more questions? Yes, absolutely, ask your questions.

Q&A: Security Models, Key Rotation, Searchable Encryption, and Homomorphic Encryption

For large providers, you're looking at bringing your own key. There's a debate between your security team and your strategy on how to protect your data. It's not like—especially so. Yes, so some of the conclusions that are coming:

Bring Your Own Key is a security model because essentially the only way we can provide good security is if you hold your own key. Third parties may provide some services, but what we showed as an example is that you own your own key and hold your own key. Each of the processes has access to it, but the admins don't have access to the processes themselves.

This is something I happen to discuss a lot. When we're talking about AI and agents, people like to think of them as people, and I beg you to stop thinking of them that way. Think of them as processes. If you think about them as processes, because that's what they are, then it's making sure that each process only has access to that component. How do I architect a solution to ensure that process only has access to that data and those keys, and another process never has it, nor can it hand the data over?

The notional idea of ownership, without pulling apart the ontological concept of ownership, is this: if it's the customer's key in the customer's account, that to me is as strong of an ownership model as if it's in their own building. I work here, but I'm very passionate about it. You own your data. If you put an object into S3, is it any less yours than if that same file was in a file system in your building? No, it's absolutely still yours.

I think external key stores, as I mentioned earlier, are a warm blanket. But to me, it is still very much your own key if it is in your account and the ISV has cross-account authorization to make use of it. The main point is this: it doesn't give you extra security if you own the key as long as the third party doesn't have access. That doesn't mean it's secure. I think that's the case.

But that's a path that leads to the idea that you don't own your own data if you're hiring an ISV to process the data, and that can't be true because then none of you would have businesses. The boundaries of ownership are important. If you're accepting the idea that you're going to hire an ISV to do stuff with your data, you are already through that looking glass. You are already asking them to touch your data.

That's also where you start to say: make sure if they're going to do any processing on that data, what security and compliance have they gone through? Do they have ISO certification? Do they have a confident SOC 2 program? I like to go further than SOC 2, but to each their own. What that does is you have a third-party auditor coming in and validating those controls that are put into place, making sure they're handling the data and ensuring that the employees never have access or that the processes themselves never write out data that isn't appropriate, and that all of that activity is logged so you have that telemetry.

I heard you say that for so long as the ISV has the authorization to use the key, you've got a security problem. I can't accept that because again, I hired the ISV to touch my data. It depends on what your application is doing. If I'm writing something and I only ever write, then I should only have encrypt on behalf of the customer and generate the key. If all I'm ever doing is reading it from, say, a SIEM or detection and response, I should only ever be able to read it. I should never be able to actually write it or generate a key for it.

Making sure that you understand what the action is and making sure that you don't give permission beyond that is important. There's an appropriate permission boundary in place in addition to those things. So how do you deal with different customers with different requirements? I get that some customers are going to say, "I want to hold my own key. I want to do it in KMS," and I'm an avid AWS user, so that's fine. Others are going to say, "I don't know what you're talking about, and I don't use AWS. I don't have an account."

My short answer to that is to build to KMS. If customers want to hold their own key,

there's a way to do that with KMS, and it's abstracted. Does that impact how you interact with it? It doesn't change any of the API contracts. It's XKS, the external key store, where it lives in an HSM in your office building or data center. It does have implications for performance.

If you're operating a petabyte-scale data lake with 1000 nodes banging on it and you have to go back to the HSM every time, or you have to go to an on-premises HSM going over a network connection with a really small caching time, this is where it starts to become important. The caching discussion is just like in a database itself. You sometimes have to cache the most frequently accessed things, and that's why you set a time to live. Same thing for DNS. You set time to live because you need to cache the fully qualified domain names.

Now, if you have very dynamic IPs and for load balancers and things like that, your time to live is probably in seconds. But if you're sitting at the top of the name space, your time to live might be 24 hours. So it's really understanding that from a KMS perspective and architecting that in a way that makes sense based upon where the customer determines they want it, how long, and where it's being stored, and you as the provider.

That's why I'm saying for that really particular customer, it's that bottom example and that time to live has to be longer because it's on-premises and you have to take into effect the fact that every time you clear your cache, you're going to have to pull it back. Then you need to make sure it's distributed. So you have to architect to account for that. But to summarize the answer: the characteristic of the key with XKS custom key store versus regular KMS does not change API interactions. It can change performance characteristics and availability characteristics of the results.

That's why I'm saying the multi-tenant one above makes sense most of the time. If you have one that is that particular customer and they're going to be on-premises and they want to hold their own key, I am not having that impact the other tenants. Does that make sense? On the onboarding side, there's going to be lots of different processes for different requirements. Onboarding a new client involves many questions.

I'm used to thinking in terms of data and key encryption, so now my data is encrypted. We talked about rotation of the key. I'm having trouble getting past that hurdle of now I've got a petabyte of data that's encrypted. We are very opinionated about what rotation is and what it is not. A lot of people think of rotation as if you rotate, then you're going to destroy the previous key, which necessitates re-encrypting all the data so that it's now protected by the new key. That is not how we do it.

Historically, I'm very opinionated about this, let me get through this quickly. Historically, you rotated for two reasons. Number one, humans had access to it, and now that human left the building. We don't have that problem with KMS. Nobody ever touches those keys, not even us. There is no way for us to access your key material. The other reason you rotate is crypto wearout. You use the same key for all this data, and now you need another one.

The envelope encryption pattern addresses both problems. The blast radius of an individual key is necessarily very narrow because it's typically per resource or per file or whatever. There are no such things as getting the key out of KMS. There's no API to get key material. You can't export it. And even if you could, you couldn't make use of it. What we do when we rotate is, I mentioned earlier that the idea of the CMK, that the ARN, is itself effectively a vessel. Inside of that is the real key material.

If you turn on rotation on a periodic basis, by default it's one year, and did we ever lower the period for that? We did. There you go, 30 days to 7 years. I should note that it's very hard this whole week because there are going to be new things and it's hard for us to keep up. At that rotation period, we create new key material that will be used for future encrypt and generate data key events.

When we encrypt and generate data key events, we keep the old key material. If we didn't, we would have to re-encrypt a giant body of data, which is expensive, time consuming, and really hard to get right because of edge cases. If we got rid of the key material, it would be very dangerous. So when we rotate, we just generate new key material that is used for new cryptographic operations, but we keep the old one. This is really important because if you ever have to go back in time to look at data from a period when it was previously written, you need to be able to read that data. That's why the retention for seven years is necessary.

Data that might be historically stored in Glacier is still attached to that older key for when it was created. So when you need to access it as the owner, when the application is making the call, it can go back to that point in time. Let me restate the two big punchlines to this topic. Number one, the envelope encryption pattern takes off the table the reasons to even think about rotation ever, full stop. Number two, we have a rotation thing that you can turn on to make somebody an audit happy, and you don't need to worry about how to go back in time and use the key that goes back and does that. It's being taken care of for you.

Now let me address the competing interests here. You want to encrypt everything, but there are trade-offs that you want to work around. Let's talk about some of those trade-offs. The classic one is searching. That's the biggest problem. If I want to have a column in a relational data store of taxpayer ID or Social Security number, but I also want to look up customers by taxpayer ID, it's not a great way to do that. There are cutting edge homomorphic encryption options that are just finally starting to become real. We hired a trio of cryptographers who have worked together for twenty years, and there are searchable encryption options out there. You can kind of search across encrypted fields these days. It's not super high performance, and I wouldn't put it in the hot path for a really high TPS application, but there are some ways to do that.

Deterministic encryption for particularly narrow key spaces like a Social Security number or a credit card number can be super dangerous. Social Security numbers are super predictable. If you know where and when someone was born, you've already lost half of the Social Security number space because they're very predictable. Same with credit card numbers because of CVV2. There's an internal checksum to the format of a credit card number, and the first six digits belong to banks. So the key spaces get really narrow really fast, which means it's trivial to create a rainbow table of all possible values. You have to be very careful with deterministic encryption.

It's getting there, but again, the larger the field is, and if the cardinality of the field blooms like a free text field, then all the promises fall apart. If the cardinality of what you're attempting to search across is relatively finite, the solution works much better. But if it's like I want to encrypt all the notes from the call center, that's where the approach becomes problematic.

However, could I encrypt addresses and still search across it? Yes, probably. The database SDK implements this for DynamoDB, and searchable encryption in DynamoDB is available today. It comes with a number of caveats that you and your engineering team should investigate, but it's effective and it works. We have customers using it with the AWS database encryption SDK.

Your account team should follow up with us. The team working on this vanguard of exploring new capabilities is very interested in talking to customers and hearing about use cases. If you're interested in having a conversation about homomorphic encryption, have your account team reach out. If you search the session catalog for homomorphic, you'll find the sessions. I'm pretty sure Tal and Raj are talking about it this week. Tal Rabin leads a trio of cryptographers, and they all have Wikipedia pages if that helps contextualize who they are and what they're about.

We're doing some really cool work in this space. We've reached the end of our time, as the red light is flashing. Thank you all for coming. I hope you found this informative. Please fill out the survey and let us know how you liked it. We look forward to talking with you more.

; This article is entirely auto-generated using Amazon Bedrock.

DEV Community