Kazuya

Posted on Dec 8, 2025

AWS re:Invent 2025-Privacy-preserving AI primitives: Building blocks for regulated industries-ARC328

🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.

Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!

Overview

📖 AWS re:Invent 2025-Privacy-preserving AI primitives: Building blocks for regulated industries-ARC328

In this video, Ruben Merz and J.D. Bean present privacy-preserving AI building blocks for regulated industries. They cover encryption fundamentals using AWS KMS with envelope encryption patterns, client-side encryption via AWS Encryption SDK, and tokenization for reducing compliance scope. The session explores confidential computing through AWS Nitro System, Nitro Enclaves, and Nitro TPM for cryptographic attestation enabling zero operator access. Advanced techniques include federated learning for distributed training without sharing raw data, differential privacy for mathematical privacy guarantees (demonstrated via AWS Clean Rooms), and Fully Homomorphic Encryption using OpenFHE for computing on encrypted data. The presenters emphasize combining multiple layers—encryption at rest/in transit, attestable compute environments, and cryptographic computing—to achieve privacy-in-depth while maintaining cloud agility for AI/ML workloads in regulated environments.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Introduction: Privacy-Preserving AI Primitives for Regulated Industries

Okay, super cool. Alright, welcome to Privacy-Preserving AI Primitives: Building Blocks for Regulated Industries. This is ARC 328, and we're super happy to have you here. My name is Ruben Merz. I'm a Principal Solutions Architect at AWS, and I'm here with my colleague J.D. Bean.

Hi y'all, my name is J.D. Bean. I'm a Principal Architect in the AWS Compute and ML Services organization. I have the privilege of being embedded pretty deeply with our engineering and product teams, working on a wide range of services to help enable our customers to overcome their security, compliance, and privacy challenges. Thank you.

Alright, so thanks for being here. I've been working in regulated industries for many years now, and with AI's fundamental need for data and growing regulation, protecting that data is more critical than ever. Both J.D. and I work with customers like you, regulated customers, exactly on that topic to help you adopt and leverage the cloud. Today we want to tell you about our experience doing exactly this. We want to give you essential building blocks for you to protect this data with confidence. By the end of this talk, we want you to understand what these building blocks are and how you can get started.

Now, this is a level 300 session, and we're assuming that you're familiar with some of AWS's core services across compute, database, and storage. You also know AWS Key Management Service, encryption at rest and in transit. Now, if you don't feel comfortable with this, don't worry, we'll get you covered. We'll go through the basics, and the most important thing is that you're building systems.

So today we cover the following. We want to double down on the urgency of data privacy in today's landscape, and we'll establish our mental model for AI and machine learning for this talk, as well as the reference scenario that we are going to use. Then we'll dive into the building blocks, and we'll close with outlook and the next steps. So let's get started.

The Regulatory Landscape and Mental Model for AI/ML Systems

Quick show of hands, who here today is working in one of these regulated industries? Alright, me too. So you're familiar with the growing regulation, right? Today, if you're handling regulated workloads, you have to account for cybersecurity, data protection, business continuity, and more recently, AI system regulation.

Now AI systems regulation is a fast-evolving and complex landscape to navigate. It's also super broad. It extends beyond cybersecurity into many topics that would each benefit from a full session. Today we want to address data privacy and data protection. That's because these topics have emerged as a central focus. We see convergence across industries with data protection, data privacy, and cybersecurity merging into unified compliance requirements towards three core capabilities: visibility, secure access, and control over your data.

Now when you handle regulated data, assurance is mandatory. You have governance frameworks and accountability mechanisms that require you to have verifiable evidence that your data controls work as intended. At AWS, what we want is to give you the tools and compliance certifications so that you have the delivery assurance so that you can meet your compliance requirements and at the same time benefit from the agility and innovation of the cloud.

So the slide behind me, the picture, is the well-known Shared Responsibility Model. We take care of security of the cloud, and you take care of the security in the cloud. For this presentation, we have two tenets. The first one is we want to give you building blocks that provide you assurance, controls with assurance, and building blocks that provide you with verifiable evidence.

So how are we doing this on our side of the Shared Responsibility Model? Well, we provide you with contractual commitments, for example, customer agreement and data processing addendum. We have more than 140 security certifications and compliance statements, and we have third-party audits. Now on your side of that model, I'm sure you can find similar statements for your vendors. What we see across our customers, regulated customers, is very often for AI that you are using open-source frameworks because they provide you with complete control and visibility into how your data is being processed.

Now another topic, this is the AI/ML spectrum. There's a broad spectrum, right? Some of you are using statistical models with just a few thousand parameters, all the way to foundation models that have billions of parameters. Now, not everyone here is using foundation models. You might also not be using deep learning, right? But you're all using data. And that's our second tenet. We want the building blocks that we provide you today to be broadly applicable in any of these situations.

And that's our mental model for today. Basically, AI and ML systems have two core processes: training and inference. So during training, you have your training data flows through your training processes with code that you have written and gives you a model. Now during inference, you take this model, you take new data, it flows through the inference process, and you get an output. Now the bottom line here is that for a regulated environment, what we see is that every component represents potential sensitive data. Your training data sets, it can be your code for training, it can be your model weights, it can be inference prompts or the outputs, and depending on your regulatory requirements or threat model, each data type might require protections and appropriate controls.

I also want to call out or make a link to the well-known generative AI security scoping matrix. This matrix maps five implementation patterns from consumer apps, for example, like Claude, all the way to enterprise solutions like Amazon QuickSight. And as you move from scope one to five, you gain increasing control over training and inference model and data. And basically at scope one, the provider has to handle security. At scope five, you own it end to end, so all of it.

Now last slide before we move into the first building block. We'll consider three types of scenarios for today. The first one is scenarios where you're using Amazon EC2 instances directly or maybe with Amazon EKS for training or for inference. In that scenario, you maintain control over the compute layer. The second scenario shows serverless inference here with Lambda with minimal infrastructure management. And the third scenario that you can consider is one where you leverage managed AI services, for example, Amazon Bedrock for foundation model inference or Amazon SageMaker AI for ML workflow. All of these scenarios, they can leverage various storage and database solutions to provide persistent storage. And so for each scenario, we want to discuss controls with assurance that apply broadly regardless of your deployment.

First Building Block: Encryption with AWS Key Management Service

So let's move into our very first building block, which is none other than encryption. Alright, so AWS Key Management Service, KMS, is your central encryption control plane. It handles authentication, authorization, and logging. Now, from an architecture point of view, the most important decision that you have to take is how do you store the keys, and you have options. With native KMS, you have access to a fleet of shared HSMs that are FIPS 140-3 Level 3 compliant. That's the most reasonable option for most of the use cases.

The second option is using AWS CloudHSM. It provides dedicated HSM for regulated scenarios when your regulation requires it. The third scenario that you can use is external key store, where the keys are being stored on premises at the expense of possibly additional latency, as well as a dependency to your infrastructure. And finally, you can also decide to import external key material so that you have cryptographic proof of origin. So there's a spectrum again. As you move up, you have greater control, but this comes at the expense of possibly latency and higher operational overhead. And basically as you move down, you have access to a managed service that gives you the best performance and availability. And what I wanted to say was that the scenario that you decide to use pretty much depends on your compliance and security requirements, right? You can mix and match. You can use one or multiple scenarios depending on the workload.

Now the most important thing about KMS in that context is that keys never leave KMS unencrypted, and KMS provides four core cryptographic operations. The one I want to call out here because we're going to talk about it later in this presentation is the generation and export of data keys. KMS integrates with more than 100 services.

For services like Amazon S3 or DynamoDB, envelope encryption is used, and we'll talk about this later. That's a pattern we're going to use and reuse throughout the presentation, and everything is monitored through CloudWatch and CloudTrail. Now let's recap encryption at rest. At AWS, we provide three approaches that allow you to balance control with operational simplicity. With client-side encryption, you encrypt your data before it reaches AWS, and for this you can use SDKs or client-specific services. The second option is using customer managed keys in AWS KMS. This gives you control over the key policy as well as the entire lifecycle of the key, and in particular in regulated environments, you control the rotation. This can be mandatory depending on your compliance requirements. The last option is the use of AWS owned keys to provide transparent encryption for many AWS services, automatic rotation, and no charges for storage and usage. Again, the choice pretty much depends on your compliance requirements.

You can use a customer managed key to protect many machine learning assets with these keys. It can be Bedrock custom models, it can be agent session data, or model artifacts in S3. Now let's recap encryption in transit. Again, we have multiple options, so you can also use client-side encryption. Now of course, here you have to decrypt the data before you can process it. The interesting thing is that you can use the isolation properties of EC2 instances or Lambda in order to do this. At the application layer, we provide you with TLS 1.2 or 1.3 with perfect forward secrecy. This is protecting services, for example, for Bedrock or SageMaker AI, and all control plane APIs.

At the VPC layer, between compatible instances, you benefit from automatic encryption that is taken care of by the Nitro System. In addition, in regulated contexts, we recently released VPC encryption control that provides centralized encryption visibility and enforcement for all traffic within and across the VPC. Again, this is useful in regulatory contexts because that might be a requirement. Finally, at the physical layer, we encrypt all traffic at region boundaries over our backbone as well as between availability zones and over peering connections, either with VPC peering or transit gateway. So together, all these layers create really defense-in-depth protection for all of your AI and ML workloads.

One thing I want to call out is a specific use case, which is for distributed AI training. I want to call out Elastic Fabric Adapter, EFA, very briefly. EFA delivers always-encrypted high-performance networking. It provides hardware-accelerated encryption. It takes advantage of the Nitro System again with no performance penalty, and basically it gives you GPU-direct RDMA access and up to 3,200 gigabits per second of throughput, and it's really here to enable distributed AI training at scale without compromising security.

Now let's try to put that together on a high-level architecture slide. At AWS, we really want you to use encryption. We provide all these options to protect data across the cloud and on premises in both directions. For example, here, TLS is going to safeguard data flowing both ways, training data imported from your premises or from AWS storage services all the way into AWS compute services. This can be EC2 or Lambda, and it also controls all control plane APIs. I also want to mention Direct Connect because it uses MACsec at interconnect points, and you can use site-to-site VPN as well on top if you want to.

Within the VPC, we talked about the Nitro System that provides encryption for training workloads between compatible EC2 instances. Of course, each storage service is using encryption at rest using envelope encryption with KMS managed keys that you can control. So essentially, you maintain control through your customer managed keys while we handle the actual encryption. Now I want to talk briefly about envelope encryption because then I want to talk about client-side encryption. So envelope encryption implements a two-tier cryptographic hierarchy where AWS KMS manages the master key, while so-called data keys that you get through generate data key handle the actual encryption operations.

Envelope Encryption and Client-Side Encryption Patterns

When you call generate data key, AWS KMS generates a plain text data key and an encrypted key. You use the plain text data key to encrypt your data, and you throw it away. You discard it immediately and store the encrypted version with the encrypted data that you have. For decryption, you do the reverse process. You call in order to decrypt the encryption key, and then you can decrypt your data. This is a stateless architecture. You don't need to keep state anywhere because you have the data and the encryption key with the data, and it allows for the encryption of large data sets, for instance, that exceed the 4 kilobyte KMS direct encryption limit. It also enables data key caching to minimize API calls and latency. That's super important for large scale AI training and model artifacts.

So how do you use that in the context of client-side encryption? In the context of AI and ML workloads, a producer can encrypt training data sets, model artifacts, or inference payloads on premises or in the cloud using envelope encryption. Then it either persists that ciphertext to Amazon S3 or maybe Amazon DynamoDB, or can pass it directly to the consumer. All of the processing, either the processing of the data or the encryption or the decryption of data, can take place across AWS Lambda, across Amazon EC2, across on-premises, and you can even use AWS Nitro Enclaves in the case of EC2. Consumers will then retrieve or receive ciphertext, decrypt with identity and access management scoped permissions, and process the data. That's an important point, right? Because you're using key policies, you can precisely control who is allowed to decrypt the data. So training data, model weights, hyperparameters, gradient checkpoints, whatever, they can all remain encrypted throughout the machine learning life cycle.

Now I said we really want you to use encryption, so we provide you here three options for client-side encryption. The first one is the AWS Encryption SDK. This is a cross-platform SDK that enables general purpose cryptographic operations. I often use it in Python or Rust. Another option is the AWS Database Encryption SDK. It delivers record level encryption as well as field level signatures that you can use for data authenticity. And the last option is the Amazon S3 Encryption Client. What's cool about it is that it can automatically encrypt and decrypt your objects by intercepting calls to put objects or get objects, so it's completely transparent. Each of these solutions implements the same envelope encryption primitives and they optimize for specific use cases and data schemas.

So let's look at an example. This Python example creates first an AWS Encryption SDK client, that's the encryption client on the slide. Then it moves on to getting a client. That's what you need and basically creates a key ring. In the context of the key ring creation, you can set encryption context metadata. These are optional non-secret key value pairs that provide additional authenticated data for KMS encryption operations, and they are bound to the ciphertext to, for example, prevent ciphertext substitution attacks. So you can bind model version or training job ID to encrypt the training data sets and model artifacts. The encrypt operation generates unique data keys per object, and the decrypt operation is fairly straightforward. What I'm not showing here is the use of, for example, data key caching that you can control either via amount of messages, number of bits encrypted, or age of the key. We also provide other options, for example, hierarchical key rings that can use branch keys with DynamoDB.

Sometimes you can't use the Encryption SDK, but nothing prevents you from combining those best patterns with open source libraries. So if we now revisit again our examples, client-side encryption provides you another layer of encryption. Everything you've learned on the slide before still applies, encryption in transit and at rest, but you can use client-side encryption for additional assurance. So in the EC2 example, you can protect data coming in from on-premises or from storage services. And again, the advantage here, or one of the benefits here, is that the KMS key policy gives you precise control about who is authorized to do the decryption.

So AWS database encryption protects data broadly to secure data, to secure training datasets, inference data, and model artifacts. And the bottom line is that with customer-managed keys, you have control. Our fundamental security principles are really about making sure that there is no human interaction with plaintext cryptographic key material across all our services.

Second Building Block: Tokenization for Reducing Compliance Scope

I want to now briefly discuss a second building block, which is tokenization. Tokenization can complement encryption. Basically, what tokenization allows you to do is to reduce compliance scope by simply removing or replacing sensitive data from your dataset. In the example here, we have, for example, credit card numbers or an email address, and these are being replaced by tokens. Tokens can be randomly or pseudo-randomly generated. The tokens, or the key-value pair, the token as well as the original mapping, are being stored in what's called a token vault. The point here of having a token vault is that it's also a central point of control for who has the authorization to recover the original mapping between sensitive data and the actual token.

All right, so let us look at the high-level architecture. This diagram shows you one potential implementation using a serverless approach with AWS Lambda. So you see here one Lambda function that takes care of the tokenization process, and you see another Lambda function that is implementing the business logic for the actual tokenization via some cryptographic operation. You can also see that we're using the pattern, the client-side encryption pattern, so on top of encryption in transit and at rest, we're adding another layer of protection with client-side encryption using the AWS database encryption. Again, keep in mind that tokenization will reduce compliance scope by removing sensitive data from your system.

Third Building Block: Confidential Computing with AWS Nitro System

With that, we're moving to our third building block, and I will let JD come on stage. Thanks, Ruben. So we've talked a little bit about encryption in transit, we've spoken a little bit about encryption at rest, and now we move on to perhaps an area that is a little bit more on the frontiers, a little bit more present in the dialogue of the last five years or so than it was previous to that, which is the concept of protecting data while it is in use. Before I move forward, we'll speak about a few ways of approaching this problem. The area that I'm going to focus on is called confidential computing.

Can I see a quick show of hands about who may have heard of that concept before? Okay, we have a few, so I'll define it briefly for you now, which is we really define confidential computing as the act of trying to protect sensitive data or code from any external unauthorized access through the use of special-purpose hardware, firmware, and associated software. At AWS, the primary confidential computing technology we offer is known as the AWS Nitro System.

The AWS Nitro System is really fundamentally the underlying technology that powers every single virtualized EC2 instance that we have released since 2018 and on. This was the process of many, many years of development beginning in 2012 and culminating in the official unveiling and release to the world of the Nitro System in November of 2017 here at re:Invent. It consists of a collection of special-purpose hardware devices as well as a custom hypervisor built for AWS, and now six generations. We're now on our sixth generation of custom chips that powers the system.

What this enables us to offer to our customers is something that is pretty much unheard of in the industry. We offer always-on confidential computing to each and every user of any EC2 instance that's based on the Nitro System, that is, any modern instance released since 2018 and onwards. This is free of cost. It's free of performance impact. It is simply the default offering of every single EC2 instance.

There is no technical mechanism for anyone at AWS to access the contents of a customer's Nitro-based EC2 instance or encrypted storage or encrypted transit traffic moving in and out of the physical underlying virtualization server. Customers don't need to change their code to enable this. It is simply there and always on, and that's really the beauty of confidential computing compared to some of the techniques we'll talk about later, which is that it enables you to isolate sensitive data without substantially modifying your general purpose code.

I'm not going to get too into exactly how we manage this. It's a great story. I've made a habit of telling it on quite a few stages, but we've gone to great lengths to provide assurance for our customers about this claim. One of the first things we did was to publish a deep dive white paper that provides an overview and kind of deep understanding of how the different components of the Nitro System work together to create this outcome. We've also provided a third party validation from an outside security firm, the NCC Group, which came in and affirmed our security claims about zero operator access. Lastly, we provided a very clear and simple statement in our service terms, Section 96, which has been updated to reflect a commitment to each and every AWS customer about this zero operator access posture.

Now it's important to note in this context that the protection of EC2 instances covers not only the CPU resources or the vCPU of a customer's instance or the memory that they use but also the contents of any associated accelerator devices whether that's AWS's Trainium chips or NVIDIA, Qualcomm, AMD, or Intel accelerators. We've listed some of the accelerators and GPU powered instances here that this covers, but of course this also applies to some of the instances that we've released just this week here at re:Invent.

Now as we began to interact more and more with our customers and to tell them about this property that we had baked into the Nitro System, we heard more from our customers about them asking whether or not they could be enabled to build systems that similarly excluded any possibility of external unauthorized access at a deep and technical level. Now I want to be clear here. Not every workload requires this type of mechanism. Furthermore, not every regulated workload requires this type of mechanism. That said, we do speak regularly with customers who are looking for these capabilities and who feel that they will help them to achieve their compliance requirements, their security requirements, or even just to help them to build trust with prospective customers.

So ultimately when we speak with our customers about how they can design a system like this, the first questions we ask them are what are you looking to protect, that is, what is the sensitive confidential data, and who are you looking to protect that from. If the folks that you were trying to protect that from are AWS operators, there is no need to change the way you design your code running in EC2, in a Nitro-based EC2 instance, right? As I mentioned earlier, you already have that protection on by default. Now, if what you're concerned about is something like perhaps your own operators or administrators or supply chain issues, that's where we have a couple of other technologies I want to share with you today.

So the first offering that we released to our customers was called AWS Nitro Enclaves. What this enables you to do is to take an EC2 instance that would otherwise have contained, you know, potentially administrative access, large kind of stacks of third party applications and code that would previously have had to bring in data that was either encrypted at rest or in transit and then decrypt it for processing in plain text. You can actually take that instance now and divide it into two strongly isolated compute domains. Technically these are basically two pure VMs that operate within what is logically your EC2 instance, and the second virtual machine we call a Nitro Enclave.

Now Nitro Enclave in many respects is like another EC2 instance running alongside your EC2 instance, but it has a few differences. It doesn't have local storage, doesn't have a TCP-based IP networking connection. By default, there's no SSH or remote interactive access into the Nitro Enclave, and all of this creates a constrained and lowered surface area of attack. Critically, it's also very useful that a Nitro Enclave can be launched using a file that can be hashed up to ultimately represent a single cryptographic value that contains the entire execution context.

Everything inside the enclave that can run or do anything is ultimately represented and traceable back to this hash value. Keep that in mind as we'll return to it in just a few slides.

Cryptographic Attestation and Zero Operator Access Architectures

As we spoke more and more with our customers who were really enjoying Nitro Enclaves, what we found was that customers wanted to take their own EC2 instance. They didn't really feel the need to divide it into two different parts, but what they wanted to do was make their own EC2 instance isolated and hardened in a way that was very similar to a Nitro Enclave. So we recently, just a few months ago, released a new offering to help our customers apply a zero operator access configuration to their own EC2 instance.

What this ultimately looks like is taking a copy of Amazon Linux, a few recipes that AWS provides, as well as the trusted code and apps provided by the customer, and combining them together using a tool called KIWI Next Generation. Ultimately, this creates an AMI, a machine image for an EC2 instance that doesn't contain things like SSH or serial console access and that has a read-only file system to create an immutable image that can boot up and similarly be captured from beginning to end with a single cryptographic hash. Now I mentioned I would speak more about this idea of a cryptographic hash in a second.

And here we are. So we now have these two capabilities, both the Nitro TPM, which powers this attestable AMI flow I mentioned earlier, and Nitro Enclaves, that are both able to provide a cryptographic attestation of the code that runs inside them to a third party system. What they're able to do is to obtain a document from the Nitro hypervisor that attests to the specific measurements of those environments so that external systems can validate that the code running inside is in fact the code that is authorized for various interactions.

Now, a particularly useful flow there is our interaction with those two cryptographic attestation mechanisms and AWS KMS. So what we do with Nitro TPM or Nitro Enclaves is to take a measurement, generate that attestation document, sign it with a Nitro hypervisor, and allow these environments to pass a request to AWS KMS to say, for example, decrypt a secret value or cryptographic credential. The AWS KMS can then validate the attestation document and make an authorization decision based upon the contents of it before either choosing to deny the request or returning back the secret value to the calling trusted execution environment.

Now this is a bit of a noisy screen, but I've highlighted two particular values here in blue and purple. So one is PCR 4 in a Nitro TPM attestation document. This value corresponds to the measurement I was talking about earlier, right? This is basically a chained value that corresponds to the entire boot context of this EC2 instance. So it includes the kernel, the boot code, and also the root file system. So this basically measures the entire instance. It basically is at a very, very low, like from a technical level, a cryptographic version of execution identity.

PCR 7 is something somewhat different. It's actually chained to the UEFI secure boot policy of the instance. Now there's a lot of details in there that probably aren't worth going into here, so I'm going to take a step back and basically tell you how you can think of that value. It effectively corresponds to the ability to say has this instance been signed or has this set of code been signed by a particular cryptographic token. So generally, right, what this looks like is effectively that you have a CI/CD pipeline and you're able to sign things that come out of that pipeline and bless them as being appropriate to handle sensitive data.

Usually the bar for that sensitive data access is going to be something like zero operator access. Now, you take one of these two measurements, or perhaps both. Of course PCR 4 is a very, very specific measurement and is perhaps a bit brittle, whereas PCR 7 is a little bit more scalable but slightly lower assurance, and you could take these values and actually plug them in directly to the KMS key policy that you set in KMS.

Right? So here you have a principal being allowed to decrypt a value and you see under the conditions section that we're looking for particular values

in the recipient attestation NitroTPM PCR, insert number here, so 4 or 7. And what we do here is include the two values from our attestation document.

Okay, we've laid a little bit of groundwork. What does that actually look like? How can I use that to achieve a valuable outcome? So here's a very simple architectural diagram of what this looks like. You have, for example, an end user device that will encrypt data using an AWS KMS key that only permits decryption of that data when the request comes along with an attestation document bearing a trusted measurement PCR value. So the encrypted user data is then sent into a Nitro-based EC2 instance that has full access to one of these powerful accelerators, as I mentioned at the outset of my talk today. We already know that AWS operators do not have access to this EC2 instance. We also know, because of the measurements and because of the pre-validation that we performed, that also no operators or staff from the customer that operates the EC2 instance can also access the contents of that sensitive data or modify the systems that run inside the EC2 instance.

So once this data comes in, a call is made to KMS. KMS then validates the attestation document, decrypts the data, sends it back using TLS, and an inference request is able to be performed. So let's take a look at ways in which these types of technologies as base primitives can be integrated into some of the workflows that we've spoken about so far. So I actually want to go back to tokenization, which Ruben was speaking about briefly. One of the things about tokenization is, of course, that at some point the sensitive data needs to be tokenized. It needs to be transformed, and what better type of environment to handle that sensitive data and perform that sensitive tokenization process than a Nitro Enclave or an attestable EC2 instance. There you have zero operator access, you have a strong isolation boundary, and inside you can then safely handle this sensitive data and transform it into less sensitive data.

Thanks to attestation, you can also be confident that the attestation service you are providing that sensitive data to is, the attested service is in fact an appropriate tokenization system, and then it will handle it as its code has instructed it to. Now here's a bit of a more complex view that is directly tied with an inference request. So in this case, this could be an agent executing a local model or even just a broader, direct, more simple kind of prompt response inference flow. But you have basically a client device that encrypts the prompt data using a KMS key that has been similarly configured to only allow decryption by a trusted environment. That data is then sent in, encrypted client side, through to an EC2 instance, which has the ability, thanks to cryptographic attestation, to prove that it is a trusted environment to KMS, to then decrypt the prompt, to perform an inference function and potentially additional processing on it, and then to re-encrypt that data and return it back to the client, all without any system that has any potential for unauthorized access ever having been able to access that customer data in plain text.

Now, similarly, you could perform something relatively equivalent with a Nitro Enclave. The only difference is Nitro Enclaves do not have direct access to accelerator hardware. So typically here you'd be thinking about a smaller model that would still operate performantly with CPU-based inference. That said, because of the Nitro Enclave's ability to run as effectively a sidecar to another compute environment, it opens the possibility for more distributed inference workloads and opportunities there.

Fourth Building Block: Federated Learning for Distributed Data

And with that said, I'm going to hand things back to Ruben to close things out. Thank you. Thank you. We're now going to move on to building blocks that are more on your side of the shared responsibility model. The first one we want to talk about is federated learning. It addresses cases where your data might be constrained to specific boundaries. This can be because of regulatory constraints, organizational boundaries, or security boundaries. In this context, federated learning inverts the centralized training model.

In a federated learning model, you have nodes that collaboratively work together across a distributed dataset, and each node keeps data locally and only shares model updates with a centralized aggregation server. So raw data never leaves the node, essentially the boundary. We can highlight two options for production implementations. One of them is Flower, the open source Flower framework, which provides flexibility with support for, for example, PyTorch or TensorFlow and provides enterprise-scale deployment. The other one is NVIDIA FLARE that delivers GPU support. On AWS, you can deploy federated learning via SageMaker containers on EC2, on EKS, on ECS, or even some other options.

This diagram highlights federated learning on AWS with security enforced at multiple layers, so we keep the same principles of having multiple layers of security. The central orchestrator distributes the global model to nodes on AWS or external premises. The attestable AI here, for example, provides cryptographic proof that instances run trusted software. Now each node trains locally on private data and returns only encrypted model parameters, never raw data, and the aggregator combines updates using secure aggregation protocols. So security operates at multiple levels. You have the Nitro TPM to help for cryptographic attestation. You have client-side encryption to provide additional protection for encryption. You have identity and access management to control who can, for example, proceed to decryption operations or communicate with the central aggregation server, and you can use, for example, IAM anywhere to make sure that only authorized nodes can participate in the protocol.

Fifth Building Block: Differential Privacy and AWS Clean Rooms

To dive deeper, you can find here selected articles from the AWS and Amazon Science blogs. The second building block that we want to talk about is one that addresses a fundamental challenge tied to privacy protection. We want to talk about differential privacy, or I will use the word DP in short. It addresses the challenge of the erosion of privacy through repeated queries or analysis or the availability of side information, and I think the best is to give you an example.

So let's consider queries on the student earnings dataset, and our study ran two queries across a semester, one at the beginning of the semester and the second one at the end of the semester. The first query returns that out of 3,005 students, 202 earn over 40K. The second query returns later in time that out of 3,004 students, 201 earn over 40K. This one student difference can prove critical because it provides precise information about a specific individual. Now let's say that someone has access to side information. They know the specific student who left the school during the semester. They are able to recover the identity of that student. They can identify the individual.

So differential privacy is a mathematical framework to provide mathematical guarantees on protecting individual data. The fundamental intuition is that DP injects carefully calibrated statistical noise into either computation or queries, and that noise has to be tuned in order to obscure specific data points but at the same time to make sure that the model remains reasonably accurate. In practice, you have to involve two considerations. One of them is you have to manage the so-called privacy budget. So you need to make sure there's a parameter that's called epsilon that can be used for that, that quantifies how much privacy you lose at every query. The other parameter is called delta. It's to prevent catastrophic privacy failure where the model suddenly breaks down and reveals all private information. There are trade-offs here that essentially balance accuracy along with how much you can protect individual information and computational complexity.

So if you want to implement differential privacy into your algorithm, there are plenty of open source frameworks that exist. It really depends on your use cases, and at AWS we provide one option that we'll discuss afterwards with AWS Clean Rooms. Now I want to finish giving you the intuition of how DP works. So now let's say that the analysts were using a DP-based system that is adding controlled randomness.

Now the first query would simply return a noisy count of 204 students that earn over 400, and the second query would return a noisy count of 199 that earned over 40k. And that randomness is what prevents identifying a specific individual. Differential privacy is a tool that adds formal privacy guarantees that remain robust, and it's one complementary approach. You can, for example, often use it with federated learning in order to prevent privacy erosion across many turns.

Implementing differential privacy is complex. You have to manage privacy budgets, you have to tune the parameters, and you have to know how to take care of it over multiple queries. AWS Clean Rooms is a service for secure multi-party data collaborations, and it essentially implements managed differential privacy controls. So in this context, you can have a so-called Clean Room operator that can set the total privacy budget. And for example, here we have three queries that consume a certain amount of that budget, and the fourth query goes over the budget. In that case, the query is basically prevented, which prevents the leaking of individual information.

Sixth Building Block: Cryptographic Computing and Fully Homomorphic Encryption

Again, if you want to dive deeper, we really encourage you to explore selected articles from our Amazon Science colleagues. You can find one early reference that dates back to 2008, so this is something we've been working on for a long time. The last building block that I want to tell you about is my favorite one. So what if you never had to disclose data? This is cryptographic computing. With cryptographic computing, you can perform computation on encrypted data without ever decrypting it, and this comprises multiple forms with different trade-offs.

For example, with open source frameworks such as SEAL or OpenFHE, you can perform computation on arbitrary functions. Now this comes with the trade-off at the expense of performance. On AWS, we provide two options for production deployment. One of them is integrated into Clean Rooms, and I'll tell you about it afterwards. The other one is integrated into the AWS Database Encryption SDK, and it provides encrypted search.

So again, if we come back to our tokenization example, that's a pretty cool application because now let's say that you have to figure out a specific mapping between an original token and its encrypted mapping. Without encrypted search, you have to go through all of the mappings and encrypt each one of them. Thanks to encrypted search, you can do that search without ever decrypting the data.

In the context of Clean Rooms, the use case that it addresses are regulatory scenarios where you cannot put the data, where the data has to remain encrypted on the cloud. One thing to always be aware with cryptographic computing is that it comes with trade-offs. In the context of Clean Rooms, these trade-offs are the following. One of them is that the participants must agree on a shared secret key. The other one is that using cryptographic encryption with Clean Rooms reduces the type of operations that you can do on your collaboration room, so you can't use all the operations that you could use without cryptographic computing. And the third aspect that you have to pay attention to is overhead in terms of storage.

Now I want to talk about Fully Homomorphic Encryption, which enables computing arbitrary functions on encrypted data without ever decrypting them. And the example that you see here is a high-level example to give you the intuition. Essentially, you have a plaintext X that is encrypted, and then the function F, which is a mapping of the computation that you want to do but that can work over encrypted functions, operates directly on the ciphertext to produce an encrypted result that can then be decrypted.

If you're familiar with these types of approaches, you might have heard that they've been traditionally super computationally expensive. Now in recent years, there has been a lot of progress, either in terms of acceleration options by using ASICs or GPUs which you find on the cloud, or by finding specific optimizations for use cases. And today, the bottom line is that you can apply this type of technique for AI workloads with acceptable trade-offs.

One of the things that I find the most impressive is that today modern approaches can even give you a compiler that can automatically take the small f of x and transform it into a capital F of X, so it makes the usability much better than something like 10 years ago.

So let me show you that in action. I'm here giving you an example with OpenFHE. It's one of the open source libraries that you can use. It has Python bindings which make it very approachable if you want to get started. That example shows you vector multiplication. And so what you see is that in step one we have to set up a cryptographic context and we have to select a specific scheme. In step two, we have to generate encryption keys. In step three, we encrypt, no surprise, and the most important step is step four, where we do operation on encrypted data. Here in that specific scenario, we're doing a vector multiplication, and step five is the decryption. And this is something that you can run on your own laptop or machine if you want.

Now let me give you, if you can do this, that means that you can do machine learning. You can, for example, apply it for inference. Let me here show you another example that has been adapted from an open source repository. This is a linear support vector machine inference. The structure is very much the same. The only difference are the operations that we're using. And for example, in that example, we're taking a trade-off between performance and data protection. So what here in that example remains encrypted are the inference values, but the model weights remain in plain text. But you don't have to do that, right? There's a trade-off that you can choose. That example, if you run it on a reasonably sized CPU machine, runs in under tens of milliseconds and provides close to 95% accuracy compared to a plain text approach. So that means that you can actually deploy these approaches in the cloud on a Lambda function in a serverless approach.

That last high level architecture shows you three different implementation deployments that very much depend on your model type and on the performance requirement that you have. This is a client-server architecture where the client is responsible for setting up a public-private key and to handle a communication protocol with a model serving API. So for a lightweight FHE model, you can use the synchronous Lambda path, that's the one on the top that basically processes requests directly in the Lambda function and returns the result. If you have more computationally intensive models, for example, you're using this type of techniques on a reasonably sized neural network, you can either use a serverless approach with GPUs, for example, with Amazon ECS, or you can use asynchronous processing where you're going to first put the result into a queue, it gets acted on later, and you recover the result later. And the third option, well, actually the third, that was the third option that I described where you don't have GPU support. You also need an Amazon S3 bucket because you need to be able to store what's called an evaluation key, which is specific cryptographic material that you use in those cases, and that's generally too large in order to be passed directly via API calls.

And what we really want to show you here with this slide is that there's a perfect match between the cloud and cryptographic computing, right, because you have access to serverless resources and to GPUs and your data can remain encrypted at all times. Again, this was our last building block, and that's one with great potential, and we'd love to hear how we can help you here. And to dive deeper, here is another selection of resources and articles from our Amazon Science colleagues. All right, let's wrap up. We've shown you multiple building blocks, and our key message here is privacy in depth. You can use them together. You can combine them to meet your specific compliance and protection requirement. There is not a one size fits all, right? It's all about visibility, secure access, and control over data. At the end of the day, what we want is for you to be able to meet your compliance requirements and to benefit from the agility and innovation benefits of the cloud.

Conclusion: Privacy in Depth and AWS's Commitment to Data Protection

The other thing that we really want to call out here is that data protection and privacy is something fundamental at AWS, and many of these building blocks, they are used under the hood with managed AWS services. For example, with Amazon Bedrock, we ensure that your data is never shared with foundation model providers, prompt data is not stored.

All API calls remain within your AWS regions. We're using encryption in transit and encryption at rest with customer-managed keys. CloudWatch and CloudTrail provide monitoring and auditability backed by over 20 compliance standards, and the same set of principles and approaches are being applied to Amazon SageMaker AI. Finally, this presentation is about building blocks with assurance.

At AWS, security, data protection, and privacy are foundational to everything that we do and build. And as for you, builders of regulated industries, as I said several times, we really want you to use encryption and have control of your keys backed by AWS KMS. Encryption is easy on AWS, so just use it.

The Nitro System is always on. It provides strong physical and logical isolation together with attestation capabilities, and that gives you control over how your data can be processed. And finally, on your side of the shared responsibility model, take advantage of federated learning, differential privacy, and cryptographic computing. They are here today. You can use them right now, and again, we'd love to hear how we can help you there and how to deploy them in production.

These building blocks give you the confidence and assurance to innovate while you comply with an evolving regulatory landscape. Thank you very much for listening, and we'll be available to take your questions just outside this room.

; This article is entirely auto-generated using Amazon Bedrock.