🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.
Overview
📖 AWS re:Invent 2025 - Demystify attestation: Cryptographically verify execution environment (CMP317)
In this video, Alex Graf and Sudhir from AWS EC2 explain cryptographic attestation for confidential computing. They demonstrate how to protect sensitive AI models and data using AWS Nitro's two-dimensional security model: isolating workloads from both AWS operators and customer operators. The session covers two attestation mechanisms—EC2 Instance Attestation using Nitro TPM with PCR measurements, and Nitro Enclaves for isolated workload execution. A live demo shows packaging an LLM application using KIWI-NG and Nix, implementing envelope encryption with AWS KMS policies based on PCR values, and solving multi-party collaboration scenarios where model owners protect intellectual property while consumers protect sensitive prompts. The presentation includes practical code examples and QR codes to GitHub repositories with sample applications.
; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.
Main Part
Introduction: The AI Model Protection Challenge
Hello, good afternoon, and welcome to the session on demystifying attestation: how to make cryptographic attestation work and verify your execution environment so you're only running what you want to be running. I'm Alex Graf, a principal engineer in the EC2 organization, and I have with me Sudhir, a senior solutions architect who is also working on EC2 topics including confidential computing just like me.
So imagine you are an AI company. You build an AI model, and you are deeply invested in it. You basically spend all of your money building such an AI model—a super expensive thing. So you want customers to use it. How do you get customers to use it? First, you get a customer who has data they want to send to you to perform some operation that you and your AI model can do for them. Once you do that operation, you're influencing that data, the model spits out a result, and your customer is happy.
The obvious first thing you would do design-wise to build such an architecture is to build it as a service, right? Your customer can send that data over to you, you process it, and generate the result, which is really a happy case in most environments. Unless you realize that your customer actually has super sensitive data they don't want to hand to you. So how do you solve that problem? Your customer cannot give you that data into your environment but rather wants to keep it in their own. Well, easy, right? You basically just give them the environment instead. Why don't we just allow the customer to run our AI model in their own premises or in their own account so that they don't have to really trust us for the execution?
Well, the problem with that is that training this AI model was not cheap and it's kind of intellectual property, so you really want to make sure that nobody can see that. So what can we do? Confidential computing to the rescue. If we put a box around it—because putting boxes around it always makes things look prettier—and we secure that box with a nice shield around it and call it confidential computing, we have the problem solved.
The reason we're in this presentation is so that we can take a look at how that actually works underneath the hood and then understand what we can do and how we can actually do that. So confidential computing in a nutshell means that we don't give anyone access to things that are running inside of an environment. What we do with this shield, this box and shield around it, really just means that whenever there's some communication going to the outside world, we define that interface to a point where data that we don't want to leak does not leak. Sounds very easy in a nutshell, but the details—the devil's in the detail, obviously.
Understanding Confidential Computing and the Nitro Stack
The basic gist of what we're building with confidential computing here is that your account operator does not gain access into the instance or into an entity, into a trusted execution environment that is actually executing your sensitive workload. But there's more. This confidential computing environment is running on top of a stack of technology. In AWS's case, that's the Nitro stack. And in Nitro, we follow the exact same model. We have APIs that are clearly defined to not include any access patterns or access paths into customer data so that any workload that executes on top of that stack is safe from AWS operators who otherwise still need to operate the fleet.
One way to think of this problem instead of this picture is to look at it in dimensions. We call the environment where we say our AWS operators don't have access dimension one. So we, including me—I do operate this fleet on a day-to-day basis—and we do not have any API access that would allow us access into customer data. Dimension two, on the other side, is when it's about your applications fending off access from either your own operators or from somebody that's running this execution environment, like what we were showing earlier in the AI model example.
And if you want to know more about how this whole concept works, there's a very nice white paper that we wrote on how the Nitro model works and how the whole overall system actually is designed.
We designed this approach and validated the entire design with the NCC Group, an external assessor, to confirm there are no gaps in the Nitro system that would compromise the security claims I outlined earlier. We are so confident about this stance that we even included in our terms of use that we do not have operator access into EC2 instances, for example. To recap, confidential computing means that AWS operators do not have access to your EC2 instance data. It's always on, and there's nothing you need to do. If that's all you care about, you can sleep peacefully.
However, this goes further. True confidential computing, in my opinion, means you also isolate the environment from other entities, not just AWS operators. It means you want to isolate your workload from your operators or from other operators who could possibly access it. That is part of customer responsibility. However, we provide very nice tools to make this an easy, achievable task, which I will outline in the rest of this presentation.
There is one piece I haven't mentioned until now. While it's nice to say you don't have access to this data, you want to prove it, right? You want some mechanism to verify that the thing you're talking to is actually running this code, rather than just taking my word for it. We can get into that.
Cryptographic Attestation: Validating Execution Environments
When we talk about data uses, we've looked at processing data so far. You have some execution environment, some confidential computing execution environment, and it processes data. But where does that data come from? Data doesn't appear out of thin air; it needs to come from somewhere. It usually comes from either storage at rest or from somewhere in transit, which means you're communicating with someone else. Either you have storage where you put data, or you talk to somebody live. That data needs to be encrypted with an encryption key known only to this execution environment so you can be sure that only trusted entities can communicate either with storage or over the network.
To get there, you need some type of shared secret, some way to authenticate and ensure that a shared secret is available only inside this execution environment, not outside. But how do we get it in? If you see it as part of the launch, it may potentially be a public secret because anyone could see and look at the original image that was executing. So we need to do that at runtime.
Let's look at how you actually build and execute these environments. Confidential computing execution environments always start from some description. You have something that describes what this image looks like and what the environment will look like. You generate an image from that description, and then from the description you spawn the actual environment. You can obviously spawn multiple environments. Ideally, you want a reproducible way from the description into the image so you can always receive the same result.
However, one really important thing happens when you generate the image. You don't only generate the image; you also generate cryptographic hashes that describe the full contents of that image. That's a really key point. The reason we need these hashes is that later, when we execute the execution environment, our trustworthy Nitro stack regenerates those same hashes and provides you with an attestation document which you can then use to identify what was actually running in that instance at that given point in time in that particular environment. If we then use them to compare the image generation hashes as well as the executing hashes, we know that the environment we have is the environment we built, which is the main property we're looking for here.
Now we still don't have a secret, but we know that we have an environment that looks the same as what we wanted to have. We can get that quite easily using a service called the Key Management Service. We integrated capabilities into the Key Management Service which allow you to define a policy.
KMS is a service that allows you to store and perform operations on key material. You can specify that you are only allowed to operate on this key if your policy matches. Basically, if your execution environment is exactly the same as the image that you vetted and is actually built with trustworthy code, you gain access.
To recap, cryptographic attestation means that we are validating the execution environment against the image that we built. Validation of that is integrated into the Key Management Service. You don't have to use AWS KMS. It just makes your life a lot easier. You use that mechanism in order to gain access to data in transit or at rest.
EC2 Instance Attestation: Building Attestable AMIs with Full EC2 Features
Typically in these confidential computing environments we assume that code is public and data is private. If you need private code, you treat it like data. If you need private code that you don't want anyone to see, you would encrypt it, so it's basically data addressed again. In order to do all of this attestation, we need an attestable AMI, and an attestable AMI is not the same as just a normal image.
It's a very special type of image that you have to build in order to gain this property where the boot stack is going to actually give you these measurements and hashes that allow you to reason about the contents. For that we provide two mechanisms. One is super new. We just launched this earlier this year, and I'm very excited about this new feature. It's called EC2 Instance Attestation.
EC2 Instance Attestation means you have a description. There are multiple different formats of descriptions, but the description describes what you want to have inside of your attestable AMI. Think of it like a Dockerfile. Then you generate the attestable AMI. It's a very special type of AMI that has a special boot flow which also generates a set of hashes. You can always generate these hashes after the fact, like when the image is already there, but typically you would generate them while you generate the AMI.
Then you can launch the AMI like you would any other EC2 instance. If you haven't launched an EC2 instance manually, try it. It's really fun. It means you can basically get a virtual machine at any point in time very quickly and easily. With this mechanism we are hooking confidential computing, including actual workload adaptation into the whole EC2 ecosystem.
So you can launch an EC2 instance and this EC2 instance hashes the boot contents of that instance into Nitro TPM. It's a virtual Trusted Platform Module that we provide for a while already in these EC2 instances. The new thing that we have launched this year is that this Nitro TPM now can also issue a Nitro TPM attestation document which gives you all of that KMS integration I was talking about earlier, so you can unlock your secrets based on your boot flow.
The secret part about why that attestable AMI is so different is that it contains a completely different boot environment. So you have the Nitro stack which starts up the virtual machine which executes a UEFI boot environment, low level firmware which then launches something called a Unified Kernel Image, or UKI, which is a standard upstream Linux concept, but we're reusing it in attestable AMIs because it gives us a very nice property.
It is a single binary that contains the kernel, init, and command line of the target system. This means that when UEFI launches this, it measures and hashes the contents, the full contents of that UKI into PCR 4. Then the UKI sees PCR 12, and the kernel command line that's part of that UKI contains a hash of the root file system. By just only looking at that one single number at PCR 4, you can reason about all of the contents of that one single system because everything is drilled down into a single number.
Thanks to the Nitro TPM attestation document, you can also reason about the fact that you are running on a Nitro stack because the document is signed with an AWS key that only Nitro environments have access to. So by looking at basically these two numbers plus an AWS signature in the document, you can reason about the two dimensions.
I am running on Nitro, so I'm running on an environment that AWS operators do not have access to, and I am running exactly the image I built. Because the templates we provide by default don't contain operator access at all, and if you don't add operator access, it means you are running an environment that also does not contain any operator access.
The super amazing thing about EC2 Instance Attestation is that you get all of these properties with all of the features and capabilities of an EC2 instance. You can use elastic network adapters, you can use elastic fabric adapters, you can use instant storage, you can use accelerators like your uranium, anything you want. All of the features that you're used to and love in EC2 just work the same way as with auto scaling groups. From an EC2 point of view, these things are just AMIs, they're just normal machine images.
The NitroTPM attestation document proves the two dimensions. It proves that you are running on the Nitro stack with no AWS operator access, and it proves that you are running exactly the image that you built. The interactive access that I talked about, where you would have an operator that could possibly go and extract data, is excluded as long as you follow our samples and you don't add that access. You know that you don't have that access and you can prove it again using the key hash. We provide sample descriptions for Amazon Linux 2023 as well as NIOS.
This QR code brings you to the web page where you can just go and get started. I would totally encourage you to do so. It's actually pretty fun and easy to do. This attestation is really useful if you have an application that you want to lift and shift. You take the full application and you put it into an EC2 instance that is fully attested and validated, and you are sure that it's running exactly your code.
Nitro Enclaves: Isolating Sensitive Workloads Within Applications
But there's a second mechanism that we have, which is called Nitro Enclaves. For Nitro Enclaves, which is a feature that we have had for quite a while, I suppose who's familiar with Nitro Enclaves? That's more hands than EC2 instances. Nitro Enclaves is the feature we launched first, but it's actually the more complicated feature in a way.
In a Nitro Enclave, you would typically say my application is consisting of a lot of different things. I have third party applications running somewhere. I have a full blown operating system that goes and does a lot of things. I have a big part of my application that goes and processes random data, maybe talks to external entities, and that I may not fully trust. But I have this tiny piece of code that I want to shield out, isolate, and make sure that this particular tiny piece of code is running securely in an environment that nobody else or nothing else can access. That's what Nitro Enclaves are for, and that's where they really shine.
If you can take your application or you want to take your application and you want to split it into two, you want to split it into a trusted and an untrusted part, but they still work lockstep, they still work hand in hand together as a single entity, then Nitro Enclave is the solution for you. Nitro Enclave provides you with a separate tiny virtual machine that you can spawn on the fly out of your EC2 instance resources. That then can process highly sensitive data inside of an enclave.
The enclave only communicates to the parent using the VSOC channel that it has, like a special low level transport. This means that you need to enlighten every communication that goes in and out, which is a special property of Nitro Enclaves that allow you to ensure that you always reason about every single communication that goes in and out of this enclave. Because, as I said, it's built to only communicate between your parent application and your enclave. Enclaves is the tool of choice if you want to perform ISA operations on behalf of a parent, so you're splitting your applications in two.
If you look at building an enclave, it looks the same as the picture I described earlier. Instead of having an image description with source code attributes, here you take an actual container image, like a built container image, and you convert that. You generate an enclave image file from it, an EIF file. While generating it or postmortem, you can generate hashes of the image so you know what to expect later on at boot. From that image you also launch a Nitro Enclave.
When you launch a Nitro Enclave, the Nitro hypervisor hashes all of the contents of the actual image and provides you with PCR values, which are then fed into a Nitro Enclave attestation document. This follows the exact same flow as before. The actual measurement itself is a bit simpler because instead of having that chain of boot flow, you really only have the Nitro stack that takes the EIF file and then boots the EIF itself and seeds PCR zero value, which means all of the contents of the EIF file. There are more PCRs, but you can read up on them online if you're interested.
Using these two values—the AWS signature you have on the Nitro Enclave attestation document as well as PCR zero—you can reason about the same two dimensions. You know that it's exactly your image running, which means no interactive access hopefully unless you put an SSH server behind VSOCK , and the Nitro stack itself as well, because the document is signed with the Nitro attestation key. Nitro Enclaves is a super useful feature, but it does have a restricted feature set . You only have CPU, memory, and VSOCK available, which dramatically simplifies auditing because you don't have a lot of surrounding code and access mechanisms available to the environment.
However, this also means you don't get access to GPUs, for example. For that, we recommend EC2 Instance Attestation instead, because that gives you a full environment just like you would on a normal EC2 instance. The attestation document for Nitro Enclaves proves the Nitro as well as the Nitro Enclave boot contents. If you want to communicate to the outside world from a Nitro Enclave, everything goes through VSOCK, including access to KMS. It may be a bit more tricky to set up, but it's usually worth it.
Nitro Enclaves is what you use if you want to execute a tested and isolated ephemeral workload that depends on a parent application. Ephemeral because you don't have persistent storage usually unless you build it again on top of it . We actually have a really cool demo application that Sudhir has built. He's going to show you how you can use all of these technologies that I just described to solve the initial problem I talked to you about, which is how do I make sure that I can give you an environment to execute that launches and operates on my AI model without you having visibility into the model itself.
Demo Setup: Multi-Party Collaboration for LLM Model Protection
I promise I have a couple of slides and then we'll just get into the code. Let's actually look at one of the cool use cases which you can use EC2 Instance Attestation to solve for . Here's an example, and by the way, everything that I'm going to show in the next few minutes you can follow along. Grab the QR codes and the sample applications that I'm going to show. The code samples show how to take the sample application and package it into an attestable AMI using the tools that Alex mentioned, KIWI-NG and NX. All of that's already open source, so grab these QR codes.
Here's the setup for the sample app. Imagine, and again this is just continuing what Alex outlined at the very beginning, which is you are an AI/ML model owner. I'm just going to take an example of an LLM model owner. Let's talk about generative AI applications. Let's say I am in the business of fine-tuning domain-specific LLM models, and then I want my customers to use them. I have invested a considerable amount of resources in fine-tuning and training those LLM models. So how do I make sure that my customers can use my LLM models without me worrying about model weight exfiltration? That's one way to look at the threat, the possible threat that I want to mitigate as a model owner.
Similarly, the model consumer, the other half of this multi-party collaboration setup, would worry about something different. What if I'm actually sending highly sensitive data in the prompts, in the query, in the context that I supply to this large language model? It's pretty common these days that you would want to enrich what you send to the LLM to get more context-aware answers back from the LLM. You would send maybe your organizational data, enterprise data, personal information, and so on. So you do have a use case where highly sensitive data is being sent and you want to make sure that the other party cannot exfiltrate what you have said.
Here is one way to solve it. Let me look at a very simple setup.
Here is how you deploy. I am not saying this is the only way to consume LLM models on AWS. There are other services, but for the purpose of this conversation, since we are talking about cryptographic attestation and Amazon EC2 instances, here is one way to host an LLM model. It is served using a model server and then I have a UI or a front end that is consuming that LLM API. So far so good. This is a classic and simple example of what we call multi-party collaboration. This is not multi-party competition; this is multi-party collaboration. You have two parties here: the party that owns the LLM model and the party that consumes the LLM model.
I think I have a slide on that. There you go. As I explained, the party that consumes the LLM model wants to protect certain things that are being supplied to the LLM model. The other party wants to protect the model itself. How would you solve for a pattern like this? Here is another visual which shows the prompt and query being sensitive. The LLM model is intellectual property of the party which owns the LLM model, so that is sensitive as well.
There are more patterns like I said. This is not the only thing that we could solve. You would have more design patterns and generative applications where you could have components of your applications stacked that anonymize or pseudonymize the prompts and the sensitive information before sending them over to the LLM model. You could have your RAG components that append additional context. That is something that you probably want to protect because RAG has access to your organizational information, for example. There are many more things. You can extrapolate, but the primitives to solve a problem like this do remain the same.
Let me take a look at those primitives. The first and foremost: as a model owner, you would want to encrypt your LLM model weights. We call it envelope encryption using AWS KMS or your own key management system before you publish it to the other party. If you recall, we started off with the problem statement saying that instead of asking your customers to send their sensitive data to you, why don't you invert the deployment topology where you are packaging up your LLM model weights in such a way that they can actually be deployed in your customer's AWS account or at the point of consumption.
So the sensitive information that is being sent as input to the LLM model stays where it is. It does not go beyond the boundary of the AWS account where it belongs. Envelope encryption is one way to package up your LLM model weights to be made available to the consumer. But the next step is the most important step. Beyond just encrypting them, because we are talking about confidential computing environments and because we have a nice tool called attestation at our disposal, what we will do is seal the model weights to the measurements that you saw in Alex's talk.
These measurements represent the confidential computing environment, the isolated compute environment that you have built. You went through the build flow, you built an attestable AMI, and one of the things that came out of the build process is measurements that represent this AMI. Now imagine you have the execution environment, the EC2 instance, which also presents you the same measurements in an attestation document. So what I am going to do as a model owner is not only encrypt but seal my model weights to be unsealed only in an execution environment that shows me or presents to me these exact same measurements in that attestation document.
Live Demonstration: Encrypting, Sealing, and Consuming LLM Models
Let me actually look at the sample application. So enough of design patterns and primitives. We will actually see how these primitives and design patterns resulted in a sample app and how we package it up and deploy it as an EC2 instance with instance attestation, which becomes your confidential computing environment. I am going to switch over to my look up here. Let me know if some of this is not clearly legible.
So first and foremost, let me give you a peek at the sample app itself.
Here's the sample app. Does it look okay so far? Do you want me to zoom it up? Let me zoom it up real quick. This is the same concept that we just looked at in the slides. I have two parties. One party is the model owner, and the other party is the model consumer. The model owner wants to encrypt and seal the model weights and then publish it to the model consumer. We're talking about an EC2 instance here with instance attestation. They want to consume it, so they will be sending their inference requests to this EC2 instance, to the application that we're going to package up into this EC2 instance. Out comes a chat interface where they can chat with it.
Let me show you how it comes together. For the purpose of this demo, we have simplified a bunch of concepts. I made it very simple, and I'm actually running both the front end and the back end of the sample application inside the same EC2 instance. By no means is this well architected. You could divide up these components into each of its own execution environment with its own attestation and its own measurements. There's nothing stopping you from doing that, but this is just for simplicity's sake.
What I have here is two views. Here is what I can do as an LLM model owner. As I mentioned, I can envelope and encrypt my model weights and then publish and seal them. I'm going to take some off-the-shelf LLM model weights available on Hugging Face. Just play along and imagine that these are my intellectual property, highly fine-tuned domain-specific LLM models or what have you. I have this LLM model weight that I'm going to use a KMS key ID to encrypt and publish to an S3 bucket.
I'm not going to bore you with the loading demo and have you sit here and watch it. I think I went ahead and published it already. Here we have it. The encrypted model weight, which in this case is a Mistral 7B model. On disk, it's about 4 gigabytes. What I also have, and this is the cool and important part with envelope encryption, is that you have a backing KMS key, which is what this ID represents. What we did behind the scenes, if you see these steps, is we generated a data key. That's what the fundamentals of envelope encryption is. You generate a data key using the master key, and then you use the data key to encrypt the GGUF file. What you publish to your consumer is two things. One is the encrypted GGUF file, which is what you see here. GGUF is just one representation of LLM model weights, and there are several others. For simplicity, we just used the most popular format that my simple LLM server, which is Ollama, can serve. We'll see those details, but yes, those are the two things that I'm publishing: the LLM model weights, which are encrypted, and the encrypted data key that I have used to encrypt the LLM model weights.
So far, we've done one thing as an LLM model owner, which is publish my model weights. Now for the sealing step, I'm going to take this KMS key ID, and this is where attestation comes into the picture. Let me actually show you attestation before we go into the sealing model. You might be wondering what the PCS is. If I go to the EC2 instance attestation tab, this is a pretty high-level or detailed view of how an attestation document looks like. This is the NitroTPM-generated attestation document, which is the EC2 instance attestation document. It has a bunch of things that might be of interest to you. One of those is the nonce. The nonce is typically used in sessions to preserve freshness. You just want to make sure the same attestation document is not replayed. But the important thing, one of the important things, is the PCR values.
We just talked about what PCRs are and how some of them come out as output from the build process. I'm going to show you the interesting ones: PCR 4, PCR 7, which runs the secure boot policy, and then importantly PCR 12 as well. These are probably the three things that you would want to verify and introspect in that attestation document at a minimum. There are a few other things that are also important to verify the integrity of the attestation document itself, which is the certificate chain that I'm showing here.
We'll start off with something like: what is my leaf certificate that's representing the TPM itself, all the way up to the root certificate that signed this attestation document. If an EC2 instance is presenting you an attestation document, you would want to verify the signature and make sure that it is a genuine Nitro signed attestation document before you consume these PCRs and reason about them.
That's why these are important, which is exactly what we're doing. I'm going to show you that there is a thing called verification, semantic validation, and certificate chain validation. The non-certification things like that—we went past all of this in this sample application. But if you go now as a model owner to your KMS key that you own and have used to encrypt the LLM model weights, here is how the experience is going to look like. Some parts I have done ahead of time, but here is a KMS key policy for this KMS key. Here I have simply added a condition, and this is because we did the out-of-the-box integration for EC2 instance attestation with AWS KMS, just like we have done it for Nitro enclaves. You could do this exact same thing for your own microservices and other components that want to interface with this instance or this confidential computing environment. There's nothing stopping you.
Here is why it is simple: because we've done it out of the box. So with the KMS key policy, what you could do is simply add a condition that says: only allow this action, which is decrypt, because we're trying to decrypt the GGUF or the LLM model weights from a customer's perspective. As an owner, I'm saying that I'm only going to allow decrypt for my model weights if the request came from an EC2 instance that presents these exact PCR values. If you try to issue a decrypt request from a regular EC2 instance which does not present an attestation document, it's simply going to fail. That's as simple as that.
Now with that in place, that concludes what you need to do as a model owner before you ask your customers to go consume it. I'm not going to go into details about how you actually onboard my customers or how I do the handshake—saying, customer, here are the PCR values that you need to trust, and here is the recipe that I've used to build this environment. All of that will be part of your customer onboarding and however you establish that handshake. But at a simple level, this is how it's going to look.
Now, from a consumer standpoint, here's what I'm going to do. Again, the sample application just makes it easy, but the primitives remain the same. What I'm doing here is: the publisher gave me an S3 bucket where the encrypted files are available. So I'm going to point to them. I'm going to use the KMS key ID provided by the publisher, and I'm going to say: will you allow me to decrypt? You go through these steps: you download the encrypted files, you attempt to decrypt the data key. If the data key decryption is successful because you presented the right attestation document with the right PCR measurements, then you get a chance to decrypt the actual model weights.
That also takes some time. I'm not going to do this live right here, but the end result would be loading the LLM to the Ollama server that's running locally within the EC2 instance. Additionally, what it's going to do—and this is the functionality that I have baked into the sample app—is if as a consumer of the LLM model, I'm interested to know that I'm talking to a specific model and not just any LLM. So in this case, if my threat model says: make sure you're talking to a Mistral 7B and not just some other LLM model, what I'm doing is hashing the model weight and extending those hashes into one of the available PCRs. So in this case, I'm actually hashing it into PCR 15.
What does it mean? There are some out-of-the-box PCR registers that track certain interesting properties of your attestable AMI, which are PCR 4, 7, 12, and so on. But you do have the flexibility to measure things that are interesting to you and to your workload, and then use one of the available PCR platform configuration registers in the attestation document in the Nitro TPM.
You extend those so that you can verify them later. One of those examples that I just showed was the LLM model hash. It could be more. In total, you have about 24 PCR registers. Some of them are reserved, and some of them are available for you to use for your own custom hashes . That gives you the idea of the consumption experience. Again, both the publisher and the consumer's experiences hinge on the fact that attestation documents are being exchanged with the handshake with AWS KMS. But you could do this exact same thing with your other components of the architecture.
Let's say you have microservices and you want microservice A to do this attestation handshake with microservice B. Both of them can exchange attestation documents to each other and get the work done. That's the idea. I'm going to show you one of the capabilities that we mentioned. Unlike enclaves, you do have the capability to consume a GPU attached, which is exactly what I'm doing here. In this particular instance, I deployed this to a G5 instance which gives me an NVIDIA A10G environment. So there's that additional capability. The choice of which path you take, whether enclaves or EC2 instance attestation, you can make the decision based on the features available to you in one versus the other.
Packaging Applications: KIWI-NG, Nix, and Nitro Enclaves Approaches
That's the quick walkthrough about the sample application. Now let's get into details about packaging itself. By the way, once you have the QR code, you can go to the sample application and see how each of those UI components and the backing API implement things like end-to-end encryption and fetching the attestation document. Now let me show you the KIWI recipe that resulted in that sample application being deployed into the EC2 instance. I'll call out some key features here. If I go to the KIWI recipe, you'll see that KIWI has its own declarative way of specifying what packages should get into the AMI and what should be ignored. One of the things I'll show you here is these few lines of code.
One of the interesting properties to solve the multi-party collaboration problem that we're looking at right now is that the sample application is hosted in your customer's account. You want it to be an isolated compute environment where your customer cannot get into it and potentially exfiltrate the model weights. All they can do is use the API and consume the LLM model using the LLM model server. What's interesting to me is I want to add these properties where I'm saying, "Hey KIWI, when you build this AMI for me, make sure you never include these packages." That's what this ignore statement says. I'm saying none of the interactive access should be possible. Things like EC2 Instance Connect, your Systems Manager console, your SSM agent which gives you SSM session access, OpenSSH, SSHD, all of those interactive paths to get into the EC2 instance are mitigated and taken away with these few lines of code.
You'll see again in the sample code, and I'll show you the QR code for this one as well. Here I'm specifying all the packages that I need. You'll see some of the packages like TPM 2 tools, AWS Nitro TPM tools, things that you need to get the attestation document, extend the PCR, and things like that. That's a bit about the KIWI recipe that resulted in the sample app that you just saw. Now, the same sample app can also be packaged as an attestable AMI using Nix. Nix is yet another way to build your machine images. Now, unlike KIWI, Nix does give you a very important and interesting property where you can do reproducible builds. What it means is that let me show you the sample app that Nix has created. It's going to look and feel the same because all I'm doing is packaging it using different tools. The application remains the same. I'm just packaging it in a different way to result in an attestable AMI. Just to show some variation, what I did in the Nix recipe which I'm going to show is I took away all the GPU drivers.
The intention here is to deploy the same sample application in a CPU-only instance type where there's no GPU attached. If you look at the environment, you'll see that there's no GPU. The idea is that the same sample application can be packaged into an AMI, and if you build it the right way, it can be deployed across multiple instance types. If you look at NitroTPM documentation, you'll see that there are many more instance types available to fit the right size for your workload compared to Enclaves. Enclaves also offer flexibility to provide the CPU and memory needed for your workloads, which we'll see in a moment. The sample application is the same, and the attestation document looks the same because this is also NitroTPM attestation. However, if you look at the recipe, it's going to be a little different. I'm going to show you some of the interesting pieces of code. This is worth mentioning. In KIWI-NG, the way I told KIWI to bake my sample application into the AMI was by using something called overlays. I told KIWI that here is my sample application, potentially in a different GitHub repository. Take that and overlay it on top of what you already have in the base image that you have the recipe for, and then output the AMI. The sample application is baked in. In Nix, it's a little different. In Nix, you have to write out something called a Nix Flake where you tell Nix how to package and build your application. For Nix, I'm saying here is how you build my front end, here is how you build my back end, and then go do it. Nix does the same thing as KIWI-NG—along with creating the AMI, it gives you a bunch of PCRs that you can later use in attestation as well.
Now let's look at the last piece of the puzzle, which is taking the same sample application and packaging it to run inside an Enclave. I did that ahead of time already. The semantics are going to be a little different when it comes to Nitro Enclaves. There are differences in how you package your application for an Enclave, and there are differences in attestation as well. Here is the Nitro Enclave attestation document. It might look similar—you still have the nonce, module ID, user data, public key, and so on. You do have PCRs, but these PCRs mean different things. For example, PCR 0 is the hash of the entirety of the Nitro Enclave image. I grabbed one of the higher-order PCRs. In this case, it looks like I loaded the model multiple times, so I took anywhere from PCR 16 to 22 and extended them with the hash of my model weights. So even Nitro Enclaves gives you the capability to use a PCR to measure and extend what's interesting to you in your workload. It has the same certificate chain, and that's about it as far as attestation comes. Pretty much all the other concepts within attestation remain the same for Enclave attestation as well. Model owner and consumer—none of those functionalities change.
Let's look at what's different when it comes to Enclaves. We mentioned vSOC. One of the things you get with an Enclave is that you do not have any ENI attached. If you want to communicate with the external world, which you do have to do to talk to KMS, to talk to S3 to get the model weights, and then finally expose your API—the OpenAI API that serves the LLM to the front end—you need proxies for all of that. That's one difference in how you package this application. The other difference is when you write out an Enclave, you have to give it a bootstrap script. You have to tell it how to run this particular application inside the Enclave. For that, I have some additional scripts. Those are some of the differences, but the idea here is you could take the same application—the multi-party collaboration application that we set out to solve for using attestation capabilities—for both of these environments. You have seen how you can use Nix and KIWI-NG to package it up to consume EC2 Instance Attestation, and you have also seen how to package it for Nitro Enclaves.
Conclusion and Resources for Getting Started
That's the demo portion. I'm going to leave some QR codes for you so that you can follow along later with this exact same application. Let me switch back to the presentation. Here's the KIWI-NG QR code. This gives you the sample packaging code that shows how to take the sample application and package it up into an attestable AMI using KIWI-NG. Here's a quick architecture that's available in the GitHub repo as well, so you don't have to worry about taking a screenshot. Here's the next sample repo. This is the same sample MPC application that you saw, and this gives you the next flake that shows you how to go about it.
By the way, we packaged this up into a workshop. If you are interested in learning how to actually build attestable AMIs, we have a session tomorrow: CMP 410. It's a short and suite builder session. Please do attend if you're interested in learning more and getting hands-on training on how to build attestable AMIs. We will use this exact same sample PC app that you just saw. Finally, I have the Nitro Enclaves solution. This QR code will lead you to a hands-on workshop that shows you how to serve LLM model weights using Nitro Enclaves. That's the end of my demo.
With that, we have learned how to do confidential computing in an EC2 environment with different mechanisms. I would encourage you to have a look at the survey and please fill it out. Let us know how we did, and I'm opening things to questions now. Thank you.
; This article is entirely auto-generated using Amazon Bedrock.




































































































Top comments (0)