🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.
Overview
📖 AWS re:Invent 2025 - State of the Art: AWS data protection in 2025 (ft. Vanguard) (SEC203)
In this video, Ken Beer, Director of AWS Cryptography, and Rajeev Sharma from Vanguard discuss post-quantum cryptography (PQC) migration strategies. Beer explains the "harvest now, decrypt later" threat where encrypted data captured today could be decrypted by future quantum computers, emphasizing the need for quantum-safe algorithms like ML-KEM and ML-DSA. AWS has implemented PQC across services including KMS, CloudFront, S3, ALB, and API Gateway, with AWS-LC and S2N libraries supporting these algorithms. Sharma shares Vanguard's journey establishing a Cryptography Center of Excellence, using AWS Config and CloudWatch for visibility, and prioritizing ingress/egress traffic protection. Both speakers stress the importance of crypto agility, starting with public-facing endpoints, and building infrastructure for future algorithm migrations. The session provides practical guidance on inventory, prioritization, and implementation timelines, with Vanguard targeting late 2029 for complete ML-KEM deployment across their $11.9 trillion asset management infrastructure.
; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.
Main Part
Introduction: AWS Cryptography Services and the Post-Quantum Challenge
Make sure everyone can hear me. If you can't hear me, raise your hand. My name is Ken Beer. I'm the Director of AWS Cryptography. I've been at AWS for a little over 13 years and helped launch the Key Management Service back in 2012 here at re:Invent. Now I own the full stack from open source cryptographic libraries and algorithm implementations up to value-added services like KMS, CloudHSM, and AWS Payment Cryptography.
Today, you'll hear from me about what AWS cryptography is up to, what we've done in the past year, and what we're thinking of doing in the coming year. You're going to learn about where we are focused and where we think you might want to be focused in the area of cryptography and securing your data. The real treat today is you get to hear from Rajeev Sharma from Vanguard, who's going to talk about their approach to achieving a state we call crypto agility in an effort to migrate to post-quantum cryptography. This is something that all of you will eventually have to do. The question is how soon. I think it's really useful to learn from some of the early adopters in the space to figure out how you're going to strategize.
One of the things that we have asserted at AWS is let us do the undifferentiated heavy lifting for you. Certainly when it comes to running compute and storage, when you use a cloud service provider, you no longer have to care about the physical security of the data center or all the front-end networking equipment in front of API endpoints. We take that away from you and ensure that it is always patched and upgraded to the state of the art when it comes to protecting your data.
You can look at other tasks related to data protection, whether it's digital certificates, managing encryption keys, or deploying software that handles PKI or encryption. We want to provide solutions for our customers when it comes to software libraries that you would use to encrypt or digitally sign data. We offer those for free. They're all open source. We have a very strong commitment to transparency about what we're doing in the area of cryptography, even when we are on the bleeding edge and inventing new areas of cryptography.
AWS Key Management Service: The Foundation of Cloud Encryption at Scale
Let me talk a little about the services that we have offered and how they're helping customers deal with undifferentiated heavy lifting. I talked about the Key Management Service, which is what we call a foundational root of trust at Amazon. All encryption of customer data ultimately chains up to keys that are managed inside KMS. If any of you have used KMS, we have a resource type called a customer managed key that gets to be your root of trust for whatever data you're protecting across however many AWS services you want to use, as well as your own applications running anywhere on the planet.
KMS is what we believe is the largest lock and key service in the world in terms of raw numbers of keys that we're handling, as well as the number of encrypt, decrypt, sign, and verify operations that we do on a daily basis. We average around 30 billion operations per hour. Depending on the time of year, we can peak north of 30 percent above that. We're able to scale to meet your needs. If you need to do more cryptographic operations faster because your business is taking off, it's fairly straightforward for us to add capacity.
One of the ways we did that is we invented our own hardware security module 12 years ago and designed it for the cloud. It's highly cost effective and highly scalable, so we can meet you as you grow your business. One of the things we did with KMS is we changed the game a little bit on how you think about managing encryption keys. For years, developers had to know where their key was located and how to ensure that the physical and logical security controls around the location of the key were adequate. It's a hard problem to solve, especially when the key at the top of a hierarchy must always be in plain text all the time for high availability.
With KMS, we said you don't have to worry about that anymore. The only thing you have to worry about is access control to use a key. The access control mechanism is the same.
This is the same access control mechanism used for every other API at AWS. Most customers can reason about this by thinking of a KMS key like an S3 bucket. You have a resource policy that controls who can use this key, under which conditions, and in which services it can be used. Data tagged with specific tags can be encrypted and decrypted under this key. KMS has scaled nicely for us, and we have a host of other services to solve slightly different problems.
Expanding the Cryptography Portfolio: CloudHSM, Certificate Manager, and Secrets Manager
The CloudHSM service is, in many ways, a stepping stone. Many customers, especially in banking, government, and some large manufacturing firms, had been using commercial off-the-shelf HSMs. Their application and client code talk to one of the arcane cryptographic protocols to the CloudHSM. This is where you want to do a lift and shift when you have to use either a client that you do not want to change or a client you cannot change to talk to an HSM.
Certificate Manager is designed to simplify PKI. Your certificate is an AWS resource name, and you can configure where that certificate gets deployed across any of the AWS service load balancing services. Earlier this year, we launched what we call take-home certs or exportable certificates, so you can have us issue a certificate for you and use it wherever you want. You will see a lot more work in this space to simplify the issuing and deploying of certificates because we think it will be a linchpin for the explosion of machine identities, part of which is being driven by the explosion of agentic AI.
Certificate Manager is designed for those public-facing, publicly trusted certificates that you would attach to your website, for example. Private Certificate Authority is where you manage your own trust store. Your client knows how to validate or verify a certificate because it is an internal mesh network. You do not rely on the browser's trust store to validate a certificate; you can do it yourself.
Secrets Manager is for when you do not have an encryption key that you need to manage because KMS does it, and you do not have a certificate to manage because ACM or PCA do it, but you have a random database credential for some application stack or a SaaS provider that you do not want to bake into your code. You put it into Secrets Manager, and that string that is your secret or credential is itself encrypted under KMS keys. That is important to know as well.
Even the certificate management services in PKI have all of the private keys and key pairs generated in KMS and vended through ACM and PCA. All the signing operations for the issuing of certificates happen through KMS. That is why we call KMS a cryptographic root of trust. We talked about Certificate Manager launching exportable public certificates earlier this year. Secrets Manager made an announcement in the past week that talks about some deeper integrations with third-party systems.
For the past couple of years, for example, if you use CyberArk, they have an on-premises secrets management system. You have a way to synchronize secrets that are in CyberArk with AWS, which is what we launched two years ago. You will see a growing number of additional partners where if you are using a secret in Salesforce to manage all your accounts and your access to your Salesforce apps, you can now synchronize that with Secrets Manager. We are trying to make it easy for you to look at Secrets Manager as your single pane of glass for arbitrary secrets management, where those secrets are specific to your application stack.
Private CA is doing more and more integrations where certificates are needed, certainly when spinning up large numbers of Kubernetes clusters. You have to get the right certificates because each of those clusters and those nodes have their own identities, and those need to be based in PKI. I want to shift the conversation to the main theme of this talk, which is post-quantum cryptography.
Understanding the Quantum Threat: Why Post-Quantum Cryptography Matters Now
What is post-quantum cryptography? It is a new type of cryptography, a new set of algorithms that are designed to protect you from a threat that does not exist yet. This might sound kind of weird, but it is something we all need to be thinking about. A quantum computer—there are certainly prototypes of quantum computers out there, and you read about them every month with breakthroughs somewhere.
At some point, a quantum computer will become what we call cryptographically relevant and may be strong enough to brute force attack the classic asymmetric algorithms that we rely on today for all of our transport security and many other digital signature applications. RSA and elliptic curve are things that have been in use for 30 to 40 years. If a quantum computer that is strong enough and stable enough appears, there is already an exploit in the wild called Shor's algorithm.
We don't know when this quantum computer is going to appear. Nobody really knows. I would encourage you when you're reading articles that say it's right around the corner or that it's a matter of months to take it with a grain of salt, because there are still a ton of material science problems to solve. The stability of these computers is the critical part because without stable quantum computing, you can't run Shor's algorithm for as long as you need to potentially crack an RSA 2048-bit key or higher.
If we don't have a quantum computer today, why do we care? Well, here's the theoretical risk, which we call harvest now, decrypt later. The problem is that asymmetric keys and RSA elliptic curve algorithms are used in every TLS session that secures a channel between an arbitrary client and an arbitrary server across the internet. In almost all cases, you as the owner of the server component or the owner of the client component have no ability to prevent somebody from copying packets, even though they're encrypted packets. There are just too many places for someone to plug in a network sniffer, pull down a PCAP file, and say they're going to sit there and wait for there to be a cryptographically relevant quantum computer, and then they're going to take this PCAP file and brute force attack it.
Why would they do this? Because they might be able to get an underlying secret that was used to encrypt very sensitive data that was sent across that wire. So the question that we're all asking ourselves is where are the network connections where data that's sensitive today needs to stay sensitive and secured for many years into the future. That's a hard question for a lot of companies to answer. But there is one company or organization that's pretty clear about that threat model, and it's nation states. The US government a couple of years ago said they need their secrets to stay secret for about 50 years. If anybody were to intercept a transmission over the wire and decrypt it in 10, 20, or 30 years, that's a very bad outcome.
The US government has been doing a lot of work over the past 5 to 10 years to try to find a mitigation. The mitigation here is that you have to be able to create a ciphertext so that if it's captured, you have confidence that it is sufficiently protected against this imaginary quantum computer that does not yet exist. The good news is that NIST has spent a lot of time and there are some quantum-safe algorithms. You'll hear a term called lattice-based cryptography with acronyms like ML-KEM, ML-DSA, or SLH-DSA. This is a lot of alphabet soup, and it's stuff that you're eventually going to learn and it will be standard and typical the same way that we talk about RSA and ECC today.
Two Critical Risks: Confidentiality and Authentication in the Quantum Era
These algorithms are available, they've been standardized, and now vendors like us are in the process of implementing them. So the question is how should you be taking advantage of this? The harvest now, decrypt later threat is really addressing what we call a confidentiality risk. The confidentiality of your data tomorrow is at risk if you don't encrypt it correctly or with quantum-safe cryptography today. The other risk is what we call an authentication risk.
As we all know, it's one thing to be able to crack encryption and discover the underlying secrets. Perhaps another scary attack vector is that somebody can impersonate you and pretend to send a message as if they were you and fool the recipient into believing it's true.
This is where digital signatures today mitigate against those impersonation attacks. But we need to be thinking about how we ensure the integrity and the authentication of signed data into the future. If you are a company that signs firmware that gets deployed into devices and you simply cannot update those devices with a new digital signature for many years, you're concerned about this problem. Amazon is concerned about this problem with the Leo satellites that we're putting up into space. Lots of set top box vendors and lots of other manufacturing firms are concerned about this problem as well.
AWS Migration Strategy: A Phased Approach to Post-Quantum Cryptography
This is another area where, while the cryptography is a little bit different and it's not about securing confidentiality, it's about ensuring authentication. We put together a blog post on the AWS Security blog last November that outlines how we at AWS are thinking about this because we have to go through our own migration. We don't pretend like we have the perfect checklist that everybody should follow, but we know that there are different steps, and one of the real challenges is that most organizations have not had to worry about a migration in cryptography for thirty years.
When the old AES standard was invented in the mid-nineties, it was replacing the Data Encryption Standard. People realized they had to re-encrypt stuff with this new symmetric cipher because the old one was simply broken. These newfangled Pentium 2 chips could brute force attack DES. So there was a lot of work in the nineties. Fast forward to today, those people aren't around anymore. Nobody really knows how to do a cryptographic migration. You cannot get away from doing some kind of inventory.
You have to identify, especially for encryption in transit to address the harvest now decrypt later threat, what are the most important connections and where is your most sensitive data coming from and going to. You've got to prioritize where you want to make changes. That's why we think that integrating quantum-safe cryptography into your public endpoints, your public-facing endpoints, is the first place to start. Next is this authentication problem with code signing, where you want to make sure that you're not going to be forced to replace a long-term digital signature in a device that you can't access.
Then the third area is also an authentication problem, but it has to do with short-lived roots of trust. The best use case to think of here is actually the digital certificates that are used to express the identity of your web server and your client. You get a digital certificate for your public-facing website that says you are example.com. Anybody who gets that certificate can say, yes, I trust the certificate authority that issued this certificate because they are Amazon or they are DigiCert or they are Sectigo or whoever the CA happens to be. Your client says, I can trust the thing that I'm connecting to.
The reason that trust works is that there's an RSA or an elliptic curve digital signature inside there. Both parties can trust that nobody's impersonating somebody they shouldn't. But in a world where RSA and elliptic curve can be brute force attacked, somebody could potentially spoof you. So we have a solution. There are algorithms like MLDSA that can be used here. Part of the problem, however, is that clients and servers in a world where you might own the server but you have no control over the browsers who are the typical clients means that interoperability is critical.
Interoperability is a thing where everybody has to move at the same rate, and right now the browser community is still trying to figure out how they might want to deploy MLDSA. They're looking at alternative algorithms because these quantum-safe algorithms create larger signatures, so more bytes on the wire. They also take a little bit more CPU to do the math, and so it's going to add latency. That's why we think PQ for server and client authentication is the final step, the final area, mostly because you can't do anything today with public certificates. Nobody is issuing a public certificate with an MLDSA signature because you can't.
You can't use an MLDSA signature because browsers have to build support for that first, and then all of the SDKs that do client-side authentication have to support that. One of the things that we realized is that we talked a little bit about NIST and the development of the core algorithms. They have been issued and they're standardized. You'll see things like FIPS 203, FIPS 204, FIPS 205. These are shorthand for the underlying algorithms, and that's step one.
But then step two is figuring out how these algorithms should work in protocols like TLS or SSH, or IPEC or MacSec. There's a whole range of protocols between clients and servers to try and build an encrypted and authenticated session. The IETF tends to be the place where you define protocol implementations, and you'll see recently AWS was part of a new RFC to try and standardize how MLDSA should be used in digital signatures.
The telecom industry has their own set of custom protocols that they use to secure all of voice and data across telecom, and ETSI in Europe is sort of the leading standards body. We've been involved with them for 10 years to make sure that, for example, NIST, IETF, and ETSI aren't doing different things because that's the bane of any new standard—when 4 different companies or 5 different countries decide to go their own way. We're trying to avoid that from happening.
Building the Foundation: AWS-LC, S2N, and Service-Level PQ Implementation
So what have we done internally? Well, where we started is with core algorithm implementations. AWS Libcrypto, or AWSLC, is our offering in this space. AWSLC is open source and completely free, with runtimes that work in almost every major development environment. We have some optimizations for x86 as well as ARM 64.
The reason we decided to build our own cryptographic library is around the time that Amazon got in the business of building our own hardware. Because we now have ways to use less CPU to do AES, RSA, and ECC operations. The less CPU we use for cryptographic operations, the more CPU is available for your applications to do their thing. So we start with AWSLC and then we work up the stack with our open source TLS implementation called S2N.
S2N is highly optimized to try and minimize the latency for TLS connections. It uses AWSLC and supports the MLDSA algorithm. You'll see more and more of our open source tooling supporting post-quantum cryptography by default, and then those can be deployed inside your apps, in your software vendors' apps, and inside our own AWS services.
This diagram is trying to explain where the work needs to happen and where you might instead delegate work to us, rather than you going through all of your source code to try and figure out how to upgrade the crypto. What we're trying to point out here is that if it's your own custom application code, you're just going to have to go through it and figure out where you might find examples of RSA or elliptic curve being used and how you're going to swap it out.
Sometimes that's working with a third party vendor, and sometimes that's swapping out a new open source module. Your mileage may vary when it's your own application code. But customers also think in terms of the endpoint that clients are talking to, which we call the TLS termination endpoint. When EC2 first came out, people said, "Great, I can run my own web server. I'll run a copy of Nginx on EC2, and that's where my web server will live, and that's where I'll put my digital certificate, and I'll terminate TLS inside an EC2 instance."
Over the years, we've offered more and more what I call reverse proxy services or load balancing services, where you let us handle TLS termination for you. Application Load Balancer, API Gateway, and CloudFront are plenty of solutions in this space if you're relying on one of those services to handle the TLS termination on your behalf. As of today, you already have post-quantum cryptography. It's there. You didn't have to do anything. You may have to make a selection to upgrade a particular policy, depending on how much curation you do on your endpoint, but all the low-level coding and cryptographic changes we've done on your behalf.
The same thing applies with other managed services. When you talk to the S3 endpoint, you don't manage the S3 endpoint. We do.
As of a couple of weeks ago, it now offers ML-KEM and a PQ option. So if your client can talk PQ to the S3 endpoint, you'll get quantum-safe cryptography and protect your data that you're uploading to S3 and downloading. This is the story going forward: every AWS service that exposes a public endpoint will support ML-KEM. I'd love to tell you a date by when we'll do that, but we're talking about hundreds of services. Just know that it is my number one priority in 2026, and I've got most of the company on board with me. You'll see a later slide where we go through the number of services that already have it, and they're the ones you would expect. The long tail of services will be coming soon.
Timeline and Progress: From NIST Standards to AWS Service Deployment
One of the things that people have said about PQ is that it's new, it's different, it's going to break somewhere, it's going to be extra CPU, and it's going to add latency. A lot of these people who have lived through the migration of the TLS protocol from 1.0 to 1.1 to 1.2 have some battle scars. When you change the underlying protocol and all the steps required to do a handshake, you do see breakage, and a lot of old clients just aren't going to work.
But the good news is when you're taking an existing protocol like TLS 1.3, which is required for PQ, and you're just adding a new cipher suite at the top of the prioritized list, you get very little breakage. The good news here is we are seeing an insanely low amount of issue reports. If you follow Cloudflare, they're very good at blogging about their adoption of PQ. Over half of the traffic that's going through Cloudflare is now using ML-KEM with no complaints. The latency issue is a non-issue, and breakage does not appear to be happening. This is good news and will make it easier for you to convince your site reliability engineers that it's okay to upgrade to this new algorithm.
So, let me review a little timeline of what we've talked about. NIST certifies the ML-KEM and ML-DSA standards as FIPS 203 and 204 in August of last year. We release a version of our core library, AWS-LC, that supports initial implementations of ML-KEM late last year. We then deploy it in some of our core security services like KMS and ACM and Secrets Manager. We also make it available in a few flavors of our AWS SDK because we know that that's the number one client that our customers are using to talk to AWS services. KMS then launches support for ML-DSA as a native cryptographic operation. So if you want to do firmware signing using an ML-DSA key, you can now do that using the KMS sign operation. CloudFront, Amazon's CDN and outside of S3 the largest fleet in terms of accepting TLS requests, supports ML-KEM by default in September.
We then released a new version of AWS-LC that we're submitting for FIPS validation. This has optimized implementations of both ML-KEM and ML-DSA. These optimizations mean less CPU on your clients and servers to do cryptography. Even though the algorithms are more CPU intensive and more bytes on the wire, we think we have done enough optimizations to make that transition negligible for you. This brings us up to more or less today. In the past month, let me talk a little bit about the things that we have done most recently.
In the past couple of weeks, we've exposed ML-KEM PQ-TLS to the load balancing reverse proxy services where it's your endpoint, it's your FQDN, you manage it, and we're just managing the web server on your behalf. ALB, NLB, and API Gateway. So you can now configure your endpoint to use a quantum-safe security TLS policy on top of the services that have already been supporting this. We think that when you look at the most important TLS connections into AWS, as long as you're using an AWS service to terminate TLS, you're in fairly good shape. You might argue, well, I use a lot of DynamoDB. Don't worry, they're coming.
So this is where you inventory and you say, all right, where are most of my TLS connections across the public internet to infrastructure that stores customer data. You have your own list. We have our list based on what we know about all of our customers. We're working through it on your behalf.
Private CA launched support for MLDSA. If KMS lets you do a raw MLDSA signature, Private Certificate Authorities said we'll take that raw signature that KMS gives us and we'll use it to issue a cert. This is a new kind of certificate. Instead of an RSA signature or an elliptic curve signature, it now has an MLDSA signature. We could only launch this in Private Certificate Authority because there's no such thing as an MLDSA signature that is publicly trusted, because the browsers don't yet support it.
But if you control your own client and you can control your own trust store on that client, you could issue a certificate through PCA that supports MLDSA. Now you have quantum-safe authentication combined with the quantum-safe confidentiality of ML-KEM in your private mesh network. This is another pretty important primitive, and one of the things that we want customers to start to experiment with to understand how this affects my private mesh network.
Are there performance issues with MLDSA? Because ML-KEM is really a nothing burger when it comes to added latency and performance. MLDSA might create some issues because of the certificate validation process. You've got to have the right client-side code to do that. So there's going to be some experimentation. It's one of the reasons why the public browsers are taking a little bit longer to support this. They're working through those issues with this core primitive. We think that our customers can also start to experiment in this space.
Prioritization Framework and the Path to Cryptographic Agility
When the public browsers announce support for some type of PQ algorithm, likely MLDSA, then we will start issuing public certificates with this. In summary, don't just say I need a budget for PQ migration as if it's a one-shot deal. You've got to break it down. You've got to prioritize. Protecting confidentiality of data in transit is your first job. If that's all you do over the next three years, that's okay. That's winning.
Long-term routes of trust, whether it's code signing or other long-term digital signatures, maybe that's applicable to you, maybe not. With digital certificates, especially the public ones, you're going to have to wait anyway until the browsers support it and the overall infrastructure is able to handle quantum-safe signatures. One of the things that we are doing at Amazon is we're using this pressure to adopt PQ as a way to go back and say let's think about how we build pipelines that are cryptographically agile, so the next algorithm that we have to deploy is easier.
We don't have to go through a billion lines of source code and try to figure out where this arcane algorithm is being used and how to upgrade it, how to test it, how to know whether it breaks, and how to roll it back. These are all important issues. So we would strongly encourage you as you go through your migration to think about when is the next time you have to do this. Because while I asserted it was thirty years ago when we had to do it when DES was broken, I'm fairly certain that we're going to have more algorithms that we want to migrate to.
Why? Because the actual attack vector does not yet exist, and when it does, we're going to learn about other potential risks. There's a lot of work already happening, especially in the digital signature area, to try to optimize and make the digital signatures smaller and faster. So while MLDSA is sort of the preferred implementation today, there will likely be three or four more options. You're going to see vendors adopt those. You may choose to adopt them, and that's going to be yet another process of how do I swap out an algorithm inside a mission-critical system to a new algorithm without breakage. That's really the task.
I've shared a little bit about how we think about the problem and what we're doing at AWS for undifferentiated heavy lifting. I'm going to bring up Rajeev Sharma from Vanguard to talk about how he's applying some of these concepts specifically to what Vanguard needs to do.
Vanguard's Journey: Strategic Response and the Cryptography Center of Excellence
Thank you, Ken. That was great. I am excited to be here to talk to you about Vanguard's journey to quantum safety. Here at Vanguard, our mission is to take a stand for all investors, treat them fairly , and give them the best chance of investment success. Let me give you a quick snapshot of what Vanguard is. We are a global asset manager with over 50 million investors and $11.9 trillion in assets under management today. Our internal crew, which are our employees, is around 20,000 strong. The key point here is that we operate without any physical branches, so all our interactions with our investors happen over secure encrypted digital channels.
A few years ago, as Ken mentioned, we identified quantum computing as a material threat to that communication. Since then, we have been actively involved in mitigating that threat. Around the same time, governments have taken action. In the United States, NIST released their lattice-based encryption algorithm standards in August of 2024. In the EU, there are existing regulations today like NIS 2 and DORA, which specify that you must adhere to the strongest form of encryption for your communications, and this might include quantum safety in the future. The Canadian government has also issued their plan to implement quantum safety in the Canadian government. If you have not started your PQC migration or looked at PQC in any way, this is really the tap on the shoulder. This is the go. We are ready to go.
For technical context, Ken mentioned many Amazon services, and we are heavy users of Amazon, but we also live in an ecosystem of data centers, other cloud providers, as well as our third-party SaaS partners. Integrating within AWS as well as outside of AWS will be an important thing to cover. For today's discussion, I am focusing on what we are doing within the AWS environment to support this. How do we plan for our journey? There are really four pillars that we are looking at here. The first is around strategic response, really framing this problem as not just swapping out encryption algorithms, but as a business problem. The second part is embedding quantum risk in how we look at our enterprise risk horizon, building internal initiatives and centers of excellence. Third and fourth, we look at visibility tools to understand where all our cryptography is today, which is a hard problem to answer, and then work with our IT partners to start migrating those into the newer lattice-space algorithms.
PQC is not just a solo sport. It is not like one team in an organization that is going to do this. We have many teams that are doing this as well as getting executive buy-in. Going to strategic response, you want to speak in terms of business risk and opportunity. You want to look at this problem like we looked at resiliency or any of these other large IT risks. This is going to hit us in a regulatory compliance way at some point in the future. More importantly, our investors and our clients expect us to have the highest level of security. They are going to be looking to see that we are implementing quantum computing in our websites. It is to stress the urgency of early action.
The work that we have in front of us could take many years to do across a large environment. Starting now and planning now to get there well before what we call Q day, or when a quantum computer is available, is urgent. We need to start that work today. We also recommend some of these actions to executive leadership as well as the board. We have got to allocate resources. We have to allocate planning. We have got to think about things like change management.
How are we going to talk to our vendors? How are we going to talk to our 50 million investors? The second part of the journey is around the internal initiatives. This is about building the Cryptography Center of Excellence. I would say we shortened this once to crypto Center of Excellence and it got confused with cryptocurrency in the organization, so don't do that—spell it out. We call it now the Cryptography Center of Excellence, and this is a group within our organization. It's a small team, but we are starting to grow it. They're looking at this as an enterprise-wide transition. They're holding out the roadmap and the plans, organizing the IT teams together, and working with our third-party risk management teams to understand what the scope of this post-quantum cryptography problem is.
They're also building a lot of things like what does good mean and how do we know that we've completed the cryptography transition. We're looking at metrics such as the percent of traffic that's TLS encrypted with ML-KEM algorithms or the number of third-party vendors and partners that we interact with that also support ML-KEM. Part of that involves this framework here: discovery, prioritize, and transform. Whenever you do any kind of large transformation exercise, you need to understand what that problem is. That's what we're doing today. We started this effort about a year and a half ago to discover and prioritize all the different areas where encryption lives in our environments. Then as services start to implement ML-KEM or implement quantum algorithms like Ken was talking about, Amazon services are starting to adopt these, and we then start to transform those and start to enforce ML-KEM as needed as we go through the environment.
The goal is to hit this by sometime late 2029 to have a complete ML-KEM implementation throughout the environment and be quantum safe. One of the ways that we're doing this is because of Amazon's commitment to the shared responsibility model where they are taking on the heavy lifting of building the cryptography into the services, and it's really just up to us to configure and use that service. Talking about visibility and IT collaboration, I kind of put these two together. The way that we're framing up this problem, the Cryptography Center of Excellence is framing up this problem in two dimensions. The first dimension is around posture and monitoring. Posture is talking about what is possible in your service and what is possible in the environment. For example, how is your CloudFront configured or how is your load balancer configured? What are the different algorithms that are available on that service?
The second part of this dimension is monitoring. If I have a list of TLS cipher suites that are available, one of them is PQC enabled, and the rest may not be. We look on the wire for that TLS handshake, for that client hello and server hello, to understand which of those protocols actually took place. That gives us a high-level inventory of what's going on. The second dimension is around ingress and egress, and we made this important because we need to understand which party is initiating that TLS connection. From an ingress standpoint, this is going to be your external browsers, for example. They're the ones initiating and giving the cipher suite to a CDN, let's say CloudFront, which also has a list of cipher suites available, and they're going to start negotiating the most optimal cipher suite.
Then from the edge location, it's going to go into a load balancer. These two hops are over the Internet today, so that load balancer is exposed on the Internet. It's getting traffic from CloudFront over the Internet, and the browser is connected to CloudFront over the Internet. From a prioritization standpoint, this is one of the most important places to start our work. As of September 5th, if I get the date right, CloudFront supported ML-KEM as one of its options, and it was backwards compatible, so we instantly got the advantage of having any browser that supported that quantum-safe protocol now able to use that, and we did see somewhere between 75 to 80% of the browsers actually supporting or capable of supporting that type of connection.
The load balancers have just come recently, so the CloudFront to load balancer connection will be the next thing that we're going to start to tackle.
From the egress side, this is where our systems are initiating the connection. Think of this as your EC2 or your containers inside the VPC making an outbound connection. They have a cipher suite on their software as well. In our architecture, they go out through a proxy and then out an Internet gateway and then to a partner, whatever that partner may be. It could be a login service, an authentication service, or anything like that.
In this flow, when we're talking about ingress and egress, we're trying to understand from that compute what is configured on that compute, how that proxy or firewall is configured, or from a third-party risk perspective. We have hundreds of partners that we interact with. Which one of these partners has gotten the tap on the shoulder and is actually working on a PQC implementation? This is a partnership across everybody.
When you take those two dimensions and lay them out, you get this floor box where you have ingress posture, ingress monitoring, egress posture, and egress monitoring. This is all around the crypto discovery and prioritization patterns. Each one of these boxes has a different tool or different technique of how you accomplish this.
Practical Implementation: Monitoring Tools, Visibility Techniques, and Closing the Loop
Let's talk about the ingress, and I'm going to focus mainly on the Amazon services side of the house here. When you're looking at the ingress posture, when we're talking about that CloudFront distribution, we use AWS Config to understand how CloudFront is actually configured. So what is the posture of it? This is an example query and then you get this result of what all the different SSL policies are available on that.
You can do this through Config. You can do this through other CNA tools as well. If you have a third-party CNAP, you can provide that same type of data. The second part of the ingress posture here is from the cloud distribution into the low bouncers, so it's a very similar thing. You can go ahead and just get the posture by creating Config or using a CNA tool to understand what's out there.
From the monitoring standpoint, what we've done is we've turned on CloudFront logs and logged into CloudWatch. From there you can then query CloudWatch and understand what are the different SSL cipher suites that were actually connecting on that wire. This gives us some idea of what cipher suite was actually used.
However, whether it was MLCAm or not is still not in the logs today. So that's one thing that we need to figure out: how we can get the actual key exchange algorithm in here, not just the cipher suite. But what we do know is that these are browsers based on their user agents and things like that that do support the MLCAm cipher suite. Same with the low bouncers. We've used Athena to query the logs, and then from here we can actually tell what the cipher suites were used here, and then we can infer which ones were actually PQC or not.
From the egress side, if you remember, we have a proxy architecture today that's making calls out through an ENI out the Internet gateway into our partner infrastructure. In this model, what we're doing is we're simply capturing the logs that are coming out of that firewall to understand what cipher suite was used there.
The other part of egress posture is that if in some of our environments we have a transparent proxy where that proxy or firewall is not breaking the TLS connection but is going straight out, in that case this is where we actually have to look at the source code that's sitting on the compute and how it was written to understand what ciphers were used there.
A couple of techniques here you can use static code analysis. There are third-party SCA tools available.
From the Post Quantum Cryptography Alliance, there is also an open source tool called COM kit, which currently covers Java and Python. We've experimented a little bit with that. Another thing we're experimenting with is using something like CodeQL or even GitHub Advanced Security to understand what different ciphers are actually configured in that environment.
From the monitoring side, this is where we're actually capturing the TLS handshake. One thing we experimented with in monitoring was VPC mirrors. If you don't have a proxy, this is something you can turn on for a few minutes just to understand what's going on. When you look at a VPC mirror, it will actually capture the traffic off that ENI. It is still TLS encrypted, but you can see the handshake, and then you can send those logs over to another compute instance which will capture the logs and write them to a bucket or a log source.
I probably wouldn't recommend this in a live production environment because it's quite heavy, but it's something you could do to try out. It's interesting education to see how the ciphers are actually being handled. Some quick takeaways from our side: you really want to establish a cryptography center of excellence. That's really where we want to go, and then have them define quantifiable outcomes. What does it mean to succeed? How do you know that you're done? How do you know that you've gotten every piece of cryptography out there?
Then focus on that muscle memory to build those cryptographic assets because we may need to do this again one day. Start the transformation when MLM is ready. As these services in AWS start to implement quantum-safe algorithms, we can start adopting them. This is a very high level overview of how to adopt quantum and crypto agility in your environment. If you want to dive deeper, there are a couple of resources here from AWS. The first one is around PQC migration plan, which is about how to actually build this roadmap. The second one is a hands-on workshop where you can actually experiment with some of the services and understand how to turn on PQC.
There's also a Post Quantum Hub here to get the latest news and information on PQC from AWS. Later this week on Thursday, there are two other sessions that dive deeper into more technical details around this. There's SEC404, which is actually a workshop, and SEC331, which is talking more about the actual algorithms in place. I wanted to echo something that Rajeev brought up which might be perceived as a gap. Rajeev, you talked about how do I prove that my last request was protected with MLchem or some quantum-safe cryptography, right?
When it comes to detecting whether or not an endpoint supports a particular algorithm, there are plenty of scanning tools that will tell you. They'll say this certificate is signed with an RSA key or an ECC key, and you'll see more and more of those tools that will say this endpoint is capable of MLchem and PQ. But actually showing evidence that yes, my connection from my client to this endpoint was in fact done over MLchem, and that means the trillionth connection will also be protected under MLchem—that's an important thing you want to hear.
One of the things we're doing is, if it's an AWS service or our endpoint, a managed endpoint where you're just going to get whatever crypto we give you, we're in the process of adding a field inside the CloudTrail log called TLS details, and it will very specifically include the string that includes MLchem.
X25519 ML-KEM 768 is a long, complex string, and the presence of it means good. The absence of that might mean bad, depending on what you're looking for. The thing you're going to want to do is configure your application stack to emit its logs to something like CloudWatch. You've got to get your application stack to emit the right thing to CloudWatch. This could be under your control or might not be under your control, depending on whether you've delegated to some other SaaS provider or software provider that you're running inside your EC2 instance. But it's a critical part of this because it brings things full circle.
At some point, you're going to be asked by an auditor or regulator to prove that you're using ML-KEM. It's probably going to be somebody representing a US government customer you might be trying to sell to. Being able to point to the CloudTrail log and say there it is settles the issue. But it's definitely an area that we need to work on throughout 2026 inside our managed services. Every ISV partner that we work with at AWS, we're reminding them and trying to get them to be out in front of it. I thought it was worth coming up and reiterating that because it's a very important part of closing the circle.
I think we're done, and we've got a 30-minute break. I think Rajiv and I will hang out up here if anybody has any follow-up questions. Thanks a lot for your time.
; This article is entirely auto-generated using Amazon Bedrock.


















































Top comments (0)