DEV Community

Cover image for AWS re:Invent 2025 - Last HSM Standing: PayPal POS’s move to AWS Payment Cryptography (SEC215)
Kazuya
Kazuya

Posted on

AWS re:Invent 2025 - Last HSM Standing: PayPal POS’s move to AWS Payment Cryptography (SEC215)

🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.

Overview

📖 AWS re:Invent 2025 - Last HSM Standing: PayPal POS’s move to AWS Payment Cryptography (SEC215)

In this video, Mark Cline from AWS and Philip Witty from PayPal POS discuss how PayPal POS eliminated physical HSMs by migrating to AWS Payment Cryptography. Philip shares their journey from operating three data centers with redundant HSM setups to a fully cloud-based solution. Key highlights include achieving PCI PIN and P2PE compliance, using shadowing techniques to validate the migration with production traffic, and successfully switching to AWS Payment Cryptography during an unexpected data center outage with zero customer impact. The migration reduced operational burden, eliminated data center costs for new regions, simplified developer workflows by replacing complex HSM interfaces with REST APIs, and transformed the team from infrastructure-focused to product-focused. Philip emphasizes that 99%+ of traffic now bypasses HSMs, with significant improvements in resilience and audit scope reduction.


; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Thumbnail 0

Introducing AWS Payment Cryptography: A Cloud-Based Alternative to Physical HSMs

Good morning, everyone. My name is Mark Cline. I am a Principal Product Manager here at AWS in charge of Payment Cryptography, and I'm joined by Philip Witty from PayPal POS. We're here to talk to you about PayPal POS's experience eliminating their HSMs and moving fully to the cloud for their payment processing.

Thumbnail 30

Let me provide a quick introduction to AWS Payment Cryptography. This is our cloud-based solution for securely processing card transactions, replacing the physical hardware security modules or HSMs that customers traditionally had in their data centers. AWS Payment Cryptography is used for use cases like card issuing, transaction acquiring, which Philip will be talking about, as well as issuer transaction processing. It provides many, if not all, of the same functions that traditional payment HSMs provide, such as translating PCI PIN data using DUKPT, decrypting P2PE data, managing keys, and importing and exporting keys. For transaction processing, it handles generating CVV-2s and validating ARQCs. This is presented as an API-based service, so it is all RESTful APIs. It is an elastic service where you can get started in just a minute or two, so there is no need to provision hardware. It is integrated into many AWS tools such as AWS IAM, Amazon CloudWatch for monitoring, as well as AWS CloudTrail.

Thumbnail 110

Some of the benefits of the service before Philip talks about his experience and PayPal's experience include removing your dependencies on dedicated payment HSMs, allowing you to migrate your critical payment workloads. You can reduce your operational burden by using a fully managed service that is PCI PIN, P2PE, 3DS, and DSS compliant. You can harness high throughput and low latency since this is a completely elastic service that will scale up and down automatically based on your usage and your customers' usage. Additionally, there is simplified setup since, as I mentioned, there is no need to provision or buy hardware. You can literally start using the service in a minute or two. Now let me turn it over to Philip, who will tell us about PayPal POS's experience using Payment Cryptography.

Thumbnail 160

PayPal POS's Journey: From Data Center HSMs to AWS Payment Cryptography

PayPal POS was originally founded as Zettle in 2010 as a small Swedish startup, primarily making card terminals that you can attach to your mobile phone so you can quickly start taking payments. Our standard merchants are ice cream shops, beach shops, and any kind of seasonal businesses. We have a very seasonal market with a lot of summer weather, and all of our transactions are in-person.

Thumbnail 180

This is our offering: a very simple little reader that takes PIN transactions. This shows us why we need Payment Cryptography. Where are we live? We operate in Europe and North America with Mexico, and as a result, we wanted a global presence for our data center.

Thumbnail 190

We have been running data centers in three different locations. Here we can see what we have. We try to have a fully redundant setup within each data center, so we have a couple of switches coming in with two fiber lines going to two HSMs. In theory, we can lose one HSM and one switch in each region without losing any dependency.

Thumbnail 210

This is what we have in every one of our regions, and we can also fail over between our regions. We are a very lean team, a very small team, but we have very high vertical ownership. My team owns the incoming payment gateway, the acquirer service, the crypto service, which was our interface to talk to the HSMs, and we own the data center and the setup there. So why were we looking for a new solution?

Thumbnail 250

I want to go back to the lean team aspect. I remember on my first day when I started, I suddenly met my new team, and they said, "Hey, this person has given their notice and they're leaving in a few months. This person is giving their notice in a few months." Suddenly I was alone on a team, which is a product team, but we own our data centers. We spend a lot of time on infrastructure and compliance. We were also struggling because we wanted to roll out to new regions, and spinning up a data center is really expensive.

Thumbnail 280

Where did we look at going? We considered a few different options, whether we wanted to get HSMs hosted by external companies so we did less of the work, or what came along was AWS Payment Cryptography. It fit all our solutions, and we are a company that is already fully focused on fully managed infrastructure. We are using ECS Fargate, for example, and elastic load balancers. For us, it seemed like the perfect solution where we can spin it up with zero work required. We just start sending requests.

Thumbnail 310

So what did we plan? The first thing, which was a very boring blocker, is that before we could actually do any work, we had to get audit approval. This was because we are doing PCI P2PE, which is for card reader terminals you ship out to merchants. It reduces the audit scope if those terminals are under PCI P2PE. But before you can do that, you're looking at your current environment, so we ended up having to get a Delta audit done for that before we could even start deploying anything to production. We also found that since it was a very new service, we were learning both within PayPal, which is designed to approve the service, and with our auditors.

Thumbnail 350

Thumbnail 360

Thumbnail 380

The next step after we could roll that out was what we called shadowing. This is where we wanted to mirror traffic to both setups so we could see how AWS Payment Cryptography performs before we switched anything. How we were going to do this was a gradual rollout, and this was again because we had a large amount of external dependencies when doing this project. For example, within our cryptography setup, we have keys for the incoming terminals and keys for the outgoing acquirers. In some cases we had to exchange new keys or we had to see that the acquirers would support any changes in our encryption methods. So it ended up being a little staggered, bit by bit. And then promotion, so the end goal was that we get rid of our HSMs from our transaction flow and that we no longer think about them. I no longer have to think about audits or every quarter going through and checking the configurations haven't changed and sending my shares off to the auditors.

Thumbnail 400

So how did we try to do the shadowing? As with the previous flow, we had a request coming into our payment gateway, and here we would either send them off to the HSM via our crypto interface, or we'd send them to AWS Payment Cryptography. Our original plan was that we were going to keep our little crypto service as the abstraction layer between the HSMs and AWS Payment Cryptography, but we quickly found we could just drop this entire service from the flow. Although it was necessary when talking to these HSMs which have these binary wire protocols and you've got to deal with connection pooling, now we were just calling an AWS API. It was no different to doing an S3 put object or a DynamoDB call. So we realized we could move that whole crypto work into our payment gateway service and knock out an entire middleman.

Thumbnail 470

For the shadowing, we would always send our request to the HSM and that is always the response we would use. Then asynchronously, we would send it off to AWS Payment Cryptography and record the response. Since a lot of these cryptography operations are idempotent, you always get the same result, so we can actually look at the binary response. We can measure both the correctness and the response times with our full production traffic. I want to talk about how we were measuring this. So as I said before, we would send every request off to our metrics, and then every morning I'd wake up and this would be my dashboard I'd look at. You can see that we have result discrepancies, for example, which was showing whether we were getting a different response when doing this request against our HSM or against AWS Payment Cryptography. In this case it was a small bug on your side, but we had that close relationship with AWS that we said, "Hey, we're seeing this," and they could fix it. We could get the full confidence here without having to have any transaction impact.

Thumbnail 530

A lot of what we're doing was very similar to other learnings from moving on-premises to cloud-based. Suddenly we had a lot less control over our latency. Here we can see the APC, the blue line, which has this very peaky response time. During the nights when we had lower traffic, we found it would increase, and I think that's because there was some kind of cold start time. But this was when we had just started doing this shadowing, so we'd just rolled it out back in March, and then six months later we could see, for example, the translate PIN duration because we're doing a lot more requests, that's dropped lower and we're keeping a lot more on top of things. We also had skipped operations, and this is what I was mentioning, that we couldn't just roll this out in one big bang, and we didn't want to. We wanted to gradually roll out and get as much data going through this as possible without having to wait for everything.

Thumbnail 560

So here we had to skip some operations because they were supported by our HSM setup but not quite supported by the AWS Payment Cryptography setup yet, due to lack of keys exchange or some other reason. After we'd done this shadowing for, originally we were planning on doing it for three to six months, I think, because ultimately we had the data center contracts and we couldn't just switch over instantly as soon as we wanted to. We'd still be paying there, so we wanted to keep it going in parallel for a long time to give us real confidence. But we found that didn't happen so much because after, I think, two months of this shadowing or one month of this shadowing, we were doing routine patching in one of our data centers in London. We had an issue. One of our switches came down and that took the other switch down. Suddenly we had to shift traffic over to another data center for our HSMs, and then for the first time, we could actually help with this from our office in Stockholm, as well as the people on the ground. The ones on the ground in the data center were trying to fix the switch, and here we were in Stockholm thinking, maybe we can just skip this whole operation and bypass the traffic to the data center.

Suddenly overnight, we shifted everything to AWS Payment Cryptography rather than to HSMs. It's really nice that we could feel useful and achieve something, and since we've been doing shadow work for so long, we had the confidence there. Everything suddenly changed—no more HSMs, no more data center involvement, and zero customer impact, which was the real pleasure. We avoided an incident, and even though we had redundancy, we would have had reduced redundancy if we hadn't managed to swap over. For our customers, we're really focused on reliability. Since we have a lot of small and medium businesses, they can't handle a day of downtime. If they miss a day's transactions selling their things at the Christmas market, there's a good chance they won't survive, so we're really focused on this. It was of the biggest importance to us that we didn't start losing transactions.

Thumbnail 670

Transformative Benefits: Enhanced Developer Ownership and Product Focus

There were difficult parts. Although it's cryptography and I thought it was unique and special, I realized all we're doing is a very standard migration from on-premises to cloud. It's work. You hit up all these old flows, you end up having to touch all parts of the code base, even bits which you're questioning whether they're still used, but you have to touch them anyway. It's just work, a classic migration. It was a very new AWS service. We had teething issues, but in return we also had a close relationship with AWS, which was nice.

Thumbnail 700

Within PayPal, I had to say, "Hey, we want to use this new service. Yes, I know it's cryptography in the cloud and you don't believe it exists, but it does." It's the same with our auditors. We ended up sending them to AWS to have a meeting and figure out more about how AWS Payment Cryptography works.

Thumbnail 720

As much as I'd like to say that we have no HSMs anymore, we do still have some long tail flows which we're working on in two ways: both how we do things and getting help on new features on the AWS side to get rid of those. We are at 99.9, maybe 99.1% of traffic not going through our HSM. For now we have kept one HSM in our secure room for handling incoming key components. The fact that this is no longer on my plate is such a relief. It was my own thing to own before, and this is what I enjoy.

Thumbnail 750

I spend a lot of my time thinking about PCI compliance. I'm very heavily involved in the yearly audits for PCI DSS, then you've got every two years your PCI P2PE, and then every three years for PIN. A huge amount of those audits has been focused on the HSM. It's non-negotiable—we had to keep that compliance. When talking to PayPal, they didn't believe this service was actually going to be capable of doing this or whether it was going to be PCI PIN compliant.

Thumbnail 780

I've struggled with how to phrase this, but when we moved from the HSM, they have very old-fashioned interfaces with very precise commands. They were clearly made for specific payment features that needed exact commands. Now we've moved to AWS Payment Cryptography, and instead of sending a translate PIN command which corresponds to a couple of digits followed by an exclamation mark and then a code block, we're now using a JSON API called Translate PIN. We found that we are designing our features not around what our HSMs can support. We don't talk to our third parties and say, "Hey, you want to do this, or we can't do that, we need to check with HSM." I just took it for granted that we have the features in AWS Payment Cryptography, that it is generic, and you can do transactions from one type of key to another type of key.

Thumbnail 830

There's increased resilience because ultimately, I was never a person who should have been in charge of switches in a data center, but I was. Moving it to AWS has brought it up to the same standard that all of our other services are, and it's just a load off our mind. The really important thing here is launching a new region. We were in three regions—just one in the US and a couple in Europe. But now that we're expanding more in the US and becoming a greater part of the PayPal offering, we want to spin up another one in the US. Previously that was tens of thousands of dollars. We'd have to rent the rack, get the fiber lines set up, buy hardware, and buy licenses for that hardware. Now, when we're moving into a new region, AWS Payment Cryptography is the only bit which doesn't cost more money. We might have to spend more money on compute, load balancers, network, and storage, but AWS Payment Cryptography now has a per-request model. We are just shifting traffic from one region to another, so it is genuinely free to launch a new region. It's really nice that my team has gone from the blocker to the enabler.

Thumbnail 890

Developer ownership is for me the biggest win and the most unexpected one. Although we're a bigger product team with very high vertical ownership, within that team there have always been some crypto people, and it's been our responsibility to always handle any changes to our abstraction layer.

Because other people are terrified of crypto. It isn't something they feel they can touch, and they don't want to understand it. The H70 interface was complex enough that it didn't feel like you could understand it. But now, it's just like calling S3 or DynamoDB—it's an AWS API. Developers are touching it, thinking about the performance, considering whether they need more connection porting because it's something they're familiar with. It's no longer a big unknown.

Thumbnail 940

Thumbnail 960

Regarding product focus, we were a small team with data centers. We had three data centers around the world, and we would have to fly around to them to do patching, sometimes doing it badly. We owned a lot of infrastructure and spent considerable time on it. Now my job has transformed. Since I started two years ago at PayPal, my job has shifted from being infrastructure focused to being product focused.

Thumbnail 980

We also achieved a reduced audit scope. This ties in with being more product focused. Every single time we had an on-site audit, we would have to go through all of our configurations and get them signed off, specifying what commands we had enabled. Now this has almost entirely shifted over to you, and we are simply the user of it.

Thank you so much, Philip, for sharing that. Let me provide a quick summary of what we discussed today. Thank you for sharing about PayPal's experience getting rid of one of your last dependencies on your physical data centers. I think it was great that you were able to maintain your PCI PIN and PCI P2PE certification the entire time. You gave a good example of going into a new region at effectively no cost, allowing you to easily expand into new regions. We're in twelve regions today with more in the future.

You mentioned simplifying infrastructure by getting rid of hundreds of lines of code and removing your middleware. Your developers can now directly use this because it's a REST API. Thank you for this great partnership. You've been live on the service for about a little over a year now. As you saw in those before and after graphs, not only was PayPal happy with the performance when they originally joined, but we've actually increased performance and stability capabilities since then.

Thank you to Philip for coming here and sharing PayPal's story. Thank you to the audience today. If anyone has questions or wants to talk about their experience, Philip and I will be outside after the presentation. Thank you very much, everybody.


; This article is entirely auto-generated using Amazon Bedrock.

Top comments (0)