🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.
Overview
📖 AWS re:Invent 2025 - Ticketmaster: Enhancing live event experiences for fans with AWS (SPF206)
In this video, Ticketmaster's Mike Fuller explains how AWS Local Zones reduced latency from 60ms to under 2ms for their Phoenix data center, enabling real-time interactive seat maps and ticket recommendations for 600 million annual ticket sales. The session covers Ticketmaster's 10-year AWS evolution, the physics of achieving sub-millisecond latency, and specific architectural best practices including Direct Connect Gateway configuration, VPC design per Local Zone, and firewall placement strategies. Mike demonstrates how Local Zones enable hybrid cloud bursting for on-sale events and discusses future plans with EKS Hybrid Nodes for workload placement flexibility across 35 global Local Zone locations.
; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.
Main Part
Introduction: Speakers and Session Overview
Hello, hola, bonjour, salam alaikum, marhaban, namaste. Welcome to AWS re:Invent and welcome to SPF206, Ticketmaster: Enhancing live event experiences for fans with AWS. Thank you for joining us today in person. For those of you watching this recording in a different time and space, thank you for clicking play. I'm extremely delighted to be here with you. My name is Susmitha Marupaka, and I lead the go-to-market and strategy for edge computing and inferencing.
It is an absolute honor to introduce my dear friend and colleague Sylvia. Hey, nice meeting everyone in person and online. I am Sylvia Lu. I'm a product manager in AWS EC2 Local Zones, and I work on the Local Zones expansion and enabling customer use cases like Ticketmaster. With that being said, I'll hand it over to Mike, our friend from Ticketmaster.
Thank you, Sylvia. We saved the best for last. We have with us a very special guest, Mike from Ticketmaster. Hi, hello everybody. Thank you very much. Yeah, I'm Mike Fuller. I'm an accomplished architect at Ticketmaster, specifically with our Infrastructure Architecture Services team. And what does that mean? We basically help all of our infrastructure service teams, Kubernetes, AWS, our on-premises services, DNS, all that sort of thing, design best practices and consistent architectures that we can use to expand and improve our overall capabilities for our developers and for our customers on the outside. My particular focus is AWS. That's why I'm here for you all today.
Now that we got the introductions done and it is recording en route, we can all go home. Yes, I'm just kidding. Well, we are here today to spend the next hour first by level-setting the expectations about Ticketmaster, their evolution and journey with AWS, followed by why latency is important for improving fan experiences for Ticketmaster. And then Sylvia, my colleague, will walk you through the Local Zones overview and then pass it back to Mike to understand the do's and don'ts, best practices, and the forward-looking strategies. Whether you're in person with us or joining online, you're in for a cool ride.
Before we get started, a quick show of hands. How many of you have ever booked a ticket on Ticketmaster for your favorite event or concert? Quite a few of you, Mike. All right, without further ado, the stage is yours, Mike.
Ticketmaster's Scale, Global Presence, and AWS Journey
Perfect, thank you very much. So yeah, good thing that most of you have used Ticketmaster. That makes this a lot easier. I don't have to give too much background, but just wanted to start with the basics. We are a live marketplace for events and sports, concerts, anything like that. We try and connect our fans, which is what we refer to our customers as, to be able to attend these unique one-time live experiences. Obviously, if you go live, you're not going to have that same experience again, so hopefully it's a nice life memory for you.
But I want to go over some history about Ticketmaster. It's hard to hear from you guys, so I'm just going to make this rhetorical, but how old would you think Ticketmaster is? Many people probably don't realize we've been around quite a long time. We actually were founded the same year Apple was, so 1976, so we're almost 50 years old. Next year will be our 50th anniversary, I guess, which is pretty cool. But over that time, we've obviously seen lots of different technology changes, new capabilities, new solutions to our current state we are today with AI and everything going on.
Each year we sell over 600 million tickets, so that's almost 2 million a day, which is quite impressive, and we have over 2 billion website visits a year as well. What most people may not realize, in addition to selling tickets to customers, we also handle the entry side. So when you go to an event, you might have your digital barcode. It's being scanned, we do validation, make sure it's not a fraudulent ticket, things like that. So we have lots of services and capabilities that have to provide these for customers.
On top of that, we have a B2B business as well to help, whether it's an artist or a venue that has to plan events, organize a concert, maybe a schedule, price tickets and seats and all those things. We have to provide capabilities for that as well, so it's not just selling tickets to our fans. And we started in the US, but now we're in over 35 countries. We also have a global support staff to align with that, so we're a very large distributed company.
So why am I actually mentioning all this?
Just to give you some background, we're a very large, very diverse infrastructure that's gone through many different iterations and changes over the years to adapt to new technologies, new capabilities, and new priorities that we have to pursue.
All right, so let's get a little closer to the actual Local Zones discussion. Our journey into AWS started roughly 10 years ago. We started the same as probably many of you did if you were an enterprise at that time, using basic compute, storage, and network. We probably had a handful of accounts back then to where we are today, where we have hundreds of accounts, hundreds of VPCs, and hundreds of AWS services. So we've sort of evolved with AWS as they grew, and we've grown alongside them. A key thing though is we have many systems and services that can't necessarily move to the cloud, so we're not a fully cloud organization. We're very strongly hybrid and have a very well-set balance of what's on-premises and what's in AWS, and this is where Local Zones is really going to come into play for us.
All right, so why did we choose AWS? There are many different reasons. These aren't necessarily our primary reasons we chose it, but they're the ones that make the most sense for our discussion today. The big one up front is global footprint. I mentioned we're a global organization. We have actual Ticketmaster data centers for on-premises systems all over the world, and we needed a cloud provider that could also sort of align with those locations. We don't want to have our data center hundreds of thousands of miles or whatever away from the cloud provider because that's going to present other issues. So AWS with their large number of regions is able to closely align with where our data centers are, which provides a good benefit that we're going to talk about in regards to latency.
APIs and services, if you've used AWS, you know they're well known for very clear, very usable APIs that are well documented. This is key for us because of the automation we like to use, infrastructure as code, and all the components that are necessary for us to sort of quickly respond to changes in our environment. Maybe we have big concerts going on sale, we need to scale up and scale down rapidly, things that we need to be able to do in an automated fashion to match that speed. And last but most definitely not least is rapid development and prototyping. We could build and test everything on-premises, but that's a lot harder or a lot more upfront investment to buy, say, racks of servers to do a, hey, will this work sort of experiment. We can use AWS, build something quickly, and maybe it works, maybe it doesn't. If it works, great. AWS scales fantastically. If it doesn't, we tear it down. We lost a little bit of money, but it's not a huge upfront investment. So that's a huge capability when we're moving into testing and trying new things as we move forward.
The Critical Role of Latency in Fan Experience
Thank you, Mike. Before we talk about why latency is important, I want to give you a real world example of what a millisecond means. The blink of an eye is about 100 milliseconds. We are talking about a single millisecond here. That means Mike and his team has to achieve the Ticketmaster workflows 100 times as fast as the blink of an eye. Mike, could you please explain why latency is important for enhancing the Ticketmaster users' experiences?
Yes, of course. So many of us have probably interacted with some website that you're going to, you're trying to search or query something, and the response takes a considerable amount of time or things don't load properly. And we get frustrated and then probably leave the website, go somewhere else, and maybe make a purchase on another site. These could be contributed from many things, but latency is often a contributor there. So for example, we're just going to tell a story here. Within our Ticketmaster website, if you search for a specific event, you'll get brought to our event details page. This is sort of where you can find out all the info on the actual location. So for example, the Sphere here, which is a fantastic venue if you've not been, you can see a list of all the tickets available, which can be either primary tickets, meaning the venue itself is selling them, or it could be a resale ticket, meaning maybe somebody got sick, can't make the event, so they can sell it to somebody else and move from there.
We also can provide seat recommendations. Maybe you want the best seat, maybe you want the lowest cost seat, or even better, you could search and find the exact seat you want with our interactive seat maps. You can pinch and zoom if you're on a mobile device, scroll down and say, hey, I want an exit row, I want the very front row, middle seat, whatever the case may be. You can sort of visually move around this. And all of these different interactions require different services and microservices and database communications, all having different latency requirements, and each call possibly resulting in different time to make that call and respond to the request.
As I mentioned already, we have databases, applications, and multiple services all around the world. Sometimes these systems that have to talk just for this event details page aren't sitting right next to each other. Maybe something is in AWS, something is on-premises, and we have to be able to optimize our network to handle those types of scenarios.
I see two operative words here: one, real-time, and two, interactive. That intrigues me into two solution concepts or challenges. One is the interdependency of the microservices, applications, and databases to have an inventory before providing recommendations, live seat maps, or ticket management. Mike, are they serialized or parallel? If they are serialized, then how are you working on the latency to improve the fan experience?
You are absolutely correct. They have to be serial for our use case. You can't display a seat map or display recommendations without knowing which tickets and seats are available, right? So there are certain interactions here that have to go in specific orders, and this is where the latency comes into play. Say we had these multi-service calls and each was 50 milliseconds, 100 milliseconds, just throwing out completely random numbers right now, those would add up. If it's 50 milliseconds plus 50, you might not notice it initially, but the more calls and services that have to be made before you can present that page back to the user is going to cause what we're calling compounding latency. That's what goes back to what I said before of a really degrading user experience where it just gets frustrating and you leave the page.
So we really have two options. We could leave the website like that, but that's not really an option because we don't want to have a bad user experience. Or we could limit site capabilities. We don't really want to do that either, but it's better to say if latency was bad, not have some of these fully interactive components versus have it there and only maybe a handful of users ever get it to work properly.
To solve this problem, we've had to find ways to strategically place applications, databases, web servers, and all the different components that make up our system into different locations to sort of play like a game board or a chessboard, moving them to where they're needed at any one time. Maybe it needs to be near your customer, maybe it needs to be near a specific database, but that's really our goal: to find ways to optimize so that our developers building these services can pick the solution that works best for them on our network.
Evolution of Infrastructure Architecture: From On-Premises to Local Zones
So Mike, while designing the infrastructure architecture for Ticketmaster, you might have had multiple strategies over the years, but in the interest of time, are there any three key strategies which you want to talk about in your evolution?
Yeah, sure. So what I want to talk about is basically, obviously, as a 50-year-old company, we've gone through many, many different infrastructure architecture designs and network designs. So we're just going to do something somewhat generic but actually representative of what we did, but keep it at a high level as well since this is only a 200-level course.
So we'll start with before 10 years ago, before we moved to AWS. Like I mentioned, we have data centers located all around the country in the US, all around the world, and we have to have them be able to communicate, obviously interconnecting them for this conversation. Moving forward, I'm going to focus on the US just to keep this a little bit simpler. But say in the US we have a West Coast and an East Coast: East Coast in Northern Virginia, West Coast in Phoenix, Arizona. We'll have application stacks. Maybe they all reside in the same location, so maybe we have our web app and databases all in Phoenix and they're just talking to each other. Maybe we have them in multiple racks in case there's some failures. That's obviously low latency and not an issue.
But there are those issues or those times where maybe we have to have a primary database in one location, maybe that's Virginia, and then maybe we have an application server in Phoenix that needs to communicate with it. That's where we run into the issue of, I guess, physics, where they can only talk so quickly when they're 2,000 or so miles apart, right? So we on average would encounter 60 millisecond latency over that connection, which is not ideal for this compounding latency component that I mentioned before, but we were able to work around it at the time and make it work for our needs.
Then early cloud. So after, we'll say starting 10 years ago, so fairly recently, we went into AWS. We had to pick a region in the US East. It's really easy. There's Northern Virginia Ticketmaster Data Center, there's Northern Virginia AWS, single-digit or single-millisecond latency in between those, so it's not really interesting to talk about today. But in Phoenix there is no AWS region in Phoenix, so this presented a challenge. We had to pick and choose: okay, do we go with Northern California, do we go with Portland, Oregon? Both not necessarily very close to Phoenix, right? But we did end up choosing Oregon.
We ended up choosing Oregon because at the time that was what was available. We stood up a Direct Connect, our VPCs, all those interactive components, and we were able to test it. Now we have 30 to 40 millisecond latency because it's about half the distance to go to Portland from Phoenix than it is to go to Ashburn or Northern Virginia. So it's getting better, still not as good as it could be. It's not nearly the same as if they were on the same physical data center. So we have a 60 millisecond option and now we have a 40 millisecond option.
About a year ago we started working with AWS on Local Zones, so we have systems that require really low latency and we have databases and maybe other components that need to live on-premises. So we need to find a way to still use some AWS resources but still communicate with those on-premises sites. We're presented with Local Zones where there is a Local Zone in Phoenix, not too far from our data center, which allows us to, once we put another Direct Connect in place and route over that traffic, do that in under 2 milliseconds.
So if you look at this, you see three different sort of connection speeds, and they can all serve different purposes. We still have plenty of use cases where everything should be on-premises, plenty where we want to use a parent region and maybe take advantage of managed services or some components that are only available there. But now we also have Local Zones. So this is sort of that chessboard, those puzzle pieces that we want to move around to meet different needs for whatever our products are. Given that very rough intro to Local Zones, let's pass it over to the experts who can talk about it more.
AWS Local Zones: Bringing Cloud Closer to Metropolitan Areas
Thank you Mike. That was the evolution from BC to MC, before cloud to modern cloud. But being AWS we constantly come up with new acronyms. Now we are going to switch from modern cloud to metropolitan areas. A quick show of hands, how many of you have already deployed your workloads in Local Zones or are planning to deploy your workloads in Local Zones? That's a mixed crowd. Sylvia, over to you.
I see a few hands, so it's a good place for us to talk about Local Zones then. Thanks Mike for the very insightful journey for Ticketmaster with AWS and the Local Zones. But now, let's spend some time to dive into what are Local Zones and the use cases of Local Zones, including low latency and hybrid migration that Mike just talked about for Ticketmaster, and discuss how Local Zones will allow us to bring the cloud closer to us, to users in various metropolitan areas.
All right. First, what are Local Zones? So Local Zones are a type of AWS infrastructure that extends AWS and cloud to more locations closer to you, your end users, or your workload. Broadly, we see Local Zones use cases in two broad types. One is latency based, so where customers need to deploy workloads in a Local Zone or use a distributed edge architecture to deliver single digit millisecond latency for their workloads. And second, location based, so customers who need their workloads at a particular location for several reasons, including residency or proximity to their data centers.
In both two types, Local Zones enable customers to access cloud in various locations worldwide to meet their respective low latency or data residency needs. So what does that mean? Let's use the slide over here as an example. Let's focus on the middle of the slide right now. For customers with end users or workloads around Nigeria, Lagos, Nigeria, accessing to AWS Cape Town region in South Africa is just thousands of miles away. So while it works for most use cases, for customers who need low latency in around Lagos or specific data residency in a specific country in Nigeria, it just doesn't work.
So instead, we launched Lagos Local Zones a few years back, and customers can architect their workloads to achieve single digit millisecond latency or help their residency needs while they can seamlessly access to the Cape Town region for full set of services for the non-latency sensitive or non-residency sensitive part of the workload. So similarly in the US, as Mike was talking about in Phoenix on the left side of the slide, as we will learn more in the session later, with Phoenix Local Zone and also Los Angeles Local Zones, Ticketmaster can achieve single digit millisecond latency for their latency sensitive and hybrid migration workloads.
Okay, so knowing about that, how can customers use Local Zones? You can think about Local Zones as very similar to Availability Zones from experience perspective. It's just they physically located in a different location than where regions are located.
AWS deploys and operates the local zone infrastructure in the places that you need, and you can focus on the rest. Similar to regions, local zones provide the consistent experience with elastic on-demand capabilities with pay-as-you-go pricing and the same API, so you can enjoy the same experience. Customers get the same AWS security, core services, and developer experiences that you are used to. Local zones are also connected to the parent region with multiple redundant, secure, and high-speed links that allow you to leverage the regional services.
Knowing about this, what are the specific use cases that customers use local zones for? Over the years, we have seen customers from multiple industries adopting local zones to unlock a variety of use cases. We have more traditional enterprise workloads with migration and modernization needs. These workloads include everything from back-office applications to SaaS applications that customers want to move to the cloud. However, given the interdependency for these more enterprise workloads, the size and constraints, they're difficult to move to the cloud today directly and fully.
With local zones, customers can use local zones to move a portion of that workload to the cloud or as a stepping stone to first deploy with a hybrid architecture with on-premises and local zones together before fully moving to the cloud. Secondly, we also have workloads that involve data with local data processing or data residency needs. This includes data-intensive workloads that cannot be easily migrated or workloads impacted by data security, data sovereignty, and geopolitical regulations that require data to be in a specific location. Finally, we also have workloads that require low latency, as we talked about a little bit already. That includes real-time medical imaging, gaming as the picture shows over here, and content creation that need the low latency capabilities.
So knowing the use cases, how do local zones actually help customers achieve that? We support a core set of services to unblock these use cases. Customers can choose from a selection of general-purpose compute, memory, storage-optimized Amazon EC2 instances, and EBS volumes, as shown in the blue circles with all the core set of services. In addition to the core set of services, we also offer services in white circles over here, like RDS, FSx, and S3 in select local zones in different locations for various customer use cases in specific locations.
Recently, over the past two years or so, we have also added AI and ML capabilities such as accelerated compute instances, including P5, P6, Trainium2, and other instances available in select local zones for customers to locally run AI training and inference workloads. Besides, as we discussed before, customers can seamlessly access the full set of services in the parent region for the full set of workloads. Now we know about the local zone capabilities and services. Let's take a look at our global footprint and where the local zones are.
Currently, we have local zones available in 35 metro places across the globe. This includes 17 in the US, including what we've just talked about, Phoenix and Los Angeles, and in fact, we also have a local zone in Las Vegas, and also 18 outside of the US. These available local zones are shown in pink on the slide over here. Besides these, we also have the white ones, which have 10 plus local zones all over the map. Those are announced, and we plan to launch them in the next few years to enable low latency and data residency access to more locations for more customers.
Architecting for Low Latency: Network Design and Connectivity Strategies
So with local zones, we extend AWS and cloud to more locations, enabling customers to architect for various use cases, including latency and hybrid migration that Ticketmaster powers for fans like us. I'm so excited. With that, I'm turning back to Susmitha and Mike to walk us through the low latency architecture and Ticketmaster's journey. Thank you, Sylvia. When it comes to latency, two things come to mind. One is physics. You cannot cheat physics.
Second, the speed of light. To achieve sub-millisecond latency, to give you a best estimate, this is how far light can travel, about 150 miles. Now, let's dive deep into how to architect for low latency using AWS. No scuba gear required. That was a joke.
Anyways, before talking about the low latency network architecture, let's take a step back and understand Local Zones architecture. As you see on the screen, on the left-hand side is the AWS region, and on the right-hand side is the Local Zone, which is like an Availability Zone in a metropolitan location, as Sylvia mentioned. How do you configure your workloads in the Local Zone? Step one, opt in to the Local Zone of your choice from the AWS region dropdown menu in your console. Step two, extend your VPC from the parent region over to your Local Zone. Step three, create subnets, resources, and start using.
A quick walkthrough on the terminology used. Each Local Zone hosts the data plane, whereas the control plane resides in the AWS region, which we call the parent region. There are two ingress points to connect to Local Zones. One is for customers like you who have on-premises facilities and want low latency connection through Direct Connect. We have over 100 Direct Connect locations in our AWS global infrastructure. Second, for users who want to connect over the public internet, Local Zones provide the AWS service of Internet Gateway, wherein the users connecting via public interface are landed over the ENIs which are present in the Local Zones.
Now let's examine what are the various strategies and tricks of the trade for architecting for low latency. First, you have your on-premises co-location facilities, and then you establish a Direct Connect to a Local Zone of your choice. Now, after that, there are three things to consider. First, with your Direct Connect and your existing private network, it can be extended over to your Local Zone. Great, you've landed in a Local Zone. What's next?
First, our customers have AWS global infrastructure in their workloads and have global reach. In order to facilitate the global connectivity, as Sylvia mentioned, Local Zones are connected to the parent region via multiple redundant high-speed AWS backbone so that you can have access to the regional services as well as the global connectivity. Wait a minute, but how do I connect? You can connect through the Transit Gateway interface. Let's take a step back. AWS networking has various methodologies for global connectivity, including but not limited to Transit Gateway, Cloud WAN, VPC peering, and others. Let's assume that global connectivity is via Transit Gateway.
In order to achieve the global connectivity from a Local Zone, our customers associate a Transit Gateway interface to a Direct Connect Gateway via a transit virtual interface. Second, but what about if you have multiple Local Zones and you want to establish east-west connectivity across Local Zones? Do I need to go to the parent region? Not at all. To establish east-west low latency connectivity between the Local Zones, like Ticketmaster between Phoenix and Los Angeles, we have a Direct Connect Gateway connection between the Local Zones to facilitate low latency east-to-west network traffic connectivity.
Step three, what if I have multiple VPCs? Our customers have tens or hundreds of VPCs and accounts and subnets. How do I architect my low latency networking with multiple tens or hundreds of VPCs and subnets? Let's hear straight from the horse's mouth. Awesome, thank you.
All right, everybody hear me still? Hopefully, all right, good. Sorry, I didn't hear the feedback, so yeah, hopefully we're all experts on network connectivity now, right? So we look at this drawing. It's going to very closely resemble the one you probably just saw because we're following the best practices from AWS, so it makes perfect sense. We have our on-premises data centers. Again, we're going to talk about Phoenix for this one, so you see that on the far right on your side. And then we actually have two different colos that we connect to.
So we have a Portland connection, Direct Connect, which connects to a Transit Gateway, which is in our primary parent region, right? This is how we have our on-premises data center be able to talk to all of our hundreds of VPCs, whether it's in US East 1, US West 2, or any of our other global regions. For Phoenix specifically and all of our Local Zones, we have a Direct Connect going from our data center to Phoenix. From there, we have a Direct Connect Gateway specifically for our Local Zones that can connect to, as we mentioned, Phoenix, and we also can connect to LA. So it allows us to sort of expand to any of the Local Zones there. We could add Vegas, possibly anything in the Southwest.
And at the same time, now we have multiple paths. Things that need to talk direct from a Local Zone to on-premises can go through their own Direct Connect. If the Local Zone needs to talk to the parent, it can go straight through the normal VPC routing pieces. If it needs to talk to another VPC, it can go that same path, go to the parent VPC, go to a Transit Gateway, and then from there to the other VPC. So we have all the different pathways that can be followed, and we can optimize and take the path of least resistance or least latency.
Scaling for Peak Demand: Bursting to the Cloud with EKS Hybrid Nodes
All right, that's a great overview of low latency networking architecture. Till now we understood why latency is important and how Local Zones can help you with your low latency workloads. Let's take a step back and dive deep into a different segment. You have learned earlier from Mike that Ticketmaster supports half a billion ticket sales annually. Can you please explain how do you handle the massive growth in scale?
Yes, of course. So we have sort of two types of sales. We have our sort of baseline. You can think of the day-to-day sales that happen routinely. Maybe a sports team or something has all their season tickets or their different tickets listed there. But then we have the really challenging ones, which are on-sale events. Maybe an artist or some promoter wants to put on a huge ticket release. Come Friday, everybody, we're going to open up and sell all of our tickets to all of our events, right? That requires a drastic increase in our capacity to support those on-sales, and it can become a strain depending on how far in advance we might know. It might be we get some lead time, maybe we don't.
With our AWS resources, that's not necessarily as big of an issue because we can use Auto Scaling Groups and the native elasticity in AWS to scale up quickly. But like I said previously, we have on-premises resources that we can't operate in AWS that we also need to scale where possible. So we sort of are left with two options. We can make a large capital investment, set out a bunch of extra racks, extra servers, and meet the demands for that specific event, but then it's sitting idle afterwards. So it's not really the best use of our money to take it that way.
The other option is we find ways to burst into the cloud to support that extra capacity. And prior to Local Zones, sometimes we could do that, sometimes we couldn't, because if we said before, it's a 40 millisecond latency connection to go from our data center to the parent region. We have systems and applications that that's too latent for them to actually work. But now that Local Zones is an option, it's much more like the Local Zone in our data center are sitting next to each other, like sub 2 millisecond latency that I mentioned, right?
So this allows us a new capability of, say, just hypothetically, we daily run 20 application servers, right? And an event's coming up that we need to scale that up to 100. Now we can expand that extra 80 into the cloud. We can run that event, process, meet our demands, and then scale it down when it's complete. So now we've paid that variable expense. We've satisfied what we needed for the on-sale, and it can go away, and we can just sort of rinse and repeat as we need to do that. So it's ideal for those cases where we have to have sort of surprise last minute or things with not a lot of lead time, or we don't want to buy all the hardware up front to scale up and down quickly.
And that also sort of ties into another goal that we're looking for, which is workload placement flexibility. So I mentioned earlier the game board and shuffling pieces around, right? We're a very heavy Kubernetes shop, and so we've recently started looking at EKS Hybrid Nodes, which will allow us to basically use Amazon EKS, which if you don't know what that is, is Amazon's managed Kubernetes service.
We can use that running it out of AWS, out of say Portland in the parent region. The control plane would live there, but it would also allow us to deploy nodes to on-premises, to Local Zones, or to the parent region. So we can go to any of those three locations, which basically solves our shifting workloads as needed to where they have to be for different scaled events.
Okay, thank you, Mike. Now let me explain from the AWS EKS hybrid nodes perspective. First point, EKS hybrid nodes are 100% upstream Kubernetes conformant. What does that mean? You'll have access to the innovation and features of the global Kubernetes community worldwide. Second, as you see in this picture, we use AWS API for EKS hybrid nodes, and we follow the consistent developer experience wherein you can build once and deploy wherever you need it, at a Local Zone, Outpost, EKS hybrid nodes, or an AWS region. It follows the bring your own infrastructure model wherein, as you see in the bottom, you can use your own APIs for your on-premises infrastructure.
But wait, first, how do I connect to the EKS clusters in the AWS region or the cloud from the on-premises? Second, is it secure? Let's address the first point first. Our customers use AWS Direct Connect, site-to-site VPN, or your own VPN service to connect and extend the VPC from the AWS region to your on-premises locations by self-managed worker nodes on the on-premises environment. But what about security? Whenever we want to connect, whenever those nodes are added to EKS clusters, the temporary IAM credentials are configured by SSM hybrid activators or IAM Roles Anywhere, all with the management, single pane of management and the security that AWS provides with IAM.
Best Practices and Lessons Learned: Do's and Don'ts for Local Zones Deployment
With that, let's move on to the next section on what are the best practices and dos and don'ts. We have spent about 30 minutes or so understanding about Ticketmaster, low latency, the evolution and architecture. But let's also hear from Mike about the good, bad, and excellent points of his journey and their journey in AWS Local Zones and Ticketmaster architecture. Could you please walk us through the best practices and the dos and don'ts?
Of course. So yeah, what I'd like to do is, like I said, we started this journey maybe a year ago, and there are obvious things we learned that if any of you are starting this that you would pick up very quickly. The top two that we'll talk about here, you'd probably figure those out relatively quickly, but there are some of the more network configuration pieces that some are obvious, some aren't. But we figure we can save you some time if we can walk through those and make you aware of them as you're going into it.
So number one, I know it's already been mentioned a couple of times, but I just want to reiterate it so it sticks. A Local Zone, you need to realize, is basically an Availability Zone, right? If you have an AZ A and an AZ B up in your primary region, your Local Zone just becomes your AZ C. There are different underlying network speeds or latency you might have to address, but for the most part you treat it the same. So it's just something you'd realize very quickly, but you want to be aware of that going in.
The second one is you really need to know what type of services or EC2 instances or families or managed services you want to use, because not everyone is the same in every Local Zone. It's something we realized going in that we have very specific needs. We need to make sure that, say, the Phoenix Local Zone actually can support those families of EC2 instances, or the LA Local Zone if it has different capabilities. So the QR code there is going to take you to the web page that lists all the existing Local Zones, all the EC2 types, I believe, and the services. AWS is constantly updating it, and you just need to be aware so you don't waste your time and go and set up the networking for a Local Zone only to have your developers try and build something that isn't feasible in that location.
All right, so now let's get into nerdy networking diagrams. So the first one that is fairly obvious, but you might not realize it right away, is you want to have, if you're trying to deal with the latency and the on-premises direct connectivity, you want to have its own Direct Connect. So this picture right here is the don't do. The reason being, say I just had an existing VPC, I enabled it for Local Zones, and I created a Phoenix subnet, right?
If I deploy an EC2 instance there that I want to talk to on-premises, it has to travel from Phoenix to Portland over a Direct Connect back down to Phoenix. So even if the local zone and our data center are miles apart, it's traveling 2,000 miles to talk to the other one because it's taking that long route, so you're not improving anything, you're actually making it twice as bad. So don't do that.
What you want to do is to set up that Direct Connect that we've mentioned a couple of times where you can have a Virtual Private Gateway set up that you can send anything for your on-premises traffic. So you can create route tables that say whatever my CIDR space for on-premises is, 192.168, send all that traffic this way to the Direct Connect and go on-premises. And at the same time, all your traffic from that local zone to everywhere else like AWS will go through your parent, so you can differentiate and send the traffic in different directions and then you can have the best of both worlds for latency.
The fourth one is very much tied to that previous one. A lot of you, or at least us two, have firewalls up in your parent region that need to be used for maybe north-south traffic filtering, so going to the Internet or coming in from the Internet, or east-west maybe between your VPCs. If you want to have that same sort of filtering capability within your local zone, you're probably not going to be able to reuse that one that's in the parent for the exact same reason I just talked about. To use it, you would need to send your traffic from Phoenix all the way to Portland, do the filtering, and then either go out to the Internet or come back to Phoenix or whichever path it needs to take, but you're increasing that latency. So instead, you're going to have to probably set up a virtual firewall appliance in the local zone. Things like Network Firewall from AWS are not yet available in a local zone, so you can't reuse that component. So a local firewall appliance can be deployed in the local zone. You can use it for the Internet-bound traffic. You can place it in front of your Direct Connect. We also considered placing a firewall on the on-premises side and trusting the Direct Connect. So there's a couple of different models, but you're going to have to place your own firewall in a local zone if you want that capability.
All right, and the last one, this was the trickiest one that got us. We initially created a VPC and created local zones and actually created both a Phoenix and an LA local zone in the same VPC thinking, hey, we're just testing this out, this should work. We started trying to run some tests talking between Phoenix and LA and realized why is this taking like 60 milliseconds or something for these to talk. The reason being, if you're familiar with VPC routing and you have a route table, right, if you try to go in there, you can't explicitly tell Phoenix how to get to LA because they're all within the same VPC. You have that local default route and so AWS picks and sort of manages how that routing happens. And every request is basically going up to the parent and coming back down. So we worked with AWS and they clarified our mistake and said that's not the way to do it. What you want is a different VPC for each local zone.
The reason being, now with this sort of setup, I can have a different Virtual Private Gateway on each local zone and I can now make a route in that routing table that says if I need to go from Phoenix to LA, send all that traffic to the Direct Connect side, not to the Transit Gateway side. And what that means is the traffic will leave through the Virtual Private Gateway, go out to the Direct Connect, hairpin back, and go to LA, and that'll give us like 15 milliseconds, which was more what we were expecting up front. So that was a trickier one. It was just, I'm pretty sure it was in the documentation, I just missed it as we were going through, but it's a good one to keep in mind if you're going down that journey of trying to have multiple local zones that can talk to each other east-west.
Looking Ahead: Future Initiatives and Session Wrap-Up
Thank you Mike for sharing the tricks of the trade and do's and don'ts. We also want to hear from you. For those of you in person who are using or deploying local zones, we would like to hear from you on your do's and don'ts and best practices. Those of you watching us online, please share comments and then our team are constantly evaluating the comments and we want to always build and work backwards based on our customer feedback.
To recap, number one, local zone is an availability zone in the metropolitan area closer to you. Number two, due to the edge proximity and the computing services that we bring closer to the users, there is a limited set of services that we have in the local zone as compared to a region which has 200+ services. We constantly evaluate the services and track them based on your feedback and the demand signals. Third, do you want to take that? Yes, for the road ahead or for just the third, the best parts.
So I've said our primary lessons learned, where we want to go now with these, because we have new capabilities we want to implement and experiment with. I mentioned the EKS hybrid nodes. That's a key initiative that we have ongoing, and just expanding that out, taking advantage of that local zone, that workload flexibility, moving things around.
Beyond that, our entire discussion today has been about Phoenix because that was the first problem we tried to solve. But we've heard that there are many more local zone locations, and we have many different AWS regions. So we're going to begin looking at whether the same sort of approach can benefit us in these other regions. Or maybe think like in the EU where they have tighter regulations on data residency requirements or other things. There might be other use cases where we can deploy a local zone to comply with data residency requirements or bring it closer to the end user or other databases. So there are other ways we can begin looking at this. Since we've pretty much solved the Phoenix location issue, now we can look at the other use cases that are possible.
That's Ticketmaster enhancing live event experiences with Local Zones. Please continue your learning with AWS. For those of you who are in person, AWS Trivia, when you board an AWS shuttle to go to Venetian or MGM Grand, we do have cool prizes to win or leaderboards to brag about for AWS Trivia on the Skill Builder. For those of you watching online, AWS Trivia was launched as a multiplayer activity to play with your colleagues or friends for all the training and learnings that you have done in AWS.
Next up, we have over 30 plus sessions on hybrid edge. You can watch them online later in a different time or space, including sessions from me and my colleagues where we talk about how to deploy AI and ML workloads in Local Zones, non-AI and ML workloads, and last but not least, AI factories. Please feel free to check out the sessions on hybrid edge and computing.
With that, thank you for being a wonderful audience. For those of you in person, please don't forget to give your feedback and rate our session.
; This article is entirely auto-generated using Amazon Bedrock.






































Top comments (0)