AWS re:Invent 2025-AWS Local Zones- Sophos’ new edge in the global race against cyber-attacks-HMC215

🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.

Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!

Overview

📖 AWS re:Invent 2025-AWS Local Zones- Sophos’ new edge in the global race against cyber-attacks-HMC215

In this video, Ben Lavasani from AWS and John Peterson and Simon Reed from Sophos demonstrate how AWS Local Zones dramatically reduced latency for Sophos's threat intelligence service, SXL (Sophos Extensible List). Sophos processes 223 terabytes of data daily, blocking 11 million threats across 600,000 customers. Previously deployed across five AWS regions, high latency caused performance degradation and timeout issues. By deploying to Local Zones, particularly in Latin America and North America, Sophos shifted their latency bell curve significantly lower. Their deploy-measure-optimize approach proved resilient during Hurricane Milton in Florida, where the system auto-scaled successfully during massive traffic spikes as power was restored. Key takeaways include the importance of strong architecture, early deployment with real-world measurement, and close collaboration with AWS service teams.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Introduction to AWS Local Zones and the Network Edge Security Challenge

Hello everybody. Good afternoon and welcome to this lightning talk. Thank you so much for joining. According to a Department of Homeland Security's recent cybersecurity report, more than two-thirds of cyber incidents now originate at the network edge—the very point where users, devices, and data meet the cloud. That's the front line where every millisecond counts, and it's exactly where Sophos is planting their new edge defense systems, powered by AWS Local Zones. In this short talk, we will show you how we can turn that statistic on its head, delivering low latency protection right where you need it, right where the attack surface is.

My name is Ben Lavasani, and I'm a hybrid cloud specialist with AWS. I'm honored to be joined today by John Peterson and Simon Reed, both from Sophos, who I'll be handing over to in a moment. But just before we dive into the details, I want to give you a small bit of background on Local Zones so you're familiar with the concept.

Local Zones is part of our hybrid cloud portfolio, and we're building a continuum of AWS services where our customers need it. It always starts with the regions. We're going to continue to innovate and expand our regional infrastructure, but we get customers that need to get closer to their endpoints. These could be in metro areas distributed around the world, as well as on-premises environments and the far edge. We're building this in a consistent way using the same APIs, familiar services, and the same tools for automation so that you're able to use any of these services across that portfolio in a very common way.

Local Zones are managed and deployed by AWS, expanding into metros around the world. They're designed to allow you to meet low latency workloads as well as local processing. You can run on-demand Elastic Compute Cloud instances in the local zone, and we extend a parent region. For example, we have one here in Las Vegas that's connected and extending the Oregon region, so we can reach AWS Cloud locally here in Las Vegas. You can also use them in hybrid mode, so you can extend to the region and use regional services as well, and we have many customers that do that. Of course, they're integrated with the same APIs just like I mentioned before and others.

Here's a short map just to show you where their locations are. We started in the US, so the vast majority are there. All the purple ones are generally available, and then the white ones are all in planning. We're expanding into Europe, Middle East, Africa, as well as Australia, New Zealand, Asia Pacific, and South America. Here are just some examples to show other locations of how we're extending, so you can see some of the proximities between local regions.

It's quite far, so there's a lot of value in latency as well as residency. That was a very short brief background. This lightning talk is mostly about the Sophos team, so at this stage, I would like to hand over to John.

Sophos's Global Cybersecurity Operations and the Latency Problem with SXL

Thanks, Ben. All right, hello folks, can everybody hear me? Great. I'm John Peterson, the Chief Development Officer at Sophos, which means I look after engineering and product development for the company as a whole. I work closely with Simon Reed, who you'll be hearing from in a moment as well.

My main task here today is to give you a quick Sophos crash course so you know what we're all about at Sophos. To start off, Sophos was one of the first cybersecurity pioneers, and we were one of the first companies that started doing antivirus back in the mid-1980s. Since that time, we've obviously diversified our business and reinvented ourselves many times. Today, we're one of the largest cybersecurity vendors in the world with over 600,000 customers and 25,000 channel partners that service that customer base.

For that reason, our primary mission is really bringing positive cybersecurity outcomes to the masses. Our view is that every business deserves a superior cybersecurity outcome regardless of their size, and our mission over the last 40 years has been to make that a reality. Cybersecurity, as you can probably imagine and can see from this funnel on the right-hand side, is a very data-intensive business. If you look across our entire portfolio today, primarily driven by our extended detection and response and managed detection and response offerings, our systems process over 223 terabytes of raw data every day.

From that payload, we extract 34 million unique detections every day. We block 11 million threats and surface 1,100 cases for our MDR teams to analyze and investigate, which ultimately results in over 200 real threat actors being stopped every day within our customer base. We are quite proud of that fact, and you are going to hear a little bit more about how Local Zones has helped us achieve those outcomes for all those customers.

I will pass it over to Simon now, and he can cover that for everybody. Thank you, JP. You have obviously seen the scale of our product and services deployment. I am responsible for threat intelligence at Sophos, and this is effectively all the decisions and intelligence that have enabled our products to inform and protect our customers.

The key thing about threat intelligence is that it needs to be available in a number of locations. Yes, it is in our product, and yes, it is in the cloud, but the key component here is that it has to work as a systematic unit. We are here today to talk about one of our key elements of our global infrastructure, and that is SXL, the Sophos Extensible List. This is effectively a service we provide in the cloud, built on AWS, which connects all of our products and services back to our threat intelligence cloud. As you can see, we have been through many iterations of this. It was first developed in 2007, and we are on the fourth generation of this technology. It is absolutely critical to how we get intelligence into our products and into our customer base.

Historically, this was deployed as a set of large services across five AWS regions. As you can see from the numbers, it is extensively used. Every single one of our products is effectively permanently connected to our threat intelligence platform. These numbers are large, but compared to a number of companies here at AWS today, they have larger numbers. There is one key differentiator with these numbers. We are in the business of stopping attacks. This means our products are in line and effectively halting function within our customers' estate. Our products have local decision-making, which is particularly fast, but at times they need to reach back out to our threat intelligence products. This effectively introduces some latency within our overall systems. There are very few actual places or services deployed where latency is this critical.

We were facing a widespread latency spread across our customer base in relation to this service that we were providing. We were in a situation with the lower graph here, which is our original service, where the latency spread in a bell curve around the planet. We have a global customer base spread across nearly every country around the world. The situation we had here is that we had medium and low latencies, and we were in a good position. We were not slowing down a customer's traffic, whether that be their endpoints, email, their firewalls, or their cloud infrastructure. However, what we were finding is that once latency got into the higher regions, we were effectively slowing down or impacting the ability of that customer to function from a performance point of view.

In a worst-case scenario, when latency got to very high levels, there were timeouts in the product. Threat intelligence was not getting to our products, and the trade-off there was not just degraded performance, it was degraded threat intelligence. We undertook a project with the Local Zones teams here to make a dramatic shift in the global latency of how we are providing this service and effectively move that bell curve dramatically into the lower areas.

Deploying Local Zones: From Regional Variability to Dynamic Global Strategy

The key thing here is where we started our journey. Our service was deployed effectively in five regions, five AWS regions serving large geographic areas. What we found with this analysis is that there was a large degree of variability in latency in each of these areas. Obviously, as you would expect, between North America and South America, there is a large distribution.

However, even within the United States, there is significant variation in latency from our customers' perspective when using AWS Regions. This is an example of when we moved to local zones in one area, specifically in South America. Red indicates highly degraded latency in our products in that region, yellow represents borderline performance, and green indicates latency within acceptable tolerances. As you can see, when we deployed local zones and moved the SXL service closer to our customers, it made a dramatic difference to the overall latency distribution within that customer base.

This is where we are today with the deployment of our local zones. On this diagram, you can see the regions in the larger areas and our deployment of local zones. We are in mid-cycle here in relation to deploying these zones, and we have focused primarily on Latin America and North America. Where we are going with this is we are turning the problem on its head. When we started this journey, we really did not want to get into manual assessment of which local zone we would be in. We are effectively changing our deployment strategy in relation to this technology.

We will deploy to a local zone by default. We will then dynamically measure the changes in latency for customers in that region, and based on that real-world measurement, we will determine whether we keep that local zone in place because it provides value to our customer base from a latency point of view, or whether we remove ourselves from that environment. The dynamic nature of this approach is necessary because the upfront analysis of whether we should deploy to a local zone is particularly challenging given the complexity of the internet, how it is wired, and the deployment from our customers' perspective. So we are going down a route of deploy, trial, measure, and then understand in relation to what we are doing.

It would not be AWS without an architecture slide. We were told in advance we had to have an architecture slide, so here it is. What we have here is our products and services that talk to an AWS local zone. We are based on core technology of EC2 and Route 53. That effectively talks back to our larger existing regions that contain all of the complex architecture. The interesting thing here is that we moved a very sophisticated caching layer from the region to the local zone. Our architecture allowed us to do this and get a deployment with our sophisticated caching closer to the customers, which yielded all of these results.

It is all well and good in the lab and measuring things, but just as you are trialing this technology, interesting things happen. This was our first ever local zone deployment. It was done manually by the team to trial the technology and ensure we actually understood how it was behaving in practice, and it was in Florida. We deployed the local zone and had the team watching all the metrics. Everything was ticking along nicely in normal circumstances with all lights green. And then suddenly, Hurricane Milton appeared. We were faced with a situation where the hurricane moved through Florida, causing major power outages throughout the whole Florida region, with certain cases experiencing days without power. Ultimately the recovery started, where things came back on and computer systems lit up. The interesting thing here is obviously the top graph shows our normal daily traffic flow through that local zone region, all measured in units of 1000.

The bottom graph shows what effectively happened in the few days following Hurricane Milton. What is happening here is computer systems are being turned on en masse as power comes back on, and from a product point of view, all local caches have been blown. Our product is also tuned so that on a reboot, you rescan systems and capability. You can see a dramatic uptick in the volume going through that local zone. The good news is it all held together, all auto scaled, and there was continuous service from our threat protection through this period. This was a really good test of the system, and something the team are very proud of.

Last but not least, what are the takeaways for us from this point of view. We always knew there was an opportunity here to reduce latency of this service. We had debated a number of times whether we should just deploy some more regions, and we never felt like that really was a game changer for us. It would have improved the situation in a number of areas, but was it a fundamental change? We were effectively sitting here knowing what the problem was and being unsatisfied with our route out of this. Then we saw the local zones announcement, and it took all of 60 seconds of reading the press release and all the information about local zones to realize this is our opportunity for the game changer.

Secondly, strong architecture is essential. We made this change relatively easily because of the architecture we had in our threat intelligence cloud, but it was not just the architecture that helped us. We truly understood the dynamics, the data, the metrics, and the interactions between our front-facing caching layer that effectively formed the basis of what we moved into local zones and the back-end threat intelligence services. We understood that before we made any changes, so it is not just the architecture. The flows and metrics through the architecture are really critical.

It has been really helpful on this journey to have immediate contact with the local zones teams. As soon as we saw the press release and knew we wanted to do this, we reached out to our AWS account representative and got directly in contact with the local zones team. We effectively pushed hard our vision of what we wanted this to be. At the time they announced it, it was US only, and we said, wait, what about the rest of the planet? If we are going to do this, we want to touch every metro region. We worked with the local zones services teams, so the takeaway there is to be bold with the AWS service teams.

Last but not least, the services we all build are complex and sophisticated. It is hard to predict with high degrees of accuracy what will happen in real life. Deploy early, measure, and then when circumstances like hurricanes happen, use it as an opportunity to stress test what you are doing and take benefit from that. Thank you everyone for this talk. I am going to be standing at the back with John Peterson, who is over there, very happy to talk more about what Sophos does, low latency environments, and the spread of where we have threat intelligence and why this is so key to what we are doing going forward. Thank you everyone.

; This article is entirely auto-generated using Amazon Bedrock.