Kazuya

Posted on Dec 8, 2025

AWS re:Invent 2025 - Launch web applications in seconds with Amazon ECS [Butterfly] (CNS379)

🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.

Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!

Overview

📖 AWS re:Invent 2025 - Launch web applications in seconds with Amazon ECS Butterfly

In this video, Malcolm Featonby and Thomas demonstrate ECS Express Mode, a new feature that simplifies container orchestration on Amazon ECS. They show how Express Mode reduces configuration complexity from extensive CloudFormation templates to just three required parameters: container image, task execution role, and infrastructure role. Thomas provides a live demo of creating, updating, and deleting services through CLI and console, showcasing automatic provisioning of load balancers, security groups, auto-scaling policies, and canary deployments. The feature manages up to 25 services per load balancer, implements zero-downtime updates, and orchestrates resource lifecycle automatically while maintaining full ECS flexibility. Express Mode is available now at no additional charge across all AWS regions where ECS operates.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Introduction to ECS Express Mode: Simplifying Container Orchestration

Good afternoon, everybody. Thank you so much for coming out and seeing us. I hope you've been having a fantastic re:Invent. We're almost at the end of it, but we're loving the enthusiasm, so that's amazing. Thank you so much. My name is Malcolm Featonby. I'm a Senior Principal Engineer with the Serverless and Containers Organization, and with me on stage, I have the privilege of having Thomas. Thomas, do you want to introduce yourself?

Absolutely, thank you, Malcolm. My name is Thomas. I've been in the container space in AWS for close to six years now across a few different services. First, an Elastic Beanstalk, then an App Runner, and this last year I had the great privilege of working on ECS Express Mode. So we're very excited to share with you today just how simple Express Mode makes container orchestration on ECS.

All right, so we're going to jump right in. One of the things that we wanted to be clear and just start off with is that this is ECS. We're talking about ECS Express, but ECS Express is a feature of ECS. And ECS is that product that, hopefully you all have been using the service. It's the one that you've come to know and love. It's the service that we launched eleven years ago. It's a service that currently provisions around three billion tasks a week in thirty-eight of our thirty-eight AWS regions in which it is hosted, so it's everywhere. And what we found is that a lot of customers, the first time that they come to AWS, sixty-five percent of those customers who are running containerized workloads start with ECS.

So really, our goal here is to make it clear that many customers, the logos of which are up there, are using this product. We launched over eighteen million tasks during our Prime 2025. So it's a very reliable, trusted, established service. And so the reason we wanted to make that clear is because some of the magic we're going to show you may make it look like it's different, but it's not. It's the ECS you've come to know and love.

One of the things as a developer that I find that I love to do is it's all about solving the problem. And solving the problem is really about making sure that I can maximize the time I spend in building, in writing the code in my business logic. And I really want to as much as possible offload the undifferentiated work, that heavy lift that's required, that toil that's required in order for me to provision the load balancer, get the service deployed, et cetera. I really want to focus on the code because that's where the gold is. And we want to make sure when you're using ECS and your developers are using ECS that you're getting that same value.

The whole idea is really to maximize your engineering cycle so that they're focusing on making sure that they're delivering the value for your customers and the value for your business. So that and offload as much of that undifferentiated toil that is required and important, but offload that to the AWS services, and ECS is a prime example of where you can do that.

Now ECS is a very rich ecosystem, and you'll see that if you've been using ECS, you'll see that it has a lot of moving parts. You know, there are task definitions, there's load balancers that you're configuring, there's certificates that you're managing, there's security groups, et cetera, et cetera. There's quite a lot to it, and there's a reason for that. In order for us to be able to make sure that we can meet you where you are and support your workloads, we want to be in a position where we can provide that rich platform for you so that we can support a heterogeneous workload type, effectively any workload that you bring to us.

But importantly, in many cases, certain types of workloads actually don't necessarily require that you do all of this customization, this configuration. And we wanted to offload a lot of that undifferentiated toil, that configuration, and again, get back to allowing your engineers to focus on that problem solving and not have to worry too much about this. This is what a typical web service application would look like if you were configuring it on ECS. All of the attributes that you see there, all of the configurations and content are what would go into comprising what's needed in order to get a web service up and running as an ECS service in your cluster running behind a load balancer.

With ECS Express, it's about making sure that we can get you there as quickly as possible. And so with Express, what you'll see is in fact, we've cut down the number of things that you need to worry about as a developer to the absolute minimum. You need to have a container image because that's going to contain your application. But once you've got a container image and you bring to us a task execution role, which is the permissions that the application needs in order to run, and an infrastructure role which gives us permissions to be able to provision a bunch of that configuration on your behalf, you're kind of good to go, right? Super simple. That's all it takes, and we'll do the rest for you.

Live Demo: Creating and Deploying Services with ECS Express Mode

Now don't take my word for it, right? We have Thomas here who has spent a bunch of time actually thinking about and building this. He is one of the technical leads on the program. So I'm going to hand it over to Thomas so that he can show you this. Thank you very much, Malcolm, and as the gentleman in the audience pointed out,

let me show you just how simple Express Mode makes container orchestration. We'll start off in the CLI, and this is baked into the AWS CLI that you already know and are familiar with. We'll start off with a simple create express gateway service command, and as you see, all we require is that execution role, infrastructure role, your image, and we're already good to go. You get back all the configuration that Express Mode is defaulting and those best practices which are embedded on your behalf for you. We'll go into a little bit more detail on that configuration in a second.

In the meantime, I want to go ahead and kick off our second express gateway service creation. This one will be a little bit more complicated, but you'll still be able to see the orchestration magic. I passed in this new parameter that we're offering, monitor resources, and this is available in the public AWS CLI. No third-party CLI required. You can see and tag along all the resources that Express Mode is provisioning on your behalf. You can see the cluster there that we use for the default, you get back immediately your service, and you can see the target service revision and then all of those downstream resources that we orchestrate and manage on your behalf.

Importantly, you get your ingress path, that URL endpoint where your application will be accessible, and then you see your load balancer, target groups, security groups coming down to auto scaling. You'll see your scalable target is being registered as well as your auto scaling policy. The metric alarm is then used in case of rollbacks during deployment, so we will be able to monitor your metrics for 5XX and 4XX from your load balancer. Then the security group that we manage to ensure that we have least privileged access between your Application Load Balancer and your ECS service as well.

You're able to actually use this really cool monitoring feature during the create experience, update experience, as well as the delete experience. It's really the entire lifecycle of your application management. Now I'll hop out of that, and with those two services created, we'll go into the console here and refresh our cluster. As Malcolm was pointing to, this really is ECS at its heart. All the ECS bread and butter that you know and love, your cluster, your service, all the way down to your task definition is still available for you.

Opening up our two services here, do it live 1 is that simple service we used just with the NGINX and the three required parameters. We can already see on our observability tab those metrics coming through, and this new load balancer metrics where we'll be able to live tail any request that gets sent to the load balancer. Coming to the resources tab, again you can see all those resources that Express Mode is managing on your behalf and follow along as they get provisioned. It looks like this service is very close to being done. Its deployment has nearly succeeded.

We'll take a look at do it live 2, that more complicated second application we created here, and similarly, we can follow along with the resource creation. Now, something interesting you may have noticed. If I open the load balancer here, this takes me to EC2 console. I can check out the load balancer. We have the listener here on 443 for HTTPS resolution, and we have six rules here. We're actually sharing the same load balancer now across six different Express Mode services, and we can scale up to 25 different services at a time that are using the same VPC network configuration for both public and private services. You benefit from the economies of scale where you're only paying for that one load balancer, and we're taking care of all the orchestration and scaling that load balancer and deprovisioning when necessary.

Coming back to here, it looks like our first deployment has succeeded, so our application is now available. Coming to the second one here, it looks like that application is still running. We have our task launched, and we're just wrapping up the deployment now. I'll go ahead and pop on to the URL that's vended by us, and there we go. In a matter of minutes, we have our scalable load balanced web application ready to serve traffic. That's a simple NGINX straightforward one. The second one that we created, which was a little bit more complicated, I actually passed in a task role to it, and that task role will allow the application to fetch live the number of tasks that are being used and served by ECS, so it's making calls to the ECS endpoints, as well as some environment variables.

This allows us to determine how many desired tasks there are, how many running tasks, and how many pending tasks there are. Additionally, we added in a custom scaling policy through that Express Mode API and we said, instead of the default of scaling with CPU, I want to scale by request count. So based on the number of requests, we're going to scale up or scale down the number of tasks that we have. And then we can open up this one here, and there we go. So we have our traffic visualizer running on port 8080 here.

I'll go ahead and start sending requests now. So this is going to monitor the live number of requests that are being sent to this endpoint. We have 50 requests per second, and the auto scaling configuration was set to scale to 500 requests per target per minute. So we see we have one task desired and one task running for now. I'll pull this off to the side and we can come back to this a little bit later.

Complete Lifecycle Management: Updates, Auto Scaling, and Deletion

Now, it's not just about the creation experience, so it's not just about getting started, it's really as Malcolm was pointing to about that entire lifecycle of your application. So now I want to work through the update experience, and this is where things get really cool. When you think about an update for this simple application, we had NGINX, but let's say we move to that more complicated container image, which I just showed you, the traffic visualizer. We need to change some of the configuration, importantly, the image, and then the container port for this task and for the service itself.

And just like that, in a single command, we're able to update it. But here's the thing, think about what's required behind the scenes to orchestrate this. If you were to update your container port for your service, you're going everywhere from the load balancer, where you're changing the security group egress path on that load balancer. You're changing the target group, the container ports on. You're changing your task definition, the container port there, and then the security group even that's associated with the service. So you need to orchestrate four or five different things all at once, and then you need to actually make the updates in the deployment at that same time, all without ensuring any downtime at all. So Express Mode takes that, takes that pain away from you, orchestrates that on your behalf with a single command, we can get that update going.

Down here in the Resources tab, which I was shown before, you can not only see the resources that are being provisioned, but also the resources from the prior service revision that are now being deprovisioned. So we can see that we're reusing a lot of the same resources. Things like the metric alarm scaling policy that are still the same and that we didn't update, those are immediately available. Things like the target groups which now need to be recreated because of the new container ports, those will be created and then we can see the old target groups being deprovisioned as well.

So through this panel, we're able to really monitor those resources at any point in time across multiple different updates for your service and see what is active at any given time and what is going on beneath the Express Mode layer itself. So talking more about beneath the Express Mode layer, let's really talk about the flexibility that you have with Express Mode. It's not merely this single API that we have for create, for you, update, and delete. As Malcolm said, you really have the full strength and power of ECS and the roots of it at your disposal.

So for instance, if you want to go in and fine tune some of the parameters in your task definition, you can go ahead and open that up, find your task definition here. Make the changes that you'd like, push another revision, update your service, and then you're able to integrate that back through Express Mode and you can continue to use that API and serve your service through Express Mode. So through that way, you really have the full flexibility of ECS at your disposal. What's going on now is you see for that service that we updated, we have two tasks running, and we can take a look at the Deployments tab to see what's going on there.

So we've launched that new deployment, and in this deployment you can see we're using canary-based deployment strategy. So we're using the state of the art best practices for ECS deployments in order to ensure zero downtime here. We have bake time integrated on the deployment and then we have monitoring as well with an alarm monitoring the number of 4XX and 5XX that are being served by your application. In case we need to roll back to the prior service revision, and all of that will be handled on your behalf.

So for this canary strategy, it will launch that second task, and then 5% of your traffic from your Application Load Balancer will be sent to that new task, while 95% remains behind. We'll bake for some time, and then we'll switch to full 100%.

If anything goes wrong, if your application has an issue and we need to roll back, we'll roll that back forward. Otherwise, you're good to go and you've completed your updates entirely.

Shifting back here now to the traffic visualizer, we see that this is starting to ramp up the number of requests. So in just a few short minutes, we've sent over 12,000 requests to this same service, and we've integrated with auto scaling in such a way that we require three metric points of data from auto scaling before auto scaling sends a trigger to ECS, and then we will scale up. This allows for any small perturbations in your traffic. Let's say you have an occasional spike that goes back down. You won't necessarily need to expand and scale up your fleet, but once you have shown a consistent increase in traffic, then we will begin to scale up that fleet accordingly.

And just right there as you see, we've now gotten that trigger from auto scaling. ECS has begun to scale up those tasks, so we've transitioned from one task desired, one task running to now three tasks desired, three tasks running, all without any intervention on your behalf. We've made this repository image publicly available, so if you'd like, go ahead and take a picture of this. Scan your QR code. That will send you to our public ECR repository where you can launch this application with one single command on ECS Express Mode.

Coming back again to our service, we'll go back to that more complicated application, and just to show you sort of the final stepping stone of the complete lifecycle management of your application, is how do you delete your Express Mode service. You see, we have all these downstream resources for you, but we've made this super simple to orchestrate on your behalf. See, Express Mode will identify which resources from your application are still needed and which ones are unique to this single service and can be deprovisioned.

So as we look at this, something like your cluster, which is still being used by multiple different services, that will be retained. Your load balancer, which as we saw before is used by multiple different services, that will be retained as well. Your log group, of course, you probably don't want to delete your log groups. You want to retain those as well, so those have retention. But things like your target group, your scaling policy, the service itself, those will be going to draining and then those will deprovision.

So again, you can follow along in this resources panel here, which will refresh for you. You can see we've deleted that listener rule, that target group, et cetera, and if we were to access the endpoints, you can see this is no longer accessible. So that's really the full end-to-end lifecycle of support and orchestration that Express Mode offers for you as well. Now, I'll hand it back to Malcolm, who will wrap us up a little bit and share more about our complete integration.

ECS Express Mode Benefits: Full ECS Power with Simplified Experience

All right, thanks, Thomas. Claudio, there you go. I got my mic back. So one of the things that we just wanted to call out there is that was a live demo, right? Like that's, you know, tempting the demo gods, but we have that much faith in the solution. So kudos, Thomas. And really what we wanted to kind of make clear here is this is ECS. It's an ECS feature. It's not something different. It is ECS.

What I really loved about Thomas's demo there is he shows you that although it's a simplified experience, the end result is we're creating resources in your account. Those resources are available to you. You can go and have a look at the task definition. You can add sidecars if that's what you want to do. So it is still full-blown ECS with all of that richness. It's just a much simplified developer experience, and it's end-to-end. It's not just about when you create the application. It's about managing the full lifecycle of that application.

One of the things that kind of really brings that home for me is if you have a look on the left, that very long bar that you probably can't read, that's the CloudFormation template that's required in order to establish and build a web service application running on ECS previous to Express Mode. On the right-hand side is what you do now. And importantly, this is alive and available right now, right? It's available through all of the different mechanisms, the SDK, the CLI, CloudFormation, and the console. That's what you end up doing, a significantly simplified outcome, although you still end up getting all of those resources that you need.

All right, importantly, ECS Express is a feature of ECS. There is no additional charge. The only thing that you're going to be paying for is you're going to be paying for the resources that you consume, so your load balancer and your compute as you normally would. This is really just about a simplified developer experience. We have 40 seconds before we get thrown off stage, so we're not going to take any questions right now, but we will be over here on the side and we would love to speak to you. We really appreciate you taking the time to come and see us. Thank you so much and enjoy the replay tonight.

; This article is entirely auto-generated using Amazon Bedrock.