DEV Community

Steve Cook
Steve Cook

Posted on

FastAPI Container Deployment with AWS Lambda and Amazon ECS

AWS container services provide many options for running containerized workloads in the cloud. In this post we will show how a Python application packaged into a container image can be executed on both AWS Lambda and Amazon ECS.

The Python workload is a simple web service endpoint that adds numbers together and returns the result. It is developed using the popular FastAPI library. Applications deployed with this library are typically run containerized with a web server. The Mangum library provides an adaptor that allows FastAPI applications to run in AWS Lambda behind Amazon API Gateway. The library translates the AWS Lambda proxy event to the Python ASGI standard.


Architecture Diagram
The ECS Task definition and Lambda function reference the same container image from ECR.


The ability to run the same container image with ECS or Lambda is achieved by taking advantage of the ability to override the default container settings for ENTRYPOINT and CMD.

ENTRYPOINT defines the executable that will run when the container is launched. CMD defines the parameters that are passed to the executable. For a detailed discussion of these Dockerfile instructions, see the AWS blog post “Demystifying ENTRYPOINT and CMD in Docker”.

Setup and Code

The example FastAPI code and a CDK application that builds the container, uploads it to Amazon ECR, deploys to both AWS Lambda and an Amazon ECS Service can be found at

The FastAPI service is very simple, it accepts 0 or more integers in a query parameter and returns the sum.

from typing import List, Optional
from fastapi import FastAPI
from fastapi.param_functions import Query
from fastapi.responses import PlainTextResponse
from mangum import Mangum

app = FastAPI()

@app.get("/math/add", response_class=PlainTextResponse)
async def addition(i: Optional[List[int]] = Query([])):
    return str(sum(i))

async def health():
    return {"status": "OK"}

handler = Mangum(app)
Enter fullscreen mode Exit fullscreen mode
RUN pip install fastapi "uvicorn[standard]" mangum
CMD ["main.handler"]
Enter fullscreen mode Exit fullscreen mode

The Dockerfile in the example above is based on the standard Lambda base image. The ENTRYPOINT executes a shell script that starts up the Lambda environment and loads the handler specified in CMD. This container can be pulled from ECR and run in Lambda directly. The key to allowing this image to be run up in Amazon ECS is the installation of Uvicorn.

To run this same container in ECS, we must provide overrides for ENTRYPOINT and CMD. In the ECS Task definition we override the ENTRYPOINT with [“uvicorn”]. This sets the container to run the Uvicorn Python web server on startup. The CMD is overridden with ["main:app", "--host", "", "--port", "80"], which defines the options for the web server to load the application and which ports to use.

The CDK code used to define the ECS Task is:

math_task = ecs.FargateTaskDefinition(
        image.repository, image.asset_hash
    port_mappings=[{"containerPort": 80}],
    command=["main:app", "--host", "", "--port", "80"],
Enter fullscreen mode Exit fullscreen mode

Hitting the ALB we get the following:

% http ""
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 2
Content-Type: text/plain; charset=utf-8
Date: Fri, 22 Oct 2021 04:24:35 GMT
server: uvicorn

Enter fullscreen mode Exit fullscreen mode

Also the result from API Gateway and Lambda:

% http ""
HTTP/1.1 200 OK
Apigw-Requestid: Hl6YIiA0ywMEPTA=
Connection: keep-alive
Content-Length: 2
Content-Type: text/plain; charset=utf-8
Date: Fri, 22 Oct 2021 04:28:01 GMT


Enter fullscreen mode Exit fullscreen mode


By planning ahead and packaging some extra dependencies into our container image, we have the ability to choose where to run the container. Using CDK to build the container image, store it in ECR, provision both the ECS Service and Lambda works very well and makes it quick to test out these ideas.

Top comments (1)

sandnath profile image

Hi Steve, great article! Thank you!
One question, in this scenario, when using FastAPI on AWS, is there anything you suggest for keeping some data in the memory and serving from there? I have a small amount of data (20-30 MB) which I thought of frequently using while serving the traffic and I was thinking of keeping it in memory instead of putting it into a database and querying that for every request (which of course has its network/connection and other overhead). I am assuming that as long as the ASGI is running with the Python/FastAPI code, and I read that file while starting the server, it will always be there to serve every request and I do not need to load the file anymore. Wanted to understand or hear your expert opinion on this issue. Thanks in advance.