If you are an aspiring Data Scientist, ML engineer or a DevOps engineer looking to expand your knowledge with AWS services, you’ve come to the best post!
Many ML engineers struggle to deploy models without incurring SageMaker costs. This post shows you how to deploy real-time ML inference using only Lambda and Docker—fully serverless, low-cost, and production-grade.
In this tutorial, we will go over a small project where we are going to develop an AWS Lambda and put it into a Docker container which is going to be used for predicting injury duration of soccer players, based on the injury description. Technologies we are going to use in this project are:
- Python
- Docker
- AWS services:
- Lambda
- API Gateway
- CloudFormation
- Elastic Container Registry
4 simple steps we are going to take to complete this project are:
- Create a script which is going to train and optimize an AI model based on the dataset
- Create the Lambda code which will take in the parameters based on which we need to predict the injury duration
- Create the CloudFormation template which will build our Lambda by using Docker
- Deploy the mentioned CloudFormation template to our AWS environment
The Github repository can be found here: https://github.com/mate329/dockerized-AI-Lambda
Important note - all resources and identifiers shown were used in a temporary environment and have since been deleted.
Let’s start!
Training and optimization script for AI Model
Data Description
The dataset we are using to train our AI model on is a dataset regarding soccer injuries. Injuries are a common occurrence in any sport and it would be very beneficial for any professional sport club to have a system which will, based on the injury description, predict the length of the injury. Here are the parameters which are used to describe an injury:
The dataset has been downloaded from https://stemgames.hr/en/event/competition-in-problem-solving-exercises/technology-arena/
Libraries used
Libraries which we are going to import into our training script:
- NumPy - used for scientific methods
- Pandas - used for data manipulation, analysis and loading CSV files
- the star of our training process - Optuna - used for hyperparameter optimization of AI models
- Pickle - saving and loading Python objects
- JSON - standard library for working with JSON data
- XGBoost - our AI library containing the regressor model which we are going to use
- Scikit-Learn - used for splitting data into training and test datasets
import numpy as np
import pandas as pd
import optuna
import pickle
import joblib
import json
from xgboost import XGBRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
Loading the data and creating datasets
Next, we are going to load the data, get the list of the features inside our dataset and split the data into training and test datasets
# Loading the data, where 'kaggle_x_train.csv' contains the injury description
# while 'kaggle_y_train.csv' contains the actual duration of the injury
data_train_x = pd.read_csv('data/kaggle_x_train.csv')
data_train_y = pd.read_csv('data/kaggle_y_train.csv')
# Since the description and actual duration of the injury are in two different
# files, we need to merge them into one object
merged_train = pd.merge(data_train_x, data_train_y[['Id', 'injury_duration']], on='Id')
merged_train.drop(columns=['Id'], inplace=True)
# Separate features and target variable
X = merged_train.drop('injury_duration', axis=1)
y = merged_train['injury_duration']
# Save feature names for Lambda validation
feature_names = list(X.columns)
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
AI Model Optimization
Now we are coming to a part of the code where we will define the hyperparameter optimization of our AI model.
Hyperparameter optimization is trying to find the combination of model settings (called hyperparameters) that give the highest performance on a given dataset. To make this process as easy as possible, we are using Python library Optuna, which does the optimization for us.
We will define the method called objective
where we need to define the parameters which we want to optimize and their ranges, then we provide the params
dictionary to the AI model, we train it on our data and see the results.
Root Mean Square Error (or RMSE) is one of the metrics which is used for assessing model performance. In this project, RMSE represents the average difference between the actual injury duration and predicted injury duration.
To start the optimization process, we call the Optuna library and create something called a study - think of it as starting an experiment - we set the direction to minimize
since we want the final model to predict the injury duration closest to the actual duration and we start the optimization process by calling the optimize
method and defining the number of trials we want to run.
def objective(trial):
# Define hyperparameter space
params = {
'n_estimators': trial.suggest_int('n_estimators', 1, 100),
'max_depth': trial.suggest_int('max_depth', 1, 10),
'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.3),
'min_child_weight': trial.suggest_int('min_child_weight', 1, 10),
'gamma': trial.suggest_float('gamma', 0, 0.3)
}
# Create and fit model
model = XGBRegressor(**params)
model.fit(X_train, y_train, eval_set=[(X_test, y_test)], verbose=False)
# Make predictions and return RMSE
preds = model.predict(X_test)
mse = mean_squared_error(y_test, preds)
rmse = np.sqrt(mse)
return rmse
# Minimize RMSE
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=200)
Feel free to experiment with the hyperparameter ranges, mine are used just as an example.
Saving needed artifacts for Lambda
At the end of our script, we are saving the necessary files (called artifacts) which our Lambda is going to use. For this, we are going to use joblib
library to save the final model with the highest performance.
# Fetch the best hyperparameters found during the study
best_params = study.best_params
print(f"Best parameters: {best_params}")
print(f"Best RMSE: {study.best_value}")
# Train final model with best parameters and evaluate
final_model = XGBRegressor(**best_params)
final_model.fit(X_train, y_train)
final_preds = final_model.predict(X_test)
print("Final model performance:")
print(f"Best parameters: {best_params}")
print(f"RMSE: {np.sqrt(mean_squared_error(y_test, final_preds)):.4f}")
# Export artifacts for AWS Lambda
print("Exporting model artifacts...")
# 1. Save the trained model
joblib.dump(final_model, 'PredictInjuryDurationLambda/injury_model.pkl')
print("Model saved as 'PredictInjuryDurationLambda/injury_model.pkl'")
# 2. Save model metadata
metadata = {
'feature_names': feature_names,
'best_params': best_params,
'model_performance': {
'rmse': float(np.sqrt(mean_squared_error(y_test, final_preds)))
},
'training_date': pd.Timestamp.now().isoformat(),
'n_features': len(feature_names)
}
with open('PredictInjuryDurationLambda/model_metadata.json', 'w') as f:
json.dump(metadata, f, indent=2)
print("Metadata saved as 'PredictInjuryDurationLambda/model_metadata.json'")
Here is the screenshot of the terminal window with Optuna library logging every trial which occurs during optimization, with the final parameters logged at the end:
And that’s it for our model training! Next, we’ll go over the Lambda code and a full step-by-step how to dockerize it!
Lambda code and Dockerizing the model
To start off, let’s explain our architecture - the client will send the request to the API Gateway URL which will forward the request to the Lambda. The Lambda will load the Docker image which is located inside Elastic Container Repository and use it to process the incoming request to predict the injury duration. After the prediction, the Lambda will return the data to the client via API Gateway.
Here is the architecture diagram:
Lambda code
The following Lambda code loads the incoming payload, validates that there are no missing information inside the payload and proceeds with the injury duration duration:
import json
import joblib
import pandas as pd
from aws_lambda_powertools import Logger
# Initialize logger
logger = Logger(service="injury_prediction")
# Load artifacts once when Lambda container starts
model = joblib.load('injury_model.pkl')
with open('model_metadata.json', 'r') as f:
metadata = json.load(f)
@logger.inject_lambda_context(log_event=True)
def lambda_handler(event, context):
# Load the incoming event and setup the logger
event_body = json.loads(event.get('body')) if 'body' in event else event
request_id = context.aws_request_id
logger.append_keys(request_id=request_id)
# Validate event structure
if 'features' not in event_body:
logger.error(f"Invalid event structure: {event_body}")
return {
'statusCode': 400,
'body': json.dumps({'error': 'Invalid event structure'})
}
# Extract features from event
features = event_body['features']
# Validate features match training
expected_features = metadata['feature_names']
if list(features.keys()) != expected_features:
logger.error(f"Feature mismatch: expected {expected_features}, got {list(features.keys())}")
return {
'statusCode': 400,
'body': json.dumps({'error': 'Feature mismatch'})
}
return perform_prediction(features, expected_features)
def perform_prediction(features, expected_features):
try:
# Create DataFrame with correct feature order
X = pd.DataFrame([features])[expected_features]
# Make prediction
prediction = model.predict(X)[0]
logger.info(f"Prediction made successfully: {prediction} days")
# Return response
return {
'statusCode': 200,
'body': json.dumps({
'injury_duration_days': float(prediction),
'model_version': metadata['training_date']
})
}
except Exception as e:
return {
'statusCode': 500,
'body': json.dumps({'error': str(e)})
}
To improve the Lambda performance, the AI model for prediction is a global variable, which follows the AWS Lambda best practices because the Lambda reuses global variables in every following invocation until it gets shut down after inactivity.
Defining Dockerfile
To define the configuration of our Docker image, we need to create a Dockerfile. We’ll use an official AWS Lambda image for Python 3.12, copy over the requirements.txt file with our Lambda dependencies, install them and then copy over the artifacts which we’ve generated after training our AI model. Lastly, we’ll define the entry method for our Lambda.
# Use an official AWS Lambda base image for Python
FROM public.ecr.aws/lambda/python:3.12
# Install required Python libraries
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
# Copy the saved model and Lambda function code into the container
COPY lambda_handler.py ./
COPY injury_model.pkl ./
COPY model_metadata.json ./
# Command to run the Lambda function
CMD ["lambda_handler.lambda_handler"]
Our requirements.txt
file for our Lambda is the following:
numpy
pandas
xgboost-cpu
scikit-learn
joblib
aws-lambda-powertools
We are using XGBoost-CPU library instead of the regular XGBoost library because Lambda is a CPU-only resource + we’ve saved ~800MB in Docker image size because the CPU-only library doesn’t download binary files used for interacting with computer’s GPU.
Defining CloudFormation resources
The following CloudFormation template is a very simple one - we just have our Lambda and API Gateway resource to open it up to the Internet.
The most important configuration is under the Metadata
section of our Lambda - we need to define the name of our Dockerfile, DockerContext and how will we tag the Docker image when it gets created. In addition, the PackageType
needs to be set to image as well, so CloudFormation recognizes that we are trying to deploy a dockerized Lambda.
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >
Predict Injury Duration Lambda
Resources:
PredictInjuryDurationLambda:
Type: AWS::Serverless::Function
Properties:
PackageType: Image
Timeout: 60
Architectures:
- x86_64
Events:
Inference:
Type: HttpApi
Properties:
Path: /predict-injury-duration
Method: post
ApiId: !Ref PredictInjuryDurationApi
Metadata:
Dockerfile: Dockerfile
DockerContext: ./
DockerTag: latest
PredictInjuryDurationApi:
Type: AWS::Serverless::HttpApi
Properties:
CorsConfiguration:
AllowOrigins:
- '*' # Allow requests from any origin
AllowHeaders:
- '*'
AllowMethods:
- '*'
Outputs:
PredictInjuryDurationApi:
Description: "API Gateway endpoint URL for Prod stage for Inference function"
Value: !Sub "https://${PredictInjuryDurationApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/predict-injury-duration/"
LambdaFunctionArn:
Description: "Injury Duration Prediction Lambda Function ARN"
Value: !GetAtt PredictInjuryDurationLambda.Arn
Deploying the Lambda with AWS Serverless Application Model and Docker
To continue, you will need to install the following dependencies:
- AWS Serverless Application Model - https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/install-sam-cli.html
- Docker - https://docs.docker.com/engine/install/
After installing these dependencies, you can continue with the tutorial.
We can build and locally test our Lambda with 3 simple terminal commands:
-
sam build
- command which will find our CloudFormation template, build the necessary artifacts for deploying to AWS -
sam local start-api
- command which enables us to run our Lambda locally without any expense, so we can confirm that everything is working as expected before deployment -
sam deploy
- command which takes the built CloudFormation template resources and sends them to your AWS account to be deployed.
Building the resources
Open up your terminal window, go into the folder where you created your template.yaml
file (in my case that’s PredictInjuryDurationLambda
folder) and run sam build
. The expected output should look like:
(venv) matia@matia-H510M-H-V2:./PredictInjuryDurationLambda$ sam build
Building codeuri: /home/matia/dev/cloudkey/mlops-cicd/injuries_lambda/PredictInjuryDurationLambda runtime: None architecture: x86_64 functions: PredictInjuryDurationLambda
Building image for PredictInjuryDurationLambda function
Setting DockerBuildArgs for PredictInjuryDurationLambda function
Step 1/7 : FROM public.ecr.aws/lambda/python:3.12
3.12: Pulling from lambda/python
8deb1a9ce5e3: Pull complete
99a4e43f82e3: Pull complete
0e56aa1f1c26: Pull complete
e2ef3e53683d: Pull complete
b9dec667dad3: Pull complete
94964360ff6a: Pull complete
Status: Downloaded newer image for public.ecr.aws/lambda/python:3.12 ---> ef61d0102ac9
Step 2/7 : COPY requirements.txt ./
---> bb551c013332
Step 3/7 : RUN pip install --no-cache-dir -r requirements.txt
---> Running in 4dd05ae833d0
Collecting numpy (from -r requirements.txt (line 1))
Downloading numpy-2.2.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (62 kB)
Collecting pandas (from -r requirements.txt (line 2))
Downloading pandas-2.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (89 kB)
Collecting optuna (from -r requirements.txt (line 3))
Downloading optuna-4.3.0-py3-none-any.whl.metadata (17 kB)
Collecting xgboost (from -r requirements.txt (line 4))
Downloading xgboost-3.0.2-py3-none-manylinux_2_28_x86_64.whl.metadata (2.1 kB)
Collecting scikit-learn (from -r requirements.txt (line 5))
Downloading scikit_learn-1.6.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (18 kB)
Collecting joblib (from -r requirements.txt (line 6))
Downloading joblib-1.5.1-py3-none-any.whl.metadata (5.6 kB)
Requirement already satisfied: python-dateutil>=2.8.2 in /var/lang/lib/python3.12/site-packages (from pandas->-r requirements.txt (line 2)) (2.9.0.post0)
Collecting pytz>=2020.1 (from pandas->-r requirements.txt (line 2))
Downloading pytz-2025.2-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas->-r requirements.txt (line 2))
Downloading tzdata-2025.2-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting alembic>=1.5.0 (from optuna->-r requirements.txt (line 3))
Downloading alembic-1.16.1-py3-none-any.whl.metadata (7.3 kB)
Collecting colorlog (from optuna->-r requirements.txt (line 3))
Downloading colorlog-6.9.0-py3-none-any.whl.metadata (10 kB)
Collecting packaging>=20.0 (from optuna->-r requirements.txt (line 3))
Downloading packaging-25.0-py3-none-any.whl.metadata (3.3 kB)
Collecting sqlalchemy>=1.4.2 (from optuna->-r requirements.txt (line 3))
Downloading sqlalchemy-2.0.41-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.6 kB)
Collecting tqdm (from optuna->-r requirements.txt (line 3))
Downloading tqdm-4.67.1-py3-none-any.whl.metadata (57 kB)
Collecting PyYAML (from optuna->-r requirements.txt (line 3))
Downloading PyYAML-6.0.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.1 kB)
Collecting nvidia-nccl-cu12 (from xgboost->-r requirements.txt (line 4))
Downloading nvidia_nccl_cu12-2.27.3-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (2.0 kB)
Collecting scipy (from xgboost->-r requirements.txt (line 4))
Downloading scipy-1.15.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
Collecting threadpoolctl>=3.1.0 (from scikit-learn->-r requirements.txt (line 5))
Downloading threadpoolctl-3.6.0-py3-none-any.whl.metadata (13 kB)
Collecting Mako (from alembic>=1.5.0->optuna->-r requirements.txt (line 3))
Downloading mako-1.3.10-py3-none-any.whl.metadata (2.9 kB)
Collecting typing-extensions>=4.12 (from alembic>=1.5.0->optuna->-r requirements.txt (line 3))
Downloading typing_extensions-4.14.0-py3-none-any.whl.metadata (3.0 kB)
Requirement already satisfied: six>=1.5 in /var/lang/lib/python3.12/site-packages (from python-dateutil>=2.8.2->pandas->-r requirements.txt (line 2)) (1.17.0)
Collecting greenlet>=1 (from sqlalchemy>=1.4.2->optuna->-r requirements.txt (line 3))
Downloading greenlet-3.2.2-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.metadata (4.1 kB)
Collecting MarkupSafe>=0.9.2 (from Mako->alembic>=1.5.0->optuna->-r requirements.txt (line 3))
Downloading MarkupSafe-3.0.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.0 kB)
Downloading numpy-2.2.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.5/16.5 MB 6.6 MB/s eta 0:00:00
Downloading pandas-2.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.7/12.7 MB 6.8 MB/s eta 0:00:00
Downloading optuna-4.3.0-py3-none-any.whl (386 kB)
Downloading xgboost-3.0.2-py3-none-manylinux_2_28_x86_64.whl (253.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 253.9/253.9 MB 6.7 MB/s eta 0:00:00
Downloading scikit_learn-1.6.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.1/13.1 MB 7.4 MB/s eta 0:00:00
Downloading joblib-1.5.1-py3-none-any.whl (307 kB)
Downloading alembic-1.16.1-py3-none-any.whl (242 kB)
Downloading packaging-25.0-py3-none-any.whl (66 kB)
Downloading pytz-2025.2-py2.py3-none-any.whl (509 kB)
Downloading scipy-1.15.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (37.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 37.3/37.3 MB 7.2 MB/s eta 0:00:00
Downloading sqlalchemy-2.0.41-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.3/3.3 MB 7.7 MB/s eta 0:00:00
Downloading threadpoolctl-3.6.0-py3-none-any.whl (18 kB)
Downloading tzdata-2025.2-py2.py3-none-any.whl (347 kB)
Downloading colorlog-6.9.0-py3-none-any.whl (11 kB)
Downloading nvidia_nccl_cu12-2.27.3-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (322.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 322.4/322.4 MB 6.9 MB/s eta 0:00:00
Downloading PyYAML-6.0.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (767 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 767.5/767.5 kB 9.5 MB/s eta 0:00:00
Downloading tqdm-4.67.1-py3-none-any.whl (78 kB)
Downloading greenlet-3.2.2-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (603 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 603.9/603.9 kB 6.1 MB/s eta 0:00:00
Downloading typing_extensions-4.14.0-py3-none-any.whl (43 kB)
Downloading mako-1.3.10-py3-none-any.whl (78 kB)
Downloading MarkupSafe-3.0.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (23 kB)
Installing collected packages: pytz, tzdata, typing-extensions, tqdm, threadpoolctl, PyYAML, packaging, nvidia-nccl-cu12, numpy, MarkupSafe, joblib, greenlet, colorlog, sqlalchemy, scipy, pandas, Mako, xgboost, scikit-learn, alembic, optuna
Successfully installed Mako-1.3.10 MarkupSafe-3.0.2 PyYAML-6.0.2 alembic-1.16.1 colorlog-6.9.0 greenlet-3.2.2 joblib-1.5.1 numpy-2.2.6 nvidia-nccl-cu12-2.27.3 optuna-4.3.0 packaging-25.0 pandas-2.2.3 pytz-2025.2 scikit-learn-1.6.1 scipy-1.15.3 sqlalchemy-2.0.41 threadpoolctl-3.6.0 tqdm-4.67.1 typing-extensions-4.14.0 tzdata-2025.2 xgboost-3.0.2
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.
[notice] A new release of pip is available: 24.3.1 -> 25.1.1
[notice] To update, run: pip install --upgrade pip
---> Removed intermediate container 4dd05ae833d0
---> 17d74dbc2906
Step 4/7 : COPY lambda_handler.py ./
---> 04d29738de3f
Step 5/7 : COPY injury_model.pkl ./
---> 8af7e1760bdf
Step 6/7 : COPY model_metadata.json ./
---> 6e20d1175fb5
Step 7/7 : CMD ["lambda_handler.lambda_handler"]
---> Running in e80c76b0a956
---> Removed intermediate container e80c76b0a956
---> 49128c299b03
Successfully built 49128c299b03
Successfully tagged predictinjurydurationlambda:latest
Simple! Now let’s test out our Lambda locally to confirm that it’s working.
Testing with the local build
We’ll spin up our stack locally by running sam local start-api
which is a great way of verifying that our Lambda works as expected. Here is the terminal output together with the Postman API call to our Lambda.
Feel free to use my payload which I’ve used for testing the Lambda:
{
"features": {
"age": 28,
"is_contact": 1,
"has_stopped": 1,
"swelling": 2,
"tone": 1,
"palpation": 3,
"is_contraction_painful": 1,
"is_stretching_painful": 1,
"class": 3,
"is_proximal": 0,
"is_abdominal": 0,
"is_distal": 1,
"fascia_depth": 2,
"is_hamstring": 0,
"is_quadriceps": 1,
"is_add_abd": 0,
"is_calf": 1,
"is_belly": 0
}
}
And here are the results from Postman API call:
And here are the results! You can see that my deployed model predicted that the injury duration, based on these descriptors, will be 17 days.
Deploying the Lambda to AWS account
You can deploy your Lambda and API Gateway by running the sam deploy -g
command (the -g
flag stands for “guided”) inside your terminal window where SAM will guide you through the deployment process. The output will look like this:
(venv) matia@matia-H510M-H-V2:./PredictInjuryDurationLambda$ sam deploy -g
Configuring SAM deploy
======================
Looking for config file [samconfig.toml] : Found
Reading default arguments : Success
Setting default arguments for 'sam deploy'
=========================================
Stack Name [sam-app]: PredictInjuryStack
AWS Region [eu-central-1]:
#Shows you resources changes to be deployed and require a 'Y' to initiate deploy
Confirm changes before deploy [y/N]:
#SAM needs permission to be able to create roles to connect to the resources in your template
Allow SAM CLI IAM role creation [Y/n]:
#Preserves the state of previously provisioned resources when an operation fails
Disable rollback [y/N]:
PredictInjuryDurationLambda has no authentication. Is this okay? [y/N]: y
Save arguments to configuration file [Y/n]:
SAM configuration file [samconfig.toml]:
SAM configuration environment [default]:
Looking for resources needed for deployment:
Managed S3 bucket: aws-sam-cli-managed-default-samclisourcebucket-vvz5tfqz29f9
A different default S3 bucket can be set in samconfig.toml and auto resolution of buckets turned off by setting resolve_s3=False
Image repositories: Not found.
#Managed repositories will be deleted when their functions are removed from the template and deployed
Create managed ECR repositories for all functions? [Y/n]:
Saved arguments to config file
Running 'sam deploy' for future deployments will use the parameters saved above.
The above parameters can be changed by modifying samconfig.toml
Learn more about samconfig.toml syntax at
https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-config.html
75bd0feec80a: Pushed
af455ae28cb4: Pushed
4761438a3d36: Pushed
1687a3ac8fa7: Pushed
8f989553341c: Pushed
c678e1c5e5d8: Pushed
5964a0804fb6: Pushed
6a9b57324378: Pushed
f27b91471588: Pushed
af56b219ad31: Pushed
0b81e7a3683d: Pushed
predictinjurydurationlambda-49128c299b03-latest: digest: sha256:fa1c14414b5029b79f0e97c5647331729beb42891e9eb814b8f77c0a5cda5e15 size: 2621
Deploying with following values
===============================
Stack name : PredictInjuryStack
Region : eu-central-1
Confirm changeset : False
Disable rollback : False
Deployment image repository :
{
"PredictInjuryDurationLambda": "9257xxxxxxxx.dkr.ecr.eu-central-1.amazonaws.com/predictinjurystack1f00e9c8/predictinjurydurationlambdad41e7e42repo"
}
Deployment s3 bucket : aws-sam-cli-managed-default-samclisourcebucket-vvz5tfqz29f9
Capabilities : ["CAPABILITY_IAM"]
Parameter overrides : {}
Signing Profiles : {}
Initiating deployment
=====================
PredictInjuryDurationLambda has no authentication.
Uploading to PredictInjuryStack/335eac6f8b2db4755ca431660bfaf17f.template 1578 / 1578 (100.00%)
Waiting for changeset to be created..
CloudFormation stack changeset
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Operation LogicalResourceId ResourceType Replacement
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Add PredictInjuryDurationApiApiGatewayDefaultSta AWS::ApiGatewayV2::Stage N/A
ge
+ Add PredictInjuryDurationApi AWS::ApiGatewayV2::Api N/A
+ Add PredictInjuryDurationLambdaInferencePermissi AWS::Lambda::Permission N/A
on
+ Add PredictInjuryDurationLambdaRole AWS::IAM::Role N/A
+ Add PredictInjuryDurationLambda AWS::Lambda::Function N/A
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Changeset created successfully. arn:aws:cloudformation:eu-central-1:9257xxxxxxx:changeSet/samcli-deploy1749059624/b0ee6116-2396-4322-bd2e-7a9467225a40
2025-06-04 19:53:50 - Waiting for stack create/update to complete
CloudFormation events from stack operations (refresh every 5.0 seconds)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ResourceStatus ResourceType LogicalResourceId ResourceStatusReason
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CREATE_IN_PROGRESS AWS::CloudFormation::Stack PredictInjuryStack User Initiated
CREATE_IN_PROGRESS AWS::IAM::Role PredictInjuryDurationLambdaRole -
CREATE_IN_PROGRESS AWS::IAM::Role PredictInjuryDurationLambdaRole Resource creation Initiated
CREATE_COMPLETE AWS::IAM::Role PredictInjuryDurationLambdaRole -
CREATE_IN_PROGRESS AWS::Lambda::Function PredictInjuryDurationLambda -
CREATE_IN_PROGRESS AWS::Lambda::Function PredictInjuryDurationLambda Resource creation Initiated
CREATE_IN_PROGRESS - CONFIGURATION_COMPLETE AWS::Lambda::Function PredictInjuryDurationLambda Eventual consistency check initiated
CREATE_IN_PROGRESS AWS::ApiGatewayV2::Api PredictInjuryDurationApi -
CREATE_IN_PROGRESS AWS::ApiGatewayV2::Api PredictInjuryDurationApi Resource creation Initiated
CREATE_COMPLETE AWS::ApiGatewayV2::Api PredictInjuryDurationApi -
CREATE_IN_PROGRESS AWS::Lambda::Permission PredictInjuryDurationLambdaInferencePermissi -
on
CREATE_IN_PROGRESS AWS::Lambda::Permission PredictInjuryDurationLambdaInferencePermissi Resource creation Initiated
on
CREATE_COMPLETE AWS::Lambda::Permission PredictInjuryDurationLambdaInferencePermissi -
on
CREATE_COMPLETE AWS::Lambda::Function PredictInjuryDurationLambda -
CREATE_IN_PROGRESS AWS::ApiGatewayV2::Stage PredictInjuryDurationApiApiGatewayDefaultSta -
ge
CREATE_IN_PROGRESS AWS::ApiGatewayV2::Stage PredictInjuryDurationApiApiGatewayDefaultSta Resource creation Initiated
ge
CREATE_COMPLETE AWS::ApiGatewayV2::Stage PredictInjuryDurationApiApiGatewayDefaultSta -
ge
CREATE_COMPLETE AWS::CloudFormation::Stack PredictInjuryStack -
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CloudFormation outputs from deployed stack
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Outputs
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Key PredictInjuryDurationApi
Description API Gateway endpoint URL for Prod stage for PredictInjuryDurationApi function
Value https://hfd90813bg.execute-api.eu-central-1.amazonaws.com/Prod/predict-injury-duration/
Key LambdaFunctionArn
Description Injury Duration Prediction Lambda Function ARN
Value arn:aws:lambda:eu-central-1:9257xxxxxxx:function:PredictInjuryStack-PredictInjuryDurationLambda-B78GP4IFvJbf
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Successfully created/updated stack - PredictInjuryStack in eu-central-1
And that’s it! Your Lambda containing the AI model is successfully deployed. You can take your *PredictInjuryDurationApi*
output and put it inside an API testing tool, like Postman to see your Lambda in action!
We’re going to use the same payload we’ve used for testing the Lambda locally and the results are the same, which confirms that everything is working!
It is important to note that the Docker image will be deployed to Elastic Container Repository (ECR) on AWS. When you are deleting the created resources from this project, make sure to remove the repository and all images stored there.
Conclusion
Congratulations! You’ve successfully built and deployed a production-ready AI model with AWS Lambda and Docker. Let’s go over what we’ve learned in this post:
- writing a complete script with XGBoost and Optuna as our AI model and optimizer combination
- how to containerize a Lambda function to handle real-time predictions
- deploy the containerized Lambda and make it accessible from anywhere on the Internet
Making this project makes it a very cost-effective solution without having to provision and maintain servers, everything is done serverless - isn’t it awesome?
Thank you for reading! Wishing you the greatest day!
Top comments (0)