DEV Community

Cover image for How to Deploy Your ML Model to AWS (Step-by-Step Guide)
Shrestha Pandey
Shrestha Pandey

Posted on

How to Deploy Your ML Model to AWS (Step-by-Step Guide)

I've trained more ML models than I've deployed. There's something comforting about the local loop—model.fit()model.evaluate(), hitting 94% accuracy, then staring at the screen wondering, "Okay, how do I make this actually useful?"

If you're stuck there right now, this guide will help.

Note: I wrote this based on AWS documentation and standard SageMaker patterns. If you try it, drop a comment about what worked (or broke).

What You Need Before Starting

  • AWS account with SageMaker enabled
  • A trained model saved as model.pkl (or .joblib)
  • requirements.txt with your dependencies
  • Python 3.8+ installed
  • AWS CLI configured (aws configure)

Step 1: Save Your Model

import joblib
joblib.dump(model, 'model.pkl')
Enter fullscreen mode Exit fullscreen mode

Create a requirements.txt file:

sklearn==1.2.0
pandas==1.5.0
numpy==1.23.0`
Enter fullscreen mode Exit fullscreen mode

Keep both files in the same folder.

Step 2: Upload to S3

import boto3

s3 = boto3.client('s3')

bucket_name = 'my-unique-ml-bucket-12345'  # Make this unique
s3.create_bucket(Bucket=bucket_name, CreateBucketConfiguration={
    'LocationConstraint': 'us-east-1'
})

s3.upload_file('model.pkl', bucket_name, 'models/model.pkl')
s3.upload_file('requirements.txt', bucket_name, 'models/requirements.txt')

model_s3_path = f's3://{bucket_name}/models/model.pkl'
Enter fullscreen mode Exit fullscreen mode

Step 3: Write Your Inference Script

Save this as inference.py:

import json
import joblib
import numpy as np
import os

model = None

def model_fn(model_dir):
    return joblib.load(os.path.join(model_dir, 'model.pkl'))

def input_fn(input_data, content_type):
    if content_type == 'application/json':
        data = json.loads(input_data)
        return np.array(data['features'])
    raise ValueError(f"Unsupported content type: {content_type}")

def predict_fn(input_data, model):
    return model.predict(input_data)

def output_fn(prediction, content_type):
    return json.dumps({'predictions': prediction.tolist()})
Enter fullscreen mode Exit fullscreen mode

These four functions are what SageMaker calls when someone hits your endpoint.

Step 4: Deploy Using Python SDK

Run this in a Python script:

from sagemaker.sklearn.model import SKLearnModel
from sagemaker import get_execution_role

sklearn_model = SKLearnModel(
    model_data=model_s3_path,
    role=get_execution_role(),
    instance_type='ml.m5.large',
    entry_point='inference.py',
    py_version='py3'
)

sklearn_model.deploy(
    initial_instance_count=1,
    instance_type='ml.m5.large',
    endpoint_name='my-model-endpoint'
)
Enter fullscreen mode Exit fullscreen mode

This takes 5–10 minutes. You'll see Creating → In Service.

Step 5: Test Your Endpoint

import boto3
import json

runtime = boto3.client('sagemaker-runtime')

response = runtime.invoke_endpoint(
    EndpointName='my-model-endpoint',
    ContentType='application/json',
    Body=json.dumps({'features': [[5.1, 3.5, 1.4, 0.2]]})
)

result = json.loads(response['Body'].read().decode())
print(result)
Enter fullscreen mode Exit fullscreen mode

If you see {'predictions': [...]}, it worked.

Step 6: Clean Up

Endpoints cost money even when idle:

aws sagemaker delete-endpoint --endpoint-name my-model-endpoint
aws sagemaker delete-endpoint-config --endpoint-config-name my-model-endpoint
Enter fullscreen mode Exit fullscreen mode

Common Errors (And Fixes)

Error Fix
NoCredentialsError Run aws configure again
InvalidRoleException IAM role needs S3 + SageMaker permissions
ModelError Check inference.py for missing imports
Endpoint stuck on Creating Wait 5–10 more minutes

Your IAM role needs:

  • s3:GetObjects3:PutObject
  • sagemaker:CreateModelsagemaker:CreateEndpoint

Cost Breakdown

Resource Cost
ml.m5.large ~$0.20/hour (~$6/month if 24/7)
S3 storage ~$0.02/GB/month

Delete when not using. I've seen $50 surprises from idle endpoints.

Verify This Before You Trust It

If you're following this, check:

  1. AWS SDK version — Run pip show boto3 sagemaker
  2. IAM role permissions — Biggest blocker is usually missing permissions
  3. Region mismatch — S3 bucket region must match SageMaker region
  4. Inference.py imports — Make sure osjoblibnumpy are installed

If something breaks, comment below with the error. I'll update this guide.

Final Thoughts

Deploying ML feels intimidating until you do it once. SageMaker handles most of the complexity. You just upload your model to S3, point SageMaker at it, and deploy.

I've trained models that sat on my laptop for months because I didn't know how to deploy them. Now I tell people: "Just run this script, it's not that hard."


If you're building something with this, drop a comment. I love seeing what people deploy.

Top comments (0)