Training a model in a Jupyter Notebook is satisfying. But deploying it? That's where the headaches usually start. Today, I'm going to show you how to deploy a Keras image classifier using AWS Lambda and TensorFlow Lite.
Why Serverless?
AWS Lambda is "Serverless," meaning you don't manage the OS or hardware. You just upload code. It's cheap because you only pay when your code runs.
The Heavyweight Problem 🏋️
Standard TensorFlow is huge (approx. 1.7 GB). If you try to shove this into a Lambda function, you'll run into storage issues and slow performance.
The Lightweight Solution ⚡
We use TensorFlow Lite. It optimizes the model for inference (prediction) only, stripping out all the training logic.
Step 1: The Handler Code
Your Python script needs a special function to handle the AWS event:
import tflite_runtime.interpreter as tflite
from keras_image_helper import create_preprocessor
interpreter = tflite.Interpreter(model_path='clothing-model.tflite')
interpreter.allocate_tensors()
def lambda_handler(event, context):
url = event['url']
# ... preprocessing and inference logic ...
return result
Step 2: The Dockerfile
We use Docker to package our dependencies. Crucial Tip: When installing the TF-Lite runtime from a URL, ensure you use the raw version of the link, or pip will throw a BadZipFile error.
Step 3: Deploy with Serverless Framework
Instead of clicking buttons in the AWS console, we can use a serverless.yml file to describe our infrastructure:
service: clothing-model
provider:
name: aws
ecr:
images:
appimage:
path: ./
functions:
predict:
image:
name: appimage
events:
- http:
path: predict
method: post
Running serverless deploy handles the Docker build, ECR upload, and Lambda creation automatically!
Happy coding!
Top comments (0)