AWS re:Invent 2021 — Serverless Inference on SageMaker! FOR REAL!

#serverless #mlops #machinelearning #deeplearning

AWS re:Invent 2021 — Serverless Inference on SageMaker! FOR REAL!

At long last, Amazon SageMaker supports serverless endpoints.

In this video, I demo this newly launched capability, named Serverless Inference. Starting from a pre-trained DistilBERT model on the Hugging Face model hub, I fine-tune it for sentiment analysis on the IMDB movie review dataset. Then, I deploy the model to a serverless endpoint, and I run multi-threaded benchmarks with short and long token sequences. Finally, I plot latency numbers and compute latency quantiles.

⭐️⭐️⭐️ Don’t forget to subscribe to be notified of future videos ⭐️⭐️⭐️