Different ways of making REST Endpoint inference with Sagemaker SDK

Ayush Rastogi — Mon, 02 May 2022 04:35:50 +0000

In this article we are going to see two ways of making inference from a machine learning model endpoint using Sagemaker SDK. The machine learning model is trained on 'Amazon Customer Reviews Dataset' and can predict customer review rating on a scale of 1-5. As a part of transfer learning, I have used BERT model with pre-training carried out through DistilBERT, using Hugging face transformer, which provides bi-directional attention, and TensorFlow. Details regarding model implementation are shown below.

The first step for the inference script is to include the boiler plate details like session, bucket, role, region, account ID and Sagemaker object.

The second step is to include the details completed during the model deployment stage. This will include the training job name, model name, and endpoint details.

Using the deployment stage details, a predictor object needs to be created which includes session details and the endpoint name. We are using JSONserializer and JSONdeserializer to format the input, interpret it and then deserialize it into an object type.

Category 1: Real-time/Ad-hoc predictions
Once this predictor is created, it can be used to make ad-hoc predictions by providing inputs as features, shown below. Here function predictor.predict() is being used to make real-time predictions against Sagemaker endpoints with python objects. Four examples of Amazon reviews have been shown here.

Category 2: Batch predictions
The reviews are stored in a .tsv file (tab separated value) and can be processed all at once. Predictions of 1000 sample reviews out of 102,084 reviews, were made which had a duration of 5 mins 45 seconds based on the compute being used. By providing the argument of 10, on the df_sample_reviews.head() call, we can retrieve the small number and validate the predictions. The column which serves as an input is 'review body' and predictions are compared against 'star rating'.

Predictions can also be invoked model as a SageMaker Endpoint directly using the following HTTP request/response syntax. This approach is primarily being used when models are deployed as microservices in a production environment.

References:

Data Science on AWS, Chris Fregly and Antje Barth
https://docs.aws.amazon.com/index.html?nc2=h_ql_doc_do_v
https://sagemaker.readthedocs.io/en/stable/

Brief Intro to Services required for Sagemaker

Ayush Rastogi — Sun, 01 May 2022 22:17:44 +0000

In this article we are going to take a look at some of AWS services required for setup of AWS Sagemaker. Once you have an understanding of the services listed in this blog, you can start using sagemaker for machine learning. This will ensure you have a safe environment to provision AWS services and allow compute resources. Knowledge about some of these services is critical as it helps in determining the right decisions to comply with AWS shared responsibility model while taking security into consideration. The model is shown below.
Some of the services we would need include:

IAM account
S3 bucket
VPC
EC2 instances.

Let's go into the detail of what each service is and how it can be provisioned.

IAM (Identity and Access Management): This falls under 'security, identity and compliance' category under the AWS list of services. In order to access AWS services, you would need to provision IAM roles to gain and manage access to AWS resources. IAM policies are attached to specific roles, providing granular permissions and authentication to users, for specific services. IAM implements the concept of principals, actions, resources, and conditions. This defines which principals can perform which actions on which resources and under which conditions. Some of the most commonly used policy types include Identity based policies and resource based policies.

S3 (Simple Storage Service) - This falls under 'storage' category under the AWS list of services. Amazon S3 is an object storage service and is used to store and protect any amount of data for a range of use cases, such as data lakes, websites, mobile applications, backup and restore, archive, enterprise applications, IoT devices, and big data analytics. Some of the features provided by S3 include different storage classes, storage and access management, data processing via s3 object lambda, as well as logging and monitoring tools.

VPC (Virtual Private Cloud) - To maintain security for data and applications, AWS resources are often provisioned inside a virtual network. Not only is it considered more secure, it can be managed according to an organization's requirements, providing more flexibility, while still keeping the benefits of using the scalable infrastructure of AWS. As an example, AWS resources like EC2 instances can be commissioned inside VPC, by specifying an IP address range for the VPC, and adding subnets, security groups and configuration of route tables. We can initiate a public subnet for access to the internet like cloning github repositories etc., whereas a private subnet can be utilized for sensitive data.

Image Ref: https://www.geeksforgeeks.org/amazon-vpc-introduction-to-amazon-virtual-cloud/

EC2 (Elastic Compute Cloud): In order to perform any computations in the cloud, you would need a virtual computing environment and the service offered by AWS is EC2. Apart from being highly scalable, EC2 instances come in a variety of configurations of CPU, memory, storage, and networking capacity. Using a service like this would allow to remove any dependency and investments to on-premise hardware. Some compute examples are shown below. EC2 spot Instances can be utilized for various stateless, fault-tolerant, or flexible applications such as big data, containerized workloads, CI/CD, web servers, high-performance computing (HPC), and test & development workloads.

Note: Image references from https://docs.aws.amazon.com/index.html

DEV Community: Ayush Rastogi

Different ways of making REST Endpoint inference with Sagemaker SDK

Brief Intro to Services required for Sagemaker