NLP Made Easy: Sentiment Analysis with Amazon Comprehend

#nlp #python #aws

Natural Language processing has become an integral part of many AI driven products around the world today, and sentiment analysis makes up a good percentage of these products, because of its importance in business and decision-making.
However, being able to build, deploy and monitor a model of this sort isn't easy!

In this short tutorial, we'd see how Amazon comprehend helps us achieve sentiment analysis in a faster and more efficient way.

What is Amazon Comprehend?

Amazon Comprehend is a Natural Language Processing(NLP) service that provides Custom Entity Recognition, Custom Classification, Key phrase Extraction, Sentiment Analysis, Entity Recognition, and more APIs that you can easily integrate into your applications. The great thing about it is it also offers medical insights and protected health information (PHI) detection via Amazon Comprehend Medical.

Getting Started with Amazon Comprehend

Getting started with Amazon Comprehend is a straightforward process.

Step1: Create an AWS Account
First thing to do would be to create an AWS account if you do not have one.

Step2: Open Comprehend
Inside the AWS Management Console, search for "Comprehend" in the services search bar. Click on the Amazon Comprehend to open the service dashboard. You should have an interface like this:

Next, click on "Launch Amazon Comprehend" to start analyzing texts.

Step3: Real Time Analysis
With Amazon Comprehend, we can analyze text in real time by using built-in or custom models. Creating a custom model to classify our sentiments is not in the scope of this tutorial. We would use a built-in model by Comprehend to perform real time sentiment analysis on our texts in two separate ways:

Amazon Comprehend Interface

Sentiment analysis using text can be performed easily and fast inside Comprehend's real time analysis interface. Simply scroll down to input data, type in your own text and click on "Analyze". Scroll further down and choose the sentiment tab to view the sentiments and their confidence scores.

From the result above, our text review, "My order was delayed by several days without any updates or communication from the seller. Terrible shipping service." has a negative review with a 99% confidence score.

Via API calls using Python's Boto3

We can programmatically access this sentiment model and perform analysis via an API using Boto3. Boto3 is the official Python Software Development Kit (SDK) provided by Amazon Web Services (AWS) to interact with AWS services using Python programming language. For comprehend, the full documentation shows us all the available methods that can be accessed via the comprehend service.

In order to do this, we need to an access key and secret access key to enable us remotely connect and access services within our account. To do this, follow the simple steps in this link.

Import boto3 library and initiate our client.

import boto3

client =  boto3.client('comprehend',region_name='us-east-1',aws_access_key_id='',aws_secret_access_key='')
client

Next, pass our text into the detect_sentiment method to predict its sentiment.

text = """My order was delayed by several days without any updates or communication from the seller. Terrible shipping service."""

client.detect_sentiment(Text=text, LanguageCode='en')

The response looks like this:

Now, let's make it a bit more interesting. We'd download a review dataset from kaggle, link here

import pandas as pd

data = pd.read_csv("amazon_reviews.csv")
data = data[['reviewText']]
data

The Data is made up of 4915 rows but for the purpose of this tutorial we'd choose 100 random samples for prediction.

data_100 = data.sample(100)

def detect_sentiment(review):
    response = client.detect_sentiment(Text=review[0], LanguageCode='en')
    return response["Sentiment"]

data_100['Sentiment'] = data_100.apply(detect_sentiment, axis=1)

Predicting 100 reviews took 1 min, 34 seconds, which is relatively fast for a model that performs well.

Pricing

According to the official pricing information here requests made to Amazon Comprehend's APIs for entity recognition, sentiment analysis, syntax analysis, key phrase extraction, and language detection are measured in units of 100 characters, (1 unit = 100 characters), with a 3 unit (300 character) minimum charge per request. Total cost = [No. of units] x [Cost per unit]. And cost per unit is $0.0001.

Conclusion

In conclusion, Sentiment Analysis, a crucial aspect of Natural Language Processing (NLP), has become an indispensable tool for businesses and organizations seeking to understand and harness the power of human sentiment and emotions.

Amazon Comprehend, offers a user-friendly and powerful platform for Sentiment Analysis, empowering businesses of all sizes an ability to leverage NLP without the need for extensive expertise or infrastructure. By utilizing Amazon Comprehend, companies can unlock valuable insights from vast amounts of text data.