DEV Community

Cover image for Understanding Text Using Amazon Comprehend(AI series on AWS)
Jeya Shri
Jeya Shri

Posted on

Understanding Text Using Amazon Comprehend(AI series on AWS)

We have already learned how to use AWS to interpret images in the case of Amazon Rekognition and documents in the case of Amazon Textract. Here we change the focus of images and documents to what all applications are interacting with on a daily basis: text.

The information that is valuable is in the form of emails, reviews, support tickets, social media posts, chat messages, feedback forms, and survey responses. Nevertheless, it becomes impractical to read and analyze this data manually when it becomes large.

Amazon Comprehend allows applications to initialise and unlock meaning in text with Natural Language Processing (NLP).

What Amazon Comprehend Is ?

Amazon Comprehend is a comprehensive managed artificial intelligence service that can read and extract the concepts of sentiment, key phrases, entities, language, and topics. Comprehend comprehends the meaning of words in their original context and intent unlike the conventional ways of text processing where the matching of keywords is applied.

Indicatively, it can tell a positive review or a negative one by a customer, that "Amazon" is an organization, not a river, or that a lengthy paragraph had key words.

It is important since these applications are used today to create large volumes of unstructured text in large amounts. Amazon Comprehend enables programmers to transform that unstructured text into structured, actionable information without having to create NLP models.

How it works?

Amazon Comprehend processes the input with pre-trained deep learning models when you send text to it and this is intended to understand the language. These models are trained on big data on various languages and various writing styles.

The service splits text into tokens, deconstructs grammar, judges the semantic meaning, and uses classification methods to generate insights. All this is going on behind the scenes. As a developer, it is still a simple API call which can give a response in the form of a JSON.

This is the abstraction that makes Comprehend usable even to software developers who have no experience in either linguistics or machine learning.

Major functionalities:

Amazon Comprehend is characterized by a number of essential features that fulfill a particular need related to the analysis of the text.

Sentiment analysis identifies the positive, negative, neutral, and mixed emotions in the text. It is typically applied to customer feedback system, review analysis and social media monitoring.

Entity recognition involves recognising in the real world objects like people, commercial objects, locations, dates, quantities, and organisations. Considering the sentence, Apple released the iPhone in California, Comprehend is able to recognize the correct organization (Apple) and the correct location (California).

Key phrase extraction emphasizes the most suitable phrases in a text. This comes in handy with respect to summarization, indexing, and optimization of search.

Language detection allows the automatic identification of the language of the input text that is particularly useful in global applications dealing with multilingual data.

Topic modeling is a service that can be used through asynchronous jobs to find themes in a massive amount of documents without labels.

All these enable applications to comprehend textual data in large scale in a profound way.

Applications

Amazon Comprehend is very popular in industries. Ticket analysis on customer support platforms helps detect possible recurrent problems and customer mood. Product perception is studied by companies dealing with e-commerce through reviews. Reports and emails go through financial institutions to identify risk indicators. Healthcare institutions examine clinical notes, frequently with the help of Comprehend Medical which is a special form of the service.

With even basic projects, Comprehend has the ability to drive functionality such as feedback dashboards, automated tagging systems or sentiment-driven alerts.

The program is accessed with the help of Amazon Comprehend in the AWS Console.

Service walk-through in AWS Console

Once signed in to the AWS, go to the search box and type in Comprehend and open the service. The console has a real time analysis area where you can paste a sample text and immediately analyze it and see the sentiment, entities, key phrases and language identified.

This interactive experience assists users to learn how the service understands various types of texts and become confident enough before using it in applications.

Developing Python applications with Amazon Comprehend

The following is a Python code with the AWS SDK to analyse sentiment and detect entities.

import boto3

comprehend = boto3.client(comprehend)

text = Customer service was very good, though it took a long time to get the goods.

sentiment_response = comprehend.detect sentiment(
Text=text,
LanguageCode='en'
)

entities_response = comprehend.identify entities ( )
Text=text,
LanguageCode='en'
)

print("Sentiment:", sentiment_response['Sentiment'])
print("Entities:")
entity in entities response[Entities]:
print(entity['Text'], entity['Type'])
Enter fullscreen mode Exit fullscreen mode

This is an illustration of the ease at which Comprehend fits into applications. Your application is able to draw in information using simply a few lines of code one might have used complicated NLP pipelines.

Included in the response are the scores of confidence, which can be utilized in decision-making logic within applications.

Batch Processing and Asynchronous Jobs

Although real time APIs are efficient with small input sizes, it is also possible to process large input in real time using asynchronous mode in Amazon Comprehend. Thousands of documents stored in S3 can be analyzed with batch jobs and get the results after the processing.

This renders Comprehend to be appropriate in analytics workloads, historical data processing, and text analysis on an enterprise level.

Pricing and Cost Awareness

Amazon Comprehend is priced on a pay-as-you-go basis, and usually by the amount of characters processed. The price differs according to the analysis of the kind being conducted.

The free tier would allow enough use of the service to experiment with it and learn about it as well as do small projects. Nevertheless, when working with a large amount of text, the developers should always look at the usage to prevent unforeseen expenses.

In Which Cases Should we use Amazon Comprehend?

Amazon Comprehend is the right option in cases when your application requires text interpretation instead of merely searching words. It is specifically applicable in sentiment analysis, extraction of entities, text classification and large-scale analysis of documents.

In case your application domain needs to have a very domain-specific understanding of language, then you might need custom classification models or specialised services. Comprehend has sufficient capability even in most general-purpose applications.

Conclusion:

Amazon Comprehend demonstrates the ability of highly powerful AI to be easily available as simple APIs. AWS allows developers to concentrate on functionality rather than models by disregarding the complexity of NLP.

As an intro, understanding Amazon Comprehend is the next skill to acquire in order to create intelligent and data-driven applications capable of understanding the human language.

Next in this series, we will discuss the service to the text-to-speech conversion called Amazon Polly which enables us to transform the text into the voice and opens the possibilities of voice-activated applications.

Top comments (0)