In this project, we will create an automated pipeline for analyzing the sentiment of product reviews. Instead of building a full application, we will upload product reviews in JSON format to an Amazon S3 bucket. As soon as a new file is uploaded to S3, an S3 event notification will trigger an AWS Lambda function. This Lambda function will use the Amazon Comprehend API to perform sentiment analysis on each review in the uploaded file. Once the sentiment data is processed, the Lambda function will upload the analyzed results to a new S3 bucke t in JSON format. Amazon Athena will then be used to query the sentiment data stored in S3. Finally, the data will be seamlessly integrated with Amazon QuickSight for interactive visualization, providing insightful analysis of the sentiment trends.
Prerequisite:
AWS Account
QuickSight Account Setup
Experience working with AWS Services
Programming skills: Python, JSON
Step 1: Create S3 Buckets
Lets get our hands dirty and start by creating two S3 buckets one for storing product reviews and the other to store the data analyzed by Amazon Comprehends sentiment analysis.
Search for Amazon S3 in the services in AWS account.
Click on create bucket.
Select bucket type as General Purpose.
Provide unique name for the bucket, for example
product-review-bucket-789
Object Ownership select as ACL Disabled.
Click Create Bucket.
Upload Product Review JSON file to S3 Bucket
Create one more bucket to store sentiment analysis data
sentiment-analysis-bucket-1234
Step 2: Create Lambda Function
Search for Lambda in the services.
Click on Create a function.
Select Author from scratch.
Provide a name for the Lambda function, for example,
LambdaForSentimentAnalysis
.Select the runtime as Python 3.13.
Choose Create a custom role and go to the IAM console.
Choose the service as Lambda.
Select AmazonS3FullAccess and ComprehendFullAccess. [Note: It is a good practice to provide granular access for a particular bucket and Comprehend job.]
Click Create role.
Go back to the Lambda screen and select the role just created in Existing roles.
Finally, click Create function.
Step 3: Write Python code
Open editor of your choice and create a file.
Write a code to read product review JSON file from the S3 bucket.
Use the Amazon Comprehend API to analyze sentiment.
Store the analyzed sentiment data in the output S3 bucket in JSON.
Step 4: Create S3 Event Notification to Trigger Lambda
Select Product review S3 bucket
product-review-bucket-789
Under Properties, click create event notification.
Provide a name to the event, for example, NewReviewUploadTrigger.
Select All object create events.
Choose Lambda function created in above step.
Click save changes.
Step 5: Set up Athena for Querying S3 Data
Select Athena in services.
Select Query your data with Trino SQL.
Click Launch query editor.
In Athena console, click on the "Settings" icon
Under Query result location, specify an S3 bucket to store Athena query
Use Query editor create a new database.
Step 6: Setup Amazon QuickSight for Visualization
Go to QuickSight in the AWS Console and setup Amazon QuickSight Account.
Attach an IAM policy to allow QuickSight to access Athena and S3.
Click on Datasets New Dataset.
Give a name to Data Source Name.
Select Athena Workgroup as Primary.
Select Database and Tables created in previous step.
Select Directly Query your Data and click Visualize then Create.
Choose visuals like Pie chart, Bar etc select Group/x-axis as sentiment.
Thank you for taking time to read my article. If I've overlooked any steps or missed any details, please don't hesitate to get in touch.
Feel free to reach out to me anytime Contact me
~ Palak Bhawsar
]]>
Top comments (0)