DEV Community

Deepak Poudel
Deepak Poudel

Posted on

S3 File Analysis with AWS Lambda: Counting Words and SNS Notifications"

In today's fast-paced world, automation is the key to efficiency. AWS Lambda, one of the popular serverless computing service, allows you to run code without provisioning or managing servers. In this tutorial, I'll walk you through how to leverage AWS Lambda to automatically count the number of words in a file uploaded to an S3 bucket and then send the word count in an email via Amazon SNS (Simple Notification Service). This tool can be valuable for various applications such as content analysis, document processing, and more.

Prerequisites
Before directly diving into the implementation, make sure you have the following prerequisites in place:

● An AWS account with appropriate permissions to create Lambda functions, S3 buckets, and SNS topics.

Step 1: Log into the AWS Management Console and navigate the S3 service.

Image description

Step 2: Create a Bucket:
Once you're in the S3 dashboard, click the "Create bucket" button.

Image description

Step 3: Configure Bucket Settings:

● Bucket Name: Choose a globally unique name for your S3 bucket. Bucket names must be unique across all of AWS, so picking a name that someone else has yet to use is essential.

Image description
● Region: Select the favorable AWS region where you want to create the bucket. Choose a region that is geographically close to your intended users or applications for better performance.

Image description
● Configure options such as versioning, logging, and tags as needed: Depending on your specific use case, you can enable features like versioning to keep multiple versions of files or configure logging to track bucket activity.
● Set Permissions: By default, the bucket is private, meaning only the AWS account that created it has access. If you want to grant public access or specific permissions to other AWS accounts or IAM users, you can configure bucket policies, access control lists (ACLs), or IAM policies.

● Review and Create: After configuring the settings for your S3 bucket, review your choices to ensure they are correct. Double-check the bucket name for uniqueness.
● Click the "Create bucket" button to create the S3 bucket.

Image description
Step 4: Create an IAM role to permit the Lambda function to access Amazon S3 and Amazon SNS:

● Create a New Role:
➔ In the IAM dashboard, click "Roles" in the left sidebar.

Image description

Image description

➔ Now click the "Create role" button to create a new role with required permission policies.

● Select Type of Trusted Entity:

➔ For the trusted entity type, select "AWS service" since you're creating this role for AWS Lambda.

Image description

➔ In the use case options, choose "Lambda
● Attach Permissions Policies:

In the "Permissions" step, you can attach policies defining the role's actions. Search and select the following policies:

➔ AWSLambdaBasicExecutionRole: This policy lets your Lambda function write logs to CloudWatch Logs.
➔ AmazonSNSFullAccess: This policy provides full access to Amazon SNS, allowing your Lambda function to publish messages to SNS topics.
➔ AmazonS3FullAccess: This policy provides full access to Amazon S3, allowing your Lambda function to read from and write to S3 buckets.
You can search for these policies in the search box and attach them individually.

Image description

Image description

Image description

● Name and Create Role:

➔ Give your role a name. In this case, you can name it "wordCounterRoleforlambda" or choose any other desired name.

Image description

➔ Add a description to help you remember the purpose of this role.
➔ Click the "Create role" button to create the role.

Image description

Step 5: Create a Simple Notification Service (SNS) topic.
● Navigate to SNS (Simple Notification Service):

Image description

Image description

➔ Create a New Topic: In the SNS dashboard, click the "Create topic" button to create a new SNS topic.

Image description

➔ Configure Topic Details: Choose a type to Standard and Provide a name for your topic.

Image description

➔ Click on Create Topic

Image description

➔ Create Subscription: With your topic selected, click the "Create subscription" button to set up a new subscription.

Image description

➔ Choose a Protocol and Endpoint and Click Create Subscription

Image description

➔ You will receive an email shortly to confirm your Subscription

Image description

➔ Click on Confirm Subscription

Image description

Image description

Step 6: Create a Lambda Function
● Go to the AWS Lambda console.
● Click "Create function" and choose "Author from scratch” as shown below.

Image description

Image description

Image description

● Give your function a name, and choose the runtime (e.g., Python 3.7)

● Change the default execution role and select the existing role that we created earlier and click on create Function

Image description

● Add a trigger to the lambda function

Image description

➔ In Trigger Configuration, Choose S3 , and the bucket we created earlier and click on Add

Image description

➔ In lambda_handler.py write a following code

Image description

`import boto3
import os
import json

def lambda_handler(event, context):

# Find the topic ARN from the environment variables.

TOPIC_ARN = os.environ['topicARN']
print("Topic ARN =", TOPIC_ARN)

# Create an S3 client and find the S3 bucket name and file name (object key) from the event object.

s3Client = boto3.resource('s3')

record = event['Records'][0]
bucketName = record['s3']['bucket']['name']
print("bucketName =", bucketName)
objectKey = record['s3']['object']['key']
print("objectKey =", objectKey)

# Read the contents of the file.

textFile = s3Client.Object(bucketName, objectKey)
fileContent = textFile.get()['Body'].read()

print("fileContent =", fileContent)

# Count the number of words in given file.

wordCount = len(fileContent.split())
print('Number of words in text file:', wordCount)

# Create an SNS client, and format and publish a message containing the word count to the topic.

snsClient = boto3.client('sns')
message =  'The word count in the file ' + objectKey + ' is ' + str(wordCount) + '.'

response = snsClient.publish(
    TopicArn = TOPIC_ARN,
    Subject = 'Word Count Result',
    Message = message
)

# Return a successful function execution message.

return {
    'statusCode': 200,
    'body': json.dumps('File successfully processed by wordCounter Lambda function')
}`
Enter fullscreen mode Exit fullscreen mode

This AWS Lambda function, when triggered by an S3 event, retrieves the file uploaded to an S3 bucket, counts the number of words in that file, and publishes the word count as a message to an Amazon SNS topic. The Lambda function uses the Boto3 library to interact with AWS services. It starts by extracting the S3 bucket name and object key from the event object, then reads the content of the file from S3. After counting the words, it formats a message and publishes it to the specified SNS topic. Finally, it returns a successful execution message. This code is designed to automate the word-counting process and notify subscribers via SNS when new files are uploaded to the S3 bucket.

● Click on Configuration and Set the Environment variable

Image description

Image description

➔ Navigate to the sns Topic that we created earlier and copy ARN

Image description

➔ Set the Environment Variable

Image description

● On Lambda Dashboard, click on Test

Image description

● Configure Test Event
➔ Give Event Name
➔ Choose S3-put on the template

Image description

➔ Update S3 bucket name, ARN, Principal, and Key on Template

Image description

{
"Records": [
{
"eventVersion": "2.0",
"eventSource": "aws:s3",
"awsRegion": "us-east-1",
"eventTime": "1970-01-01T00:00:00.000Z",
"eventName": "ObjectCreated:Put",
"userIdentity": {
"principalId": "EXAMPLE"
},
"requestParameters": {
"sourceIPAddress": "127.0.0.1"
},
"responseElements": {
"x-amz-request-id": "EXAMPLE123456789",
"x-amz-id-2": "EXAMPLE123/5678abcdefghijklambdaisawesome/mnopqrstuvwxyzABCDEFGH"
},
"s3": {
"s3SchemaVersion": "1.0",
"configurationId": "testConfigRule",
"bucket": {
"name": "deepakwordcountbucket",
"ownerIdentity": {
"principalId": "*"
},
"arn": "arn:aws:s3:::deepakwordcountbucket"
},
"object": {
"key": "word.txt",
"size": 1024,
"eTag": "0123456789abcdef0123456789abcdef",
"sequencer": "0A1B2C3D4E5F678901"
}
}
}
]
}

Step 7: Create and Upload a file in amazon s3 bucket that we created earlier
● Create a new file and write some contents

Image description

Note that there are 6 words in this file

● Upload this file to amazon s3 bucket (deepakwordcountbucket)

Image description

● Click on deploy in lambda function

Image description

● You will get an email which will tell you the number of words in the file you uploaded to s3 bucket

Image description

Conclusion
In this tutorial, I've shown you how to automate word counting on files uploaded to an S3 bucket using AWS Lambda and send the word count in an email notification via Amazon SNS. This automation can be a valuable addition to various data processing and content analysis workflows, saving you time and effort while keeping you informed about the contents of uploaded files. Explore further customization and integration options to suit your specific use cases.

Top comments (0)