DEV Community

Revathi Joshi for AWS Community Builders

Posted on

How to import CSV data into DynamoDB using Lambda and S3 Event triggers

As part of my learning curve on DynamoDB and its interaction with various AWS services, I am writing this article on how S3 event trigger triggers an action on a Lambda function to import CSV data into a DynamoDB table using AWS Management Console.

DynamoDB

DynamoDB is a key-value, non-relational database that uses a simple key-value method to store data. A key-value database stores data as a collection of key-value pairs in which a key serves as a unique identifier, which is called the Primary Key. Also known as Partition Key / Hash Key. Optionally it uses a Sort Key

DynamoDB is fully integrated with AWS Backups. You can use the DynamoDB console, API, and AWS Command Line Interface (AWS CLI) to enable automatic backups and restore for your DynamoDB tables.

What is Lambda?

AWS Lambda is a serverless, event-driven compute service that lets you run code for virtually any type of application or back-end service without provisioning or managing servers, capacity provisioning, automatic scaling, code monitoring and logging. You can trigger Lambda from over 200 AWS services and only pay for what you use.

S3 Bucket

Amazon Simple Storage (Amazon S3) is an object storage service. You can store and protect any amount of data for a range of use cases, like backup and restore, data lakes, archive, mobile application, and websites.

When someone creates an object or modifies (removes or updates) an object stored in your S3 bucket, S3 will trigger an event. You can use Lambda to process such event notifications from S3.

So add a trigger to our Amazon S3 bucket to call our Lambda function whenever new data arrives. Amazon S3 needs permission from the function's resource-based policy to invoke Lambda function.

An IAM Role

An IAM role is an AWS Identity. Every IAM role has its own permission policy that defines what that role can do and what it cannot do. It is like an IAM user without a password or an access key and a secret key. When you assume a role, it provides you with temporary security credentials for your role session. You use roles to allow the service to access resources in other services on your behalf. A role that a service assumes to perform actions on your behalf is called a service role. When a role serves a specialized purpose for a service, it is categorized as a service role for Lambda instances (for example), or a service-linked role.

Configuring the Role with proper policies/permissions is the key issue here just so that it can access proper AWS resources for sending out the message

Please visit my GitHub Repository for DynamoDB articles on various topics being updated on constant basis.

Let’s get started!

Objectives:

  • Create a Amazon DynamoDB table. 

  • Create an S3 bucket

  • Upload a CSV file.

  • All associated IAM roles needed for the solution, configured according to the principle of least privilege

  • Creating Lambda Function with a timeout of 1 minute, which contains the code to import the CSV data into DynamoDB

  • Test the CSV Data Import in Lambda

  • Adding Event Triggers to the S3 Bucket to call our Lambda function whenever new data arrives.

  • Test the setup - Testing S3 Event Trigger to Import New Data into DynamoDB

  • Cleanup

Pre-requisites:

  • AWS user account with admin access, not a root account.

  • Cloud9 IDE with AWS CLI.

  • Create an IAM role

Resources Used:

1. How to create a DynamoDB table
2. How to create a S3 bucket and a S3 Trigger
3. How to create a Lambda function
4. An IAM role with required permission policy

Steps for implementation to this project:

1. Create a Amazon DynamoDB table:

  • On DynamoDB Dashboard / Tables / Under Create table , Table Details

Table Name: FriendsDDB
Partition key: Id
Type: String

Create Table

Status should be Active

Image description

2. Create an S3 bucket

  • On Amazon S3 Console / Create bucket / Under Create bucket, General configuration

Bucket name: friends-s3

Create bucket

Image description

3. Upload a CSV file

  • Click on the S3 bucket - friends-s3

File: friends.csv

Image description

  • Under friends-s3, Objects, Upload / Under Upload, For Files and folders / Add files

select - friends.csv

Upload

Image description

4. All associated IAM roles needed for the solution, configured according to the principle of least privilege

  • On the IAM dashboard / Roles / IAM dashboard / Create Role / Under Select trusted entity, Trusted entity type / Select AWS service / Under Use case / Select Lambda /

  • Next

  • Under Name, review, and create, Role details /

Role name: csv-lamda-role
Create role

Image description

  • Click the role you just created csv-lambda-role
    drop-down Add permissions / select Attach policies

  • Under Attach policy to csv-lambda-role / Search for AmazonDynamoDBFullAccess / Check the box / Attach poilicies

  • Attach AmazonS3FullAccess in the same way.

Image description

5. Creating Lambda Function with a timeout of 1 minute, which contains the code to import the CSV data into DynamoDB

  • On the Lambda Console / Functions / Create function / Select Author from scratch / Under Basic information

Function name: csv-s3-lambda
Runtime: From the drop-down choose Python 3.9

  • Click on Change default execution role / select Use an existing Role

  • Select the role you created just now - csv-lambda-role

Create function

Image description

Image description

  • Once the function is created, it will open the main page of the Lambda function.

csv-lambda.py - contains Python code which uses boto3 APIs for AWS.

  • In the csv-lambda.py -

    • Update table = dynamodb.Table("<DynamoDB Table") to
    • table = dynamodb.Table("FriendsDDB")
  • The python code above does the following:

    • Imports the CSV file from S3 bucket.
    • Splits the CSV data into multiple strings.
    • Uploads data to the DynamoDB table.


import boto3
s3_client = boto3.client("s3")
dynamodb = boto3.resource("dynamodb")

table = dynamodb.Table("FriendsDDB")

def lambda_handler(event, context):
    bucket_name = event['Records'][0]['s3']['bucket']['name']
    s3_file_name = event['Records'][0]['s3']['object']['key']
    resp = s3_client.get_object(Bucket=bucket_name,Key=s3_file_name)
    data = resp['Body'].read().decode("utf-8")
    Students = data.split("\n")
    #print(friends)
    for friend in Friends:
        print(friend)
        friend_data = friend.split(",")
        # add to dynamodb
        try:
            table.put_item(
                Item = {
                    "id"        : friend_data[0],
                    "name"      : friend_data[1],
                    "Subject"   : friend_data[2]
                }
            )
        except Exception as e:
            print("End of file")



Enter fullscreen mode Exit fullscreen mode
  • Remove the existing code in the function code environment window.

  • Copy and paste the code from the csv-lambda.py into the lambda_function.py under the Code Source.

Image description

  • After updating the code, Click on Deploy button to save the code.

Image description

  • Change the function timeout as follows:

    • Navigate to the Configuration 
    • click on General configuration / click on Edit 
  • In the Edit Basic setting / change the Timeout value to 1 min

  • Click on Save button

Image description

Image description

Image description

6. Test the CSV Data Import in Lambda:

  • In the csv-s3-lambda lambda function page, click on the Test tab.

  • Configure to test event data as follows:

Test event action: Create a new event

Name: Event name - csv

Template: Select Amazon S3 Put, Upon selection, it will be displayed as s3-put

  • Below in the JSON code:

Under S3 → bucket → name → Enter friends-s3

arn": "arn:aws:s3:::friends-s3"

Under S3 → object → Key → Enter friends.csv

  • Click on Create and then Save to save the changes.

Note: Make sure the S3 bucket name and file name are correct in the JSON.

Image description

Image description

  • Click on Test in top-right Corner to trigger the lambda function.

  • Once the lambda function is successfully executed, you will be able to see a detailed success message with table data.

Image description

  • Go to the DynamoDB table and then select the FriendsDDB and click on Explore Table Items

Image description

7. Adding Event Triggers to the S3 Bucket to call our Lambda function whenever new data arrives.

  • On the S3 Console / Click on the s3 bucket named friends-s3

  • Click on the Properties tab / go down to Event notifications. 

  • Click on Create event notification button

  • Under General configuration /

Name: friends_upload

Suffix: .csv

All Object create events: check

Destination: Select Lambda Function

Lambda: Select csv-s3-lambda

Click on Save changes

Image description

Image description

Image description

  • Now every time a CSV file is uploaded to our S3 bucket, it will trigger the lambda to import the CSV data into the DynamoDB table

8. Test the setup - Testing S3 Event Trigger to Import New Data into DynamoDB

file: friends1.csv

Upload the friends1.csv file to the friends-s3 bucket

Image description

Image description

  • This upload event should triggered our Lambda function to import the CSV data into the DynamoDB table FriendsDDB.

  • Go to the DynamoDB table FriendsDDB to see the changes.

  • Click on the refresh button if items have not yet changed.

  • You can see that new CSV data has been successfully imported into the DynamoDB table.

Image description

9. Cleanup

  • delete the Lambda function
  • delete the S3 bucket
  • delete the DynamoDB table
  • delete IAM Role

What we have done so far

  • We have successfully created an Amazon DynamoDB Table.

  • We have successfully created a Lambda function and configured it to import CSV data from S3 into DynamoDB.

  • We have created an S3 event to trigger our Lambda function.

  • We have tested the import of a new CSV file to the DynamoDB table.

Top comments (1)

Collapse
 
letspoke profile image
Marcel Kühnau

Hey
great post so far

but if I follow along and create the test I get an error saying "name friends not found" and now im a bit sad