Hritik Raj

Posted on Jan 26

⚡ AWS 146: Event-Driven Automation - Syncing S3 Buckets with Lambda & DynamoDB

#aws #lambda #s3 #100daysofcloud

🚀 Serverless Sync: Automating File Transfers with AWS Lambda

Hey Cloud Architects 👋

Welcome to Day 46 of the #100DaysOfCloud Challenge!
Today, we are moving into advanced serverless workflows. The Nautilus DevOps team needs to automate file management between a public upload bucket and a secure private storage bucket. We are implementing an event-driven solution where an upload to S3 triggers a Lambda function to copy the file and log the entire transaction into DynamoDB.

This task is part of my hands-on practice on the KodeKloud Engineer platform, perfect for mastering serverless integration and IAM roles.

🎯 Objective

Provision a public upload bucket (xfusion-public-27433) and a private storage bucket (xfusion-private-26642).
Create a DynamoDB table named xfusion-S3CopyLogs for auditing.
Configure an IAM Role (lambda_execution_role) with permissions for S3, DynamoDB, and CloudWatch.
Deploy a Lambda function (xfusion-copyfunction) to handle the copy logic and metadata logging.
Verify the automated workflow by uploading sample.zip and checking the logs.

💡 The Power of Event-Driven Design

In an event-driven system, code only runs when something happens (like a file upload). This is highly cost-effective and scalable because you aren't paying for idle servers you only pay for the few milliseconds the Lambda function runs.

🔹 Key Concepts

S3 Event Notifications: A feature that detects uploads and sends a signal (JSON event) to Lambda.
Lambda Execution Role: A set of permissions that allows your code to "talk" to other AWS services like S3 and DynamoDB securely.
Boto3 (Python SDK): The library used within the Lambda function to interact with AWS resources programmatically.

🛠️ Step-by-Step: Serverless Workflow

🔹 Phase A: Storage & Database Setup

First, we prepare our data "landing" and "logging" zones.

Public Bucket: Create xfusion-public-27433 and ensure public access is enabled.
Private Bucket: Create xfusion-private-26642 with all public access blocked.
DynamoDB Table: Create xfusion-S3CopyLogs with a Partition Key named LogID (String).

🔹 Phase B: Identity Management (IAM)

The Lambda function needs permission to read from one bucket and write to another.

Create Role: Named lambda_execution_role.
Attach Policies: Add permissions for s3:GetObject (Public), s3:PutObject (Private), and dynamodb:PutItem (Logs). Also, include CloudWatch Logs permissions for troubleshooting.

🔹 Phase C: Lambda Function Deployment

Now, we deploy the logic that bridges the two buckets.

Function Name: xfusion-copyfunction (Python).
Configuration: Replace the placeholders in the lambda-function.py script with your specific private bucket name and DynamoDB table name.
Trigger: Add an S3 trigger to the xfusion-public-27433 bucket for the "All object create events" type.

🔹 Phase D: Testing & Verification

With the infrastructure live, it’s time to test the automation.

Upload: Use the CLI or console to upload sample.zip to the public bucket.
S3 Check: Navigate to the private bucket to confirm sample.zip has appeared.
DynamoDB Check: View the xfusion-S3CopyLogs items to verify a log entry exists with the source, destination, and object key.

✅ Verify Success

Automation: The file appeared in the private bucket without any manual move command.
Auditing: A new entry in DynamoDB accurately reflects the transfer details.
Permissions: The Lambda function successfully assumed the lambda_execution_role to perform the task.

📝 Key Takeaways

🚀 Zero Server Management: Everything we built today is serverless; AWS handles the underlying scaling and availability.
🛡️ Principle of Least Privilege: By creating a custom IAM role, we ensured the Lambda function only has access to exactly what it needs and nothing more.
📦 Visibility: Logging to DynamoDB provides a persistent audit trail that is much easier to query than raw CloudWatch logs.

🚫 Common Mistakes

Circular Triggers: Never set a Lambda to trigger on a bucket it writes to, or you will create an infinite loop and a massive AWS bill!
Missing Placeholder Update: Forgetting to update the Python script with your actual bucket and table names will cause the Lambda to fail with a "ResourceNotFound" error.
IAM Policy Delay: Sometimes it takes a minute for new IAM permissions to propagate; if your first test fails, wait 60 seconds and try again.

🌟 Final Thoughts

This project is a perfect example of how serverless components can be stitched together to solve complex business problems. You’ve just built an automated, secure, and auditable data pipeline that forms the core of many modern enterprise applications.

🌟 Practice Like a Pro

Want to master serverless architectures? Sharpen your skills here:
👉 KodeKloud Engineer - Practice Labs

🔗 Let’s Connect

💬 LinkedIn: Hritik Raj
⭐ Support my journey on GitHub: 100 Days of Cloud

DEV Community