Hritik Raj

Posted on Jan 3

🪣 AWS 123: Data in Motion - Migrating S3 Buckets via AWS CLI

#aws #s3 #devops #100daysofcloud

🔄 Efficient Data Migration: S3 Sync Strategies

Hey Cloud Builders 👋

Welcome to Day 23 of the #100DaysOfCloud Challenge!
Today, the Nautilus DevOps team is tackling a high-stakes data migration. We need to move a substantial amount of data from an old bucket to a brand-new one while ensuring 100% data consistency using the power of the AWS CLI.

By using the sync command instead of a simple cp (copy), we can ensure that our migration is both fast and accurate.

🎯 Objective

Create a new private S3 bucket named devops-sync-19208
Migrate all data from devops-s3-12582 to the new bucket
Verify that both buckets are perfectly synchronized
Perform all actions exclusively via the AWS CLI

💡 Why S3 Sync Over Copy?

While aws s3 cp is great for single files, aws s3 sync is the professional choice for migrations.

🔹 Key Concepts

Sync Command Recursively copies new and updated files from the source to the destination. It compares file sizes and modification times to avoid redundant transfers.
Private by Default Security is paramount. New buckets should always remain private unless there is a specific requirement for public access.
Data Integrity Migrations aren't finished until they are verified. We use listing commands to ensure the object counts match.

🛠️ Step-by-Step: S3 Data Migration

We’ll move logically from Creation → Migration → Verification.

🔹 Phase A: Create the New S3 Bucket

Use the mb (Make Bucket) command to create your destination:

aws s3 mb [DESTINATION_BUCKET] --region us-east-1

🔹 Phase B: Migrate Data using Sync

Now, we trigger the migration. The syntax is aws s3 sync <source> <destination>.

aws s3 sync [SOURCE_BUCKET] [DESTINATION_BUCKET]

🔹 Phase C: Verify Data Consistency

To ensure the migration was successful, we list the contents of both buckets to compare:

Check Source:

aws s3 ls [SOURCE_BUCKET] --recursive --human-readable --summarize

Check Destination:

aws s3 ls [DESTINATION_BUCKET] --recursive --human-readable --summarize

✅ Verify Success

🎉 If the "Total Objects" and "Total Size" match in both command outputs, mission accomplished! Your data has been migrated without any loss.

📝 Key Takeaways

🚀 sync is "Idempotent": You can run it multiple times; it will only copy what has changed since the last run.
🔐 Permissions: Ensure your CLI user has s3:ListBucket and s3:GetObject on the source, and s3:PutObject on the destination.
🌍 Cross-Region: You can sync buckets even if they are in different AWS regions!

🚫 Common Mistakes

Missing the S3 Prefix: Always remember the s3:// before the bucket name.
Trailing Slashes: Be careful with slashes at the end of bucket names; they can sometimes affect how folders are nested during a sync.
Bucket Names: Remember that S3 bucket names must be globally unique.

🌟 Final Thoughts

You’ve just executed a fundamental DevOps task: Data Reliability. By mastering the AWS CLI for S3, you can automate backups, website deployments, and large-scale data transfers with a single line of code.

This skill is essential for:

Disaster Recovery (DR) setups
Moving from Development to Production environments
Periodic data archival

🔗 Let’s Connect

💬 LinkedIn: Hritik Raj
⭐ Support my journey on GitHub: 100 Days of Cloud

DEV Community