DEV Community

Cover image for πŸͺ£ AWS 123: Data in Motion - Migrating S3 Buckets via AWS CLI
Hritik Raj
Hritik Raj

Posted on

πŸͺ£ AWS 123: Data in Motion - Migrating S3 Buckets via AWS CLI

AWS

πŸ”„ Efficient Data Migration: S3 Sync Strategies

Hey Cloud Builders πŸ‘‹

Welcome to Day 23 of the #100DaysOfCloud Challenge!
Today, the Nautilus DevOps team is tackling a high-stakes data migration. We need to move a substantial amount of data from an old bucket to a brand-new one while ensuring 100% data consistency using the power of the AWS CLI.

By using the sync command instead of a simple cp (copy), we can ensure that our migration is both fast and accurate.


🎯 Objective

  • Create a new private S3 bucket named devops-sync-19208
  • Migrate all data from devops-s3-12582 to the new bucket
  • Verify that both buckets are perfectly synchronized
  • Perform all actions exclusively via the AWS CLI

πŸ’‘ Why S3 Sync Over Copy?

While aws s3 cp is great for single files, aws s3 sync is the professional choice for migrations.

πŸ”Ή Key Concepts

  • Sync Command Recursively copies new and updated files from the source to the destination. It compares file sizes and modification times to avoid redundant transfers.

  • Private by Default Security is paramount. New buckets should always remain private unless there is a specific requirement for public access.

  • Data Integrity Migrations aren't finished until they are verified. We use listing commands to ensure the object counts match.


πŸ› οΈ Step-by-Step: S3 Data Migration

We’ll move logically from Creation β†’ Migration β†’ Verification.


πŸ”Ή Phase A: Create the New S3 Bucket

Use the mb (Make Bucket) command to create your destination:

aws s3 mb [DESTINATION_BUCKET] --region us-east-1

Enter fullscreen mode Exit fullscreen mode

πŸ”Ή Phase B: Migrate Data using Sync

Now, we trigger the migration. The syntax is aws s3 sync <source> <destination>.

aws s3 sync [SOURCE_BUCKET] [DESTINATION_BUCKET]

Enter fullscreen mode Exit fullscreen mode

πŸ”Ή Phase C: Verify Data Consistency

To ensure the migration was successful, we list the contents of both buckets to compare:

  • Check Source:
aws s3 ls [SOURCE_BUCKET] --recursive --human-readable --summarize

Enter fullscreen mode Exit fullscreen mode
  • Check Destination:
aws s3 ls [DESTINATION_BUCKET] --recursive --human-readable --summarize

Enter fullscreen mode Exit fullscreen mode

βœ… Verify Success

πŸŽ‰ If the "Total Objects" and "Total Size" match in both command outputs, mission accomplished! Your data has been migrated without any loss.


πŸ“ Key Takeaways

  • πŸš€ sync is "Idempotent": You can run it multiple times; it will only copy what has changed since the last run.
  • πŸ” Permissions: Ensure your CLI user has s3:ListBucket and s3:GetObject on the source, and s3:PutObject on the destination.
  • 🌍 Cross-Region: You can sync buckets even if they are in different AWS regions!

🚫 Common Mistakes

  • Missing the S3 Prefix: Always remember the s3:// before the bucket name.
  • Trailing Slashes: Be careful with slashes at the end of bucket names; they can sometimes affect how folders are nested during a sync.
  • Bucket Names: Remember that S3 bucket names must be globally unique.

🌟 Final Thoughts

You’ve just executed a fundamental DevOps task: Data Reliability. By mastering the AWS CLI for S3, you can automate backups, website deployments, and large-scale data transfers with a single line of code.

This skill is essential for:

  • Disaster Recovery (DR) setups
  • Moving from Development to Production environments
  • Periodic data archival

πŸ”— Let’s Connect

Top comments (0)