🚀 Executive Summary
TL;DR: This guide addresses the challenge of manually migrating assets between cloud storage providers like DigitalOcean Spaces and AWS S3. It provides a professional, automated solution using Boto3, leveraging its S3-compatible API to efficiently transfer data programmatically.
🎯 Key Takeaways
- DigitalOcean Spaces are S3-compatible, allowing Boto3’s S3 client to interact with them by specifying a custom
endpoint\_urland region. - Using
get\_paginator(‘list\_objects\_v2’)is crucial for efficiently listing all objects in large DigitalOcean Spaces, preventing memory issues and ensuring comprehensive migration. - When uploading to AWS S3, it’s important to preserve
ContentTypefrom the source object and explicitly setACL(e.g., ‘private’) and potentiallyStorageClassfor the destination object. - Common pitfalls include authentication errors due to incorrect credentials or insufficient IAM permissions, API rate limiting for high throughput, and memory exhaustion when handling very large files.
- For large files, avoid reading the entire object into memory; instead, stream
do\_response[‘Body’]directly toput\_objector useboto3.s3.transfer.S3Transferfor robust managed uploads. - Post-migration best practices include securely deleting source objects, implementing S3 Lifecycle policies for cost optimization, enabling S3 bucket versioning, and integrating with other AWS services.
Moving Assets from DigitalOcean Spaces to AWS S3 using Boto3
As a DevOps Engineer, you often find yourself navigating the complex landscape of cloud infrastructure,
seeking efficiency, cost optimization, and robust data management solutions. Migrating assets
between cloud storage providers is a common, yet often daunting, task. Manually downloading
terabytes of data from one service only to re-upload it to another is not only
time-consuming and error-prone but also a significant drain on productivity. This process
becomes even more challenging when dealing with a multitude of objects, diverse metadata,
and the need for minimal downtime.
Whether you’re consolidating your infrastructure under a single cloud provider, optimizing
for cost and performance, or simply transitioning away from a legacy setup, automating
this migration is paramount. This tutorial from TechResolve will guide you through a
professional and efficient method to move your valuable assets from DigitalOcean Spaces
to AWS S3 using Boto3, the AWS SDK for Python. Boto3’s versatility allows us
to interact with DigitalOcean Spaces (thanks to its S3-compatible API) and seamlessly
transfer data to AWS S3, all with a robust and programmatic approach.
Prerequisites
Before we dive into the migration process, ensure you have the following in place:
- Python 3.x: Installed on your local machine or server.
-
pip: Python’s package installer, usually bundled with Python 3.x. - DigitalOcean Spaces Access Key and Secret: These credentials are required to authenticate and access your DigitalOcean Space. You can generate them in your DigitalOcean account settings under “API” > “Spaces access keys”.
-
DigitalOcean Space Name and Endpoint URL:
For example,
your-space-nameandnyc3.digitaloceanspaces.com. -
AWS Access Key ID and Secret Access Key:
An IAM user with programmatic access and appropriate permissions (at least
s3:PutObject,s3:GetObject,s3:ListBucket) for your target S3 bucket. Best practice is to use an IAM role with least privilege. -
AWS S3 Bucket Name and Region: Your destination bucket, e.g.,
your-aws-bucketinus-east-1.
Step-by-Step Guide
Step 1: Set Up Your Python Environment and Install Boto3
It’s always a good practice to work within a virtual environment to manage your project’s
dependencies. Open your terminal or command prompt and execute the following commands:
python3 -m venv env
source env/bin/activate # On Windows, use `env\Scripts\activate`
pip install boto3
Once Boto3 is installed, you have the necessary library to interact with both
AWS S3 and DigitalOcean Spaces, as DigitalOcean Spaces are S3-compatible, allowing
us to use Boto3’s S3 client with a custom endpoint.
Step 2: Configure Credentials for DigitalOcean Spaces and AWS S3
For security, it’s recommended to use environment variables for your credentials rather
than hardcoding them in your script. However, for this tutorial’s clarity, we will
place them within the script. In a production environment, consider using a configuration
management tool or loading from a secure file.
Boto3 automatically looks for AWS credentials in several locations, including environment
variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) and the
~/.aws/credentials file. For DigitalOcean, we’ll explicitly pass the
credentials and endpoint URL to the Boto3 client.
Step 3: List Objects from DigitalOcean Spaces
First, we need to connect to your DigitalOcean Space and retrieve a list of all objects
you intend to migrate. We will use boto3.client('s3', ...) for this.
import boto3
import os
# DigitalOcean Spaces Configuration
DO_SPACES_KEY = os.getenv('DO_SPACES_KEY', 'YOUR_DO_SPACES_ACCESS_KEY')
DO_SPACES_SECRET = os.getenv('DO_SPACES_SECRET', 'YOUR_DO_SPACES_SECRET_KEY')
DO_SPACES_ENDPOINT = os.getenv('DO_SPACES_ENDPOINT', 'nyc3.digitaloceanspaces.com') # e.g., nyc3.digitaloceanspaces.com
DO_SPACES_BUCKET_NAME = os.getenv('DO_SPACES_BUCKET_NAME', 'your-do-space-name')
# AWS S3 Configuration
AWS_ACCESS_KEY_ID = os.getenv('AWS_ACCESS_KEY_ID', 'YOUR_AWS_ACCESS_KEY_ID')
AWS_SECRET_ACCESS_KEY = os.getenv('AWS_SECRET_ACCESS_KEY', 'YOUR_AWS_SECRET_ACCESS_KEY')
AWS_S3_REGION = os.getenv('AWS_S3_REGION', 'us-east-1') # e.g., us-east-1
AWS_S3_BUCKET_NAME = os.getenv('AWS_S3_BUCKET_NAME', 'your-aws-s3-bucket-name')
# Initialize S3 client for DigitalOcean Spaces
do_spaces_client = boto3.client(
's3',
region_name='nyc3', # The region is often part of the endpoint, but boto3 expects something.
# For DO, this can be arbitrary as long as endpoint_url is correct.
endpoint_url=f'https://{DO_SPACES_ENDPOINT}',
aws_access_key_id=DO_SPACES_KEY,
aws_secret_access_key=DO_SPACES_SECRET
)
print(f"Listing objects in DigitalOcean Space: {DO_SPACES_BUCKET_NAME}...")
do_objects_to_migrate = []
paginator = do_spaces_client.get_paginator('list_objects_v2')
pages = paginator.paginate(Bucket=DO_SPACES_BUCKET_NAME)
for page in pages:
if "Contents" in page:
for obj in page["Contents"]:
do_objects_to_migrate.append(obj["Key"])
else:
print("DigitalOcean Space is empty or no 'Contents' found in page.")
print(f"Found {len(do_objects_to_migrate)} objects in DigitalOcean Space.")
The code above initializes an S3 client configured for DigitalOcean Spaces. It then uses
a paginator to efficiently list all objects in your specified Space, which is crucial for
Spaces containing a large number of assets. Each object’s Key (its path) is
stored for subsequent download and upload.
Step 4: Download Objects from DigitalOcean Spaces and Upload to AWS S3
Now, we’ll iterate through the list of object keys, download each object from DigitalOcean
Spaces, and then immediately upload it to your AWS S3 bucket. We’ll use a separate
Boto3 client for AWS S3.
# Initialize S3 client for AWS S3
aws_s3_client = boto3.client(
's3',
region_name=AWS_S3_REGION,
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY
)
print(f"Starting migration to AWS S3 bucket: {AWS_S3_BUCKET_NAME} in region {AWS_S3_REGION}...")
for object_key in do_objects_to_migrate:
try:
print(f"Migrating object: {object_key}")
# 1. Download object from DigitalOcean Spaces
do_response = do_spaces_client.get_object(Bucket=DO_SPACES_BUCKET_NAME, Key=object_key)
object_body = do_response['Body'].read()
content_type = do_response['ContentType'] if 'ContentType' in do_response else 'binary/octet-stream'
# Optional: Preserve metadata if needed. Boto3 copies common headers by default for put_object.
# If custom metadata is crucial, you'd extract it from do_response['Metadata'] and pass to put_object.
# 2. Upload object to AWS S3
aws_s3_client.put_object(
Bucket=AWS_S3_BUCKET_NAME,
Key=object_key,
Body=object_body,
ContentType=content_type,
ACL='private' # Or 'public-read' if your objects need public access
# You can also set StorageClass='STANDARD_IA' for Infrequent Access, etc.
)
print(f"Successfully migrated {object_key}")
except Exception as e:
print(f"Error migrating {object_key}: {e}")
print("Migration complete!")
This script connects to both DigitalOcean Spaces and AWS S3. For each object found
in your DigitalOcean Space, it performs a get_object call to retrieve
its content and then uses put_object to upload it to the specified
AWS S3 bucket. We’re also attempting to preserve the ContentType and
setting a default ACL (Access Control List) for the uploaded objects.
You might need to adjust the ACL and consider other parameters like
StorageClass, Metadata, and ServerSideEncryption
based on your specific requirements.
Step 5: Verification
After the script completes, it’s crucial to verify that all assets have been
successfully migrated. You can do this by:
- Checking the AWS S3 Console: Navigate to your target bucket in the AWS Management Console and visually inspect the uploaded objects.
- Running a verification script: Write a small Python script using Boto3 to list objects in your AWS S3 bucket and compare the count and some sample keys against your DigitalOcean Space’s contents.
Common Pitfalls
- Authentication and Authorization Errors:
Ensure your DigitalOcean Spaces access keys and AWS access keys are correct and
have the necessary permissions. For DigitalOcean, this means read access to the Space.
For AWS, the IAM user or role must have s3:PutObject,
s3:GetObject, and s3:ListBucket permissions on the target
bucket. Look for errors like “Access Denied” or “InvalidAccessKeyId”.
- Rate Limiting:
For very large numbers of small objects or large objects, you might hit API rate
limits from either DigitalOcean or AWS. Boto3 includes automatic retry mechanisms
with exponential backoff for transient errors, but for sustained high throughput,
you might need to implement custom delays or consider parallel processing
(e.g., using Python’s multiprocessing module) with careful rate control.
- Memory and Large Files:
The current script downloads the entire object into memory
(object_body = do_response['Body'].read()) before uploading.
For extremely large files (gigabytes or more), this can lead to memory exhaustion.
A more robust solution for large files involves streaming the data directly
between the two services, without fully loading it into memory. This can be achieved
by passing the do_response['Body'] directly to put_object
without calling .read(), or by using boto3.s3.transfer.S3Transfer
for managed uploads and downloads.
Conclusion
You have successfully orchestrated a data migration from DigitalOcean Spaces to AWS S3 using
Boto3. This programmatic approach not only saves countless hours of manual effort but also
provides a repeatable, auditable, and scalable solution for your cloud storage migration
needs. Automating such tasks is a cornerstone of modern DevOps practices, ensuring consistency
and reducing human error.
Now that your assets reside in AWS S3, you can leverage the full power of the AWS ecosystem.
Consider these next steps:
- Delete Source Objects: Once verified, securely delete the original objects from your DigitalOcean Space to avoid duplicate storage costs.
- Implement Lifecycle Policies: Configure S3 Lifecycle policies to automatically transition objects to different storage classes (e.g., S3 Glacier) or delete them after a certain period, further optimizing costs.
- Set up Versioning: Enable S3 bucket versioning to protect against accidental deletions and overwrites.
- Integrate with AWS Services: Explore how your newly migrated assets can integrate with other AWS services like CloudFront for CDN, Lambda for event-driven processing, or Athena for analytics.
At TechResolve, we believe in empowering engineers with the tools and knowledge
to build resilient and efficient cloud infrastructures. Happy migrating!
👉 Read the original article on TechResolve.blog
☕ Support my work
If this article helped you, you can buy me a coffee:

Top comments (0)