DEV Community

Cover image for Solved: Moving Assets from DigitalOcean Spaces to AWS S3 using Boto3
Darian Vance
Darian Vance

Posted on • Originally published at wp.me

Solved: Moving Assets from DigitalOcean Spaces to AWS S3 using Boto3

🚀 Executive Summary

TL;DR: This guide addresses the challenge of manually migrating assets between cloud storage providers like DigitalOcean Spaces and AWS S3. It provides a professional, automated solution using Boto3, leveraging its S3-compatible API to efficiently transfer data programmatically.

🎯 Key Takeaways

  • DigitalOcean Spaces are S3-compatible, allowing Boto3’s S3 client to interact with them by specifying a custom endpoint\_url and region.
  • Using get\_paginator(‘list\_objects\_v2’) is crucial for efficiently listing all objects in large DigitalOcean Spaces, preventing memory issues and ensuring comprehensive migration.
  • When uploading to AWS S3, it’s important to preserve ContentType from the source object and explicitly set ACL (e.g., ‘private’) and potentially StorageClass for the destination object.
  • Common pitfalls include authentication errors due to incorrect credentials or insufficient IAM permissions, API rate limiting for high throughput, and memory exhaustion when handling very large files.
  • For large files, avoid reading the entire object into memory; instead, stream do\_response[‘Body’] directly to put\_object or use boto3.s3.transfer.S3Transfer for robust managed uploads.
  • Post-migration best practices include securely deleting source objects, implementing S3 Lifecycle policies for cost optimization, enabling S3 bucket versioning, and integrating with other AWS services.

Moving Assets from DigitalOcean Spaces to AWS S3 using Boto3

As a DevOps Engineer, you often find yourself navigating the complex landscape of cloud infrastructure,

seeking efficiency, cost optimization, and robust data management solutions. Migrating assets

between cloud storage providers is a common, yet often daunting, task. Manually downloading

terabytes of data from one service only to re-upload it to another is not only

time-consuming and error-prone but also a significant drain on productivity. This process

becomes even more challenging when dealing with a multitude of objects, diverse metadata,

and the need for minimal downtime.

Whether you’re consolidating your infrastructure under a single cloud provider, optimizing

for cost and performance, or simply transitioning away from a legacy setup, automating

this migration is paramount. This tutorial from TechResolve will guide you through a

professional and efficient method to move your valuable assets from DigitalOcean Spaces

to AWS S3 using Boto3, the AWS SDK for Python. Boto3’s versatility allows us

to interact with DigitalOcean Spaces (thanks to its S3-compatible API) and seamlessly

transfer data to AWS S3, all with a robust and programmatic approach.

Prerequisites

Before we dive into the migration process, ensure you have the following in place:

  • Python 3.x: Installed on your local machine or server.
  • pip: Python’s package installer, usually bundled with Python 3.x.
  • DigitalOcean Spaces Access Key and Secret: These credentials are required to authenticate and access your DigitalOcean Space. You can generate them in your DigitalOcean account settings under “API” > “Spaces access keys”.
  • DigitalOcean Space Name and Endpoint URL: For example, your-space-name and nyc3.digitaloceanspaces.com.
  • AWS Access Key ID and Secret Access Key: An IAM user with programmatic access and appropriate permissions (at least s3:PutObject, s3:GetObject, s3:ListBucket) for your target S3 bucket. Best practice is to use an IAM role with least privilege.
  • AWS S3 Bucket Name and Region: Your destination bucket, e.g., your-aws-bucket in us-east-1.

Step-by-Step Guide

Step 1: Set Up Your Python Environment and Install Boto3

It’s always a good practice to work within a virtual environment to manage your project’s

dependencies. Open your terminal or command prompt and execute the following commands:

python3 -m venv env
source env/bin/activate  # On Windows, use `env\Scripts\activate`
pip install boto3
Enter fullscreen mode Exit fullscreen mode

Once Boto3 is installed, you have the necessary library to interact with both

AWS S3 and DigitalOcean Spaces, as DigitalOcean Spaces are S3-compatible, allowing

us to use Boto3’s S3 client with a custom endpoint.

Step 2: Configure Credentials for DigitalOcean Spaces and AWS S3

For security, it’s recommended to use environment variables for your credentials rather

than hardcoding them in your script. However, for this tutorial’s clarity, we will

place them within the script. In a production environment, consider using a configuration

management tool or loading from a secure file.

Boto3 automatically looks for AWS credentials in several locations, including environment

variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) and the

~/.aws/credentials file. For DigitalOcean, we’ll explicitly pass the

credentials and endpoint URL to the Boto3 client.

Step 3: List Objects from DigitalOcean Spaces

First, we need to connect to your DigitalOcean Space and retrieve a list of all objects

you intend to migrate. We will use boto3.client('s3', ...) for this.

import boto3
import os

# DigitalOcean Spaces Configuration
DO_SPACES_KEY = os.getenv('DO_SPACES_KEY', 'YOUR_DO_SPACES_ACCESS_KEY')
DO_SPACES_SECRET = os.getenv('DO_SPACES_SECRET', 'YOUR_DO_SPACES_SECRET_KEY')
DO_SPACES_ENDPOINT = os.getenv('DO_SPACES_ENDPOINT', 'nyc3.digitaloceanspaces.com') # e.g., nyc3.digitaloceanspaces.com
DO_SPACES_BUCKET_NAME = os.getenv('DO_SPACES_BUCKET_NAME', 'your-do-space-name')

# AWS S3 Configuration
AWS_ACCESS_KEY_ID = os.getenv('AWS_ACCESS_KEY_ID', 'YOUR_AWS_ACCESS_KEY_ID')
AWS_SECRET_ACCESS_KEY = os.getenv('AWS_SECRET_ACCESS_KEY', 'YOUR_AWS_SECRET_ACCESS_KEY')
AWS_S3_REGION = os.getenv('AWS_S3_REGION', 'us-east-1') # e.g., us-east-1
AWS_S3_BUCKET_NAME = os.getenv('AWS_S3_BUCKET_NAME', 'your-aws-s3-bucket-name')

# Initialize S3 client for DigitalOcean Spaces
do_spaces_client = boto3.client(
    's3',
    region_name='nyc3', # The region is often part of the endpoint, but boto3 expects something.
                        # For DO, this can be arbitrary as long as endpoint_url is correct.
    endpoint_url=f'https://{DO_SPACES_ENDPOINT}',
    aws_access_key_id=DO_SPACES_KEY,
    aws_secret_access_key=DO_SPACES_SECRET
)

print(f"Listing objects in DigitalOcean Space: {DO_SPACES_BUCKET_NAME}...")

do_objects_to_migrate = []
paginator = do_spaces_client.get_paginator('list_objects_v2')
pages = paginator.paginate(Bucket=DO_SPACES_BUCKET_NAME)

for page in pages:
    if "Contents" in page:
        for obj in page["Contents"]:
            do_objects_to_migrate.append(obj["Key"])
    else:
        print("DigitalOcean Space is empty or no 'Contents' found in page.")

print(f"Found {len(do_objects_to_migrate)} objects in DigitalOcean Space.")
Enter fullscreen mode Exit fullscreen mode

The code above initializes an S3 client configured for DigitalOcean Spaces. It then uses

a paginator to efficiently list all objects in your specified Space, which is crucial for

Spaces containing a large number of assets. Each object’s Key (its path) is

stored for subsequent download and upload.

Step 4: Download Objects from DigitalOcean Spaces and Upload to AWS S3

Now, we’ll iterate through the list of object keys, download each object from DigitalOcean

Spaces, and then immediately upload it to your AWS S3 bucket. We’ll use a separate

Boto3 client for AWS S3.

# Initialize S3 client for AWS S3
aws_s3_client = boto3.client(
    's3',
    region_name=AWS_S3_REGION,
    aws_access_key_id=AWS_ACCESS_KEY_ID,
    aws_secret_access_key=AWS_SECRET_ACCESS_KEY
)

print(f"Starting migration to AWS S3 bucket: {AWS_S3_BUCKET_NAME} in region {AWS_S3_REGION}...")

for object_key in do_objects_to_migrate:
    try:
        print(f"Migrating object: {object_key}")

        # 1. Download object from DigitalOcean Spaces
        do_response = do_spaces_client.get_object(Bucket=DO_SPACES_BUCKET_NAME, Key=object_key)
        object_body = do_response['Body'].read()
        content_type = do_response['ContentType'] if 'ContentType' in do_response else 'binary/octet-stream'

        # Optional: Preserve metadata if needed. Boto3 copies common headers by default for put_object.
        # If custom metadata is crucial, you'd extract it from do_response['Metadata'] and pass to put_object.

        # 2. Upload object to AWS S3
        aws_s3_client.put_object(
            Bucket=AWS_S3_BUCKET_NAME,
            Key=object_key,
            Body=object_body,
            ContentType=content_type,
            ACL='private' # Or 'public-read' if your objects need public access
            # You can also set StorageClass='STANDARD_IA' for Infrequent Access, etc.
        )
        print(f"Successfully migrated {object_key}")

    except Exception as e:
        print(f"Error migrating {object_key}: {e}")

print("Migration complete!")
Enter fullscreen mode Exit fullscreen mode

This script connects to both DigitalOcean Spaces and AWS S3. For each object found

in your DigitalOcean Space, it performs a get_object call to retrieve

its content and then uses put_object to upload it to the specified

AWS S3 bucket. We’re also attempting to preserve the ContentType and

setting a default ACL (Access Control List) for the uploaded objects.

You might need to adjust the ACL and consider other parameters like

StorageClass, Metadata, and ServerSideEncryption

based on your specific requirements.

Step 5: Verification

After the script completes, it’s crucial to verify that all assets have been

successfully migrated. You can do this by:

  • Checking the AWS S3 Console: Navigate to your target bucket in the AWS Management Console and visually inspect the uploaded objects.
  • Running a verification script: Write a small Python script using Boto3 to list objects in your AWS S3 bucket and compare the count and some sample keys against your DigitalOcean Space’s contents.

Common Pitfalls

  • Authentication and Authorization Errors:

Ensure your DigitalOcean Spaces access keys and AWS access keys are correct and

have the necessary permissions. For DigitalOcean, this means read access to the Space.

For AWS, the IAM user or role must have s3:PutObject,

s3:GetObject, and s3:ListBucket permissions on the target

bucket. Look for errors like “Access Denied” or “InvalidAccessKeyId”.

  • Rate Limiting:

For very large numbers of small objects or large objects, you might hit API rate

limits from either DigitalOcean or AWS. Boto3 includes automatic retry mechanisms

with exponential backoff for transient errors, but for sustained high throughput,

you might need to implement custom delays or consider parallel processing

(e.g., using Python’s multiprocessing module) with careful rate control.

  • Memory and Large Files:

The current script downloads the entire object into memory

(object_body = do_response['Body'].read()) before uploading.

For extremely large files (gigabytes or more), this can lead to memory exhaustion.

A more robust solution for large files involves streaming the data directly

between the two services, without fully loading it into memory. This can be achieved

by passing the do_response['Body'] directly to put_object

without calling .read(), or by using boto3.s3.transfer.S3Transfer

for managed uploads and downloads.

Conclusion

You have successfully orchestrated a data migration from DigitalOcean Spaces to AWS S3 using

Boto3. This programmatic approach not only saves countless hours of manual effort but also

provides a repeatable, auditable, and scalable solution for your cloud storage migration

needs. Automating such tasks is a cornerstone of modern DevOps practices, ensuring consistency

and reducing human error.

Now that your assets reside in AWS S3, you can leverage the full power of the AWS ecosystem.

Consider these next steps:

  • Delete Source Objects: Once verified, securely delete the original objects from your DigitalOcean Space to avoid duplicate storage costs.
  • Implement Lifecycle Policies: Configure S3 Lifecycle policies to automatically transition objects to different storage classes (e.g., S3 Glacier) or delete them after a certain period, further optimizing costs.
  • Set up Versioning: Enable S3 bucket versioning to protect against accidental deletions and overwrites.
  • Integrate with AWS Services: Explore how your newly migrated assets can integrate with other AWS services like CloudFront for CDN, Lambda for event-driven processing, or Athena for analytics.

At TechResolve, we believe in empowering engineers with the tools and knowledge

to build resilient and efficient cloud infrastructures. Happy migrating!


Darian Vance

👉 Read the original article on TechResolve.blog


Support my work

If this article helped you, you can buy me a coffee:

👉 https://buymeacoffee.com/darianvance

Top comments (0)