TL;DR
You can save money on AWS S3 by using Python scripts to automatically move files between storage classes based on how often you access them. AWS features like Intelligent-Tiering and lifecycle policies make this process easy and hands-free.
Abstract
Managing cloud storage efficiently is essential for organizations of all sizes. Amazon Web Services (AWS) provides features like S3 Intelligent-Tiering and lifecycle policies to help automatically move files to the most cost-effective storage locations based on how often they are accessed. This article explains, in simple terms and with step-by-step code, how you can use Python and the Boto3 library to automate these processes—making sure your data is always stored in the right place at the right price. References to AWS documentation and helpful resources are provided throughout.
Introduction
Cloud storage is like a giant online hard drive where companies keep their files, databases, and backups. But just like at home, if you don’t organize your storage, you can end up paying too much for things you rarely use. AWS S3 (Simple Storage Service) offers tools to help you automatically move your files to less expensive storage areas when you don’t need them as often, and bring them back when you do. These tools are called Intelligent-Tiering and lifecycle policies. By automating these processes, companies can save money and reduce manual work, especially when dealing with thousands or millions of files. Learn more about S3 Intelligent-Tiering.
Prerequisites
Before you start, you’ll need:
An AWS account with permission to create and manage S3 buckets.
- Python installed on your computer (version 3.x is recommended).
- The Boto3 library, which is a tool that lets you control AWS services from Python.
- Your AWS credentials (like a username and password for AWS), which you can set up using the AWS website or the AWS Command Line Interface (CLI).
- If you’re new to Python or AWS, don’t worry—there are many beginner guides online. Here’s AWS’s getting started guide.
Setting Up Your Environment
First, you’ll need to install Boto3 so your Python scripts can talk to AWS. Open your command prompt or terminal and run:
How to installboto3:
Shell
pip install boto3
Next, set up your AWS credentials. You can do this by running these commands in your terminal (replace the values with your own keys):
Shell
export AWS_ACCESS_KEY_ID=YOUR_ACCESS_KEY
export AWS_SECRET_ACCESS_KEY=YOUR_SECRET_KEY
export AWS_DEFAULT_REGION=us-east-1
If you’re on Windows, you can use the AWS CLI to configure these credentials interactively:
Shell
aws configure
See more about AWS credential setup.
Creating and Managing S3 Buckets Programmatically
An S3 bucket is like a folder in the cloud where you store your files. You can create one using Python and Boto3 with just a few lines of code:
Python
import boto3
s3 = boto3.client('s3')
s3.create_bucket(Bucket='my-example-bucket')
This code tells AWS to create a new storage bucket named “my-example-bucket.” You can now upload files to it, organize them, and apply storage rules. More about S3 buckets.
Enabling S3 Intelligent-Tiering with Python
S3 Intelligent-Tiering is a feature that automatically moves your files between different “tiers” of storage based on how often you use them. For example, files you use every day stay in a fast, slightly more expensive tier, while files you rarely touch are moved to a slower, cheaper tier. This helps you save money without having to move files manually. How S3 Intelligent-Tiering works.
Here’s how you can set it up with Python:
Python
import boto3
s3 = boto3.client('s3')
intelligent_tiering_config = {
'Id': 'MyIntelligentTieringConfig',
'Status': 'Enabled',
'Filter': {'Prefix': ''}, # Apply to all objects
'Tierings': [
{'Days': 30, 'AccessTier': 'ARCHIVE_ACCESS'},
{'Days': 90, 'AccessTier': 'DEEP_ARCHIVE_ACCESS'}
]
}
s3.put_bucket_intelligent_tiering_configuration(
Bucket='my-example-bucket',
Id='MyIntelligentTieringConfig',
IntelligentTieringConfiguration=intelligent_tiering_config
)
You can verify the configuration using get_bucket_intelligent_tiering_configuration
.
This code tells AWS to automatically move files that haven’t been used in 30 days to an “archive” tier, and after 90 days to a “deep archive” tier, which is even cheaper. You can check your configuration with the get_bucket_intelligent_tiering_configuration API.
Automating Lifecycle Policies
A lifecycle policy is like a set of rules that tells AWS what to do with your files over time. For example, you might want to automatically move files to cheaper storage after a month, or even delete them after a year. This is especially useful for old logs, backups, or files you don’t need forever. More about S3 lifecycle policies.
Here’s how you can set up a lifecycle policy with Python:
Python
lifecycle_policy = {
'Rules': [
{
'ID': 'ArchiveOldFiles',
'Status': 'Enabled',
'Filter': {'Prefix': ''},
'Transitions': [
{'Days': 30, 'StorageClass': 'STANDARD_IA'},
{'Days': 90, 'StorageClass': 'GLACIER'},
{'Days': 365, 'StorageClass': 'DEEP_ARCHIVE'}
]
}
]
}
s3.put_bucket_lifecycle_configuration(
Bucket='my-example-bucket',
LifecycleConfiguration=lifecycle_policy
)
This rule moves files to less expensive storage classes as they get older. You can adjust the days and storage classes to fit your needs. See the full API reference.
Monitoring and Analyzing Storage Classes
It’s important to know where your files are and what storage class they’re in. You can use Python to list your files and see their current storage status:
Python
import boto3
s3 = boto3.client('s3')
bucket = 'my-example-bucket'
paginator = s3.get_paginator('list_objects_v2')
for page in paginator.paginate(Bucket=bucket):
for obj in page.get('Contents', []):
head = s3.head_object(Bucket=bucket, Key=obj['Key'])
print(obj['Key'], head.get('StorageClass'), head.get('ArchiveStatus', 'STANDARD'))
This script prints out each file’s name and its storage class (like STANDARD, GLACIER, etc.), so you can audit your storage and make sure your policies are working. How to check S3 object storage class using boto3.
Changing Storage Class of an Object
Sometimes you may want to manually move a file to a different storage class (for example, if you know you won’t need it for a long time). You can do this with the following code:
Python
s3.copy_object(
Bucket='my-example-bucket',
CopySource={'Bucket': 'my-example-bucket', 'Key': 'old-object.txt'},
Key='old-object.txt',
StorageClass='GLACIER'
)
This command copies the file to itself, but changes its storage class to GLACIER, which is a low-cost, long-term storage option. How to change storage class of object in S3 bucket using boto3.
Advanced: Automating Tiering for Multiple Buckets
If your organization has many buckets or a lot of data, you can use Python scripts to read a list of buckets from a file (like CSV or YAML) and apply these policies to all of them in a loop. This way, you don’t have to repeat the same steps for each bucket manually, saving time and reducing errors. See AWS’s automation examples.
Experimental Results
Organizations that have automated S3 lifecycle policies and intelligent tiering have reported saving 30–60% on their storage costs for data that isn’t accessed often. This is because AWS automatically moves files to cheaper storage classes as they age or are less frequently used. However, it’s important to note that very small files (under 128KB) are not eligible for automatic tier migration. Read about cost savings with Intelligent-Tiering.
Best Practices and Recommendations
- Review your storage usage regularly. Make sure your policies are working and your data is where you expect it to be.
- Monitor costs and access patterns. Adjust your rules as your business needs change.
- Use tags and prefixes. This helps you apply different policies to different types of files (for example, keep important documents in fast storage, but archive old logs).
- Test on a small scale first. Before rolling out automation to all your data, try it on a test bucket to make sure it works as expected.
Best practices for S3 lifecycle and tiering.
Conclusion
Automating S3 storage tiering and lifecycle management with Python and Boto3 helps organizations save money, reduce manual work, and keep their data organized. Even if you’re not a programmer, understanding these concepts can help you make smarter decisions about your cloud storage. With the example scripts and AWS features shown here, you can ensure your data is always in the right place at the right price.
Top comments (0)