🚀 Executive Summary
TL;DR: Manual AWS EC2 snapshot management is prone to errors and costly. This guide provides a robust, cost-effective solution to automate EC2 snapshot creation using AWS Lambda and CloudWatch Events, ensuring critical data backup without manual overhead.
🎯 Key Takeaways
- A dedicated IAM role for the Lambda function must grant
ec2:DescribeInstances,ec2:DescribeVolumes,ec2:CreateSnapshot,ec2:CreateTags, andlogs:PutLogEventspermissions. - The Python Lambda function, using
boto3, iterates through running EC2 instances, identifies EBS volumes, and creates snapshots with descriptive tags for management. - Amazon CloudWatch Events (EventBridge) are configured with a cron schedule (e.g.,
cron(0 2 \* \* ? \*)) to trigger the Lambda function periodically, automating the snapshot process.
Automating AWS EC2 Snapshots with Lambda & CloudWatch Events
As a Senior DevOps Engineer at TechResolve, I understand the critical importance of data backup and disaster recovery. For many organizations, managing snapshots of AWS EC2 instances and their associated EBS volumes can be a tedious and error-prone manual task. Neglecting this can lead to significant data loss and operational disruption. While third-party solutions exist, they often come with a hefty price tag and may not offer the granular control you need.
This tutorial will guide SysAdmins, Developers, and DevOps Engineers through building a robust, cost-effective, and fully automated solution for EC2 snapshot management using AWS Lambda and CloudWatch Events. By leveraging AWS’s native services, you can ensure your critical data is regularly backed up without the manual overhead or the expense of external tools, freeing your team to focus on innovation rather than operational toil.
Prerequisites
Before we dive into the implementation, ensure you have the following:
- An AWS Account with administrative access or an IAM user with permissions to create/manage IAM Roles, Lambda Functions, EC2 resources, and CloudWatch Events.
- A basic understanding of AWS Identity and Access Management (IAM), AWS Lambda, Amazon EC2, and Amazon CloudWatch.
- The AWS Command Line Interface (CLI) installed and configured (optional but recommended for easier setup of IAM roles and event rules).
- Basic familiarity with Python 3.x.
Step-by-Step Guide
Step 1: Create an IAM Role for the Lambda Function
Our Lambda function will need specific permissions to interact with EC2 to create snapshots. We’ll create an IAM role that the Lambda function will assume when it executes.
First, define the trust policy, allowing Lambda to assume this role. Save this JSON as lambda-trust-policy.json:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
Next, define the permissions policy. This policy grants the necessary actions for our Lambda function: describing EC2 instances and volumes, and creating snapshots. Save this JSON as ec2-snapshot-policy.json:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"ec2:DescribeVolumes",
"ec2:CreateSnapshot",
"ec2:CreateTags"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*"
}
]
}
The "Resource": "*" for EC2 actions is common for snapshot creation across all instances within a region. The logs permissions are crucial for the Lambda function to output logs to CloudWatch Logs.
Now, use the AWS CLI to create the role and attach the policy:
aws iam create-role --role-name LambdaEC2SnapshotRole --assume-role-policy-document file://lambda-trust-policy.json
aws iam put-role-policy --role-name LambdaEC2SnapshotRole --policy-name EC2SnapshotPolicy --policy-document file://ec2-snapshot-policy.json
Make a note of the Role ARN after creation; you’ll need it for the Lambda function configuration.
Step 2: Develop the Lambda Function
This Python function will iterate through all running EC2 instances in a specific region, identify their attached EBS volumes, and create snapshots. It will also add descriptive tags to the snapshots for easier identification and management.
Create a file named snapshot_lambda.py with the following code:
import boto3
import datetime
import os
def lambda_handler(event, context):
ec2 = boto3.client('ec2', region_name=os.environ.get('AWS_REGION', 'us-east-1'))
# Define a custom tag key to identify instances for snapshot
# Or simply snapshot all instances
# INSTANCE_TAG_KEY = os.environ.get('INSTANCE_TAG_KEY', 'AutoSnapshot')
# INSTANCE_TAG_VALUE = os.environ.get('INSTANCE_TAG_VALUE', 'True')
try:
# Describe running instances
# To filter by a tag, uncomment and modify the Filters below
instances_response = ec2.describe_instances(
Filters=[
{'Name': 'instance-state-name', 'Values': ['running']},
# {'Name': f'tag:{INSTANCE_TAG_KEY}', 'Values': [INSTANCE_TAG_VALUE]}
]
)
for reservation in instances_response['Reservations']:
for instance in reservation['Instances']:
instance_id = instance['InstanceId']
instance_name = 'No-Name'
# Try to find the instance name tag
for tag in instance.get('Tags', []):
if tag['Key'] == 'Name':
instance_name = tag['Value']
break
print(f"Processing instance: {instance_id} ({instance_name})")
for block_device_mapping in instance.get('BlockDeviceMappings', []):
if 'Ebs' in block_device_mapping:
volume_id = block_device_mapping['Ebs']['VolumeId']
description = (
f"Automated snapshot of {volume_id} "
f"attached to {instance_id} ({instance_name}) "
f"created on {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}."
)
print(f"Creating snapshot for volume: {volume_id}")
snapshot = ec2.create_snapshot(
VolumeId=volume_id,
Description=description,
TagSpecifications=[
{
'ResourceType': 'snapshot',
'Tags': [
{'Key': 'CreatedBy', 'Value': 'Lambda'},
{'Key': 'Automation', 'Value': 'EC2SnapshotTool'},
{'Key': 'InstanceId', 'Value': instance_id},
{'Key': 'InstanceName', 'Value': instance_name},
{'Key': 'VolumeId', 'Value': volume_id},
{'Key': 'Name', 'Value': f"{instance_name}-{volume_id}-snapshot-{datetime.datetime.now().strftime('%Y%m%d%H%M')}"}
]
}
]
)
print(f"Snapshot created: {snapshot['SnapshotId']}")
except Exception as e:
print(f"Error creating snapshots: {e}")
raise e
return {
'statusCode': 200,
'body': 'EC2 snapshots created successfully!'
}
Now, create the Lambda function. Replace YOUR_LAMBDA_ROLE_ARN with the ARN you obtained in Step 1, and specify your desired region.
aws lambda create-function \
--function-name EC2AutomatedSnapshotter \
--runtime python3.9 \
--handler snapshot_lambda.lambda_handler \
--zip-file fileb://snapshot_lambda.zip \
--role arn:aws:iam::123456789012:role/LambdaEC2SnapshotRole \
--timeout 300 \
--memory-size 128 \
--environment Variables={AWS_REGION=us-east-1}
Note: You need to zip your Python file first: zip snapshot_lambda.zip snapshot_lambda.py.
The --environment Variables={AWS_REGION=us-east-1} ensures your Lambda function targets the correct AWS region for your EC2 instances. Adjust us-east-1 to your operational region.
Step 3: Configure CloudWatch Event Rule
To trigger our Lambda function on a schedule, we’ll use CloudWatch Events (now often referred to as Amazon EventBridge). We’ll set up a daily schedule.
First, create the CloudWatch Event Rule. This example sets a daily trigger at 02:00 UTC:
aws events put-rule \
--name DailyEC2SnapshotTrigger \
--schedule-expression "cron(0 2 * * ? *)" \
--state ENABLED
The cron expression cron(0 2 * * ? *) means at 02:00 AM UTC every day.
Next, add the Lambda function as a target for this rule:
aws events put-targets \
--rule DailyEC2SnapshotTrigger \
--targets "Id"="1", "Arn"="arn:aws:lambda:us-east-1:123456789012:function:EC2AutomatedSnapshotter"
Replace the Lambda ARN with your function’s ARN.
Finally, grant the CloudWatch Event Rule permission to invoke your Lambda function:
aws lambda add-permission \
--function-name EC2AutomatedSnapshotter \
--statement-id AllowCloudWatchToInvoke \
--action "lambda:InvokeFunction" \
--principal events.amazonaws.com \
--source-arn arn:aws:events:us-east-1:123456789012:rule/DailyEC2SnapshotTrigger
Again, ensure the region and account ID match your setup.
Step 4: Test and Monitor
After setting up, it’s crucial to test your Lambda function. You can manually invoke it from the AWS Lambda console by creating a test event (an empty JSON {} is sufficient). Monitor the CloudWatch Logs for your Lambda function to ensure it executes successfully and creates snapshots as expected.
Periodically check the EC2 Snapshots section in the AWS Management Console to verify that snapshots are being created with the correct tags.
Common Pitfalls
-
IAM Permissions: The most frequent issue. Double-check that your Lambda’s IAM role has all the necessary
ec2:DescribeInstances,ec2:DescribeVolumes,ec2:CreateSnapshot, andec2:CreateTagspermissions, as well as CloudWatch Logs permissions. -
Lambda Timeout: If you have a large number of EC2 instances and EBS volumes, the default
30-secondLambda timeout might not be enough. The provided setup uses300 seconds (5 minutes), which should be sufficient for most cases, but adjust it if you encounter timeouts. -
Regionality: Ensure your Lambda function is deployed in the same AWS region where your EC2 instances reside, or modify the Lambda code to iterate across multiple regions if your instances are geographically dispersed. The provided code assumes a single region set via the
AWS_REGIONenvironment variable. - Missing Tags: If you uncommented the tag-based filtering in the Lambda code, ensure your EC2 instances are correctly tagged. Otherwise, the Lambda function might not find any instances to snapshot.
Conclusion
Automating AWS EC2 snapshots with Lambda and CloudWatch Events provides a powerful, serverless, and cost-effective solution for ensuring your critical data is backed up reliably. By following this guide, you’ve established an infrastructure that eliminates manual backup chores, reduces human error, and gives you peace of mind regarding your data’s integrity.
This setup is just the beginning. Consider enhancing your automated snapshot strategy by implementing:
- Snapshot Retention Policy: Extend the Lambda function to delete old snapshots, preventing excessive storage costs.
- Multi-Region Support: Modify the Lambda to iterate through multiple AWS regions if your infrastructure spans across them.
- Notifications: Integrate with Amazon SNS to send notifications (e.g., email or Slack) upon successful snapshot completion or failure.
- Custom Tagging Strategies: Develop more sophisticated tagging to categorize and manage snapshots based on environment, application, or compliance requirements.
By continuously refining your automation, you can build an even more resilient and efficient cloud environment. Happy automating!

Top comments (0)