DEV Community

Cover image for Automating EC2 Instance and Tag Management with AWS SDK and PostgreSQL
Dmitry Romanoff
Dmitry Romanoff

Posted on

Automating EC2 Instance and Tag Management with AWS SDK and PostgreSQL

Introduction:

Managing EC2 instances at scale across multiple AWS regions can become a daunting task. Whether you're keeping track of instance details, monitoring their status, or managing tags, the process can be time-consuming. In this article, we’ll explore how you can automate the collection of EC2 instance information and its tags across multiple AWS regions, and store this data in a PostgreSQL database for easy querying and reporting. We’ll leverage Python, the boto3 library for interacting with AWS, and psycopg2 for PostgreSQL integration.

Prerequisites

To follow along with this tutorial, you will need:

  • AWS credentials configured on your machine. You can set them up using aws configure.
  • Python 3.x installed on your machine along with the following libraries:
    • boto3: AWS SDK for Python
    • psycopg2: PostgreSQL adapter for Python

You can install the required libraries using:

pip install boto3 psycopg2
Enter fullscreen mode Exit fullscreen mode

Additionally, you should have a PostgreSQL database running, either locally or in the cloud.

Step 1: Setting Up AWS and PostgreSQL Connections

We start by configuring the AWS EC2 client and PostgreSQL connection parameters. The following code sets up the database details as well as the function for fetching EC2 regions.

import boto3
import psycopg2
from psycopg2 import sql

# PostgreSQL connection details
DB_HOST = "..."
DB_PORT = "5432"  # default port for PostgreSQL
DB_NAME = "..."
DB_USER = "..."
DB_PASSWORD = "..."

# Function to get all EC2 regions
def get_all_regions():
    ec2_client = boto3.client('ec2')
    response = ec2_client.describe_regions()
    regions = [region['RegionName'] for region in response['Regions']]
    return regions
Enter fullscreen mode Exit fullscreen mode

Here, get_all_regions() uses the boto3 EC2 client to fetch a list of all available regions. These will be used to gather EC2 data across all regions in the subsequent steps.

Step 2: Fetching EC2 Instances and Their Tags

Once we have the list of regions, we need to gather information about EC2 instances and their tags. For this, we define the get_ec2_instances_and_tags() function:

# Function to get EC2 instances and tags for a given region
def get_ec2_instances_and_tags(region):
    ec2_client = boto3.client('ec2', region_name=region)

    # Describe EC2 instances and tags
    instances = ec2_client.describe_instances()
    tags = ec2_client.describe_tags()

    ec2_data = {}

    # Process EC2 Instances
    for reservation in instances['Reservations']:
        for instance in reservation['Instances']:
            instance_id = instance['InstanceId']
            instance_name = None
            instance_type = instance['InstanceType']
            instance_state = instance['State']['Name']
            region = instance['Placement']['AvailabilityZone'][:-1]  # Removing the AZ suffix to get the region

            # Try to get instance name from the tags
            for tag in instance.get('Tags', []):
                if tag['Key'] == 'Name':
                    instance_name = tag['Value']

            ec2_data[instance_id] = {
                'instance_id': instance_id,
                'instance_name': instance_name,
                'instance_type': instance_type,
                'status': instance_state,
                'region': region,
                'tags': {}
            }

    # Process EC2 Tags
    for tag in tags['Tags']:
        instance_id = tag['ResourceId']
        key = tag['Key']
        value = tag['Value']

        if instance_id in ec2_data:
            ec2_data[instance_id]['tags'][key] = value

    return ec2_data
Enter fullscreen mode Exit fullscreen mode

In this function:

  • We fetch the EC2 instances in a specific region and their associated tags.
  • The instance details (like ID, type, and status) are stored in a dictionary.
  • Tags are also gathered and added to the corresponding EC2 instance data.

Step 3: Inserting EC2 Data into PostgreSQL

Next, we set up the insert_ec2_data_into_db() function that connects to the PostgreSQL database and inserts the EC2 data:

# Function to connect to PostgreSQL and insert data
def insert_ec2_data_into_db(ec2_data):
    # Connect to your PostgreSQL server
    conn = psycopg2.connect(
        dbname=DB_NAME,
        user=DB_USER,
        password=DB_PASSWORD,
        host=DB_HOST,
        port=DB_PORT
    )
    cur = conn.cursor()

    # Create tables
    cur.execute("""
    CREATE TABLE IF NOT EXISTS ec2_instances (
        instance_id VARCHAR(50) PRIMARY KEY,
        instance_name VARCHAR(100),
        instance_type VARCHAR(50),
        status VARCHAR(50),
        region VARCHAR(50)
    );
    """)

    cur.execute("""
    CREATE TABLE IF NOT EXISTS ec2_tags (
        tag_id SERIAL PRIMARY KEY,
        instance_id VARCHAR(50),
        tag_key VARCHAR(100),
        tag_value VARCHAR(255),
        FOREIGN KEY (instance_id) REFERENCES ec2_instances (instance_id) ON DELETE CASCADE
    );
    """)

    cur.execute("""
    TRUNCATE TABLE ec2_tags;
    TRUNCATE TABLE ec2_instances CASCADE;
    """)

    # Insert EC2 instances and tags
    for instance_id, instance_data in ec2_data.items():
        # Insert instance data
        cur.execute("""
        INSERT INTO ec2_instances (instance_id, instance_name, instance_type, status, region)
        VALUES (%s, %s, %s, %s, %s)
        ON CONFLICT (instance_id) DO NOTHING;
        """, (
            instance_id,
            instance_data['instance_name'],
            instance_data['instance_type'],
            instance_data['status'],
            instance_data['region']
        ))

        # Insert tags
        for tag_key, tag_value in instance_data['tags'].items():
            cur.execute("""
            INSERT INTO ec2_tags (instance_id, tag_key, tag_value)
            VALUES (%s, %s, %s);
            """, (instance_id, tag_key, tag_value))

    # Commit and close connection
    conn.commit()
    cur.close()
    conn.close()
Enter fullscreen mode Exit fullscreen mode

Here, we:

  • Create tables (ec2_instances and ec2_tags) to store instance details and tags.
  • Use TRUNCATE to clear the existing data before inserting fresh EC2 data.
  • Insert EC2 instance details and their tags into the database.

Step 4: Putting It All Together

Finally, we integrate all these components in the main() function:

# Main function
if __name__ == '__main__':
    regions = get_all_regions()
    ec2_data = {}

    # Iterate through each region and fetch EC2 instances and tags
    for region in regions:
        print(f"Fetching EC2 instances from region: {region}")
        region_data = get_ec2_instances_and_tags(region)
        ec2_data.update(region_data)  # Combine the data from each region

    # Insert the fetched EC2 data into the database
    insert_ec2_data_into_db(ec2_data)
    print("EC2 instances and tags have been inserted into the database.")
Enter fullscreen mode Exit fullscreen mode

Step 5: Running Queries

Once your EC2 data is stored in PostgreSQL, you can run SQL queries to generate reports. Here’s a simple query that joins the EC2 instances and their tags:

SELECT t1.*, t2.tag_key, t2.tag_value
FROM ec2_instances t1
JOIN ec2_tags t2 ON t1.instance_id = t2.instance_id
ORDER BY t1.instance_id, t2.tag_key, t2.tag_value;
Enter fullscreen mode Exit fullscreen mode

This query will give you a detailed report of all EC2 instances along with their associated tags.

Conclusion

In this article, we demonstrated how to automate the retrieval of EC2 instances and tags across all AWS regions using Python and the boto3 library. We then stored this data in a PostgreSQL database and outlined how to run SQL queries for reporting purposes. This solution can be a game changer for managing EC2 instances at scale, especially when you need to report on or monitor large sets of resources across multiple regions.

By automating this process, you save time and ensure that your EC2 data is always up to date and easily accessible for any analysis or reporting needs.

Top comments (0)