loading...

Fetch a bunch of AWS resource tags (without being throttled!)

ajkerrigan profile image AJ Kerrigan ・4 min read

TL;DR - The Resource Groups Tagging API can help you fetch resource tags in bulk, even if you don't use resource groups!

The Problem

You want to programmatically build a list of active AWS resources. For a service like EC2, you call DescribeInstances and get tags included in the response. Yay!

For other services (I'll use RDS here), you need to do this in two steps:

  1. Scan for active resources (DescribeDBInstances)
  2. Fetch tags for those resources

There are a couple different ways to handle that second step, and the simplest one will start to hurt as your resource count grows.

Fetching Tags - Simplest Way

The RDS API has a ListTagsForResource action. If you've got resources and need their tags, that's the most obvious way to get them. The wrinkle is that you can only fetch tags for one resource at a time.

So that's a bit frustrating, but with a bit of looping it all seems doable:

import boto3

rds = boto3.client('rds')

# Build a mapping of instance ARNs to details
instances = {
    instance['DBInstanceArn']: instance
    for instance in rds.describe_db_instances()['DBInstances']
}

# Add tag detail to each instance
for arn, instance in instances.items():
    instance['Tags'] = rds.list_tags_for_resource(ResourceName=arn).get('TagList')

This works fine for a handful of resources, but will eventually choke for a couple reasons:

  1. DescribeDBInstances can only return up to 100 records in a single call. Some of you likely noticed this oversight. Fortunately we can work around that by using boto3's excellent paginator support.

  2. If you solve for the pagination issue, that means you have over 100 resources. That also means you're bombarding the RDS API with over 100 ListTagsForResource calls over a short period of time. While AWS doesn't currently publicize the rate limits for the RDS API, they will still bite you.

Avoiding API Rate Limits / Throttling

If you bump into API rate limits while trying to fetch resource tags, you'll probably have two thoughts:

  1. I'll be fine if I'm responsible about using backoff / retry logic.
  2. Still, I really wish I could pull tags for more than one resource at a time!

The Resource Groups Tagging API

I have to be honest here, I completely ignored the Resource Groups Tagging API for a long time. I don't use Resource Groups much, and I wrongly assumed that something called the Resource Groups Tagging API would be aimed at managing tags for Resource Groups.

I still think that was a reasonable assumption...

But for our purpose here, let's look specifically at the GetResources action. As the docs point out, it:

Returns all the tagged or previously tagged resources that are located in the specified region for the AWS account.

Well how about that! That means code like this can help find all DB instance tags in a given account and region:

instance_tags = rds.get_resources(ResourceTypeFilters=['rds:db'])

We still need to address paginated responses to handle more than 100 resources, but this is already a huge win:

  1. In the most extreme case, we've reduced our API call count by a factor of 100.
  2. This reduced number of calls doesn't even target the RDS API anymore.

That means we get the data we need more quickly, with less risk of error and less impact to others working in the same account and region.

Putting It All Together

With all of this in mind, here's a breakdown of just one way to get a list of RDS instances and tags.

from itertools import chain

import boto3

rds = boto3.client('rds')
tagging = boto3.client('resourcegroupstaggingapi')

We'll use chain in a little bit to make working with lists of lists nicer.

# Build a mapping of instance ARNs to details
paginator = rds.get_paginator('describe_db_instances')
instances = {
    instance['DBInstanceArn']: instance
    for page in paginator.paginate()
    for instance in page['DBInstances']
}

This is a pagination-friendly version of the mapping we built earlier.

# Fetch tag data for all tagged (or previously tagged) RDS DB instances
paginator = tagging.get_paginator('get_resources')
tag_mappings = chain.from_iterable(
    page['ResourceTagMappingList']
    for page in paginator.paginate(ResourceTypeFilters=['rds:db'])
)

The get_resources() paginator gives us a collection of pages, and each page has a collection of results. chain.from_iterable() helps us treat this "list of lists" as a single collection.

Using rds:db as a resource type filter ensures that we only fetch tags for RDS DB instances, rather than bringing other RDS resources (like snapshots) along for the ride.

# Add tag detail to each instance
for tag_mapping in tag_mappings:
    # Convert list of Key/Value pairs to dict for convenience
    tags = {tags['Key']: tags['Value'] for tags in tag_mapping['Tags']}

    instances[tag_mapping['ResourceARN']]['Tags'] = tags

Thanks to chain.from_iterable(), we can loop over tag_mappings as if it were a flat list.

Since having a tag dictionary is typically more useful than a list of Key/Value pairs, we may as well convert it.

Acknowledgements

Big shout out here to the Cloud Custodian project. Its centralized approach to tag-fetching was what helped me realize how mistaken I was to overlook the Resource Groups Tagging API for so long.

For AWS discussions that go beyond or outside the official documentation, I've found the Open Guide to AWS repo and Slack channel to be immensely useful.

Feedback

If you've gotten this far, thanks for reading! I like to chat about Python and/or AWS, so please say hi :).

Please fire away in the comments if you have suggestions to improve this post, or better ways to fetch tags at scale.

Discussion

markdown guide
 

Great write-up. However the Resource Groups API still has a major flaw in that it cannot give you tag data on provisioned resources that have never been tagged.

 

That's true. Out of curiosity, when has that been an issue for you? When I'm pulling tags, I'm typically also pulling other information from a service API and merging it with the tag data. There have been a couple exceptions though:

  1. When I've wanted a central, service-neutral way to report on untagged resources.
    This isn't bad to work around most of the time.

  2. When I wanted to reliably list all SQS queues in an account with more than 1,000 queues.
    This one is a bit more tedious since ListQueues doesn't support pagination. It's doable by listing queues with different name prefixes, just feels like an awkward edge case. If the Resource Groups Tagging API could reliably list all queues regardless of whether they had ever been tagged, it would have been a nice surprise.

I'm curious about other places where the Resource Groups Tagging API is almost a good fit.

 

One major issue that comes up is if you want to use tags to track billing. This requires some sort of tagging compliance system where untagged resources in an application are detected and remediated so that organizations can have the most accurate data possible on the cost of their applications. If the Resource Group API also returned provisioned resources that were never tagged, tagging governance would be a lot less complicated in general.