DEV Community

Markus Toivakka
Markus Toivakka

Posted on

Using Athena to query multi-account Cloudwatch Logs

The scenario. We have multiple workloads and environments deployed to multi-account organization in AWS Organization. Cloudwatch Logs is used to to store logs from various services within a scope of a single AWS account. EC2s are pushing system logs, Lambda functions pushing execution logs and so on.

In order to increase understanding on application logs, aggregating logs from separate AWS accounts to a single service or S3 bucket can be helpful. Depending on business, regulatory compliance may also oblige you to keep logs for certain amount of time. Handling long time log storage in a centralised account can help when implementing access control and retention schedules.

This post demonstrates how to implement centralised log storage to multi-account organization in AWS Organizations. Main goal is to create cloud infrastructure to:

  • Store application logs from multiple AWS accounts to a single S3 bucket.
  • Run ad-hoc queries agains data in S3 with Amazon Athena.

Let's get started!

Overview

Logging overview

The proposed solution uses Cloudwatch Logs for log ingestion on source accounts. Amazon Kinesis is used for log delivery to centralised storage and S3 for long term log data storage. Finally we will use Amazon Athena to run SQL queries against these logs.

Deployment

Cloudformation templates for the logging solution can be found here: https://github.com/markymarkus/cloudformation/tree/master/centralised-cloudwatch-logs

logging-account-template.yml provisions infrastructure for Log Storage Account. Template parameter OrganizationId is used on Cloudwatch Logs Destination access policy to allow log delivery from AWS Organizations member accounts.

member-account-template.yml demonstrates how to create Subscription Filter to Cloudwatch Logs. You can create this also on the same account where logging-account-template.yml is deployed.

Log Source Accounts

Source accounts have workloads producing and ingesting logs to Cloudwatch Logs. Each log group is storing logging events from a specific source. In our example:
/var/log/messages stores system logs from EC2 instances.
/aws/lambda/hello-world stores logs from Hello World Lambda function.
Having consistent naming conventions for log groups will come handy when we run Athena queries against the data. Queries like '/var/log/messages of every EC2 instance in every AWS account of Organization' are depended on consistent naming.

Logs are sent from a member account to a receiving centralised destination in Log Storage Account through a subscription filter.

Log Storage Account

Log Storage Account receives and prepares logging data for Athena and stores logs to S3 bucket. Logs are stored in one JSON record per line format. Supported by Athena and easy to export other services and tools.

Transforming log events

By default, Firehose writes JSON records in a stream to S3 bucket without separators or new lines. This would work if we would not be using Athena to run queries. Athena requires each JSON record to be represented in a separate line.
To split JSON records we are using Firehose transformation Lambda. Lambda function reads records batch and adds a new line character + "\n" after every record. Processed records are written back to Firehose stream which finally delivers logs to S3 bucket.

output = []
for record in event['records']:
    payload = base64.b64decode(record['data'])
    striodata = BytesIO(payload)
    with gzip.GzipFile(fileobj=striodata, mode='r') as f:
        payload = f.read().decode("utf-8")

    # Add newline to each record
    output_record = {
        'recordId': record['recordId'],
        'result': 'Ok',
        'data': base64.b64encode((payload + "\n").encode("utf-8"))
    }
    output.append(output_record)
Enter fullscreen mode Exit fullscreen mode

Running Athena queries

At this point we have application logs in S3 bucket in Log Storage account. Kinesis is a time based stream so each file contains logging from multiple Source AWS accounts and log groups:

Image description

To get Athena queries running, first create external table pointing to the data:

CREATE EXTERNAL TABLE logs (
    messageType string,
    owner string,
    logGroup string,
    subscriptionFilters string,
    logEvents array<struct<id:string,
              timestamp:string,
              message:string
              >>
  )           
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://BUCKET_NAME/logs/year=2022/month=05/day=05/' 
Enter fullscreen mode Exit fullscreen mode

logEvents are stored in array structure. In order to query Cloudwatch Logs fields within an array, you need to UNNEST logEvents. Here are couple of queries to get you started:

First query returns latest streamed logs from Source Accounts:

SELECT owner,loggroup,n.message FROM logs
CROSS JOIN UNNEST(logs.logevents) AS t (n)
LIMIT 20
######
1   111111111111    /var/log/messages   May 5 03:07:48 ip-10-0-24-56 dhclient[2085]: XMT: Solicit on eth0, interval 119340ms.
2   222222222222    /aws/lambda/hello-world-lambda  START RequestId: fcc9e873-d4c1-4ca3-a7de-fd5490300740 Version: $LATEST
3   222222222222    /aws/lambda/hello-world-lambda  [DEBUG] 2022-05-05T03:07:50.972Z fcc9e873-d4c1-4ca3-a7de-fd5490300740 {'version': '0', 'id': ....
4   222222222222    /aws/lambda/hello-world-lambda  [DEBUG] 2022-05-05T03:07:50.973Z fcc9e873-d4c1-4ca3-a7de-fd5490300740 Hello World!
5   111111111111    /aws/lambda/lambda-writer-LambdaFunction    START RequestId: 38fb9b6b-dbea-4875-91cc-cf1dd5b36ab9 Version: $LATEST
6   111111111111    /aws/lambda/lambda-writer-LambdaFunction    [DEBUG] 2022-05-05T03:08:16.81Z 38fb9b6b-dbea-4875-91cc-cf1dd5b36ab9
7   111111111111    /var/log/messages   May 5 11:40:01 ip-10-0-24-56 systemd: Created slice User Slice of root.
8   111111111111    /var/log/messages   May 5 11:40:01 ip-10-0-24-56 systemd: Started Session 169 of user root. 

Enter fullscreen mode Exit fullscreen mode

Next SQL query return log event count for each logging source(Log group):


SELECT owner,loggroup,count(*) FROM logs
CROSS JOIN UNNEST(logs.logevents) AS t (n)
GROUP BY owner,loggroup
###
1   111111111111    /var/log/secure 429
2   111111111111    /var/log/messages   1670
3   222222222222    /aws/lambda/hello-world-lambda  5764
4   111111111111    /aws/lambda/lambda-writer-LambdaFunction    7198
...
Enter fullscreen mode Exit fullscreen mode

Conclusion

If you are in process of building and planning a logging strategy, this solution can be a good starting point. You can collect Cloudwatch Logs from multiple accounts and regions to a single S3 bucket. Athena queries can be executed against the consolidated logging data. I encourage you to experiment with SQL queries to the logging data. Analysing source specific logging patterns and event amounts may help you to improve an overall log management process.

Thanks for reading.

Top comments (1)

Collapse
 
mmuller88 profile image
Martin Muller

That is super cool :)