DEV Community

How to track Cognito Identity Pool IAM roles API calls via Athena

About Cloudtrail

AWS CloudTrail is a service that records AWS API calls and events for AWS accounts. This service can save logs as JSON text files in compressed gzip format (*.json.gzip) in S3 Bucket.

About Athena

AWS Athena is an interactive analytics service for large-scale data analysis, offering users the ability to perform SQL queries on data stored in S3 Bucket quickly, flexibly, and cost-effectively.

How these services work together

Through AWS CloudTrail we can monitor and audit all activities carried out in your AWS account via Athena

The purpose of the article is to demonstrate a way to perform Athena SQL query to:

  • Check if the Cognito Identity Pool is being used
  • Check which API calls to AWS are being made from temporary cognito IAM roles credentials

Steps

  1. Create a trail.
  2. Create a Athena database and configure location query result.
  3. Run a Athena query to create table.
  4. Run a Athena query to obtain the trail logs.

Athena Queries

To track Cognito roles API calls, we will use the table creation query from the official AWS documentation with an update.

To be able to query the Cognito-role event you must change the CloudTrail table schema so that the webIdFederationData column has the following definition:

webIdFederationData: STRUCT<
                federatedProvider: STRING,
                attributes: map<string,string>>
Enter fullscreen mode Exit fullscreen mode

Example Athena query to create table:

CREATE EXTERNAL TABLE cloudtrail_logs_pp(
    eventVersion STRING,
    userIdentity STRUCT<
        type: STRING,
        principalId: STRING,
        arn: STRING,
        accountId: STRING,
        invokedBy: STRING,
        accessKeyId: STRING,
        userName: STRING,
        sessionContext: STRUCT<
            attributes: STRUCT<
                mfaAuthenticated: STRING,
                creationDate: STRING>,
            sessionIssuer: STRUCT<
                type: STRING,
                principalId: STRING,
                arn: STRING,
                accountId: STRING,
                userName: STRING>,
            ec2RoleDelivery:string,
            webIdFederationData:map<string,string>
        >
    >,
    eventTime STRING,
    eventSource STRING,
    eventName STRING,
    awsRegion STRING,
    sourceIpAddress STRING,
    userAgent STRING,
    errorCode STRING,
    errorMessage STRING,
    requestparameters STRING,
    responseelements STRING,
    additionaleventdata STRING,
    requestId STRING,
    eventId STRING,
    readOnly STRING,
    resources ARRAY<STRUCT<
        arn: STRING,
        accountId: STRING,
        type: STRING>>,
    eventType STRING,
    apiVersion STRING,
    recipientAccountId STRING,
    serviceEventDetails STRING,
    sharedEventID STRING,
    vpcendpointid STRING,
    eventCategory STRING,
    tlsDetails struct<
        tlsVersion:string,
        cipherSuite:string,
        clientProvidedHostHeader:string>
  )
PARTITIONED BY (
   `timestamp` string)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://bucket-trail/AWSLogs/account-id/CloudTrail/aws-region'
TBLPROPERTIES (
  'projection.enabled'='true', 
  'projection.timestamp.format'='yyyy/MM/dd', 
  'projection.timestamp.interval'='1', 
  'projection.timestamp.interval.unit'='DAYS', 
  'projection.timestamp.range'='2020/01/01,NOW', 
  'projection.timestamp.type'='date', 
  'storage.location.template'='s3://bucket-trail/AWSLogs/account-id/CloudTrail/aws-region/${timestamp}')
Enter fullscreen mode Exit fullscreen mode

Values that should be overridden for your environment:

  • LOCATION
    • bucket-trail: Name of the S3 bucket where the trail log is saved.
    • 012345678912: AWS Account ID.
    • us-east-1: AWS region.
  • projection.timestamp.range
    • '2024/02/01,NOW': Datetime range to track the trail logs.
  • storage.location.template
    • bucket-trail: Name of the S3 bucket where the trail log is saved.
    • 012345678912: AWS Account ID.
    • us-east-1: AWS region.

Run a Athena query to obtain the trail logs

SELECT
 useridentity.arn,
 eventname,
 sourceipaddress,
 eventtime
FROM cloudtrail_logs
WHERE userIdentity.sessionContext.sessionIssuer.arn = 'arn:aws:iam::123456789012:role/ROLE-UNAUTHENTICATED'
LIMIT 100
Enter fullscreen mode Exit fullscreen mode

Values that should be overridden for your environment:

  • WHERE userIdentity.sessionContext.sessionIssuer.arn = 'arn:aws:iam::123456789012:role/ROLE-UNAUTHENTICATED'
    • 'arn:aws:iam::123456789012:role/ROLE-UNAUTHENTICATED': Cognito Identity Pool role arn.

Top comments (0)