Create account

DEV Community

Johannes Konings for AWS Community Builders

Posted on Oct 3, 2022 • Originally published at johanneskonings.dev on Oct 3, 2022

Example how to trigger a Dynamodb export and create an Athena saved query with CDK

#aws #awsathena #awscdk #awsdynamodb

In this post is described how to get the data to analyze the changes in the dynamodb data. This post describes how to (semi) automate the export of the dynamodb table data and analyze it with Athena. This post describes how you can do that manually.

One approach is with a lambda and another approach is with step functions. Both approaches implement the steps for triggering the export to a S3 bucket, create an athena table for that exported data and prepare a namend query for analyzing.

The data for this example looks like this.

With lambda

This lambda triggers the export with via the sdk and create or update a named query.

The query creates the athena table. The export id will be set by the lambda by replacing the “s3location” with something like s3://<<bucket name>>/ddb-exports/AWSDynamoDB/<<ddb-export-id>>/data/.

CREATE EXTERNAL TABLE ddb_exported_table (
 Item struct<pk:struct<S:string>,
             person:struct<M:struct<
                jobArea:struct<S:string>,
                firstname:struct<S:string>,
                gender:struct<S:string>,
                jobType:struct<S:string>,
                jobDescriptor:struct<S:string>,
                lastname:struct<S:string>
                >>>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3Location'
TBLPROPERTIES ( 'has_encrypted_data'='true');

https://github.com/JohannesKonings/test-aws-dynamodb-athena-cdk/blob/main/cdk/lib/ddb-export/readTable.sql

SELECT 
item.pk.S as pk,
item.person.M.firstname.S as firstname,
item.person.M.lastname.S as lastname,
item.person.M.jobArea.S as jobArea,
item.person.M.gender.S as gender, 
item.person.M.jobType.S as jobType, 
item.person.M.jobDescriptor.S as jobDescriptor
FROM "db_name"."table_name";

After you started the lambda you have to wait until the export is finished. Then you can run the query for creating the athena table. The lambda has already deleted the old table. After that you can use the prepared query for analyzing.

A more orchestrated approach is with step function. That’s better for waiting for the results :)

With step functions

This are the steps, which are orchestrated by the step function.

It’s definend here

The step function could be startet with the default values.

It takes some minutes to complete.

The “recent queries” section list the steps for dropping the old table and create the new one.

After it’s finished you can choose the saved query with the name sfn-ddb-export-read-table. It can be used to query all the data from the dynamodb table and could be adapted to more “complex” queries.

Code

JohannesKonings / examples-aws-dynamodb-analytics-cdk

Example how to Analyze DynamoDB data with Athena in different ways created with AWS CDK

examples-aws-dynamodb-analytics-cdk

diff over all: npm run cdk -- diff deploye over all: npm run cdk -- deploy

scenarios

ddb-kinesis-athena: DynamoDb -> DynamoDb Streams -> Kinesis Data Streams -> Data Firehose -> S3 -> Glue Crawler -> Athena

diff: npm run cdk:ddb-kinesis-athena -- diff deploy: npm run cdk:ddb-kinesis-athena -- deploy

ddb-kinesis-athena: DynamoDb -> EventBridge Pipe (DynamoDb Streams -> Data Firehose) -> S3 -> Glue Crawler -> Athena

diff: npm run cdk:ddb-pipe-athena -- diff deploy: npm run cdk:ddb-pipe-athena -- deploy

deploy options

The config file controls the deplyoement options.

Firehose

The formats JSON and Parquet can be choosen

export const config: Config = {
    ...
    kinesisFormat: 'JSON',
    ...
}

export const config: Config = {
    ...
    kinesisFormat: 'PARQUET',
    ...
}

Quicksight

export const config: Config = {
    ...
    isQuicksight: true,
    ...
}

cd cdk

QUICKSIGHT_USERNAME=<<Quicksight user name>> npx cdk deploy

desription

see…

View on GitHub

(https://github.com/JohannesKonings/test-aws-dynamodb-athena-cdk)

DEV Community

Example how to trigger a Dynamodb export and create an Athena saved query with CDK

With lambda

With step functions

Code

JohannesKonings / examples-aws-dynamodb-analytics-cdk

Example how to Analyze DynamoDB data with Athena in different ways created with AWS CDK

examples-aws-dynamodb-analytics-cdk

scenarios

ddb-kinesis-athena: DynamoDb -> DynamoDb Streams -> Kinesis Data Streams -> Data Firehose -> S3 -> Glue Crawler -> Athena

ddb-kinesis-athena: DynamoDb -> EventBridge Pipe (DynamoDb Streams -> Data Firehose) -> S3 -> Glue Crawler -> Athena

archive

deploy options

Firehose

Quicksight

desription

Top comments (0)

Read next

Amazon Q Developer Tips: No.20 Amazon Q Developer Agents - /review

How to Define AI Agents with Cloudformation and SAM: A Builder's Guide

🚀 Amazon Nova: AWS's New Foundation Model for GenAI🤖

Amazon Q Developer Tips: No.19 Amazon Q Developer Agents - /doc