DEV Community

Cover image for AWS CDK and Amazon DynamoDB global tables

AWS CDK and Amazon DynamoDB global tables

Amazon DynamoDB is THE database for serverless AWS applications. Its HTTP-based connection model makes integrating with serverless computing services like AWS Lambda easy. One of the additional capabilities of Amazon DynamoDB is the global tables, which allows you to run replica instances of the database in other regions.

In this blog post, I will touch on Amazon DynamoDB global tables in the context of the AWS CDK framework – there are a couple of nuances with huge implications you should be aware of.

Let us dive in.

Traditional way of creating global tables

There are two ways I'm aware of to create global tables using AWS CDK, with one of them being, in my humble opinion, much better for the maintainability of your application.

Let us start with the "traditional" way of creating the Amazon DynamoDB global tables via the AWS CDK – by using the aws_dynamodb.Table construct and specifying the replicationRegions property.

const globalTable = new dynamodb.Table(this, "Table", {
  partitionKey: { name: "id", type: dynamodb.AttributeType.STRING },
  replicationRegions: ["us-east-1", "us-east-2", "us-west-2"],
  billingMode: dynamodb.BillingMode.PROVISIONED
});
Enter fullscreen mode Exit fullscreen mode

Here I've defined a table with three replication regions. If you deploy this code snippet in the AWS DynamoDB web console, you will see that the table has three replicas – one for each region.

On the surface, all looks good and works as expected. But as soon as you peek "under the curtains" of the construct, you will see something unexpected – that deploying the "Table" created an AWS CloudFormation custom resource alongside the DynamoDB resource.

The problem with the custom resource

You can find the code the custom resource in question uses here.

The custom resource is responsible for creating the replica tables. It uses the AWS SDK updateTable API and provides the appropriate value for the ReplicaUpdates property. On the surface, it does not sound bad, but by creating the replicas via the AWS SDK call, we effectively detach those resources from your AWS CloudFormation template and AWS CDK code.

Simplified diagram

The implications of this architecture are enormous from the maintainability perspective. Here are the two cases of confusion and toil I had to deal with multiple times when operating global tables deployed via the replicationRegions property.

The DeletionPolicy attribute

If you are running a production workload, it is considered good practice (I would argue a must-have) to specify the DeletionPolicy (in AWS CDK, the name is removalPolicy) of RETAIN for resources that are stateful and hold production data.

If you delete the CloudFormation stack, the CloudFormation will not delete the resources when we set the DeletionPolicy to RETAIN. As you can imagine, this can save you from a disaster.

Usually, in AWS CDK, this is done by using the applyRemovalPolicy method on a given construct.

const globalTable = new cdk.aws_dynamodb.Table(this, "Table", {
  // ...
});
globalTable.applyRemovalPolicy(cdk.RemovalPolicy.RETAIN);
Enter fullscreen mode Exit fullscreen mode

The problem is that replicas created via the custom resource will not inherit that removalPolicy. This setting will only apply to the "root" table.

So, If I were to delete a CloudFormation stack, the CloudFormation will preserve the "root" table but delete the custom resource. Since the custom resource triggered with a Delete event, it would issue the Delete action for all the replica tables. And with that, all of your replica tables are gone.

To make sure the Delete event never gets to the custom resource the AWS CDK uses, you have to set the DeletionPolicy on the custom resource itself. That is not so easy as you have to "find" the custom resource node in the AWS CDK tree and override its DeletionPolicy attribute. Here is how I would do it.

const globalTable = new cdk.aws_dynamodb.Table(this, id, {
  // ...
});

/**
 * This only applies to the "root table" and NOT THE REPLICAS!
 */
globalTable.applyRemovalPolicy(removalPolicy);

/**
 * Make sure we do not remove the custom resource
 */
const customReplicaResource = globalTable.node.children.find(child => {
  return (child as any).resource?.cfnResourceType === "Custom::DynamoDBReplica";
}) as cdk.CfnResource;

customReplicaResource.applyRemovalPolicy(removalPolicy);
Enter fullscreen mode Exit fullscreen mode

Enabling point-in-time recovery on the replica tables

For production usage, enabling point-in-time recovery (PITR) is arguably a must. It is not only helpful in data-corruption scenarios but also for disaster recovery. In CDK, applying the PITR on a global table (created via the replicationRegions property) might initially seem easy.

const globalTable = new cdk.aws_dynamodb.Table(this, id, {
  pointInTimeRecovery: true
  // ...
});
Enter fullscreen mode Exit fullscreen mode

The problem is, similarly to the DeletionPolicy we have looked at earlier, that the pointInTimeRecovery property only applies to the "root" table. It will NOT be "forwarded" to the replica tables. An auditing tool will catch that for you in the best-case scenario. In the worst, you will not be able to recover your replica tables if something goes wrong.

All hope is not lost, though. You could also use a workaround to ensure that the replica tables have the same PITR setting as your "root" table. This involves calling updateContinuousBackups yourself via the AwsCustomResource construct for each region where the replica resides.

const replicationRegions = ["eu-center-1", "eu-west-1"];

const globalTable = new cdk.aws_dynamodb.Table(this, id, {
  pointInTimeRecovery: true,
  replicationRegions
  // ...
});

for (const replicationRegion of replicationRegions) {
  new cdk.custom_resources.AwsCustomResource(this, id, {
    onUpdate: {
      service: "DynamoDB",
      action: "updateContinuousBackups",
      parameters: {
        TableName: globalTable.tableName,
        PointInTimeRecoverySpecification: {
          PointInTimeRecoveryEnabled: true
        }
      },
      region: replicationRegion,
      physicalResourceId: cdk.custom_resources.PhysicalResourceId.of(id)
    },
    policy: cdk.custom_resources.AwsCustomResourcePolicy.fromSdkCalls({
      resources: [globalTable.tableArn]
    })
  });
}
Enter fullscreen mode Exit fullscreen mode

A different way of creating global tables

Instead of using the aws_dynamodb.Table construct, consider using the aws_dynamodb.CfnGlobalTable. To the best of my knowledge, this resource was made available to us in May 2021, and I would highly encourage you for it to be your default, even if you do not anticipate using replica tables in the immediate future in your architecture.

If you want to dive deeper into technical details about the resource to which CfnGlobalTable corresponds, look no further than this excellent blog post.

I advocate for the CfnGlobalTable usage because the gotchas mentioned in the previous section do not apply to this construct. For example, to specify the DeletionPolicy, which would apply to all tables, even the replicas, all you need to do is to use the applyRemovalPolicy method available on the construct.

const globalTable = new cdk.aws_dynamodb.CfnGlobalTable(this, id, {
  // ...
});
globalTable.applyRemovalPolicy(cdk.RemovalPolicy.RETAIN);
Enter fullscreen mode Exit fullscreen mode

What about the PITR? To enable it on all the tables (replicas and the "root" table) do not need to fiddle with AWS SDK calls through a custom resource (or some other way). You can set the PITR for the "root" table and the replica tables.

const globalTable = new cdk.aws_dynamodb.CfnGlobalTable(this, id, {
  // ...
  replicas: [
    {
      region: "eu-center-1",
      pointInTimeRecoverySpecification: {
        pointInTimeRecoveryEnabled: true
      }
    },
    // Let us say that, the root table lives in "eu-west-1"
    {
      region: "eu-west-1",
      pointInTimeRecoverySpecification: {
        pointInTimeRecoveryEnabled: true
      }
    }
  ]
});
Enter fullscreen mode Exit fullscreen mode

Another significant reason for using the CfnGlobalTable is that this resource uses a newer version of Amazon DynamoDB global tables. You can read about different versions of global tables here.

Migration

I'm not yet ready to publish a migration story here (I'm still evaluating different avenues to do so), so please keep an eye on my next article. If you want to perform migration now, I would base the steps necessary on this and this article.

Closing words

I hope that the information in this article will save you some toil and frustrations I had to go through while working with global tables created via the replicationRegions property in CDK.

As I mentioned, the CfnGlobalTable is objectively better suited for the job. Yes, you might lose some of the niceties of the L2 (L3?) CDK construct, but using the newer version of the Amazon DynamoDB global tables will pay dividends in the long run.

For more similar content, please consider following me on Twitter – @wm_matuszewski.

Thank you for your time.

Top comments (0)