Michael O'Brien

Posted on Mar 12, 2021 • Originally published at sensedeep.com

DynamoDB Sparse Indexes with Transparent Follow

#dynamodb #serverless #aws #nosql

DynamoDB provides secondary indexes for fast access via alternate partition and sort keys.

You can create up to 20 secondary indexes on a DynamoDB table to support your access patterns, though with single-table design patterns, you typically need only a couple to support a wide variety of access patterns.

By using sparse secondary indexes, you can minimize your DynamoDB storage costs and not compromise on having fast, efficient queries. You can also improve your ability to evolve your data designs in the future.

Using DynamoDB OneTable makes retrieving data from sparse, key-only secondary indexes easy and cost effective. It provides a transparent follow option to retrieve complete data items via sparse, keys-only indexes.

Secondary Indexes

A DynamoDB secondary index contains a subset of the data from the primary index. Unlike indexes in RDBMs systems, with DynamoDB the data is actually replicated for the secondary index, so that if you replicate all the attributes of all items into a secondary index, you will double your storage costs. If you have three indexes and you replicate all the attributes, you will triple your storage costs. Depending on your application, this may be a significant cost issue.

Data retrieved via a secondary index will be eventually consistent with the primary index. You cannot perform strongly consistent reads from a secondary index. Further, the secondary index cannot be used to update or delete data items. Yet despite these limitations, secondary indexes are often essential to supporting the required access patterns in most applications.

Sparse Indexes

DynamoDB will only copy items from the primary index into a secondary index if the sort key value is present in the item. If the sort key is absent, the index will not contain that item and the index called a "sparse" index. Sparse indexes can be very useful to implement specific queries over a small subset of the table.

A sparse index will incur storage charges only for the data that is replicated into the index. If the sparse index has only one item replicated from the primary index, then the storage charge will be for just one item. This makes sparse indexes a very cost effective solution.

Projected Attributes

A secondary index can project (replicate) either all item attributes to the secondary index, a subset of the attributes or only the key attributes.

If you project only the keys, then a read from the secondary index will only return the key attributes. From these you can read all the attributes from the primary index, but that will incur an addition read request and require addition code to manage the second request.

If you project all the attributes, you will double your storage costs for those items. Projecting a subset of attributes can seem like an ideal middle ground, but changing which attributes you project may make evolving your application for future needs more difficult despite using a migration tool like OneTable Migrate CLI.

OneTable Follow

To make using keys-only sparse indexes easier, the DynamoDB OneTable library has a helpful follow option that will return a complete item from a secondary index.

If reading from a sparse secondary index that projects keys only, you would normally have to issue a second read to fetch the full attributes from the desired item. By using the follow option, OneTable will transparently follow the retrieved primary keys and fetch the full item from the primary index so that you do not have to issue the second read manually.

let account = await Account.find({name: 'acme'}, {index: 'gs1', follow: true})

Under the hood, OneTable is still performing two reads to retrieve the item but your code is much cleaner. For situations where the storage costs are a concern, this approach allows minimal cost, keys-only secondary indexes to be used without the complexity of multiple requests in your code.

SenseDeep with OneTable

At SenseDeep, we've used OneTable and the OneTable CLI extensively with our SenseDeep serverless developer studio. All data is stored in a single DynamoDB table and we extensively use single-table design patterns. We could not be more satisfied with DynamoDB implementation. Our storage and database access costs are insanely low and access/response times are excellent.

Please try our Serverless trouble shooter SenseDeep.

Contact

You can contact me (Michael O'Brien) on Twitter at: @mobstream, or email and ready my Blog.

To learn more about SenseDeep and how to use our serverless developer studio, please visit https://www.sensedeep.com/.

DEV Community