Quoc-Hung Hoang

Posted on Jan 9

What I Learned from the 'Amazon DynamoDB for Serverless Architectures' Course on AWS Skill Builder

#aws #dynamodb #nosql

TL;DR

DynamoDB is well designed for at Online Transaction Processing (OLTP) workloads where you need consistent single-digit millisecond response times at any scale
Unlike traditional databases, the key to unlocking DynamoDB's power lies in knowing your request patterns upfront - not as an afterthought, but as a fundamental part of your design process. You can't just throw any query at it; instead, you design your data model around specific questions you need to answer. This approach makes it ideal for applications with well-defined workflows, like e-commerce carts, user sessions, or game states, but less suitable for exploratory analytics or ad-hoc queries.

How it works

Data is stored in tables. A table contains items with attributes.
You can think of items as rows or tuples in a relational database and attributes as columns.

Primary keys

Basic item requests:

Write

PutItem: Write item to specified primary key.
UpdateItem: Change attributes for item with specified primary key.
BatchWriteItem: Write bunch of items to the specified primary keys.
DeleteItem: Remove item associated with specified primary key.

Read

GetItem: Retrieve item associated with specified primary key.
BatchGetItem: Retrieve items with this bunch of specified primary keys.
Query: For specified partition key, retrieve items matching sort key expression (forward/reverse order).
Scan: Give me every item in my table.

Details: Working with items and attributes in DynamoDB

Secondary indexes: allow to query data based on other attributes than your table's primary key.

Local secondary indexes
- index is local to partition key
- allow you to query items with the same partition key
Global secondary indexes:
- allow you to query over the entire table
- index is across all partitions

Pros:

Improved query performance
Support complex queries
Non-Unique Indexing: They allow for indexing on non-unique attributes, which broadens the range of query possibilities, such as searching for products by category or timestamps in logs

Cons:

Write Performance Overhead: Each time an item in the base table is updated, corresponding entries in the secondary index must also be updated. This can slow down insert and update operations, especially if there are multiple secondary indexes on a table
Storage Costs: While secondary indexes reduce data retrieval time, they also consume additional storage space, which can be a consideration in environments with limited resources
Limited Use Cases: Secondary indexes should not be applied to attributes with low or very high cardinality; low cardinality can lead to inefficient queries, while high cardinality can result in excessive scanning across nodes

Important notes:

The secondary index should not apply to the values with too low (e.g. gender -> male/female) or too high cardinality (e.g. user's unique id)
- former case: querying won't be efficient as leads to wide index partitions and automatically we've a lot of data to scan
- latter case: queries can be executed at best on 1 node and at worst in all nodes
- Learn more: https://www.waitingforcode.com/general-big-data/secondary-index-nosql-data-stores/read?t#sample_implementation

Design considerations

Partition keys

When selecting an attribute as a partition key in a NoSQL database, it is crucial to consider several key factors to ensure that read and write operations are evenly distributed across partitions. Here are the primary considerations:

High Cardinality (that means there are lots of unique values):
• Definition: Choose attributes that have a large number of distinct values (high cardinality). For instance, using user_id, email, or order_id can help ensure that data is spread across many partitions.
• Impact: High cardinality minimizes the risk of “hot partitions,” where one partition receives significantly more traffic than others, leading to performance bottlenecks .
Avoid Monotonically Increasing Values:
• Examples: Attributes like timestamps or sequential IDs can lead to uneven distribution because new entries will always be directed to the same partition until it reaches its capacity.
• Recommendation: Instead, consider using composite keys or appending random suffixes to create variability in the partition key .

Hot and Cold Data

Separate data that is frequently accessed (hot) from data that is not accessed frequently (cold)

Large attributes

Ideally should keep item size small

Compress large data before storing it
Store large data in external storage like S3
Break it up into smaller items

Closing Thoughts

DynamoDB represents a fundamental shift in how we approach database design. While traditional databases let us figure out our queries after the fact, DynamoDB asks us to think differently – to start with our application's access patterns and build our data model around them. This isn't just a technical constraint; it's a design philosophy that rewards careful planning with exceptional performance and scalability.

Throughout this course, I've come to appreciate that success with DynamoDB isn't about fighting its constraints but embracing them. Whether it's choosing the right partition keys, managing hot and cold data, or designing secondary indexes, each decision flows from understanding your application's specific needs upfront.

DEV Community

What I Learned from the 'Amazon DynamoDB for Serverless Architectures' Course on AWS Skill Builder

TL;DR

How it works

Primary keys

Basic item requests:

Secondary indexes: allow to query data based on other attributes than your table's primary key.

Design considerations

Partition keys

Hot and Cold Data

Large attributes

Closing Thoughts

Top comments (0)

Read next

How to Simulate High CPU Usage on AWS Ubuntu Instances for Testing and Performance Optimization

Implications of using Aws cognito in Laravel 11 that uses the pool's user_id as application's User Id.

Exploring new AWS EKS auto mode. What is it ? Why it is useful ? How to quick start ?

Amazon Q: Your GenAI Assistant for Business Processes, Code Reviews, and Documentation