DEV Community

Quoc-Hung Hoang
Quoc-Hung Hoang

Posted on

What I Learned from the 'Amazon DynamoDB for Serverless Architectures' Course on AWS Skill Builder

TL;DR

  • DynamoDB is well designed for at Online Transaction Processing (OLTP) workloads where you need consistent single-digit millisecond response times at any scale
  • Unlike traditional databases, the key to unlocking DynamoDB's power lies in knowing your request patterns upfront - not as an afterthought, but as a fundamental part of your design process. You can't just throw any query at it; instead, you design your data model around specific questions you need to answer. This approach makes it ideal for applications with well-defined workflows, like e-commerce carts, user sessions, or game states, but less suitable for exploratory analytics or ad-hoc queries.

How it works

Data is stored in tables. A table contains items with attributes.
You can think of items as rows or tuples in a relational database and attributes as columns.
Tables and Partitions

Primary keys

Primary keys

Basic item requests:

Write

  • PutItem: Write item to specified primary key.
  • UpdateItem: Change attributes for item with specified primary key.
  • BatchWriteItem: Write bunch of items to the specified primary keys.
  • DeleteItem: Remove item associated with specified primary key.

Read

  • GetItem: Retrieve item associated with specified primary key.
  • BatchGetItem: Retrieve items with this bunch of specified primary keys.
  • Query: For specified partition key, retrieve items matching sort key expression (forward/reverse order).
  • Scan: Give me every item in my table.

Details: Working with items and attributes in DynamoDB

Secondary indexes: allow to query data based on other attributes than your table's primary key.

  • Local secondary indexes
    • index is local to partition key
    • allow you to query items with the same partition key
  • Global secondary indexes:
    • allow you to query over the entire table
    • index is across all partitions

Pros:

  • Improved query performance
  • Support complex queries
  • Non-Unique Indexing: They allow for indexing on non-unique attributes, which broadens the range of query possibilities, such as searching for products by category or timestamps in logs

Cons:

  • Write Performance Overhead: Each time an item in the base table is updated, corresponding entries in the secondary index must also be updated. This can slow down insert and update operations, especially if there are multiple secondary indexes on a table
  • Storage Costs: While secondary indexes reduce data retrieval time, they also consume additional storage space, which can be a consideration in environments with limited resources
  • Limited Use Cases: Secondary indexes should not be applied to attributes with low or very high cardinality; low cardinality can lead to inefficient queries, while high cardinality can result in excessive scanning across nodes

Important notes:

Design considerations

Partition keys

When selecting an attribute as a partition key in a NoSQL database, it is crucial to consider several key factors to ensure that read and write operations are evenly distributed across partitions. Here are the primary considerations:

  • High Cardinality (that means there are lots of unique values):
    • Definition: Choose attributes that have a large number of distinct values (high cardinality). For instance, using user_id, email, or order_id can help ensure that data is spread across many partitions.
    • Impact: High cardinality minimizes the risk of “hot partitions,” where one partition receives significantly more traffic than others, leading to performance bottlenecks .

  • Avoid Monotonically Increasing Values:
    • Examples: Attributes like timestamps or sequential IDs can lead to uneven distribution because new entries will always be directed to the same partition until it reaches its capacity.
    • Recommendation: Instead, consider using composite keys or appending random suffixes to create variability in the partition key .

Hot and Cold Data

Separate data that is frequently accessed (hot) from data that is not accessed frequently (cold)

Large attributes

Ideally should keep item size small

  • Compress large data before storing it
  • Store large data in external storage like S3
  • Break it up into smaller items

Closing Thoughts

DynamoDB represents a fundamental shift in how we approach database design. While traditional databases let us figure out our queries after the fact, DynamoDB asks us to think differently – to start with our application's access patterns and build our data model around them. This isn't just a technical constraint; it's a design philosophy that rewards careful planning with exceptional performance and scalability.

Throughout this course, I've come to appreciate that success with DynamoDB isn't about fighting its constraints but embracing them. Whether it's choosing the right partition keys, managing hot and cold data, or designing secondary indexes, each decision flows from understanding your application's specific needs upfront.

Top comments (0)