Query Data with DynamoDB

#dynamodb #aws #nosql #cloudnative

Introducing Today's Project!

What is Amazon DynamoDB?

Amazon DynamoDB is a fully managed, serverless NoSQL database service from AWS that provides fast, predictable performance and scales automatically. It is useful because it eliminates the need to manage servers, supports both key‑value and document data models, and ensures single‑digit millisecond response times even at massive scale.

How I used Amazon DynamoDB in this project

In today’s project, I used Amazon DynamoDB to practice querying and updating data across tables in a way that keeps everything consistent. I started by running get-item commands to retrieve specific records using partition keys and projection expressions, which allowed me to pull back only the attributes I needed. Then I explored how related tables can be updated together by running a transaction with transact-write-items, which let me insert a new comment into one table while simultaneously updating a counter in another. This showed me how DynamoDB ensures atomicity, both operations succeed or fail together making it really useful for handling connected data across multiple tables without risking mismatched updates.

One thing I didn’t expect in this project is how seamlessly DynamoDB handled transactions across multiple tables. I thought working with related data in separate tables would require a lot of manual coordination, but using transact-write-items made it surprisingly straightforward to insert a new record in one table while simultaneously updating another. It was eye‑opening to see how DynamoDB guarantees atomicity, either both operations succeed or neither does which really simplifies keeping related data consistent. This step showed me that DynamoDB isn’t just about speed and scalability, but also about reliability when managing complex relationships.

This project took me through the full cycle of working with DynamoDB from retrieving specific items with "get-item" and "projection expressions", to exploring how tables can be related, and finally running a transaction that updated two tables at once. It wasn’t just about learning commands; it was about seeing how DynamoDB ensures consistency and reliability when handling connected data

Querying DynamoDB Tables

A partition key is the primary attribute DynamoDB uses to distribute and retrieve data across its storage partitions. Every item in a DynamoDB table must include a partition key, and items with the same partition key value are grouped together. This key determines where the data is stored internally and is essential for efficient queries, since DynamoDB can quickly locate items based on that key rather than scanning the entire table.
A sort key is the secondary attribute in a DynamoDB table’s primary key schema that works alongside the partition key to uniquely identify items. While the partition key determines which partition the data belongs to, the sort key organizes items within that partition. This means multiple items can share the same partition key but be distinguished by different sort key values. Sort keys also enable powerful query patterns, such as retrieving items in a range (e.g., all comments after a certain date) or ordering results by the sort key.

Limits of Using DynamoDB

I ran into an error when I queried for items in the Comment table without providing a value for the partition key Id. This was because DynamoDB requires the partition key filter to be specified in every query, without it, the system doesn’t know which partition to look in, so the console flagged the input as invalid. In other words, the query failed because the partition key field was left empty, and DynamoDB cannot execute a query unless the full key schema is respected.
Insights we could extract from our Comment table include the ability to see which posts attract the most engagement, track how often specific users contribute comments, and identify time-based activity patterns such as peak commenting hours or days. We can also observe relationships between posts and their associated comments, giving us a clear picture of community interaction at a structural level.

Insights we can’t easily extract from the Comment table include deeper qualitative analysis, such as the sentiment or tone of comments, trending topics across multiple posts, or demographic-based engagement patterns. DynamoDB stores structured attributes but doesn’t analyze meaning or allow complex joins across tables, so extracting these kinds of insights would require additional tools or data sources.

Running Queries with CLI

A query I ran in CloudShell was

aws dynamodb get-item \
--table-name ContentCatalog \
--key '{"Id":{"N":"202"}}' \
--projection-expression "Title, ContentType, Services" \
--return-consumed-capacity TOTAL

This query will fetch the item in the ContentCatalog table with the partition key Id equal to 202, but instead of returning the entire record, DynamoDB will only return the attributes I specified in the projection expression — Title, ContentType, and Services. Alongside those values, the response will also include a Consumed-Capacity block that shows how many read capacity units (RCUs) were used, giving me both the filtered item data and a usage report in one result.

Query options I could add to my query affect how DynamoDB returns the data and what additional information I get back. Specifically:

--consistent-read
Ensures I always get the most up-to-date version of the item, rather than a possibly stale copy from a replicated node.

--projection-expression
Lets me specify which attributes to return, so instead of the full record I only get the fields I care about (Title, ContentType, and Services).

--return-consumed-capacity TOTAL
Adds a usage report to the response, showing how many read capacity units were consumed by the query.

Transactions

A transaction is a coordinated set of operations in DynamoDB that are executed together so they either all succeed or all fail. Based on the surrounding page content, the idea is that when you need to update related data across multiple tables or items, you can group those changes into a single transaction. DynamoDB then guarantees atomicity: if one part of the transaction cannot be completed, none of the changes are applied. This ensures consistency and prevents situations where one table is updated but another is left behind, keeping your data reliable and synchronized across different parts of your application.

I ran a transaction using the aws dynamodb transact-write-items command with a client request token called TRANSACTION1. This transaction did two things: first, it added a new item into the "Comment" table with details such as the event name, the date and time of the comment, the comment text, and the user who posted it. Second, it updated the "Forum" table by incrementing the Comments attribute for the Events item. By grouping these two operations together in a single transaction, DynamoDB ensured that both changes either succeeded or failed as one unit, keeping the data consistent across the related tables.