DEV Community

Crizant Lai
Crizant Lai

Posted on • Originally published at Medium

Things you should consider before using DynamoDb

Lately, I'm been working on a project with DynamoDb, and this is my first time to meet this AWS service. It is just not a regular NoSQL database like MongoDB, I have encountered a few issues (or features?) of it, I will share them along with the way I dealt with them(if I do know how to).

No empty strings

DynamoDB is a NoSQL database, but it cannot store empty strings. Not in any levels of your JSON object. If you have a single value which is an empty string in your object, DynamoDB throws an error.

The workaround of this is, in aws-sdk (for node developers like me, you can find it on npm), it provides an option convertEmptyValues, if you set it true, the empty strings will be converted to null before being written into DB.

Query operation size limit

DynamoDB can maximum return 1MB of data in one query/scan operation. If your query contains more data, the results would include a key last_evaluated_key that you need to include in the next query, in order to retrieve "next page" of data.

Batch write limit

In every write requests, you can maximum include 25 rows only, which is very inconvenient. To deal with this issue and handle the possible write failure due to the exceed of throughput in concurrent write operations in multiple nodes, I have to do it like this:

const R = require('ramda')
const aws = require('aws-sdk')
const batchWriteToDb = async (tableName, data) => {
  const dynamoDb = new aws.DynamoDB({
    maxRetries: 999
  })
  const docClient = new aws.DynamoDB.DocumentClient({
    service: dynamoDb,
    convertEmptyValues: true
  })
  const dataSegments = R.splitEvery(25, data)
  // dynamoDb can only write 25 items once
  for (let i = 0; i < dataSegments.length; i++) {
    const segment = dataSegments[i]
    const params = {
      RequestItems: {
        [tableName]: segment.map(item => ({
          PutRequest: {
            Item: item
          }
        }))
      }
    }
    try {
      let response = await docClient.batchWrite(params).promise()
      while (!R.isEmpty(response.UnprocessedItems)) {
        const count = response.UnprocessedItems[tableName].length
        console.log(`${count} unprocessed item(s) left, retrying...`)
        const params = {
          RequestItems: response.UnprocessedItems
        }
        response = await docClient.batchWrite(params)
      }
    } catch (error) {
      console.log(error, error.stack)
    }
  }
}

Very clumsy, right?

Counting number of rows in a table

In other databases, there should be simple APIs to count the number of rows in a table, but not in the case of DynamoDb. There are two ways to get the number of rows in a table:

1. Use DescribeTable API

It returns information about a table, including the number of rows in it. But it only updates approximately every six hours, I think this is unacceptable.

2. Do a full table scan

Yet if your data exceeds 1MB, you have to do the scan operation recursively to get all pages of data…

Conclusion

DynamoDB is powerful, but it definitely got some trade-offs. If you come from the world of other NoSQL databases like MongoDB, make sure you take the above points into consideration before switching.

Top comments (0)