I opened up the documentation for the DynamoDB v2 SDK the other day. I just wanted to look up some parameters on BatchGet
when I saw it.
Banner stating to start using v3 instead of v2 SDK
I let out a big sigh.
I have a lot of code out there using v2 of the SDK. This was going to be a big deal.
I thought about leaving it alone and continuing to use the v2 of the SDK. But decided I better practice what I preach and make an effort to stay up to date.
So my team and I began the journey to upgrade to AWS SDK v3. Here are some lessons we learned along the way.
The SDK Is Not Preloaded
One of the nice things about lambdas is that they come with the AWS SDK preloaded in the runtime. Unfortunately, that's not true with v3 of the SDK.
If you want to use v3 of the AWS SDK in your lambdas, you must add the packages yourself.
Adding packages isn't hard to do, we do it all the time. But when my team and I first started implementing it, it was a gotcha because we made the assumption that it was already included.
Since these packages will be used many times across all the functions in your ecosystem, it would be best to add them to a shared layer. The way I structure my SAM templates is to create a dependency layer
and add it globally to every function.
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Globals:
Function:
Runtime: nodejs14.x
Architectures:
- arm64
Tracing: Active
Timeout: 3
Handler: index.handler
Layers:
- !Ref DependencyLayer
Resources:
DependencyLayer:
Type: AWS::Serverless::LayerVersion
Metadata:
BuildMethod: nodejs14.x
Properties:
LayerName: dependency-layer
ContentUri: layers/
CompatibleRuntimes:
- nodejs14.x
This pattern automatically adds the packages defined in the lambda layer to all functions defined in your template. So as you start consuming different clients in the SDK, add them to the layer to have them added into your lambda environments.
For a real world example, you can check out my repo on AWS WebSockets.
DynamoDB Data Must Be Marshalled (and Unmarshalled)
When we first started consuming v3 of the SDK, nothing seemed to work. We kept getting a weird error back from Dynamo:
TypeError: Cannot read property ‘0’ of undefined
This was frustrating because we didn't change how the code was functionally working. We just swapped out the v2 for the v3 of the SDK. This led us to believe this was going to become a much bigger change than originally thought.
However, we stumbled across this post from Nadtakan Futhoem that stated the data had to be marshalled prior to the call.
Simply put, marshalling is adding the data type to the json. For a deeper insight into marshalling, you can check out this page. DynamoDB requires data to be in a specific format to save. V2 handled that for you, but v3 requires the caller to do it.
Any value you are passing into the SDK must be marshalled:
-
Item
in thePutItemCommand
-
Key
in theGetItemCommand
-
ExpressionAttributeValues
in theQueryCommand
On the flip side, data you receive in a response must be unmarshalled. The entire Item
object can be unmarshalled directly. However, when getting results from a query, you cannot unmarshall the entire array. You must process each item individually.
const { unmarshall } = require('@aws-sdk/util-dynamodb');
const command = exports.buildQueryCommand(params);
const result = await ddb.send(command);
return result.Items?.map( item => unmarshall(item));
To perform the marshall/unmarshalling, you can use the util-dynamodb package from the SDK.
You Don't Need to .Promise() Anymore
One of the changes made in v3 of the SDK was to make command execution generic. The caller builds a command and passes it into a single .send
function on the client.
This is great because it makes your code more explicit and makes it easier to tree shake and pull in only the code you need.
The generic .send
command returns a promise, so now you don't have to add .promise()
to the end of your calls!
Be sure you take this into consideration as you're refactoring your code. While you're replacing the existing calls with this one, don't add .promise()
back into the mix.
Do Not Refactor Everything at Once
A benefit of serverless is the modularity. Not everything needs to be done at once. You can go function by function over time, working through your repository. Or you could start making changes point forward where all your new development is done using v3 of the SDK.
As you go back and make changes to your existing functions that are written in v2, you can swap them over to v3 while you're there.
AWS is big on blast radius - which means minimizing the impact if something goes wrong. If you change everything in one go, you have the potential for everything to fail. But going function by function means you lessen that impact into many smaller changes.
Paul Swail provides a two-phased approach to migration which minimizes risk and keeps you moving forward quickly.
Conclusion
If you aren't already, it's time to start using the AWS SDK v3. It is faster, makes your lambda packages smaller, and is designed for better general usability for developers.
As with all things software, it comes with trade-offs. You will now be responsible for marshalling your data before it goes to DynamoDB, and you have to manage the SDK in your lambda layers.
With serverless development, you must make a conscious effort to stay up to date. The further you fall behind, the harder it will be to catch up when you finally decide to do it. Start with updating one function. Then do another. Then update all functions you touch while doing routine maintenance or defect fixing. That's the beauty of serverless, you can take a phased approach.
Happy coding!
Top comments (2)
Hi, nice article. I Wrote something similar some time ago, because indeed switching to V3 is not so straightforward.
from my tests you dont really need marshalling/ unmarshalling if you use the DynamoDBDocumentClient. (it is the same confusion as with V2 between the DynamoDBClient and DynamoDBDocumentClient)
I wrote some samples here
This is great, thank you!