Hi folks.
I would like to share with you the idea/prototype for managing and running migration or seed scripts for DynamoDB.
First, let's clarify what a migration and a seed script are.
Migration refers to the process of transferring data from one database to another, or making structural changes to a database, such as upgrading, transforming, or migrating to a new database management system (DBMS). Due to the fact that aws cdk helps us and handles for us DynamoDB structural changes. I would like to make an agriment that into this article migration refers only for transforming data.
Seed refers to the process of populating a database with initial data, often for development, testing, or setup purposes.
Second, let's outline the required functionality.
In the company where I've worked, the product follows a multi-service architecture. I identified five key requirements:
- TypeScript support: The tool should support TypeScript.
- Minimal migration/seed script logic*: The migration and seeding processes should require minimal scripting.
- Ability to run specific scripts: The tool should allow developers to execute particular scripts.
- Service independence: The tool should function independently of any specific services.
- Leverage the latest AWS SDK version: The tool should utilize the most recent version of the AWS SDK.
Third, before diving into my own implementation, I explored existing packages. One option was the Dynamit CLI, created by my friend Victor Korzunin. While Dynamit CLI offers basic functionality and handles most common tasks, it doesn't fully meet all of the requirements outlined above. Therefore, I decided to implement my own solution.
To present my solution, I've created a separate repository dedicated to migration scripts. The source code you can find here. The repository structure is straightforward, consisting of three primary folders.
- The first folder, named
templates
, contains files for the Plop package. Plop, a micro-generator framework, allows us to generate standardized templates for migration and seed scripts. - The second folder,
scripts
, this folder can contain subdirectories, each named after a specific service, to organize service-specific migration/seed scripts. - The final folder,
framework
, houses the core migration logic and interfaces.
First script
Let's run the npm run plop command to generate our first script. This will prompt you to provide a name for the script and the desired folder location.
Let's open the file and examine the code.
First, I'd like to draw your attention to the DynamoDBScriptTracker
instance. The first argument in the constructor is a client, and the second is an object with two properties. The first property, scriptName
, is automatically generated when I provide the name using the plop console command. The second property, scriptStore
, refers to the name of the DynamoDB table where I store all previously executed script names.
NOTE: If you choose a different store for tracking, you only need to implement the ScriptTracker interface. This store could be anything—a relational database, a file, or another storage solution.
Then, as you can see, you need to implement three functions: read
, write
, and transform
.
-
read
: This required function retrieves items in chunks. The data source could be anything—an API, file, S3 bucket, or DynamoDB table. -
transform
: This optional function applies transformations to an array of items. Additionally, you can cover this function with unit tests. -
write
: This required function writes a batch of items to a target resource. You don't need to worry about the array's size, as it is automatically split into chunks under the hood.
Finally, you need to export an instance of the Migrator
class to pass the arguments scriptTracker
and operations
. Additionally, there is one more optional argument, force
, which allows you to skip execution checks and run the script directly.
If you open the migrator file, you will not find any complicated logic; there is only one public method, run, and that is pretty much it. The other private methods provide logging and recursively execute functions from the operations dependency. The implementation of Migrator class you can find below.
This implementation is straightforward and flexible, allowing it to be used not only for migrations but also as a seeding tool. Additionally, the data source and target can be entirely different from each other.
The only question left is: How do you run it?
Before we dive into running the solution, we first need to decide how to compile and store it. There are several options available, and I’ll outline a few:
- We can build artifacts on a pre-push hook and push them to the repository (
dist
folder as an example). - We can build artifacts on a pre-push hook and store them in an S3 bucket.
- We can set up a Bitbucket pipeline or GitHub Actions workflow to handle builds and storage. Note: that Bitbucket pipelines only store artifacts for 14 days, so you’d need to build them daily or on demand if you require longer retention.
There are plenty of other options as well, and you can choose the one that best fits your requirements.
To run it, I created my own CLI and published it in our private npm repository. There are several commands available, one of which is migration
run. In my next article, I will discuss how to build your own CLI. For now, I will provide the code and the command.
The command looks like:
npx @company/cli migration --list="migration-foo, migration-bar"
and the code:
I hope the idea is clear: you simply provide a list of migration or seed script names, and the command looks up these files in a loop and then executes them. I’ve also omitted some details for clarity, but keep in mind that you’ll also need to provide additional arguments, such as the AWS token and environment.
Last but not least, if you appreciate my work, you can buy me a coffee or subscribe.
Top comments (0)