DEV Community

Amit Wagner
Amit Wagner

Posted on

Kipping duplicated data between collection synced with mongodb-data-sync

MongoDB is undoubtedly a popular document database. It has become a top database choice for many web apps, especially those powered by NodeJS. The scalability, efficiency, and speed are only some of its benefits. However, for high read performance, denormalization is crucial and it can be painful doing so with MongoDB.

Why you should in some cases Denormalizate the data

Denormalization has many advantages, the most notable being the improvements it brings to the read performance of the database. Another advantage of Denormalization is the simplicity of read queries (by removing the necessity of joins). If you’ve used MongoDB before, you probably already know the only JOIN method it provides is LEFT_OUTER_JOIN, unlike relational databases that provide many other JOIN methods. This forces writing APPJOINs that query the database multiple times and slowing down the performance. To overcome this issue many programmers use Denormalization.

Denormalization also makes it possible to index all fields and enables faster searching and sorting of the database.
However, it has its own downsides such as handling duplicate data, and having consistency across collections. When you denormalize MongoDB, you should expect to have less consistent results when compared to a normalized schema.

The Solution

A possible solution to this is to create jobs to sync data so it becomes stable. Another is to update all the collections. But this is stressful and makes denormalization discouraging. With mongodb-data-sync, it is easier as you can set it up once, and not have to worry about consistency anymore.

As an application grows, more relationships show up that affect the logic. Looking at this from the angle of denormalizing the database, mongodb-data-sync lets you simply declare these dependencies. It takes up the task of syncing the data without you having to worry about the complexities. All this with the assurance that speed is not negatively affected and the data remains consistent.

Often times, every tool that makes it easier to build software adds an overhead that can negatively affect the product over time. You don’t need to worry about that in this case as most of the checks before synchronization are done in memory. Also, mongodb-data-sync makes use of MongoDB’s change streams which lets applications and libraries access real-time database changes.
Installing Mongodb-data-sync

There are two parts to mongodb-data-sync: the engine and the SDK. To get the library working, you need to install these parts and set them up:

npm install mongodb-data-sync -g will install the engine
npm install mongodb-data-sync --save will install the SDK

You can check the npm page to learn more about how you can use the library in your projects and handle your mongodb dependencies gracefully.

Top comments (0)