DEV Community

Cover image for MongoDB 1ïļâƒĢ-Few/Many
Manav Misra
Manav Misra

Posted on

MongoDB 1ïļâƒĢ-Few/Many

Photo by Everton Vila on Unsplash

I love 💙 MongoDB and ðŸ’Ŋ reach for it before SQL and its limited paradigms whenever feasible. No, not saying NoSQL is ðŸ’Ŋ > SQL, but a majority of the time unless all of the data truly is independent and only exists because of relationships, then it probably is. However, that's a convo for another day.

A common challenge of using MongoDB (unlike in SQL) is the decision fatigue that comes with figuring out the relationships.

What Not To Do

One bad solution to this problem is to just use NoSQL like its SQL.

Just put all the things into separate collections (a la SQL tables) and then use references (Mongoose populate or native $lookup) for all the things. In that case, just don't use NoSQL. 🙃

References

I'm putting my references at the top because my post really does little more than regurgitate ðŸĪŪ and remix what's already been said by official MongoDB folks.

  1. MongoDB Schema Design Best Practices
  2. MongoDB Schema Design Best Practices (Video)
  3. Schema Design Anti-Patterns - Part 1

You should review these for full insights. I am just summarizing and remixing.

This post just focuses on relationships, and not other considerations that are covered in the aforementioned resources ☝ïļ.

TLDR

Information that tends to be viewed/accessed together tends to stay together ergo favor embedding rather than references/populate/$lookup.

1ïļâƒĢ-Few

Prefer embedding for one-to-few relationships.

What's 'few?' A couple of hundred.

A user with a few different addresses and/or social media accounts. This is a 'one-to-few.'

It's reasonable that I would click on the user and want to view all of the details together at the same time.

I don't need to create a separate view in my front-end just to view the addresses. Just show me the user and all of the details including addresses in one view. This means it's one read to get what I want.

In addition, it's unlikely that I need to make frequent updates to a user's addresses.

Embed an array of address subdocuments into the user documents. Do not make it a reference like you would in SQL.

But......

Needing to access an object on its own is a compelling reason not to embed it.

Let's imagine ðŸšēs. An e-commerce site that sells products in addition to individual parts that make up said products.

Chances are, I don't need to click a link to a product and at the same time see a comprehensive listing of all of the associated part details. Instead, I might have links on the web view where I can click on a part number and then see details for a specific part.

It's also a reasonable use case that I might be shopping for 1ïļâƒĢ of the individual parts and not a product. In other words, I need to view a product in isolation and also view a part in isolation.

In this case, unlike a user's addresses, it makes sense for these document types to be presented by themselves.

Furthermore, prices change on both products and on individual parts.

Keep them as separate collections and do use references.

1ïļâƒĢ-Many

What's 'many?' More than a 'few'. A few thousand. More specifically, an unbounded array. This is especially important due to Mongo DB's 16MB document limit.

Just use references. Don't embed.


There are also use cases for 1ïļâƒĢ-Squillions mentioned in the aforementioned resources. Not covering that here.


Favor Embedding

If not sure, then favor embedding. Unlike SQL, it's 🆗 to have duplicate data, if needed.

This is only an issue if this embedded data needs to be updated frequently. Then, separate with references, but unless you are sure, then favor embedding.

With the users and addresses scenario ☝ïļ, a user doesn't frequently update their addresses (or do they? 👇ðŸū), social media, etc., so embed.

Frequent Updates and/or Isolated Views? References!

For ðŸšē products/parts, prices of each might fluctuate somewhat frequently, and, as discussed previously, I probably don't need to look 👀 at all of that information together, anyway. They would be separate views. Use references.

It All Depends ðŸ˜ĩ‍ðŸ’Ŧ

What post on development would be complete without these infamous words?

Think all the way back ☝ïļ to our example regarding users and addresses. The assumption was that users were the focal point of the application.

What if it's more about the addresses and who is living at what address instead? Say, for a housing complex with tenants.

What we might do is flip it 🙃. Embed the users into their addresses. As folks move in and out, we'll just be updating that embedded users array for an address document. 🆒.

The assumption here is that most of the reads pertain to addresses and not users by themselves. We also assume that the users' individual data don't need to updated frequently.

If this was not the case and there was equal reading/updating of both individual users and addresses, then we should use references, as long as we are sure that there is a need. When in doubt, favor embedding.

Authors Of Pain 📚

Consider authors and books data. What is the app about? Is it an authors directory listing for publishers? Then embed books into authors. An author is not going to have an unbounded array situation where they are cranking out books like Tweets. We are assuming a couple of hundred here. Not several thousand.

Or, is it more about finding books for users/readers? Then, embed authors into books. Again, a book is not going to have an unbounded array of authors.

What if it's both? What if the site serves both audiences? I might want to have a view showing all of the authors with the embedded 📚 and/or also also just browse books and not worry as much about the author.

Then, we should use a reference.

Unsure? Embed! In NoSQL data schemas are flexible and it's easier to retroactively fix a bad design decision later in the process. This is not so with SQL.

Hybrid Approaches

Still, favor embedding if not sure.

Embedding MongoDB

Use this in situations such as movies and reviews. Here, it doesn't make sense to access reviews outside of the context of a movie. I am also facing an unbounded array situation where a bunch of jerks online will all feel obligated to pour their thoughts out about some movie such that over time there are thousands and thousands of reviews. They're kind of like jerks that pour their thoughts out on MongoDB schemas! 😏

When I click on a movie, do I want to read all of that ðŸ’Đ? No! But...I can use a hybrid approach where I embed the last few reviews in the movie and keep an array of references to all of the rest. Best of both worlds!


What you use together, store together.

Top comments (0)