While working on a MERN stack project, I came across a situation where I wanted to populate a field but also populate a field inside that populated field (I know it's confusing. Bear with me :p ). So, I solved it and decided to share it with you all. Nice idea, isn't it ? Let's get started then !
I assume you know the basics of mongoose, mongodb and nodejs. In this post, I will cover populate. What it is, How it works, and how to use it to populate documents in mongodb.
What is Population ??
Population is way of automatically replacing a path in document with actual documents from other collections. E.g. Replace the user id in a document with the data of that user. Mongoose has an awesome method populate
to help us. We define refs
in ours schema and mongoose uses those refs to look for documents in other collection.
Some points about populate:
- If no document is found to populate, then field will be
null
. - In case of array of documents, if documents are not found, it will be an empty array.
- You can chain populate method for populating multiple fields.
- If two populate methods, populate same field, second populate overrides the first one.
First things first. We need an example to work with !!
We are going to create 3 collections with 3 schemas:
- User
- Blog
- Comment
const mongoose = require('mongoose');
const Schema = mongoose.Schema;
const UserSchema = new Schema({
name: String,
email: String,
blogs: [{
type: mongoose.Schema.Types.ObjectId,
ref: "Blog"
}]
});
const BlogSchema = new Schema({
title: String,
user: {
type: mongoose.Schema.Types.ObjectId,
ref: "User"
},
body: String,
comments: [{
type: mongoose.Schema.Types.ObjectId,
ref: "Comment"
}]
})
const CommentSchema = new Schema({
user: {
type: mongoose.Schema.Types.ObjectId,
ref: "User"
},
blog: {
type: mongoose.Schema.Types.ObjectId,
ref: "Blog"
},
body: String
})
const User = mongoose.model("Author", UserSchema);
const Blog = mongoose.model("Blog", BlogSchema);
const Comment = mongoose.model("Comment", CommentSchema);
module.exports = {User, Blog, Comment}
Minimal schemas with references to other schemas that will help us use populate method.
How Populate Works
Now let's see how populate works. I won't be writing the whole code. Only the important parts.
Suppose you want a user by id with its blogs. If you do it without populate, you will get the user document with his blog ids array. But we want blog documents instead of ids !!
So let's see how to do it.
// in your node js file
const User = require("path/to/userSchema");
User
.findOne({_id: userId })
.populate("blogs") // key to populate
.then(user => {
res.json(user);
});
/*
OUTPUT:
{
_id: userid, // obviously it will be id generated by mongo
name: "john doe",
email: "john@email.com",
blogs: [
{
_id: blogid,
title: "how to do nothing",
body: "Interesting matter in 11111the blog...",
comments: [commentId_1, commentId_2]
}
]
}
*/
Easy right? Populate is awesome for joining documents like that. You will get user with all blog documents in blogs array.
But if you see the output you will notice comments array is still full of comment ids instead of documents from comments. How do we populate them ??? Keep reading to know...
Nested Populate in a document !
Let's see how to do nested populate in a query and populate comments in user blogs.
User
.findOne({_id: userId })
.populate({
path: "blogs", // populate blogs
populate: {
path: "comments" // in blogs, populate comments
}
})
.then(user => {
res.json(user);
});
/*
OUTPUT:
{
_id: userid, // obviously it will be id generated by mongo
name: "john doe",
email: "john@email.com",
blogs: [
{
_id: blogid,
title: "how to do nothing",
body: "Interesting matter in the blog...",
comments: [
{
user: userId,
blog: blogId,
body: "your blog is awesome !"
}
]
}
]
}
*/
So that was it. If you want to select specific fields while populating, you can use select
key to specify fields inside an object.
// simple populate
User
.findOne({_id: userId })
.populate("blogs", { name: 1 }) // get name only
// nested populate
User
.findOne({_id: userId})
.populate({
path: "blogs",
populate: {
path: "comments",
select: { body: 1 }
}
})
EXTRA : Populate After Save !?
Sometimes (rarely), you may want to populate a document after saving it to mongodb. Example, you create a new comment and save it, but when you send it with response you want to add user info in it instead of just user id.
const Comment = require("/path/to/commentSchema");
let newComment = new Comment({
user: userId,
blog: blogId,
body: "this is a new comment"
});
newComment.save().then(result => {
Comment
.populate(newComment, { path: "user" })
.then(comment => {
res.json({
message: "Comment added",
comment
});
})
})
/*
OUTPUT: Comment
{
user: {
_id: userid,
name: "john doe",
email: "johndoe@email.com",
blogs: [blogId_1, blogId_2
},
blog: blogId,
body: "your blog is awesome !"
}
*/
Any Suggestions are much appreciated.
Hope you find it useful and learned something new from it.
Happy Coding :)
If you want to learn MongoDB, do checkout my Learn MongoDB Series
Top comments (55)
From the outside, this seems like basically creating
JOIN
functionality from an RDBMS. That's not a bad thing! It turns out that sometimes an RDBMS needs document-style records, and sometimes document DBs need relations :) Just wanted to make the comparison in case it helps anyone else (and to make sure I understand it correctly).Correct !!...And Nice comparision. :)
Mongo newbie here 👋🏻
Is it possible to do findOne after the populate?
The query string I have available belongs to the referenced document. I would like to first populate it and then find the document I am looking for.
Other wise I first need to do a findOne in one collection and pass the result to another collection’s findOne.
I am trying to avoid an unnecessary findOne operation but maybe I am overthinking it and its okay to do multiple findOne’s to reach a result?
Another solution comes to my mind is to make the query string I have available an indexed unique field so I can directly do findOne.
Great point you made there. I never tried that, but I guess you can give it try. For cases like these, I tend to use aggregation.
Merci!!! m'a beaucoup aidé cette information!! :)
Je vous en price. 😊
I have a schema named Product inside I have added another schema named Category. I want to get all the products related to a particular category(_id) passed. how do I do that. please help.
I think you can simply use
find
products and check forcategory
using$in
operator. If it doesn't help then can you show a dummy schema and what you want in result?Thank you for great article.
But I have a concern, what if I use my custom Id:string as a key, what will the type be rather than mongoose.Schema.Types.ObjectId?
There is a concept called "Virtuals". I came across it when I was thinking the same thing. I can only give you hint about it because I have not learned or applied them. In virtuals, you define the conditions for virtuals: 'localField' and 'foreignField'. Local field refers to the name of the field of your current Schema. Foreign field is the name of the field which you are using in other schema to refer to your current Schema. Then you use populate normally.
There maybe some mistake in above example but hope it helps you understand the basics atleast.
Maybe once I get time to study, I will write about it as well. :)
Thanks, that's helpful. But the "yahoo" could be anything, when we call
.populate("field name of Authors")
. There are something still not clear. However, I get over it by modify data type of _id field.Good...things are not 100% clear to me as well...but I am sure after trying and testing we can understand it better :)
Hello paras wanted to know how you can find or filter items based on the populated items
Hi Neural. I don't think, we can find or filter items based on populated items. It may also slow down the query as well.
I would say, go for aggregation, because in aggregation we have better control over how we structure and filter data in different steps. Populate is good for simple to intermediate scenarios but aggregation is more helpful when you need to handle a little complex to advance scenarios
Thank you for helpfull explaination. You have explained all in easy way and clearly. I wish you good luck!
I am glad you found it helpful. Thank you as well :)
You too Good Luck !!
For anyone that wants more info, you can read more here. mongoosejs.com/docs/tutorials/virt...
Thanks @paras594 for the article, really helpful
Welcome and thanks for the reference !! You made the article more useful :)
Thank you for great article.
Welcome ❤️
Hi Paras,
This was a nice read. I am going to implement this for my MEAN project. A Couple of questions.
I have a Log collection, which has 4 ObjectId type fields. i.e. client, service, executive, manager - of course apart from its own fields. And when on frontend, the Log is displayed, I am planning to populate all these 4 (i.e. Client Name, Category, Sub Category, Rating), (Service Name, Service Freq), ExecName and ManagerName.
The Log display on frontend has Search and filter options on various fields. So when a user searches or applies filters the backend Mongoose query will run again.
What would be the performance impact if the number of entries in Log collection is in the range to 5000 - 50,000?
For the above type of case would there be any other option for fetching the related data OR this is the best option.
Though populate is not bad. It is optimised. But there is a another way. If you know about aggregation framework in mongodb then there is a
$lookup
option that you can add in pipeline. I am still learning about aggregation framework. I have read that $lookup is faster than populate. Try to find more about itJust know one thing. That populate makes a second call to database. So in your case 4 populates = 4 calls.
Or, you can do a live test :p...run query with populate and check its performance then run query with $lookup pipeline and compare its performance. (mongodb has ways to check query performance)
I hope it helps you. All the best for your project :)
Thanks Paras for your quick response. It surely helps.
You are welcome sir !!
$lookup cannot be sharded and so it limits your scaling, super annoying constraint because I loved this stage until I spotted that.
Okayy. thank you !! I didn't know that :)
I still have to learn more about sharding. Never tried it.
does the populate use find() query behind the scenes and if it does, will it also activate all the pre query middlewares of the model whose documents are used to populate?
Suppose .populate('comments')
will it also run the pre query middlewares of the ref of comments?
Yes we can say it makes a call to find method behind the scenes. But I don't know about pre query middlewares. You have to try it once to see the outcome. Maybe it will pre query middlewares. Never really tried :)
I have another question, that we also have the option to select certain fields while using populate like
.populate({path: 'comments', select: 'content name'}). Do you have some idea that it only summons that selected field or summons the whole documents and then selects the field?
Honestly I don't have idea about this one. But I can say that returning whole docs is easier because it doesn't have to convert it to partial one and return BSON data directly. So selecting specific fields adds an overhead. Don't let it discourage you from using select in populate or find.
Awesome . I knew this In django , Finally Got one for Node 🤘
Great bro
Very much helpful
Thank you !!! :)