Codefari.com
var duplicatesIds = [];
db.Employee.aggregate([
{
$group: {
_id: {
EmpId: "$EmpId"
},
dups: {
"$addToSet": "$_id"
},
count: {
"$sum": 1
}
}
}, {
$match: {
count: {
"$gt": 1
}
}
}
], {
allowDiskUse: true
}).forEach(function (doc) {
doc.dups.shift();
doc.dups.forEach(function (dupId) {
duplicatesIds.push(dupId);
})
});
printjson(duplicatesIds);
db.Employee.remove({_id:{$in:duplicatesIds}})
db.Employee.find();
Now we will do an analysis of the above-written query.
1- var duplicatesIds = []: This is an array declaration where this query will push the duplicate IDs.
2-{$group:{_id:{EmpId:"$EmpId"},dups:{"$addToSet":"$_id"} ,count:{"$sum":1}}}: Here we are grouping the records on behalf of EmpId, and using $addToSet command, we can create an array "dups", and count:{"$sum":1} is counting the duplicate records.
3- {$match:{count:{"$gt":1}}}: Here we are filtering the records that have a count greater than 1. As the above group pipeline, we are counting the duplicate records on behalf of EmpId.
4- ForEach: we are iterating records one by one here which are grouped EmpId, here we will find the array of duplicate records, for example
"dups" : [
ObjectId("5e5f5d20cad2677f9f839327"),
ObjectId("5e5f5d27cad2677f9f839328"),
ObjectId("5e5f5cf8cad2677f9f839323")
].
5- doc.dups.shift():Here we are removing one record which will not be deleted, and It means we will delete the duplicates except one document.
6- doc.dups.forEach(function (dupId): here again, we are iterating the array to push (duplicatesIds.push(dupId)) it records (duplicatesIds)on the above-declared array.
7- db.Employee.find(): to fetch the records.
Now finally execute the above MongoDB query, and you will find the following records.
For more details follow the codefari.com
Top comments (0)