DEV Community 👩‍💻👨‍💻

Dilip Kumar Singh
Dilip Kumar Singh

Posted on • Originally published at codefari.com

MongoDB Query: Remove duplicate records from the collection except one

Codefari.com
var duplicatesIds = [];
db.Employee.aggregate([
{
$group: {
_id: {
EmpId: "$EmpId"
},
dups: {
"$addToSet": "$_id"
},
count: {
"$sum": 1
}
}
}, {
$match: {
count: {
"$gt": 1
}
}
}
], {
allowDiskUse: true
}).forEach(function (doc) {
doc.dups.shift();
doc.dups.forEach(function (dupId) {
duplicatesIds.push(dupId);
})
});
printjson(duplicatesIds);

db.Employee.remove({_id:{$in:duplicatesIds}})
db.Employee.find();

Now we will do an analysis of the above-written query.
1- var duplicatesIds = []: This is an array declaration where this query will push the duplicate IDs.

2-{$group:{_id:{EmpId:"$EmpId"},dups:{"$addToSet":"$_id"} ,count:{"$sum":1}}}: Here we are grouping the records on behalf of EmpId, and using $addToSet command, we can create an array "dups", and count:{"$sum":1} is counting the duplicate records.

3- {$match:{count:{"$gt":1}}}: Here we are filtering the records that have a count greater than 1. As the above group pipeline, we are counting the duplicate records on behalf of EmpId.

4- ForEach: we are iterating records one by one here which are grouped EmpId, here we will find the array of duplicate records, for example
"dups" : [
ObjectId("5e5f5d20cad2677f9f839327"),
ObjectId("5e5f5d27cad2677f9f839328"),
ObjectId("5e5f5cf8cad2677f9f839323")
].

5- doc.dups.shift():Here we are removing one record which will not be deleted, and It means we will delete the duplicates except one document.

6- doc.dups.forEach(function (dupId): here again, we are iterating the array to push (duplicatesIds.push(dupId)) it records (duplicatesIds)on the above-declared array.

7- db.Employee.find(): to fetch the records.
Now finally execute the above MongoDB query, and you will find the following records.

For more details follow the codefari.com

Top comments (0)

18 Useful Github Repositories Every Developer Should Bookmark

>> Check out this classic DEV post <<