DEV Community

Dilip Kumar Singh
Dilip Kumar Singh

Posted on • Originally published at

MongoDB Query: Remove duplicate records from the collection except one
var duplicatesIds = [];
$group: {
_id: {
EmpId: "$EmpId"
dups: {
"$addToSet": "$_id"
count: {
"$sum": 1
}, {
$match: {
count: {
"$gt": 1
], {
allowDiskUse: true
}).forEach(function (doc) {
doc.dups.forEach(function (dupId) {


Now we will do an analysis of the above-written query.
1- var duplicatesIds = []: This is an array declaration where this query will push the duplicate IDs.

2-{$group:{_id:{EmpId:"$EmpId"},dups:{"$addToSet":"$_id"} ,count:{"$sum":1}}}: Here we are grouping the records on behalf of EmpId, and using $addToSet command, we can create an array "dups", and count:{"$sum":1} is counting the duplicate records.

3- {$match:{count:{"$gt":1}}}: Here we are filtering the records that have a count greater than 1. As the above group pipeline, we are counting the duplicate records on behalf of EmpId.

4- ForEach: we are iterating records one by one here which are grouped EmpId, here we will find the array of duplicate records, for example
"dups" : [

5- doc.dups.shift():Here we are removing one record which will not be deleted, and It means we will delete the duplicates except one document.

6- doc.dups.forEach(function (dupId): here again, we are iterating the array to push (duplicatesIds.push(dupId)) it records (duplicatesIds)on the above-declared array.

7- db.Employee.find(): to fetch the records.
Now finally execute the above MongoDB query, and you will find the following records.

For more details follow the

Top comments (0)