DEV Community

Removing duplicates in an Array of Objects in JS with Sets

Marina Mosti on February 04, 2019

The other day at work I was faced with what I think is a rather common problem when dealing with data coming from an API. I was getting from my as...
Collapse
 
coachmatt_io profile image
Matt

Without getting into cryptic one-liners, there's a pretty straight forward linear time solution as well.


const seen = new Set();
const arr = [
  { id: 1, name: "test1" },
  { id: 2, name: "test2" },
  { id: 2, name: "test3" },
  { id: 3, name: "test4" },
  { id: 4, name: "test5" },
  { id: 5, name: "test6" },
  { id: 5, name: "test7" },
  { id: 6, name: "test8" }
];

const filteredArr = arr.filter(el => {
  const duplicate = seen.has(el.id);
  seen.add(el.id);
  return !duplicate;
});
Enter fullscreen mode Exit fullscreen mode
Collapse
 
marinamosti profile image
Marina Mosti

Hey Matt, nice solution! Yeah, all ways leads to Rome :)

Collapse
 
andreasvirkus profile image
ajv

Well... yes and no.

I feel it's important to distinguish that the OP-s solution loops over the whole array twice. I know there are times to make a trade-off between performance and readability, but I don't feel this needs to be one of those times :)

Collapse
 
spiritupbro profile image
spiritupbro

wow crazy matt
thanks for this

Collapse
 
johnfewell profile image
John Fewell

Great answer, thank you Matt!

Collapse
 
programmist profile image
Tony Childs • Edited

Here's another possibility using the Map class constructor and values method:

const arr = [
  { id: 1, name: "test1" },
  { id: 2, name: "test2" },
  { id: 2, name: "test3" },
  { id: 3, name: "test4" },
  { id: 4, name: "test5" },
  { id: 5, name: "test6" },
  { id: 5, name: "test7" },
  { id: 6, name: "test8" }
];

const uniqueObjects = [...new Map(arr.map(item => [item.id, item])).values()]

Enter fullscreen mode Exit fullscreen mode
Collapse
 
imcorfitz profile image
Corfitz • Edited

There is a difference between Tony's and Matt's approach in how the final array will look like.

Matt's approach is adding the id for each entry it loops through to a Set and checks whether or not it has been ´seen´ before or not, and return the object if 'no' is the case. So if we look at the returned object with ID: 2, Matt will return the object with name: "test2" as it will consider the next object a duplicate and skip it.

Tony's approach is by creating a new map using ID as a key - which has to be unique - and then extracts the values. E.g. [1: { id: 1, name: "test1" }, 2: { id: 2, name: "test2" }....] etc. What this means though, is that even though id: 2 has been added to the map, it is simply overwritten by the third item in the array, thus Tony will return name: "test3" for ID: 2.

Just keep this in mind whether you want the first object or the last object by a duplicated identifier to be the truth.

Collapse
 
alanwong9179 profile image
alanwong9179

I have to create an account and say thank you! This answer save my day.

Collapse
 
mayanxoni profile image
Mayank Soni

Hi Tony. Your answer helped me, thanks!
BTW, can you please help me achieve this array?
[
{ id: 1, name: "test1" },
{ id: 2, name: "test2" },
{ id: 2, name: "test3" },
{ id: 3, name: "test2" }
]

I mean even if 'id' or 'name' already exists, it should not be omitted because either of the value is different (like in the case of 'name: "test2"') in the whole array.

Collapse
 
imcorfitz profile image
Corfitz

Hey @mayanxoni ,

Might be late with a reply here, but in your case I would probably map through your array of objects as stringified content, as you are not relying on a single identifier but an entire object.

NB: Though this might not be performant when you are dealing with bigger objects and large arrays.

const arr = [
{ id: 1, name: "test1" },
{ id: 2, name: "test2" },
{ id: 2, name: "test2" },
{ id: 2, name: "test2" },
{ id: 2, name: "test3" },
{ id: 3, name: "test2" }
];

const newArray = Array.from(new Set(arr.map(el => JSON.stringify(el)))).map(el => JSON.parse(el));
Enter fullscreen mode Exit fullscreen mode
Collapse
 
programmist profile image
Tony Childs

Hi Mayank,

I'm not sure I follow. If no duplicates are removed then that is just the original array is it not? Or do you mean you want an array with just ids 1, 2, and 3? If so, you can use Array.prototype.filter and only return true for the ids you want to keep.

Collapse
 
briancollins082 profile image
brian

Great

Collapse
 
samrocksc profile image
Sam Clark

wow, i love that

Collapse
 
marlon_zayro profile image
marlon zayro arias v

Thanks !!

Collapse
 
misterwhat profile image
Jonas Winzen

Interesting challenge. So I've three very similar solutions. They are all based on the same principle of reducing the array into a key-value structure and re-creating the array from the values only.

Approach 1: Classical Reducer

(reducer maintains immutability)

/**
 * classic reducer
 **/
const uniqByProp = prop => arr =>
  Object.values(
    arr.reduce(
      (acc, item) =>
        item && item[prop]
          ? { ...acc, [item[prop]]: item } // just include items with the prop
          : acc,
      {}
    )
  );

// usage:

const uniqueById = uniqByProp("id");

const unifiedArray = uniqueById(arrayWithDuplicates);

Enter fullscreen mode Exit fullscreen mode

Depending on your array size, this approach might easily become a bottleneck in your app. More performant is to mutate your accumulator object directly in the reducer.

Approach 2: Reducer with object-mutation

/**
 * using object mutation
 **/
const uniqByProp = prop => arr =>
  Object.values(
    arr.reduce(
      (acc, item) => (
        item && item[prop] && (acc[item[prop]] = item), acc
      ), // using object mutation (faster)
      {}
    )
  );

// usage (same as above):

const uniqueById = uniqByProp("id");

const unifiedArray = uniqueById(arrayWithDuplicates);

Enter fullscreen mode Exit fullscreen mode

The larger your input array, the more performance gain you'll have from the second approach. In my benchmark (for an input array of length 500 - with a duplicate element probability of 0.5), the second approach is ~440 x as fast as the first approach.

Approach 3: Using ES6 Map

My favorite approach uses a map, instead of an object to accumulate the elements. This has the advantage of preserving the ordering of the original array:

/**
 * using ES6 Map
 **/
const uniqByProp_map = prop => arr =>
  Array.from(
    arr
      .reduce(
        (acc, item) => (
          item && item[prop] && acc.set(item[prop], item),
          acc
        ), // using map (preserves ordering)
        new Map()
      )
      .values()
  );

// usage (still the same):

const uniqueById = uniqByProp("id");

const unifiedArray = uniqueById(arrayWithDuplicates);

Enter fullscreen mode Exit fullscreen mode

Using the the same benchmark conditions as above, this approach is ~2 x as fast as the second approach and ~900 x as fast as the first approach.

Conclusion

Even if all three approaches are looking quite similar, they have surprisingly different performance footprints.

You'll find the benchmarks I used here: jsperf.com/uniq-by-prop

Collapse
 
drozerah profile image
Drozerah • Edited

Hi ! One other one line path to Rome, from France :


arr = arr.filter((power, toThe, yellowVests) => yellowVests.map(updateDemocracy => updateDemocracy['id']).indexOf(power['id']) === toThe)

console.log(arr)

Enter fullscreen mode Exit fullscreen mode


`

Collapse
 
marinamosti profile image
Marina Mosti

Merci! :)

Collapse
 
arthurcat profile image
ArthurCat

More JS but too slow.

Collapse
 
itachiuchiha profile image
Itachi Uchiha

I have also this solution O.o

const dupAddress = [
    {
        id: 1,
        name: 'Istanbul'
    },
    {
        id: 2,
        name: 'Kocaeli'
    },
    {
        id: 3,
        name: 'Ankara'
    },
    {
        id: 1,
        name: 'Istanbul'
    }
]

let addresses = [...new Set([...dupAddress.map(address => dupAddress[address.id])])]

console.log(addresses)
Enter fullscreen mode Exit fullscreen mode

But this only works with address.id, so this doesn't work with address.name

Really, why this doesn't work like that?

let addresses = [...new Set([...dupAddress.map(address => dupAddress[address.name])])]
Enter fullscreen mode Exit fullscreen mode
Collapse
 
marinamosti profile image
Marina Mosti

Well, you're passing [address.id] as an index to the dupAddress array, that's just not going to work because the id !== index. Try changing it to address.id or address.name without accessing the array

Collapse
 
itachiuchiha profile image
Itachi Uchiha • Edited

Okay, I tried it didn't work actually I was wonder why that didn't work. Thanks.

Collapse
 
waqasongithub profile image
waqas
let addresses = [...new Set([...dupAddress.map(address => dupAddress[address.name])])]

Enter fullscreen mode Exit fullscreen mode
Collapse
 
mhasansiddiqui profile image
hasan

let array = [];
let singleEle = [];
const arr = [
{ id: 1, {questionId : { _id: "5e2016a1560d8c2aa842e65d"} }},
{ id: 1, {questionId : { _id: "5e1c211cc201f33834e7baf1"} }},
{ id: 1, {questionId : { _id: "5e201733560d8c2aa842e65e"} }}
];
arr.forEach(item => {
if (array[item.questionId._id] ) {

}
else{
array[item.questionId._id] = true;
singleEle.push(item)
}
});

Collapse
 
jreina profile image
Johnny Reina

I found myself with this issue recently and though I've always used the same code to find distinct primitives (before we had the Set object), this code required me to adhere to the C# API where you pass in a comparison function T -> T -> boolean. This solution felt relatively clean though obviously not in linear time.
github.com/jreina/ShittyLINQ.js/bl...

Collapse
 
aissa0347 profile image
Aissa0347 • Edited

Thanks for chairing, I found this also helpful

const arr = [
  { id: 1, name: "test1" },
  { id: 2, name: "test2" },
  { id: 2, name: "test3" },
  { id: 3, name: "test4" },
  { id: 4, name: "test5" },
  { id: 5, name: "test6" },
  { id: 5, name: "test7" },
  { id: 6, name: "test8" }
];
filteredArr= arr.filter((currentUser, index) => {
      return (
        arr.findIndex((user) => user.id === currentUser.id) === index
      );
    });
Enter fullscreen mode Exit fullscreen mode
Collapse
 
juandouek profile image
JuanDouek

Other solutions...

Leave last appearance:

const arr = [
    { id: 1, name: "test1" },
    { id: 2, name: "test2" },
    { id: 2, name: "test3" },
    { id: 2, name: "test4" },
    { id: 3, name: "test5" },
    { id: 4, name: "test5" }
];

const by_id = {};

for (item of arr) by_id[item.id] = item;

const uniques = Object.values(by_id);
Enter fullscreen mode Exit fullscreen mode

Leave first appearance:

const arr = [
    { id: 1, name: "test1" },
    { id: 2, name: "test2" },
    { id: 2, name: "test3" },
    { id: 2, name: "test4" },
    { id: 3, name: "test5" },
    { id: 4, name: "test5" }
];

const by_id = {};

for (item of arr)
    if(!by_id[item.id]) by_id[item.id] = item;

const uniques = Object.values(by_id);
Enter fullscreen mode Exit fullscreen mode

Remove duplicated names also:

const arr = [
    { id: 1, name: "test1" },
    { id: 2, name: "test2" },
    { id: 2, name: "test3" },
    { id: 2, name: "test4" },
    { id: 3, name: "test5" },
    { id: 4, name: "test5" }
];

const by_id = {};
const by_name = {};

for (item of arr)
    if(!by_id[item.id] && !by_name[item.name])
    {
        by_id[item.id] = item;
        by_name[item.name] = 1;
    }

const uniques = Object.values(by_id);
Enter fullscreen mode Exit fullscreen mode
Collapse
 
duaneq profile image
DuaneQ

Thanks for the code snippet, Marina...I'm getting to errors when I attempt to use it. The first is "Set is only referred to a type but is being used as a value here"

when I use the "REDUCE" example I get the following in the console:

Maximum call stack size exceeded
at Array.reduce

Collapse
 
feldmanovitch profile image
feldmanovitch

Man, I only created an account here just for saying thanks. So... thanks! :D

I came here because Chat GPT suggested to me a filter-based solution that did not work as filter does consider two objects unequal due to different spots in memory, eventhough they contain the same values.

You saved me a lot of headaches.

Collapse
 
mlaurapereyram profile image
Laura Pereyra

I have two list of objects and I need to remove all duplicates, something like this:

INPUTS
var array1 = [(1, 'banana', 'yellow'), (1, 'apple', 'red'), (1, 'orange', 'orange')];
var array2 = [(1, 'banana', 'yellow'), (1, 'apple', 'red'), (2, 'grapes', 'purple')];

OUTPUT
array1 = [(1, 'orange', 'orange')]

I tried using filter but it doesn't work for objects :(

Collapse
 
fraserkemp profile image
Fraser Kemp

what does this bit do? not sure I understand? -> acc.concat([current])

Collapse
 
marinamosti profile image
Marina Mosti

Array.concat is a way to concatenate two arrays into one.
developer.mozilla.org/en-US/docs/W...

Collapse
 
fahimu10 profile image
Fahim Uddin

Thanks for the solution.

Collapse
 
briancollins082 profile image
brian

Thanks for this post

Collapse
 
chrisrouty profile image
Chris Routy • Edited

Thank you @marinamosti , you saved my life xD

Collapse
 
seyyed_sina profile image
Seyed Sina

The array reduce solution saved my day!
Tanx Marina

Collapse
 
marinamosti profile image
Marina Mosti

Welcome! :D

Collapse
 
carchuli profile image
Carchuli • Edited

Thanks! This is exactly what I was looking for

Collapse
 
samrocksc profile image
Sam Clark

Really enjoyed this article and the rad discussion!!!!!!

Collapse
 
yukoliesh profile image
yukoliesh

Thanks so much for your article. It saved and made my day. :)

Collapse
 
inorganik profile image
Jamie Perkins

This is great! Thanks