DEV Community

loading...
Cover image for Remove Duplicates With Set() - Full-Stop

Remove Duplicates With Set() - Full-Stop

bytebodger profile image Adam Nathaniel Davis Updated on ・3 min read

This will probably seem like a trivial and overly simplistic post. But I've seen the wrong answer posted so frequently - even right here on Dev.to - that it's honestly starting to get me kinda annoyed.

Everyone who wants to throw their hat in the ring as some kinda JS "mentor" decides to crank out an article about de-duping arrays. On the surface, it kinda makes sense. Because de-duping an array is a common coding task given to people during whiteboard interviews.

It's one of those questions that seems downright silly for a seasoned dev. But more junior types might struggle with it. More importantly, there are (theoretically) many different ways that you could attack the problem. Thus, it could potentially give the evaluator a great chance to "see" how you think.

So this should be a great chance to write up an awesome how-to article, right? You can display your epic dev knowledge and maybe help some noobs while you burnish your reputation... right??

Unfortunately, I see sooooo many articles where the proposed approach is just downright wrong. I'm not claiming that other proposed solutions don't work. I'm saying that other proposed solutions "work" - but they're still very, very wrong.



Alt Text

Use Set() - It's That Simple

Let's have a long, detailed, crazy-intense discussion about all the various ways we could de-dupe an array. Or... let's not. That doesn't sound like much fun at all. Instead, let's just use Set() - and then go on to something far more productive.

I've lost track of how many times I've seen - on this site, on Stack Overflow, on Medium, on... wherever - someone claiming to show you how to de-dupe an array. And then they start pulling out a ridiculous mess of .reduce(), .filter(), .map(), .forEach(), or .whateverArrayPrototypeInterestsMeToday().

Stop. It.

Right now. Just... stop it.

Here's your dead-simple rule for de-duping JS arrays:

If the task is only to de-dupe a JS array containing scalar values, you should be reaching for Set(). Every. Single. Time.


Honestly, this is a fairly narrow use-case in real-life code. I rarely find myself with an array for which I only need to de-dupe it. More often than not, there's some combination of tricks I need to perform on the array. And in those cases, it's perfectly natural (and correct) to start reaching for all of those Array.prototype functions. But if you truly just need to de-dupe an array then, for the love of god, man, just use Set(). Period.


Alt Text

Secret Knowledge(???)

This really shouldn't be any kind of "secret knowledge". Set() isn't a hidden "trick". But most devs rarely use Set() and know little about it. The fact is that I rarely use it myself. But I know what it does. Specifically, I know that it has three very cool features:

  1. Each element in a set must be unique. To be more specific, a Set() will not allow you to create duplicate elements.

  2. It can be initiated with an array - and if that array contains duplicates, the duplicates will be ignored.

  3. You can use the spread operator to convert it back into a plain-ol' array.

When you combine all these features, you have a powerful (and dead-simple) de-duping tool. It looks like this:

const theDupes = [6,7,8,6,7,2,5,6,7,8,3,6,8,7,2,5];
const noDupes = [...new Set(theDupes)]; // [6,7,8,2,5,3]
Enter fullscreen mode Exit fullscreen mode

Maybe you're thinking that this isn't often recommended due to performance?? Umm... no. This technique typically beats every other approach. In fact, the comparison is rarely even close.

De-duping with Set() uses fewer LoC than any of those other tutorials. It's far faster than other approaches. So why would anyone suggest using something else??

The only valid excuse I can muster is if you're not using Babel and your code needs to run on IE. But if you're not using Babel and your code needs to run on IE, you have a lot bigger problems than de-duping arrays.

So the next time you see someone hawking a "How To De-Dupe An Array In JavaScript" tutorial, and they don't use Set(), please do us all a favor - and tell them to get the hell off the interwebs, before they hurt themselves or others.

Discussion (6)

pic
Editor guide
Collapse
monfernape profile image
Usman Khalil

Does that work with array of objects, having similar properties?

Collapse
merri profile image
Vesa Piittinen

No.

new Set([{ a: 1 }, { a: 1 }]).size
// -> 2
Enter fullscreen mode Exit fullscreen mode
Collapse
bytebodger profile image
Adam Nathaniel Davis Author • Edited

As Vesa demonstrated, this only works on scalar values. The reason is pretty simple:

console.log({a: 1} === {a: 1}); // false
Enter fullscreen mode Exit fullscreen mode

You could kinda sorta get around this using the old JSON.stringify() hack:

const set1 = new Set([{ a: 1 }, { a: 1 }]);
console.log(set1.size); // 2

const set2 = new Set([JSON.stringify({ a: 1 }), JSON.stringify({ a: 1 })]);
console.log(set2.size); // 1

const set3 = new Set([['a'], ['a']]);
console.log(set3.size); // 2

const set4 = new Set([JSON.stringify(['a']), JSON.stringify(['a'])]);
console.log(set4.size); // 1
Enter fullscreen mode Exit fullscreen mode

Although that only "works" if the keys/elements are in the same order. It would also fail in any scenario where JSON.stringify() would fail - e.g., if the objects/arrays contain functions.

Collapse
kretaceous profile image
Abhijit Hota

Came here to comment that. @bytebodger you might want to add that as a catch.

Because this particular gotcha has been a pain for me earlier.

Collapse
bytebodger profile image
Adam Nathaniel Davis Author • Edited

I did edit the original article slightly to indicate that it works with scalar (primitive) values. But I hadn't really thought of this as much of a "gotcha", because two objects that look the same... are not the same.

FWIW, this approach actually works with arrays and objects, if they are truly the same array/object. In other words, you can do something like this:

const anObject = {one: 'uno', two: 'dos'};
const anotherObject = anObject;
const anArray = [anObject, anotherObject];
const noDupes = [...new Set(anArray)];
Enter fullscreen mode Exit fullscreen mode

And the resulting noDupes array will only contain one object. But if you do this:

const anObject = {one: 'uno', two: 'dos'};
const anotherObject = {one: 'uno', two: 'dos'};
const anArray = [anObject, anotherObject];
const noDupes = [...new Set(anArray)];
Enter fullscreen mode Exit fullscreen mode

noDupes will contain two objects. Because, even though anObject and anotherObject look the same to our eye, they are not the same value.

Collapse
kwstannard profile image
Kelly Stannard

I do this with Ruby too.