DEV Community

Cover image for JavaScript: How to Remove Duplicate Values from Arrays
Will Harris
Will Harris

Posted on • Edited on

JavaScript: How to Remove Duplicate Values from Arrays

Originally posted on Will's blog


In a previous post we saw how to determine whether a JavaScript array contains duplicate values. Today, I want to show a few different methods I've found for removing duplicate values from an array.

Using the Array.prototype.filter() & Array.prototype.indexOf() methods

let originalArray = [1, 2, 3, 4, 1, 2, 3, 4]

let uniqueArray = originalArray.filter((item, index, array) => {
  return array.indexOf(item) === index
})

// uniqueArray === [1, 2, 3, 4]

The basic strategy here is to iterate through originalArray and check to see if the index of the item we are currently examining is the same as the index of the item in the originalArray.

Because indexOf returns the first index that it finds for a given value, if it isn't a duplicate value then the index for that item must be the same!

Note that this method is not the most efficient: it executes in quadratic time. So if the arrays you're checking are very large in size, you may want to use a different method.

Another thing worth nothing is that we can use the same method to return only the duplicate values by inverting the comparison:

let duplicateArray = originalArray.filter((item, index, array) => {
  return array.indexOf(item) !== index
})

Using Array.prototype.reduce() & Array.prototype.includes()

let originalArray = [1, 2, 3, 4, 1, 2, 3, 4]

let uniqueArray = originalArray.reduce((unique, item) => {
  unique.includes(item) ? unique : [...unique, item]
}, [])

// uniqueArray === [1, 2, 3, 4]

Here the strategy is to keep a running list of the unique items in our reducer function's 'accumulator' (unique). For each item in originalArray we check to see if the accumulator includes the item under consideration.

  • If it does contain the item, return the accumulator without making any changes, in effect 'skipping over' that item.
  • If it does not contain the item, spread the values in the accumulator into a new array, and add the item under consideration.

Array.prototype.includes returns a boolean value -- true if the value is found in the array, false if not. This boolean value drives our conditional, determining what to do with each value.

I find this approach less intuitive and harder to read, but it works.

Also note that the empty array that is passed in after the reducer function is the starting value for the accumulator, so the first pass through the reduce, unique is an empty array.

⚡ Using the ES6 Set object ⚡

let originalArray = [1, 2, 3, 4, 1, 2, 3, 4]

let uniqueArray = array => [...new Set(array)]

// or

let uniqueArray = Array.from(new Set(originalArray))

// uniqueArray = [1, 2, 3, 4]

This approach harnesses the power of the Set object, introduced in ES6.

Sets are guaranteed to preserve the order of the inserted items, and to only contain unique values. Therefore it is by definition impossible for a set to contain duplicates!

Here we call the Set object's constructor, passing it the array we'd like to construct a Set from. Then, once we've trimmed out all the duplicates and stored the remaining values in our Set, we convert back to an array and return the result.

I've seen some discussion of this approach being a bit less performant if the array under consideration is very large and contains many duplicate values. However, the same discussion found that this approach is very efficient in a scenario where the data has very few duplicates.

Personally I think the conciseness of this last approach is enough of a benefit to warrant using the Set object approach, unless there's a compelling performance reason not to.

Top comments (18)

Collapse
 
miketalbot profile image
Mike Talbot ⭐ • Edited

Surely Set must be faster - the whole thing is an O(N) operation with Set, but it's getting towards O(N**2) with anything that has to scan like includes or filter... (The reduce version is better than O(N**2) as it is only having to test the currently created unique list)

I decided to see just how different it is. In this example on jsperf, you can see that the "reduce" is around 180x slower and the indexOf even worse at almost 380x slower. Also the version of "reduce" I've used is slightly faster than your one shown, because it doesn't make an array on each loop and just modifies the one being created.

Collapse
 
will_devs profile image
Will Harris

Thanks for throwing together that jsperf test Mike!

Set does appear to be much faster, which is fantastic because the syntax is so much easier to grok (for me, at least).

Collapse
 
miketalbot profile image
Mike Talbot ⭐

Yes, nice that good syntax has great performance haha! Not a normal happy thing that I find! (Like the immutability is a nicer syntax but comes with a penalty).

Collapse
 
evanlk profile image
Evan Kennedy • Edited

The approach below is O(n) and should work without ES6. The issue with some of the approaches above is they are using Array.indexOf or Array.includes, which is basically another loop through the array. You basically take a performance hit when the searched values are located near the end of the array. The only downside to the method below is that it's memory intensive for large arrays. You're essentially giving up memory for performance.

const originalArray = [1, 2, 3, 4, 1, 2, 3, 4];

let key = {};

const uniqueArray = originalArray.filter((item) => {
  if (key[item]) return false;

  key[item] = true;
  return true;
});
Collapse
 
willsmart profile image
willsmart • Edited

A similar function using Set is

function withoutDuplicates(inputArray) {
  const trackerSet=new Set();
  return inputArray.filter(
    v => !trackerSet.has(v) && trackerSet.add(v)
  ) // add returns the Set instance, which is truthy. You may wish to add a more explicit truthy return
}

These have the main advantage over Array.from(new Set(inputArray)) in that there's no need to iterate the Set after filling it with values, but I've done a little profiling and my snippet has pretty much the same characteristics and timing as Array.from(new Set(inputArray)) (just a little bit slower), even for very large arrays.

tbh, I think the big win is that they aren't leaning on JS Set's latent insertion-order stability, making the code more WYSIWYG, and more portable since in many other C-style languages Sets don't have that guarantee.

Collapse
 
will_devs profile image
Will Harris

Thanks Evan! That's also a great way to approach the problem.

Collapse
 
jaymcconnon profile image
James McConnon

Awesome article. First i am seeing sets. Also glad to read comment below that its runtime is better than expected because to me it is more readable. Also i think allot of the array prototype functions can make the code harder to read in the long run. Assuming you know a bit of background on sets, that last example is so concise!

That being said, could you explain how the ... spread operator works with it? Trying to get my head around that at the moment.

Collapse
 
will_devs profile image
Will Harris

Thanks James!

Because the Set object implements the iterator property, we can use the ES6 spread syntax to cast the values from the Set into an array.

It really is doing the same thing as the second part of that example, where I called Array.from().

Collapse
 
jaymcconnon profile image
James McConnon

Thankyou that makes sense. Spread operator has been a tough one to learn for me. I always feel like i kind if have an idea of what it is doing in one situation but struggle to use it myself.

Collapse
 
savagepixie profile image
SavagePixie

Sets don't come with the same prototype functions as arrays. They have a forEach method, but no map or filter, for example (I think). The spread operator and square braces there are simply extracting the values from the set and turning them into an array.

Collapse
 
felipemfp profile image
Felipe Pontes

Set works great when you are dealing with primitive values. But if you are working with some list of items (e.g. list of comment authors in this post), you're better off using a key mapping approach:

const mapping = {}
const unique = array.filter((item) => {
  if (item.id in mapping) return false
  mapping[item.id] = true
  return true
})
Collapse
 
killshot13 profile image
Michael R.

Is there any way to count how many times the duplicate object(s) appears in the array before eliminating said duplicates using this method? Asking for a noob trying to solve an annoying kata. :)

Reference link

Collapse
 
miggu profile image
miggu

What is quadratic time?

Collapse
 
ky1e_s profile image
Kyle Stephens
Collapse
 
jesuscovam profile image
jesuscovam

too much coolness in one function

Collapse
 
__bangash profile image
shahid
Collapse
 
caoshouse profile image
caoshouse • Edited

let originalArray = [1, 2, 3, 4, 1, 2, 3, 4];
for(var i=0,obj={};i<originalArray.length;i++){
obj[originalArray[i]]=null
}
let noDuplicates=Object.keys(obj)

Collapse
 
4umfreak profile image
Mark Voorberg • Edited

Same using .reduce()...

let myArray = [1,2,2,"hello world",4,5,5, {moo: "cow"}];
let myObject = myArray.reduce((acc, x) => {acc[x]=0; return acc}, {});
let noDuplicates = Object.keys(myObject);