Svein Petter Gjøby

Posted on Dec 8, 2019

Removing duplicates from an array

#webdev #javascript

Originally posted on javascript.christmas

Knowing more than one way to solve a given problem can help you write more readable code. Let's look at three different ways to remove duplicates elements from an array.

Never underestimate the importance of code quality. As a developer it is key to clearly communicate the implementation of any solution you are working on through readable code. Knowing more than one way to solve a given problem can help you write more readable code. Let's look at three different ways to remove duplicate primitive values from an array.

const array = [1, 1, 1, 3, 3, 2, 2];

// Method 1: Using a Set
const unique = [...new Set(array)];

// Method 2: Array.prototype.reduce
const unique = array.reduce((result, element) => {
  return result.includes(element) ? result : [...result, element];
}, []);

// Method 3: Array.prototype.filter
const unique = array.filter((element, index) => {
  return array.indexOf(element) === index;
});

Set

A Set is an object that lets you store unique values. Repeated calls of Set.add(value) with the same value don’t do anything.

const uniqueNames = new Set();

uniqueNames.add("Dasher"); // {"Dasher"}
uniqueNames.add("Dasher"); // {"Dasher"}

By exploiting the fact that a Set cannot contain duplicate values, then use a spread operator to transform the Set back to an array we are able to remove duplicate elements from the array.

const array = [1, 1, 1, 3, 3, 2, 2];

const uniqueSet = new Set(array); // {1, 3, 2}

const uniqueArray = [...uniqueSet]; // [1, 3, 2]

Reduce

The reduce method executes a reducer function (provided by you) on each element of the array, resulting in a single output value. The value returned from a reducer function is assigned to the accumulator, which is passed as the first argument of the subsequent execution of the reducer function and ultimately becomes the final resulting value.

To remove duplicate elements from an array, we can provide a function that checks if the accumulated array includes the current element. If not we add the current element to the array.

const array = [1, 1, 1, 3, 3, 2, 2];

const reducerFunction = (result, element) => {
  return result.includes(element) ? result : [...result, element];
}

const unique = array.reduce(reducerFunction);

Filter

The key to understand this method is to understand how indexOf and filter works.

indexOfreturns the first index of a given element in an array.
filter creates a new array with all the elements that passes a test. You can provide the test as the first argument of filter.

If we combine these two methods, by providing a test that checks if each element is the first occurrence of the given element in the array, we can remove duplicate elements from arrays.

const isFirst = (element, index) => {
  // Checks if a given element is the first occurrence of it.
  return array.indexOf(element) === index;
}

const unique = array.filter(isFirst);

Which method should I choose?

We saw three different methods to remove duplicate elements from an array. It's easy to imagine a fourth method that would improve the readability. Namely, by creating a proposal to add Array.prototype.unique to EcmaScript.

In terms of readability and performance, I prefer the first method. By using a Set your code is both short, performant and easy to understand.

Top comments (9)

Charles Assunção • Dec 8 '19 • Edited

Nice article. :)

I guess on a daily basis I would go with Set approach for simplicity. I just wanted to add that in a interview I would probably try to come up with an O(n) solution, so something like:

function removeDuplicates(arr){
    let map = {};

    return arr.reduce((acc, curr) => {
        if(!map[curr]){
            map[curr] = true;
            return [...acc, curr];
        } else {
            return acc;
        }
    }, []);
}

Oliver Anteros • Dec 8 '19 • Edited

const removeDouplicate = (arr) => 
    Object.keys(
        arr.reduce((res, val) => ({
            ...res, 
            [val]: true
        }, {})
    );

Regarding comments about returning a string array independent of the input data:

const removeDouplicate = (arr) => 
    Object.entries(
        arr.reduce((res, val) => ({
            ...res, 
            [val]: val
        }, {})
    ).map(([_, val]) => val);

Regarding comments about creating N number of objects:

const removeDouplicate = (arr) => 
    Object.entries(
        arr.reduce((res, val) => (
            res[val] = val,
            res
        ), {})
    ).map(([_, val]) => val);

This will work for all arrays as long as they only contain primitive values. Seeing as res[{ foo: 42 }] will access the index res['[object Object]'] no matter the object.

Charles Assunção • Dec 9 '19

Yeah, fthis works but you are "mutating" the data itself

Mahendra • Dec 9 '19 • Edited

One more solution..

let obj = {}
let tempArray = []
for(let i=0;i<arr.length;i++){
if(!obj.values.includes(arr(i)){
tempArray.push(arr[i])
}else {
obj[arr[i]] = arr[i]
}
}
console.log(tempArray)

Charles Assunção • Dec 9 '19

Not really, I am keeping the same entry as the original array and not converting it to a string, what can be even trickier if your elements is an object or another thing. Returning an array of strings you are not preserving the data.

Jilles van Gurp • Dec 9 '19

It starts with realizing that arrays are blocks of memory and that technically you are creating a new array with some of the old values as you can't update the size of an array after you create one.

So, you are copying data to a new array instead of modifying an existing one. There's no way around having to iterate over your data at least once. However, the difference is in the price of your contains operation. I suspect/hope that the set might do some optimal things here (like using a hash function). Of course then having to use the spread operation might mean you are copying twice; which is of course not ideal. Your other two options probably have O(n) complexity for the contains check.

So if you do this a lot, use a Set as your primary data structure instead of converting to and from arrays because that isn't free.

Andy • Dec 8 '19

Before reading your article, I asked myself how I'd do it, and intuitively I came up with the Set solution. So I couldn't agree more with you on the solution you have chosen