DEV Community

Cover image for Filter Unique in Javascript
Simon Taylor
Simon Taylor

Posted on

Filter Unique in Javascript

This is a repost from my medium account. I've been planning to move away from medium for a long time, since they introduced their awful monetization model. Anyway, it's one of my more popular posts so I hope you enjoy it!

For a while I have been thinking about a simple problem, why can’t we use .filter on array's in Javascript to get all the unique values in the array. The short answer is, you can! But I went on a short journey down JavaScript lane to find the best way to do this.

Overview

To me, filtering this way feels natural. The purpose of applying .filter to an array is to remove unwanted values i.e. duplicates. There are plenty of ways of achieving this without using .filter but these feel both less readable and less elegant. An added benefit of using .filter is that you can also neatly chain off of other array functions (i.e. .map, .reduce, and other .filter calls).

If you don’t care to use .filter there’s a neat trick you can do with the ES6 Set class. This essentially converts the array to a Set which only stores unique values and then converts it back into an array. It’s clean and elegant.

const unique = arr => [...new Set(arr)];
unique(myArray)
Enter fullscreen mode Exit fullscreen mode

Version 1

After a bit of tinkering the first version I came up with was this...

const unique = (elem, index, array) => {
  for (var i = 0; i < index; i++) {
    if (array[i] === elem) return false;
  }
  return true;
};

myArray.filter(unique);
Enter fullscreen mode Exit fullscreen mode

Try it out in runkit.

I was initially pretty happy with this approach.

  • It works for strings and numbers (primitives). I wasn’t trying to make this work for deep equality comparisons.
  • It only looks at the array elements that precede the current element, so that the first instance of a value will return true and any subsequent instances will return false. It doesn’t need to look at the entire array.
  • It returns false as soon as the duplicate is found (it doesn’t keep checking the remainder of the array).

What could be better?

After a while, I realized that while Version 1 worked pretty great, but it could be better. Consider this example, an array with 1000 items.

const myArray = [<...998 items that aren't 1>, 1, 1];
Enter fullscreen mode Exit fullscreen mode

When we get to the last item, it will start looking at items that precede it (999 items) to see if the same element already exists. It starts at array position [0] and will keep checking every item until it reaches the second to last 1 at array position [998].

But this feels unnecessary as we already found a 1 so we know any subsequent 1’s cannot be unique. It would be great if we could somehow cache each unique value we find so as to not need to recheck the entire array for repeated values.

I then started thinking about creating my own cache object, it’s not hard to store values in a cache and provide a function to check if they already exist, but this felt like perhaps I’m reinventing the wheel, this sounds exactly like what Set does.

Version 2

This then got me thinking again about Set() and it’s elegance, and I started to wonder if I could somehow apply this to an .filter.

One of the challenges with writing a function to use with .filter is you are limited to the arguments filter provides. It provides you only with the current element, index, and the initial array. It does not, unfortunately, provide you with the array being constructed for the result as you are iterating (like you have access to in .reduce, the “accumulator”). If this were the case this would be an easier problem to solve.

After some hacking, this is what I came up with…

const unique = () => {
  let cache;

  return (elem, index, array) => {
    if (!cache) cache = new Set(array);
    return cache.delete(elem);
  };
};

myArray.filter(unique());
Enter fullscreen mode Exit fullscreen mode

Try it out on runkit.

This solution first generates a cache of the entire array using Set. Then as it iterates it checks each value against the cache using the .delete method. .delete will remove the element from the cache and return true if it can be found, or return false if it cannot.

This mean’s the first time a value is found, it will return true (for a successful deletion) since every element exists once in the cache, and thus includes it in the result. But on subsequent checks for the same value (i.e. duplicates) it will return false (the deletion fails) since that value has already been deleted, thus excluding it from the result.

In terms of performance, this solution is great, because we find all the unique values up front, and then each iteration is purely a single check against the list of unique values. There is no need to check the same values repeatedly.

The one caveat is this solution requires a cache to be generated and shared across each call and because of this requires a closure .filter(unique()) rather than .filter(unique). But I think this is a fair tradeoff for a performant and clean solution.

The Internet’s Solution

After I published this article, I discovered a pretty neat solution on the internet. It’s a one liner which makes it all the more beautiful.

const unique = (x, i, a) => a.indexOf(x) == i;

myArray.filter(unique);
Enter fullscreen mode Exit fullscreen mode

So I started thinking maybe I wasted my time?

Performance

I believe performance metrics may have changed since I first posted this article, please check JSBench for yourself before deciding on a solution

The final piece of the puzzle was which solution is the most performant. You can see the results from yourself on JSBench.

JSBench Performance

Honestly, they’re all pretty close. Version 2 is better than Version 1 which is better than “The Internet’s Solution”. That said, they’re all very close and it likely takes a very large data set to make my solution (Version 2) worthwhile. So if this isn’t something you need, I’d probably just go with the one liner.

Wrap up

So that’s pretty much it. You can try out my solutions on runkit or copy and paste into your project. I did also create an npm package called youneek if you prefer as well. If you like this article / solution, the best thanks you can give me is a star on github.

Appendix

What if I do want to filter objects using deep equality? That’s beyond the scope of what I was trying to achieve, but you could probably use the first solution (Version 1) with something like fast-deep-equal to do the comparison.

const isEqual = require('fast-deep-equal');

const deepUnique = (elem, index, array) => {
  for (var i = 0; i < index; i++) {
    if (isEqual(array[i], elem)) return false;
  }
  return true;
};

myArray.filter(deepUnique);
Enter fullscreen mode Exit fullscreen mode

Top comments (0)