Originally posted on Will's blog
When dealing with arrays of values in JavaScript we sometimes want to determine if the array contains any duplicate values. Unfortunately, JavaScript arrays do not expose any built-in methods that can do this for us -- we have to write the implementation ourselves.
One approach to this problem might look like this:
function checkForDuplicates(array) {
let valuesAlreadySeen = []
for (let i = 0; i < array.length; i++) {
let value = array[i]
if (valuesAlreadySeen.indexOf(value) !== -1) {
return true
}
valuesAlreadySeen.push(value)
}
return false
}
This works, but in the worst-case scenario where the only duplicate value occurs at the end of the array, this is not a very performant approach. We would have to iterate through the entire array (which could be huge!) only to realize at the last element that there is actually a non-unique value in the array.
Another approach that I recently learned harnesses the power of ES6 Sets.
In case you aren't familiar with Sets in JavaScript (I wasn't until recently!), here is the MDN definition:
Set
objects are collections of values. You can iterate through the elements of a set in insertion order. A value in theSet
may only occur once; it is unique in theSet
's collection.
Read that last line one more time, as that is our secret sauce here. 'A value in the Set
may only occur once; it is unique in the Set
's collection.'
This fact means that we can convert our original array into a Set
and then be confident that it only contains unique values. Once we've extracted all of the unique values from the array and stored them in our Set
, we can compare the lengths of the array and the Set. If the lengths are not the same, it must follow that the array contained duplicate values!
Here's what that approach looks like:
function checkForDuplicates(array) {
return new Set(array).size !== array.length
}
If the length of the Set
and the array are not the same this function will return true
, indicating that the array did contain duplicates. Otherwise, if the array and the Set
are the same length the function will return false
and we can be certain that the original array contained no duplicate values!
I really like this second approach for how concise and expressive it is, but you might run into browser support issues if you need to target older browsers so take that into consideration!
Top comments (6)
Nice technique. But it works only with primitives.
If you do not mind I want to add a bit more info about this topic.
If you need to find duplicates in array of objects, you can do something like this:
or use almost the same technique as the author:
Yep. It is cleaner than mine.
could you please explain this syntax to me and thank you in advance.
The first function uses a for loop to iterate through the array and an indexOf method to search for a value in the valuesAlreadySeen array. This can result in a worst-case time complexity of O(n^2), where n is the number of elements in the array. In the case of a large array with 1,000,000 elements, this could lead to performance issues and slow execution times.
When duplicates are found early in the array, the first function might be faster as it can exit early after finding the duplicates. However, considering the average case and worst-case scenarios, the second function is generally faster for larger arrays.
Although the second function doesn't have the advantage of early exits like the first function, it has a better average-case and worst-case time complexity. In real-world scenarios, where duplicates may not always be at the beginning of the array, the second function is more efficient in terms of time complexity.
If we need a solution that can exit early when duplicates are found and still maintain good performance for large arrays, we can use a combination of a Set and a loop
This function iterates over the array using a for loop and checks for duplicates using a Set. It maintains the fast O(1) Set operations while still allowing for early exits when duplicates are found.
Oh I like that...