Algorithm Problem: First Duplicate in Array

#javascript #computerscience

We are given the following problem statement.

Given an array a that contains only numbers in the range
from 1 to a.length, find the first duplicate number for
which the second occurrence has the minimal index. In other words, if
there are more than 1 duplicated numbers, return the number for
which the second occurrence has a smaller index than the second
occurrence of the other number does. If there are no such elements,
return -1.

So in essence, we need to search the array and find the pair of the first duplicate elements. There are several approaches to tackle this problem, let's take a look at some of these.

Approach #1: The naive loop approach

The first and most intuitive way to do this is to choose an element, and iterate all the way to the end of the array, checking whether the element has a duplicate.

How would such an algorithm work? First, we select the first element
and look all the way through the end of the list. If we find an element
that is duplicate, we just return the element and stop here. If we don't
we do the same steps but starting from the second element of the
list. We continue doing so, until reaching the second to last element
of the array. If we didn't find any duplicate element up to this point,
then there are no duplicate elements and we should return -1. We stop
at this element since it is our last chance to find a duplicate. If the
last element had a duplicate, we would already found it.

While this solution works, it has a time complexity of O(N²).
Surely there must be a better solution...

Approach #2: The memory approach

In the previous implementation, we had a problem. Our algorithm didn't remember the elements it encountered. For that reason, it passed the duplicate elements multiple times, until it had one element of the pair. What if our algorithm remembered each element it encountered? Then, as soon as it encountered a duplicate element it would stop.

Now in order for our algorithm to remember what elements it encountered, we need to store them somewhere. I will go for the object, but an array would be perfectly valid.

So we once again, start iterating our array, but now we perform a check for every element. If we encountered this element before, we got our duplicate and we can go home. If we didn't encounter it, we are going to store it.

This time, we need to iterate the array only once. The complexity to iterate the array once will be O(N). Storing and retrieving an item from an object has a complexity of O(1), so our final time complexity will be O(N). But, in this case, we are introducing an O(N) space complexity too since we are storing the elements of the array once again.

These are just two solutions for the problem, that I came up with. Surely there are more out there. Do you have something to add? Leave a comment below, and thanks for reading!

Originally posted here

For the record, I'd like to apologize in advance that you are about to read
my writing and my erratic thought process. This is my first post here so
I hope you enjoyed it! :bowtie:

Top comments (3)

Jason Steinhauser • Aug 9 '19

You can push elements into a Set and check to see if it exists already when while iterating. If it exists, then you can terminate and return the index of the repeated element. This should run in O(n) time (disregarding Set resizing).

Rodrigo Sene • Aug 7 '19

same code for both approaches

Kostas Kalafatis • Aug 8 '19

Care to elaborate?