Angela F.

Posted on Jun 10 • Edited on Jun 25

Leetcode 150 | Day 3: Remove Duplicates From Sorted Array - Naive vs. Optimized

#leetcode #javascript #algorithms

Leetcode 80: Remove Duplicates II

Leetcode 80 asks us to modify a sorted array so that each unique element appears at most twice. We are to preserve relative order and store the result in the first part of array nums. The first k elements of nums will hold the result, and beyond k it does not matter what values may have been left behind. Lastly, this must be accomplished by modifying the input array in-place, without allocating any extra space.

I chose to do a writeup on this problem because it's the first time I've run into the possibility that there really isn't a feasible naive approach that respects all of the restrictions put in place - namely, the requirement of modifying in-place. I tried using splice() to shift elements out of the array while tracking position with a single pointer. However, using splice() and subsequently having to decrement the pointer with i-- doesn't work in the case of three or more consecutive duplicates. It will skip a value for the same reason outlined in Day 2. There was an identical issue I ran into there, and it is explained here Day 2 Remove Element. It may not be worth the time to continue digging around, but I am curious whether there does exist a naive approach that respects the restrictions completely.

While it's true the naive approach fails the restrictions laid out in the problem - even if that weren't the case, I'd still argue against it. The time complexity is the same as the optimized approach, but as you'll see below, the difference in readability is significant.

For both approaches we will use the following array:

nums = [1, 2, 3, 3, 4, 4, 4]

Approach 1: Naive (Nested For Loop)

Solution:

var removeDuplicates = function(nums) {
    let counts = [];

    for (let i = 0; i < nums.length; i++) {
        if (counts.length === 0 || counts[counts.length - 1][0] !== nums[i]) {
            counts.push([nums[i], 1]);
        } else {
            counts[counts.length - 1][1]++;
        }
    }

    let idx = 0;
    for (let [num, count] of counts) {
        let timesToAdd = Math.min(count, 2);
        for (let i = 0; i < timesToAdd; i++) {
            nums[idx++] = num;
        }
    }

    return idx;
};

This approach works in two passes. First, it builds a counts array that tracks how many times each number appears consecutively. Each entry is a pair: [number, count]. Then it loops over counts, writes each number back into nums at most twice using a nested loop, and returns the final index as k.

It gets the job done. But it requires an extra array, and the logic is spread across two separate loops with a nested loop inside the second. Compare that to what follows.

Time complexity: O(n)

Two passes through the data, but each is O(n). The nested loop runs at most twice per element, so it doesn't change the overall complexity.

Space complexity: O(n)

The counts array grows with the input. This is what disqualifies it - the problem explicitly requires O(1) extra memory.

Approach 2: Optimized (Two Pointers)

Solution:

var removeDuplicates = function(nums) {
    let slow = 2;

    for (let fast = 2; fast < nums.length; fast++) {
        if (nums[fast] !== nums[slow - 2]) {
            nums[slow] = nums[fast];
            slow++;
        }
    }

    return slow;
};

This approach is comparatively more straightforward, and less code. We start the algorithm by pointing both the slow and fast pointer to index 2. Why 2? Because we can make two assumptions here - only one can be true, and both are acceptable per the problem description.

Assumption 1: Both index 0 and index 1 hold the same value. This is allowed. Assuming it's true means we can skip those positions and iterate over a smaller portion of the array.

Assumption 2: The elements at index 0 and 1 are unique values. Also allowed. Same result - we skip ahead and iterate over less.

Perhaps it can be argued that the time saved here is nominal. But it can also be said, why not?

Once we've entered the for loop and hit the if statement, the condition checks whether the value at nums[fast] is the same as the value two positions behind slow. If it is, that means we already have two of this value written into the result - adding another would violate the restriction. So we skip it. If it's different, we write nums[fast] to nums[slow] and increment slow. The for loop handles advancing fast automatically at the start of each iteration.

Step 1.

Both slow and fast begin at index 2. We check whether nums[fast] equals nums[slow - 2]. nums[2] is 3 and nums[0] is 1. They are not equal, so we write 3 to nums[slow] and increment slow to 3.

Step 2.

fast advances to index 3. nums[3] is 3 and nums[slow - 2] is nums[1], which is 2. They are not equal, so we write 3 to nums[slow]. Increment slow to 4.

Step 3.

fast advances to index 4. nums[4] is 4 and nums[slow - 2] is nums[2], which is 3. Not equal - write 4 to nums[slow]. Increment slow to 5.

Step 4.

fast advances to index 5. nums[5] is 4 and nums[slow - 2] is nums[3], which is 3. Not equal - write 4 to nums[slow]. Increment slow to 6.

Step 5.

fast advances to index 6. nums[6] is 4 and nums[slow - 2] is nums[4], which is 4. Equal - this would be a third consecutive 4, which violates the restriction. We skip. slow stays at 6. fast has reached the end of the array and the loop ends.

We return slow, which is 6. The first 6 elements of nums are [1, 2, 3, 3, 4, 4].

Time complexity: O(n)

One pass through the array. No shifting, no extra structures.

Space complexity: O(1)

Everything happens in place. No extra memory allocated.

Conclusion

Both approaches have a time complexity of O(n). However, only Approach 2 truly meets the requirements of the problem. Space complexity for Approach 1 is O(n), while Approach 2 is O(1).

Approach 1 pros:

Readable logic. Each step is explicit.
No pointer math required.

Approach 1 cons:

O(n) space. Fails the problem's constraints.
More code. Two loops, one nested.
Three loops total - two outer, one nested - means more to mentally track when reading the algorithm end to end.

Approach 2 pros:

O(1) space. Meets all requirements.
Four lines of logic once you understand the two-pointer pattern.
Scales cleanly.

Approach 2 cons:

The two-pointer pattern is less intuitive at first glance, especially the slow - 2 check.
Starting at index 2 requires understanding why the first two elements are skipped.

Sometimes we have to sacrifice readability for better time complexity, or for other reasons. This time, I don't think that's the case. We get better readability and better space complexity.

...Day 4 coming soon