DEV Community

Cover image for How to Remove Duplicates from a Sorted Array
Mahbub Alam Masum
Mahbub Alam Masum

Posted on • Originally published at blog.masum.dev

How to Remove Duplicates from a Sorted Array

Removing duplicates from a sorted array is a common problem asked in many coding interviews and programming challenges. This can be approached using different methods. In this article, we'll discuss two approaches to remove duplicates from a sorted array: one using set and another using the two-pointers technique.

Solution 1: Brute Force Approach (using Set)

This method uses a set to store unique elements, ensuring that duplicates are removed.

Implementation:

// Solution-1: Brute Force Approach (using Set)
// Time Complexity: O(nlogn)
// Space Complexity: O(n)
int removeDuplicates(vector<int> &arr, int n)
{
    set<int> uniqueElements;

    for (int i = 0; i < n; i++)
    {
        uniqueElements.insert(arr[i]);
    }

    int k = uniqueElements.size();

    // Copy unique elements from the set back into the input array
    int i = 0;
    for (auto x : uniqueElements)
    {
        arr[i] = x;
        i++;
    }

    // Return the number of unique elements
    return k;

}
Enter fullscreen mode Exit fullscreen mode

Logic:

  1. Use a Set: Insert all elements of the array into a set to remove duplicates.

  2. Get Unique Count: The size of the set gives the number of unique elements.

  3. Copy Back to Array: Copy elements from the set back to the array.

Time Complexity: O(n log n)

  • Explanation: Inserting each element into the set takes O(log n) time, resulting in O(n log n) for n elements.

Space Complexity: O(n)

  • Explanation: The set stores all unique elements, potentially up to n elements.

Example:

  • Input: arr = [1, 1, 2, 2, 3, 4, 4], n = 7

  • Output: k = 4, arr = [1, 2, 3, 4, ...]

  • Explanation: The array contains 4 unique elements.


Solution 2: Optimal Approach (Two-Pointers Technique)

This method uses two pointers to overwrite duplicates in place, providing an efficient solution.

Implementation:

// Solution-2: Optimal Approach (Two-Pointers Technique)
// Time Complexity: O(n)
// Space Complexity: O(1)
int removeDuplicates(vector<int> &arr, int n)
{
    int i = 0; // Pointer to track the position of the next unique element
    for (int j = 1; j < n; j++)
    {
        if (arr[i] != arr[j])
        {
            i++;
            arr[i] = arr[j];
        }
    }

    // Return the number of unique elements
    return i + 1;
}
Enter fullscreen mode Exit fullscreen mode

Logic:

  1. Initialize Pointers: Use i to track the position of the next unique element and j to traverse the array.

  2. Compare Elements: If arr[i] is not equal to arr[j], increment i and update arr[i] with arr[j].

  3. Return Count: The value i + 1 gives the number of unique elements.

Time Complexity: O(n)

  • Explanation: The array is traversed once.

Space Complexity: O(1)

  • Explanation: No additional space is used apart from variables.

Example:

  • Input: arr = [1, 1, 2, 2, 3, 4, 4], n = 7

  • Output: k = 4, arr = [1, 2, 3, 4, ...]

  • Explanation: The array contains 4 unique elements.


Comparison

  • Brute Force Method:

    • Pros: Simple and straightforward.
    • Cons: Inefficient for large arrays due to O(n log n) time complexity and additional space usage.
    • Use Case: Useful when simplicity is more important than efficiency.
  • Optimal Method:

    • Pros: Efficient with O(n) time complexity and O(1) space complexity.
    • Cons: Requires in-place modification of the array.
    • Use Case: Ideal for large arrays and when in-place operations are feasible.

Edge Cases

  • Empty Array: Returns 0 as there are no elements.

  • Single Element Array: Returns 1 as a single element is trivially unique.

  • All Identical Elements: Returns 1 as all elements are the same.

Additional Notes

  • Efficiency: The optimal approach is significantly more efficient for large datasets.

  • Simplicity: Despite its efficiency, the optimal approach is simple to implement.

  • Practicality: The optimal method is generally preferred due to its linear time complexity and constant space complexity.

Conclusion

Removing duplicates from a sorted array can be done efficiently using an in-place approach with two pointers. While the brute force method provides a straightforward solution, the optimal approach is both efficient and easy to implement, making it suitable for large datasets.


AWS Security LIVE!

Tune in for AWS Security LIVE!

Join AWS Security LIVE! for expert insights and actionable tips to protect your organization and keep security teams prepared.

Learn More

Top comments (0)

AWS Security LIVE!

Tune in for AWS Security LIVE!

Join AWS Security LIVE! for expert insights and actionable tips to protect your organization and keep security teams prepared.

Learn More

👋 Kindness is contagious

Discover a treasure trove of wisdom within this insightful piece, highly respected in the nurturing DEV Community enviroment. Developers, whether novice or expert, are encouraged to participate and add to our shared knowledge basin.

A simple "thank you" can illuminate someone's day. Express your appreciation in the comments section!

On DEV, sharing ideas smoothens our journey and strengthens our community ties. Learn something useful? Offering a quick thanks to the author is deeply appreciated.

Okay