DEV Community

Cover image for Removing Duplicates from an Array (Using C#)
Sung M. Kim
Sung M. Kim

Posted on • Originally published at slightedgecoder.com on

Removing Duplicates from an Array (Using C#)

* Featured Image – “removed.jpg” by Don Crowley, used under BY SA

This article is from last year but after reading Sai's article,

(go check it 👆 out~ it's good)

I wanted to share a C# version.


I’ve experimented a few different ways to remove duplicate items in an array.

An array could be either ordered or not.

There are many solutions to this problem and I will show two different ways for “unordered” array, and one for the ordered one.

Let’s take care of the easy one first.

Case 1 – Removing duplicates from an ordered array.

Supposed that you are given an array, {1, 2, 2, 3, 4, 4, 4, 5}.

Since an array is ordered, you can simply check if a next item is same as the current item being checked against. Here is the function, which implements the algorithm.

private static int[] RemoveDuplicatesByCheckingPreviousElement(int[] a)
{
    List<int> result = new List<int>();
    for (int i = 0; i < a.Length - 1; i++)
    {
        if (a[i] != a[i + 1])
        {
            result.Add(a[i]);
        }
    }

    result.Add(a[a.Length - 1]);
    return result.ToArray();
}
Enter fullscreen mode Exit fullscreen mode

You only add the current item, if the next one is not the same as the current one. E.g.)

E.g.) While on the 3rd item, “2”, then the next item is “3”, so we know that we have passed through all possible duplicates since the array is sorted.

We need to add the last item to the result because the last one doesn’t have the next element to check for a duplicate (line #12).

You can find more details on this YouTube video, How to remove duplicates from a sorted array in C/C++.

The source is in C++ but I translated it to C#.

Case 2.1 – Removing duplicates from an unordered array using a set.

By definition, a set doesn’t allow duplicate value. So in .NET, instantiating a HashSet object by passing an array would have removed duplicates automatically.

But the point of this exercise is to do so manually. I am using HashSet because it has an O(1) lookup time while an array has an O(N) time complexity.

The algorithm is simple. While going through each item in the array, if the current item has not been seen, we add it to the set as demonstrated in following code snippet.

private static int[] RemoveDuplicatesUsingHashTable(int[] a)
{
    HashSet<int> alreadySeen = new HashSet<int>();
    foreach (int item in a)
    {
        if (!alreadySeen.Contains(item))
            alreadySeen.Add(item);
    }
    return alreadySeen.ToArray();
}
Enter fullscreen mode Exit fullscreen mode

You can find details on this YouTube video, [Interview Question] Duplicate Integers in Array.

Case 2.2 – Removing duplicates from an unordered array by sorting it and check for next element for duplicates.

The algorithm is to sort the current array and apply case 1 algorithm.

private static int[] RemoveDuplicatesBySortingFirst(int[] a)
{
    Array.Sort(a);
    return RemoveDuplicatesByCheckingPreviousElement(a);
}
Enter fullscreen mode Exit fullscreen mode

Note that RemoveDuplicatesByCheckingPreviousElement is the function name from case 1.

Sorting the array takes O(N log N) time and case 1, O(N), which results in O(N) total time complexity.

Conclusion

When you were asked to solve this problem in an interview, ask the interviewer if the array is sorted or not. Use any algorithms above depending on the context;

If the array is ordered, then use case 1 algorithm, else either 2.1 or 2.2. But be aware that case 2.2 returns an array in different order from the original one.

E.g.) Given an array a = {8, 1, 1, 3, 2, 2},

RemoveDuplicatesBySortingFirst(a)
Enter fullscreen mode Exit fullscreen mode

Above function returns {1, 2, 3, 8}, not {8, 1, 3, 2}.

Full working source code can be found on GitHub.

The post Removing Duplicates from an Array (Using C#) appeared first on Slight Edge Coder.

Top comments (2)

Collapse
 
sait profile image
Sai gowtham

Do you have any solution for removing duplicate objects from an array?

Collapse
 
dance2die profile image
Sung M. Kim

I haven't implemented any but the implementation can be a bit tricky for the case 1 & 2.1

Case 1 & 2.1

  • Implement IComparable interface for instance of the class you are comparing. So that you can compare current & next objects in an array.
  • Also override the comparison operators (>, >=, <, <=).
// To enable Point class object instance comparable.
public class Point : IComparable<Point>
{
    public int CompareTo(Point otherPoint)
    {
        return this.X > otherPoint.X && this.Y > otherPoint.Y;
    }

    public static bool operator >  (Point p1, Point p2) => p1.CompareTo(p2) == 1;
    public static bool operator <  (Point p1, Point p2) => p1.CompareTo(p2) == -1;
    public static bool operator >=  (Point p1, Point p2) => p1.CompareTo(p2) >= 0;
    public static bool operator <=  (Point p1, Point p2) p1.CompareTo(p2) <= 0;
}
...

// case 1
if (a[i] != a[i + 1])
{
    result.Add(a[i]);
}

// case 2.1
if (!alreadySeen.Contains(item))
    alreadySeen.Add(item);

For Case 2.2

Create a class implementing IComparer interface so that you can specify how objects can be sorted.

public class PointComparer<Point> : IComparer<Point>
{
  int IComparer.Compare(Point p1, Point p2)  
  {
      return p1.X > p2.X && p1.Y > p2.Y;
  }
}

// ...
Array.Sort(points, new PointComparer());