DEV Community

Cover image for C# Tip: Use a SortedSet to avoid duplicates and sort items
Davide Bellone
Davide Bellone

Posted on • Originally published at code4it.dev

C# Tip: Use a SortedSet to avoid duplicates and sort items

As you probably know, you can create collections of items without duplicates by using a HashSet<T> object.

It is quite useful to remove duplicates from a list of items of the same type.

How can we ensure that we always have sorted items? The answer is simple: SortedSet<T>!

HashSet: a collection without duplicates

A simple HashSet creates a collection of unordered items without duplicates.

This example

var hashSet = new HashSet<string>();
hashSet.Add("Turin");
hashSet.Add("Naples");
hashSet.Add("Rome");
hashSet.Add("Bari");
hashSet.Add("Rome");
hashSet.Add("Turin");


var resultHashSet = string.Join(',', hashSet);
Console.WriteLine(resultHashSet);
Enter fullscreen mode Exit fullscreen mode

prints this string: Turin,Naples,Rome,Bari. The order of the inserted items is maintained.

SortedSet: a sorted collection without duplicates

To sort those items, we have two approaches.

You can simply sort the collection once you've finished adding items:

var hashSet = new HashSet<string>();
hashSet.Add("Turin");
hashSet.Add("Naples");
hashSet.Add("Rome");
hashSet.Add("Bari");
hashSet.Add("Rome");
hashSet.Add("Turin");

var items = hashSet.ToList<string>().OrderBy(s => s);


var resultHashSet = string.Join(',', items);
Console.WriteLine(resultHashSet);

Enter fullscreen mode Exit fullscreen mode

Or, even better, use the right data structure: a SortedSet<T>

var sortedSet = new SortedSet<string>();

sortedSet.Add("Turin");
sortedSet.Add("Naples");
sortedSet.Add("Rome");
sortedSet.Add("Bari");
sortedSet.Add("Rome");
sortedSet.Add("Turin");


var resultSortedSet = string.Join(',', sortedSet);
Console.WriteLine(resultSortedSet);
Enter fullscreen mode Exit fullscreen mode

Both results print Bari,Naples,Rome,Turin. But the second approach does not require you to sort a whole list: it is more efficient, both talking about time and memory.

Use custom sorting rules

What if we wanted to use a SortedSet with a custom object, like User?

public class User { 
    public string FirstName { get; set; }
    public string LastName { get; set; }

    public User(string firstName, string lastName)
    {
        FirstName = firstName;
        LastName = lastName;
    }
}
Enter fullscreen mode Exit fullscreen mode

Of course, we can do that:

var set = new SortedSet<User>();

set.Add(new User("Davide", "Bellone"));
set.Add(new User("Scott", "Hanselman"));
set.Add(new User("Safia", "Abdalla"));
set.Add(new User("David", "Fowler"));
set.Add(new User("Maria", "Naggaga"));
set.Add(new User("Davide", "Bellone"));//DUPLICATE!

foreach (var user in set)
{
    Console.WriteLine($"{user.LastName} {user.FirstName}");
}
Enter fullscreen mode Exit fullscreen mode

But, we will get an error: our class doesn't know how to compare things!

That's why we must update our User class so that it implements the IComparable interface:

public class User : IComparable
{
    public string FirstName { get; set; }
    public string LastName { get; set; }

    public User(string firstName, string lastName)
    {
        FirstName = firstName;
        LastName = lastName;
    }

    public int CompareTo(object obj)
    {
        var other = (User)obj;
        var lastNameComparison = LastName.CompareTo(other.LastName);

        return (lastNameComparison != 0)
            ? lastNameComparison :
            (FirstName.CompareTo(other.FirstName));
    }
}
Enter fullscreen mode Exit fullscreen mode

In this way, everything works as expected:

Abdalla Safia
Bellone Davide
Fowler David
Hanselman Scott
Naggaga Maria
Enter fullscreen mode Exit fullscreen mode

Notice that the second Davide Bellone has disappeared since it was a duplicate.

This article first appeared on Code4IT

Wrapping up

Choosing the right data type is crucial for building robust and performant applications.

In this article, we've used a SortedSet to insert items in a collection and expect them to be sorted and without duplicates.

I've never used it in a project. So, how did I know that? I just explored the libraries I was using!

From time to time, spend some minutes reading the documentation, have a glimpse of the most common libraries, and so on: you'll find lots of stuff that you've never thought existed!

Toy with your code! Explore it. Be curious.

And have fun!

🐧

Top comments (0)

Image of Docusign

🛠️ Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more