DEV Community

Albert Bennett
Albert Bennett

Posted on

Data Structures in C# - The complex ones... and some other ones as well

Hi and welcome to my post!
Today I'll be going through the more complex data structures.
I think most people reading this will be familiar with some of the more common data structures such as arrays, lists and dictionaries but, I wonder how many would be familiar with the likes of binary trees, stacks and linked lists.
In this post I'll show you the wonderful world of data structures.

Feel free to leave a like if you learned something new or found out something interesting.

There is a lot in this post so I've added link to the individual header in this post for reference:

Binary Trees

Here we go starting of with the more complex of the data structures.
Before we get into the code implementation of a flavor of binary tree lets first define the common elements of binary trees:

  • Node: This object represents the connection between itself and other nodes in the binary tree
  • Root Node: This is the first node in the tree and has no parent but, can have children
  • Child: Any node that has a parent, is a child of that parent.
  • Sibling: This is any node that lays on the same depth as the current node and has the same parent node
  • Parent: This is any node that has children, in a binary tree a parent can have up to two children
  • Leaf: This is a child that has no children
  • Depth: This is the position in the tree that the node is in

Here is a diagram of an example full binary tree:
Binary tree example
In our example tree above, 0 is the root node and is the parent of both 5 and 10. 5 is the parent node of both 3 and 7. 3 and 7 are siblings as they both have 5 as a parent and are at the same depth. 3, 7 and 10 are all leaf nodes as they don't have any children of their own.
With all of that out of the way lets go on to the types of binary trees. Binary trees come into five distinct flavors:

  • Pathological binary tree: A pathological Binary tree is a binary tree where each node has only one child. These kinds of binary trees act similar to linked lists. Example below: Pathological binary tree example
  • Full binary tree: These kinds of binary trees can have from 0 to 2 children. See example below, it the same diagram as above in the explainer on the elements of a binary tree: Binary tree example
  • Complete binary tree: This is a type of binary tree where all nodes are filled on all depths and is skewed to the left, except for the last depth. Example of a complete binary tree
  • Balanced binary tree: This kind of binary is where the left and right side of the binary tree differ in depth by at most one. Example of a balanced binary tree
  • Perfect binary tree: This is a binary tree where all nodes have two children and all leaves are at the same depth. Example of a perfect binary tree

Super! So we have the different types of binary trees defined, next on to how this kind of data structure is constructed in code. to clarify we will be building a full binary tree, as that is the most common type of binary tree that I have experienced.

To start we will define what a node looks like.

    internal class Node(Node? parent, int value)
    {
        readonly int value = value;
        readonly Node? parent = parent;
        Node? leftChild;
        Node? rightChild;

        public int Value { get { return value; } }
        public Node? Parent { get { return parent; } }
        public Node? LeftChild { get { return leftChild; } }
        public Node? RightChild { get { return rightChild; } }

        // This a recursive method to add the new nodes to our tree
        public void Add(int value)
        {
            /*If the new value is < our value either add a child node
             or check the left child node*/
            if (value < this.value)
            {
                if (leftChild != null)
                {
                    leftChild.Add(value);
                }
                else
                {
                    leftChild = new Node(this, value);
                    Console.WriteLine($"parent:{this.value}, left child: {value}");
                }

                return;
            }

            /*If the new value is >= our value either add a child node
                or check the right child node*/
            if (rightChild != null)
            {
                rightChild.Add(value);
            }
            else
            {
                rightChild = new Node(this, value);
                Console.WriteLine($"parent:{this.value}, right child: {value}");
            }
        }
    }
Enter fullscreen mode Exit fullscreen mode

And that's it... that's our tree...
As you can see from above, for a node we can create one with no parent to be our root node from here when we add to the root node it checks it's children if they are null it creates on based on the value to be added or it will call add on one of it's child nodes. The adding of the child nodes is a O(N) complexity where N is the number of nodes in the binary tree:
I've written the below to test this out:

List<int> values = new() { 3, 7, 8, 6, 10 };
Node rootNode = new(null, 5);

Console.WriteLine($"Root node value: {rootNode.Value}");
Console.WriteLine(string.Join(',', values));

foreach(int i in values)
{
    rootNode.Add(i);
}

Console.ReadKey();
Enter fullscreen mode Exit fullscreen mode

The output:
Binary tree output
Diagram of the tree output
Now there are a few bits we can add to the node structure to help to query it. First we will add a new enum to tell us if the node is a root, child or leaf node:

    internal enum NodeTypes : byte
    {
        Root = 0,
        Child = 1,
        Leaf = 2
    }
Enter fullscreen mode Exit fullscreen mode

Added to the node class:

        public NodeTypes Type
        {
            get
            {
                return parent == null ? NodeTypes.Root :
                    leftChild != null || rightChild != null ?
                        NodeTypes.Child : NodeTypes.Leaf;
            }
        }
Enter fullscreen mode Exit fullscreen mode

These types will help us query the nodes and let us know if we have reached the end of any branch of a tree.
Next we will write a method to let use know if a node is a sibling of another node:

        public bool IsSibling(Node node)
        {
            if (node.Type == NodeTypes.Root)
                return false;

            return node.parent == this.parent;
        }
Enter fullscreen mode Exit fullscreen mode

Finally the below method will let us search for a value in the binary tree and return the node that contains the value. The search is a O(N) complexity where N is the number of nodes in the binary tree:

        /* This is another recursive method where each node is checked until either
           the tree has been searched or the value has been found */
        public Node? FindNodeWithValue(int value)
        {
            if (this.value == value)
            {
                return this;
            }
            else if (Type != NodeTypes.Leaf)
            {
                Node? foundNode = null;

                if (leftChild != null)
                {
                    foundNode = leftChild.FindNodeWithValue(value);
                }

                if (foundNode == null)
                {
                    if (rightChild != null)
                    {
                        return rightChild.FindNodeWithValue(value);
                    }
                }

                return foundNode;
            }

            return null;
        }
Enter fullscreen mode Exit fullscreen mode

Essentially every node is searched until the value is found, now there are other methods for searching a binary tree including depth first search.
Search and finding outputs
Now, there are other kinds of tree not just binary trees. A binary tree is just a tree structure with up to two children per node. You also have oct trees (8 children per node) that are used for partitioning 3D space for the likes of defining the level of detail on terrain in a video game or quad trees (4 children per node) that are used for image compression.


Stacks

So stacks, this is a generic collection that works on the principal last in first out. This means that the last item in is the first item in the collection. Also they don't support an indexer.
They can be defined as follows:

Stack<int> stack = new Stack<int>();
Enter fullscreen mode Exit fullscreen mode

Items can be added to a stack by pushing them onto it:

Stack<int> stack = new Stack<int>();

stack.Push(5);
stack.Push(6);
stack.Push(7);

Console.WriteLine($"{string.Join(',', stack)}");
Enter fullscreen mode Exit fullscreen mode

stack output
To view the top item in the stack you can use peek. the stack:

Stack<int> stack = new Stack<int>();

stack.Push(5);
stack.Push(6);
stack.Push(7);
stack.Push(8);
stack.Push(9);
stack.Push(10);

Console.WriteLine($"Stack: {string.Join(',', stack)}");

int peeked = stack.Peek();
Console.WriteLine($"Item: {peeked}");
Enter fullscreen mode Exit fullscreen mode

peeking in a stack

You can remove items from the stack by using stack pop. Using stack pop will only remove the first item in the collection and return it.

Stack<int> stack = new Stack<int>();

stack.Push(5);
stack.Push(6);
stack.Push(7);

Console.WriteLine($"Stack: {string.Join(',', stack)}");

int removed = stack.Pop();
Console.WriteLine($"Item removed: {removed}");

Console.WriteLine($"Stack: {string.Join(',', stack)}");
Enter fullscreen mode Exit fullscreen mode

popped stack
You can also remove all items in the stack with clear.

The main use cases for stacks is when you need to make use of the last item added to the collection. So it could be useful in scenarios where you need to manage the history of a item such as browsing history or you need to keep an audit log or track of what had happened to a particular object to get to the state that it is in now.


Queues

Queues work on the principal of first in first out and also don't support an indexer.
Queues can be instantiated as follows:

Queue<string> names = new Queue<string> ();
Enter fullscreen mode Exit fullscreen mode

Items are added to the queue using enqueue. This will add items to the end of the collection similar to a list.

Queue<string> names = new Queue<string> ();

names.Enqueue("Tom");
names.Enqueue("Timmy");
names.Enqueue("Thomas");
names.Enqueue("Tommity");
names.Enqueue("Tommiton");
names.Enqueue("Tombert");
names.Enqueue("Tomantha");

Console.WriteLine($"{string.Join(',', names)}");
Enter fullscreen mode Exit fullscreen mode

enqueue example
Like with a stack you can view the item in the front of the collection without removing it using peek.

string peeked = names.Peek();
Console.WriteLine($"Peeked: {peeked}");
Enter fullscreen mode Exit fullscreen mode

queue peeked example
Similar again to stacks you can remove the first item in the collection with dequeue:

Queue<string> names = new Queue<string> ();

names.Enqueue("Tom");
names.Enqueue("Timmy");
names.Enqueue("Thomas");
names.Enqueue("Tommity");
names.Enqueue("Tommiton");
names.Enqueue("Tombert");
names.Enqueue("Tomantha");

Console.WriteLine($"{string.Join(',', names)}");

string dequeued = names.Dequeue();
Console.WriteLine($"dequeued: {dequeued}");
Console.WriteLine($"{string.Join(',', names)}");
Enter fullscreen mode Exit fullscreen mode

dequeuing of a queue
You can also clear a queue by calling the clear method.

The main use case for queues is for well managing messages such as in message queues in a Publish Subscribe architecture. In it queues are use to store messages to mitigate system interrupts so that when the consuming resource is free to take on a new task the queue is dequeued and the first item is sent to the consumer.


Data tables

Data tables as the name suggest is a code representation of a table with columns and rows that you can use to define the relationship between the different data elements and query the data in memory.

As you'd imagine Data Tables are formed from several different elements:

  • Data Table: defines the table in memory
  • Data Row: defines a row in the Data Table
  • Data Column: defines a column in the Data Table
  • Data Relationship: defines the relationship between two columns in Data Tables (like a foreign key to table)
  • Data Set: defines the collection of the data objects (tables, relations, etc)

Super! Lets start with an example of a simple table that some customer information in it:

using System.Data;

const string CUSTOMER_TABLE_NAME = "Customers";

const string ID_COLUMN_NAME = "customer_id";
const string CUSTOMER_FIRST_NAME_COLUMN_NAME = "customer_first_name";
const string CUSTOMER_LAST_NAME_COLUMN_NAME = "customer_last_name";

DataSet ds = new DataSet();

CreateCustomerTable(ds);

static void CreateCustomerTable(DataSet ds)
{
    //Creates our data table
    DataTable dataTable = new DataTable(CUSTOMER_TABLE_NAME);

    //Defines the columns in the data table
    dataTable.Columns.Add(ID_COLUMN_NAME, typeof(int));
    dataTable.Columns.Add(CUSTOMER_FIRST_NAME_COLUMN_NAME, typeof(string));
    dataTable.Columns.Add(CUSTOMER_LAST_NAME_COLUMN_NAME, typeof(string));

    addDataToCustomerTable(1, "John", "Tomson", dataTable);
    addDataToCustomerTable(2, "Tom", "Johnson", dataTable);

    //Adding our table to the data set
    ds.Tables.Add(dataTable);
}

//The method is used to add a new row to the table
static void addDataToCustomerTable(int id, string customerFirstName, string customerLastName, DataTable table)
{
    DataRow row = table.NewRow();

    row[ID_COLUMN_NAME] = id;
    row[CUSTOMER_FIRST_NAME_COLUMN_NAME] = customerFirstName;
    row[CUSTOMER_LAST_NAME_COLUMN_NAME] = customerLastName;

    table.Rows.Add(row);
}
Enter fullscreen mode Exit fullscreen mode

That is the basics of creating a new data table. However, next we'll add two new tables and define a relationship between them. In the example I'll be defining an Item table and an Online Order table.

using System.Data;

const string CUSTOMER_TABLE_NAME = "Customers";
const string ONLINE_ORDER_TABLE_NAME = "OnlineOrders";
const string ITEM_TABLE_NAME = "Items";

const string CUSTOMER_ID_COLUMN_NAME = "customer_id";
const string CUSTOMER_FIRST_NAME_COLUMN_NAME = "customer_first_name";
const string CUSTOMER_LAST_NAME_COLUMN_NAME = "customer_last_name";

const string ORDER_ID_COLUMN_NAME = "online_order_id";
const string QUANTITY_COLUMN_NAME = "quantity";

const string ITEM_ID_COLUMN_NAME = "item_id";

const string ITEM_DESCRIPTION_COLUMN_NAME = "description";

DataSet ds = new DataSet();

CreateCustomerTable(ds);
CreateOnlineOrderTable(ds);
CreateItemTable(ds);

CreateTableRelationships(ds);

void CreateTableRelationships(DataSet ds)
{
    ds.Tables[ONLINE_ORDER_TABLE_NAME].ParentRelations.Add(
        ds.Tables[CUSTOMER_TABLE_NAME].Columns[CUSTOMER_ID_COLUMN_NAME],
        ds.Tables[ONLINE_ORDER_TABLE_NAME].Columns[CUSTOMER_ID_COLUMN_NAME]);

    ds.Tables[ONLINE_ORDER_TABLE_NAME].ParentRelations.Add(
        ds.Tables[ITEM_TABLE_NAME].Columns[ITEM_ID_COLUMN_NAME],
        ds.Tables[ONLINE_ORDER_TABLE_NAME].Columns[ITEM_ID_COLUMN_NAME]);
}

static void CreateOnlineOrderTable(DataSet ds)
{
    //Creates our data table
    DataTable dataTable = new DataTable(ONLINE_ORDER_TABLE_NAME);

    //Defines the columns in the data table
    DataColumn idColumn = new DataColumn(ORDER_ID_COLUMN_NAME, typeof(int));
    idColumn.Unique = true;
    idColumn.AutoIncrement = true;

    dataTable.Columns.Add(idColumn);
    dataTable.Columns.Add(CUSTOMER_ID_COLUMN_NAME, typeof(int));
    dataTable.Columns.Add(ITEM_ID_COLUMN_NAME, typeof(int));
    dataTable.Columns.Add(QUANTITY_COLUMN_NAME, typeof(string));

    addDataToTable(CreateDataForOrderTable(1, 1, 15), dataTable);

    //Adding our table to the data set
    ds.Tables.Add(dataTable);
}

static void CreateCustomerTable(DataSet ds)
{
    //Creates our data table
    DataTable dataTable = new DataTable(CUSTOMER_TABLE_NAME);

    //Defines the columns in the data table
    DataColumn idColumn = new DataColumn(CUSTOMER_ID_COLUMN_NAME, typeof(int));
    idColumn.Unique = true;
    idColumn.AutoIncrement = true;

    dataTable.Columns.Add(idColumn);
    dataTable.Columns.Add(CUSTOMER_FIRST_NAME_COLUMN_NAME, typeof(string));
    dataTable.Columns.Add(CUSTOMER_LAST_NAME_COLUMN_NAME, typeof(string));

    addDataToTable(CreateDataForCustomerTable("John", "Tomson"), dataTable);
    addDataToTable(CreateDataForCustomerTable("Tom", "Johnson"), dataTable);

    //Adding our table to the data set
    ds.Tables.Add(dataTable);
}

static void CreateItemTable(DataSet ds)
{
    //Creates our data table
    DataTable dataTable = new DataTable(ITEM_TABLE_NAME);

    //Defines the columns in the data table
    DataColumn idColumn = new DataColumn(ITEM_ID_COLUMN_NAME, typeof(int));
    idColumn.Unique = true;
    idColumn.AutoIncrement = true;

    dataTable.Columns.Add(idColumn);
    dataTable.Columns.Add(ITEM_DESCRIPTION_COLUMN_NAME, typeof(string));

    addDataToTable(CreateDataForItemTable("Apple"), dataTable);
    addDataToTable(CreateDataForItemTable("Orange"), dataTable);

    //Adding our table to the data set
    ds.Tables.Add(dataTable);
}

static Dictionary<string, object> CreateDataForCustomerTable(string firstName, string lastName)
{
    return new Dictionary<string, object>
    {
        { CUSTOMER_FIRST_NAME_COLUMN_NAME, firstName },
        { CUSTOMER_LAST_NAME_COLUMN_NAME, lastName }
    };
}

static Dictionary<string, object> CreateDataForOrderTable(int customerId, int itemId, int quantity)
{
    return new Dictionary<string, object>
    {
        { CUSTOMER_ID_COLUMN_NAME, customerId },
        { ITEM_ID_COLUMN_NAME, itemId },
        { QUANTITY_COLUMN_NAME, quantity }
    };
}

static Dictionary<string, object> CreateDataForItemTable(string description)
{
    return new Dictionary<string, object>
    {
        { ITEM_DESCRIPTION_COLUMN_NAME, description }
    };
}

//The method is used to add a new row to the table
static void addDataToTable(Dictionary<string, object> data, DataTable table)
{
    DataRow row = table.NewRow();

    foreach (KeyValuePair<string, object> kvp in data)
    {
        row[kvp.Key] = kvp.Value;
    }

    table.Rows.Add(row);
}
Enter fullscreen mode Exit fullscreen mode

In the code sample above both the Item table and Customer table can join the Online Order table see entity relationship diagram below:
Entity relationship diagram from example code

Nice one!
With the relationships defined if we try to add something to our online order table with out it being in either customers or our item times first we should get errors as that data doesn't exist in the joining tables.

The below should throw an exception when we try to add a new row into the online order table as there is no primary key in the customer table with an id of 3, as such we get a foreign key constraint issue.
foreign key constraint error

That is the basics of data tables in C#. The DataTable object in C# is actually quite robust. There are many other features of it such as data binding and setting grid views for displaying the table. Overall if you need to have some control over data as you would have in a database(to lesser extent of course) the DataTable is a good way to do this in C# as it mimics some of the same features that a table in the database has.


Hashset

A Hashset is an collection that prevents duplicates by ensuring each element in the collection is unique. Kind of like keys in a dictionary.

As Hashset is instantiated in much the same way as a list or other generic collection:

HashSet<int> hashset = [1,2,3,4,5];
Enter fullscreen mode Exit fullscreen mode

Hashset elements
When I try to add a new element to the Hashset that already exist if doesn't add a new entry into the collection:

HashSet<int> hashset = [1,2,3,4,5];

Console.WriteLine($"Hashset elements: {string.Join(',', hashset)}");

int toAdd = 1;

Console.WriteLine($"Attempting to add {toAdd}");
hashset.Add(toAdd);

Console.WriteLine($"Hashset elements: {string.Join(',', hashset)}");
Enter fullscreen mode Exit fullscreen mode

Hashset elements after an attempted insert

The other thing I'd mention is that with Hashset they do have most of the same functionality as other generic collections such as adding and removing elements. The main use case for them is to ensure you aren't storing duplicate elements in a collection


I topic isn't something I see much of online and to be honest before exploring this topic and seeing the other data structures I might not have considered using most of them. In my day job a lot of the time when I need a go to collection it's either dictionaries, lists or arrays. Rarely had I seen Hashsets or DataTables. Maybe in some older applications but, I can't remember the last time I needed to implement a binary tree.

Thanks for reading my post.
Feel free to leave a like if you learned something new or found out something interesting and I'll see you in the next post >>

Top comments (0)