DEV Community

mzakzook
mzakzook

Posted on

Balanced BST - Part 1

Implementing a balanced binary search tree is a task I was challenged to accomplish by a friend of mine who is a senior engineer at Apple. He said that it would be a good exercise to better understand the mechanisms of how a binary search tree works and why it is a powerful data structure.

I first googled 'implement binary search tree javascript' and discovered several methods for implementing unbalanced BSTs. I read through the code and realized that, depending on the order of insertion, an unbalanced BST could effectively become more of a linked list. This is because if data was inserted in ascending or descending order, you would end up with a root node with child nodes on only one side. I wanted to find a more practical approach so I searched for a balanced solution and found very little. I soon realized that it is not a common task that would be completed in high-level languages like JavaScript because other lower-level languages perform much better for data driven tasks; languages such as Java or C/C++.

I went back to my friend and asked what specifically I should read about, and he suggested red-black trees & AVL Trees. In this post, I will discuss my introduction to red-black trees.

Red-black trees are utilized frequently because they efficiently perform searching, inserting and removing. According to Wikipedia, 'the balancing of the tree is not perfect, but it is good enough to allow it to guarantee searching in O(log n) time, where n is the total number of elements in the tree. The insertion and deletion operations, along with the tree rearrangement and recoloring, are also performed in O(log n) time' (Wikipedia - Red-black tree). The extra memory required is negligible compared to an unbalanced BST and search is improved from O(n) to O(log n).

As mentioned in the quote from Wikipedia, red-black trees are not perfectly balanced. This is interesting because they are still able to guarantee operations of O(log n) time. The general concept is that each node carries an extra bit, compared to an unbalanced BST, and that bit identifies if the node is red or black. The red and black designations prevent the tree from becoming too unbalanced, so that operations will always have a worst case scenario of O(log n) time. A node is painted red or black after insertion or removal of a node. The rearrangement and painting of nodes bubble up until the structure is consistent with the properties of a red-black tree.

According to Wikipedia, the five properties of a red-black tree are (Wikipedia - Red-black tree):

  1. Each node is either red or black.
  2. The root is black. This rule is sometimes omitted. Since the root can always be changed from red to black, but not necessarily vice versa, this rule has little effect on analysis.
  3. All leaves (NIL) are black.
  4. If a node is red, then both its children are black.
  5. Every path from a given node to any of its descendant NIL nodes goes through the same number of black nodes.

There are different methods for implementing a red-black tree and I am still working out the details of how each function comes together to allow for insertion, removal, rearrangement and recoloring. This past week was helpful in getting me to understand the complexity of implementing advanced data structures and was educational about the history of balanced BSTs and some of the computer scientists that have contributed to the advancement of tree architecture over the last ~40 years.

Top comments (0)