Resources:
Takeaways:
- The union-find data structure (also known as disjoint-set) keeps track of elements that are partitioned into disjoint subsets (also called components).
- When elements are initially added to union-find, they are all considered members of their own component. That means if we create a union-find with 3 elements it will also contain 3 components (each component being a single element). This is achieved by having each element specify itself as its parent (self-referencing).
- Union-find has three main operations:
MakeSet()
,Union()
, &Find()
.-
MakeSet()
takes an element, adds it to the underlying collection/tree, gives it either arank
of 0 orsize
of 1 (explained later on), and specifies a parent pointer to itself (which creates a new component). -
Find()
takes an elementx
and traverses it's parent chain.Find()
tells us what componentx
is in by finding the root element of that component. If two elements have the same root, they are in the same component. -
Union()
takes two elementsx & y
and merges the components they belong to together.Union()
leveragesFind()
to determine what componentsx & y
are in. If the roots are the same,x & y
are in the same component, and no action is taken. If the roots are different, this meansx & y
are in separate components. These components are merged by pointing the root of one of the components to the root of the other (one root becomes the parent of the other).- How do we determine which component is the one to merge into the other? By using
rank
orsize
mentioned earlier.
- How do we determine which component is the one to merge into the other? By using
-
- Union by size merges the smallest component (fewest elements) into the larger one.
-
Union by rank merges the component with the shorter tree into the taller one. Each componenet has a rank, which starts as 0. If two components are unioned and they have the same rank then the resulting component's rank is increased by 1. If the unioned components have a different rank, then the resulting component's rank is equal to the highest rank between the two.
- We use rank instead of tree height or depth because of a techninque called path compression that changes the height/depth of the components over time
- Path compression is utilized in the
Find()
operation to flatten a components tree. It does this by making every element of the component point to the components root (every element's parent is now the root of the component). This makes theFind()
operation faster for that component, as it has a flatter chain to traverse when looking for a component's root. - Union-find is often implemented with either an array or using a more object-oriented approach. In my implementations I use an array.
- Union-find is a useful data structure when used on graphs. Union-find can be used for things such as efficiently connecting vertices, finding connected components, & determining minimum spanning tree.
- Using path compression, both union-find using
rank
orsize
have an amortized time complexity ofO(n)
forUnion()
&Find()
operations. Space isO(n)
. Without path compression, bothUnion()
&Find()
operations have a complexity ofO(log n)
.
Below are array based union-find implementations by size & rank:
As always, if you found any errors in this post please let me know!
Top comments (0)