DEV Community

JB
JB

Posted on

2 1

Union-find (Disjoint-set)

Resources:

  1. Youtube playlist
  2. Video explanation
  3. Princeton explanation & implementations
  4. Wikipedia article

Takeaways:

  • The union-find data structure (also known as disjoint-set) keeps track of elements that are partitioned into disjoint subsets (also called components).
  • When elements are initially added to union-find, they are all considered members of their own component. That means if we create a union-find with 3 elements it will also contain 3 components (each component being a single element). This is achieved by having each element specify itself as its parent (self-referencing).
  • Union-find has three main operations: MakeSet(), Union(), & Find().
    • MakeSet() takes an element, adds it to the underlying collection/tree, gives it either a rank of 0 or size of 1 (explained later on), and specifies a parent pointer to itself (which creates a new component).
    • Find() takes an element x and traverses it's parent chain. Find() tells us what component x is in by finding the root element of that component. If two elements have the same root, they are in the same component.
    • Union() takes two elements x & y and merges the components they belong to together. Union() leverages Find() to determine what components x & y are in. If the roots are the same, x & y are in the same component, and no action is taken. If the roots are different, this means x & y are in separate components. These components are merged by pointing the root of one of the components to the root of the other (one root becomes the parent of the other).
      • How do we determine which component is the one to merge into the other? By using rank or size mentioned earlier.
  • Union by size merges the smallest component (fewest elements) into the larger one.
  • Union by rank merges the component with the shorter tree into the taller one. Each componenet has a rank, which starts as 0. If two components are unioned and they have the same rank then the resulting component's rank is increased by 1. If the unioned components have a different rank, then the resulting component's rank is equal to the highest rank between the two.
    • We use rank instead of tree height or depth because of a techninque called path compression that changes the height/depth of the components over time
  • Path compression is utilized in the Find() operation to flatten a components tree. It does this by making every element of the component point to the components root (every element's parent is now the root of the component). This makes the Find() operation faster for that component, as it has a flatter chain to traverse when looking for a component's root.
  • Union-find is often implemented with either an array or using a more object-oriented approach. In my implementations I use an array.
  • Union-find is a useful data structure when used on graphs. Union-find can be used for things such as efficiently connecting vertices, finding connected components, & determining minimum spanning tree.
  • Using path compression, both union-find using rank or size have an amortized time complexity of O(n) for Union() & Find() operations. Space is O(n). Without path compression, both Union() & Find() operations have a complexity of O(log n).

Below are array based union-find implementations by size & rank:

using System;
using System.Collections.Generic;
using System.Diagnostics;
public class Program
{
public class UnionFindBySize
{
// size of component
private int[] size;
// parent of item
private int[] parent;
private int componentCount;
public int NumComponents { get { return componentCount; } }
public UnionFindBySize(int n)
{
componentCount = n;
parent = new int[n];
size = new int[n];
for (int i = 0; i < n; i++)
{
// i is it's own parent
// everything is self referencing at first
parent[i] = i;
// all self referencing, so component size is always 1
size[i] = 1;
}
}
public int Find(int input)
{
var root = input;
while (root != parent[root])
{
root = parent[root];
}
// no path compression
// return root;
// path compression:
// point all of input's ancestors to root
// aka make root parent of ancestors
while (input != root)
{
int next = parent[input];
parent[input] = root;
input = next;
}
// representative item of component
return root;
}
public void Union(int a, int b)
{
var rootA = Find(a);
var rootB = Find(b);
// Is already Connected?
if (rootA == rootB)
{
return;
}
// merge smaller component into large
if (size[rootA] < size[rootB])
{
size[rootB] += size[rootA];
parent[rootA] = rootB;
}
else
{
size[rootA] += size[rootB];
parent[rootB] = rootA;
}
// merged so overall # decreased
componentCount--;
}
public bool IsConnected(int a, int b)
{
return Find(a) == Find(b);
}
}
public class UnionFindByRank
{
private int[] rank;
private int[] parent;
private int componentCount;
public int NumComponents { get { return componentCount; } }
public UnionFindByRank(int n)
{
componentCount = n;
rank = new int[n];
parent = new int[n];
for (int i = 0; i < n; i++)
{
parent[i] = i;
rank[i] = 0;
}
}
public int Find(int input)
{
var root = input;
while (root != parent[root])
{
root = parent[root];
}
// no path compression
// return root;
// path compression
while (input != root)
{
var next = parent[input];
parent[input] = root;
input = next;
}
return root;
}
public void Union(int a, int b)
{
var rootA = Find(a);
var rootB = Find(b);
if (rootA == rootB)
{
return;
}
if (rank[rootA] <= rank[rootB])
{
parent[rootA] = rootB;
if (rank[rootA] == rank[rootB])
{
rank[rootB]++;
}
}
else
{
parent[rootB] = rootA;
}
componentCount--;
}
public bool IsConnected(int a, int b)
{
return Find(a) == Find(b);
}
}
public static void Main()
{
var ufRank = new UnionFindByRank(10);
var ufSize = new UnionFindBySize(10);
Debug.Assert(ufSize.NumComponents == 10);
Debug.Assert(ufRank.NumComponents == 10);
ufSize.Union(4, 3);
ufSize.Union(3, 8);
ufSize.Union(6, 5);
ufSize.Union(9, 4);
ufSize.Union(2, 1);
Debug.Assert(ufSize.NumComponents == 5);
ufSize.Union(8, 9);
Debug.Assert(ufSize.NumComponents == 5);
ufSize.Union(5, 0);
ufSize.Union(7, 2);
ufSize.Union(6, 1);
Debug.Assert(ufSize.NumComponents == 2);
ufSize.Union(1, 0);
ufSize.Union(6, 7);
Debug.Assert(ufSize.NumComponents == 2);
ufRank.Union(4, 3);
ufRank.Union(3, 8);
ufRank.Union(6, 5);
ufRank.Union(9, 4);
ufRank.Union(2, 1);
Debug.Assert(ufRank.NumComponents == 5);
ufRank.Union(8, 9);
Debug.Assert(ufRank.NumComponents == 5);
ufRank.Union(5, 0);
ufRank.Union(7, 2);
ufRank.Union(6, 1);
Debug.Assert(ufRank.NumComponents == 2);
ufRank.Union(1, 0);
ufRank.Union(6, 7);
Debug.Assert(ufRank.NumComponents == 2);
}
}
view raw UnionFind.cs hosted with ❤ by GitHub

As always, if you found any errors in this post please let me know!

Hostinger image

Get n8n VPS hosting 3x cheaper than a cloud solution

Get fast, easy, secure n8n VPS hosting from $4.99/mo at Hostinger. Automate any workflow using a pre-installed n8n application and no-code customization.

Start now

Top comments (0)

Billboard image

Create up to 10 Postgres Databases on Neon's free plan.

If you're starting a new project, Neon has got your databases covered. No credit cards. No trials. No getting in your way.

Try Neon for Free →

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay