Solution Developed In:
The Question
For this article we will be covering Leetcode's '1584. Min Cost to Connect All Points' question. This question is very similar to the 684. Redundant Connection question. As we're going to use Union Find to solve this problem. If you have not solved 684. Redundant Connection question using Union Find yet, then you should do so by following this guide here.
Question:
You are given an array points representing integer coordinates of some points on a 2D-plane, where
points[i] = [xi, yi]
.
The cost of connecting two points[xi, yi]
and[xj, yj]
is the manhattan distance between them:|xi - xj|
+|yi - yj|
, where|val|
denotes the absolute value ofval
.
Input: points = [[0,0],[2,2],[3,10],[5,2],[7,0]]
Output: 20
Explanation: We can connect the points as shown above to get the minimum cost of 20.
Notice that there is a unique path between every pair of points.
Explaining The Question
This Question is rated Medium. Which is false. I consider this question to be a Hard question. As the Datastructre used to solve this question is rarely known and the specific algorithms to use (Kruskals Algorithm or Prims Algorithm) is also rarely seen. I think it would be impossible to solve this question if you hadn't encourted these algorithms / data structures / Minimum spanning tree algorithms. Nonetheless, this is a fantastic problem to solve.
What's expected of you is to use Union Find to solve this problem. Specifically, Union Find by Rank is expected. And given this structure, we will use Kruskals Algorithm to solve this problem.
We have been given a list of Nodes and Edges ([Node -> Edge]). Which forms a Graph, we need to connect this entire graph together at the minimal cost. This forms a Minimum Spanning Tree. The cost of a connection is determined by the Manhattan Distance between two nodes. So we need to connect all nodes to their closest neighbors.
Recommended Knowledge
- Graph Theory
- Union Find
- Union Find by Rank
- Path Compression
- Amortized Analysis
- Kruskals Algorithm
- Minimum Spanning Tree
- Manhattan Distance
- Priority Queue
- Heap
What do we know?
- All pairs are distinct.
- We need to connect all nodes to the cheapest connection as defined by the Manhattan Distance.
How we're going to do it:
We're going to use Union Find to solve this problem. Specifically, Union Find by Rank. We're going to use Kruskals Algorithm to create a Minimum Spanning Tree by connecting each node to their cheapest connection. We will union all nodes starting with the operation that is the cheapest.
Meaning, that prior to the union find we will create a list of Operations. An Operation means's that if we was to connect Node_1
to Node_2
, how much would it cost
? What this forms is an Array of Arrays that looks like this:
[
[1, 2, 1]
[2, 1, 1]
[3, 4, 2]
[4, 3, 2]
[5, 6, 3]
[6, 5, 3]
]
]
Where [Node_1, Node_2, Cost] is the operation. We sort this list of operations by the cost
. So we start with the cheapest connection and then attempt to connect Node_1 to Node_2 using UnionFind. Each time we union two nodes, we will add the cost of the connection to the total cost. Once we have unionised all nodes, we will have a Minimum Spanning Tree and thus our total cost. This is known as Kruskals Algorithm. We will use a Min Heap to find the order the cost of the connections. So we can always start with the cheapest connection.
While we're running through the list of operations, we will also tally the number of operations processed so we can exit the program early, as we could've already connected all the nodes and we're running redundant operations. We will also note the cost if the Union was successful.
Big O Notation:
Time Complexity: O(N x E) | Where n is the number of nodes in the Graph. As we're going to visit every node in the matrix. Where V is the number of nodes in the graph and E is the number of edges in the graph. Although we could easily argue it's O(n x e ^ 2) as we're going to visit every node for every node. As every node is a potential connection.
Space Complexity: O(N x E) | As we're going to store the list of operations in a Min Heap.
We did although implemented a Path Compression and Union by Rank technique to achieve a Amortized O(1) time complexity on our Union and Find functions. But as we will still need to iterate through the Nodes, we will still have a O(n x e) time complexity.
Could this be improved?
Yes, Prim's Algorithm is a better algorithm to solve this question. But I think Kruskals Algorithm is a better algorithm to solve this question as you're more likely to come across union find questions than Prim's Algorithm questions.
Leetcode Results:
See Submission Link:
Do note, this question wasn't developed very well for Javascript, as half the time this question won't even count as valid due to taking so long despite being a very valid
answer using Kruskals Algorithm.
The Solution
class UnionFind {
/**
* @summary We're going to generate a UnionFind data structure.
* Union Find is a special data-structure that can be used to form
* a disjoint set (A tree). For this solution, we're going to use
* the Rank variant of Union Find. All this mean's is that we keep
* track the number of nodes a given tree has. It allows us to merge
* trees that will require the minimal amount of work (Increases
* the Amortized Complexity).
*
* @param {Array} edges [[node, edge_to_connect_to], [node, edge_to_connect_to]]
*/
constructor(edges) {
// Create a array of Ranks (Index -> Tree Size)
// Looks Like: [1,1,1,1]
// (Each node is a tree of size 1 by default)
this.ranks = new Array(edges.length).fill(1);
// Create a array of Parents (Index -> Index of Parent)
// If we keep following the parent, we'll eventually find
// the root of the tree.
// Looks Like: [0,1,2,3]
// (Each node's parent is itself by default, as it's the root of it's tree)
this.parents = Array.from(Array(edges.length).keys());
}
/**
* @summary Find the root of a given node, we do this by asking the parents
* list 'Who's the parent of this node's index?', we repeat this until the parent
* of the node is itself. Meaning, we have reached the root of the tree.
* We have also utilized a concept called 'Path Compression'. This mean's
* instead of going up the tree one node at a time, we go up the tree 2 nodes
* at a time. Tree height should be very small due to the 'rank' concept.
*
* Time Complexity: Amortized O(1) (Best, tree height is small)
* : O(log n) (Average)
* : O(n) (Worst, linked list tree)
*
* Space Complexity: O(1) (Finding allocated no space)
*
* Technically, we rate an algorithm by it's worst case. Thus this is
* O(n) in time. But it's such a rare case that we don't care, so it's better
* to use the amortized case of O(1)
*
* @param {Number} index (Index of node in [Parents, Ranks, Edges])
* @return {Number} (Index of parent, the root node of the tree)
*/
find(index) {
// Get parent of node
let parent = this.parents[index];
// Keep getting parents until we reach the root of the tree
while (parent != this.parents[parent]) {
// Path Compression
parent = this.parents[this.parents[parent]];
}
return parent;
}
/**
* @summary Merge two trees by connecting the root of the tree by rank.
* What this mean's, is we're going to find the parents of both of the supplied
* nodes, and then figure out which tree is larger. We then connect the root of
* the smaller tree to the root of the larger tree.
* Why? Because, during the Find() operation, we want to reduce the number of
* steps required to get to the root of a given tree. By merging smaller into larger
* we won't need as many steps to find the root of a given parent.
*
* This is called Union by Rank. Rank meaning, size of a given tree. When you combine
* Path Compression and Union by Rank, you get a amortized O(1) time complexity.
*
* Time and Space Complexity is the same as Find() as we utilise that function.
*
* @param {Number} n1 (Index of node 1)
* @param {Number} n2 (Index of node 2)
* @return {Boolean} (False if nodes are already in the same tree)
*/
union(n1, n2) {
// Find the parents of each node.
const n1_parent = this.find(n1);
const n2_parent = this.find(n2);
// Are the nodes already in the same tree?
// REDUNDANT CONNECTION!!!
if (n1_parent === n2_parent) return false;
// Union by rank, merge smallest into largest.
if (this.ranks[n1_parent] > this.ranks[n2_parent]) {
// Update parent and ranks
this.parents[n2_parent] = n1_parent;
this.ranks [n2_parent] += this.ranks[n1_parent];
} else {
// Merge n1 into n2
this.parents[n1_parent] = n2_parent;
this.ranks [n1_parent] += this.ranks[n2_parent];
}
// Successfully merged. Ranks and parents updated
return true;
}
}
/**
* @param {number[][]} points
* @return {number}
*/
var minCostConnectPoints = function (points) {
// We're going to perform Kruskal's algorithm to find the minimum cost of connecting all the points.
// Which results in a minimum spanning tree. (MST). Kruskal's algorithm is a greedy algorithm,
// that connects a node with another node based on the smallest distance. So we always
// connect 2 nodes together knowing that it's the smallest distance.
// We're going to create a list of possible operations, Node -> Closest Node.
// We're going to union these 2 nodes by rank and note the cost. We run through all
// the cheapest operations and connect the nodes together. We then return the cost once
// we have connected all the nodes.
// Base case
if (points.length === 1) return 0;
// STAGE 1
// Create a list of operations
// Node -> [All Nodes except itself] | Cost
// As all nodes are a candidate for connecting. Once created, we sort our operations by cost.
// as in Kruskal's algorithm, we always start by connecting the cheapest nodes together.
// We will use a MinHeap to achieve this. [Cost (Priority)] -> [Node, Vertex]
const node_edge_cost = new MinPriorityQueue();
// Prevent Duplicate Operations (Not Needed)
const operation_set = new Set();
/**
* @summary: Manhattan distance between 2 nodes on this graph.
* Time : O(1)
* Space : O(1)
*
* @param {number} point1
* @param {number} point2
* @return {number} Manhattan distance
*/
const distance = (point1, point2) => {
return Math.abs(point1[0] - point2[0]) + Math.abs(point1[1] - point2[1]);
};
// Populate the heap with all possible
// operations. Except for itself. We do this because any node
// could be the closest to said node.
for (let i = 0; i < points.length; i++) {
for (let j = 0; j < points.length; j++) {
if (i != j && !operation_set.has(`${j}-${i}`)) {
// Add the operation to the adjacency list
// [Node, Possible Connection] => Operation Cost
node_edge_cost.enqueue([i,j], distance(points[i], points[j]))
}
}
}
// Unlock our Union Find
const UF = new UnionFind(points);
// Unionise all nodes
// with their cheapest node and note it's cost
// Merge into the smallest tree
let union_cost = 0;
let number_of_connections = 0;
// Starting at the smallest operation, unionise all nodes to
// their closest connection. If union is successful, add the cost. (Distance) (Priority in heap)
// We also keep track of the number of unions that occur, as many connections
// will accidentally be duplicates. It mean's we can exit the loop early. Preventing
// lots of unnecessary work.
while (node_edge_cost.size()){
// Get the cheapest operation from the heap
const node = node_edge_cost.dequeue();
const vertex = node.element[0];
const edge = node.element[1];
// Early exit if we've already connected all the nodes.
if (number_of_connections === points.length - 1) return union_cost;
// Unionise the nodes, add the cost.
if (UF.union(vertex, edge)) {
number_of_connections += 1;
union_cost += node.priority;
}
}
// Optimisations Made (Not Required, but increases the Amortized Complexity)
// Union Find by Rank
// Find by Path Compression
// Early Exit by connection counting.
// Duplicate Operations Check. (Prevents extra node finding)
// We also used a heap to prevent a O(n^2) time of sorting.
// Time and Space: O(n^2) due to building the adjacency list.
return union_cost;
};
Top comments (0)