Union-Find
Union-Find (also called Disjoint Set Union) tracks which nodes belong to the same connected component, supporting merge and membership queries in near-constant time using two optimizations.
- Replaces repeated BFS/DFS connectivity checks with O(α(n)) amortized operations — effectively O(1) — using path compression and union by rank
- Two operations: find (returns the root representative of a node's component) and union (merges two components, returns false if they were already connected)
- Reach for it when the problem involves counting connected components, detecting cycles in undirected graphs, or dynamically connecting nodes as edges are added
Quick Revision - Must Do Problems
If short on time, solve only these. They cover ~90% of Union-Find patterns.
| # | Problem | Pattern | Why It's Essential | Time to Solve |
|---|---|---|---|---|
| 1 | Number of Connected Components | Count | Basic union-find — count successful unions | 15 min |
| 2 | Redundant Connection | Cycle Detection | Return the edge whose union() returns false | 20 min |
| 3 | Number of Provinces | Count | Same pattern, adj matrix input | 15 min |
Practice Tips
How to Identify a Union-Find Problem
- "How many connected groups / components?"
- "Is there a cycle in this undirected graph?"
- "Are nodes X and Y in the same group?"
- "Add edges one at a time — after each addition, how many components remain?"
- BFS/DFS also works for static graphs; reach for Union-Find when edges are added dynamically or when you need repeated connectivity queries
Union-Find vs. Other Techniques
| If you see... | Use... | Not... |
|---|---|---|
| Connected components (static graph) | Union-Find or DFS | Union-Find not required, but cleaner |
| Cycle detection in undirected graph | Union-Find | DFS with visited[] (either works) |
| Cycle detection in directed graph | DFS with 3 states | Union-Find (does NOT work for directed!) |
| Edges added dynamically, query connectivity | Union-Find | Re-running BFS/DFS each time |
Interview Approach
- Initialize
parent[i] = iandrank[i] = 0for all nodes find(x): follow parent pointers to the root; apply path compression —parent[x] = find(parent[x])union(x, y): find both roots; if same root, they're already connected (return false / cycle detected); otherwise attach smaller-rank tree under larger-rank- For component count: start with n components and decrement by 1 on every successful union
- For cycle detection: return the edge where
union()returns false (the two nodes were already connected)
1 Main Pattern
Union-Find with Path Compression + Union by Rank
int[] parent, rank;
void init(int n) {
parent = new int[n];
rank = new int[n];
for (int i = 0; i < n; i++) {
parent[i] = i; // every node is its own root
}
}
int find(int x) {
if (parent[x] != x) {
parent[x] = find(parent[x]); // path compression
}
return parent[x];
}
boolean union(int x, int y) {
int rx = find(x), ry = find(y);
if (rx == ry) {
return false; // already connected
}
if (rank[rx] < rank[ry]) { int t = rx; rx = ry; ry = t; } // rx = larger rank
parent[ry] = rx;
if (rank[rx] == rank[ry]) {
rank[rx]++;
}
return true; // successfully merged
}
General Templates
Component Count Template
// Start with n components; decrement on every successful union
int components = n;
UnionFind uf = new UnionFind(n);
for (int[] edge : edges) {
if (uf.union(edge[0], edge[1])) components--;
}
Cycle Detection Template
// The first edge where union() returns false creates a cycle
for (int[] edge : edges) {
if (!uf.union(edge[0], edge[1])) return edge; // this edge is redundant
}
Problems in this guide
| # | Title |
|---|---|
| 1 | Number of Connected Components in Undirected Graph ⭐ |
| 2 | Redundant Connection ⭐ |
| 3 | Number of Provinces ⭐ |
Problem 1: Number of Connected Components in Undirected Graph ⭐ MUST DO
| Difficulty | Time to Solve | Pattern |
|---|---|---|
| Medium | 15 min | Union-Find (Component Count) |
Given n nodes and a list of edges, return the number of connected components.
Example:
Input: n = 5, edges = [[0,1],[1,2],[3,4]] → 2
Input: n = 5, edges = [[0,1],[1,2],[2,3],[3,4]] → 1
Key Insight:
- Initialize each node as its own component (count = n). For every edge, attempt a union. A successful union means two previously separate components merged — decrement the count. After all edges, the count is the number of remaining components.
Solution:
public int countComponents(int n, int[][] edges) {
int[] parent = new int[n], rank = new int[n];
for (int i = 0; i < n; i++) {
parent[i] = i;
}
int components = n;
for (int[] e : edges) {
if (union(parent, rank, e[0], e[1])) components--;
}
return components;
}
private int find(int[] parent, int x) {
if (parent[x] != x) {
parent[x] = find(parent, parent[x]);
}
return parent[x];
}
private boolean union(int[] parent, int[] rank, int x, int y) {
int rx = find(parent, x), ry = find(parent, y);
if (rx == ry) {
return false;
}
if (rank[rx] < rank[ry]) {
int t = rx; rx = ry; ry = t;
}
parent[ry] = rx;
if (rank[rx] == rank[ry]) {
rank[rx]++;
}
return true;
}
Complexity:
| Time | O(E × α(n)) ≈ O(E) — α(n) is the inverse Ackermann function, effectively constant |
| Space | O(n) — parent and rank arrays |
Problem 2: Redundant Connection ⭐ MUST DO
| Difficulty | Time to Solve | Pattern |
|---|---|---|
| Medium | 20 min | Union-Find (Cycle Detection) |
Given a tree with one extra edge added (making one cycle), return that redundant edge. If multiple answers exist, return the one that appears last.
Example:
Input: [[1,2],[1,3],[2,3]] → [2,3]
Input: [[1,2],[2,3],[3,4],[1,4],[1,5]] → [1,4]
Key Insight:
- Process edges one by one. A tree with n nodes has exactly n-1 edges — adding one creates exactly one cycle. The first edge whose two endpoints are already in the same component (union returns false) is the redundant one. Since we want the last such edge if there are ties, processing in order and returning the last failure handles it naturally.
Solution:
public int[] findRedundantConnection(int[][] edges) {
int n = edges.length;
int[] parent = new int[n + 1], rank = new int[n + 1];
for (int i = 0; i <= n; i++) {
parent[i] = i;
}
for (int[] e : edges) {
if (!union(parent, rank, e[0], e[1])) return e;
}
return new int[]{};
}
private int find(int[] parent, int x) {
if (parent[x] != x) {
parent[x] = find(parent, parent[x]);
}
return parent[x];
}
private boolean union(int[] parent, int[] rank, int x, int y) {
int rx = find(parent, x), ry = find(parent, y);
if (rx == ry) {
return false;
}
if (rank[rx] < rank[ry]) {
int t = rx; rx = ry; ry = t;
}
parent[ry] = rx;
if (rank[rx] == rank[ry]) {
rank[rx]++;
}
return true;
}
Complexity:
| Time | O(E × α(n)) ≈ O(E) |
| Space | O(n) |
Problem 3: Number of Provinces ⭐ MUST DO
| Difficulty | Time to Solve | Pattern |
|---|---|---|
| Medium | 15 min | Union-Find (Component Count) |
Given an n×n adjacency matrix where isConnected[i][j] = 1 means cities i and j are directly connected, return the number of provinces (groups of directly or indirectly connected cities).
Example:
Input: [[1,1,0],[1,1,0],[0,0,1]] → 2
Input: [[1,0,0],[0,1,0],[0,0,1]] → 3
Key Insight:
- Same connected-components count as Problem 1, but the graph is given as an adjacency matrix rather than an edge list. Iterate only the upper triangle (
j > i) to avoid processing each edge twice. The union-find logic is identical.
Solution:
public int findCircleNum(int[][] isConnected) {
int n = isConnected.length;
int[] parent = new int[n], rank = new int[n];
for (int i = 0; i < n; i++) {
parent[i] = i;
}
int components = n;
for (int i = 0; i < n; i++) {
for (int j = i + 1; j < n; j++)
}
if (isConnected[i][j] == 1 && union(parent, rank, i, j)) {
components--;
}
return components;
}
private int find(int[] parent, int x) {
if (parent[x] != x) {
parent[x] = find(parent, parent[x]);
}
return parent[x];
}
private boolean union(int[] parent, int[] rank, int x, int y) {
int rx = find(parent, x), ry = find(parent, y);
if (rx == ry) {
return false;
}
if (rank[rx] < rank[ry]) {
int t = rx; rx = ry; ry = t;
}
parent[ry] = rx;
if (rank[rx] == rank[ry]) {
rank[rx]++;
}
return true;
}
Complexity:
| Time | O(n²) — must scan all matrix cells |
| Space | O(n) |
Quick Reference Table
| # | Problem | Pattern | Key Technique | Time |
|---|---|---|---|---|
| 1 | Number of Connected Components ⭐ | Component Count | Start n; decrement on successful union | O(E) |
| 2 | Redundant Connection ⭐ | Cycle Detection | Return edge where union() returns false | O(E) |
| 3 | Number of Provinces ⭐ | Component Count | Adj matrix — upper triangle only, same logic | O(n²) |
When to Use Each Pattern
Component Count
- Need the total number of isolated groups after all edges are added
- Initialize count = n; decrement once per successful
union()call - Works for both edge-list input and adjacency-matrix input
Cycle Detection
- Process edges in order; the first edge where
union()returns false creates a cycle - Only valid for undirected graphs — directed cycle detection requires DFS with 3-state visited array
- When the problem asks to return the redundant edge, process in order (last qualifying edge is returned naturally)
Common Mistakes to Avoid
Structural Mistakes
- Using Union-Find on directed graphs —
union(x, y)merges components symmetrically; it cannot model directed edges. Use DFS with 3-state visited array for directed cycle detection (Course Schedule pattern) - Comparing raw node values instead of roots — always call
find()on both nodes before comparing; comparingx == yinstead offind(x) == find(y)gives wrong results when nodes are in the same component but have different parents
Initialization Mistakes
- Missing node 0 or off-by-one — if nodes are 1-indexed (common in LeetCode), allocate
parent[n+1]and initializeparent[i] = iforifrom 0 to n - Forgetting rank initialization —
rank[]starts at 0 for all nodes; forgetting to initialize it leaves garbage values that corrupt union-by-rank
Optimization Mistakes
- Omitting path compression — without
parent[x] = find(parent[x]), the tree can degenerate to a chain andfind()becomes O(n) per call; path compression is what makes it near-O(1)