Union-FindDisjoint SetConnected ComponentsCycle DetectionKruskal's MST

Union-Find (Disjoint Set Union) — The Complete Guide

Master Union-Find from first principles. Learn to group elements into dynamic sets, detect cycles, count connected components, and confidently solve every DSU-based interview question.

45 min read10 sections
01

Intuition from Scratch

Why does this pattern exist?

Imagine you're at a party and people keep introducing each other. "Alice knows Bob." "Bob knows Carol." Now Alice, Bob, and Carol are all in the same friend group — even though Alice and Carol never met directly. As more introductions happen, friend groups merge. At any point, you want to answer: "Are these two people in the same group?" and "How many groups are there?"

Union-Find (also called Disjoint Set Union or DSU) is a data structure built exactly for this. It supports two operations: union(a, b) — merge the groups containing a and b, and find(a) — determine which group a belongs to. Both operations run in nearly O(1) amortized time with two optimizations: path compression and union by rank.

Union-Find is the go-to data structure whenever you need to dynamically group elements and query connectivity. It's simpler than graph traversal (BFS/DFS) for connectivity problems, and it handles edges arriving one at a time — something BFS/DFS can't do efficiently without rebuilding the graph.

How It Works (The Basics)

Union-Find — Core Ideatext
Each element starts as its own group (parent = itself).

find(x): Follow parent pointers until you reach the root.
          The root is the "representative" of the group.

union(a, b): Find the roots of a and b.
             If different, make one root point to the other.
             Now both groups are merged.

Example: union(1, 2), union(3, 4), union(2, 3)

Step 1: union(1, 2)     Step 2: union(3, 4)     Step 3: union(2, 3)
  12                   12   34          12

                                                    34

find(1) = 1, find(4) = 1same group

Two optimizations make this nearly O(1):
  Path Compression: During find(), point every node directly to the root.
  Union by Rank:    Attach the shorter tree under the taller tree.

Analogies to Build Intuition

🤝

Friend Groups at a Party

Each person starts alone. When two people are introduced, their entire friend groups merge. To check if Alice and Dave know each other (transitively), you check if they're in the same group. Union-Find tracks these merging groups efficiently — each 'find' asks 'who's the leader of your group?' and each 'union' merges two groups under one leader.

🏝️

Islands Connected by Bridges

Imagine islands in an ocean. Each bridge connects two islands. As bridges are built, islands merge into larger landmasses. Union-Find tracks which islands are connected. 'Find' tells you which landmass an island belongs to. 'Union' merges two landmasses when a bridge is built between them.

🔌

Electrical Wiring

You have components that need to be connected. Each wire connects two components. After all wiring, you want to know: are these two components on the same circuit? How many separate circuits exist? Union-Find answers both questions — union when you add a wire, find to check connectivity, count roots for the number of circuits.

What kind of problems does it solve?

Union-Find is your go-to when:

  • You need to count connected components as edges are added
  • You need to check if two nodes are connected (same component)
  • You need to detect cycles in an undirected graph
  • You need to build a Minimum Spanning Tree (Kruskal's algorithm)
  • You need to group elements dynamically based on equivalence relations
  • You need to process edges/connections one at a time (online connectivity)

The core insight

Union-Find answers "are these two elements in the same group?" in nearly O(1). It's the fastest way to track dynamic connectivity. Whenever you see a problem about merging groups, checking connectivity, or counting components — and edges arrive one at a time — Union-Find is almost certainly the answer.

02

Pattern Recognition

Recognizing when Union-Find is the right tool is the most important skill. Here are the signals to look for and the traps to avoid.

Keywords & Signals in the Problem Statement

Use Union-Find when you see...

  • "Number of connected components" or "number of islands"
  • "Are these two nodes connected?" (dynamic connectivity)
  • "Detect a cycle in an undirected graph"
  • "Minimum spanning tree" (Kruskal's algorithm)
  • "Group elements by equivalence" (accounts merge, similar strings)
  • "Redundant connection" — find the edge that creates a cycle
  • "Earliest time when all nodes are connected"
  • "Make connected" — minimum edges to connect all components
  • Edges/connections arriving one at a time (online processing)
  • "Union" or "merge" in the problem description

Don't use Union-Find when...

  • You need shortest path — use BFS or Dijkstra
  • You need to traverse all nodes in a component — use BFS/DFS
  • The graph is directed — Union-Find is for undirected graphs only
  • You need to split groups (un-union) — Union-Find doesn't support this
  • The graph is static and you just need components once — BFS/DFS is simpler
  • You need topological sort — use DFS or Kahn's algorithm

Union-Find vs. Similar Patterns

PatternWhen to UseKey Difference
Union-Find (DSU)Dynamic connectivity, cycle detection, component counting, MSTNearly O(1) union/find — handles edges arriving one at a time
BFS/DFSTraversal, shortest path (unweighted), exploring all nodes in a componentO(V + E) per traversal — better when you need to visit nodes, not just check connectivity
Kruskal's (uses DSU)Minimum Spanning TreeSort edges by weight, use Union-Find to add edges without creating cycles
Topological SortOrdering in directed acyclic graphsFor directed graphs — Union-Find only works on undirected graphs
Graph ColoringBipartiteness check, 2-coloringBFS/DFS with colors — Union-Find can also solve bipartite checks with a weighted variant

The Union-Find question to always ask

When you see a connectivity problem, ask: "Am I processing edges one at a time and need to check/update connectivity after each edge?" If yes → Union-Find. If you just need to find components once on a static graph → BFS/DFS is simpler. Union-Find shines when the graph is built incrementally.

03

Brute Force → Optimized Thinking

Let's walk through the thinking evolution using a concrete example: Number of Connected Components in an Undirected Graph — given n nodes and a list of edges, return the number of connected components.

1

Brute Force — BFS/DFS from Every Unvisited Node

O(V + E) time · O(V + E) space

Build an adjacency list, then run BFS/DFS from each unvisited node. Each traversal discovers one component. Count the number of traversals. This works but requires building the full graph upfront.

BFS/DFS Approachtypescript
function countComponentsBFS(n: number, edges: number[][]): number {
  // Build adjacency list — O(V + E)
  const adj: number[][] = Array.from({ length: n }, () => []);
  for (const [u, v] of edges) {
    adj[u].push(v);
    adj[v].push(u);
  }

  const visited = new Set<number>();
  let components = 0;

  for (let i = 0; i < n; i++) {
    if (!visited.has(i)) {
      components++;
      // BFS to mark all nodes in this component
      const queue = [i];
      visited.add(i);
      while (queue.length > 0) {
        const node = queue.shift()!;
        for (const neighbor of adj[node]) {
          if (!visited.has(neighbor)) {
            visited.add(neighbor);
            queue.push(neighbor);
          }
        }
      }
    }
  }

  return components;
}
// Works fine for static graphs. But what if edges arrive one at a time
// and you need the component count after EACH edge? You'd need to
// rebuild and re-traverse — O((V + E) × E) total. Way too slow.
2

Naive Union-Find — No Optimizations

O(n) per find · O(n) per union

Each element has a parent pointer. Find follows the chain to the root. Union connects two roots. But without optimizations, the tree can become a long chain (like a linked list), making find O(n).

Naive Union-Findtypescript
class NaiveUF {
  parent: number[];

  constructor(n: number) {
    this.parent = Array.from({ length: n }, (_, i) => i);
  }

  find(x: number): number {
    while (this.parent[x] !== x) {
      x = this.parent[x]; // follow parent chain — O(n) worst case
    }
    return x;
  }

  union(a: number, b: number): void {
    const rootA = this.find(a);
    const rootB = this.find(b);
    if (rootA !== rootB) {
      this.parent[rootA] = rootB; // just attach one root to the other
    }
  }
}
// Problem: Without optimizations, the tree can degenerate into a chain.
// find() becomes O(n), making total complexity O(n × E).
3

Optimal — Path Compression + Union by Rank

O(α(n)) ≈ O(1) amortized per operation

Two optimizations make Union-Find nearly O(1): (1) Path compression — during find(), point every node directly to the root, flattening the tree. (2) Union by rank — attach the shorter tree under the taller one, keeping trees balanced. Together, they give O(α(n)) amortized time, where α is the inverse Ackermann function — effectively constant.

Optimized Union-Findtypescript
class UnionFind {
  parent: number[];
  rank: number[];
  count: number; // number of components

  constructor(n: number) {
    this.parent = Array.from({ length: n }, (_, i) => i);
    this.rank = new Array(n).fill(0);
    this.count = n;
  }

  find(x: number): number {
    if (this.parent[x] !== x) {
      this.parent[x] = this.find(this.parent[x]); // path compression
    }
    return this.parent[x];
  }

  union(a: number, b: number): boolean {
    const rootA = this.find(a);
    const rootB = this.find(b);
    if (rootA === rootB) return false; // already connected

    // Union by rank — attach shorter tree under taller
    if (this.rank[rootA] < this.rank[rootB]) {
      this.parent[rootA] = rootB;
    } else if (this.rank[rootA] > this.rank[rootB]) {
      this.parent[rootB] = rootA;
    } else {
      this.parent[rootB] = rootA;
      this.rank[rootA]++;
    }

    this.count--;
    return true; // merged two components
  }

  connected(a: number, b: number): boolean {
    return this.find(a) === this.find(b);
  }
}
// α(n) ≤ 4 for any practical n (up to 2^65536).
// Effectively O(1) per operation. Total for E edges: O(E × α(n)) ≈ O(E).

Why Do the Optimizations Work?

1

Path compression flattens the tree

Every time you call find(x), you make every node on the path point directly to the root. Next time you call find() on any of those nodes, it's O(1). The tree becomes almost flat after a few operations.

2

Union by rank keeps trees balanced

By always attaching the shorter tree under the taller one, you prevent the tree from degenerating into a chain. The maximum height is O(log n) even without path compression. With both optimizations, it's O(α(n)) ≈ O(1).

3

Together they give inverse Ackermann time

The inverse Ackermann function α(n) grows so slowly that it's ≤ 4 for any n up to 2^65536. For all practical purposes, each operation is O(1). This makes Union-Find one of the most efficient data structures in existence.

The thinking process matters more than the code

In interviews, walk through this progression: "BFS/DFS works for static graphs but can't handle edges arriving one at a time efficiently. Naive Union-Find has O(n) find. With path compression and union by rank, each operation is amortized O(α(n)) ≈ O(1)." This shows the interviewer you understand the evolution from brute force to optimal.

04

Core Templates

Union-Find problems follow a few recurring templates. Memorize the base class and these usage patterns — most problems are variations.

Template 1: The Union-Find Class (Reusable)

This is the class you'll write in every Union-Find problem. Memorize it — it's always the same.

Union-Find Class — Memorize Thistypescript
class UnionFind {
  parent: number[];
  rank: number[];
  count: number;

  constructor(n: number) {
    this.parent = Array.from({ length: n }, (_, i) => i);
    this.rank = new Array(n).fill(0);
    this.count = n; // each element is its own component
  }

  find(x: number): number {
    if (this.parent[x] !== x) {
      this.parent[x] = this.find(this.parent[x]); // path compression
    }
    return this.parent[x];
  }

  union(a: number, b: number): boolean {
    const rootA = this.find(a);
    const rootB = this.find(b);
    if (rootA === rootB) return false; // already same group

    if (this.rank[rootA] < this.rank[rootB]) {
      this.parent[rootA] = rootB;
    } else if (this.rank[rootA] > this.rank[rootB]) {
      this.parent[rootB] = rootA;
    } else {
      this.parent[rootB] = rootA;
      this.rank[rootA]++;
    }

    this.count--;
    return true;
  }

  connected(a: number, b: number): boolean {
    return this.find(a) === this.find(b);
  }
}
// Variation: use "size" instead of "rank" (union by size)
// — attach smaller tree under larger, track component sizes.

Template 2: Count Connected Components

The most common usage. Create a Union-Find, process all edges, and the component count is tracked automatically.

Count Components Templatetypescript
function countComponents(n: number, edges: number[][]): number {
  const uf = new UnionFind(n);

  for (const [u, v] of edges) {
    uf.union(u, v);
  }

  return uf.count;
}
// Each union that merges two different components decrements count.
// Final count = number of connected components.
// When to use: Number of Provinces, Number of Islands (with coordinate mapping),
// Number of Connected Components

Template 3: Cycle Detection in Undirected Graph

If union(a, b) returns false (they're already in the same component), the edge (a, b) creates a cycle. This is the foundation of Kruskal's MST algorithm.

Cycle Detection Templatetypescript
function findRedundantConnection(edges: number[][]): number[] {
  const n = edges.length;
  const uf = new UnionFind(n + 1); // 1-indexed nodes

  for (const [u, v] of edges) {
    if (!uf.union(u, v)) {
      return [u, v]; // this edge creates a cycle!
    }
  }

  return []; // no cycle found
}
// Key insight: if find(u) === find(v) BEFORE adding edge (u, v),
// then u and v are already connected — adding this edge creates a cycle.
// When to use: Redundant Connection, Graph Valid Tree, Kruskal's MST

Template 4: Union-Find on a 2D Grid

For grid problems (like Number of Islands), map 2D coordinates to 1D indices: index = row × cols + col. Then use standard Union-Find.

2D Grid Union-Find Templatetypescript
function numIslands(grid: string[][]): number {
  const rows = grid.length, cols = grid[0].length;
  const uf = new UnionFind(rows * cols);
  let waterCount = 0;

  const dirs = [[0, 1], [1, 0]]; // only right and down to avoid duplicates

  for (let r = 0; r < rows; r++) {
    for (let c = 0; c < cols; c++) {
      if (grid[r][c] === '0') {
        waterCount++;
        continue;
      }
      const idx = r * cols + c;
      for (const [dr, dc] of dirs) {
        const nr = r + dr, nc = c + dc;
        if (nr < rows && nc < cols && grid[nr][nc] === '1') {
          uf.union(idx, nr * cols + nc);
        }
      }
    }
  }

  return uf.count - waterCount;
}
// Map (row, col) → row * cols + col for 1D indexing.
// Only check right and down neighbors to avoid double-counting.
// Subtract water cells from the component count.

Which template to pick?

Ask yourself: (1) Am I counting components? → Template 2. (2) Am I detecting a cycle or finding a redundant edge? → Template 3. (3) Is the problem on a 2D grid? → Template 4. All of them use the same Union-Find class (Template 1).

05

Variations & Sub-Patterns

Union-Find isn't a single trick — it's a versatile tool with several important variations.

VariationStrategyClassic Example
Basic ConnectivityUnion edges, count components or check connectedNumber of Provinces, Number of Connected Components
Cycle DetectionIf union returns false, the edge creates a cycleRedundant Connection, Graph Valid Tree
Kruskal's MSTSort edges by weight, union if not already connectedMin Cost to Connect All Points, Connecting Cities With Minimum Cost
Union by SizeTrack component sizes, attach smaller under largerAccounts Merge, Largest Component Size by Common Factor
2D Grid DSUMap (row, col) to 1D index, union adjacent land cellsNumber of Islands, Number of Islands II (online)
Weighted / Ranked DSUTrack relative weights between nodes (e.g., ratios)Evaluate Division, Swim in Rising Water
Online ConnectivityProcess edges one at a time, answer queries after eachEarliest Moment When Everyone Becomes Friends

Problems mentioned above

Deep Dive: Kruskal's MST — Min Cost to Connect All Points

Given n points, connect all of them with minimum total Manhattan distance. This is Kruskal's algorithm: sort all possible edges by weight, then greedily add edges that connect different components (using Union-Find to check).

Kruskal's MST with Union-Findtypescript
function minCostConnectPoints(points: number[][]): number {
  const n = points.length;
  const edges: [number, number, number][] = []; // [weight, u, v]

  // Generate all edges with Manhattan distance
  for (let i = 0; i < n; i++) {
    for (let j = i + 1; j < n; j++) {
      const dist = Math.abs(points[i][0] - points[j][0])
                 + Math.abs(points[i][1] - points[j][1]);
      edges.push([dist, i, j]);
    }
  }

  // Sort edges by weight (ascending)
  edges.sort((a, b) => a[0] - b[0]);

  const uf = new UnionFind(n);
  let totalCost = 0;
  let edgesUsed = 0;

  for (const [weight, u, v] of edges) {
    if (uf.union(u, v)) {
      totalCost += weight;
      edgesUsed++;
      if (edgesUsed === n - 1) break; // MST has n-1 edges
    }
  }

  return totalCost;
}
// Time: O(E log E) for sorting + O(E × α(n)) for unions
// Space: O(E + n)
// Key: Kruskal's = sort edges + Union-Find to avoid cycles.
// MST is complete when we've added n-1 edges.

Deep Dive: Accounts Merge — Union-Find with Grouping

Given accounts where each has a name and emails, merge accounts that share any email. This is a Union-Find problem where emails are the elements and accounts are merged when they share an email.

Accounts Mergetypescript
function accountsMerge(accounts: string[][]): string[][] {
  const uf = new UnionFind(accounts.length);
  const emailToAccount = new Map<string, number>(); // email → account index

  // Union accounts that share emails
  for (let i = 0; i < accounts.length; i++) {
    for (let j = 1; j < accounts[i].length; j++) {
      const email = accounts[i][j];
      if (emailToAccount.has(email)) {
        uf.union(i, emailToAccount.get(email)!);
      } else {
        emailToAccount.set(email, i);
      }
    }
  }

  // Group emails by root account
  const groups = new Map<number, Set<string>>();
  for (let i = 0; i < accounts.length; i++) {
    const root = uf.find(i);
    if (!groups.has(root)) groups.set(root, new Set());
    for (let j = 1; j < accounts[i].length; j++) {
      groups.get(root)!.add(accounts[i][j]);
    }
  }

  // Build result
  const result: string[][] = [];
  for (const [root, emails] of groups) {
    result.push([accounts[root][0], ...[...emails].sort()]);
  }

  return result;
}
// Time: O(n × k × α(n) + E log E) where k = emails per account
// Key: Union accounts (not emails) — use email as the "bridge" to find
// which accounts should be merged. Then group emails by root account.

Deep Dive: Number of Islands II — Online DSU

Given a grid and a list of positions where land is added one at a time, return the number of islands after each addition. This is the classic online Union-Find problem — BFS/DFS would need to re-traverse after each addition.

Number of Islands II — Onlinetypescript
function numIslands2(m: number, n: number, positions: number[][]): number[] {
  const uf = new UnionFind(m * n);
  uf.count = 0; // start with 0 components (no land yet)
  const grid = Array.from({ length: m }, () => new Array(n).fill(false));
  const result: number[] = [];
  const dirs = [[0, 1], [0, -1], [1, 0], [-1, 0]];

  for (const [r, c] of positions) {
    if (grid[r][c]) {
      result.push(uf.count); // duplicate — skip
      continue;
    }

    grid[r][c] = true;
    uf.count++; // new island
    const idx = r * n + c;
    uf.parent[idx] = idx; // initialize this cell

    for (const [dr, dc] of dirs) {
      const nr = r + dr, nc = c + dc;
      if (nr >= 0 && nr < m && nc >= 0 && nc < n && grid[nr][nc]) {
        uf.union(idx, nr * n + nc); // merge with adjacent land
      }
    }

    result.push(uf.count);
  }

  return result;
}
// Time: O(k × α(m×n)) where k = number of positions
// Key: each new land cell starts as its own island (count++),
// then merges with adjacent land (count-- per successful union).
// BFS/DFS would be O(k × m × n) — Union-Find is much faster.
06

Visual Walkthrough

Let's trace through three classic Union-Find scenarios step by step.

Union-Find Operations — Step by Step

Union-Find Operations — Tracetext
n = 6, edges = [[0,1], [1,2], [3,4], [2,4]]

Initial: parent = [0, 1, 2, 3, 4, 5], rank = [0,0,0,0,0,0], count = 6
         Each node is its own component: {0} {1} {2} {3} {4} {5}

Edge [0, 1]: find(0)=0, find(1)=1different roots
  union: parent[1] = 0, rank[0] = 1, count = 5
  parent = [0, 0, 2, 3, 4, 5]
  Components: {0,1} {2} {3} {4} {5}

Edge [1, 2]: find(1)→parent[1]=0find(0)=0, find(2)=2different
  union: parent[2] = 0, count = 4
  parent = [0, 0, 0, 3, 4, 5]
  Components: {0,1,2} {3} {4} {5}

  Path compression: find(1) set parent[1] = 0 (already was 0)
                    find(2) set parent[2] = 0

Edge [3, 4]: find(3)=3, find(4)=4different
  union: parent[4] = 3, rank[3] = 1, count = 3
  parent = [0, 0, 0, 3, 3, 5]
  Components: {0,1,2} {3,4} {5}

Edge [2, 4]: find(2)→parent[2]=0=root, find(4)→parent[4]=3=root
  find(2)=0, find(4)=3different
  union: rank[0]=1, rank[3]=1equal, parent[3] = 0, rank[0] = 2
  count = 2
  parent = [0, 0, 0, 0, 3, 5]
  Components: {0,1,2,3,4} {5}

Now find(4): parent[4]=3parent[3]=0root!
  Path compression: parent[4] = 0 (shortcut!)
  parent = [0, 0, 0, 0, 0, 5]

Final: 2 components: {0,1,2,3,4} and {5}

Cycle Detection — Step by Step

Redundant Connection — Tracetext
edges = [[1,2], [1,3], [2,3]]  (1-indexed)

Initial: parent = [0, 1, 2, 3], count = 3

Edge [1, 2]: find(1)=1, find(2)=2differentunion
  parent = [0, 1, 1, 3], count = 2

Edge [1, 3]: find(1)=1, find(3)=3differentunion
  parent = [0, 1, 1, 1], count = 1

Edge [2, 3]: find(2)→parent[2]=1=root, find(3)→parent[3]=1=root
  find(2)=1, find(3)=1SAME ROOT!
  union returns falseTHIS EDGE CREATES A CYCLE!

Answer: [2, 3]

Key: Before adding edge (u, v), if find(u) === find(v),
they're already connected — this edge is redundant and creates a cycle.

Kruskal's MST — Step by Step

Kruskal's MST — Tracetext
4 nodes, edges sorted by weight:
  [1, 0,1], [2, 1,2], [3, 0,2], [4, 2,3], [5, 1,3]

Initial: parent = [0,1,2,3], count = 4, totalCost = 0

Edge weight=1, (0,1): find(0)≠find(1) → union
  totalCost = 1, edgesUsed = 1, count = 3

Edge weight=2, (1,2): find(1)≠find(2) → union
  totalCost = 3, edgesUsed = 2, count = 2

Edge weight=3, (0,2): find(0)=find(2) → SAME! Skip (would create cycle)

Edge weight=4, (2,3): find(2)≠find(3) → union
  totalCost = 7, edgesUsed = 3 = n-1DONE!

MST edges: (0,1), (1,2), (2,3), total cost = 7

Key: Sort edges by weight. Greedily add the cheapest edge that
connects two different components. Union-Find detects cycles in O(α(n)).
Stop when edgesUsed = n - 1 (MST has exactly n-1 edges).
07

Practice Problems

These 7 problems are carefully ordered from easy to hard. Each one teaches a different aspect of Union-Find. Solve them in order.

1

LC 547. Number of Provinces

Medium

Why this pattern:

Basic connectivity counting. Given an adjacency matrix of cities, find the number of connected groups (provinces). This is the 'hello world' of Union-Find — union connected cities, return the component count.

Key idea: Create UnionFind(n). For each pair (i, j) where isConnected[i][j] = 1, call union(i, j). Return uf.count. The adjacency matrix is symmetric, so only process the upper triangle to avoid redundant unions.

2

LC 684. Redundant Connection

Medium

Why this pattern:

Cycle detection — find the edge that creates a cycle. Process edges in order. The first edge where union returns false (both endpoints already connected) is the redundant edge.

Key idea: For each edge [u, v]: if union(u, v) returns false, this edge connects two already-connected nodes — it's the cycle-creating edge. Return it. The problem guarantees exactly one redundant edge.

3

LC 261. Graph Valid Tree

Medium

Why this pattern:

A valid tree has exactly n-1 edges and no cycles. Use Union-Find to check both: if any union returns false, there's a cycle. If edges.length !== n-1, it's not a tree.

Key idea: First check: edges.length === n - 1 (necessary condition). Then process all edges with Union-Find. If any union returns false (cycle detected), return false. If all unions succeed and edge count is n-1, it's a valid tree.

4

LC 323. Number of Connected Components in an Undirected Graph

Medium

Why this pattern:

Direct component counting. Given n nodes and edges, count connected components. Create UnionFind(n), process all edges, return uf.count. The simplest Union-Find application.

Key idea: Initialize with n components. Each successful union decrements the count. After processing all edges, uf.count is the answer. This is Template 2 in its purest form.

5

LC 721. Accounts Merge

Medium

Why this pattern:

Union-Find with grouping. Accounts sharing any email belong to the same person. Union account indices when they share an email. Then group all emails by their root account.

Key idea: Map each email to the first account that owns it. When a later account has the same email, union the two accounts. After all unions, group emails by root account index. Sort emails within each group. The tricky part is the email-to-account mapping, not the Union-Find itself.

6

LC 1584. Min Cost to Connect All Points

Medium

Why this pattern:

Kruskal's MST. Generate all edges with Manhattan distances, sort by weight, greedily add edges that connect different components. Stop when n-1 edges are added.

Key idea: Generate O(n²) edges. Sort by distance. Use Union-Find to add the cheapest edge that doesn't create a cycle. Total cost of the n-1 accepted edges is the MST weight. Early termination when edgesUsed = n-1.

7

LC 778. Swim in Rising Water

Hard

Why this pattern:

Sort cells by elevation, process from lowest to highest. At each elevation, 'activate' the cell and union it with adjacent active cells. The answer is the elevation when (0,0) and (n-1,n-1) become connected.

Key idea: Create (elevation, row, col) tuples for all cells. Sort by elevation. Process in order: activate each cell, union with active neighbors. After each union, check if top-left and bottom-right are connected. The current elevation is the answer. This is Union-Find + sorting — a powerful combination.

Practice strategy

For each problem: (1) Identify what the "elements" and "connections" are. (2) Write the Union-Find class first (it's always the same). (3) Figure out the union logic. (4) After solving, write: "I union [what] when [condition], and the answer is [what I read from the DSU]."

08

Common Mistakes

These are the traps that catch people on Union-Find problems. Learn them here so you don't learn them in an interview.

🔗

Forgetting path compression

You implement find() as a simple while loop without path compression. The tree degenerates into a chain, and find() becomes O(n) instead of O(α(n)). Your solution TLEs on large inputs.

Always use path compression: this.parent[x] = this.find(this.parent[x]). This single line flattens the tree during every find() call. It's the difference between O(n) and O(1) amortized. Never skip it.

⚖️

Forgetting union by rank/size

You always attach rootA under rootB (or vice versa) without considering tree heights. This can create unbalanced trees, degrading performance.

Track rank (or size) for each root. Always attach the shorter/smaller tree under the taller/larger one. If ranks are equal, pick either and increment the winner's rank. This keeps trees balanced.

🔢

Off-by-one with 1-indexed nodes

The problem uses 1-indexed nodes (like Redundant Connection) but you create UnionFind(n) instead of UnionFind(n+1). Node n has no slot in the parent array.

Check the problem's indexing. If nodes are 1 to n, create UnionFind(n + 1) so index n is valid. Alternatively, subtract 1 from all node values to make them 0-indexed.

🗺️

Wrong 2D-to-1D mapping on grids

You map (row, col) to row + col instead of row * cols + col. Multiple cells map to the same index, causing incorrect unions.

The correct mapping is: index = row * numCols + col. This gives each cell a unique index in [0, rows × cols). Double-check by verifying that (0, cols-1) and (1, 0) map to different indices.

🔄

Unioning roots instead of calling find() first

You do parent[a] = b instead of parent[find(a)] = find(b). This connects node a to node b, but doesn't merge their entire groups. Nodes in a's group that don't go through a are now disconnected.

ALWAYS call find() on both arguments before modifying parent. The union operation connects ROOTS, not individual nodes: rootA = find(a), rootB = find(b), then parent[rootA] = rootB.

📊

Not tracking component count correctly

You decrement count on every union call, even when the two elements are already in the same component. This makes the count go negative or gives wrong results.

Only decrement count when union actually merges two DIFFERENT components. Check if find(a) !== find(b) before decrementing. The union method should return a boolean indicating whether a merge happened.

09

Interview Strategy

Knowing when and how to use Union-Find is half the battle. Here's how to present it in an interview setting to maximize your score.

The 5-Step Interview Flow

1

Identify the Connectivity Pattern

'I see that we're processing connections one at a time and need to track which elements are in the same group. This is a classic Union-Find problem.' Name the pattern early — it shows confidence.

2

Mention BFS/DFS as the Alternative

'I could use BFS/DFS to find components, but since edges arrive one at a time and I need connectivity after each edge, Union-Find is more efficient — O(α(n)) per operation vs rebuilding the graph each time.' Show you considered alternatives.

3

Write the Union-Find Class First

'Let me write the Union-Find class with path compression and union by rank — this is standard.' Write it quickly and confidently. It's always the same code. This is your foundation.

4

Explain the Problem-Specific Logic

'Now I'll process each edge: union the two endpoints. If union returns false, this edge creates a cycle.' The Union-Find class is generic — the interesting part is how you USE it for the specific problem.

5

Analyze Complexity

'Each union/find is O(α(n)) amortized, which is effectively O(1). With E edges, total time is O(E × α(n)) ≈ O(E). Space is O(n) for the parent and rank arrays.' Mention α(n) — it shows you understand the theory.

Common Follow-Up Questions

Follow-UpHow to Handle
"What is α(n)?"The inverse Ackermann function. It grows so slowly that α(n) ≤ 4 for any n up to 2^65536. For all practical purposes, it's constant. It comes from the combined analysis of path compression + union by rank.
"Can you use BFS/DFS instead?"Yes, for static graphs. BFS/DFS is O(V + E) per traversal. But if edges arrive one at a time and you need connectivity after each, you'd need to re-traverse — O(E × (V + E)) total. Union-Find handles this in O(E × α(n)) ≈ O(E).
"What about union by size instead of rank?"Both work. Union by size attaches the smaller component under the larger one and tracks actual sizes. Union by rank uses an upper bound on height. Both give O(α(n)). Size is useful when you need component sizes (e.g., largest component).
"Can Union-Find handle directed graphs?"Standard Union-Find is for undirected graphs only. For directed graphs, you need different techniques: DFS for cycle detection, topological sort for ordering. There's a weighted Union-Find variant for some directed problems (like Evaluate Division).
"Can you un-union (split groups)?"Standard Union-Find doesn't support splitting. If you need to remove edges, process edges in reverse order (offline) or use a more complex structure like Link-Cut Trees. Mention this limitation.
"Why not just use a hash set for each component?"Merging two hash sets is O(min(|A|, |B|)) per merge. With n merges, worst case is O(n²). Union-Find is O(α(n)) per operation — much faster. The tree structure with path compression is the key advantage.

The meta-skill

Interviewers love Union-Find because it tests whether you can recognize a connectivity pattern and implement a non-trivial data structure from scratch. The key sentence: "I'll use Union-Find with path compression and union by rank. Each operation is O(α(n)) amortized, effectively O(1). I union [elements] when [condition], and the answer is [component count / cycle detection / MST cost]."

10

Summary & Cheat Sheet

Pin this section. Review it before mock interviews and contests.

Quick Revision Cheat Sheet

Core idea: Track dynamic groups with two operations: find(x) returns the group leader, union(a,b) merges two groups. Nearly O(1) per operation.

Path compression: During find(), point every node directly to the root: parent[x] = find(parent[x]). Flattens the tree for future lookups.

Union by rank: Attach shorter tree under taller. If equal rank, pick either and increment. Prevents degenerate chains.

Complexity: O(α(n)) per operation where α is inverse Ackermann — effectively O(1). Total for E edges: O(E × α(n)) ≈ O(E).

Component count: Start with count = n. Each successful union decrements count. Final count = number of connected components.

Cycle detection: If find(u) === find(v) BEFORE adding edge (u,v), the edge creates a cycle. This is how Kruskal's avoids cycles.

Kruskal's MST: Sort edges by weight. Greedily add cheapest edge that connects different components (union returns true). Stop at n-1 edges.

2D grid mapping: index = row × numCols + col. Creates unique 1D index for each cell. Only union right and down neighbors to avoid duplicates.

Online connectivity: Union-Find handles edges arriving one at a time. BFS/DFS needs the full graph upfront. This is Union-Find's killer advantage.

Union by size: Alternative to rank: track actual component sizes. Attach smaller under larger. Useful when you need component sizes.

Can't un-union: Union-Find doesn't support splitting groups. If needed, process in reverse order (offline) or use Link-Cut Trees.

Undirected only: Standard Union-Find is for undirected graphs. For directed graphs, use DFS, topological sort, or weighted DSU.

Mental trigger: "Connected components" or "merge groups" or "cycle in undirected graph" or "MST" → Union-Find.

Decision Flowchart

When to Use Union-Find — Decision Treetext
Does the problem involve grouping/connectivity in an undirected graph?
├── NONot a Union-Find problem
│         → Directed graph? Use DFS/Topological Sort
│         → Shortest path? Use BFS/Dijkstra
│         → Traversal needed? Use BFS/DFS
└── YESAre edges processed one at a time (online)?
          ├── YESUnion-Find is the best choice
          │         ├── Counting components? → Track uf.count
          │         ├── Detecting cycles? → Check if union returns false
          │         ├── Building MST? → Sort edges + Union-Find (Kruskal's)
          │         └── Grouping by equivalence? → Union elements sharing a property
          └── NOIs the graph static (all edges known upfront)?
                    ├── YESBFS/DFS is simpler for one-time component finding
                    │         (Union-Find also works but is overkill)
                    └── NOUnion-Find for dynamic edge additions

Union-Find Operations Quick Reference

OperationTimeDescriptionCommon Pitfall
find(x)O(α(n))Return root (group leader) of xForgetting path compression
union(a, b)O(α(n))Merge groups of a and bNot calling find() on both first
connected(a, b)O(α(n))Check if a and b are in same groupComparing a and b directly instead of find(a) and find(b)
countO(1)Number of componentsDecrementing on redundant unions
ConstructorO(n)Initialize n singleton componentsWrong size for 1-indexed problems

One last thing

Union-Find is one of those data structures that seems magical the first time you see it. Two simple optimizations — path compression and union by rank — turn a naive O(n) operation into effectively O(1). The class is always the same 25 lines of code. Memorize it, and the only challenge in each problem is figuring out what to union and when. That's the real skill: not implementing Union-Find, but recognizing that a problem is a Union-Find problem in disguise.