Minimum Spanning Tree (Prims and Kruskals) A Deep Dive for Devs

All Notes

Code

What’s the Fuss About Minimum Spanning Trees (MSTs)?

Imagine you’re laying out fiber optic cables to connect a cluster of data centers, or designing a network of roads to connect towns, or even optimizing circuit board traces. In all these scenarios, you want to ensure every point is connected to every other point, but you also want to minimize the total cost (length of cable, distance of road, amount of copper). This is precisely where Minimum Spanning Trees (MSTs) shine.

Let’s break down what an MST is, starting with the basics.

Graphs: The Foundation

An MST lives within a graph. A graph G = (V, E) consists of:

Vertices (nodes) V: The points of interest (e.g., data centers, towns).
Edges E: Connections between vertices (e.g., fiber cables, roads).
Edge Weights: A cost or distance associated with each edge. This is crucial for MSTs.

We’re specifically interested in undirected, connected graphs with weighted edges.

Spanning Trees: The Connection

A spanning tree of a connected graph is a subgraph that:

Includes all vertices of the original graph.
Is a tree (i.e., it’s connected and has no cycles).
Contains exactly |V| - 1 edges.

Think of it as the minimum set of connections required to keep the entire graph connected without any redundant paths (cycles).

Minimum Spanning Tree (MST): The Optimal Connection

A Minimum Spanning Tree (MST) is a spanning tree whose sum of edge weights is less than or equal to the sum of edge weights of any other spanning tree. In essence, it’s the “cheapest” way to connect all vertices.

Key properties of an MST:

It’s a tree, so it has no cycles.
It connects all vertices.
The sum of its edge weights is minimal.
For a connected graph with distinct edge weights, the MST is unique. If weights are not distinct, there might be multiple MSTs with the same minimum total weight, but their total weight will be identical.

Why Do We Care? Real-World Applications

MSTs are not just an academic curiosity. They’re fundamental to:

Network Design: Laying out cables (telecommunications, power grids) efficiently.
Clustering: Grouping similar data points in machine learning (e.g., single-linkage clustering).
Computer Vision: Image segmentation.
Circuit Design: Minimizing wire length on printed circuit boards.
Transportation Networks: Designing efficient routes.

Now that we know what an MST is and why it’s important, let’s dive into two of the most popular algorithms for finding them: Prim’s and Kruskal’s. Both are greedy algorithms, meaning they make the locally optimal choice at each step hoping it leads to a globally optimal solution. (Spoiler: for MSTs, greedy algorithms actually work!)

Prim’s Algorithm: Building Incrementally

Prim’s algorithm builds the MST by incrementally adding vertices to a growing tree, starting from an arbitrary vertex. At each step, it adds the cheapest edge that connects a vertex inside the current tree to a vertex outside it.

How Prim’s Algorithm Works

Initialization: Pick an arbitrary starting vertex and add it to your MST. Initialize a data structure (like a priority queue) to store all edges connecting vertices in your current MST to vertices outside it.
Iteration: Repeat |V| - 1 times (or until all vertices are included):
- From the priority queue, extract the edge with the minimum weight that connects a vertex already in the MST to a vertex not yet in the MST.
- Add this edge and the new vertex to your MST.
- For the newly added vertex, add all its incident edges (that connect to vertices not yet in the MST) to the priority queue.
Termination: When all vertices are included, you have your MST.

Note: The priority queue is key here. It allows efficient retrieval of the minimum-weight edge. If implemented with a min-heap, this step is O(log E) or O(log V) depending on what you store.

Prim’s Algorithm Complexity

Time Complexity:
- Using a binary min-heap: O(E log V) or O(E log E). Since E can be up to V^2, log E can be log V^2 = 2 log V, so O(E log V) is the more common way to express it. This is efficient for dense graphs (where E is close to V^2).
- Using an adjacency matrix and simple array to find min edge: O(V^2).
Space Complexity: O(V + E) for storing the graph, and O(V) or O(E) for the priority queue and distances.

Prim’s Algorithm Python Example

Let’s represent our graph using an adjacency list where each entry is (neighbor, weight).

import heapq

def prim(graph):
    """
    Finds the Minimum Spanning Tree (MST) of a connected, undirected,
    weighted graph using Prim's algorithm.

    Args:
        graph (dict): An adjacency list representation of the graph.
                      {vertex: [(neighbor, weight), ...]}

    Returns:
        tuple: A tuple containing:
               - list: The edges forming the MST, e.g., [(u, v, weight), ...]
               - int: The total weight of the MST.
    """
    if not graph:
        return [], 0

    vertices = list(graph.keys())
    if not vertices:
        return [], 0

    # Start with an arbitrary vertex (e.g., the first one in the list)
    start_node = vertices[0]
    
    # Set to keep track of vertices already included in the MST
    in_mst = {start_node}
    
    # Priority queue to store edges (weight, u, v)
    # The edge (u, v) connects a vertex u IN the MST to a vertex v NOT IN the MST.
    min_heap = []

    # Add all edges connected to the start_node to the priority queue
    # This also handles cases where a neighbor might not be a direct key in graph (isolated node scenarios),
    # though for connected graphs used here, all vertices will be keys.
    for neighbor, weight in graph.get(start_node, []):
        heapq.heappush(min_heap, (weight, start_node, neighbor))

    mst_edges = []
    mst_weight = 0

    # Continue until all vertices are included in the MST
    # An MST for a graph with |V| vertices has |V|-1 edges.
    while min_heap and len(in_mst) < len(vertices):
        weight, u, v = heapq.heappop(min_heap)

        # If 'v' is already in the MST, this edge forms a cycle, so skip it
        if v in in_mst:
            continue

        # Add the edge to the MST
        mst_edges.append((u, v, weight))
        mst_weight += weight
        
        # Add 'v' to the set of vertices in the MST
        in_mst.add(v)

        # Now, add all edges connected to the newly added vertex 'v'
        # that lead to vertices NOT YET in the MST
        for neighbor_of_v, weight_of_edge in graph.get(v, []):
            if neighbor_of_v not in in_mst:
                heapq.heappush(min_heap, (weight_of_edge, v, neighbor_of_v))
    
    # Check if all vertices were connected. If not, the graph was not connected.
    if len(in_mst) < len(vertices):
        print("Warning: The graph is not connected. MST cannot include all vertices from the original graph.")
        # For a truly disconnected graph, this implementation finds an MST of the connected component
        # containing the start_node. It won't find a spanning tree for the whole graph.
        return [], 0 

    return mst_edges, mst_weight

# --- Example Usage ---
# Graph represented as an adjacency list (from CLRS, Fig 23.4, "MST-PRIM"):
# {node: [(neighbor1, weight1), (neighbor2, weight2), ...]}
# Note: For an undirected graph, if (u,v,w) exists, (v,u,w) must also be present.
graph_prim_example = {
    'A': [('B', 4), ('H', 8)],
    'B': [('A', 4), ('C', 8), ('H', 11)],
    'C': [('B', 8), ('D', 7), ('F', 4), ('I', 2)],
    'D': [('C', 7), ('E', 9), ('F', 14)],
    'E': [('D', 9), ('F', 10)],
    'F': [('C', 4), ('D', 14), ('E', 10), ('G', 2)],
    'G': [('F', 2), ('H', 1), ('I', 6)],
    'H': [('A', 8), ('B', 11), ('G', 1), ('I', 7)],
    'I': [('C', 2), ('G', 6), ('H', 7)]
}

mst_prim_edges, total_prim_weight = prim(graph_prim_example)

print("--- Prim's Algorithm ---")
print("MST Edges:")
for u, v, w in mst_prim_edges:
    print(f"  Edge: ({u}-{v}), Weight: {w}")
print(f"Total MST Weight: {total_prim_weight}")

# Another example: A simpler graph
simple_graph_prim = {
    'A': [('B', 1), ('C', 3)],
    'B': [('A', 1), ('C', 1), ('D', 5)],
    'C': [('A', 3), ('B', 1), ('D', 2)],
    'D': [('B', 5), ('C', 2)]
}

mst_simple_prim_edges, total_simple_prim_weight = prim(simple_graph_prim)

print("\n--- Prim's Algorithm (Simple Graph) ---")
print("MST Edges:")
for u, v, w in mst_simple_prim_edges:
    print(f"  Edge: ({u}-{v}), Weight: {w}")
print(f"Total MST Weight: {total_simple_prim_weight}")

--- Prim's Algorithm ---
MST Edges:
  Edge: (A-B), Weight: 4
  Edge: (A-H), Weight: 8
  Edge: (H-G), Weight: 1
  Edge: (G-F), Weight: 2
  Edge: (F-C), Weight: 4
  Edge: (C-I), Weight: 2
  Edge: (C-D), Weight: 7
  Edge: (D-E), Weight: 9
Total MST Weight: 37

--- Prim's Algorithm (Simple Graph) ---
MST Edges:
  Edge: (A-B), Weight: 1
  Edge: (B-C), Weight: 1
  Edge: (C-D), Weight: 2
Total MST Weight: 4

Kruskal’s Algorithm: Building by Edge Selection

Kruskal’s algorithm takes a different approach. It sorts all edges by weight in ascending order and then iterates through them. For each edge, it checks if adding it to the MST would form a cycle. If not, it adds the edge.

How Kruskal’s Algorithm Works

Initialization: Create a list of all edges in the graph, each with its weight.
Sorting: Sort all edges in non-decreasing order of their weights.
Disjoint Set Union (DSU): Initialize a Disjoint Set Union (also known as Union-Find) data structure. Each vertex initially belongs to its own separate set. The DSU helps efficiently detect cycles.
Iteration: Iterate through the sorted edges:
- For each edge (u, v) with weight w:
  - Check if u and v are already in the same connected component (i.e., if find(u) == find(v)).
  - If they are in the same component, adding this edge would create a cycle. Skip it.
  - If they are in different components, add this edge to your MST, and then merge the components of u and v using union(u, v).
Termination: Continue until |V| - 1 edges have been added to the MST (as an MST always has |V| - 1 edges), or all edges have been processed.

Note: The Disjoint Set Union data structure is critical for Kruskal’s. Its find (determine representative of a set) and union (merge two sets) operations need to be very efficient. Path compression and union by rank/size are standard optimizations that make these operations nearly constant time on average (α(V) where α is the inverse Ackermann function, which grows extremely slowly, practically < 5 for any real-world graph size).

Kruskal’s Algorithm Complexity

Time Complexity:
- O(E log E) or O(E log V): The dominant factor is sorting the edges (E log E). Since E <= V^2, log E is log V^2 = 2 log V, so E log E is equivalent to E log V. The DSU operations (find and union) are practically constant time. Kruskal’s is generally preferred for sparse graphs (graphs with relatively few edges, i.e., E is much closer to V than V^2) because its performance is dominated by the O(E log E) sort.
Space Complexity: O(V + E) for storing edges and the DSU structure.

Kruskal’s Algorithm Python Example

We’ll need a simple DisjointSetUnion class for this.

class DisjointSetUnion:
    """
    A simple Disjoint Set Union (DSU) data structure with path compression
    and union by size for efficient `find` and `union` operations.
    """
    def __init__(self, vertices):
        self.parent = {v: v for v in vertices}
        self.size = {v: 1 for v in vertices} # For union by size

    def find(self, i):
        """
        Finds the representative (root) of the set containing element 'i'.
        Applies path compression.
        """
        if self.parent[i] == i:
            return i
        self.parent[i] = self.find(self.parent[i]) # Path compression
        return self.parent[i]

    def union(self, i, j):
        """
        Unites the sets containing elements 'i' and 'j'.
        Applies union by size. Returns True if union happened, False if already in same set.
        """
        root_i = self.find(i)
        root_j = self.find(j)

        if root_i != root_j:
            # Union by size: attach smaller tree under root of larger tree
            if self.size[root_i] < self.size[root_j]:
                self.parent[root_i] = root_j
                self.size[root_j] += self.size[root_i]
            else:
                self.parent[root_j] = root_i
                self.size[root_i] += self.size[root_j]
            return True # Union successful
        return False # Already in the same set

def kruskal(graph_edges, num_vertices):
    """
    Finds the Minimum Spanning Tree (MST) of a connected, undirected,
    weighted graph using Kruskal's algorithm.

    Args:
        graph_edges (list): A list of edges, where each edge is
                            (u, v, weight).
        num_vertices (int): The total number of vertices in the graph.
                            This is used to know when enough edges are added.

    Returns:
        tuple: A tuple containing:
               - list: The edges forming the MST, e.g., [(u, v, weight), ...]
               - int: The total weight of the MST.
    """
    if not graph_edges and num_vertices == 0:
        return [], 0
    if num_vertices <= 1: # An MST of 0 or 1 vertex is an empty graph or the vertex itself
        return [], 0

    # Sort all edges by weight in non-decreasing order
    # Note: Python's sorted() is stable, but for elements with equal weight, 
    # the relative order might vary depending on Python version/implementation details.
    # This can lead to different valid MSTs if edge weights are not unique, but the total weight will be the same.
    sorted_edges = sorted(graph_edges, key=lambda x: x[2])

    # Initialize DSU structure with all unique vertices found in the edges
    all_vertices = set()
    for u, v, _ in graph_edges:
        all_vertices.add(u)
        all_vertices.add(v)
    
    # It's important to initialize DSU for ALL vertices in the graph,
    # including potential isolated ones, if `num_vertices` includes them.
    # Here, we assume `all_vertices` captures all relevant vertices for connectivity.
    dsu = DisjointSetUnion(all_vertices)

    mst_edges = []
    mst_weight = 0
    edges_count = 0

    # Iterate through sorted edges
    for u, v, weight in sorted_edges:
        # If adding this edge does not form a cycle (i.e., u and v are in different components)
        if dsu.union(u, v):
            mst_edges.append((u, v, weight))
            mst_weight += weight
            edges_count += 1
            
            # An MST has exactly |V|-1 edges. Once we have enough, we're done.
            if edges_count == num_vertices - 1:
                break
    
    # Check if all vertices were connected. If the graph was connected, 
    # edges_count should be num_vertices - 1.
    if edges_count < num_vertices - 1:
        print("Warning: The graph is not connected. MST cannot include all vertices.")
        # If the graph has isolated vertices not part of any edge, 
        # num_vertices - 1 might be larger than actual edges_count for the connected components.
        # For a disconnected graph, Kruskal's will find a Minimum Spanning Forest (MSF).
        # We can still return the MST of the largest component found.
            
    return mst_edges, mst_weight

# --- Example Usage ---
# Graph represented as a list of edges: (u, v, weight)
# Vertices are A, B, C, D, E, F, G, H, I (9 vertices)
graph_kruskal_edges = [
    ('A', 'B', 4), ('A', 'H', 8),
    ('B', 'C', 8), ('B', 'H', 11),
    ('C', 'D', 7), ('C', 'F', 4), ('C', 'I', 2),
    ('D', 'E', 9), ('D', 'F', 14),
    ('E', 'F', 10),
    ('F', 'G', 2),
    ('G', 'H', 1), ('G', 'I', 6),
    ('H', 'I', 7)
]
num_vertices_kruskal = 9 # A, B, C, D, E, F, G, H, I

mst_kruskal_edges, total_kruskal_weight = kruskal(graph_kruskal_edges, num_vertices_kruskal)

print("\n--- Kruskal's Algorithm ---")
print("MST Edges:")
for u, v, w in mst_kruskal_edges:
    print(f"  Edge: ({u}-{v}), Weight: {w}")
print(f"Total MST Weight: {total_kruskal_weight}")

# Another example: A simpler graph
# Vertices are A, B, C, D (4 vertices)
simple_graph_kruskal = [
    ('A', 'B', 1), ('A', 'C', 3),
    ('B', 'C', 1), ('B', 'D', 5),
    ('C', 'D', 2)
]
num_vertices_simple_kruskal = 4

mst_simple_kruskal_edges, total_simple_kruskal_weight = kruskal(simple_graph_kruskal, num_vertices_simple_kruskal)

print("\n--- Kruskal's Algorithm (Simple Graph) ---")
print("MST Edges:")
for u, v, w in mst_simple_kruskal_edges:
    print(f"  Edge: ({u}-{v}), Weight: {w}")
print(f"Total MST Weight: {total_simple_kruskal_weight}")

--- Kruskal's Algorithm ---
MST Edges:
  Edge: (G-H), Weight: 1
  Edge: (C-I), Weight: 2
  Edge: (F-G), Weight: 2
  Edge: (A-B), Weight: 4
  Edge: (C-F), Weight: 4
  Edge: (C-D), Weight: 7
  Edge: (A-H), Weight: 8
  Edge: (D-E), Weight: 9
Total MST Weight: 37

--- Kruskal's Algorithm (Simple Graph) ---
MST Edges:
  Edge: (A-B), Weight: 1
  Edge: (B-C), Weight: 1
  Edge: (C-D), Weight: 2
Total MST Weight: 4

Prim’s vs. Kruskal’s: When to Use Which?

Both Prim’s and Kruskal’s algorithms correctly find an MST. Their practical performance depends heavily on the characteristics of the graph:

Feature	Prim’s Algorithm	Kruskal’s Algorithm
Approach	Expands a single component until it covers all vertices. “Vertex-centric”.	Adds edges in increasing order of weight if they don’t form a cycle. “Edge-centric”.
Data Structure	Priority Queue (Min-Heap) for edges.	Disjoint Set Union (Union-Find) for components.
Time Complexity	`O(E log V)` (with binary heap) or `O(V^2)` (with adjacency matrix).	`O(E log E)` (dominated by sorting edges).
Space Complexity	`O(V + E)`	`O(V + E)`
Best for	Dense graphs (E ≈ V^2). When `E` is large, `log V` is smaller than `log E`. `O(V^2)` is also good here.	Sparse graphs (E ≈ V). When `E` is small, `E log E` is faster.
Implementation	Slightly simpler to implement if standard library priority queues are used.	Requires a robust Disjoint Set Union implementation.

In summary:

If your graph is dense, Prim’s algorithm (especially with an adjacency matrix or a Fibonacci heap if you’re feeling adventurous) often performs better. Its O(V^2) complexity makes it a solid choice here.
If your graph is sparse, Kruskal’s algorithm is usually faster. Its O(E log E) complexity is efficient when E is much smaller than V^2.
Connectivity: Prim’s always maintains a single connected component that grows. Kruskal’s gradually merges multiple components until all are connected. If the graph is disconnected, Kruskal’s will naturally find a Minimum Spanning Forest (an MST for each connected component), whereas Prim’s will only find an MST for the component reachable from its starting node.

Conclusion

Minimum Spanning Trees are a fundamental concept in graph theory with widespread applications in engineering and computer science. Prim’s and Kruskal’s algorithms offer elegant and efficient solutions to this problem, each with its strengths depending on the graph’s structure. Understanding their core mechanics, data structure requirements, and complexity trade-offs empowers you to choose the right tool for the job.

With these examples, you should have a solid foundation to implement and apply MST algorithms in your own projects. Happy coding!

How Spotify Uses Shuffling Algorithms (It’s Not Really Random)

Jun 17, 2025

How to Simulate Gravitational Waves in a Python Sandbox

Jun 17, 2025

Understanding the Reactor Pattern

Jun 18, 2025

Greedy Algorithms In Graphs - A Developers Guide with Examples

Jun 18, 2025

Using `watch` and `entr` to Live-Reload Anything

Jun 17, 2025

Demystifying Longest Common Substring - A Practical Guide for Developers