Prim&#39;s algorithm for minimum spanning tree

After reading this article, you not only learned the algorithm routines, but you can also go to LeetCode to win the following topics:

1135. Lowest cost access to all cities (medium)

Minimum cost to connect all points (medium)

-----------

This article is the 7th graph theory algorithm article. First, I will list the graph theory algorithms I have written before:

1. Graph Theory Algorithm Foundation

2. Graph Judgment Algorithm

3. Ring Detection and Topological Sorting Algorithm

4. Dijkstra's shortest path algorithm

5. Union Find algorithm

6. Kruskal Minimum Spanning Tree Algorithm

Although advanced algorithms such as graph theory algorithms are not difficult, the reading volume is generally low. I did not want to write Prim algorithm, but considering the integrity of the algorithm knowledge structure, I still want to fill in the pits of Prim algorithm. In this way, all classical graph theory algorithms are basically perfect.

Prim algorithm and Kruskal algorithm are both classic minimum spanning tree algorithms. Before reading this article, I hope you have read the previous article Kruskal minimum spanning tree algorithm , understand the basic definition of minimum spanning tree and the basic principle of Kruskal algorithm, so that you can easily Understand the logic of Prim algorithm.

Compare Kruskal's Algorithm

The minimum spanning tree problem of graph theory is to let you find several edges from the graph to form a set of mst , these edges have the following characteristics:

1. These edges form a tree (the difference between a tree and a graph is that it cannot contain cycles).

2. The tree formed by these edges should contain all nodes.

3. The sum of the weights of these edges should be as small as possible.

So what logic does the Kruskal algorithm use to satisfy the above conditions and calculate the minimum spanning tree?

First, Kruskal's algorithm uses the greedy idea to satisfy the problem that the sum of the weights is as small as possible:

First, sort all the edges according to the weight from small to large, starting from the edge with the smallest weight, select the appropriate edge to add to the mst set, so that the tree composed of the selected edges has the smallest weight.

Secondly, the Kruskal algorithm uses the Union-Find algorithm to ensure that the selected edges must be a "tree" and not contain rings or form a "forest":

If the two nodes of an edge are already connected, the edge will cause a cycle in the tree; if the total number of connected components at the end is greater than 1, it means that a "forest" is formed instead of a "tree".

So, what logic does Prim's algorithm, the protagonist of this article, use to calculate the minimum spanning tree?

First of all, Prim's algorithm also uses greedy thinking to make the weight of the spanning tree as small as possible , which is the "slicing theorem", which will be explained in detail later.

Secondly, the Prim algorithm uses the BFS algorithm idea and visited Boolean arrays to avoid looping to ensure that the selected edge must eventually form a tree.

The Prim algorithm does not need to sort all the edges in advance, but uses the priority queue to dynamically achieve the sorting effect, so I think the Prim algorithm is similar to Kruskal's dynamic process.

Let's introduce the core principle of Prim's algorithm: the segmentation theorem.

Segmentation Theorem

The term "segmentation" is actually very easy to understand. It is to divide a graph into two non-overlapping and non-empty node sets:

The red knife divides the nodes in the graph into two sets, which is a kind of " split ", in which the edge cut by the red line (marked in blue) is called " cross-cut edge ".

PS: Remember the meaning of these two professional terms, we will use these two words frequently later, don't confuse them.

Of course, a graph can have several kinds of segmentation, because according to the definition of segmentation, as long as you can divide a node into two parts with one stroke.

Next we introduce the "segmentation theorem":

For any kind of "split", the "cross-cutting edge" with the smallest weight must be an edge that constitutes the minimum spanning tree .

It should be easy to prove that if there is a minimum spanning tree for a weighted undirected graph, suppose the edge marked in green in the following graph is the minimum spanning tree:

Then, you can definitely find a number of "splitting" ways to cut this minimum spanning tree into two subtrees. For example, the following segmentation:

You will find that any blue "crosscutting edge" can connect these two subtrees to form a spanning tree.

So in order to minimize the weight and sum of the final spanning tree, how do you say you want to choose?

You must choose the "crosscut edge" with the smallest weight, right, which proves the segmentation theorem.

Regarding the segmentation theorem, you can also prove it by contradiction:

Given the minimum spanning tree of a graph, then for any kind of "slice", there must be at least one "cross-cutting edge" belonging to the minimum spanning tree.

Assuming that this "cross-cutting edge" is not the least weighted, it means that the weight sum of the minimum spanning tree still has room for further reduction, which is contradictory. The weight sum of the minimum spanning tree is already the smallest, how can it be reduced ? So the segmentation theorem is correct.

With this segmentation theorem, you probably have an algorithmic idea for calculating the minimum spanning tree:

Since every "segment" must be able to find an edge in the minimum spanning tree, then I will cut it casually, and each time I will take out the "cross-cutting edge" with the smallest weight and add it to the minimum spanning tree, until the minimum spanning tree is formed. All edges of the tree are cut out until .

Well, it can be said that this is the core idea of the Prim algorithm, but it still requires some skills to implement it.

Because you can't get the computer to understand what "cutting" means, you should design mechanized rules and tactics to tune your algorithm and minimize wasted effort.

Prim algorithm implementation

When we think about algorithm problems, if the general situation of the problem is not easy to solve, we can start with a relatively simple special case. Prim algorithm is the way of thinking.

According to the definition of "segment", as long as the nodes in the graph are cut into two non-overlapping and non-empty nodes, it can be counted as a legal "segment", then I only cut out one node, right? Is it a legal "segment"?

Yes, this is the simplest "slicing", and the "cross-cutting edge" is also well determined, which is the edge of this node.

Then we can choose a random point, assuming that the A starts from 061f3b09ae5d6a:

Since this is a legal "split", then according to the split theorem, the edge with the AB, AF must be an edge in the minimum spanning tree:

Ok, now the first edge of the minimum spanning tree (edge AB ) has been found, and then, how to arrange the next "split"?

According to the logic of the Prim algorithm, we can next B the two nodes A and 061f3b09ae5dbf:

Then you can find the edge with the smallest weight from the cross-cutting edge (the blue edge in the figure) generated by this split, and find the second edge BC in the minimum spanning tree:

what's next? It is also similar, and then A, B, C around the three points of 061f3b09ae5e04, and the edge with the smallest weight in the generated cross-cutting edge is BD , then BD is the third edge of the minimum spanning tree:

Next, we will A, B, C, D the four points around 061f3b09ae5e24...

Prim algorithm is like this, each split can find an edge of the minimum spanning tree, and then a new round of splitting can be performed until all the edges of the minimum spanning tree are found .

An advantage of designing the algorithm in this way is that it is easier to determine the "cross-cutting edge" generated by each new "segment".

For example, when I look back on the picture just now, when I know all the "cross-cutting edges" of A, B cut({A, B}) ), which is the blue edge in the picture:

Is it possible to quickly calculate cut({A, B, C}) , that is, all the "cross-cutting edges" of A, B, C

Yes, because we found:

cut({A, B, C}) = cut({A, B}) + cut({C})

And cut({C}) is all adjacent edges of C

This feature makes it possible to implement "slicing" and handle "crosscut edges" with our code:

In the process of splitting, we only need to continuously add the adjacent edges of the new node to the cross-cutting edge set, and then we can get all the cross-cutting edges of the new split.

Of course, the careful reader would surely find, cut({A, B}) cross-cut edges and cut({C}) transverse sides BC side repeated.

But this is easy to handle, with a boolean array inMST help prevent double-counting of crosscut edges.

The last question, the purpose of our cross-cutting edge is to find the cross-cutting edge with the smallest weight. How to do it?

Very simple, use a priority queue to store these crosscut edges, and you can dynamically calculate the crosscut edge with the smallest weight.

understands the above algorithm principle, let's take a look at the code implementation of Prim algorithm :

class Prim {
    // 核心数据结构，存储「横切边」的优先级队列
    private PriorityQueue<int[]> pq;
    // 类似 visited 数组的作用，记录哪些节点已经成为最小生成树的一部分
    private boolean[] inMST;
    // 记录最小生成树的权重和
    private int weightSum = 0;
    // graph 是用邻接表表示的一幅图，
    // graph[s] 记录节点 s 所有相邻的边，
    // 三元组 int[]{from, to, weight} 表示一条边
    private List<int[]>[] graph;

    public Prim(List<int[]>[] graph) {
        this.graph = graph;
        this.pq = new PriorityQueue<>((a, b) -> {
            // 按照边的权重从小到大排序
            return a[2] - b[2];
        });
        // 图中有 n 个节点
        int n = graph.length;
        this.inMST = new boolean[n];

        // 随便从一个点开始切分都可以，我们不妨从节点 0 开始
        inMST[0] = true;
        cut(0);
        // 不断进行切分，向最小生成树中添加边
        while (!pq.isEmpty()) {
            int[] edge = pq.poll();
            int to = edge[1];
            int weight = edge[2];
            if (inMST[to]) {
                // 节点 to 已经在最小生成树中，跳过
                // 否则这条边会产生环
                continue;
            }
            // 将边 edge 加入最小生成树
            weightSum += weight;
            inMST[to] = true;
            // 节点 to 加入后，进行新一轮切分，会产生更多横切边
            cut(to);
        }
    }

    // 将 s 的横切边加入优先队列
    private void cut(int s) {
        // 遍历 s 的邻边
        for (int[] edge : graph[s]) {
            int to = edge[1];
            if (inMST[to]) {
                // 相邻接点 to 已经在最小生成树中，跳过
                // 否则这条边会产生环
                continue;
            }
            // 加入横切边队列
            pq.offer(edge);
        }
    }

    // 最小生成树的权重和
    public int weightSum() {
        return weightSum;
    }

    // 判断最小生成树是否包含图中的所有节点
    public boolean allConnected() {
        for (int i = 0; i < inMST.length; i++) {
            if (!inMST[i]) {
                return false;
            }
        }
        return true;
    }
}

After understanding the segmentation theorem and adding detailed code comments, you should be able to understand the code of Prim's algorithm.

Here we can review the connection between at the beginning of this article and the 161f3b09ae5fa5 Kruskal algorithm

Kruskal's algorithm sorts all the edges at the beginning, and then selects the edges belonging to the minimum spanning tree from the edge with the smallest weight to form the minimum spanning tree.

The Prim algorithm starts from the split of a starting point (a set of cross-cutting edges) and executes the logic similar to the BFS algorithm. With the help of the splitting theorem and the dynamic sorting of the priority queue, a minimum spanning tree is "grown" from this starting point.

Speaking of which, what is the time complexity of Prim's algorithm ?

This is not difficult to analyze the complexity of the main priority queue pq operation, since pq was inside view of the "edge", assuming that the number of edges of a picture is E , then a maximum operating O(E) times pq . The time complexity of each operation on the priority queue depends on the number of elements in the queue, and the worst case is O(logE) .

So the total time complexity of this implementation of Prim's algorithm is O(ElogE) . Recall Kruskal's algorithm , whose time complexity is mainly to sort all edges by weight, also O(ElogE) .

But having said that, and previously Dijkstra algorithm Similarly, the time complexity of the algorithm Prim also be optimized, but the point is to optimize the realization of priority queues, and Prim's algorithm algorithm itself has little idea of the relationship, so we are not here After discussion, interested readers can search by themselves.

Next, let's do a wave of practical exercises, and use the Prim algorithm to solve the problem of force buckle that was solved by the Kruskal algorithm before.

Topic practice

The first question is the 1135th question "Connecting all cities at the lowest cost", which is a standard minimum spanning tree problem:

The function signature is as follows:

int minimumCost(int n, int[][] connections);

Each city is equivalent to a node in the graph, the cost of connecting a city is equivalent to the weight of an edge, and the minimum cost of connecting all cities is the sum of the weights of the minimum spanning tree.

Then the solution is obvious. We first connections input in the question into the form of an adjacency list, and then input it into the previously implemented Prim algorithm class:

public int minimumCost(int n, int[][] connections) {
    // 转化成无向图邻接表的形式
    List<int[]>[] graph = buildGraph(n, connections);
    // 执行 Prim 算法
    Prim prim = new Prim(graph);

    if (!prim.allConnected()) {
        // 最小生成树无法覆盖所有节点
        return -1;
    }

    return prim.weightSum();
}

List<int[]>[] buildGraph(int n, int[][] connections) {
    // 图中共有 n 个节点
    List<int[]>[] graph = new LinkedList[n];
    for (int i = 0; i < n; i++) {
        graph[i] = new LinkedList<>();
    }
    for (int[] conn : connections) {
        // 题目给的节点编号是从 1 开始的，
        // 但我们实现的 Prim 算法需要从 0 开始编号
        int u = conn[0] - 1;
        int v = conn[1] - 1;
        int weight = conn[2];
        // 「无向图」其实就是「双向图」
        // 一条边表示为 int[]{from, to, weight}
        graph[u].add(new int[]{u, v, weight});
        graph[v].add(new int[]{v, u, weight});
    }
    return graph;
}

class Prim { /* 见上文 */ }

There are two points to note buildGraph

First, the node number given by the title starts from 1, so we make an index offset and convert it to start from 0 so that the Prim class can be used;

The second is how to use an adjacency list to represent an undirected weighted graph. The Graph Theory Algorithms said that "undirected graph" can actually be understood as "bidirectional graph".

In this way, the graph form Prim algorithm class, and the Prim algorithm can be directly used to calculate the minimum spanning tree.

Let's take a look at the 1584th "minimum cost of connecting all points":

For example, the example given in the title:

points = [[0,0],[2,2],[3,10],[5,2],[7,0]]

The algorithm should return 20, connecting the points as follows:

The function signature is as follows:

int minCostConnectPoints(int[][] points);

Obviously, this is also a standard minimum spanning tree problem: each point is a node in an undirected weighted graph, the weight of the edge is the Manhattan distance, and the minimum cost of connecting all the points is the sum of the weights of the minimum spanning tree.

Therefore, we only need to convert the points array into the form of an adjacency list, and then we can reuse the previously implemented Prim algorithm class:

public int minCostConnectPoints(int[][] points) {
    int n = points.length;
    List<int[]>[] graph = buildGraph(n, points);
    return new Prim(graph).weightSum();
}

// 构造无向图
List<int[]>[] buildGraph(int n, int[][] points) {
    List<int[]>[] graph = new LinkedList[n];
    for (int i = 0; i < n; i++) {
        graph[i] = new LinkedList<>();
    }
    // 生成所有边及权重
    for (int i = 0; i < n; i++) {
        for (int j = i + 1; j < n; j++) {
            int xi = points[i][0], yi = points[i][1];
            int xj = points[j][0], yj = points[j][1];
            int weight = Math.abs(xi - xj) + Math.abs(yi - yj);
            // 用 points 中的索引表示坐标点
            graph[i].add(new int[]{i, j, weight});
            graph[j].add(new int[]{j, i, weight});
        }
    }
    return graph;
}

class Prim { /* 见上文 */ }

A small modification has been made to this problem: each coordinate point is a two-tuple, so it stands to reason that a five-tuple should be used to represent a weighted edge, but it is inconvenient to perform the Prim algorithm; so we use the points array in the The index represents each coordinate point, so that the previous Prim algorithm logic can be directly reused.

At this point, the Prim algorithm is finished, and the entire graph theory algorithm is almost the same. For more exciting articles, please look forward to it.

Prim's algorithm for minimum spanning tree

Compare Kruskal's Algorithm

Segmentation Theorem

Prim algorithm implementation

Topic practice

labuladong

引用和评论

王炸！算法可视化功能全面上线，包括递归算法可视化！

大模型时代，后端程序员如何避免被AI卷死？

C++ 中 VS 项目引入公共配置文件

LSM-TREE从入门到入魔：从零开始实现一个高性能键值存储｜得物技术

疯狂推荐！从零开始 Dify 部署全攻略！

Cherry Studio 入门 MCP：为你的大模型插上翅膀

OpenWebUI：一站式 AI 应用构建平台体验

Prim&#39;s algorithm for minimum spanning tree

Compare Kruskal's Algorithm

Segmentation Theorem

Prim algorithm implementation

Topic practice

labuladong

引用和评论

王炸！算法可视化功能全面上线，包括递归算法可视化！

大模型时代，后端程序员如何避免被AI卷死？

C++ 中 VS 项目引入公共配置文件

LSM-TREE从入门到入魔：从零开始实现一个高性能键值存储 ｜ 得物技术

疯狂推荐！从零开始 Dify 部署全攻略！

Cherry Studio 入门 MCP：为你的大模型插上翅膀

OpenWebUI：一站式 AI 应用构建平台体验

Prim's algorithm for minimum spanning tree

LSM-TREE从入门到入魔：从零开始实现一个高性能键值存储｜得物技术