Topological sorting, YYDS

After reading this article, you will not only learn the algorithm routines, but also take the following topics on LeetCode:

207. Course schedule

210. Course Schedule II

-----------

Many readers left a message saying that they want to look at the algorithm related to the "graph", so that's to satisfy everyone, and I will show you the skills related to the graph in combination with the algorithm question.

previous article 1614b2fe860d43, the framework thinking of learning data structure , the algorithms related to data structure are nothing more than two points: traversal + access. Then the basic traversal method of the graph is also very simple, the previous graph algorithm basis talked about how to extend the traversal framework from the multi-tree traversal to the traversal of the graph.

There are also some special algorithms for graph data structures, such as bipartite graph judgment, ring graph acyclic graph judgment, topological sorting, and the most classic minimum spanning tree, single source shortest path problem, and even more difficult similar networks Stream such a problem.

However, in my experience, you are not a contestant for problems like network streaming. Unless you are particularly interested, you don’t need to learn it; problems like minimum spanning tree and shortest path are used from the perspective of problem solving. There are not many of them, but they belong to classic algorithms, and you can master them if you have enough time to learn; like topological sorting, they belong to relatively basic and useful algorithms and should be mastered more proficiently.

Then this article combines specific algorithmic questions to talk about two graph theory algorithms: the ring detection of directed graphs and the topological sorting algorithm.

Determine whether there is a ring in the directed graph

Let’s take a look at Likou Question 207 "Course Schedule" first:

The function signature is as follows:

int[] findOrder(int numCourses, int[][] prerequisites);

The question should not be difficult to understand. When can I not finish all the courses? When there is a circular dependency.

In fact, this kind of scenario is also very common in real life. For example, when we write code import package is also an example, we must design the code directory structure reasonably, otherwise there will be circular dependencies and the compiler will report errors, so the compiler actually uses similar algorithms. To determine whether your code can be successfully compiled.

saw the dependency problem, and the first thing that came to my mind was to transform the problem into a data structure of "directed graph". As long as there are loops in the graph, it means that there is a circular dependency .

Specifically, we can first regard courses as nodes in a "directed graph", the node numbers are 0, 1, ..., numCourses-1 respectively, and the dependencies between courses can be regarded as directed edges between nodes.

For example, if you must complete the course 1 before you can take the course 3 , then there is a directed edge from the node 1 to 3 .

So we can generate a picture similar to this prerequisites array entered in the title:

finds that there are loops in this directed graph, it means that there is a circular dependency between the courses, and it must be impossible to finish all courses; on the contrary, if there are no loops, then all courses can definitely be taken.

Ok, so to solve this problem, we must first convert the input of the title into a directed graph, and then judge whether there is a ring in the graph.

How to convert it into a graph? In the previous article Graph Theory Fundamentals wrote about two storage forms of graphs, the adjacency matrix and the adjacency table.

Based on my experience in writing questions, a common storage method is to use adjacency lists, such as the following structure:

List<Integer>[] graph;

graph[s] is a list that stores the node s pointed to by node .

So we can first write a mapping function:

List<Integer>[] buildGraph(int numCourses, int[][] prerequisites) {
    // 图中共有 numCourses 个节点
    List<Integer>[] graph = new LinkedList[numCourses];
    for (int i = 0; i < numCourses; i++) {
        graph[i] = new LinkedList<>();
    }
    for (int[] edge : prerequisites) {
        int from = edge[1];
        int to = edge[0];
        // 修完课程 from 才能修课程 to
        // 在图中添加一条从 from 指向 to 的有向边
        graph[from].add(to);
    }
    return graph;
}

The graph is built, how can I tell if there is a ring in the graph?

Don't worry, let's first think about how to traverse this picture. As long as you can traverse, you can judge whether there is a ring .

The previous article Graph Theory Foundation wrote the framework of the DFS algorithm traversal graph, which is nothing more than an extension from the multi-tree traversal framework, adding a visited array:

// 防止重复遍历同一个节点
boolean[] visited;
// 从节点 s 开始 BFS 遍历，将遍历过的节点标记为 true
void traverse(List<Integer>[] graph, int s) {
    if (visited[s]) {
        return;
    }
    /* 前序遍历代码位置 */
    // 将当前节点标记为已遍历
    visited[s] = true;
    for (int t : graph[s]) {
        traverse(graph, t);
    }
    /* 后序遍历代码位置 */
}

Then we can directly apply this traversal code:

// 防止重复遍历同一个节点
boolean[] visited;

boolean canFinish(int numCourses, int[][] prerequisites) {
    List<Integer>[] graph = buildGraph(numCourses, prerequisites);
    
    visited = new boolean[numCourses];
    for (int i = 0; i < numCourses; i++) {
        traverse(graph, i);
    }
}

void traverse(List<Integer>[] graph, int s) {
    // 代码见上文
}

Note that not all nodes in the graph are connected, so use a for loop to call the DFS search algorithm once with all nodes as the starting point.

In this way, you can traverse all the nodes in this picture, you print the visited array, it should all be true.

The framework thinking of learning data structure and algorithm mentioned earlier that the traversal of graphs is similar to traversing a multi-tree, so you should be able to understand it easily by now.

So how to judge whether there is a ring in this picture?

As we said earlier in the backtracking algorithm core routine , you can think of a recursive function as a pointer that walks up the recursion tree. Here is similar:

You can also regard traverse as a pointer to the nodes in the graph, just add a boolean array onPath record the current path of traverse

boolean[] onPath;

boolean hasCycle = false;
boolean[] visited;

void traverse(List<Integer>[] graph, int s) {
    if (onPath[s]) {
        // 发现环！！！
        hasCycle = true;
    }
    if (visited[s]) {
        return;
    }
    // 将节点 s 标记为已遍历
    visited[s] = true;
    // 开始遍历节点 s
    onPath[s] = true;
    for (int t : graph[s]) {
        traverse(graph, t);
    }
    // 节点 s 遍历完成
    onPath[s] = false;
}

Here is a bit of a backtracking algorithm. When entering the node s , onPath[s] as true, and mark it back to false when leaving. If you find that onPath[s] has been marked, it means that a loop has occurred.

PS: Refer to the scene .

In this way, you can determine whether there is a ring by the way in the process of traversing the graph. The complete code is as follows:

// 记录一次 traverse 递归经过的节点
boolean[] onPath;
// 记录遍历过的节点，防止走回头路
boolean[] visited;
// 记录图中是否有环
boolean hasCycle = false;

boolean canFinish(int numCourses, int[][] prerequisites) {
    List<Integer>[] graph = buildGraph(numCourses, prerequisites);
    
    visited = new boolean[numCourses];
    onPath = new boolean[numCourses];
    
    for (int i = 0; i < numCourses; i++) {
        // 遍历图中的所有节点
        traverse(graph, i);
    }
    // 只要没有循环依赖可以完成所有课程
    return !hasCycle;
}

void traverse(List<Integer>[] graph, int s) {
    if (onPath[s]) {
        // 出现环
        hasCycle = true;
    }
    
    if (visited[s] || hasCycle) {
        // 如果已经找到了环，也不用再遍历了
        return;
    }
    // 前序遍历代码位置
    visited[s] = true;
    onPath[s] = true;
    for (int t : graph[s]) {
        traverse(graph, t);
    }
    // 后序遍历代码位置
    onPath[s] = false;
}

List<Integer>[] buildGraph(int numCourses, int[][] prerequisites) {
    // 代码见前文
}

This problem is solved, the core is to judge whether there is a ring in a directed graph.

However, if the person who asked the question continues to disgust you and asks you not only to judge whether there is a ring, but also to return to the specific nodes of the ring, what should you do?

You might say that onPath is not the number of the nodes that make up the ring?

No, suppose the green nodes in the figure below are recursive paths, and their onPath are all true, but obviously the nodes that form a loop are only part of them:

This question is left for everyone to think about. I will put the correct answer at the top of the official account message area.

So next, let's talk about a classic graph algorithm: topological sort .

Topological sort

Take a look at Likou Question 210 "Schedule II":

This question is the advanced version of the previous question. It does not just let you judge whether you can complete all the courses, but further allows you to return to a reasonable order of classes, ensuring that when you start each course, the previous courses have been taken over.

The function signature is as follows:

int[] findOrder(int numCourses, int[][] prerequisites);

Here I first talk about the term Topological Sorting. The definition found on the Internet is very mathematical. Here is a picture of Baidu Baike to let you intuitively feel:

Intuitively speaking, let you "flatten" a picture, and in this "flattened" picture, all the arrow directions are the same , for example, all the arrows in the above picture are facing right.

Obviously, if there are loops in a directed graph, topological sorting cannot be done, because it is certainly impossible to make all the arrow directions consistent; conversely, if a graph is a "directed acyclic graph", then it must be done Topological sorting.

But what does our problem have to do with topological sorting?

see. If the courses are abstracted into nodes and the dependencies between courses are abstracted into directed edges, then the topological sorting result of this graph is the class order .

First of all, let's first judge whether the curriculum dependency of the question input is in a loop. If the loop is formed, topological sorting cannot be performed, so we can reuse the main function of the previous question:

public int[] findOrder(int numCourses, int[][] prerequisites) {
    if (!canFinish(numCourses, prerequisites)) {
        // 不可能完成所有课程
        return new int[]{};
    }
    // ...
}

So the key question is, how to perform topological sorting? Are you going to show off some tall skills again?

is actually very simple. Inverting the result of the subsequent traversal is the result of topological sorting .

Look at the solution code directly:

boolean[] visited;
// 记录后序遍历结果
List<Integer> postorder = new ArrayList<>();

int[] findOrder(int numCourses, int[][] prerequisites) {
    // 先保证图中无环
    if (!canFinish(numCourses, prerequisites)) {
        return new int[]{};
    }
    // 建图
    List<Integer>[] graph = buildGraph(numCourses, prerequisites);
    // 进行 DFS 遍历
    visited = new boolean[numCourses];
    for (int i = 0; i < numCourses; i++) {
        traverse(graph, i);
    }
    // 将后序遍历结果反转，转化成 int[] 类型
    Collections.reverse(postorder);
    int[] res = new int[numCourses];
    for (int i = 0; i < numCourses; i++) {
        res[i] = postorder.get(i);
    }
    return res;
}

void traverse(List<Integer>[] graph, int s) {
    if (visited[s]) {
        return;
    }
    
    visited[s] = true;
    for (int t : graph[s]) {
        traverse(graph, t);
    }
    // 后序遍历位置
    postorder.add(s);
}

// 参考上一题的解法
boolean canFinish(int numCourses, int[][] prerequisites);

// 参考前文代码
List<Integer>[] buildGraph(int numCourses, int[][] prerequisites);

Although the code looks a lot, the logic should be very clear. As long as there is no loop in the graph, then we call the traverse function to perform BFS traversal on the graph, record the post-order traversal result, and finally reverse the post-order traversal result as the final Answer.

Then why the reverse result of the post-order traversal is topological sorting ?

I also avoid mathematical proofs here, and use an intuitive example to explain, let's just say binary trees, which is the binary tree traversal framework we have said many times:

void traverse(TreeNode root) {
    // 前序遍历代码位置
    traverse(root.left)
    // 中序遍历代码位置
    traverse(root.right)
    // 后序遍历代码位置
}

When is the post-order traversal of the binary tree? After traversing the left and right subtrees, the code for the post-order traversal position will be executed. In other words, when the nodes of the left and right subtrees are all installed in the result list, the root node will be installed.

post-order traversal is very important. The reason why topological sorting is based on post-order traversal is because a task must wait until all dependent tasks are completed before starting to execute .

You understand each task as a node in the binary tree, and the task that this task depends on is understood as a child node. Should you first process all the child nodes before processing the parent node? Is this a post-order traversal?

Let me talk about why we need to reverse the post-order traversal result, which is the final topological sorting result.

We say that a node can be understood as a task, and the child nodes of this node are understood as the dependency of this task, but you pay attention to the expression of the dependency relationship we said before: if you finish A before you can do B , then there is a point A B directed edge indicates B dependent A .

Then, the parent node depends on the child node, which should be reflected in the binary tree like this:

and our normal binary tree pointer pointing the other way around? Therefore, the normal post-order traversal result should be reversed, which is the result of topological sorting .

Above, I briefly explained why "the result of topological sorting is the result of post-order traversal after inversion". Of course, although my explanation is relatively intuitive, there is no strict mathematical proof. Interested readers can check it out by themselves.

In short, you remember that topological sorting is the result of post-order traversal reversal, and topological sorting can only be used for directed acyclic graphs. Ring detection is required before topological sorting. These knowledge points are sufficient.

＿＿＿＿＿＿＿＿＿＿＿＿＿

View more high-quality algorithm articles Click on my avatar , and take you through the button, dedicated to clarifying the algorithm! My algorithm tutorial has won 90k stars, welcome to like it!

Topological sorting, YYDS

Determine whether there is a ring in the directed graph

Topological sort

labuladong

引用和评论

王炸！算法可视化功能全面上线，包括递归算法可视化！

贵金属实时高频报价API调研对比

专访金融时报中文网总编：你怎么看 Crypto？

ATRNX.AI 引领金融量化交易变革，开启智能决策新时代

屠龙者困境：比特币已沦为资本新权杖？加密货币行业应如何破局？

提示词工程师自白：我如何用一个技巧解放自己的生产力

香港通过《稳定币条例草案》，京东币链科技解读「里程碑时刻」

Topological sorting, YYDS

Determine whether there is a ring in the directed graph

Topological sort

labuladong

引用和评论

王炸！算法可视化功能全面上线，包括递归算法可视化！

贵金属实时高频报价API调研对比

专访金融时报中文网总编：你怎么看 Crypto？

ATRNX.AI 引领金融量化交易变革，开启智能决策新时代

屠龙者困境：比特币已沦为资本新权杖？加密货币行业应如何破局？

提示词工程师自白：我如何用一个技巧解放自己的生产力

香港通过《稳定币条例草案》，京东币链科技解读 「里程碑时刻」

香港通过《稳定币条例草案》，京东币链科技解读「里程碑时刻」