1

After reading this article, you not only learned the algorithm routines, but you can also go to LeetCode to win the following topics:

797. All possible paths (medium)

-----------

Readers often ask me about the data structure of "graph". In fact, I said in Framework Thinking for Learning Data Structures and Algorithms that although graphs can play more algorithms and solve more complex problems, in essence graphs It can be considered as an extension of a multi-fork tree.

There are very few graph-related questions in the interview written test. Even if there are, most of them are simple traversal questions, which can basically completely copy the traversal of multi-fork trees.

Then, this article still adheres to the style of our account, and only talks about the most practical part of "graph", the part closest to us, so that you can have an intuitive understanding of the graph. At the end of the article, I give other classic graph theory algorithms, after understanding this article Should be able to get it.

The logical structure and concrete realization of graph

A graph is composed of node and edge . The logical structure is as follows:

What is a "logical structure"? That is to say, for the convenience of research, we abstract the graph as .

According to this logical structure, we can think that the implementation of each node is as follows:

/* 图节点的逻辑结构 */
class Vertex {
    int id;
    Vertex[] neighbors;
}

See this implementation, are you familiar with it? It's almost exactly the same as the multi-tree node we talked about earlier:

/* 基本的 N 叉树节点 */
class TreeNode {
    int val;
    TreeNode[] children;
}

Therefore, the graph is really nothing deep, it is just a multi-fork tree of advanced points.

However, the above implementation is "logical". In fact, we rarely use this Vertex class to implement the graph, but use the often-mentioned adjacency list and adjacency matrix to implement it.

For example, the picture just now:

The adjacency list and adjacency matrix are stored as follows:

The adjacency list is very intuitive. I x the neighbors of each node 061ea76223d1e6 in a list, and then associate x with this list, so that a node x find all its neighbors.

The adjacency matrix is a two-dimensional boolean array, which we call matrix . If the nodes x and y are connected, then set matrix[x][y] to true (the green square in the above figure represents true ). If you want to find the neighbors of x matrix[x][..] .

If expressed in code form, the adjacency list and adjacency matrix will look like this:

// 邻接矩阵
// graph[x] 存储 x 的所有邻居节点
List<Integer>[] graph;

// 邻接矩阵
// matrix[x][y] 记录 x 是否有一条指向 y 的边
boolean[][] matrix;

So why are there two ways to store graphs? It must be because they each have pros and cons .

For adjacency lists, the benefit is that it takes up less space.

You see that there are so many empty positions in the adjacency matrix, and more storage space is definitely needed.

However, the adjacency list cannot quickly determine whether two nodes are adjacent.

For example, if I want to judge whether the node 1 is 3 , I will go to the neighbor list corresponding to 1 3 whether 061ea76223d31a exists. But for the adjacency matrix, it is simple, just look at matrix[1][3] to know, and the efficiency is high.

Therefore, which method to use to implement the graph depends on the specific situation.

Well, for the data structure of "graph", it is enough to understand the above.

Then you may ask, our model of this graph is just a "directed and unweighted graph", not any weighted graph, undirected graph, etc...

In fact, these more complex models are derived from this simplest diagram .

have to realize how weighted graph ? It's very simple:

If it is an adjacency list, we not only store all the neighbor nodes of x x to each neighbor, isn't it a weighted directed graph?

If it is an adjacency matrix, matrix[x][y] no longer a boolean value, but an int value. 0 means no connection, and other values represent weights. Isn't it a weighted directed graph?

If it is expressed in the form of code, it will look like this:

// 邻接矩阵
// graph[x] 存储 x 的所有邻居节点以及对应的权重
List<int[]>[] graph;

// 邻接矩阵
// matrix[x][y] 记录 x 指向 y 的边的权重,0 表示不相邻
int[][] matrix;

undirected graph implement ? It is also very simple. Is the so-called "undirected" equivalent to "two-way"?

If a free connection to the node in the graph x and y , the matrix[x][y] and matrix[y][x] have become true not on the list; adjacency table is similar operation, in x add neighbor list in y , while y add neighbor list in x .

Putting the above tricks together, it becomes an undirected weighted graph...

Alright, that's all for the basic introduction to pictures. Now, no matter what kind of messy pictures come, you should have a clue in your heart.

Let's take a look at the problem that all data structures cannot escape: traversal.

traversal of graph

framework for thinking learning data structures and algorithms said various data structures are invented nothing more than to traverse and access, so "traverse" is the basis of all data structures .

How to traverse the graph? Again, referring to the multi-fork tree, the traversal framework of the multi-fork tree is as follows:

/* 多叉树遍历框架 */
void traverse(TreeNode root) {
    if (root == null) return;

    for (TreeNode child : root.children) {
        traverse(child);
    }
}

The biggest difference between a graph and a polytree is that a graph may contain cycles. You start traversing from a certain node in the graph, and you may go back to this node after a circle.

So, if the graph contains cycles, traversing the frame requires an visited to assist:

// 记录被遍历过的节点
boolean[] visited;
// 记录从起点到当前节点的路径
boolean[] onPath;

/* 图遍历框架 */
void traverse(Graph graph, int s) {
    if (visited[s]) return;
    // 经过节点 s,标记为已遍历
    visited[s] = true;
    // 做选择:标记节点 s 在路径上
    onPath[s] = true;
    for (int neighbor : graph.neighbors(s)) {
        traverse(graph, neighbor);
    }
    // 撤销选择:节点 s 离开路径
    onPath[s] = false;
}

Pay attention to visited array and the onPath array, because the binary tree is a special graph, so use the process of traversing the binary tree to understand the difference between the two arrays:

the above described process GIF recursive traversal of a binary tree in visited are marked as true of nodes in gray, in onPath is marked as true in the node is represented by a green , and you now can understand the difference between both of them, right .

If you are asked to deal with path-related problems, this onPath variable will definitely be used, such as topological sorting .

In addition, you should have noticed that the onPath array is very similar to "make selection" and "undo selection" in the core routine of inside the loop, while onPath array are outside the for loop.

The only difference between inside and outside the for loop is the treatment of the root node.

For example, the following two multi-fork tree traversal:

void traverse(TreeNode root) {
    if (root == null) return;
    System.out.println("enter: " + root.val);
    for (TreeNode child : root.children) {
        traverse(child);
    }
    System.out.println("leave: " + root.val);
}

void traverse(TreeNode root) {
    if (root == null) return;
    for (TreeNode child : root.children) {
        System.out.println("enter: " + child.val);
        traverse(child);
        System.out.println("leave: " + child.val);
    }
}

The former will correctly print the entry and exit information of all nodes, while the latter will only print the entry and exit information of the root node of the entire tree less.

Why does the backtracking algorithm framework use the latter? Backtracking is not a concern because the node, but the branches do not believe you backtracking algorithm core routines inside the map.

Obviously, for the traversal of the "graph" here, we should put onPath outside the for loop, otherwise the traversal of the starting point of the record will be missed.

After talking about so many onPath talk about the visited arrays, the purpose is obvious. Since the graph may contain cycles, the visited array is to prevent recursive repeated traversal of the same node into an infinite loop.

Of course, if the title tells you that there are no cycles in the figure, you can visited array, which is basically the traversal of the multi-fork tree.

Topic practice

Let's take a look at the 797th "All possible paths" of Likou. The function signature is as follows:

List<List<Integer>> allPathsSourceTarget(int[][] graph);

Enter a directed acyclic graph , this graph contains n nodes, labeled 0, 1, 2,..., n - 1 , please calculate all the paths 0 to node n - 1

The input graph is actually a graph represented by the "adjacency list". graph[i] stores all the neighbor nodes of this node i

For example, enter graph = [[1,2],[3],[3],[]] , which represents the following picture:

The algorithm should return [[0,1,3],[0,2,3]] , which is all paths from 0 to 3

solution of 161ea76223d814 is very simple. Take 0 as the starting point to traverse the graph, and record the traversed path at the same time. When traversing to the end point, record the path to .

Since the input graph is acyclic, we do not need the visited array, and directly apply the graph traversal framework:

// 记录所有路径
List<List<Integer>> res = new LinkedList<>();
    
public List<List<Integer>> allPathsSourceTarget(int[][] graph) {
    // 维护递归过程中经过的路径
    LinkedList<Integer> path = new LinkedList<>();
    traverse(graph, 0, path);
    return res;
}

/* 图的遍历框架 */
void traverse(int[][] graph, int s, LinkedList<Integer> path) {

    // 添加节点 s 到路径
    path.addLast(s);

    int n = graph.length;
    if (s == n - 1) {
        // 到达终点
        res.add(new LinkedList<>(path));
        path.removeLast();
        return;
    }

    // 递归每个相邻节点
    for (int v : graph[s]) {
        traverse(graph, v, path);
    }
    
    // 从路径移出节点 s
    path.removeLast();
}

This problem is solved in this way. Pay attention to the language characteristics of Java. When adding path res , you need to copy a new list, otherwise the list in res will be empty.

Finally, to sum up, there are mainly adjacency lists and adjacency matrices for storing graphs. No matter what fancy graphs are, they can be stored in these two ways.

In the written test, the most commonly tested algorithm is graph traversal, which is very similar to the multi-tree traversal framework.

Of course, there will be many other FIG interesting algorithms, such bipartite graph determination , ring topology detection and sorting (compiler circular reference detection algorithm is similar), minimum spanning tree , the Dijkstra shortest path algorithm Wait, interested readers can go and see, this article is here.

_____________

to see more high-quality algorithm article click on my avatar , brush your hands with a power button, dedicated to the algorithm to make it clear! My algorithm tutorial has won 90k stars, welcome to like it!

labuladong
63 声望37 粉丝