What do you test for graph theory in an interview? This article tells you!

picture

Graph Theory is a branch of mathematics. It takes pictures as the research object. A graph in graph theory is a graph composed of a number of given points and a line connecting two points. This graph is usually used to describe a certain relationship between certain things. A point is used to represent a thing, and a point is used to connect two points. The line indicates that there is such a relationship between the corresponding two things.

The following is a logical diagram structure:

逻辑上的图结构

A graph is one of the most complex data structures, and the data structures mentioned above can all be regarded as special cases of graphs. Then why not use graphs for all, and divide so many data structures?

This is because many times you don't need to use such complex functions, and many features of graphs are not available. If they are called graphs in general, it is very unfavorable for communication. You don’t think you will ever communicate with others to say that this question is to examine a special kind of picture, this kind of picture. . . . This is too long-winded, so special names are given to special pictures of other pictures, which makes communication easier. "real" figure until we encounter very complicated situations.

The previous chapter mentioned that the data structure is for algorithm services, and the data structure is used to store data, and the purpose is to be more efficient. So when do you need to use graphs to store data, and where is the efficiency of graphs in this case? The answer is simple, that is, if you can't store it well with other simple data structures, you should use graphs. For example, if we need to store a two-way friend relationship, and this friend relationship is many-to-many, then we must use graphs, because other data structures cannot be simulated.

basic concept

Undirected Graph & Deriected Graph〔Undirected Graph & Deriected Graph〕

As mentioned earlier, binary trees can fully implement other tree structures. Similarly, directed graphs can also fully implement undirected graphs and mixed graphs. Therefore, the research of directed graphs has always been the focus of investigation.

All the graphs in this article are directed graphs .

As mentioned earlier, we use a line connecting two points to indicate this relationship between corresponding two things. Therefore, if the relationship between two things is directional, it is a directed graph, otherwise it is an undirected graph. For example: A knows B, then B does not necessarily know A. Then the relationship is one-way, and we need to use a directed graph to represent it. Because if it is represented by an undirected graph, we cannot distinguish whether the edges of A and B indicate whether A knows B or B knows A.

Traditionally, when we draw pictures, we use arrows to represent directed graphs, and those without arrows to represent undirected graphs.

Weighted Graph & Unweighted Graph [Weighted Graph & Unweighted Graph]

If the edge is weighted, it is a weighted graph (or a weighted graph), otherwise it is an unweighted graph (or an unweighted graph). So what is weight? For example, the exchange rate is a logical diagram of weight. 1 Currency A is exchanged for 5 currency B, then the weight of the edge of our A and B is 5. And a relationship like a friend can be seen as a picture without rights.

Indegree & Outdegree〔Indegree & Outdegree〕

How many edges point to node A, then the in-degree of node A is. Similarly, how many edges originate from A, then node A's out-degree is what.

Still taking the above graph as an example, the in-degree and out-degree of all nodes in this graph are both 1.

Path & Ring [Path: Path]

Cyclic Graph [Cyclic Graph] The above graph is a cyclic graph, because we trigger from a certain point in the graph and can return to the starting point. This is the same as the ring in reality.
Acyclic Graph (Acyclic Graph)

I can transform the above graph into an acyclic graph with a little modification. At this time, there is no loop.

Connected graph & strongly connected graph

In the undirected graph, if any two vertices I and j are the path communication , called the communication undirected graph in FIG.

In a directed graph, if any two vertices I and j are the path communication , called the directed graph strongly connected graph.

Spanning tree

The spanning tree of a connected graph refers to a connected subgraph, which contains all n vertices in the graph, but only has n-1 edges sufficient to form a tree. A spanning tree with n vertices has only n-1 edges. If one more edge is added to the spanning tree, it must form a ring. In all spanning trees of the connected network, the spanning tree with the cost of all edges and the smallest is called the minimum spanning tree, where the cost and refer to the sum of the weights of all edges.

Graph creation

The general graph title will not give you a ready-made graph data structure. When you know that this is a picture problem, the first step in solving the problem is usually to build a picture.

The above are all about the logical structure of graphs, so how are graphs stored in the computer?

We know that a graph is composed of points and edges. In theory, we only need to store all the edge relationships in the graph, because the edge already contains the relationship between two points.

Here I briefly introduce two common ways to build maps: adjacency matrix (commonly used, important) and adjacency list.

Adjacency Matrix (common)〔Adjacency Matrixs〕

The first way is to use an array or hash table to store the graph, here we use a two-dimensional array to store it.

Use an n * n matrix to describe the graph graph, which is a two-dimensional matrix, where graphi describes the relationship between the edges.

Generally speaking, for unweighted graphs, I use graphi = 1 to indicate that there is an edge between vertex i and vertex j, and the direction of the edge is from i to j. Use graphi = 0 to indicate that there is no edge between vertex i and vertex j. For the power graph, we can store other numbers, which represent the weight.

It can be seen that the above picture is diagonally symmetrical, so we only need to look at half of it, which results in a waste of half of the space.

The space complexity of this storage method is O(n^2), where n is the number of vertices. If it is a sparse graph (the number of edges in the graph is much smaller than the number of vertices), then it will be a waste of space. And if the graph is undirected, there will always be at least 50% wasted space. The following figure also intuitively reflects this point.

The main advantages of the adjacency matrix are:

Intuitive and simple.
Determine whether the two vertices are connected, get the in-degree and out-degree and update the degree, the time complexity is O(1)

Because it is relatively simple to use, all my topics that need to be created basically use this method.

For example, Likou 743. Network delay time. Title description:

有 N 个网络节点，标记为 1 到 N。

给定一个列表 times，表示信号经过有向边的传递时间。 times[i] = (u, v, w)，其中 u 是源节点，v 是目标节点， w 是一个信号从源节点传递到目标节点的时间。

现在，我们从某个节点 K 发出一个信号。需要多久才能使所有节点都收到信号？如果不能使所有节点收到信号，返回 -1。


示例：

输入：times = [[2,1,1],[2,3,1],[3,4,1]], N = 4, K = 2
输出：2
 

注意:

N 的范围在 [1, 100] 之间。
K 的范围在 [1, N] 之间。
times 的长度在 [1, 6000] 之间。
所有的边 times[i] = (u, v, w) 都有 1 <= u, v <= N 且 0 <= w <= 100。

This is a typical graph problem. For this problem, how do we use the adjacency matrix to build a graph?

A typical mapping code:

Use a hash table to build an adjacency matrix:

    graph = collections.defaultdict(list)
    for fr, to, w in times:
        graph[fr - 1].append((to - 1, w))

Use a two-dimensional array to construct an adjacency matrix:

graph = [[0]*n for _ in range(m)] # 新建一个 m * n 的二维矩阵

for fr, to, w in times:
    graph[fr-1][to-1] = w

This constructs a critical matrix, and then we can traverse the graph based on this adjacency matrix.

Adjacency List (Adjacency List)

For each point, a linked list is stored to point to all points directly connected to the point. For a weighted graph, the value of the element in the linked list corresponds to the weight.

For example, in the undirected and unauthorised graph:

graph-1
(Picture from https://zhuanlan.zhihu.com/p/25498681)

It can be seen that in the undirected graph, the adjacency matrix is symmetric about the diagonal, and the adjacency list always has two symmetrical edges.

And in the directed and unauthorised graph:

graph-2

(Picture from https://zhuanlan.zhihu.com/p/25498681)

Since the adjacency list is a little troublesome to use, it is also not commonly used. In order to reduce the cognitive burden of beginners, I will not post the code.

Graph traversal

The graph is created, the next step is to traverse.

No matter what algorithm you are, it must be traversed. Generally, there are two methods: depth-first search and breadth-first search (other weird traversal methods are of little practical significance, and there is no need to learn).

No matter what kind of traversal, if the graph has loops, it is necessary to record the access of nodes to prevent endless loops. Of course, you may not need to actually use a collection to record the access of nodes, such as using a data in-situ mark outside the data range, the space complexity will be $O(1)$.

Here, a directed graph is taken as an example, and the directed graph is similar, so I won't repeat it here.

About the search of pictures, the following search topics will also be introduced in detail, so I will stop here.

Depth First Search (DFS)

The method of depth-first traversal of the graph is to start from a certain vertex v in the graph, and continue to visit the neighbors, the neighbors of the neighbors until the visit is completed.

As shown above, if we use the DFS, and starting from the node A, then, a possible access order is: A -> C -> B -> D -> F. -> G -> E , of course, may It is A -> D -> C -> B -> F -> G -> E etc., depending on your code, but they are all depth-first.

Breadth First Search [Breadth First Search, BFS]

Breadth-first search can be vividly described as "a simple taste", and it also needs a queue to maintain the order of traversed vertices, so that the adjacent vertices of these vertices can be visited in the order of dequeuing.

As shown above, if we use the BFS, and starting from the node A, then, a possible access order is: A -> B -> C -> F. -> E -> G -> D , of course, may It is A -> B -> F -> E -> C -> G -> D etc., depending on your code, but they are breadth first.

It should be noted that DFS and BFS are just an algorithm idea, not a specific algorithm. Therefore, it has strong adaptability, not limited to the characteristic data structure. The graph described in this article can be used, and the tree described above can also be used. can be used as long as it is a non-linear data structure of .

Common algorithms

The algorithm of the title of the figure is more suitable for a set of templates.

Here are some common board questions. There are:

Dijkstra
Floyd-Warshall
Minimum Spanning Tree (Kruskal & Prim) At present, this section has been deleted, and I feel that I have not written it in detail enough. After the completion of the supplement, it will open again.
A star pathfinding algorithm
Bipartite graph (staining method)〔Bipartitie〕
Topological Sort〔Topological Sort〕

Listed below are templates of common algorithms.

All the following templates are based on the adjacency matrix.

It is strongly recommended that everyone study the following classic algorithms after studying the search of the topic. You can take a few common search questions to test, and then learn if you can make them. Recommended topics: Maximize the value of a path in a graph

Shortest distance

Dijkstra algorithm

The basic idea of DIJKSTRA is breadth-first traversal. In fact, the basic idea of searching the shortest path algorithm is breadth first, but the specific expansion strategy is different.

DIJKSTRA algorithm is mainly to solve FIG any point to FIG additional any point shortest distance, i.e., single-source shortest path.

The name Dijkstra is more difficult to remember, you can simply remember it as DJ algorithm , is it easy to remember?

For example, give you a few cities and the distance between them. Let you plan the shortest route from city a to city b.

For this problem, we can first establish the distance between cities with a graph, and then use dijkstra to do it. So how does dijkstra calculate the shortest path?

The basic idea of the dj algorithm is greedy. Starting from the starting point, all neighbors are traversed every time, and the smallest distance is found. This is essentially a breadth-first traversal. Here we use the heap data structure to make it possible to find the point with the smallest cost in the time of $logN$.

If a normal queue is used, it is actually a special case where all edge weights in the graph are the same.

For example, we find the shortest distance from point start to point end. We expect the dj algorithm to be used in this way.

For example, a picture is like this:

E -- 1 --> B -- 1 --> C -- 1 --> D -- 1 --> F
 \                                         /\
  \                                        ||
    -------- 2 ---------> G ------- 1 ------

We use the adjacency matrix to construct:

G = {
    "B": [["C", 1]],
    "C": [["D", 1]],
    "D": [["F", 1]],
    "E": [["B", 1], ["G", 2]],
    "F": [],
    "G": [["F", 1]],
}

shortDistance = dijkstra(G, "E", "C")
print(shortDistance)  # E -- 3 --> F -- 3 --> C == 6

Specific algorithm:

Initialize the heap. The data in the heap are all binary ancestors of (cost, v), which means "the distance from start to v is cost". Therefore, in the initial situation, the tuple (0, start) is stored in the heap
Pop (cost, v) from the heap, and the first pop must be (0, start). If v has been visited, skip it to prevent the generation of loops.
If v is the end point we are looking for, directly return to cost, and the cost at this time is the shortest distance from start to the point
Otherwise, the neighbors of v are added to the heap, that is, (neibor, cost + c) is added to the heap. Where neibor is the neighbor of v, and c is the distance from v to neibor (that is, the cost of transfer).

Repeat steps 2-4

Code template:

Python

import heapq


def dijkstra(graph, start, end):
    # 堆里的数据都是 (cost, i) 的二元祖，其含义是“从 start 走到 i 的距离是 cost”。
    heap = [(0, start)]
    visited = set()
    while heap:
        (cost, u) = heapq.heappop(heap)
        if u in visited:
            continue
        visited.add(u)
        if u == end:
            return cost
        for v, c in graph[u]:
            if v in visited:
                continue
            next = cost + c
            heapq.heappush(heap, (next, v))
    return -1

JavaScript

const dijkstra = (graph, start, end) => {
  const visited = new Set()
  const minHeap = new MinPriorityQueue();
  //注：此处new MinPriorityQueue()用了LC的内置API，它的enqueue由两个部分组成：
  //element 和 priority。
  //堆会按照priority排序，可以用element记录一些内容。
  minHeap.enqueue(startPoint, 0)

  while(!minHeap.isEmpty()){
    const {element, priority} = minHeap.dequeue();
    //下面这两个变量不是必须的，只是便于理解
    const curPoint = element;
    const curCost = priority;

    if(curPoint === end) return curCost;
    if(visited.has(curPoint)) continue;
    visited.add(curPoint);

    if(!graph[curPoint]) continue;
    for(const [nextPoint, nextCost] of graph[curPoint]){
      if(visited.has(nextPoint)) continue;
      //注意heap里面的一定是从startPoint到某个点的距离；
      //curPoint到nextPoint的距离是nextCost；但curPoint不一定是startPoint。
      const accumulatedCost = nextCost + curCost;
      minHeap.enqueue(nextPoint, accumulatedCost);
    }
  }
  return -1
}

After meeting this algorithm template, you can go to AC 743. Network delay time.

Here is the complete code for your reference:

Python

class Solution:
    def dijkstra(self, graph, start, end):
        heap = [(0, start)]
        visited = set()
        while heap:
            (cost, u) = heapq.heappop(heap)
            if u in visited:
                continue
            visited.add(u)
            if u == end:
                return cost
            for v, c in graph[u]:
                if v in visited:
                    continue
                next = cost + c
                heapq.heappush(heap, (next, v))
        return -1
    def networkDelayTime(self, times: List[List[int]], N: int, K: int) -> int:
        graph = collections.defaultdict(list)
        for fr, to, w in times:
            graph[fr - 1].append((to - 1, w))
        ans = -1
        for to in range(N):
            dist = self.dijkstra(graph, K - 1, to)
            if dist == -1: return -1
            ans = max(ans, dist)
        return ans

JavaScript

const networkDelayTime = (times, n, k) => {
    //咳咳这个解法并不是Dijkstra在本题的最佳解法
    const graph = {};
    for(const [from, to, weight] of times){
        if(!graph[from]) graph[from] = [];
        graph[from].push([to, weight]);
    }

    let ans = -1;
    for(let to = 1; to <= n; to++){
        let dist = dikstra(graph, k, to)
        if(dist === -1) return -1;
        ans = Math.max(ans, dist);
    }
    return ans;
};

const dijkstra = (graph, startPoint, endPoint) => {
  const visited = new Set()
  const minHeap = new MinPriorityQueue();
  //注：此处new MinPriorityQueue()用了LC的内置API，它的enqueue由两个部分组成：
  //element 和 priority。
  //堆会按照priority排序，可以用element记录一些内容。
  minHeap.enqueue(startPoint, 0)

  while(!minHeap.isEmpty()){
    const {element, priority} = minHeap.dequeue();
    //下面这两个变量不是必须的，只是便于理解
    const curPoint = element;
    const curCost = priority;
    if(visited.has(curPoint)) continue;
    visited.add(curPoint)
    if(curPoint === endPoint) return curCost;

    if(!graph[curPoint]) continue;
    for(const [nextPoint, nextCost] of graph[curPoint]){
      if(visited.has(nextPoint)) continue;
      //注意heap里面的一定是从startPoint到某个点的距离；
      //curPoint到nextPoint的距离是nextCost；但curPoint不一定是startPoint。
      const accumulatedCost = nextCost + curCost;
      minHeap.enqueue(nextPoint, accumulatedCost);
    }
  }
  return -1
}

The time complexity of the DJ algorithm is $vlogv+e$, where v and e are the number of points and edges in the graph, respectively.

Finally, I leave a question for you: what if you calculate the distance from a point to all points What adjustments will our algorithm have?

Tip: You can use a dist hash table to record the shortest distance from the starting point to each point. If you figure it out, you can use 882 questions to verify it~

It is worth noting that Dijkstra cannot handle the situation where the edge weight is negative. That is, if there are edges with negative weights, the answer may be incorrect. The shortest path based on the dynamic programming algorithm (described below) can handle this situation.

Floyd-Warshall algorithm

Floyd-Warshall can solve any two points from , i.e., multi-source shortest path, this is not the same algorithm and dj.

In addition, the Bellman-Ford algorithm is also a classic dynamic programming algorithm for solving the shortest path. This is also different from dj, which is based on greed.

Compared with the above dijkstra algorithm, because its calculation process will save the intermediate calculation results to prevent repeated calculations, it is particularly suitable for find the distance between any two points in the , such as Likou’s 1462. Curriculum IV. In addition to this advantage. The biggest difference between the Bellman-Ford algorithm described below and this algorithm is that this algorithm is the shortest path with multiple sources, while the Bellman-Ford algorithm is the shortest path with a single source. Regardless of complexity and writing, the Bellman-Ford algorithm is simpler, and we will introduce it to you later.

Of course, it is not to say that Bellman's algorithm and the above dijkstra do not support multi-source shortest paths, you only need to add a for loop to enumerate all the starting points.

Another very important point is that the Floyd-Warshall algorithm uses the dynamic programming instead of being greedy, so its can handle the negative weight . This point requires everyone to pay special attention. For details of dynamic programming, please refer to the following dynamic programming topics and backpack problem .

The algorithm is not difficult to understand. Simply put, it is: The shortest path from i to j = the shortest path from i to k + the minimum value As shown below:

The shortest distance from u to v is the shortest distance from u to x + the shortest distance from x to v. The above figure x is the only way from u to v. If not, we need the values of multiple intermediate nodes and take the smallest one.

The correctness of the algorithm is self-evident, because from i to j, either directly or through another point k in the graph, there may be more than one intermediate node k, and the smallest one is taken when passing through the intermediate point, which is naturally i to The shortest distance of j.

Questions: Can the longest acyclic path be solved by dynamic programming?

The time complexity of this algorithm is $O(N^3)$, and the space complexity is $O(N^2)$, where N is the number of vertices.

Code template:

Python

# graph 是邻接矩阵，n 是顶点个数
# graph 形如： graph[u][v] = w

def floyd_warshall(graph, n):
    dist = [[float("inf") for _ in range(n)] for _ in range(n)]

    for i in range(n):
        for j in range(n):
            dist[i][j] = graph[i][j]

    # check vertex k against all other vertices (i, j)
    for k in range(n):
        # looping through rows of graph array
        for i in range(n):
            # looping through columns of graph array
            for j in range(n):
                if (
                    dist[i][k] != float("inf")
                    and dist[k][j] != float("inf")
                    and dist[i][k] + dist[k][j] < dist[i][j]
                ):
                    dist[i][j] = dist[i][k] + dist[k][j]
    return dist

JavaScript

const floydWarshall = (graph, v)=>{
  const dist = new Array(v).fill(0).map(() => new Array(v).fill(Number.MAX_SAFE_INTEGER))

  for(let i = 0; i < v; i++){
    for(let j = 0; j < v; j++){
      //两个点相同，距离为0
      if(i === j) dist[i][j] = 0;
      //i 和 j 的距离已知
      else if(graph[i][j]) dist[i][j] = graph[i][j];
      //i 和 j 的距离未知，默认是最大值
      else dist[i][j] = Number.MAX_SAFE_INTEGER;
    }
  }

  //检查是否有一个点 k 使得 i 和 j 之间距离更短，如果有，则更新最短距离
  for(let k = 0; k < v; k++){
    for(let i = 0; i < v; i++){
      for(let j = 0; j < v; j++){
        dist[i][j] = Math.min(dist[i][j], dist[i][k] + dist[k][j])
      }
    }
  }
  return 看需要
}

Let’s go back and see how to set a template to solve Likou’s 1462. Course Arrangement IV, topic description:

你总共需要上 n 门课，课程编号依次为 0 到 n-1 。

有的课会有直接的先修课程，比如如果想上课程 0 ，你必须先上课程 1 ，那么会以 [1,0] 数对的形式给出先修课程数对。

给你课程总数 n 和一个直接先修课程数对列表 prerequisite 和一个查询对列表 queries 。

对于每个查询对 queries[i] ，请判断 queries[i][0] 是否是 queries[i][1] 的先修课程。

请返回一个布尔值列表，列表中每个元素依次分别对应 queries 每个查询对的判断结果。

注意：如果课程 a 是课程 b 的先修课程且课程 b 是课程 c 的先修课程，那么课程 a 也是课程 c 的先修课程。

 

示例 1：



输入：n = 2, prerequisites = [[1,0]], queries = [[0,1],[1,0]]
输出：[false,true]
解释：课程 0 不是课程 1 的先修课程，但课程 1 是课程 0 的先修课程。
示例 2：

输入：n = 2, prerequisites = [], queries = [[1,0],[0,1]]
输出：[false,false]
解释：没有先修课程对，所以每门课程之间是独立的。
示例 3：



输入：n = 3, prerequisites = [[1,2],[1,0],[2,0]], queries = [[1,0],[1,2]]
输出：[true,true]
示例 4：

输入：n = 3, prerequisites = [[1,0],[2,0]], queries = [[0,1],[2,0]]
输出：[false,true]
示例 5：

输入：n = 5, prerequisites = [[0,1],[1,2],[2,3],[3,4]], queries = [[0,4],[4,0],[1,3],[3,0]]
输出：[true,false,true,false]
 

提示：

2 <= n <= 100
0 <= prerequisite.length <= (n * (n - 1) / 2)
0 <= prerequisite[i][0], prerequisite[i][1] < n
prerequisite[i][0] != prerequisite[i][1]
先修课程图中没有环。
先修课程图中没有重复的边。
1 <= queries.length <= 10^4
queries[i][0] != queries[i][1]

You can also use Floyd-Warshall to do this question. You can think of it this way. If the distance from i to j is greater than 0, isn't it a prerequisite course? The data range of this question is about 10^4, and the above dijkstra algorithm will definitely time out, so the Floyd-Warshall algorithm is a wise choice.

I'll set the template directly here, and just change it a bit. Complete code:
Python

class Solution:
    def Floyd-Warshall(self, dist, v):
        for k in range(v):
            for i in range(v):
                for j in range(v):
                    dist[i][j] = dist[i][j] or (dist[i][k] and dist[k][j])

        return dist

    def checkIfPrerequisite(self, n: int, prerequisites: List[List[int]], queries: List[List[int]]) -> List[bool]:
        graph = [[False] * n for _ in range(n)]
        ans = []

        for to, fr in prerequisites:
            graph[fr][to] = True
        dist = self.Floyd-Warshall(graph, n)
        for to, fr in queries:
            ans.append(bool(dist[fr][to]))
        return ans

JavaScript

//咳咳这个写法不是本题最优
var checkIfPrerequisite = function(numCourses, prerequisites, queries) {
    const graph = {}
    for(const [course, pre] of prerequisites){
        if(!graph[pre]) graph[pre] = {}
        graph[pre][course] = true
    }

    const ans = []

    const dist = Floyd-Warshall(graph, numCourses)
    for(const [course, pre] of queries){
        ans.push(dist[pre][course])
    }

    return ans
};

var Floyd-Warshall = function(graph, n){
    dist = Array.from({length: n + 1}).map(() => Array.from({length: n + 1}).fill(false))
    for(let k = 0; k < n; k++){
        for(let i = 0; i < n; i++){
            for(let j = 0; j < n; j++){
                if(graph[i] && graph[i][j]) dist[i][j] = true
                if(graph[i] && graph[k]){
                    dist[i][j] = (dist[i][j])|| (dist[i][k] && dist[k][j])
                }else if(graph[i]){
                    dist[i][j] = dist[i][j]
                }
            }
        }
    }
    return dist
}

If you can solve this problem, I recommend another one to you 1617. The maximum distance between cities in the statistics subtree is 1618a5bcdd71e0. The international version has a solution code that is very clear and The performance is not very good, address: https://leetcode.com/problems/count-subtrees-with-max-distance-between-cities/discuss/1136596/Python-Floyd-Warshall-and-check-all-subtrees

You can also use this topic to practice the dynamic programming algorithm on the graph.

787. Cheapest flight within K stop transit

Bellman-Ford Algorithm

Similar to the above algorithm. This solution mainly solves the shortest path of a single source, that is, the shortest distance from a certain point to other points in the graph.

The basic idea is also dynamic programming.

The core algorithm is:

Initial starting point distance is 0
All edges in the graph is several processing until stable. The processing basis is: For each directed edge (u,v), if dist[u] + w is less than dist[v], it means that we found a way closer to v, , and update it.
The upper limit of the number of times above is the number of vertices V, so it is better to perform n times directly.
Finally, check whether there is a ring caused by a negative edge. (Notice)

for example. For the following graph, there is a B -> C -> D -> B, so the distance from B to C and D can be infinitely small in theory. We need to detect this situation and exit.

The time complexity of this algorithm: $O(V*E)$, and the space complexity: $O(V)$.

Code example:
Python

# return -1 for not exsit
# else return dis map where dis[v] means for point s the least cost to point v
def bell_man(edges, s):
    dis = defaultdict(lambda: math.inf)
    dis[s] = 0
    for _ in range(n):
        for u, v, w in edges:
            if dis[u] + w < dis[v]:
                dis[v] = dis[u] + w

    for u, v, w in edges:
        if dis[u] + w < dis[v]:
            return -1

    return dis

JavaScript

const BellmanFord = (edges, startPoint)=>{
  const n = edges.length;
  const dist = new Array(n).fill(Number.MAX_SAFE_INTEGER);
  dist[startPoint] = 0;

  for(let i = 0; i < n; i++){
    for(const [u, v, w] of edges){
        if(dist[u] + w < dist[v]){
            dist[v] = dist[u] + w;
        }
    }
  }

  for(const [u, v, w] of edges){
    if(dist[u] + w < dist[v]) return -1;
  }

  return dist
}

Recommended reading:

bellman-ford-algorithm

Topic recommendation:

Best Currency Path

Topological sort

In the field of computer science, the topological sorting of a directed graph is a linear sorting of its vertices, so that for each directed edge uv from vertex u to vertex v, u comes first in the sorting. It is possible to perform topological sorting if and only if there is no directional ring in the graph (that is, a directed acyclic graph).

A typical topic is to give you a bunch of courses, and there is a prerequisite relationship between courses, allowing you to give a feasible way of learning, requiring prerequisite courses to be studied first. Any directed acyclic graph has at least one topological sort. It is known that there are algorithms that can construct any topological sorting of directed acyclic graphs in linear time.

Kahn algorithm

To put it simply, suppose L is a list of results, first find those nodes with zero in-degree, and put these nodes in L, because these nodes do not have any parent nodes. then removes the edges connected to these nodes from the graph, and then looks for nodes with zero in-degree in the graph. For these newly found nodes with zero indegree, their parent nodes are already in L, so L can also be placed. Repeat the above operation until no node with zero in-degree is found. If the number of elements in L is the same as the total number of nodes at this time, the sorting is complete; if the number of elements in L is different from the total number of nodes, it means that there are loops in the original graph and topological sorting cannot be performed.

def topologicalSort(graph):
    """
    Kahn's Algorithm is used to find Topological ordering of Directed Acyclic Graph
    using BFS
    """
    indegree = [0] * len(graph)
    queue = collections.deque()
    topo = []
    cnt = 0

    for key, values in graph.items():
        for i in values:
            indegree[i] += 1

    for i in range(len(indegree)):
        if indegree[i] == 0:
            queue.append(i)

    while queue:
        vertex = queue.popleft()
        cnt += 1
        topo.append(vertex)
        for x in graph[vertex]:
            indegree[x] -= 1
            if indegree[x] == 0:
                queue.append(x)

    if cnt != len(graph):
        print("Cycle exists")
    else:
        print(topo)


# Adjacency List of Graph
graph = {0: [1, 2], 1: [3], 2: [3], 3: [4, 5], 4: [], 5: []}
topologicalSort(graph)

Minimum spanning tree

First, let's look at what is spanning tree.

First of all, the spanning tree is a subgraph of the original graph. It is essentially a tree. This is why it is called a spanning tree, not a generated graph. Secondly, the spanning tree should include all the vertices in the graph. As the following figure does not contain all vertices, in other words, all vertices are not in the same connected domain, so it is not a spanning tree.

The yellow vertices are not included

You can think of a spanning tree as a polytree with an uncertain root node. Since it is a tree, it must not contain rings. The following figure is not a spanning tree.

Therefore, it is not difficult to conclude that the number of edges of the minimum spanning tree is n-1, where n is the number of vertices.

Next we look at what is the minimum spanning tree.

The minimum spanning tree is based on the spanning tree with the minimum keyword, which is the abbreviation of the minimum weight spanning tree. It can also be seen from this sentence that the minimum spanning tree processing is the right graph. The weight of the spanning tree is the sum of the weights of all its edges, then the minimum spanning tree of the weight and . It can be seen that neither the spanning tree nor the minimum spanning tree may be unique.

The minimum spanning tree has a strong value in real life. For example, if I want to build a subway and cover n stations, these n stations must be reachable to each other (the same connected domain). How can the cost be minimized if it is built? Because the route between each station is different, the cost is different, so this is the actual use scenario of a minimum spanning tree, and there are many similar examples.

(The picture comes from Wikipedia)

It is not difficult to see that calculating the minimum spanning tree is to select n-1 edges from the edge set so that they satisfy the spanning tree and have the smallest weight sum.

Kruskal and Prim are two classic algorithms for finding the minimum spanning tree. How do these two algorithms calculate the minimum spanning tree? In this section we will take a look at them.

Kruskal

Kruskal is relatively easy to understand and it is recommended to master it.

The Kruskal algorithm is also vividly called the edge addition method . Each time it advances, the edge with the smallest weight is selected and added to the result set. In order to prevent the generation of a ring (adding a ring is meaningless, as long as the weight is positive, it will definitely make the result worse), we need to check whether the currently selected edge is connected to the already selected edge. If it is connected, there is no need to select it, because this will cause the loop to be generated. Therefore, algorithmically, we can use union search to assist in completion. Regarding the collection, we will explain it in the advanced chapter.

The find_parent part of the code below is actually the core code of the union search, but we didn't encapsulate it and use it.

Kruskal specific algorithm:

The edges are sorted according to their weights from small to large.
Initialize n vertices into n connected domains
According to the weight from small to large, select edges to be added to the result set, and each time selects the smallest edge 1618a5bcdd770c. If the currently selected edge is connected with the already selected edge (if it is forced to add a loop), then give up the selection, otherwise select it and add it to the result set.
Repeat 3 until we find a subgraph with a Unicom domain size of n

Code template:

Where edge is an array, each item of the array has the form: (cost, fr, to), meaning that there is an edge with a weight of cost from fr to to.

class DisjointSetUnion:
    def __init__(self, n):
        self.n = n
        self.rank = [1] * n
        self.f = list(range(n))
    
    def find(self, x: int) -> int:
        if self.f[x] == x:
            return x
        self.f[x] = self.find(self.f[x])
        return self.f[x]
    
    def unionSet(self, x: int, y: int) -> bool:
        fx, fy = self.find(x), self.find(y)
        if fx == fy:
            return False

        if self.rank[fx] < self.rank[fy]:
            fx, fy = fy, fx
        
        self.rank[fx] += self.rank[fy]
        self.f[fy] = fx
        return True

class Solution:
    def Kruskal(self, edges) -> int:
        n = len(points)
        dsu = DisjointSetUnion(n)
        
        edges.sort()
        
        ret, num = 0, 1
        for length, x, y in edges:
            if dsu.unionSet(x, y):
                ret += length
                num += 1
                if num == n:
                    break
        
        return ret

Prim

Prim's algorithm is also vividly called plus point method . Each time it advances, the point with the smallest weight is selected and added to the result set. It looks like a real-world tree that is constantly growing.

Prim's specific algorithm:

Initialize the minimum spanning tree point set MV to any vertex in the graph, and the minimum spanning tree edge set ME is empty. Our goal is to fill MV to be the same as V, and the edge set is automatically calculated based on the generation of MV.
In the set E (set E is the edge set of the original graph), select the smallest edge <u, v> where u is the existing element in the MV, and v is the element that does not exist in the MV (like the Growing real-world tree ), add v to MV, and add <u, v> to ME.
Repeat 2 until we find a subgraph with a Unicom domain size of n

Code template:

Where dist is a two-dimensional array, and disti = x means that there is an edge with weight x from vertex i to vertex j.

class Solution:
    def Prim(self, dist) -> int:
        n = len(dist)
        d = [float("inf")] * n # 表示各个顶点与加入最小生成树的顶点之间的最小距离.
        vis = [False] * n # 表示是否已经加入到了最小生成树里面
        d[0] = 0
        ans = 0
        for _ in range(n):
            # 寻找目前这轮的最小d
            M = float("inf") 
            for i in range(n):
                if not vis[i] and d[i] < M:
                    node = i
                    M = d[i]
            vis[node] = True
            ans += M
            for i in range(n):
                if not vis[i]:
                    d[i] = min(d[i], dist[i][node])
        return ans

Comparison of two algorithms

For the convenience of the following description, let V be the number of vertices in the graph, and E be the number of edges in the graph. Then KruKal's algorithm complexity is $O(ElogE)$, and Prim's algorithm time complexity is $E + VlogV$. Therefore Prim is suitable for dense graphs, while KruKal is suitable for sparse graphs.

You can also refer to the Wikipedia-minimum spanning tree as a supplement.

In addition, here is a video learning material, the animation is well done, you can use it as a reference, address: https://www.bilibili.com/video/BV1Eb41177d1/

You can use LeetCode's The minimum cost of connecting all points is to practice the algorithm.

Other algorithms

A star pathfinding algorithm

The problem solved by A star path finding is to find the shortest distance or shortest path between any two points in a two-dimensional table. It is a commonly used heuristic algorithm for mobile computing of NPCs in games. Generally, there will be obstacles in such problems. In addition to obstacles, there will be some restrictions on Likou's questions, making the questions more difficult.

This kind of problem is generally difficult and difficult. It is not difficult to understand, but it is not so easy to write it completely without bugs.

In this algorithm, we start from the starting point, check the four adjacent squares and try to expand until we find the target. There are more than one way to find the way of A star path finding algorithm, you can find out if you are interested.

The formula is expressed as: f(n)=g(n)+h(n).

in:

f(n) is the estimated cost from the initial state to the target state through state n,
g(n) is the actual cost from the initial state to state n in the state space,
h(n) is the estimated cost of the best path from state n to the target state.

If g(n) is 0, that is, only the evaluation function h(n) from any vertex n to the target is calculated, and the distance from the starting point to the vertex n is not calculated, the algorithm is transformed into a best-first search using a greedy strategy, which is the fastest. But may not get the optimal solution;
If h(n) is not greater than the actual distance from vertex n to the target vertex, the optimal solution can be found. The smaller h(n), the more nodes need to be calculated, and the lower the algorithm efficiency. Common evaluation functions are ——Euclidean distance, Manhattan distance, Chebyshev distance;
If h(n) is 0, that is, only the shortest path g(n) from the starting point to any vertex n is required, and no evaluation function h(n) is calculated, it is transformed into a single-source shortest path problem, that is, Dijkstra's algorithm. Need to calculate the most vertices;

An important concept here is the valuation algorithm . Generally, we use Manhattan distance for valuation, that is, H(n) = D * (abs ( n.x – goal.x ) + abs ( n.y – goal.y ) ) .

(The picture is from Wikipedia https://zh.wikipedia.org/wiki/A*%E6%90%9C%E5%B0%8B%E6%BC%94%E7%AE%97%E6%B3%95 )

A complete code template:

grid = [
    [0, 1, 0, 0, 0, 0],
    [0, 1, 0, 0, 0, 0],  # 0 are free path whereas 1's are obstacles
    [0, 1, 0, 0, 0, 0],
    [0, 1, 0, 0, 1, 0],
    [0, 0, 0, 0, 1, 0],
]

"""
heuristic = [[9, 8, 7, 6, 5, 4],
             [8, 7, 6, 5, 4, 3],
             [7, 6, 5, 4, 3, 2],
             [6, 5, 4, 3, 2, 1],
             [5, 4, 3, 2, 1, 0]]"""

init = [0, 0]
goal = [len(grid) - 1, len(grid[0]) - 1]  # all coordinates are given in format [y,x]
cost = 1

# the cost map which pushes the path closer to the goal
heuristic = [[0 for row in range(len(grid[0]))] for col in range(len(grid))]
for i in range(len(grid)):
    for j in range(len(grid[0])):
        heuristic[i][j] = abs(i - goal[0]) + abs(j - goal[1])
        if grid[i][j] == 1:
            heuristic[i][j] = 99  # added extra penalty in the heuristic map


# the actions we can take
delta = [[-1, 0], [0, -1], [1, 0], [0, 1]]  # go up  # go left  # go down  # go right


# function to search the path
def search(grid, init, goal, cost, heuristic):

    closed = [
        [0 for col in range(len(grid[0]))] for row in range(len(grid))
    ]  # the reference grid
    closed[init[0]][init[1]] = 1
    action = [
        [0 for col in range(len(grid[0]))] for row in range(len(grid))
    ]  # the action grid

    x = init[0]
    y = init[1]
    g = 0
    f = g + heuristic[init[0]][init[0]]
    cell = [[f, g, x, y]]

    found = False  # flag that is set when search is complete
    resign = False  # flag set if we can't find expand

    while not found and not resign:
        if len(cell) == 0:
            return "FAIL"
        else:  # to choose the least costliest action so as to move closer to the goal
            cell.sort()
            cell.reverse()
            next = cell.pop()
            x = next[2]
            y = next[3]
            g = next[1]

            if x == goal[0] and y == goal[1]:
                found = True
            else:
                for i in range(len(delta)):  # to try out different valid actions
                    x2 = x + delta[i][0]
                    y2 = y + delta[i][1]
                    if x2 >= 0 and x2 < len(grid) and y2 >= 0 and y2 < len(grid[0]):
                        if closed[x2][y2] == 0 and grid[x2][y2] == 0:
                            g2 = g + cost
                            f2 = g2 + heuristic[x2][y2]
                            cell.append([f2, g2, x2, y2])
                            closed[x2][y2] = 1
                            action[x2][y2] = i
    invpath = []
    x = goal[0]
    y = goal[1]
    invpath.append([x, y])  # we get the reverse path from here
    while x != init[0] or y != init[1]:
        x2 = x - delta[action[x][y]][0]
        y2 = y - delta[action[x][y]][1]
        x = x2
        y = y2
        invpath.append([x, y])

    path = []
    for i in range(len(invpath)):
        path.append(invpath[len(invpath) - 1 - i])
    print("ACTION MAP")
    for i in range(len(action)):
        print(action[i])

    return path


a = search(grid, init, goal, cost, heuristic)
for i in range(len(a)):
    print(a[i])

Typical topic 1263. Sokoban

binary picture

I've talked about the bipartite graph in these two questions. After you look at it, you can do these two questions. In fact, there is no difference between these two questions and one question.

The recommended order is: first look at 886 and then look at 785.

Summarize

To understand the common concepts of graphs, we are just getting started. Next, we can do the questions.

There are two general graph problems, one is a search problem, and the other is a dynamic programming problem.

For search topics, we can:

The first step is to build a map
The second step is based on the graph of the first step to traverse to find a feasible solution

If the title shows that it is an acyclic graph, we can not use the visited array, otherwise most of them will need the visited array. Of course, you can also choose the in-situ algorithm to reduce the space complexity, and the specific search techniques will be discussed in the search section of the topic section.

The title of the picture is relatively difficult, especially at the level of code writing. But as far as interview questions are concerned, there are not many types of questions in the picture.

As far as search topics are concerned, many topics can be solved by using templates. Therefore, it is recommended that you practice the template more and type it yourself to make sure you can type it out by yourself.
For dynamic programming, a classic example is the Floyd-Warshall algorithm . After you understand it, you might as well take 787. The cheapest flight in the transit station K to practice. Of course, this requires that everyone should learn dynamic programming first. We will explain in depth in the following "Dynamic Programming" and "Knapsack Problem".

The common picture board questions are as follows:

The shortest path. Algorithms include DJ algorithm, floyd algorithm and bellman algorithm. Some of these are single-source algorithms, some are multi-source algorithms, some are greedy algorithms, and some are dynamic programming.
Topological sorting. Topological sorting can use bfs or dfs. Compared with the shortest path, this type of problem is simple when you know it.
Minimum spanning tree. The minimum spanning tree is the lowest frequency of these three types of questions, and can be the last breakthrough.
The ratio of A star path finding and bipartite graph questions is very low, you can choose to master them according to your own situation.

What do you test for graph theory in an interview? This article tells you!

picture

basic concept

Undirected Graph & Deriected Graph〔Undirected Graph & Deriected Graph〕

Weighted Graph & Unweighted Graph [Weighted Graph & Unweighted Graph]

Indegree & Outdegree〔Indegree & Outdegree〕

Path & Ring [Path: Path]

Connected graph & strongly connected graph

Spanning tree

Graph creation

Adjacency Matrix (common)〔Adjacency Matrixs〕

Adjacency List (Adjacency List)

Graph traversal

Depth First Search (DFS)

Breadth First Search [Breadth First Search, BFS]

Common algorithms

Shortest distance

Dijkstra algorithm

Floyd-Warshall algorithm

Bellman-Ford Algorithm

Topological sort

Kahn algorithm

Minimum spanning tree

Kruskal

Prim

Comparison of two algorithms

Other algorithms

A star pathfinding algorithm

binary picture

Summarize

lucifer

`引用和评论`

伯克利大学的计算机入门教程

可视化图解算法34：二叉搜索树的最近公共祖先

如何对接韩国和日本股票数据源API

可视化图解算法19：递归基础

可视化图解算法29：合并二叉树

反转链表（花式反转）

从尾到头打印链表

What do you test for graph theory in an interview? This article tells you!

picture

basic concept

Undirected Graph & Deriected Graph〔Undirected Graph & Deriected Graph〕

Weighted Graph & Unweighted Graph [Weighted Graph & Unweighted Graph]

Indegree & Outdegree〔Indegree & Outdegree〕

Path & Ring [Path: Path]

Connected graph & strongly connected graph

Spanning tree

Graph creation

Adjacency Matrix (common)〔Adjacency Matrixs〕

Adjacency List (Adjacency List)

Graph traversal

Depth First Search (DFS)

Breadth First Search [Breadth First Search, BFS]

Common algorithms

Shortest distance

Dijkstra algorithm

Floyd-Warshall algorithm

Bellman-Ford Algorithm

Topological sort

Kahn algorithm

Minimum spanning tree

Kruskal

Prim

Comparison of two algorithms

Other algorithms

A star pathfinding algorithm

binary picture

Summarize

lucifer

引用和评论

**伯克利大学** 的计算机入门教程

可视化图解算法34：二叉搜索树的最近公共祖先

如何对接韩国和日本股票数据源API

可视化图解算法19：递归基础

可视化图解算法29：合并二叉树

反转链表（花式反转）

从尾到头打印链表

`引用和评论`

伯克利大学的计算机入门教程