Recently, a fan communicated with me about the algorithm questions encountered in the interview. One of the questions is more interesting and I will share it with everyone.

He said that he interviewed for a front-end position in a large blockchain company and was asked an algorithm question. This question is also a very common question. There is also the original question 110. Balanced binary tree , the difficulty is simple.

However, the interviewer made a little small extension, difficulty instant upgrade . Let's take a look at what extensions the interviewer has done.

topic

Entitled "determines whether a balanced binary tree", refers to a so-called balanced binary tree binary tree of all nodes difference between the left and right subtrees depth of not more than 1. The input parameter is the root node root of the binary tree, and the output is a bool value.

The code will be called as follows:

console.log(isBalance([3, 9, 2, null, null, 5, 5]));

console.log(isBalance([1, 1, 2, 3, 4, null, null, 4, 4]));

Ideas

The idea of solving is to revolve around the definition of binary tree.

For each node in the binary tree:

  • Calculate the height of the left and right subtrees separately, if the height difference is greater than 1, directly return false
  • Otherwise, continue to recursively call the left and right child nodes. If the left and right child nodes are all balanced binary trees, then return true. Otherwise return false

It can be seen that our algorithm is the definition .

Calculating the depth of the node is relatively easy. You can either use the preorder traversal + reference expansion method, or the postorder traversal method. Here I use preorder traversal + parameter expansion .

are unfamiliar with this are strongly recommended to look at this article 160acdfc822124. I have almost finished all the tree questions, I found these things. . .

So you can write the following code.

function getDepth(root, d = 0) {
  if (!root) return 0;
  return max(getDepth(root.left, d + 1), getDepth(root.right, d + 1));
}

function dfs(root) {
  if (!root) return true;
  if (abs(getDepth(root.left), getDepth(root.right)) > 1) return false;
  return dfs(root.left) && dfs(root.right);
}

function isBalance(root) {
  return dfs(root);
}

It is not difficult to find that the result of this question has nothing to do with the val of the node (TreeNode), and the value of val does not affect the result at all.

Is this over?

You can carefully observe the usage example given by the topic, and you will find that the topic gives the nodes array, not the root node root of the binary tree.

Therefore, we need to build a binary tree with build a binary tree is essentially a reverse sequence process. If you want to know how to deserialize, you must first know about serialization.

And there are many methods of binary tree sequence? What kind of title is given? This requires you to communicate with the interviewer. Chances are that the interviewer is waiting for you to ask him! ! !

Deserialization

Let's first look at what serialization is. The following definition comes from Wikipedia:

In the data processing of computer science, serialization refers to the conversion of data structure or object state into a usable format (such as saving as a file, storing in a buffer, or sending via a network), so that it can be saved in the same or another The process of restoring the original state in a computer environment. When re-obtaining the result of the byte according to the serialization format, it can be used to produce a copy with the same semantics as the original object. For many objects, such as complex objects that use a large number of references, this serialization reconstruction process is not easy. Object serialization in object-oriented does not summarize the functions related to the original object before. This process is also called marshalling. The reverse operation of extracting a data structure from a series of bytes is deserialization (also known as unmarshalling, deserialization, and unmarshalling).

It can be seen that the application of serialization and deserialization in computer science is still very extensive. Take the LeetCode platform as an example, which allows users to input something like:

[1,2,3,null,null,4,5]

This data structure describes a tree:

([1,2,3,null,null,4,5] corresponding binary tree)

In fact, serialization and deserialization are just a concept, not a specific algorithm, but many algorithms. And for different data structures, the algorithm will be different.

Pre-knowledge

Before reading this article, you need to be familiar with tree traversal and BFS and DFS. If you are not familiar with it yet, it is recommended to read the related articles before reading it. Or I also wrote a summary article Binary Tree Traversal , you can also take a look.

Preface

We know that the depth-first traversal of the binary tree can be divided into pre-order traversal, middle-order traversal, and post-order traversal according to the order of visiting the root node. That is, if the root node is visited first is the preorder traversal, the last visit to the root node is the postorder traversal, and the others are the middle-order traversal. The relative order of the left and right nodes will not change, it must be left first and then right.

Of course, it can also be set to right and then left.

And knowing any two of the three traversal results can restore the original tree structure. Isn't this serialization and deserialization? If you are unfamiliar with this classmate, take a look at the 160acdfc8222e8 "Constructing Binary Tree Series"

With such a premise, the algorithm is natural. That is, first perform two different traversals on the binary tree. It may be assumed that the two traversals are performed in the pre-order and middle-order. Then serialize the two traversal results, for example, join the two traversal results into a string with a comma ",". After that, you can reverse the sequence of the string, for example, split it into an array with a comma ",".

Serialization:

class Solution:
    def preorder(self, root: TreeNode):
        if not root: return []
        return [str(root.val)] +self. preorder(root.left) + self.preorder(root.right)
    def inorder(self, root: TreeNode):
        if not root: return []
        return  self.inorder(root.left) + [str(root.val)] + self.inorder(root.right)
    def serialize(self, root):
        ans = ''
        ans += ','.join(self.preorder(root))
        ans += '$'
        ans += ','.join(self.inorder(root))

        return ans

Deserialization:

Here I directly use the 105. The solution of constructing a binary tree from the pre-order and middle-order traversal sequence, one line of code is not changed.

class Solution:
    def deserialize(self, data: str):
        preorder, inorder = data.split('$')
        if not preorder: return None
        return self.buildTree(preorder.split(','), inorder.split(','))

    def buildTree(self, preorder: List[int], inorder: List[int]) -> TreeNode:
        # 实际上inorder 和 preorder 一定是同时为空的,因此你无论判断哪个都行
        if not preorder:
            return None
        root = TreeNode(preorder[0])

        i = inorder.index(root.val)
        root.left = self.buildTree(preorder[1:i + 1], inorder[:i])
        root.right = self.buildTree(preorder[i + 1:], inorder[i+1:])

        return root

In fact, this algorithm is not necessarily true, because the nodes of the tree may have duplicate elements. That is to say, the I mentioned earlier can restore the original tree structure after knowing any two of the three traversal results. Strictly speaking, it should be If there are no duplicate elements in the tree, then you know. Any two of the three traversal results can restore the original tree structure .

You should be clever. I used i = inorder.index(root.val) above code. If there are duplicate elements, the index i may not be accurate. However, this algorithm can be used if the title is limited to no repeating elements. However, it is not realistic to not have duplicate elements in reality, so other methods need to be considered. What kind of method is that?

The answer is that records the empty node . Next enter the topic.

DFS

Serialization

Let's imitate the notation of Likou. For example: [1,2,3,null,null,4,5] (essentially BFS level traversal), the corresponding tree is as follows:

The reason for choosing this notation instead of the DFS notation is that it seems more intuitive. It does not mean that we are here to talk about the serialization and deserialization of BFS.

The serialization code is very simple, we only need to increase the output of the empty node on the basis of the normal traversal (normal traversal does not deal with the empty node).

For example, we all perform a pre-order traversal of the tree and increase the processing of empty nodes. The reason for choosing pre-order traversal is that it is easy to know the position of the root node, and the code is easy to write, you can try if you don't believe it.

Therefore, serialization is just a normal DFS. Let me show you the code directly.

Python code:

class Codec:
    def serialize_dfs(self, root, ans):
        # 空节点也需要序列化,否则无法唯一确定一棵树,后不赘述。
        if not root: return ans + '#,'
        # 节点之间通过逗号(,)分割
        ans += str(root.val) + ','
        ans = self.serialize_dfs(root.left, ans)
        ans = self.serialize_dfs(root.right, ans)
        return ans
    def serialize(self, root):
        # 由于最后会添加一个额外的逗号,因此需要去除最后一个字符,后不赘述。
        return self.serialize_dfs(root, '')[:-1]

Java code:

public class Codec {
    public String serialize_dfs(TreeNode root, String str) {
        if (root == null) {
            str += "None,";
        } else {
            str += str.valueOf(root.val) + ",";
            str = serialize_dfs(root.left, str);
            str = serialize_dfs(root.right, str);
        }
        return str;
    }

    public String serialize(TreeNode root) {
        return serialize_dfs(root, "");
    }
}

[1,2,3,null,null,4,5] will be processed as 1,2,#,#,3,4,#,#,5,#,#

Let’s watch a short video first:

(Animation comes from Likou)

Deserialization

The first step in deserialization is to expand it. Take the above example, it will become an array: [1,2,#,#,3,4,#,#,5,#,#] , and then we also perform a pre-order traversal, processing one element at a time, and just rebuild. Because of our preorder traversal, the first one is the root element, the next one is its left child node, and the next one is its right child node.

Python code:

    def deserialize_dfs(self, nodes):
        if nodes:
            if nodes[0] == '#':
                nodes.pop(0)
                return None
            root = TreeNode(nodes.pop(0))
            root.left = self.deserialize_dfs(nodes)
            root.right = self.deserialize_dfs(nodes)
            return root
        return None

    def deserialize(self, data: str):
        nodes = data.split(',')
        return self.deserialize_dfs(nodes)

Java code:

    public TreeNode deserialize_dfs(List<String> l) {
        if (l.get(0).equals("None")) {
            l.remove(0);
            return null;
        }

        TreeNode root = new TreeNode(Integer.valueOf(l.get(0)));
        l.remove(0);
        root.left = deserialize_dfs(l);
        root.right = deserialize_dfs(l);

        return root;
    }

    public TreeNode deserialize(String data) {
        String[] data_array = data.split(",");
        List<String> data_list = new LinkedList<String>(Arrays.asList(data_array));
        return deserialize_dfs(data_list);
    }

complexity analysis

  • Time complexity: Each node will be processed once, so the time complexity is $O(N)$, where $N$ is the total number of nodes.
  • Space complexity: The space complexity depends on the stack depth, so the space complexity is $O(h)$, where $h$ is the depth of the tree.

BFS

Serialization

In fact, we can also use BFS to represent a tree. At this point, it is actually consistent with Likou's notation.

We know that there are actually levels when traversing levels. It's just that some topics require you to record the level information of each node, and some do not.

This is actually an unpretentious BFS, the only difference is the addition of empty nodes.

Python code:


class Codec:
    def serialize(self, root):
        ans = ''
        queue = [root]
        while queue:
            node = queue.pop(0)
            if node:
                ans += str(node.val) + ','
                queue.append(node.left)
                queue.append(node.right)
            else:
                ans += '#,'
        return ans[:-1]

Deserialization

There is such a tree in the picture:

Then its level traversal is [1,2,3,#,#, 4, 5]. Let's see how to restore the binary tree based on the results of this level traversal. The following is a schematic diagram I drew:

Animation presentation:

树的层次遍历.svg

It is easy to see:

  • The node of level x must point to the node of level x + 1. How to find level + 1? This is easy to do through hierarchy traversal.
  • For the given level x, the nodes corresponding to level x + 1 from left to right, that is, the left and right children of the first node correspond to the first and second nodes of the next layer, and the left and right children of the second node The nodes correspond to the 3rd and 4th nodes of the next layer. . .
  • In fact, if you observe carefully, in fact, the judgment of level x and level x + 1 does not require special judgment. We can reverse our thinking: means that the left and right child nodes of the first node correspond to the first and second nodes, and the left and right child nodes of the second node correspond to the third and fourth nodes. . . (Attention, the next three words are missing)

So our idea is the same BFS, and connect the left and right nodes in turn.

Python code:

    def deserialize(self, data: str):
        if data == '#': return None
        # 数据准备
        nodes = data.split(',')
        if not nodes: return None
        # BFS
        root = TreeNode(nodes[0])
        queue = collections.deque([root])
        # 已经有 root 了,因此从 1 开始
        i = 1

        while i < len(nodes) - 1:
            node = queue.popleft()
            lv = nodes[i]
            rv = nodes[i + 1]
            i += 2
            # 对于给的的 level x,从左到右依次对应 level x + 1 的节点
            # node 是 level x 的节点,l 和 r 则是 level x + 1 的节点
            if lv != '#':
                l = TreeNode(lv)
                node.left = l
                queue.append(l)

            if rv != '#':
                r = TreeNode(rv)
                node.right = r
                queue.append(r)
        return root

complexity analysis

  • Time complexity: Each node will be processed once, so the time complexity is $O(N)$, where $N$ is the total number of nodes.
  • Space complexity: $O(N)$, where $N$ is the total number of nodes.

Is this the end?

With the above serialization knowledge.

We can ask the interviewer what kind of serialization means. And a targeted deserialization scheme is selected to construct a binary tree. Finally, use the method at the beginning of this article to solve it.

Do you think it's over here?

not at all! interviewer asked him to tell his complexity.

After reading this, you might as well pause for a while and think about the complexity of this solution?

1

2

3

4

5

Ok, let's reveal the secret.

The time complexity is $O(n) + O(n^2)$, where $O(n)$ is the time to span the tree, and $O(n^2)$ is the time to determine whether it is a balanced binary tree.

Why is the time complexity of judging a balanced binary tree $O(n^2)$? This is because we calculate the depth of each node, so the total time is depths of all nodes is 160acdfc822af3. The worst case is the case of degenerating to the linked list, and the total height at this time is $1 + 2 + ... n$, according to the arithmetic sequence summation formula, the time complexity is $O(n^2)$.

The space complexity is obviously O(n)$. This includes the overhead of constructing a binary tree and the recursive stack.

The interviewer asked again: Can it be optimized?

After reading this, you might as well pause for a while and think about the complexity of this solution?

1

2

3

4

5

Ok, let's reveal the secret.

There are two optimization methods. The first is:

  • Space for time. Record the return value of the getDepth function to ensure that getDepth is executed multiple times and the parameters are the same. In this way, the time complexity can be reduced to O(n)$
  • The second method is similar to the above method, and its essence is memoization recursion (similar to dynamic programming).
I read in the last article readers: Western France, how can memory recursion be changed to dynamic programming? describes in detail the mutual conversion between memoized recursion and dynamic programming. If you look at it, you will find that this is memoization recursion.

The code of the first method is relatively simple, so I won't write it. Here is the code for the second method.

Define the function getDepth(root) to return the depth of root. It should be noted that is unbalanced, it will directly return -1. so that the above two functions (getDepth and isBalance) can be put into one function for execution.

class Solution:
    def isBalanced(self, root: TreeNode) -> bool:
        def getDepth(root: TreeNode) -> int:
            if not root:
                return 0
            lh = getDepth(root.left)
            rh = getDepth(root.right)
            # lh == -1 表示左子树不平衡
            # rh == -1 表示右子树不平衡
            if lh == -1 or rh == -1 or abs(rh - lh) > 1:
                return -1
            return max(lh, rh) + 1

        return getDepth(root) != -1

to sum up

Although this interview question is a common routine question. But the parameters were changed a bit, and the difficulty came up in an instant. If the interviewer did not directly tell you how the nodes are serialized, he may have done it deliberately. There are many methods for What kind of title is given? This requires you to communicate with the interviewer. Chances are that the interviewer is waiting for you to ask him! ! ! This is the difficulty of this question.

The essence of constructing a binary tree is a process of inverse sequence of a binary tree. How to deserialize requires a combination of serialization algorithms.

The serialization method can be divided into storing empty nodes and not storing empty nodes according to whether to store empty nodes.

Storing empty nodes will cause a waste of space, and not storing empty nodes will make it impossible to uniquely determine a tree containing duplicate values.

Regarding serialization, this article mainly talks about the serialization and deserialization of binary trees. After reading this article, you can safely go to the following two questions in AC:

In addition, violent actions are not enough. Everyone has to put higher demands on themselves.

At the very least, you have to be able to analyze your own algorithms, and the most commonly used is complexity analysis. Furthermore, if you can optimize the algorithm, it will be a bonus. For example, here I used two optimization methods to optimize the time to $O(n)$.

The above is the entire content of this article. If you have any thoughts on this, please leave a message to me. I will check the answers one by one when I have time. I’m Lucifer, maintaining the best algorithmic solution in West Lake District. Github has over 40K stars. You can also pay attention to my public account "Like Plus" to take you to the hard bones of the algorithm.
In addition, the 1,000-page e-book that I have compiled has been downloaded for free for a limited time. You can go to the backstage of my public account "Likoujiajia" and reply to the e-book to get it.


lucifer
5.3k 声望4.6k 粉丝