Data Structure and Algorithm: Introduction to Greedy Algorithm

This tutorial outlines and explains the greedy algorithm, with easy-to-understand code and infographics. You will become a professional soon!

1. Prefix tree

1.1 Description

The prefix tree is related to the greedy algorithm; let alone the relationship.

Prefix tree, also known as Trie, word search tree, etc., is a tree structure used to store a large number of strings.

Its characteristic is that space is exchanged for time, and the common prefix of the string is used to reduce the overhead of query time, so as to achieve the purpose of improving efficiency.

1.2 Classic prefix tree

The characters of the classic prefix tree are stored on the path, and each node has no data.

1.3 Code definition

1.3.1 Node structure

In order to implement certain functions with Trie, some data items are actually added to the node.

public class Node {
    // 构建前缀树时,该节点被到达过多少次
    int pass;
    // 该节点是否是某个字符串的结尾节点,如果是,是多少个字符串的结尾节点
    int end;
    // 各个字符的通路
    Node[] nexts; // HashMap<Char, Node> -- TreeMap<Char, Node>

    public Node() {
        this.pass = 0;
        this.end = 0;
        // 假如词典中只有'a'-'z',那么next[26],对应下标0-25
        // 通过下级是否有节点判断是否有路
        // nexts[0]==null表示下级没有存储'a'的路
        // nexts[0]!=null表示下级有存储'a'的路
        nexts = new Node[26];
    }
}

Note : In the prefix tree, the downward path of each node is implemented by mounting slave nodes. In the implementation of the code, the subscripts that can be mounted are numbered through the subscripts of the array, so that each party can "carry" different character information through the one-to-one correspondence between subscripts and characters.

If you need to store a string containing multiple types of characters, it is not appropriate to use an array to store the mounted nodes. For example, Java supports more than 60,000 characters, so you cannot open an array with a capacity of 60,000 from the beginning. Therefore, when there are many character types, a hash table can be used instead of an array to store the mounted nodes, and the keys of the hash table can also correspond to the characters one-to-one.

After replacing the hash table with an array, the overall algorithm will not change, and the details of Coding will change.

However, after using hash table storage, the path is out of order. If you want the path to be organized like an array storage, you can use an ordered list instead of a hash table storage.

Use the node where the data item has been added to store the above string again:

By adding nodes with data items, we can easily solve many problems, such as:

How do I know if the string "bck" is stored in Trie?

Answer: Starting from the root node, is there a way to check'b'? Yes; is there a way to'c' it? Yes; is there another way to "k"? Yes; then check the end of the node at the end of the'k' path, if end != 0, store "bck", if end = 0, don't store "bck".

How do I know how many of all the strings stored in Trie are prefixed with "ab"?

Answer: Starting from the root node, is there a way to check to'a'? Yes; is there a way to'b' it? Yes; then check the pass of the node at the end of the'b' path. The value of pass is the number of strings prefixed with "ab".

How do you know how many strings are stored in Trie?

Answer: Just check the pass of the root node.

How to know how many empty strings are stored in Trie?

Answer: Just check the end of the root node.

Through the above questions, we can find that using the data item information of the node type, you can easily query each string stored in the Trie, and the cost of querying the string is very low, only need to traverse the number of characters in it to query the character The number of strings is sufficient.

1.3.2 Tree structure

public class Trie {
    // 根节点
    private Node root;
    
    public Trie() {
        this.root = new Node();
    }
    
    // 相关操作
    ...
}

1.4 Basic operation

1.4.1 Add string

Idea: Starting from the root node, add 1 to the pass of the nodes along the way, and add 1 to the node at the end of the string.

Code:

public void insert(String word) {
    if (word == null) {
        return ;
    }
    char[] charSequence = word.toCharArray();
    // 字符串起始节点为根节点
    Node cur = this.root;
    cur.pass ++;
    for (int i = 0; i < charSequence.length; i ++) {
        int index = charSequence[i] - 'a';
        // 该字符对应节点如果没有挂载就挂载上
        if (cur.nexts[index] == null) {
            cur.nexts[index] = new Node();
        }
        // 向下遍历
        cur = cur.nexts[index];
        cur.pass ++;
    }
    // 记录字符串结尾节点
    cur.end ++;
}

Note: Each string to be added must start from the root, which means that each string is prefixed with an empty string.

1.4.2 Delete string

Idea: If the string exists, starting from the root node, the pass of the nodes along the way is reduced by 1, and the node at the end of the string is reduced by 1.

Code:

public void delete(String word) {
    // 判断是否存储该单词
    if (word == null || search(word) == 0) {
        return ;
    }
    char[] charSequence = word.toCharArray();
    Node cur = this.root;
    cur.pass --;
    for (int i = 0; i < charSequence.length; i ++) {
        int index = charSequence[i] - 'a';
        // 当前节点的下级节点在更新数据后pass为0,意味这没有没有任何一个字符串还通过该节点
        if (-- cur.nexts[index].pass == 0) {
            // 释放掉下级路径的所有节点
            cur.nexts[index] = null;
            return ;
        }
        cur = cur.nexts[index];
    }
    cur.end --;
}

Note : If there is only one target string in the Trie, after modifying the node data, all redundant nodes need to be released. Due to the automatic garbage collection in Java, when the pass of a node is 0 for the first time, we can directly set it to null, and then all subordinate nodes of the node will be automatically recycled. If it is implemented in C++, then you need to traverse to the end, and when you backtrack along the way, call the destructor to manually release the node.

1.4.3 Query string

Idea: If the string exists, query the node at the end of the string.

Code:

// 查询word这个单词被存储的次数
public int search(String word) {
    if (word == null) {
        return 0;
    }
    char[] charSequence = word.toCharArray();
    Node cur = this.root;
    // 只遍历Trie,不更新数据
    for (int i = 0; i < charSequence.length; i ++) {
        int index = charSequence[i] - 'a';
        // 如果没有挂载节点,说明没有存该字符串
        if (cur.nexts[index] == null) {
            return 0;
        }
        cur = cur.nexts[index];
    }
    // 返回字符串末尾节点的end
    return cur.end;
}

1.4.4 Query prefix

Idea: If the string exists, query the pass of the node at the end of the prefix string.

Code:

// Trie存储的字符串中,前缀是pre的字符串个数
public int preFixNumber(String pre) {
    if (pre == null) {
        return 0;
    }
    char[] charSequence = pre.toCharArray();
    Node cur = this.root;
    // 只遍历Trie,不更新数据
    for (int i = 0; i < charSequence.length; i ++) {
        int index = charSequence[i] - 'a';
        // 如果没有挂载节点,说明没有字符串以该子串作为前缀
        if (cur.nexts[index] == null) {
            return 0;
        }
        cur = cur.nexts[index];
    }
    // 返回pre子串末尾节点的pass
    return cur.pass;
}

2. Greedy algorithm

2.1 Concept

Under a certain standard, the algorithm that gives priority to the samples that most meet the standard, and finally considers the samples that do not meet the standard and finally gets the answer is called the greedy algorithm.

In other words, without considering the overall optimality, what is made is a local optimal solution in a sense.

2.2 Description

Greedy algorithm is actually the most commonly used algorithm, and the code implementation is also very short.

Many greedy algorithms only need to find a good solution, not an optimal solution. In other words, for most of the daily greedy algorithms, the process from the local optimal to the overall optimal cannot be proved, or the proof is wrong, because sometimes greedy is very subjective. But the greedy algorithm problems we have encountered are all deterministic, and the global optimal solution can be found. At this time, the inspection of the greedy algorithm requires a proof from the local optimum to the overall optimum.

This article will not show the proof process, because for each problem, how the local optimal strategy derives the proof of the global optimal is different. If every greedy algorithm problem is proven, the time in the interview process is not enough. The following will introduce a very useful technique. The premise of this technique is to prepare a lot of templates, but you only need to prepare it once. Be prepared. When you do greedy algorithm questions in the future, you will answer them quickly and accurately, much faster than proofs.

2.3 Meeting issues

title:

Some projects need to occupy a conference room for presentation, and the conference room cannot accommodate two presentations at the same time. Give you all the projects and the start time and end time of each project, you will arrange the schedule of the lectures, and require the meeting room to have the most lectures. Back to this most preached meeting.

analysis:

Greedy Strategy A : The sooner the project starts, the sooner it is scheduled.

Unable to obtain the global optimal solution. The counter-example is as follows:

Greedy Strategy B : The shorter the project duration, the priority will be arranged.

Unable to obtain the global optimal solution. The counter-example is as follows:

Greedy Strategy C : First arrange the project that ends early.

The global optimal solution can be obtained.

Counter-example strategy A:

Counter-example strategy B:

public class Solution {
        
    static class Program {
        // 项目开始时间
        public int begin;
        // 项目结束时间
        public int end;
        public Program(int begin, int end) {
            this.begin = begin;
            this.end = end;
        }
    }

    // 定义Program比较器,按照Program的结束时间来比较
    static class ProgramComparator implements Comparator<Program> {
        @Override
        public int compare(Program program1, Program program2) {
            return program1.end - program2.end;
        }
    }

    public int arrangeProgram(Program[] programs, int currentTime) {
        if (programs == null || programs.length == 0) {
            return 0;
        }
        // 可以安排项目的场数
        int number = 0;
        // 将Program数组按照结束时间早晚来排序,结束早的排前面
        Arrays.sort(programs, new ProgramComparator());
        for (int i = 0; i < programs.length; i ++) {
            // 如果当前时间还没到会议开始时间,安排会议
            if (currentTime < programs[i].begin) {
                number ++;
            }
            // 当前时间来到会议结束
            currentTime = programs[i].end;
        }
        return number;
    }
}

2.4 Problem solving procedures

After reading 2.3, there will definitely be problems. Why can the greedy strategy C find the optimal solution? Don't worry about why the greedy strategy C is right, which is to rely on blindness and skillful blindness.

In problems related to greedy algorithms, you will always have several greedy strategies. If you can use counterexamples to overthrow some greedy strategies, that would be great. But if there are a few greedy strategies that you think are more reliable and you can't cite counterexamples, do you need to use strict mathematical proofs? You can do the questions privately, but not during the interview!

The mathematical proof of the greedy strategy is different for each question, because each question has its own thing, the method of proof is also incredible, and the test is the mathematical ability.

So, how do you verify that your greedy strategy is correct? logarithm.

Routines to solve the problem:

To achieve solution X that does not rely on greedy strategies, the most violent attempt can be used.
The brain makes up for greedy strategy A, greedy strategy B, greedy strategy C...
Use the solution X and the logarithm to verify each greedy strategy, and use experiments to know which greedy strategy is correct.
Don't worry about the proof of the greedy strategy.
ready:

Be prepared for brute force attempts or a complete array of code templates.
Prepare logarithmic code template.

2.5 Interview situation

Greedy algorithmic questions are not flexible, and the interview ratio is relatively low. There are about five questions, and at most one question.

First of all, the problem of greedy algorithms cannot test the skill of Coding because the code is extremely simple.

Secondly, the problem of greedy algorithm has no discriminative degree. As long as the greedy strategy is found, it is the same, only the difference between 0 and 1.

3. Logarithm

3.1 Description

Logarithm is very easy to use and can help you solve many problems.

Suppose you want to test method A, but the same problem can be achieved with many strategies. If the complexity of the incident is not considered, I can write a solution that is tried violently (for example, list all permutations and combinations). We say that this method does not pursue the advantages and disadvantages of time complexity, but it is very thoughtful and easy to write. Method B. Why test method A? Because it may be difficult to think of method A or the time complexity is relatively low.

In the past, did I need to rely on online OJ for every exam? If every topic must rely on OJ, do I still need to find it on OJ when I encounter unfamiliar topics? If you can't find it, don't you practice it? Secondly, the test data on the Internet is what people think, so when he prepares the test case, will he make your code run wrong, but will not let you pass? You have passed it, but can you guarantee that your code is correct? Not necessarily, the logarithmic method is needed at this time, and the logarithmic method is foolproof.

3.2 Implementation

Implement a random sample generator, which can control the sample size, the number of tests, and generate random data. The generated data is run in method A to get res1, and then run in method B to get res2. Check whether res1 and res2 are consistent. You measure tens of thousands of groups and then change the sample size. When res1 and res2 are found to be inconsistent, either method A is wrong, or method B is wrong, or both are wrong.

Implementation of random numbers in Java:

// [0,1)上所有小数，等概率返回一个，double类型
random = Math.random();
// [0,N)上所有小数，等概率返回一个，double类型
random = N * Math.random();
// [0,N-1]上所有整数，等概率返回一个，int类型
random = (int) (N * Math.random());

By generating random numbers, any part of the test sample can be randomized, such as sample size, number of tests, and test data.

The business logic of each topic is different, and the implementation of the logarithm is also different. You need to implement it according to the actual situation.

4. Interview Questions

4.1 Finding the median

title:

The user uses a structure to store N numbers one by one, and requires that the median of all numbers currently stored in the structure can be found at any time.

Rule: The median of odd digits is the middle digit, and the median of even digits is the average of the middle two digits.

analysis:

This subject has nothing to do with greedy algorithms, and is a classic subject for research on reactor applications.

Because the heap is widely used in the greedy algorithm, you can familiarize yourself with the operation of the heap through this topic.

The time complexity of this algorithm is very low, because all operations on the heap are O(logN).

The process is:

Prepare a large root pile and a small root pile.
The first number goes into the big root pile.
Enter the fixed iterative process:
The currently entered number cur <= the top number of the big root pile, if it is, then cur enters the big root pile; if not, then cur enters the small root pile.
Observe the size of the big root pile and the small root pile. If there are more than 2 larger ones than smaller ones, the larger top will pop up another one.
Stop the iteration after storing all the numbers.
The smaller N/2 number is in the large pile, and the larger N/2 number is in the small pile. The median can be found by using the highest numbers of the two piles.
Code:

public class Solution {

    // 大根堆比较器
    static class BigComparator implements Comparator<Integer> {
        @Override
        public int compare(Integer num1, Integer num2) {
            return num2 - num1;
        }
    }

    public static int findMedian(int[] input) {
        if (input == null || input.length == 0) {
            return 0;
        }
        PriorityQueue<Integer> bigRootHeap = new PriorityQueue<>(new BigComparator());
        PriorityQueue<Integer> smallRootHeap = new PriorityQueue<>();
        // 第一个数先入大根堆
        bigRootHeap.add(input[0]);
        for (int i = 1; i < input.length; i ++) {
            if (input[i] <= bigRootHeap.peek()) {
                bigRootHeap.add(input[i]);
            } else {
                smallRootHeap.add(input[i]);
            }
            if (Math.abs(bigRootHeap.size() - smallRootHeap.size()) == 2) {
                if (bigRootHeap.size() > smallRootHeap.size()) {
                    smallRootHeap.add(bigRootHeap.poll());
                } else {
                    bigRootHeap.add(smallRootHeap.poll());
                }
            }
        }
        // 判断输入数字个数是奇数还是偶数
        if (input.length % 2 != 0) {
            return smallRootHeap.peek();
        }
        return (bigRootHeap.peek() + smallRootHeap.peek()) / 2;
    }
}

4.2 Gold Bar Problem

title:

Cutting the gold bar in half requires a copper plate of the same length. For example, if a strip of length 20 is cut in half, it costs 20 copper plates.

A group of people want to divide the whole gold bar, how to divide the most economical copper plate?

For example, given an array [10, 20, 30], representing three people, the length of the entire gold bar is 10 + 20 + 30 = 60. Gold bars are divided into three parts: 10, 20, and 30. If you divide 60 long gold bars into 10 and 50, the cost is 60; then 50 long gold bars are divided into 20 and 30, which costs 50; a total of 110 copper plates are required .

But if you divide the 60-long gold bar into 30 and 30, it will cost 60; then divide the 30-long gold bar into 10 and 20, which costs 30; a total of 90 copper plates are spent.

Enter an array and return the minimum cost of the split.

analysis:

This problem is a classic Huffman coding problem.

The process is:

Put all the elements in the array into the small root heap
Iterative fixation process:
Pop two nodes from the top of the small root pile and combine them into new nodes to build a Huffman tree.
The new node is put back into the small root pile.
Stop iterating until there is only one node in the small root heap.

Code:

public int leastMoney(int[] parts) {
    PriorityQueue<Integer> smallRootHeap = new PriorityQueue<>();
    // 需要花费的最少钱数
    int money = 0;
    // 将节点全部放入小根堆
    for (int i = 0; i < parts.length; i ++) {
        smallRootHeap.add(parts[i]);
    }
    // 直到堆中只有一个节点时停止
    while (smallRootHeap.size() != 1) {
        // 每次堆顶弹两个算累加和
        int cur = smallRootHeap.poll() + smallRootHeap.poll();
        money += cur;
        // 累加和的新节点入堆
        smallRootHeap.add(cur);
    }
    return money;
}

4.3 Project planning issues

title:

Your team has received some projects, and each project will have costs and profits. Since your team has very few people, you can only do projects in order. Assuming that your team currently has M disposable funds and can only do K projects at most, what is the final maximum disposable funds?

Note: casts[i], progress[i], M and K are all positive numbers.

analysis:

The process is:

Build a small root stake and a large root stake. The classification standard for small piles is cost, and the classification standard for large piles is profit.
Put all items in the small root pile.
Enter the fixed iterative process:
The small root pile ejects all items with a cost less than or equal to M into the large root pile.
The most profitable project appeared.
M plus the profit of the project just completed.
Stop the iteration until the number of completed projects is K.

Code:

public class Solution {

    static class Project {
        int cost;
        int profit;
        public Project(int cost, int profit) {
            this.cost = cost;
            this.profit = profit;
        }
    }

    static class minCostComparator implements Comparator<Project> {
        @Override
        public int compare(Project p1, Project p2) {
            return p1.cost - p2.cost;
        }
    }

    static class maxProfitComparator implements Comparator<Project> {
        @Override
        public int compare(Project p1, Project p2) {
            return p2.profit - p1.profit;
        }
    }

    public static int findMaximumFund(int M, int K, int[] costs, int[] profits) {
        if (M == 0 || K == 0) {
            return 0;
        }
        // 通过花费构建小根堆
        PriorityQueue<Project> costSmallRootHeap = new PriorityQueue<>(new minCostComparator());
        // 通过利润构建大根堆
        PriorityQueue<Project> profitBigRootHeap = new PriorityQueue<>(new maxProfitComparator());
        // 将所有项目全部放入小根堆
        for (int i = 0; i < costs.length; i ++) {
            costSmallRootHeap.add(new Project(costs[i], profits[i]));
        }
        // 一共只能做K个项目
        for (int i = 0; i < K; i ++) {
            // 将小根堆中当前可以做的项目放入大根堆
            while (!costSmallRootHeap.isEmpty() && costSmallRootHeap.peek().cost <= M) {
                profitBigRootHeap.add(costSmallRootHeap.poll());
            }
            // 没有可以做的项目
            if (profitBigRootHeap.isEmpty()) {
                return M;
            }
            // 从大根堆中选选取利润最大的做
            Project cur = profitBigRootHeap.poll();
            M += cur.profit;
        }
        return M;
    }
}

4.4 N queen problem

title:

The N queen problem refers to placing N queens on an N*N chessboard, requiring any two queens to be in different rows, different columns, and not on the same diagonal.

Given an integer n, how many ways to return the queen n.

E.g:

n=1, return 1.

n=2 or 3, return 0. (The question of 2 queens and 3 queens will not work no matter how you put them)

n=8, return 92.

analysis:

This problem is a classic problem, the optimal solution is very complicated, and it is a dynamic programming problem with sequelae.

If you are not writing a thesis, the best solution during the interview process is to adopt a depth-first approach, place the queen in each row in turn, and use violent recursion to try every possibility in each column.

The time complexity index of this scheme is still very high.

Because there are N choices in the first row, N choices in the second row, N choices in the third row,..., there are a total of N rows, so the time complexity is O(N^N).

Suppose, record[0] = 2, record[1] = 4. When the depth-first traversal reaches i = 2, there are 3 reasonable Queen placement positions in the third row of the figure, and then these three positions are traversed depthwise first. And start again and again.

Code:

public static int nQueen(int n) {
    if (n < 1) {
        return 0;
    }
    int[] record = new int[n];
    // 从第0行开始
    return process(0, n, record);
}

/**
 * @param i 当前遍历第几行
 * @param n 一共多少行
 * @param record [0,i-1]行已经摆放的Queen的位置,record[1]=2表示第1行第2列已摆放一个Queen
 * @return n行n列棋盘中摆n个Queen的合理的摆法总数
 */
public static int process(int i, int n, int[] record) {
    // 遍历到最后一行的下一行结束递归,说明这条摆放方案合理
    if (i == n) {
        return 1;
    }
    // 记录[i,n-1]行合理的摆法总数
    int result = 0;
    // 尝试第i行的[0,n-1]列进行深度优先遍历
    for (int j = 0; j < n; j ++) {
        // 判断在第i行第j列的位置上是否能放Queen
        if (isValid(i, j, record)) {
            record[i] = j;
            // 遍历下一行
            result += process(i + 1, n, record);
        }
    }
    return result;
}

// 检查第i行第j列能不能放Queen
public static boolean isValid(int i, int j, int[] record) {
    // 遍历[0,i-1]行放过的所有Queen,检查是否和当前位置有冲突
    for (int k = 0; k < i; k ++) {
        // 判断是否是同一列或者是否共斜线(不可能共行)
        if (record[k] == j || Math.abs(k - i) == Math.abs(record[k] - j)) {
            return false;
        }
    }
    return true;
}

4.5 N queen problem (optimization)

Although the N-queen problem cannot be optimized in terms of time complexity, it can be optimized in terms of constants, and there are many optimizations.

It can be said that the time complexity is still such a time complexity, but I can make it very low in a constant time during the implementation process.

How low is it? For example, for the 14-queen problem, the solution of 4.4 will run for 5s, and the solution optimized for 4.5 will run for 0.2s. For the 15-queen problem, the 4.4 solution runs for 1 minute, and the 4.5 optimized solution runs for 1.5 seconds.

analysis:

Use bit manipulation to speed up. Bit arithmetic acceleration is a very commonly used technique, it is recommended to master it.

Because of the use of bitwise operations, it is related to the storage form of the variables in the code. The type of the variable in this code is a 32-bit int, so it cannot solve the problem of 32 queens or more.

If you want to solve the problem of more than 32 queens, you can change the parameter type to long.

Code:

public static int nQueen(int n) {
    if (n < 1 || n > 32) {
        return 0;
    }
    int limit = n == 32 ? -1 : (1 << n) - 1;
    return process(limit, 0, 0, 0);
}

/**
 * @param n 一共多少行
 * @param colLim 列的限制，1的位置不能放皇后，0的位置可以
 * @param leftDiaLim 左斜线的限制，1的位置不能放皇后，0的位置可以
 * @param rightDiaLim 右斜线的限制，1的位置不能放皇后，0的位置可以
 * @return  n行n列棋盘中摆n个Queen的合理的摆法总数
 */
public static int process(int n, int colLim, int leftDiaLim, int rightDiaLim) {
    // 皇后是否填满
    if (colLim == n) {
        return 1;
    }
    int mostRightOne = 0;
    // 所有后选皇后的列都在pos的位信息上
    int pos = n & (~ (colLim | leftDiaLim | rightDiaLim));
    int res = 0;
    while (pos != 0) {
        // 提取出候选皇后中最右侧的1
        mostRightOne = pos & (~ pos + 1);
        pos = pos - mostRightOne;
        // 更新限制，进入递归
        res += process(n, colLim | mostRightOne,
                       (leftDiaLim | mostRightOne) << 1,
                       (rightDiaLim | mostRightOne) >>> 1);
    }
    return res;
}

Get more free information and add group: 3907814

Data Structure and Algorithm: Introduction to Greedy Algorithm

1. Prefix tree

1.1 Description

1.2 Classic prefix tree

1.3 Code definition

1.3.1 Node structure

1.3.2 Tree structure

1.4 Basic operation

1.4.1 Add string

1.4.2 Delete string

1.4.3 Query string

1.4.4 Query prefix

2. Greedy algorithm

2.1 Concept

2.2 Description

2.3 Meeting issues

2.4 Problem solving procedures

2.5 Interview situation

3. Logarithm

3.1 Description

3.2 Implementation

4. Interview Questions

4.1 Finding the median

4.2 Gold Bar Problem

4.3 Project planning issues

4.4 N queen problem

4.5 N queen problem (optimization)

Java攻城师

引用和评论

Java 在 2021 年仍然重要吗？

Java8的新特性

浏览器原生「磁吸」效果！Anchor Positioning 锚点定位神器解析

Java11的新特性

Java5的新特性

Java9的新特性

Java13的新特性