Finally entered the learning of our sorting-related algorithms. I believe that friends who have studied the algorithm systematically or those who have not studied the algorithm will have heard of many very famous sorting algorithms. Of course, the content of our introduction today does not directly start with the most common algorithm, but according to certain rules. The rules are introduced one by one.

First of all, the sorting algorithm we want to introduce is an insertion type sorting algorithm. As the name implies, insertion sort is to "insert" one or several records that are unordered into an ordered sequence. Typical examples are simple insertion sort and Hill sort.

Simple insertion sort

Simple insertion sort can also be called direct insertion sort. Let's look at the code first, and then explain to the next step.

function InsertSort($arr)
{
    $n = count($arr);
    for ($i = 1; $i < $n; $i++) { // 开始循环,从第二个元素开始,下标为 1 的
        $tmp = $arr[$i]; // 取出未排序序列第一个元素
        for ($j = $i; $j > 0 && $arr[$j - 1] > $tmp; $j--) { // 判断从当前下标开始向前判断,如果前一个比当前元素大
            $arr[$j] = $arr[$j - 1]; // 依次移动元素
        }
        // 将元素放到合适的位置
        $arr[$j] = $tmp;
    }
    echo implode(', ', $arr), PHP_EOL;
}

InsertSort($numbers);

// 49, 38, 65, 97, 76, 13, 27, 49
// 13, 27, 38, 49, 49, 65, 76, 97

The amount of code is not much, but it is actually very easy to understand. Let's briefly explain the first two numbers of the test data.

First of all, the first loop starts from 1, that is, the first unsorted sequence element taken out is tmp = arr[1], that is, the current tmp = 38.

Then start the loop, the current loop is to determine whether the element of j-1 is larger than the current tmp element, if it is, enter the loop body, arr[1] = arr[0]. So far, both arr[0] and arr[1] are now 49. The entire sequence is 49, 49, 65,...

Finally let arr[0] = $tmp, which is equal to 38. (J-- in the loop). The entire sequence is 38, 49, 65,...

Through the picture below, we can see more clearly the process of sorting the entire sequence.

/img/bVcTTsG

As can be seen from the above steps, simple insertion sort is a process of starting from one side and gradually ordering the previous data. It can be seen from the code that it continuously decreases j in the internal loop, and compares it with the previous sequence of numbers. When it finds its proper position, it puts the data in this position.

From the code and our analysis, the time complexity of simple insertion sort is O(n 2 ). At the same time, it belongs to a stable sort. What is a stable sort? Careful students should have discovered that in our test code, there are two identical data, which is 49. Stable means that the position of the same data before and after sorting will not change. The 49 in the front is still in front of the 49 in the back. This is the stability of sorting.

In addition, simple insertion sort is more suitable for the situation where the initial records are basically ordered. When the initial records are out of order and n is large, the time complexity of this algorithm will be relatively high and it is not suitable for adoption.

Hill sort

Simple insertion sort is easy to understand, right? What the hell is Hill sort? Don't worry, we don't see any clues from this name, because the name of this sort is named after its discoverer. In fact, Hill sort is still an insertion sort algorithm.

As mentioned above, simple insertion sorting is suitable for basic ordering, and Hill sorting is to improve the efficiency of simple insertion sorting. Its main purpose is to reduce the size of sorted n and let data form through several sorts. Basically ordered format.

For this algorithm, we can't get the code first, let's look at the picture first.

/img/bVcTTsH

do you understand? We actually group the data, and each grouping is based on a certain increment. For example, in our diagram, we sort the data in increments of 5 for the first time, and 3 for the second time. In this way, when sorting for the third time, the increment is 1, which becomes an ordinary simple insertion sort. It will be reflected in our code in a while.

Let's perform a specific analysis of the three-pass sorting according to the increment as the iterative order:

1) In the first iteration, we set the grouping increment to 5. At this time, there are three sets of data, namely 49 and 13, 38 and 27, 65 and 49, and then perform simple insertion sort on these three sets of data. The result of the following array is 13, 27, 49, 97, 76, 49, 38, 65.

2) In the second iteration, the grouping increment is 3. At this time, it is divided into two groups, each group has three data, respectively 13, 97, 38 as a group, and the other group is 27, 76, 65. After simple insertion and sorting of these two sets of data, the result of updating the array is 13, 27, 49, 38, 65, 49, 97, 76.

3) In fact, it can be seen from the grouping and sorting twice that this array is basically ordered. At this time, the last step is to perform simple insertion sort again with a grouping increment of 1. To put it bluntly, the last step is an ordinary simple insertion sort process.

After the step-by-step explanation, is it clearer? Let me repeat this article. Hill sort is actually a large-scale insertion sort by grouping, and the last step is reduced to a simple insertion sort with only 1 increment. Let's take a look at the code again:

function ShellSort($arr)
{
    $n = count($arr);
    $sedgewick = [5, 3, 1];

    // 初始的增量值不能超过待排序列的长度
    for ($si = 0; $sedgewick[$si] >= $n; $si++); 

    // 开始分组循环,依次按照 5 、3 、 1 进行分组
    for ($d = $sedgewick[$si]; $d > 0; $d = $sedgewick[++$si]) {
        // 获取当前的分组数量
        for ($p = $d; $p < $n; $p++) {
            $tmp = $arr[$p];
            // 插入排序开始,在当前组内
            for ($i = $p; $i >= $d && $arr[$i - $d] > $tmp; $i -= $d) {
                $arr[$i] = $arr[$i - $d];
            }
            $arr[$i] = $tmp;
        }
    }
    echo implode(', ', $arr), PHP_EOL;
}
ShellSort($numbers);

Looking at the code, there seems to be a three-layer for loop. How can it improve efficiency? In fact, the efficiency improvement of Hill sorting is indeed limited. It actually makes the data basically orderly through the previous few groupings. In the grouping state, the number of data comparisons does not reach the level of n. When the simple sorting is performed for the last time, the entire data is basically in order. In this case, the number of exchanges will obviously be reduced a lot, so its time complexity can be reduced to O(log 2 n) 2 The level of

Summarize

How about the sorted introductory meal? We didn't just take advantage of the bubbling and fast queues of the bad street. Not being famous does not mean that you will not use it. For example, when I was interviewing, there was a company that stated on the interview questions that bubbling and fast sorting could not be used. At this time, I believe that the intuitive and easy-to-understand characteristics of simple insertion sort will definitely help us through this kind of interview difficulty!

Test code:

https://github.com/zhangyue0503/Data-structure-and-algorithm/blob/master/7. Sorting/source/7.1 Insertion sorting: simple insertion, hill

Reference documents:

The example in this article is from the second edition of "Data Structure", Yan Weimin

"Data Structure" Second Edition, Chen Yue

Searchable on their respective media platforms [Hardcore Project Manager]


硬核项目经理
90 声望18 粉丝