1

7-4 Replacement Selection (30分)

When the input is much too large to fit into memory, we have to do external sorting instead of internal sorting. One of the key steps in external sorting is to generate sets of sorted records (also called runs) with limited internal memory. The simplest method is to read as many records as possible into the memory, and sort them internally, then write the resulting run back to some tape. The size of each run is the same as the capacity of the internal memory.

Replacement Selection sorting algorithm was described in 1965 by Donald Knuth. Notice that as soon as the first record is written to an output tape, the memory it used becomes available for another record. Assume that we are sorting in ascending order, if the next record is not smaller than the record we have just output, then it can be included in the run.

For example, suppose that we have a set of input { 81, 94, 11, 96, 12, 99, 35 }, and our memory can sort 3 records only. By the simplest method we will obtain three runs: { 11, 81, 94 }, { 12, 96, 99 } and { 35 }. According to the replacement selection algorithm, we would read and sort the first 3 records { 81, 94, 11 } and output 11 as the smallest one. Then one space is available so 96 is read in and will join the first run since it is larger than 11. Now we have { 81, 94, 96 }. After 81 is out, 12 comes in but it must belong to the next run since it is smaller than 81. Hence we have { 94, 96, 12 } where 12 will stay since it belongs to the next run. When 94 is out and 99 is in, since 99 is larger than 94, it must belong to the first run. Eventually we will obtain two runs: the first one contains { 11, 81, 94, 96, 99 } and the second one contains { 12, 35 }.

Your job is to implement this replacement selection algorithm.

Input Specification:

Each input file contains several test cases. The first line gives two positive integers $N (≤10​^5​​)$ and $M (<N/2)$, which are the total number of records to be sorted, and the capacity of the internal memory. Then N numbers are given in the next line, all in the range of int. All the numbers in a line are separated by a space.

Output Specification:

For each test case, print in each line a run (in ascending order) generated by the replacement selection algorithm. All the numbers in a line must be separated by exactly 1 space, and there must be no extra space at the beginning or the end of the line.

Sample Input:

13 3
81 94 11 96 12 99 17 35 28 58 41 75 15

Sample Output:

11 81 94 96 99
12 17 28 35 41 58 75
15

题目限制:

image.png

题目大意:

给定一长度为N的序列,假设现在内存大小为M(<N/2),那么现在需要使用外部排序的置换选择排序算法对利用该内存空间进行排序输出。

算法思路:

其实就是模拟外部排序的过程,题目举的例子算是说的比较详细了,模拟的步骤如下:

  • 1、假设待排序的序列存储在unsorted队列中,前M个元素已经加入到当前轮次队列currentRun待处理,那么只要当前队列不空就进行2~3的循环,否则转4
  • 2、在当前轮的部分有序序列currentRun中出队一个元素,并添加到result数组中进行暂存,然后再从unsorted队列中出队一个元素,如果大于此前进入result的元素,进入当前轮次的排序队列currentRun中,否则就加入下一轮次的队列nextRun中。
  • 3、判断currentRun是否为空,如果是,说明得进入下一轮排序,将result进行输出并情况,将nextRun赋值给currentRun并清空,
  • 4、检测当前轮次排序队列currentRun是否全部处理完毕,如果没有就添加进result数组中,并输出result元素。然后判断下一轮次的队列nextRun是否有元素需要处理,如果有就加入到currentRun进行排序,然后依次输出currentRun的每一个元素即可。

注意点:

  • 1、由于每次都得在当前轮次队列中获取最小的元素,那么使用优先队列是最方便的
  • 2、对于测试点1和测试点2内存超限的情况,有可能是没有及时释放nextRun所造成的,这里将nextRun设置为vector就必须得释放,但是如果是优先队列的话,就会出现该问题。
  • 3、判断当前轮为空得在当前轮次处理完毕之后才行,否则测试点1错误或者测试点2格式错误

提交结果:

image.png

AC代码:

#include<cstdio>
#include<vector>
#include<queue>
#include<unordered_map>

using namespace std;

priority_queue<int,vector<int>,greater<int> > currentRun;//当前轮次
queue<int> unsorted;//待排序部分
vector<int> nextRun;//暂存下一轮数字
vector<int> result;// 保存每一轮结果

int main(){
    int N,M;
    scanf("%d %d",&N,&M);
    // 先将前M个数字加入到当前轮次中
    for(int i=0;i<M;++i){
        int num;
        scanf("%d",&num);
        currentRun.push(num);
    }
    // 剩余N-M个数字加入到待排序序列中
    for(int i=M;i<N;++i){
        int num;
        scanf("%d",&num);
        unsorted.push(num);
    }
    while(!unsorted.empty()){
        // 只要还有数字需要排序
        int temp = currentRun.top();
        currentRun.pop();
        result.push_back(temp);
        // 出队未排序序列元素
        int num = unsorted.front();
        unsorted.pop();
        if(num>result[result.size()-1]){
            // 比上一次出队元素大,在当前轮次
            currentRun.push(num);
        }else{
            // 在下一轮次
            nextRun.push_back(num);
        }
        // 判断当前轮为空得在当前轮次处理完毕之后才行,否则测试点1错误或者测试点2格式错误
        if(currentRun.empty()) {
            //当前轮为空,需要输出result,并进行下一轮
            for (int i = 0; i < result.size(); ++i) {
                printf("%d", result[i]);
                if (i < result.size() - 1) printf(" ");
            }
            printf("\n");
            for (int i:nextRun) {
                currentRun.push(i);
            }
            nextRun.clear();
            result.clear();
        }
    }
    // 需要排序的都已经为空了,判断当前轮次还有没有处理完毕的
    if(!currentRun.empty()) {
        while(!currentRun.empty()){
            // 当期轮还有数字没有输出
            int temp = currentRun.top();
            currentRun.pop();
            result.push_back(temp);
        }
        for(int i=0;i<result.size();++i){
            printf("%d",result[i]);
            if(i<result.size()-1) printf(" ");
        }
        printf("\n");
    }
    if(!nextRun.empty()){
        // 下一轮不空
        for(int i : nextRun){
            currentRun.push(i);
        }
        // 输出
        while(!currentRun.empty()){
            int temp = currentRun.top();
            currentRun.pop();
            if(currentRun.empty()){
                // 最后一个元素
                printf("%d",temp);
            }else{
                printf("%d ",temp);
            }
        }
    }
    return 0;
}

乔梓鑫
569 声望17 粉丝

主要分享个人学习经验和心得