Classic interview question: When do you think Go will preempt P?

search [160d4116436d72 brain into the fried fish ] Follow this fried fish with liver-fried liver. This article GitHub github.com/eddycjy/blog has been included, and there are my series of articles, materials and open source Go books.

Hello everyone, I am fried fish.

A few days ago, we talked about "Single-core CPU, open two Goroutines, one of them is endless loop, what will happen? "The problem, we mentioned in a detail part:

There are new friends who will have more questions, that is, how to preempt P in the Go language, and how is it done?

Today in this article we will decrypt the preemption P.

The history of the scheduler

In the Go language, Goroutine was not designed to be preemptive in the early days. In the early days, Goroutine only triggered scheduling switching during operations such as reading and writing, actively yielding, and locking.

This has a serious problem. When the garbage collector is performing STW, if there is a Goroutine that is always blocking the call, the garbage collector will always wait for it. I don't know when to wait...

In this case, preemptive scheduling is needed to solve the problem. If a Goroutine runs for too long, it needs to be preempted to solve it.

This Go language began to implement a preemptive scheduler since Go1.2, and it has been continuously improved to this day:

Go0.x: Single-threaded program scheduler.
Go1.0: Based on multi-threaded scheduler.
Go1.1: Scheduler based on task stealing.
Go1.2-Go1.13: Preemptive scheduler based on collaboration.
Go1.14: Signal-based preemptive scheduler.

New proposal of the scheduler: Non-uniform memory access (NUMA),
However, because the implementation is too complicated and the priority is not high enough, it has not been on the agenda for a long time.

Interested friends can refer to NUMA-aware scheduler for Go proposed by Dmitry Vyukov, dvyukov.

Why do you want to preempt P

Why do you want to preempt P? To put it bluntly, if you don’t preempt, you will not have a chance to run and will hang and die. Or the distribution of resources is uneven,

This is obviously unreasonable in the design of the scheduler.

Same as this example:

// Main Goroutine 
func main() {
    // 模拟单核 CPU
    runtime.GOMAXPROCS(1)
    
    // 模拟 Goroutine 死循环
    go func() {
        for {
        }
    }()

    time.Sleep(time.Millisecond)
    fmt.Println("脑子进煎鱼了")
}

In the old version of the Go language, this example will always be blocked and cannot be seen again. It is a scene that requires preemption.

But there may be a small partner asking if there will be new problems if you seize it. Because M, which was using P originally, is cold (M will be bound to P), without P, it will not be able to continue execution.

This is actually no problem, because the Goroutine has been blocked on the system call, and there will be no subsequent execution of new demands for the time being.

But in case the code can run again after running for a long time (long waiting is also allowed in business), that is, the Goroutine has recovered from the blocked state and it is expected to continue running, what should I do if there is no P?

At this time, the Goroutine can be the same as other Goroutines, first check whether the M where it is located is still bound to P:

If there is P, you can adjust the state and continue to run.
If there is no P, you can grab P again, then occupy and bind P for your own use.

That is, preempting P is itself a two-way behavior. If you steal my P, I can also grab someone else’s P to continue running.

How to preempt P

After explaining the reasons why we want to preempt P, we dig deeper, how did "he" preempt the specific P?

This involves the runtime.retake method mentioned above, which handles the following two scenarios:

Preempt P blocked on the system call.
Preempt G that has been running for too long.

This is mainly aimed at the scenario of preempting P, and the analysis is as follows:

func retake(now int64) uint32 {
    n := 0
    // 防止发生变更，对所有 P 加锁
    lock(&allpLock)
    // 走入主逻辑，对所有 P 开始循环处理
    for i := 0; i < len(allp); i++ {
        _p_ := allp[i]
        pd := &_p_.sysmontick
        s := _p_.status
        sysretake := false
        ...
        if s == _Psyscall {
            // 判断是否超过 1 个 sysmon tick 周期
            t := int64(_p_.syscalltick)
            if !sysretake && int64(pd.syscalltick) != t {
                pd.syscalltick = uint32(t)
                pd.syscallwhen = now
                continue
            }
      
            ...
        }
    }
    unlock(&allpLock)
    return uint32(n)
}

This method will allpLock first. This variable has the same meaning as its name. allpLock can prevent the array from changing.

It will protect allp , idlepMask and timerpMask attributes without P read and size changes, as well as allp , which can avoid affecting subsequent operations.

scene one

After the pre-processing is completed, enter the main logic and use the universal for loop to process all P (allp) one by one.

            t := int64(_p_.syscalltick)
            if !sysretake && int64(pd.syscalltick) != t {
                pd.syscalltick = uint32(t)
                pd.syscallwhen = now
                continue
            }

The first scenario is: syscalltick be judged. If there is a task with more than 1 sysmon tick cycle (at least 20us) in the system call (syscall), P will be preempted from the system call, otherwise it will be skipped.

Scene two

If it is not satisfied, it will continue to go down to the following logic:

func retake(now int64) uint32 {
    for i := 0; i < len(allp); i++ {
        ...
        if s == _Psyscall {
            // 从此处开始分析
            if runqempty(_p_) && 
      atomic.Load(&sched.nmspinning)+atomic.Load(&sched.npidle) > 0 && 
      pd.syscallwhen+10*1000*1000 > now {
                continue
            }
            ...
        }
    }
    unlock(&allpLock)
    return uint32(n)
}

The second scene focuses on this long series of judgments:

runqempty(_p_) == true method will determine whether the task queue P is empty, in order to detect whether there are other tasks to be executed.
atomic.Load(&sched.nmspinning)+atomic.Load(&sched.npidle) > 0 will determine whether there is an idle P and a P that is being scheduled to steal G.
pd.syscallwhen+10*1000*1000 > now will determine whether the system call time exceeds 10ms.

The strange thing here is that the runqempty method has clearly determined that there is no other task, which means that there is no task to be executed, and there is no need to grab P.

But the actual situation is that, because it may prevent the deep sleep of the sysmon thread, I hope to continue to occupy P in the end.

After completing the above judgment, enter the stage of snatching P:

func retake(now int64) uint32 {
    for i := 0; i < len(allp); i++ {
        ...
        if s == _Psyscall {
            // 承接上半部分
            unlock(&allpLock)
            incidlelocked(-1)
            if atomic.Cas(&_p_.status, s, _Pidle) {
                if trace.enabled {
                    traceGoSysBlock(_p_)
                    traceProcStop(_p_)
                }
                n++
                _p_.syscalltick++
                handoffp(_p_)
            }
            incidlelocked(1)
            lock(&allpLock)
        }
    }
    unlock(&allpLock)
    return uint32(n)
}

Unlock related attributes: need to call unlock method to unlock allpLock , so as to obtain sched.lock in order to continue to the next step.
Reduce idle M: The number of idle M needs to be reduced before the atomic operation (CAS) (assuming one is running). Otherwise, it may exit the system call when grabbing M, increment nmidle and report the deadlock event.
Modify the P state: call the atomic.Cas method to set the robbed P state to idle so that it can be used by other Ms.
Snatch P and control M: Call the handoffp method to snatch P from the system called or locked M, and the new M will take over this P.

to sum up

At this point, the basic process of preempting P has been completed, and we can conclude that the following conditions are met:

If there is a system call timeout: there is a task that exceeds 1 sysmon tick cycle (at least 20us), and P will be preempted from the system call.
If there is no free P: all Ps have been bound to M. Need to preempt the current system call, but in fact the system call does not need this P, it will be allocated to other M to schedule other G.
If there is G waiting to run in P's running queue, in order to ensure that G in P's local queue is scheduled in time. And his own P is busy with system calls and has no time to manage. At this time, another M will be found to take over P, so as to achieve the purpose of continuing to schedule G.

If you have any questions please comment and feedback exchange area, best relationship is mutual achievement , everybody thumbs is fried fish maximum power of creation, thanks for the support.

The article is continuously updated, and you can read it on WeChat search [the brain is fried fish], and reply [ 000 ] I have prepared the first-line interview algorithm questions and information; this article GitHub github.com/eddycjy/blog has been included , Welcome Star to urge you to update.

Classic interview question: When do you think Go will preempt P?

The history of the scheduler

Why do you want to preempt P

How to preempt P

scene one

Scene two

to sum up

reference

煎鱼

引用和评论

Cloudflare 从 PHP 到 Go：迁移与经验分享

💢线上高延迟请求排查

每一个前端，都要拥有属于自己的埋点库~

这些年

Spring-@Configuration注解简析

一个用JavaScript生成思维导图(mindmap)的github repo

单元测试-PowerMock