Go抢占式调度

为什么需要抢占式调度

单个G占用M/P时间太长的话，影响其它Goroutine的执行；
一旦某个G中出现死循环，那么G将永远占用该P和M，导致其它的G得不到调度，被饿死；

如何实现抢占式调度

在每个函数或方法的入口，加上一段额度的代码(morestack)，让runtime有机会检查是否抢占；
对于没有函数调用，纯算法循环的G，依然无法抢占(1.14之前)；也就是说，除非极端的无限循环或死循环，否则只要G调用函数，Goroutine就有抢占G的机会；
golang1.14引入了基于信号量的抢占，解决了上述问题；

具体实现

由sysmon检查，如果G执行超过10ms，执行retake抢占：

// src/runtime/proc.go
func main() {
    ......
    systemstack(func() {
        newm(sysmon, nil)
    })
    ......
}

// src/runtime/proc.go
// forcePreemptNS is the time slice given to a G before it is // preempted.
const forcePreemptNS = 10 * 1000 * 1000 // 10ms

func retake(now int64) uint32 {
    for i := int32(0); i < gomaxprocs; i++ {
        _p_ := allp[i]
        if _p_ == nil {
                continue
        }
        pd := &_p_.sysmontick
        s := _p_.status
        ...
        if s == _Prunning {
            // Preempt G if it's running for too long.
            t := int64(_p_.schedtick)
            if int64(pd.schedtick) != t {
                    pd.schedtick = uint32(t)
                    pd.schedwhen = now
                    continue
            }
            if pd.schedwhen+forcePreemptNS > now {
                    continue
            }
            preemptone(_p_)    //抢占
        }
        if s == _Psyscall {
            // Retake P from syscall if it's there for more than 1 sysmon tick (at least 20us).
            .....
            handoffp(_p_)
        }
    }
    ...
}

retake的流程：枚举所有的P

若P在运行中(_Prunning)，且经过了一次sysmon循环&&运行时间超过10ms，则抢占:
- 调用preemptone函数：设置g.preempt=true;
若P在系统调用中(_Psyscall)，且经过了一次sysmon，则抢占这个P;
- 调用handoffp解除M和P之间的关联；

// src/runtime/proc.go
func preemptone(_p_ *p) bool {
    ...
    gp.preempt = true

    // Every call in a go routine checks for stack overflow by
    // comparing the current stack pointer to gp->stackguard0.
    // Setting gp->stackguard0 to StackPreempt folds
    // preemption into the normal stack overflow check.
    gp.stackguard0 = stackPreempt
}

通俗来讲：

如果1个G任务运行10ms，sysmon就会认为其运行时间太久而发出抢占调度的请求，将G置上stackguard0标志；
一旦G被置上抢占标志位，那么待这个G下一次调用函数或方法时，runtime便可以将G抢占，退出runnable状态，将G放入localQueue，等待下一次被调度；

Golang1.14基于信号量的抢占

抢占流程：

首先注册绑定SIGURG信号及其handler；
sysmon间隔性的检测运行超时的P，然后发信号给M；
M收到信号后休眠当前goroutine并重新进行调度；

参考：https://www.cnblogs.com/sunsk...

Go抢占式调度

为什么需要抢占式调度

如何实现抢占式调度

具体实现

Golang1.14基于信号量的抢占

a朋

引用和评论

alertmanager源码：整体架构和流程分析

gozero限流、熔断、降级如何实现？面试的时候怎么回答？

大龄程序员的悲惨结局是什么？

golang gin 添加swagger文档教程

大厂外包VS小公司，你会怎么选？

程序员兄弟们生涯中写过最大的 bug 是什么？

2025Go面试八股（含100道答案）