Interviewer: How much do you know about the mutual exclusion lock of Go language?

foreword

Hello, everyone, my name is asong .
When it comes to concurrent programming and multi-threaded programming, lock is the first thing that comes to mind. Lock is a synchronization primitive in concurrent programming. It can ensure that multi-threads will not compete when accessing the same piece of memory to ensure concurrency safety; Go language is more respected by channel To achieve shared memory through communication, this design point is inconsistent with many mainstream programming languages, but Go sync is also-- sync The mutex and read-write lock are provided in the package. After all, channel cannot satisfy all scenarios. The use of mutex and read-write lock is inseparable from ours, so connect Next, I will share two articles to share how mutex locks and read-write locks are implemented. In this article, let's take a look at the implementation of mutex locks.

This article is based on Golang Version: 1.18

Go language mutex design and implementation

Introduction to mutex

sync under the package mutex is the mutex, which provides three public methods: call Lock() to obtain the lock, call Unlock() Lock, in Go1.18 newly provided TryLock() method can take non-blocking lock operation:

Lock() : call Lock method to perform the lock operation. When using it, you should pay attention to the same goroutine in order to lock again when the lock is released, otherwise it will cause Program panic .
Unlock() : call UnLock method to unlock the operation. When using it, it should be noted that releasing the lock when the lock is not locked will cause the program panic to be different from the locked Mutex. A specific goroutine is associated so that one goroutine can be used to lock it and another goroutine can be used to unlock it.
tryLock() : call TryLock method to try to acquire the lock, when the lock is occupied by other goroutines, or the current lock is in starvation mode, it will return false immediately, and try to acquire the lock when the lock is available, Failure to acquire will not spin/block, and will return false immediately;

mutex has a relatively simple structure with only two fields:

 type Mutex struct {
    state int32
    sema  uint32
}

state : Indicates the current state of the mutex, a composite field;
sema : semaphore variable, used to control blocking sleep and wakeup waiting for goroutine

At first glance, you may be a little confused about the structure. Mutex should be a complex thing. How can it be implemented with only two fields? That's because the design uses the bit method to make the mark. The different bits of state represent different states respectively, and the smallest memory is used to represent more meanings. The lower three bits are from low to high respectively. Indicates mutexed , mutexWoken and mutexStarving , and the remaining bits are used to indicate how many locks are currently waiting: goroutine

 const (
   mutexLocked = 1 << iota // 表示互斥锁的锁定状态
   mutexWoken // 表示从正常模式被从唤醒
   mutexStarving // 当前的互斥锁进入饥饿状态
   mutexWaiterShift = iota // 当前互斥锁上等待者的数量
)

mutex The initial implementation only has normal mode, and the thread waiting in normal mode acquires the lock according to the first-in, first-out method, but the newly created gouroutine will be the same as the just awakened goroutine Competition, which will cause the newly awakened goroutine can not get the lock, this situation will cause the thread to be blocked for a long time, so Go language is 1.9 has been optimized to introduce starvation mode. When goroutine exceeds 1ms the lock is not acquired, the current mutex will be switched to starvation mode. In starvation mode, the mutex will be directly handed over to the front of the waiting queue goroutine , new goroutines cannot acquire locks or enter the spin state in this state, they will only be at the end of the queue wait. If a goroutine acquires the mutex and it is at the end of the queue or it waits less than 1ms, the current mutex switches back to normal mode.

The basic situation of mutex has been mastered by everyone. Next, we will analyze how mutex is realized from locking to unlocking;

Lock lock

Start with the Lock method:

 func (m *Mutex) Lock() {
    // 判断当前锁的状态，如果锁是完全空闲的，即m.state为0，则对其加锁，将m.state的值赋为1
    if atomic.CompareAndSwapInt32(&m.state, 0, mutexLocked) {
        if race.Enabled {
            race.Acquire(unsafe.Pointer(m))
        }
        return
    }
    // Slow path (outlined so that the fast path can be inlined)
    m.lockSlow()
}

The above code has two main parts of logic:

Judging the current lock status by CAS , that is, the lower 1 bit of the field state , if the lock is completely idle, that is, m.state is 0, then lock it and set the The value of m.state is assigned to 1
If the current lock has been locked by other goroutine , then proceed to lockSlow method to try to starve by spinning or starvation goroutine competitively waiting for the lock to be released, we Introduce the lockSlow method below;

lockSlow code segment is a bit long, and the main body is a for loop. Its main logic can be divided into the following three parts:

state initialization
Determine whether the spin condition is met, and perform the spin operation if the condition is met
lock grab ready desired state
Update the desired state with the CAS operation

initialized state

In the locakSlow method, 5 fields will be initialized first:

 func (m *Mutex) lockSlow() {
    var waitStartTime int64 
    starving := false
    awoke := false
    iter := 0
    old := m.state
    ........
}

waitStartTime is used to calculate the waiting time of waiter
starving is the starvation mode flag. If the waiting time exceeds 1ms, starving is set to true, and subsequent operations will also mark the Mutex as starving.
awoke indicates whether the coroutine wakes up. When goroutine is spinning, it is equivalent to a coroutine waiting for a lock on the CPU. In order to avoid waking up other coroutines when the Mutex is unlocked, try to put the Mutex in the wake-up state when spinning, and set the awoke of this coroutine to true after the Mutex is in the wake-up state.
iter used to record the number of spins of the coroutine,
old record the current lock status

spin

The conditions for judging spin are very harsh:

 for {
    // 判断是否允许进入自旋 两个条件，条件1是当前锁不能处于饥饿状态
    // 条件2是在runtime_canSpin内实现，其逻辑是在多核CPU运行，自旋的次数小于4
        if old&(mutexLocked|mutexStarving) == mutexLocked && runtime_canSpin(iter) {
      // !awoke 判断当前goroutine不是在唤醒状态
      // old&mutexWoken == 0 表示没有其他正在唤醒的goroutine
      // old>>mutexWaiterShift != 0 表示等待队列中有正在等待的goroutine
      // atomic.CompareAndSwapInt32(&m.state, old, old|mutexWoken) 尝试将当前锁的低2位的Woken状态位设置为1，表示已被唤醒, 这是为了通知在解锁Unlock()中不要再唤醒其他的waiter了
            if !awoke && old&mutexWoken == 0 && old>>mutexWaiterShift != 0 &&
                atomic.CompareAndSwapInt32(&m.state, old, old|mutexWoken) {
                    // 设置当前goroutine唤醒成功
          awoke = true
            }
      // 进行自旋
            runtime_doSpin()
      // 自旋次数
            iter++
      // 记录当前锁的状态
            old = m.state
            continue
        }
}

The condition of spin here is still very complicated. We want the current goroutine to enter the spin because we are optimistic that the goroutine currently holding the lock can return the lock in a short time , so we Some conditions are needed to judge. The judgment conditions of mutex are described in text:

old&(mutexLocked|mutexStarving) == mutexLocked Used to judge whether the lock is in normal mode and locked, why do you judge it?

mutexLocked binary representation is 0001

mutexStarving binary representation is 0100

mutexLocked|mutexStarving binary is 0101. Use 0101 in the current state to do & operation, if it is currently in starvation mode, the lower three bits must be 1, if it is currently in lock mode, the lower 1 bit It must be 1, so use this method to determine whether the current lock is in normal mode and locked;

runtime_canSpin() method is used to judge whether the spin condition is met:

 // / go/go1.18/src/runtime/proc.go
const active_spin     = 4
func sync_runtime_canSpin(i int) bool {
    if i >= active_spin || ncpu <= 1 || gomaxprocs <= int32(sched.npidle+sched.nmspinning)+1 {
        return false
    }
    if p := getg().m.p.ptr(); !runqempty(p) {
        return false
    }
    return true
}

The spin conditions are as follows:

The number of spins should be within 4 times
CPU must be multicore
GOMAXPROCS>1
There is at least one running processor P on the current machine and the processing run queue is empty;

After judging that the current goroutine can spin, call the runtime_doSpin method to spin:

 const active_spin_cnt = 30
func sync_runtime_doSpin() {
    procyield(active_spin_cnt)
}
// asm_amd64.s
TEXT runtime·procyield(SB),NOSPLIT,$0-0
    MOVL    cycles+0(FP), AX
again:
    PAUSE
    SUBL    $1, AX
    JNZ    again
    RET

The number of loops is set to 30 times, and the spin operation is executed 30 times PAUSE instruction, which occupies CPU and consumes CPU time, busy waiting;

This is the logic of the entire spin operation. This is to optimize the process of waiting for blocking -> awakening -> participating in preempting locks. Therefore, spin is used for optimization, and the lock is expected to be released during this process.

lock grab ready desired state

After the spin logic is processed, it starts to calculate the latest state of the current mutex according to the context, and calculates according to different conditions mutexLocked , mutexStarving , mutexWoken and mutexWaiterShift :

First calculate the value of mutexLocked :

 // 基于old状态声明到一个新状态
        new := old
        // 新状态处于非饥饿的条件下才可以加锁
        if old&mutexStarving == 0 {
            new |= mutexLocked
        }

Calculate the value of mutexWaiterShift :

 //如果old已经处于加锁或者饥饿状态，则等待者按照FIFO的顺序排队
if old&(mutexLocked|mutexStarving) != 0 {
            new += 1 << mutexWaiterShift
        }

Calculate the value of mutexStarving :

 // 如果当前锁处于饥饿模式，并且已被加锁，则将低3位的Starving状态位设置为1，表示饥饿
if starving && old&mutexLocked != 0 {
            new |= mutexStarving
        }

Calculate the value of mutexWoken :

 // 当前goroutine的waiter被唤醒,则重置flag
if awoke {
            // 唤醒状态不一致，直接抛出异常
            if new&mutexWoken == 0 {
                throw("sync: inconsistent mutex state")
            }
     // 新状态清除唤醒标记，因为后面的goroutine只会阻塞或者抢锁成功
     // 如果是挂起状态，那就需要等待其他释放锁的goroutine来唤醒。
     // 假如其他goroutine在unlock的时候发现Woken的位置不是0，则就不会去唤醒，那该goroutine就无法在被唤醒后加锁
            new &^= mutexWoken
}

Update the desired state with the `CAS` operation

We have obtained the desired state of the lock above, and then update the state of the lock through CAS :

 // 这里尝试将锁的状态更新为期望状态
if atomic.CompareAndSwapInt32(&m.state, old, new) {
  // 如果原来锁的状态是没有加锁的并且不处于饥饿状态，则表示当前goroutine已经获取到锁了，直接推出即可
            if old&(mutexLocked|mutexStarving) == 0 {
                break // locked the mutex with CAS
            }
            // 到这里就表示goroutine还没有获取到锁，waitStartTime是goroutine开始等待的时间，waitStartTime != 0就表示当前goroutine已经等待过了，则需要将其放置在等待队列队头，否则就排到队列队尾
            queueLifo := waitStartTime != 0
            if waitStartTime == 0 {
                waitStartTime = runtime_nanotime()
            }
      // 阻塞等待
            runtime_SemacquireMutex(&m.sema, queueLifo, 1)
      // 被信号量唤醒后检查当前goroutine是否应该表示为饥饿
     // 1. 当前goroutine已经饥饿
     // 2. goroutine已经等待了1ms以上
            starving = starving || runtime_nanotime()-waitStartTime > starvationThresholdNs
  // 再次获取当前锁的状态
            old = m.state
   // 如果当前处于饥饿模式，
            if old&mutexStarving != 0 {
        // 如果当前锁既不是被获取也不是被唤醒状态，或者等待队列为空 这代表锁状态产生了不一致的问题
                if old&(mutexLocked|mutexWoken) != 0 || old>>mutexWaiterShift == 0 {
                    throw("sync: inconsistent mutex state")
                }
        // 当前goroutine已经获取了锁，等待队列-1
                delta := int32(mutexLocked - 1<<mutexWaiterShift
         // 当前goroutine非饥饿状态 或者 等待队列只剩下一个waiter，则退出饥饿模式(清除饥饿标识位)              
                if !starving || old>>mutexWaiterShift == 1 {
                    delta -= mutexStarving
                }
        // 更新状态值并中止for循环，拿到锁退出
                atomic.AddInt32(&m.state, delta)
                break
            }
      // 设置当前goroutine为唤醒状态，且重置自璇次数
            awoke = true
            iter = 0
        } else {
      // 锁被其他goroutine占用了，还原状态继续for循环
            old = m.state
        }

The logic of this block is very complicated. It is determined by CAS whether the lock is acquired. If the lock is not obtained through CAS, it will call runtime.sync_runtime_SemacquireMutex through the semaphore to ensure that the resource will not be blocked by two goroutine acquire, runtime.sync_runtime_SemacquireMutex will keep trying to acquire the lock in the method and fall into sleep waiting for the release of the semaphore, once the current goroutine can acquire the semaphore, it will return immediately, If it is a newcomer goroutine , it needs to be placed at the end of the queue; if it is a wake-up waiting lock goroutine , it needs to be placed at the head of the queue, and the whole process needs to gnaw on the code to deepen understand.

unlock

Compared with the locking operation, the unlocking logic is not so complicated. Let's take a look at the logic of UnLock :

 func (m *Mutex) Unlock() {
    // Fast path: drop lock bit.
    new := atomic.AddInt32(&m.state, -mutexLocked)
    if new != 0 {
        // Outlined slow path to allow inlining the fast path.
        // To hide unlockSlow during tracing we skip one extra frame when tracing GoUnblock.
        m.unlockSlow(new)
    }
}

Use the AddInt32 method to unlock quickly, set the low 1 position of m.state to 0, and then determine the new m.state value. If the value is 0, it means that the current lock has been completely idle, and the unlocking ends. Not equal to 0 indicates that the current lock is not occupied, there will be waiting goroutine not been woken up yet, a series of wakeup operations are required, this part of the logic is in unlockSlow Inside the method:

 func (m *Mutex) unlockSlow(new int32) {
  // 这里表示解锁了一个没有上锁的锁，则直接发生panic
    if (new+mutexLocked)&mutexLocked == 0 {
        throw("sync: unlock of unlocked mutex")
    }
  // 正常模式的释放锁逻辑
    if new&mutexStarving == 0 {
        old := new
        for {
      // 如果没有等待者则直接返回即可
      // 如果锁处于加锁的状态，表示已经有goroutine获取到了锁，可以返回
      // 如果锁处于唤醒状态，这表明有等待的goroutine被唤醒了，不用尝试获取其他goroutine了
      // 如果锁处于饥饿模式，锁之后会直接给等待队头goroutine
            if old>>mutexWaiterShift == 0 || old&(mutexLocked|mutexWoken|mutexStarving) != 0 {
                return
            }
            // 抢占唤醒标志位，这里是想要把锁的状态设置为被唤醒，然后waiter队列-1
            new = (old - 1<<mutexWaiterShift) | mutexWoken
            if atomic.CompareAndSwapInt32(&m.state, old, new) {
        // 抢占成功唤醒一个goroutine
                runtime_Semrelease(&m.sema, false, 1)
                return
            }
      // 执行抢占不成功时重新更新一下状态信息，下次for循环继续处理
            old = m.state
        }
    } else {
    // 饥饿模式释放锁逻辑，直接唤醒等待队列goroutine
        runtime_Semrelease(&m.sema, true, 1)
    }
}

When we wake up goroutine , both normal mode and starvation mode are called func runtime_Semrelease(s *uint32, handoff bool, skipframes int) , these two modes are different in the second parameter transmission, if handoff is true, pass count directly to the first waiter. .

non-blocking lock

Go language introduces a non-blocking locking method TryLock() in the 1.18 version, and its implementation is very simple:

 func (m *Mutex) TryLock() bool {
  // 记录当前状态
    old := m.state
  //  处于加锁状态/饥饿状态直接获取锁失败
    if old&(mutexLocked|mutexStarving) != 0 {
        return false
    }
    // 尝试获取锁，获取失败直接获取失败
    if !atomic.CompareAndSwapInt32(&m.state, old, old|mutexLocked) {
        return false
    }


    return true
}

The implementation of TryLock is relatively simple, mainly including two judgment logics:

Judge the status of the current lock, if the lock is in the locked state or starved state, it fails to acquire the lock directly
Attempt to acquire the lock, if the acquisition fails, the direct acquisition of the lock fails

TryLock is not encouraged to use it, at least I haven't thought of any scenarios where it can be used.

Summarize

After reading through the source code, you will find that the logic of the mutex is really complicated. Although the amount of code is not large, it is difficult to understand. Some details need to be read several times to understand why it does so. We will summarize it at the end of the article. Knowledge points of mutual exclusion locks:

There are two modes of mutex: normal mode and starvation mode. The starvation mode appears to optimize the goroutine and the newly created goroutine that are just awakened in the normal mode. If there is no lock, the starvation mode is introduced when Go1.9 , if one goroutine fails to acquire the lock for more than 1ms , it will be switched to Mutex starvation mode, if a goroutine acquires the lock, and he is waiting for the end of the queue or he is waiting for less than 1ms , it will switch the mode of Mutex back to normal model
The process of locking:
- The lock is in a completely idle state and is directly locked through CAS
- When the lock is in normal mode, locked, and the spin condition is met, a maximum of 4 spins will be attempted
- If the current goroutine does not meet the spin condition, calculate the lock expectation state of the current goroutine
- Try to use CAS to update the lock status. If the update lock status is successful, determine whether the current goroutine can obtain the lock. If the lock is obtained, exit directly. If the lock cannot be obtained, it will fall into sleep and wait to be woken up.
- After the goroutine is woken up, if the lock is in starvation mode, the lock will be obtained directly, otherwise the number of spins, the flag wake-up bit will be reset, and the for loop spin and lock acquisition logic will be re-run;
unlocking process
- Atomic operation mutexLocked, if the lock is completely idle, the direct unlock is successful
- If the lock is not completely idle, then enter unlockedslow logic
- If you unlock an unlocked lock and panic directly, because the value of no lock mutexLocked is 0, the mutexLocked - 1 operation is performed when unlocking, which will make the entire mutex chaotic, so this judgment is required
- If the lock is in starvation mode, directly wake up the waiter waiting for the head of the queue
- If the lock is in normal mode, the goroutine that is not waiting can exit directly. If the lock is already in locked state, wake-up state, or starvation mode, it can exit directly, because the wake-up goroutine has acquired the lock.
Remember to copy Mutex when using mutex, because copy Mutex will copy the state together, because Lock will acquire the lock only when it is completely free If it is successful, it will cause deadlock after copying together with the state.
The implementation logic of TryLock is very simple. It mainly judges that the current lock is in the locked state and starvation mode will directly fail to acquire the lock, and if the attempt to acquire the lock fails, it will return directly;

Is there anything you don't understand about mutexes after this article? Criticisms and corrections in the comment area are welcome~;

Well, this article ends here, I'm asong , see you next time.

Welcome to the public account: Golang Dream Factory

Interviewer: How much do you know about the mutual exclusion lock of Go language?

foreword

Go language mutex design and implementation

Introduction to mutex

Lock lock

initialized state

spin

lock grab ready desired state

Update the desired state with the `CAS` operation

unlock

non-blocking lock

Summarize

asong

引用和评论

伙计，Go项目怎么使用枚举？

Go 语言-计算密集型服务性能优化

IO 密集型服务耗时优化

Go 语言 JSON 与 Cache 库调研与选型

Go 程序如何实现优雅退出？来看看 K8s 是怎么做的——上篇

Go 语言-内存泄漏排查两例

Golang GC 从原理到优化

Interviewer: How much do you know about the mutual exclusion lock of Go language?

foreword

Go language mutex design and implementation

Introduction to mutex

Lock lock

initialized state

spin

lock grab ready desired state

Update the desired state with the CAS operation

unlock

non-blocking lock

Summarize

asong

引用和评论

伙计，Go项目怎么使用枚举？

Go 语言-计算密集型服务 性能优化

IO 密集型服务 耗时优化

Go 语言 JSON 与 Cache 库 调研与选型

Go 程序如何实现优雅退出？来看看 K8s 是怎么做的——上篇

Go 语言-内存泄漏排查两例

Golang GC 从原理到优化

Update the desired state with the `CAS` operation

Go 语言-计算密集型服务性能优化

IO 密集型服务耗时优化

Go 语言 JSON 与 Cache 库调研与选型