Let&#39;s talk about lock upgrades

foreword

A long time ago, when I was interviewing an intern, someone asked me about the synchronized lock escalation process. I only had a superficial understanding of it at the time, but I actually understood the lock escalation process later. But in fact, I don't really understand where the optimization is, and which scenario is optimized for. I actually want to get the introduction scenario in the lock upgrade process. Especially after seeing JDK 15 deprecating and disabling biased locking, I'm actually wondering why this technique was removed, is the JDK better optimized, or is this technique no longer applicable today. Let's just say the answer here, the answer is in JEP 374. I originally wanted to post the answer directly, but considering that some students do not know the synchronization upgrade process, here is a brief introduction to the lock upgrade process.

In fact, this is also a common question in interviews, but often the interviewer asks me about the process of lock escalation, instead of asking which scenarios will benefit from lock escalation. This is where I often wonder, instead of asking why what.

There are always some well-publicized optimizations that everyone is willing to believe, but we all have to believe in that sentence: without investigation, there is no right to speak.

Introduction to synchronized locks

Here we briefly review synchronized. Synchronized is the first synchronization tool we encountered. It has many aliases: internal locks, exclusive locks, and pessimistic locks. It guarantees atomicity, visibility and order. A method modified by the synchronized keyword is called a synchronized method, and a static instance method modified by synchronized is called a synchronized instance method. The entire method of synchronizing methods is called a critical section.

 // 修饰方法
public synchronized void synchronizedDemo(){

}   
// 修饰静态方法
public static  synchronized void synchronizedStaticDemo(){
                
}    
public void synchronizedDemoPlus(){
   // 修饰代码块
   synchronized (this){
            
   }
}

Any object in the Java platform has a unique lock associated with it. A thread needs to apply for a lock to enter the critical section, so where is the lock placed? The answer is the object header. The internal structure of a normal Java object is shown in the following figure:

对象的构成

Generally speaking, our understanding of this lock is that when multiple threads enter the critical section, they will apply for the lock. If the lock has been acquired by other threads, the thread will be blocked.

A more accurate description is that the JVM allocates an entry set for each internal lock to record the threads waiting to acquire the corresponding internal lock. When the locks applied by these threads are released by their holding threads, one of the lock entry sets is Any thread will be woken up by the JVM. Seeing this, some students may ask, how does the JVM handle the acquisition and release of locks mentioned above. This actually requires the help of decompilation instructions. Here we use the synchronized code block to observe the internal implementation of synchronized:

 /**
 * 这个代码要编译一下,形成字节码,
 * 其实问题的答案就在字节码上。
 */
public class ThreadDemo{
  public  void synchronizedDemo(){
     synchronized (this){

      }
  }
}

Then find the folder where the bytecode corresponding to this class is located, open the command line and execute the following command

 javap -c ./ThreadDemo.class

反编译字节码

monitorenter represents entering the critical section and applying for a lock instruction, and monitorexit represents the critical section and releasing the lock instruction. Then why there are two release lock instructions, this question is a good question, the bottom release lock instruction is prepared for an exception in the code of the critical section. JVM needs to use an atomic operation (CAS) when implementing monitorenter and monitorexit, which is expensive.

Lock Escalation Overview

But we know that the threads in Java are mapped to the threads at the operating system level, so it is necessary to request the operating system to wake up. If a thread holds the lock for a short time, the thread falls into a deep sleep, and the operating system wakes up the cost just a little high. This leads to lock escalation:

锁升级的流程

At the beginning, the lock state of the object header is lock-free. When the thread enters the critical section and executes the code, if the lock is successfully acquired, the JVM will maintain a preference (Bias) for each object, that is, the internal lock corresponding to an object is the first time. Obtained by a thread, then this thread will be recorded as the object's preference thread (Biased Thread). Whether this thread is to apply for or release the lock again, this preference thread does not need to resort to the original expensive atomic operation, thus reducing the number of locks The application and release overhead.

This optimization is based on the observation that locks are mostly free of contention and that these locks will be held by at most one thread for their entire lifetime. In fact, most blogs on the Internet also introduce why biased locks are introduced based on this situation. In fact, I don't understand this sentence. I use synchronization to solve the problem caused by multi-threaded competition for resources. What kind of scenario is the above observation based on? To answer this question, I first ask a question: What are the thread-safe collections in Java? The average student might answer:

ConcurrentHashMap
CopyOnWriteArrayList
CopyOnWriteArraySet

These collections are all sophisticated concurrent collections, which were introduced in JDK1.5. In fact, there are some lesser-known concurrent safe collections:

Hashtable
Vector

These two collections are the original collections of Java and were introduced in JDK 1.0. Generally, no one chooses to use these two collections. The reason is that the implementation of thread safety of these two collections is simple and crude. synchronized. I took a look at the introduction time of ArrayList and HashMap. It was JDK 1.2, so early Java programmers had no choice at all. Only Hashtable and Vector could be used. Even in single-threaded usage scenarios, there was no ArrayList or HashMap available. Java is a forward-compatible language, and even though JDK 19 is about to be released, the JDK of some projects is still stuck at JDK5 and 6. Therefore, the introduction of biased locking in JDK 6 is to optimize the performance of early JDK code. This is also one of the reasons why JDK 15 removes biased locks, and the more fatal reason is that enabling biased locks can lead to performance degradation.

Why remove the bias lock

Let's take a closer look at why the JEP 374 proposal removes biased locks:

Biased locking is an optimization technique used in the HotSpot Virtual Machine to reduce the overhead of uncontended locking. It aims to avoid executing a compare-and-swap atomic operation when acquiring a monitor by assuming that a monitor remains owned by a given thread until a different thread tries to acquire it. The initial lock of the monitor biases the monitor towards that thread, avoiding the need for atomic instructions in subsequent synchronized operations on the same object. When many threads perform many synchronized operations on objects used in a single-threaded fashion, biasing the locks has historically led to significant performance improvements over regular locking techniques.
The performance gains seen in the past are far less evident today. Many applications that benefited from biased locking are older, legacy applications that use the early Java collection APIs, which synchronize on every access (eg, Hashtable and Vector ). Newer applications generally use the non-synchronized collections (eg, HashMap and ArrayList threaded), or the single introduced for the even in Java 1.2 scenarios more-performant concurrent data structures, introduced in Java 5, for multi-threaded scenarios. This means that applications that benefit from biased locking due to unnecessary synchronization will likely see a performance improvement if the code is updated to use these newer classes. Furthermore, applications built around a thread-pool queue and worker threads generally perform better with biased locking disabled. (SPECjbb2015 was designed that way, eg, whi le SPECjvm98 and SPECjbb2005 were not). Biased locking comes with the cost of requiring an expensive revocation operation in case of contention. Applications that benefit from it are therefore only those that exhibit significant amounts of uncontended synchronized operations, like those mentioned above, so that the cost of executing cheap lock owner checks plus an occasional expensive revocation is still lower than the cost of executing the eluded compare-and-swap atomic instructions. Changes in the cost of atomic instructions since the introduction of biased locking into HotSpot also change the amount of uncontended operations needed for that relation to remain true. Another aspect worth noting is that applications won't have noticeable performance improvements from biased locking even when the previous cost relation is true when the time spend on synchronized operations is still only a small fraction of the total application workload.
Biased locking introduced a lot of complex code into the synchronization subsystem and is invasive to other HotSpot components as well. This complexity is a barrier to understanding various parts of the code and an impediment to making significant design changes within the synchronization subsystem. To that end we would like to disable, deprecate, and eventually remove support for biased locking.

Just put Google Translate here:

Biased locking is an optimization technique used in HotSpot virtual machines to reduce the overhead of uncontended locking. It is designed to avoid doing a compare-and-swap atomic operation when acquiring a monitor, by assuming that the monitor is still owned by a given thread until a different thread tries to acquire it. The initial locking of the monitor biases the monitor towards that thread, avoiding the need for atomic instructions in subsequent synchronized operations on the same object. Biased locking has historically resulted in significant performance improvements over conventional locking techniques when many threads perform many synchronization operations on objects used in a single-threaded fashion.
The performance gains seen in the past are far less noticeable today. Many applications that benefit from biased locking are older legacy applications that use early Java collection APIs that synchronize on every access (eg Hashtable and Vector). Newer applications often use unsynchronized collections (eg, HashMap and ArrayList), introduced in Java 1.2 for single-threaded scenarios, or in Java 5 with higher-performance concurrent data structures for multi-threaded scenarios. This means that applications that benefit from biased locking due to unnecessary synchronization may see performance improvements if their code is updated to use these newer classes. Additionally, applications built around thread pool queues and worker threads often perform better with biased locking disabled. (For example, SPECjbb2015 is designed this way, while SPECjvm98 and SPECjbb2005 are not). Biased locking comes with the cost of requiring expensive undo operations in case of contention. Therefore, the only applications that benefit from it are those that exhibit a large number of uncontended synchronization operations, such as those mentioned above, so the cost of performing cheap lock owner checks plus occasional expensive revocations is still lower than performing avoidance The cost of compare and swap atomic instructions. Since the introduction of biased locking into HotSpot, changes in the cost of atomic instructions have also changed the number of non-contending operations required to keep this relationship correct. Another notable aspect is that applications do not gain significant performance improvements from biased locking, even if the previous cost relationship is correct, and the time spent on synchronous operations is still only a small fraction of the total application workload part. Since the introduction of biased locking into HotSpot, changes in the cost of atomic instructions have also changed the number of non-contending operations required to keep this relationship correct. Another notable aspect is that the application does not gain significant performance improvements from biased locking, even if the previous cost relationship is correct, and the time spent on synchronous operations is still only a small fraction of the total application workload . Since the introduction of biased locking into HotSpot, changes in the cost of atomic instructions have also changed the number of non-contending operations required to keep this relationship correct. Another notable aspect is that the application does not gain significant performance improvements from biased locking, even if the previous cost relationship is correct, and the time spent on synchronous operations is still only a small fraction of the total application workload .
Bias locking introduces a lot of complex code in the synchronization subsystem and also hacks into other HotSpot components. This complexity is a barrier to understanding the various parts of the code, as well as making major design changes within synchronized subsystems. To that end, we want to disable, deprecate, and eventually remove support for biased locking.

To sum up, the purpose of introducing biased locks is mainly to optimize HashTable and Vector before JDK 1.2. These two collections are what we said above. The corresponding locks are only acquired by one thread during the entire collection life cycle. The reason why the biased lock is removed now is that basically no one uses these two sets, and the high cost of revoking the biased lock, so JDK 15 decided to remove this feature.

Lightweight lock to heavyweight lock?

We won't mention the bias lock here, let's talk about the lock upgrade process:

The previous step is to upgrade from lock-free to biased lock. Assuming that other threads access biased locks to apply for locks, then biased locks are upgraded to lightweight locks at this time. The specific performance of this lightweight lock is the thread that fails to acquire the lock, not It will fall into a blocked state, but will spin, that is, to acquire locks in a non-stop loop, but long-term self-selection consumes CPU resources, so after a certain number of times, it will reach a heavyweight lock. If the lock is in a heavyweight The lock state, the thread that fails to acquire the lock will enter the blocking state.

A description seen in "Java Multithreaded Programming Practical Guide":

In the case of lock contention, when a thread applies for a lock, if the lock happens to be held by another thread, then the thread needs to wait for the lock to be released by its holding thread. A conservative way to achieve this waiting is Suspend the thread, but suspending the thread will cause context switching, so for a specific lock instance, this implementation strategy is more suitable for the scenario where most threads in the system hold the lock for a long time, so as to offset the Context switching overhead. Another implementation method is to use busy waiting. The so-called busy waiting is equivalent to a loop statement with an empty loop body as shown in the following code:
while(lockIsHeldByOtherThread){}.
It can be seen that busy waiting is realized by repeatedly performing no-op until the required conditions are established. The advantage of this strategy is that it will not cause context switching, but the disadvantage is that it consumes more processor resources.
In fact, the JVM does not have to choose one of the above two implementation strategies, it can use the above two strategies in combination. For a specific lock instance, the JVM will determine whether the lock is held by the thread for a "longer" or "shorter" time based on the information collected during its operation. For the locks that are held by the thread for a "longer" time, the JVM will select the pause-waiting strategy, and for the locks that the thread holds for a "shorter" time, the JVM will select the busy-waiting strategy. The JVM may also adopt the busy-waiting strategy first, and then adopt the suspend-waiting strategy when the busy-waiting fails. This optimization of the JVM virtual machine is called adaptive locking.

In fact, when I read this paragraph, I was thinking, will the JVM adopt a pause-waiting strategy first, and then adjust to a busy-waiting strategy? Searching the whole network, I still can't find this discussion. The relevant discussion is only on adaptive locks. Adaptive locks were introduced in JDK 1.6. Adaptive locks can be understood as adaptive self-selected locks, and self-adaptive means self-selected time times It is no longer fixed, but is determined by the previous self-selected time on the same lock and the status of the owner of the lock. If on the same lock object, the optional wait has just successfully acquired the lock, and the thread holding the lock is running, then the virtual machine will think that this spin is also likely to succeed again, and then it will continue the spin wait relative to longer. If the spin is rarely successfully acquired for a certain lock, then trying to acquire the lock in the future will probably omit the self-selection process, block the thread directly, and avoid wasting processor resources.

In fact, the problem is basically over here. We have basically answered the process of lock escalation. The lock escalation process after JDK 8 should be from lock-free to biased lock, and then choose by yourself. JVM According to the success rate of spin, if If the success rate of spin is high, then spin, if the success rate of spin to acquire the lock is relatively low, it will consume more resources and enter the heavyweight lock.

Standard answer to lock escalation process

If an interviewer asks about the lock upgrade process, I think the standard answer is as follows:

In JDK 8 to 14, there is a shift from lock-free to biased lock adaptive lock. The so-called adaptive lock refers to the JVM will decide whether to spin or block the thread according to the information collected during the running process. If the success rate of the spin acquisition lock is relatively high , then it is upgraded from a biased lock to a lightweight lock. If the failure rate of spin-acquisition locks is relatively high, indicating that a single thread holds the lock for a long time, then the JVM will switch from lightweight locks to heavyweight locks.
The biased lock was removed in JDK 15 because the biased lock was introduced, mainly to optimize the code related to the two sets of JDK 1.0, but now it seems that these two sets are rarely used, and the JVM revokes the biased lock state It is more resource-intensive, so JDK 15 revokes biased locks. Therefore, the lock upgrade process of JDK 15 is from no lock to lightweight lock to heavyweight lock.

write at the end

In fact, I originally planned to write about lock downgrade, but lock downgrade involves a safe point, and the introduction of safe point is mixed with GC. I originally planned to introduce it, but after thinking about it, an introduction to safe point is not realistic.

Reference documentation

JEP 374: Deprecate and Disable Biased Locking https://openjdk.org/jeps/374
In-depth analysis: lock upgrade process and lock status, you will understand after reading this! https://segmentfault.com/a/1190000022904663
Must say about Java "lock" https://tech.meituan.com/2018/11/15/java-lock.html

Let's talk about lock upgrades

foreword

Introduction to synchronized locks

Lock Escalation Overview

Why remove the bias lock

Lightweight lock to heavyweight lock?

Standard answer to lock escalation process

write at the end

Reference documentation

北冥有只鱼

引用和评论

从阻塞IO到io_uring: Linux IO模型的演进之路

Java8的新特性

Java11的新特性

Java5的新特性

Java9的新特性

Java13的新特性

Java7的新特性

Let&#39;s talk about lock upgrades

foreword

Introduction to synchronized locks

Lock Escalation Overview

Why remove the bias lock

Lightweight lock to heavyweight lock?

Standard answer to lock escalation process

write at the end

Reference documentation

北冥有只鱼

引用和评论

从阻塞IO到io_uring: Linux IO模型的演进之路

Java8的新特性

Java11的新特性

Java5的新特性

Java9的新特性

Java13的新特性

Java7的新特性

Let's talk about lock upgrades