【Dewu Technology】In-depth understanding of the underlying principle of synchronzied

1. Introduction to synchronized

Synchronized is a keyword in Java, which is a kind of synchronization lock. In multi-threaded programming, multiple threads may compete for the same shared resource at the same time. This resource is generally called a critical resource. This shared resource can be accessed by multiple threads at the same time, and can be modified by multiple threads at the same time. However, the execution of threads requires CPU resource scheduling, and the process is uncontrollable, so a synchronization mechanism is needed to control the Access to shared resources, so thread synchronization lock-synchronized came into being.

2. How to solve the problem of thread concurrency safety

In the case of multithreaded concurrent read and write access to critical resources, there will be thread safety issues. The method of synchronous and mutually exclusive access that can be used is that at the same time, only the same thread can access the critical resources. When multiple threads execute the same method, the local variables inside the method are not critical resources, because these local variables will be stored in the local variable table of each thread’s private stack when the class is loaded, so they are not shared resources. All will not cause thread safety issues.

Three, synchronized usage

The synchronized keyword is mainly used in the following three ways:

Modification of the class method, acting on the lock of the current class, if multiple threads and different objects access the method, synchronization cannot be guaranteed.
Modify the static method, act on the current class object to lock, and obtain the lock of the current class object before entering the synchronization code. The lock is the class containing this method, that is, the class object, so that if multiple threads and different objects access the static method, It can also guarantee synchronization.
Modify the code block, specify the lock object, lock the given object, and obtain the lock of the given object before entering the synchronized code base.

Four, Synchronized principle analysis

You can first take a look at the synchronization code block through a simple case:

public class SynchTestDemo {
    
    public void print() {
        synchronized ("得物") {
            System.out.println("Hello World");
        }
    }
    
}

Synchronized is a Java keyword, and there is no way to directly see its underlying source code, so it can only be disassembled through the class file.

First use the javac SynchTestDemo.java instruction to directly compile the SynchroTestDemo.java file into the SynchroTestDemo.class file; then use the javap -v SynchTestDemo.class instruction to disassemble the SynchroTestDemo.class file to obtain the following bytecode instructions:

These decompiled bytecode instructions will not be explained in detail here. You can understand what they mean by JVM instruction manual 1615811906e6bd. It can be seen from the decompilation result of the above figure that the monitorexit instruction is actually executed twice. The first time is to release the lock under normal conditions, and the second time is to release the lock when an abnormal situation occurs. The purpose of this is to ensure that the thread does not Deadlock.

monitorenter

First, we can look at the JVM specification description of the monitorenter of :

The translation is: any object has a monitor associated with it, and when a monitor is held, it will be in a locked state, and other threads cannot obtain the monitor. When the JVM executes the monitorenter inside a method of a certain thread, it will try to obtain the ownership of the current corresponding monitor. The process is as follows:

If the number of entries of the monitor is 0, the thread enters the monitor, and then the number of entries is set to 1, and the thread is the owner of the monitor;
If the thread has already occupied the monitor and just re-entered, the number of entries entering the monitor is increased by 1;
If the monitor is already occupied by other threads, the thread enters the blocking state until the number of entries in the monitor is 0, and then retry to acquire the ownership of the monitor;

monitorexit

We can also look at the JVM specification description of monitorexit of :

The translation is:

The thread that can execute the monitorexit instruction must be the thread that owns the ownership of the monitor of the current object;
When the monitorexit is executed, the entry number of the monitor is reduced by one. When the number of entries in the monitor is reduced to 0, the current thread exits the monitor and no longer owns the ownership of the monitor. At this time, other threads blocked by this monitor can try to obtain the ownership of the monitor;

After the synchronized keyword is compiled into bytecode, it will be translated into two instructions, monitorenter and monitorexit, at the start and end of the logic code of the synchronization block, as shown in the following figure:

Each synchronization object has its own Monitor (monitor lock), the locking process is shown in the following figure:

From the above description, we can see the realization principle of synchronized: the bottom layer of synchronized is actually implemented by a monitor object. In fact, the wait/notify method is also implemented by the monitor object, which is why it is only in the synchronized code block or method. This method can be called, otherwise the cause of the exception java.lang.IllegalMonitorStateException will be thrown.

Let's take a look at the synchronization method through a simple case:

public class SynchTestDemo {
    
    public synchronized void print() {
        System.out.println("Hello World");
    }
    
}

In the same way as above, you can see that the bytecode instruction of this method:

From the bytecode decompilation, it can be seen that the synchronization method is not implemented by the instructions monitorenter and monitorexit, but compared to the ordinary method, the constant pool has an additional ACC_SYNCHRONIZED identifier. The JVM actually implements method synchronization based on this identifier.

When the method is called, it will check whether the ACC_SYNCHRONIZED flag is set. If it is set, the thread will first acquire the monitor, and the method body will be executed after the acquisition is successful. After the method is executed, the monitor will be released again. During the execution of the method, no other thread can obtain the same monitor object.

In fact, the two synchronization methods are essentially the same. The execution of the two instructions is implemented by the JVM calling the mutex primitive of the operating system. The blocked thread will be suspended and wait for rescheduling, which will lead to Switching threads between "user mode" and "kernel mode" will have a great impact on performance.

5. What is a monitor?

Monitor is usually described as an object, which can be understood as a synchronization tool, or as a synchronization mechanism. All Java objects come with a lock when they come out new, which is the monitor lock, that is, the object lock, which exists in the object head (Mark Word), the lock identification bit is 10, and the pointer points to the start address of the monitor object. . In the Java Virtual Machine (HotSpot), the Monitor is implemented by its underlying C++ object ObjectMonitor:

ObjectMonitor() {
    _header = NULL;
    _count = 0;                        //用来记录该线程获取锁的次数
    _waiters = 0，
    _recursions = 0;                 // 线程的重入次数 
    _object = NULL;                 // 存储该monitor的对象
    _owner = NULL;                     // 标识拥有该monitor的线程
    _WaitSet = NULL;                 // 处于wait状态的线程，会被加入到_WaitSet
    _WaitSetLock = 0 ;
    _Responsible = NULL;
    _succ = NULL;
    _cxq = NULL;                     // 多线程竞争锁时的单向队列
    FreeNext = NULL;
    _EntryList = NULL;                 // 处于等待锁block状态的线程，会被加入到该列表
    _SpinFreq = 0;
    _SpinClock = 0;
    OwnerIsThread = 0;
}

_owner: Initially, it is NULL. When a thread occupies the monitor, the owner is marked as the unique identifier of the thread. When the thread releases the monitor, the owner returns to NULL again. Owner is a critical resource, and JVM guarantees its thread safety through CAS operations;
_cxq: Competition queue, all threads requesting locks will be placed in this queue first (one-way link). Cxq is a critical resource, and the JVM modifies the cxq queue through CAS atomic instructions. Before the modification, the old value of cxq is filled into the next field of node, and _cxq points to the new value (new thread). Therefore _cxq is a last-in first-out stack (stack);
_EntryList: Threads eligible to be candidate resources in the _cxq queue will be moved to the queue;
_WaitSet: Threads blocked because of calling the wait method will be placed in this queue.

Give an example to specifically analyze the difference between the _cxq queue and the _EntryList queue:

public void print() throws InterruptedException {
    synchronized (obj) {
        System.out.println("Hello World");
        //obj.wait();
    }
 }

If the above code is executed by multiple threads, the t1 thread enters the synchronization code block for the first time and can obtain the lock, and then another t2 thread is also ready to execute this code. The t2 thread did not grab the lock, and the t2 thread The thread will enter the _cxq queue to wait. At this time, another thread t3 is ready to execute this code. Of course, t3 will not grab the lock, so t3 will also enter _cxq to wait. Then, the t1 thread finishes executing the synchronization code block and releases the lock. At this time, the lock may be grabbed by any thread among t1, t2, and t3. If it is grabbed by the t1 thread at this time, the threads t2 and t3 that have entered the _cxq queue last time will enter the _EntryList to wait. If a t4 thread comes at this time, the t4 thread does not grab the lock After the resource, it will still enter _cxq to wait.

Let's analyze the _WaitSet queue and the _EntryList queue in detail below:

In the markOop->monitor() of each object, the ObjectMonitor object can be stored. The ObjectWaiter object stores thread (thread object) and unpark threads. Each thread waiting for a lock will have an ObjectWaiter object, and objectwaiter is an object with a doubly linked list structure.

Combined with the monitor structure diagram in the above figure, it can be analyzed that when the thread owner finishes executing the thread, the lock will be released. At this time, it may be a blocked thread to grab the lock, or it may be a thread in a waiting state being awakened to grab the lock. To the lock. In the JVM, each thread waiting for a lock will be encapsulated into an ObjectMonitor object, _owner identifies the thread that owns the monitor, and _EntryList and _WaitSet are used to save the list of ObjectWaiter objects. The biggest difference between _EntryList and _WaitSet is that the former is used To store the threads waiting for the lock block state, the latter is used to store the threads in the wait state.

When multiple threads access the same piece of code at the same time:

First, it will enter the _EntryList collection. Whenever the thread obtains the monitor of the object, the _ower in the monitor will be set to the current thread, and the counter _count in the monitor will be increased by 1
If the thread calls the wait() method, it will release the currently held monitor object, set _ower to null, and _count minus 1, and the thread enters _WaitSet and waits to be awakened

If the current thread is executed, the monitor lock will also be released, and the _count value will be restored, so that other threads can acquire the lock

The monitor object exists in the object header (Mark Word) of each Java object, so any object in Java can be used as a lock. Because the methods such as notify/notifyAll/wait will use the monitor lock object, it must be used in the synchronization code block. . In the case of multiple threads, threads need to access critical resources at the same time, and the monitor monitor can ensure that only one thread is accessing shared data at the same time.

So the question is, synchronized is an object lock, and the lock is added to the object. How do you record the state of the lock when the object is? The answer is that the lock status is recorded in the object header (Mark Word) of each object. So what is the object header?

Six, what is the object head

In JVM, the layout of objects in memory is divided into three areas: object header, instance data, and alignment padding. As shown below:

The object header also includes two parts of information. The first part is used to store the runtime data of the object itself (Mark Word), such as HashCode, GC generation age, lock status flag, lock held by the thread, biased thread ID, biased timestamp, etc. . The other part of the object header is the type pointer (Klass pointer), that is, the pointer that the object points to its class metadata. The virtual machine uses this pointer to determine which class the object is an instance of.

Class<? extends SynchTestDemo> synchClass = synchTestDemo.getClass();

It is worth noting that: the class meta information exists in the method area. The class meta information is different from the synchClass bytecode object in the heap. The synchClass can be understood as after the class is loaded, the JVM stores the class information in the heap, and then uses reflection to go. Access all its information (including functions and fields), but most objects in the JVM are implemented using C++ code. If class information is needed inside the JVM, the JVM will use the type pointer of the object header to get the class element in the method area. Informational data.

Instance data: store the attribute data information of the class, including the attribute information of the parent class.

Alignment padding: Because the virtual machine requires that the start address of the object must be an integer multiple of 8 bytes. The padding data does not have to exist, just for byte alignment.

Let's take a look at the structure of the target head:

In a 32-bit virtual machine, Mark Word is 32-bit in size, and its storage structure is as follows:

In a 64-bit virtual machine, Mark Word is 64-bit in size, and its storage structure is as follows:

Now the virtual machine is basically 64-bit, and the 64-bit object header is a bit of a waste of space. The JVM turns on pointer compression by default, so the object header is basically recorded in the form of 32-bit. You can also control the JVM to turn on and off pointer compression through the following parameters:

Turn on compressed pointer (-XX:+UseCompressedOops) Turn off compressed pointer (-XX:-UseCompressedOops)

So why does the JVM need to enable pointer compression by default? The reason is that the meta-information pointer Klass pointer on the object header is stored in 4 bytes in the 32-bit JVM virtual machine, but the Klass pointer in the 64-bit JVM virtual machine is stored in 8 bytes. Some objects are stored in 32 bits. The bit virtual machine also uses 4 bytes for storage, and the 64-bit machine uses 8 bytes for storage. There are thousands of objects in an engineering project. If each object is stored by 8 bytes If it is, these objects will increase a lot of space invisibly, resulting in a lot of pressure on the heap, and the heap will easily become full, and then it will be easier to trigger GC. The main function of pointer compression is to compress every The size of the memory address of an object, then the same heap memory size can put more objects.

Here is just one additional little knowledge point: there are 4 bytes in the object header used to store the age of the object's generation, 4 bytes is the fourth power of 2 is equal to 16, and its range is 0~15, so also It is easy to understand that when the object is in the GC, the default generation age of the JVM object from the young generation to the old generation is 15.

`Seven, the optimization of synchronized locks`

The operating system is divided into "user space" and "kernel space". JVM runs in "user mode". Before jdk1.6, the underlying operating system implementation needs to be called when using synchronized locks. The underlying monitor will block and wake up threads. Thread blocking and awakening require the CPU to change from "user mode" to "kernel mode". Frequent blocking and awakening are a heavy burden for the CPU. These operations bring a lot to the concurrent performance of the system. pressure. At the same time, the CPU needs to switch from "user mode" to "kernel mode". In this process, performance is very degraded and efficiency is very low. Therefore, the synchronized before jdk1.6 is a heavyweight lock. As shown below:

Then a professor from the State University of New York called Doug Lea saw that the synchronized performance of jdk was relatively low, so he used pure Java language to implement AQS-based ReentrantLock lock (of course the bottom layer also calls the bottom layer language), as shown in the following figure , It can be said that the emergence of ReentrantLock locks is entirely to make up for the various deficiencies of synchronized locks.

Due to the serious lack of synchronized lock performance, Oracle officially upgraded the synchronized lock after jdk1.6. The entire process of lock upgrade is shown in the figure above. So there are the following nouns:

`no lock`

No lock does not lock the resource. All threads can access and modify the same resource, but only one thread can modify successfully at the same time. The bottom layer is implemented through CAS. No lock can not replace a lock in an all-round way, but the performance of no lock in some situations is very high.

`Deflection lock (No lock -> Deflection lock)`

The "bias" of the bias lock is the "bias" of eccentricity and the "bias" of favoritism. It means that the lock will be biased to the first thread to obtain it, and the thread ID of the lock bias will be stored in the object header. When a thread enters and exits a synchronized block, it only needs to check whether it is a bias lock, lock flag bit, and ThreadID.

At the beginning of the lock-free state, the JVM will open a state of "anonymous" bias by default, that is, when the thread has not yet held the lock, an anonymous bias lock is set in advance, and after a thread holds the lock, it will use CAS The operation sets the thread ID to the high 23 bits of the mark word of the object [32-bit virtual machine]. If the thread competes for the lock resource again next time, try to reduce unnecessary lightweight lock execution in the case of multi-thread competition Path, because the acquisition and release of lightweight locks rely on multiple CAS atomic instructions, only one CAS atomic instruction is needed when replacing ThreadID.

`Lightweight lock (bias lock -> lightweight lock)`

When threads alternately execute synchronized code blocks and the competition is not fierce, the bias lock will be upgraded to a lightweight lock. In most cases, the lock is always acquired by the same thread multiple times, and there is no multi-thread competition, so there is a biased lock. The goal is to improve performance when only one thread executes synchronized code blocks. When a thread accesses the synchronized code block and acquires the lock, the thread ID of the lock bias is stored in the Mark Word. When the thread enters and exits the synchronized block, the CAS operation is no longer used to lock and unlock, but to check whether the Mark Word stores a biased lock pointing to the current thread. The introduction of biased locks is to minimize unnecessary lightweight lock execution paths without multi-threaded competition, because the acquisition and release of lightweight locks rely on multiple CAS atomic instructions, and biased locks only need to replace the ThreadID Just rely on the CAS atomic instruction once. After revoking the bias lock, return to the state of no lock (the flag bit is "01") or lightweight lock (the flag bit is "00").

`Spin lock`

In many scenarios, the locked state of shared resources will only last for a short period of time, and it is not worth blocking and waking up threads for this period of time. If the physical machine has more than one processor, and two or more threads can be executed in parallel at the same time, we can let the thread that requests the lock "wait a moment" without giving up the execution time of the processor. Whether the thread with the lock will release the lock soon. In order to make the thread wait, we only need to let the thread execute a busy loop (spin), which is the spin lock.

When a thread t1 and t2 colleagues compete for the same lock, if the t1 thread grabs the lock first, the lock will not be upgraded to a heavyweight lock immediately. At this time, the t2 thread will spin several times (the default number of spins is 10, You can use the parameter -XX: PreBlockSpin to change). If the spin of t2 exceeds the maximum number of spins, then t2 will use the traditional method to suspend the thread, and the lock will be upgraded to a heavyweight lock.

Spin waiting cannot replace blocking. For the time being, the requirement for the number of processors must require two cores. Although spin waiting itself avoids the overhead of thread switching, it takes up processor time, so if the lock is occupied The time is very short, the effect of spin waiting will be very good, if the lock is occupied for a long time, then the spinning thread will only consume processor resources, and will not do any useful work, but will bring performance The waste of energy.

The spin lock has been introduced in jdk1.4, but it is turned off by default. You can use the -XX:+UseSpinning parameter to turn it on. After jdk1.6, the spin lock is already open by default.

`Heavyweight lock`

When upgrading to a heavyweight lock, the status value of the lock flag becomes "10". At this time, the pointer to the heavyweight lock is stored in the Mark Word. At this time, the threads waiting for the lock will enter the blocking state.

`Lock elimination`

Lock elimination refers to the elimination of locks that require synchronization on some code when the virtual machine just-in-time compiler (JIT) is running, but it is detected that there is no shared data competition. The main basis for determining lock elimination comes from the data support of escape analysis. If it is judged that in a piece of code, all data on the heap will not escape and be accessed by other threads, then they can be treated as data on the stack. They are thread-private, and synchronization locking is naturally unnecessary.

public class SynchRemoveDemo {
    public static void main(String[] args) {
        stringContact("AA", "BB", "CC");
    }
    public static String stringContact(String s1, String s2, String s3) {
        StringBuffer sb = new StringBuffer();
        return sb.append(s1).append(s2).append(s3).toString();
    }
}

//append()方法源码
@Override
public synchronized StringBuffer append(String str) {
   toStringCache = null;
   super.append(str);
   return this;
}

The append() of StringBuffer is a synchronous method, and the lock is this or sb object. The virtual machine found that its dynamic scope was restricted to the stringContact() method. In other words, the reference of the sb object will never "escape" outside the stringContact() method, and other threads cannot access it. Therefore, although there is a lock here, it can be safely eliminated. After just-in-time compilation, this This piece of code will ignore all synchronization and execute it directly.

By the way, here is a small JVM knowledge point-"object escape analysis": it is to analyze the dynamic scope of the object. When an object is defined in a method, it may be referenced by an external method, such as being passed to other methods as a call parameter Place in. JVM determines that the object will not be accessed externally through escape analysis. If there is no escape, the object can be allocated memory on the stack first, so that the memory space occupied by the object can be destroyed when the stack frame is popped out of the stack, which reduces the pressure of garbage collection. The above sb object will not escape the method stringContact(), so the sb object may be allocated in the thread stack first, but it is possible. The point is here, you need to understand that you can learn by yourself~

`Lock coarsening`

The JVM will detect that a series of small operations use the same object to lock, enlarge the scope of the synchronized code block, and put it outside the series of operations, so that only one lock is needed. You can take a look at the following example:

public class SynchDemo {
    public static void main(String[] args) {
        StringBuffer sb = new StringBuffer();
        for (int i = 0; i < 50; i++) {
            sb.append("AA");
        }
        System.out.println(sb.toString());
    }
}

//append()方法源码
@Override
public synchronized StringBuffer append(String str) {
   toStringCache = null;
   super.append(str);
   return this;
}

StringBuffer's append() is a synchronous method. As can be seen from the above code, the append() method must be locked every time the loop is looped. At this time, the system will modify it to the following through judgment, and directly change the original append( ) The synchronized lock of the method is removed and added directly outside the for loop.

public class SynchDemo {
    public static void main(String[] args) {
        StringBuffer sb = new StringBuffer();
        synchronized(sb){
            for (int i = 0; i < 50; i++) {
            sb.append("AA");
            }
        }
        System.out.println(sb.toString());
    }
}

//append()方法源码
@Override
public StringBuffer append(String str) {
   toStringCache = null;
   super.append(str);
   return this;
}

`8. Analyze the lock upgrade process through the object header`

You can use the object header analysis tool to observe the changes of the object header during the lock upgrade: the runtime object header lock state analysis tool JOL, which is an OpenJDK open source toolkit, introduces the following maven dependency

<dependency>
     <groupId>org.openjdk.jol</groupId>
     <artifactId>jol‐core</artifactId>
     <version>0.10</version>
</dependency>

`Observe the object head in the unlocked state [unlocked state]:`

 public static void main(String[] args) throws InterruptedException {
     Object object = new Object();
     System.out.println(ClassLayout.parseInstance(object).toPrintable());
 }

operation result:

java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           01 00 00 00 (00000001 00000000 00000000 00000000) (1)          第一行：对象头MarkWord
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)          第二行：对象头MarkWord
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243) 第三行：klass Pointer
     12     4        (loss due to the next object alignment)                                                                  第四行：对齐填充
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

Here is a detailed explanation of the print results, and no detailed analysis will be done later:

OFFSET: memory address offset

SIZE: the byte size corresponding to this message

Instance size: 16 bytes: the size of the new Object object this time

Since the currently used machine is a 64-bit operating system machine, the first two lines represent the object header MarkWord, which has been marked in the above running results, which is exactly 8 bytes, and each byte is 8 bits, which is exactly 64 Bit; From the comparison of the digits of the 32-bit object header and the 64-bit object header in the above, it can be seen that the object header in the first row can be analyzed for the upgrade of the object header lock.

The third line refers to the type pointer (as mentioned above, it points to the class meta information of the method area), which has been marked in the above operation result. The default value of Klass Pointer on 64-bit machines is 8 bytes. Here, because of pointer compression The reason is currently 4 bytes.

The fourth line refers to the alignment and padding, sometimes sometimes sometimes not. The JVM needs to ensure that the object size is an integer multiple of 8 bytes. In fact, the bottom layer of the computer calculates an integer multiple of 8 bytes when the object is calculated. Can improve the efficiency of object storage.

It can be observed that the size of the new Object object this time is actually only 12 bytes. Here, the object is filled with 4 bytes to make the size of the Object object 16 bytes which is an integer multiple of 8 bytes.

The JVM adopts the little-endian mode and needs to be converted to the big-endian mode now. The specific conversion is shown in the following figure:

It can be seen that the object is not locked at the beginning, and it can also be observed through the last three digits of "001". The first 25 digits represent the hashcode, so why are the first 25 digits 0 here? In fact, the hashcode is obtained through the C language similar to "lazy loading", so you can see that there is no hashcode in the upper 25 bits of the object.

`Observe the object head in the state of lock and no competition [no lock -> bias lock]:`

 public static void main(String[] args) throws InterruptedException {
     Object object = new Object();
     System.out.println(ClassLayout.parseInstance(object).toPrintable());
     synchronized (object){
          System.out.println(ClassLayout.parseInstance(o).toPrintable());
     }
 }

Operation result (JVM default little endian mode):

java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           01 00 00 00 (00000001 00000000 00000000 00000000) (1)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           90 39 62 05 (10010000 00111001 01100010 00000101) (90323344)
      4     4        (object header)                           00 70 00 00 (00000000 01110000 00000000 00000000) (28672)
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

Analysis of running results:

It can be seen from the running results that the first printed out is a "001" unlocked state, but the later printed out "000" is not a locked state. Check the table above and you can find that "000" is directly lightweight The state of the lock. When the JVM starts, there are actually many threads executing synchronized internally. The JVM is to avoid the performance overhead caused by the fearless lock upgrade process (bias lock -> lightweight lock -> heavyweight lock), so the JVM default state Down will delay the activation of the bias lock. Just add a delay time in front of the code to observe the bias lock:

public static void main(String[] args) throws InterruptedException {
     TimeUnit.SECONDS.sleep(6);
     Object o = new Object();
     System.out.println(ClassLayout.parseInstance(o).toPrintable());
     synchronized (o){
       System.out.println(ClassLayout.parseInstance(o).toPrintable());
    }
}

Operation result (JVM default little endian mode):

java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 00 00 00 (00000101 00000000 00000000 00000000) (5)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 90 80 de (00000101 10010000 10000000 11011110) (-561999867)
      4     4        (object header)                           b2 7f 00 00 (10110010 01111111 00000000 00000000) (32690)
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

Analysis of the operating results of the unopened bias lock and the opened bias lock:

未开启偏向锁（大端模式），没加锁：00000000 00000000 00000000 00000001
开启偏向锁（大端模式），没加锁     ：00000000 00000000 00000000 00000101
开启偏向锁（大端模式），加锁      ：11011110 10000000 10010000 00000101

The unlocked state after the bias lock is turned on will add a bias lock, called anonymous bias (biasable state), which means that the object lock can be biased. From the high 23 bits of 23 0, it can be seen that there is no If you prefer any thread, it means that you are ready for the bias, and just wait for the next thread to get it, you can directly use the CAS operation to record the thread id in the high 23 position.

`Observe the object head in a locked and competitive state [bias lock -> lightweight lock]:`

public static void main(String[] args) throws InterruptedException {
        
        Thread.sleep(5000);
        
        Object object = new Object();
        
        //main线程
        System.out.println(ClassLayout.parseInstance(object).toPrintable());

        //线程t1
        new Thread(() -> {
            synchronized (object) {
                System.out.println(ClassLayout.parseInstance(object).toPrintable());
            }
        },"t1").start();

        Thread.sleep(2000);

        //main线程
        System.out.println(ClassLayout.parseInstance(object).toPrintable());
        //线程t2
        new Thread(() -> {
            synchronized (object) {
                System.out.println(ClassLayout.parseInstance(object).toPrintable());
            }
        },"t2").start();
    }

Operation result (JVM default little endian mode):

java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 00 00 00 (00000101 00000000 00000000 00000000) (5)                //main线程打印
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 90 94 2d (00000101 10010000 10010100 00101101) (764710917)        //t1线程打印
      4     4        (object header)                           c9 7f 00 00 (11001001 01111111 00000000 00000000) (32713)
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 90 94 2d (00000101 10010000 10010100 00101101) (764710917)       //main线程打印
      4     4        (object header)                           c9 7f 00 00 (11001001 01111111 00000000 00000000) (32713)
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           08 a9 d5 07 (00001000 10101001 11010101 00000111) (131442952)        //t2线程打印
      4     4        (object header)                           00 70 00 00 (00000000 01110000 00000000 00000000) (28672)
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

Analysis of running results:

At the beginning, the object header printed by the main thread can be seen as an anonymous bias;

Then thread t1 printed the object object header. It is not difficult to compare with the first printed object header. It is not difficult to find that the printing of t1 is also a bias lock, but the object header printed by t1 has recorded the thread id of t1 in its corresponding 23 digits. ；

The program returns to the main thread again, it still prints out the object header data just t1, that is to say, once the bias lock is biased to a certain thread, if the thread cannot be biased again, then the bias lock will still record the previous bias. The state of the object header of that thread;

Then thread t2 started printing the object object header again. It can be seen that the last printing has been upgraded to a lightweight lock, because there are already two threads t1 and t2 alternately entering the synchronization code block of the object object lock, and the lock There is no fierce competition, so the lock has been upgraded to a lightweight lock.

`Observe the whole process of the object head in the state of no-lock upgrade to heavyweight lock [No-lock->Heavyweight lock]:`

public static void main(String[] args) throws InterruptedException {
        sleep(5000);
        Object object = new Object();

        System.out.println(ClassLayout.parseInstance(object).toPrintable());

        new Thread(()->{
            synchronized (object) {
                System.out.println(ClassLayout.parseInstance(object).toPrintable());
                //延长锁的释放，造成锁的竞争
                try {
                    sleep(5000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        },"t0").start();

        sleep(5000);

        new Thread(() -> {
            synchronized (object) {
                System.out.println(ClassLayout.parseInstance(object).toPrintable());
                //延长锁的释放，造成锁的竞争
                try {
                    sleep(5000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        },"t1").start();

        new Thread(() -> {
            synchronized (object) {
                System.out.println(ClassLayout.parseInstance(object).toPrintable());
                try {
                    sleep(2000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        },"t2").start();

    }

operation result:

java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 00 00 00 (00000101 00000000 00000000 00000000) (5)            //main线程打印
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           05 d8 8f ef (00000101 11011000 10001111 11101111) (-275785723)    //t0线程打印
      4     4        (object header)                           ce 7f 00 00 (11001110 01111111 00000000 00000000) (32718)
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           00 e9 a9 09 (00000000 11101001 10101001 00001001) (162130176)    //t1线程打印
      4     4        (object header)                           ce 7f 00 00 (11001110 01111111 00000000 00000000) (32718)
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           0a d8 80 f0 (00001010 11011000 10000000 11110000) (-259991542)    //t2线程打印
      4     4        (object header)                           ce 7f 00 00 (11001110 01111111 00000000 00000000) (32718)
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

Analysis of running results (JVM default little-endian mode):

The program is set to sleep for 5 seconds at the beginning, the purpose is to make the JVM priority loading is completed, the JVM will delay the start of the bias lock in the default state, you can open the main thread at the beginning to print "101" is the default anonymous bias Lock, but the thread id is not set; after that, the t0 thread will print immediately. At this time, you only need to use the CAS operation to set the t0 thread id into the object header, so this time is also a biased lock state; the subsequent program sleeps 5 Seconds later, when the t1 and t2 threads in the program execute the code block, they deliberately sleep their threads for a few seconds. The purpose is to let another thread be spinning waiting no matter which thread grabs the lock first, so the t1 thread If it prints "00", it is already a lightweight lock. Finally, looking at the program execution result, t2 prints "10" and it has been upgraded to a heavyweight lock. Obviously, the t2 thread has exceeded the maximum number of spins. Turned into a heavyweight lock.

`Nine, summary`

How to optimize synchronized when writing code?

My summary is:

1. Reduce the scope of synchronized, keep the synchronized code block as short as possible, reduce the execution time of the code in the synchronized code block, and reduce lock competition.

2. Reduce the granularity of synchronized locks and split a lock into multiple locks to improve concurrency. In fact, you can refer to the underlying principles of HashTable and ConcurrentHashMap.

HashTable locking actually locks the entire hash table. When one operation is in progress, other operations cannot be performed.

However, ConcurrentHashMap is partially locked, not the entire table is locked. ConcurrentHashMap is locked as a segment, and the current segment is locked, which does not affect the operations of other segments.

Text/harmony

Pay attention to the material technology, be the most fashionable technical person!

【Dewu Technology】In-depth understanding of the underlying principle of synchronzied

1. Introduction to synchronized

2. How to solve the problem of thread concurrency safety

Three, synchronized usage

Four, Synchronized principle analysis

monitorenter

monitorexit

5. What is a monitor?

Six, what is the object head

`Seven, the optimization of synchronized locks`

`no lock`

`Deflection lock (No lock -> Deflection lock)`

`Lightweight lock (bias lock -> lightweight lock)`

`Spin lock`

`Heavyweight lock`

`Lock elimination`

`Lock coarsening`

`8. Analyze the lock upgrade process through the object header`

`Observe the object head in the unlocked state [unlocked state]:`

`Observe the object head in the state of lock and no competition [no lock -> bias lock]:`

`Observe the object head in a locked and competitive state [bias lock -> lightweight lock]:`

`Observe the whole process of the object head in the state of no-lock upgrade to heavyweight lock [No-lock->Heavyweight lock]:`

`Nine, summary`

得物技术

`引用和评论`

从大模型性能优化到DeepSeek部署｜得物技术

Spring-@Configuration注解简析

单元测试-PowerMock

还在用命令行监控服务器？试试这款监控工具吧，直观又易用！

实现钉钉登录第三方网站功能

springboot初始化数据库+druid解密

探索 Java 线程的创建