"Deep Understanding of Java Virtual Machine: Advanced Features and Best Practices of JVM" (Part 2)

💡 Learning without thinking is worthless, thinking without learning is perilous. - Confucius
👉 WeChat public account has been opened, Cai Nong said , students who didn't pay attention remember to pay attention!

This article brings "In-depth Understanding of Java Virtual Machine: JVM Advanced Features and Best Practices" written by Mr. Zhou Zhiming, which is very hard-core!

The book is divided into 5 parts, around the core topics of memory management , execution subsystems , program compilation and optimization , efficient concurrency and other core topics to conduct a comprehensive and in-depth analysis of the JVM, profoundly revealed the working principle of the JVM.

The whole book consists of 5 parts, thirteen chapters, and a total of 358,929 words. The overall structure is quite clear, so that it is impossible to excerpt when writing reading notes (even want to repeat the whole book), the following is the content of the third part of the book, I hope readers can taste it carefully!

1. Part III Virtual Machine Execution Subsystem

The conversion of the result of code compilation from native machine code to bytecode is a small step in the development of storage formats, but a giant step in the development of programming languages

Chapter 6 Class File Structure

Computers only know 0 and 1, so the program we write needs to be translated into binary format consisting of 0 and 1 by the compiler before it can be executed by the computer.

1) The cornerstone of irrelevance

The virtual machines of various platforms and the program storage format used uniformly by all platforms - ByteCode (ByteCode) is the instantiation that constitutes platform independence.

2) The structure of the Class class file

Any Class file corresponds to the definition information of a unique class or interface, but conversely, the class or interface does not necessarily have to be defined in the file (for example, the class or interface can also be directly generated by the class loader)

The Class file is a set of binary streams based on 8-bit bytes. Each data item is arranged in the Class file in strict order and compact, without any separator in the middle, which makes almost all the content stored in the entire Class file is Necessary data for the program to run.

The Class file format uses a pseudo-structure similar to the C language structure to store data. There are only two data types in this pseudo-structure:

Unsigned numbers : basic data types that can be used to describe numbers, index references, numeric values, or to form string values in UTF-8 encoding
Table : A composite data type composed of multiple unsigned numbers or other tables as data items, all tables habitually end with _info .

1. Magic number and version of Class file

The first 4 bytes of each Class file are called the magic number (0xCAFEBABE), and its only function is to determine whether the file is a Class file that can be accepted by the virtual machine.

The 4 bytes following the magic number store the version number of the Class file: the 5th and 6th bytes are the minor version, and the 7th and 8th bytes are the major version number.

The version number of Java starts from 45, and the major version number of each major JDK release after JDK 1.1 increases by 1

2. Constant pool

After the major and minor version numbers is the entry to the constant pool .

The number of constants in the constant pool is not fixed, so a u2 type of data will be placed at the entry of the constant pool, representing the constant pool capacity count value.

The constant pool capacity (offset address: 0x00000008) is hexadecimal 0x0016, which is 22 in decimal, which means that there are 21 constants in the constant pool, and the index value ranges from 1 to 21.

The constant pool mainly stores two types of constants: literals and symbolic references

Symbolic references include three types of constants:

Fully qualified names of classes and interfaces
Field names and descriptors
method name and descriptor

3. Access flags

After the end of the constant pool, the next two bytes represent the access flags (access_flags), which are used to identify some class or interface level access information

Is this Class a class or an interface
Is it defined as a public type
Is it defined as abstract type
If it is a class, is it declared final?

访问标志

4. Class index, parent class index and interface index collection

Both the class index and the parent class index are a u2 type of data, and the interface index set is a set of u2 type data sets. The Class file uses these three data to determine the inheritance relationship of this class.

类索引、父类索引、接口索引集合

The values of the three u2 types starting from the offset address 0x000000F1 are 0x0001 , 0x0003 , 0x0000 respectively, that is, the class index is 1, the parent class index is 3, and the interface index set size is 0, and then pass javap The constant pool calculated by the command to find the constants of the corresponding class and parent class

5. Field table collection

Field tables are used to describe variables declared in an interface or class.

字段访问标志

Descriptors for methods and fields are more complex than fully qualified and simple names. The role of the descriptor is to describe the data type of the field, the parameter list of the method (including the number, type and order) and the return value. According to the descriptor rules, the basic data types (byte, char, double, float, int, long, short, boolean) and the void type representing no return value are represented by an uppercase character, while the object type is represented by the character L plus object The fully qualified name of the table.

描述符表示字符含义

For array types, each dimension will be described with a leading "[" character

6. Method table collection

The structure of the method table is the same as that of the field table, including access flags (access_flags), name index (name_index), descriptor index (descriptor_index), and attribute table sets (attributes) in turn.

The code in the method, after being compiled into bytecode instructions by the compiler, is stored in an attribute named Code in the method attribute table collection. The attribute table is the most extensible data item in the Class file format. .

If the parent class method is not overridden (Override) in the child class, the method information from the parent class will not appear in the method table collection. But again, there may be methods added automatically by the compiler, the most typical being the class constructor \<clinit> method and the instance constructor \<init> method.

7. Property Sheet Collection

The restrictions on the attribute table set are slightly looser, and each attribute table is no longer required to have a strict order. As long as it does not duplicate the existing attribute name, the compiler implemented by anyone can write their own defined attribute information into the attribute table.

3) Introduction to bytecode instructions

The instruction of the Java virtual machine consists of a byte-length number representing the meaning of a specific operation (called an opcode, Opcode) followed by zero at most

Each represents the required parameters for this operation (called operands, Operands).

Chapter 7 Virtual Machine Class Loading Mechanism

The virtual machine loads the data describing the class from the Class file into the memory, and verifies, converts, parses and initializes the data, and finally forms a Java type that can be directly used by the virtual machine. This is the class loading mechanism of the virtual machine.

1) Timing of class loading

The entire life cycle of a class starts from being loaded into the virtual machine memory and unloads the memory. Its entire life cycle includes: Loading , Verification , Preparation , Resolution , Initialization , and use . (Using) and unloading (Unloading) 7 stages.

类的生命周期

The order of the 5 phases of loading, verifying, preparing, initializing and unloading is determined, and the class loading process must start step by step in this order, while the parsing phase is not necessarily: it can in some cases be after the initialization phase Again, this is to support runtime binding of the Java language (also known as dynamic binding or late binding)

Time to initialize :

When encountering the four bytecode instructions of new, getstatic, putstatic or invokestatic, if the class has not been initialized, the initialization needs to be triggered first. In layman's terms: when instantiating an object using the new keyword, when reading or setting a static field of a class (except for static fields modified by final, and the result has been put into the constant pool at compile time), and when calling When a class has a static method
When using the method of the java.lang.reflect package to make a reflection call to a class, if the class has not been initialized, it needs to be initialized first
When initializing a class, if you find that its parent class has not been initialized, you need to trigger the initialization of its parent class first
When the virtual machine starts, the user needs to specify a main class to be executed (the class containing the main() method), and the virtual machine will initialize this main class first
When using the dynamic language support of JDK 1.7, if the last parsing result of a java.lang.invoke.MethodHandle instance is the method handle of REF_getStatic, REF_putStatic, and REF_invokeStatic, and the class corresponding to this method handle has not been initialized, it needs to be triggered first. its initialization

2) The process of class loading

1. Load

During the loading phase, the virtual machine needs to do 3 things:

Get the binary byte stream defining a class by its fully qualified name
Convert the static storage structure represented by this byte stream into the runtime data structure of the method area
Generate a java.lang.Class object representing this class in memory as the access entry for various data of this class in the method area

2. Verification

Verification is the first step in the connection phase. The purpose of this phase is to ensure that the information contained in the byte stream of the Class file meets the requirements of the current virtual machine and will not compromise the security of the virtual machine itself.

Class files are not necessarily required to be compiled from Java source code, and can be generated by any method, even including direct writing with a hexadecimal editor to generate Class files.

The verification phase will complete four stages of inspection actions:

file format validation

Whether it starts with the magic number OxCAFEBABE
Whether the major and minor versions are within the processing scope of the current virtual machine
Whether there are unsupported constant types in the constants of the constant pool (check the constant tag flag)
...

The main purpose of the verification phase is to ensure that the input byte stream can be correctly parsed and stored in the method area, which is based on the binary byte stream.

metadata validation

Does this class have a parent class?
Whether the parent class of this class inherits a class that is not allowed to be inherited (final modified class)
If the class is not abstract, does it implement all the methods required to be implemented in its parent class or interface
...

This stage is to perform semantic analysis on the information described by the bytecode to ensure that the described information conforms to the requirements of the Java language specification. The main purpose is to perform semantic verification on the metadata of the class to ensure that there is no metadata information that does not conform to the Java language specification.

Bytecode Verification

Ensure that the data type of the operand stack and the instruction code sequence can work together at any time
Guarantees that jump instructions will not jump to bytecode outside the body of the method
...

The main purpose of this stage is to determine that the program semantics are legal and logical through data flow and control flow analysis.

Symbolic reference verification

Whether the corresponding class can be found by the fully qualified name described by the string in the symbol reference
Whether there is a field description in the specified class that matches the method and the method and field described by the simple name
...

The main purpose of this stage is to check the matching of information other than the class itself (various symbol references in the constant pool) to ensure that the parsing action can be executed normally.

3. Prepare

The preparation phase is the phase of formally allocating memory for class variables and setting the initial values of class variables. The memory used by these variables will be allocated in the method area.

4. Analysis

The parsing phase is the process by which the virtual machine replaces the symbolic references of the constant pool with direct references.

Symbolic reference : Symbolic reference describes the referenced target as a set of symbols. The symbol can be any form of literal, as long as it can be used unambiguously to the target. Symbolic references have nothing to do with the memory layout implemented by the virtual machine, and the target of the reference is not necessarily loaded into memory.
Direct reference : A direct reference can be a pointer directly to the target, a relative offset, or a handle that can be located indirectly to the target. Direct references are related to the memory layout implemented by the virtual machine. The direct references translated from the same symbolic reference on different virtual machine instances are generally not the same. If there is a direct reference, the referenced target must already exist in memory.

5. Initialization

The class initialization phase is the last step in the class loading process. The initialization phase is the process of executing the class constructor \<clinit> method.

3) Class loader

In the class loading phase, the fully qualified name of a class is used to obtain the binary byte stream describing this class. This action is implemented outside the Java virtual machine, so that the application can decide how to obtain the required class. The code module that implements this action is called a class loader

1. Classes and class loaders

Each class load has a separate class namespace. Comparing whether two classes are "equal" only makes sense if the two classes are loaded by the same class loader, otherwise, even if the two classes come from the same class file, they are loaded by the same virtual machine , the two classes must not be equal as long as the class loader that loads them is different.

2. Parent delegation model

Class loaders can be divided into 3 categories:

启动类加载器（Bootstrap ClassLoader）

This class loader is responsible for storing the files stored in the <JAVA_HOME>\lib directory, or in the path specified by the -Xbootclasspath parameter

扩展类加载器（Extension ClassLoader）

This loader is implemented by sun.misc.Launcher$ExtClassLoader , which is responsible for loading all class libraries in the <JAVA_HOME>\lib\ext directory or in the path specified by the java.ext.dirs system variable. Developers can directly Use extension class loader.

应用程序类加载器（Application ClassLoader）

This class loader is implemented by sun.misc.Launcher$App-ClassLoader . Since this class loader is the return value of the getSystemClassLoader() method in ClassLoader, it is generally called the system class loader. If the application has not customized its own class loader, in general this is the default class loader in the program

Here, the parent-child relationship between class loaders is generally not implemented in an inheritance relationship, but uses a composition relationship to reuse the code of the parent loader.

How the parent delegation model works

If a class loader receives a class loading request, it will not try to load the class itself first, but delegate the request to the parent class loader to complete. This is the case for every level of class loaders, so all The load request should eventually be passed to the top-level startup class loader, and only when the parent loader reports that it cannot complete the load request (the required class is not found in its search scope), the child loader will try itself to load.

3. Break the parental delegation model

After JDK 1.2, users are no longer encouraged to override the loadClass() method, but should write their own class loading logic into the findClass() method. If the parent class fails to load in the logic of the loadClass() method, it will call its own findClass() method to complete the loading, so as to ensure that the newly written class loader conforms to the parent delegation rules.
Thread Context ClassLoader. This class loader can be set through the setContextClassLoaser() method of the java.lang.Thread class. If it is not set when the thread is created, it will inherit one from the parent thread, if it is not set in the global scope of the application. If so, the class loader is the application class loader by default.

Chapter 8 Virtual Machine Bytecode Execution Engine

1) Runtime stack frame structure

The stack frame is a data structure used to support the virtual machine to perform method invocation and method execution. It is a data structure for the virtual machine to perform method invocation and method execution. It is the stack element of the virtual machine stack in the virtual machine runtime data area.

The stack frame stores the method's local variable table , operand stack , dynamic connection and method return address and other information.

The call chain in a thread method may be very long. For the execution engine, only the stack frame at the top of the stack is valid, which is called the current stack frame .

1. Local variable table

The local variable table is a set of variable value storage spaces used to store method parameters and local variables defined within the method.

The capacity of the local variable table is the smallest unit of variable slot .

2. Operand stack

The operand stack, also often called the operation stack, is a last-in, first-out stack. The maximum depth of the operand stack will be written to the max_stacks data item of the Code property at compile time.

In the conceptual model, the two stack frames are completely independent as elements of the virtual machine stack. But there is some optimization overlap in most virtual machine implementations. In this way, a part of the data can be shared when the method is called, and there is no need to copy and pass additional parameters.

栈帧之间的数据共享

3. Dynamic connection

Each stack frame contains a reference to the method in the runtime constant pool to which the stack frame belongs. This reference is held to support dynamic connection during method invocation.

4. Method return address

There are two ways to stop a method from running:

The execution engine encountered a bytecode instruction returned by either method. This exit is called a normal completion exit
An exception was encountered during method execution, and the exception was not handled within the method body. This way of exiting a method is called an exception completion exit

2) Method call

Method invocation is not the same as method execution. The only task in the method invocation phase is to determine the version of the called method (that is, which method to call), and the specific running process inside the method is not involved for the time being.

At compile time, all method calls stored in the Class file are only symbolic references, and the direct reference of the target method can only be determined during class loading, even during runtime.

Parse

In the parsing phase of class loading, some of the symbolic references will be converted into direct references, and the conditions for this parsing are: the method has a determinable calling version before the program actually runs, and the calling version of the method is running period is immutable.

Second, the fourth part of the program compilation and code optimization

From the first day when computer programs appeared, the pursuit of efficiency is the firm belief in the program. This process is like a never-ending, never-ending Formula 1 race. The programmer is the driver, and the technology platform is racing on the track. racing

Chapter 10 Early (Compiler) Optimizations

1) Javac compiler

The Javac compiler itself is a program written in the Java language

The compilation process can be roughly divided into three processes, namely:

The process of parsing and filling the symbol table
Annotation processing for plug-in annotation processors
Analysis and bytecode generation process

Javac 编译过程

The entry point of Javac compilation action is com.sun.tools.javac.main.JavaCompiler class

1. Parse and fill the symbol table

Parse:

The parsing step includes the two processes of lexical analysis and syntax analysis in the classic program compilation principle

The lexical analysis process is implemented by the com.sun.tools.javac.parser.Scanner class, which is the process of constructing an abstract syntax tree according to the Token sequence.
The parsing process is implemented by the com.sun.tools.javac.parser.Parser class, and the abstract syntax tree produced at this stage is represented by the com.sun.tools.javac.tree.JCTree class.

After this step, the compiler will basically no longer operate on the source code, and the subsequent operations are based on the abstract syntax tree.

Fill the symbol table :

The action of populating the symbol table is implemented by the enterTrees() method.

The symbol table is a table composed of a set of symbol addresses and symbol information, and the registered information is used in different stages of compilation.

The process of filling the symbol table is implemented by the com.sun.tools.javac.comp.Enter class. The exit of this process is a To Do List, which contains

The top-level node of the abstract syntax tree for each compilation unit, and the top-level node of package-info.java (if it exists)

2. Annotation processor

The Java language provides support for annotations , which work at runtime just like normal Java code.

3. Semantic Analysis and Bytecode Generation

After the above steps are completed, an abstract syntax tree can be obtained, but there is no guarantee that the source program is logical. The main task of semantic analysis is to examine the context-sensitive nature of the structurally correct source program.

Callout Check
Data and Control Flow Analysis
Decoding syntactic sugar
bytecode generation

Chapter 11 Late (Runtime) Optimization

1) Interpreter and Compiler

When the program needs to be started and executed quickly, the interpreter can be used first, saving the time of compilation and executing it immediately. After the program runs, as time goes by, the compiler gradually begins to play a role. After more and more codes are compiled into native codes, higher execution efficiency can be obtained. When the memory limit in the program running environment is large (such as embedded system), you can use interpreted execution to save memory, and vice versa, you can use compiled execution to improve efficiency.

There are two real-time compilers built into the HotSpot virtual machine, called Client Compiler and Server Compiler, referred to as C1 compiler and C2 compiler for short. Users can use -client or -server to specify whether to run in Client mode or Server mode

In order to achieve the best balance between program startup response speed and running speed, layered compilation is introduced

Layer 0: Program interpretation and execution, the interpreter does not turn on the performance monitoring function, which can trigger the first layer of compilation
Tier 1: Also known as C1 compilation, compiles bytecode to native code, performs simple, reliable optimizations, and adds performance monitoring logic if necessary
Layer 2: Also known as C2 compilation, which also compiles bytecode into native code, but enables some optimizations that take a long time to compile, and even some unreliable aggressive optimizations based on performance monitoring information

2) Compile objects and trigger conditions

There are two types of hot code that will be used by the real-time compiler during runtime:

Method called multiple times
The body of the loop that is executed multiple times

Determining whether a piece of code is a hotspot code and whether it needs to trigger real-time compilation is called hotspot detection

Sampling-based hotspot detection : The virtual machine periodically checks the stack top of each thread. If a method is found to appear frequently on the top of the stack, then this method is a hotspot method.
Counter-based hotspot detection : The virtual machine using this method establishes a counter for each method (even a code block), counts the execution times of the method, and considers it a hotspot method if the execution times exceed a certain threshold. (method call counter and back edge counter)

Back side counter: The function is to count the number of times the loop body code is executed in a method. The instruction that encounters the control flow backward jump in the bytecode is called back side. The purpose of establishing back-edge counter statistics is to trigger OSR compilation.

方法调用计数器

回边计数器

3) Compilation optimization technology

优化技术总览

1. Common subexpression elimination

If an expression E has been evaluated and the values of all variables in E have not changed since the previous evaluation, then this occurrence of E becomes a common subexpression.

2. Array bounds check elimination

Array bounds checking is mandatory for safety, but not during every run.

3. Method inlining

The behavior of method inlining is to copy the code of the target method into the method that initiates the call, avoiding the actual method call. To solve the problem of virtual method inlining in Java, a technique called "Type Inheritance Analysis (CHA)" was introduced, which is an application-wide type analysis technique used to determine what is currently loaded. In the class, whether there is more than one implementation of an interface, whether there is a subclass of a certain class, whether the subclass is an abstract class, etc.

4. Escape Analysis

The basic behavior of escape analysis is to analyze the dynamic scope of objects: when an object is defined in a method, it may be referenced by external methods, such as passed as a call parameter to other methods, which is called method escape. It may even be accessed by external threads, which is called thread escape.

To prove that an object doesn't escape outside of a method or thread requires some optimizations on this variable:

stack allocation
Sync Elimination
scalar substitution

3. Part 5 Efficient Concurrency

The wide application of concurrent processing is the fundamental reason why Amdahl's Law has replaced Moore's Law as the driving force of computer performance development, and it is also the most powerful weapon for human beings to squeeze computer computing power.

Chapter 12 The Java Memory Model and Threads

1) Efficiency and consistency of hardware

When the computing tasks of multiple processors involve the same main memory area, the respective cache data may be inconsistent, so read and write operations should be performed according to the protocol, such as MSI, MESI, MOSI, etc.

2) Main memory and working memory

The main goal of the Java memory model is to define the access rules for each variable in the program, that is, the low-level details of storing and retrieving variables in and out of memory in the virtual machine.

3) Interaction between memory

The following 8 operations are defined in the Java memory model to complete. The virtual machine implementation must ensure that each operation mentioned below is atomic and inseparable.

lock (lock) : a variable acting on main memory that identifies a variable as a thread-exclusive state
unlock (unlock) : a variable acting on the main memory, it releases a variable in a locked state, and the released variable can be locked by other threads
read (read) : a variable acting on main memory, which transfers the value of a variable from main memory to the thread's working memory for use by subsequent load actions
load : a variable acting on working memory, which puts the variable value obtained from the main memory by the read operation into the variable copy of the working memory
use (use) : a variable acting on working memory, which passes the value of a variable in working memory to the execution engine, which will be executed whenever the virtual machine encounters a bytecode instruction that needs to use the value of the variable. this operation
assign (assignment) : a variable that acts on working memory, it assigns a variable that receives a value from the execution engine to working memory, and executes this operation whenever the virtual machine encounters a bytecode instruction that assigns a value to a variable
store (storage) : a variable acting on the main memory, which puts the value of the variable obtained from the working memory by the store operation into the variable in the main memory
write (write) : a variable acting on the main memory, which puts the value of the variable obtained from the working memory by the store operation into the variable in the main memory

If you want to copy a variable from main memory to working memory, you must perform read and load operations sequentially. If you want to synchronize variables from working memory back to main memory, you must perform store and write operations sequentially.

The Java memory model specifies the following rules to be satisfied when performing the above 8 basic operations:

One of read and load, store and write operations is not allowed to appear alone, that is, a variable is not allowed to be read from main memory but not accepted by working memory, or a write-back is initiated from working memory but not accepted by main memory.
A thread is not allowed to discard its most recent assign operation, i.e. after a variable has changed in working memory it must synchronize the change back to main memory
A thread is not allowed to synchronize data from the thread's working memory back to main memory for no reason (no assign operation has occurred)
A new variable can only be created in main memory, it is not allowed to use an uninitialized (load or assign) variable directly in working memory
A variable can only be locked by one thread at the same time, but the lock operation can be repeated by the same thread for many times. After the lock is executed multiple times, the variable will be unlocked only after the same number of unlock operations are performed.
If the lock operation is performed on a variable, the value of the variable in the working memory will be cleared. Before the execution engine uses the variable, the load or assign operation needs to be re-executed to initialize the value of the variable.
If a variable is not previously locked by a lock operation, it is not allowed to perform an unlock operation on it, nor to unlock a variable that is locked by another thread
Before an unlock operation is performed on a variable, the variable must be synchronized back into main memory.

1. Atomicity, Visibility, and Order

The Java memory model is built around how atomicity, visibility and ordering are handled in concurrent processes

1. Atomicity

The read, load, assign, use, store and write of the Java memory model directly guarantee atomic variable operations.

2. Visibility

Visibility means that when a thread modifies the value of a shared variable, other threads are immediately aware of the modification. In addition to volatile, in Java you can also guarantee visibility through synchronize and final.

3. Orderliness

If observed in this thread, all operations are in order, if one thread is observed in another thread, all operations are out of order.

The first half of the sentence refers to: the semantics of serial performance in the thread, and the second half of the sentence refers to: the phenomenon of instruction reordering and the phenomenon of synchronization delay between working memory and main memory

4) Java and threads

1. Thread implementation

Each thread can share process resources (memory address, file I/O, etc.), and can be scheduled independently (thread is the basic unit of CPU scheduling)

There are three main ways to implement threads:

Kernel thread implementation

A thread that is directly supported by the operating system kernel, and the thread is switched by the kernel. Programs generally do not use kernel threads directly, but use a high-level interface of kernel threads— lightweight process LWP = thread

limitation:

Due to the implementation based on kernel threads, various inter-thread operations (creation, destruction and synchronization) require system calls. The cost of system calls is relatively high, and it needs to switch back and forth between user mode and kernel mode .

User thread implementation

A thread can be considered a user thread as long as it is not a kernel thread. Lightweight processes also belong to user threads. User thread refers to an implementation that is completely built on the thread library in user space, and the system kernel cannot perceive the existence of threads. The establishment, synchronization, destruction and scheduling of user threads are completely completed in user mode without the help of the kernel.

limitation:

Without the support of the system kernel, all operations of the thread need to be handled by the user program itself, and it will be extremely difficult to deal with problems such as blocking and scheduling .

User thread plus lightweight process hybrid implementation

In this hybrid implementation, there are both user threads and lightweight processes. User processes are completely built in user space, so operations such as creation, switching, and destruction of user threads are still cheap, and large-scale concurrency of user threads can be supported.

2. Java thread scheduling

Thread scheduling refers to the process by which the system allocates processor usage rights to threads. There are two main scheduling methods: cooperative thread scheduling and preemptive thread scheduling

Cooperative Thread Scheduling

The execution time of the thread is controlled by the thread itself. After the thread has finished its work, it should actively notify the system to switch to another thread.

Features : The implementation is simple, but the execution time of the thread is uncontrollable. If there is a problem with the writing of a thread, it will cause it to block all the time.

Preemptive thread scheduling

Each thread will be allocated execution time by the system, and the switching of threads is not determined by the thread itself.

Features : The execution time of the thread is controllable by the system, and there is no problem that one thread causes the entire process to block.

3. Thread state switching

There are 5 thread states defined in the Java language:

New (New): Threads that have not been started after key creation are in this state
Runable: Runable includes Running and Ready in the operating system state, that is, the thread in this state may be executing, or it may be waiting for the CPU to allocate execution time to it
Waiting indefinitely: Threads in this state are not allocated CPU execution time, they wait to be explicitly awakened by other threads
Timed Waiting: Threads in this state will not be allocated CPU execution time, but do not need to wait to be explicitly awakened by other threads, they will be automatically awakened by the system after a certain period of time
Blocked: The thread is blocked, waiting to acquire an exclusive lock, which will happen when another thread relinquishes the lock
Terminated (Terminated): The thread state of the terminated thread, the thread has ended execution

Chapter 13 Thread Safety and Lock Optimization

1) Thread safety

When multiple threads access an object, if the scheduling and alternate execution of these threads in the runtime environment are not considered, additional synchronization is not required, or any other coordination operation is performed on the caller, the behavior of calling this object is The correct result can be obtained, then the object is thread-safe.

We can divide the data shared by various operations in the Java language into 5 categories:

immutable

Immutable objects must be thread-safe. There are many ways to ensure that the behavior of an object does not affect its own state, the simplest of which is to declare all variables with state in the object as final, so that after the constructor ends, it is immutable.

absolutely thread safe

Most of the classes that mark themselves as thread-safe in the Java API are not absolutely thread-safe.

Relatively thread safe

Relative thread safety is thread safety in our usual sense. It needs to ensure that the individual operations on this object are thread-safe. We do not need to make additional safeguards when calling, but for some consecutive calls in a specific order, just It may be necessary to use additional synchronization means on the calling side to ensure the correctness of the call.

thread compatible

Thread compatibility means that the object itself is not thread-safe, but the object can be safely used in a concurrent environment by using synchronization methods correctly on the calling side.

thread opposition

Thread opposition refers to code that cannot be used concurrently in a multithreaded environment, regardless of whether synchronization measures are taken by the caller.

2) How to implement thread safety

1. Mutual exclusion synchronization

Mutual exclusion is the method, synchronization is the purpose. The most basic method is to use the synchronized keyword. After compilation, two bytecode instructions, monitorenter and monitorexit , are formed before and after the synchronized block.

In addition to using the synchronized keyword, you can also use the ReentrantLock under the JUC package to achieve synchronization. Compared with synchronized , ReentrantLock adds some advanced functions, mainly including the following three items: waiting can be interrupted, fair lock can be achieved, and lock can be bound to multiple conditions.

2. Non-blocking synchronization

Non-blocking synchronization is an optimistic concurrency strategy based on conflict detection. This can usually be done using CAS.

In most cases, the ABA problem will not affect the correctness of program concurrency. If you need to solve the ABA problem, it may be more efficient to use the traditional mutual exclusion synchronization than the atomic class.

3. No synchronization scheme

If a method does not involve shared data, it naturally does not need any synchronization measures to ensure correctness.

3) Lock optimization

The HotSpot virtual machine development team spent a lot of effort in this release to implement various lock optimization techniques, such as adaptive spin , lock elimination , lock coarsening , lightweight locks , and biased locks , etc.

In this article, we mainly made relevant reading notes for the second half of "In-depth Understanding of Java Virtual Machine: JVM Advanced Features and Best Practices". Please read slowly and turn it into your own knowledge~! 👨💻

Don't talk empty-handed, don't be lazy, let's be a programmer who brags about the architecture with Xiaocai. Please pay attention to be a companion, so that Xiaocai is no longer alone. See you below!

👀 Work harder today, and you will be able to say one less word of begging tomorrow!
👉🏻 WeChat public account: Vegetable Farmer's Day, students who didn't pay attention, remember to pay attention!