Abstract: G1 garbage collector is a garbage collector mainly for server applications.
This article is shared from Huawei Cloud Community " JVM Interview High-frequency Test Site: From the shallower to the deeper, let you understand the G1 garbage collector! ! ! ", the original author: Code Pipi Shrimp.
Introduction to G1 Garbage Collector
G1 Garbage Collector is a garbage collector mainly for server-side applications. As the result of a milestone in the history of garbage collector technology development, G1 garbage collector is different from previous garbage collectors. First of all, it is a change in thinking, as shown in the following figure:
G1's division of the Java heap
The above picture may not be clear to the friends at the first time, because you don't understand G1 yet, look at the following, it should be almost the same.
The G1 garbage collector's division of the Java heap area is different from our previous understanding of Java's area division
In the past, the Java heap area was divided into: Cenozoic and Old Generation. The Cenozoic is divided into Eden and Survivor areas, and Survivor is divided into from and to areas.
But now, G1 no longer insists on the division of fixed size and fixed number of generational regions, but divides the continuous Java heap space into multiple independent regions of equal size (Region), each region can become Eden space, Survivor space , Old age space.
This ideological change and design allows G1 to collect any part of the heap memory for recycling. The measurement standard is no longer which generation it belongs to, but which piece of memory stores the most garbage and has the greatest recovery benefit. This is the Mixed GC mode of the G1 collector, that is, the mixed GC mode.
Region also has a special type of Humongous area, dedicated to storing large objects. G1 believes that as long as the size of an object exceeds half of the capacity of a region, it can be judged as a large object. If it is a large object that exceeds the capacity of the entire Region, it will be placed in N consecutive Humongous Region regions.
Region is 1M ~ 32M
The default number of
-XX:G1HeapRegionSize = N
G1 seems to have a completely new feeling in doing this, but a careful may have discovered that if there are cross-region reference objects between regions, how to solve these objects?
- Regardless of G1 or other generational collectors, JVM uses Remembered Set to avoid global scanning.
- Each Region has a corresponding memory set.
- Every time the Reference type data is written, a Write Barrier is generated to temporarily terminate the operation
- Then check whether the object pointed to by the reference to be written is in a different Region from the reference type data (other collectors: check whether the old generation object refers to the new generation object)
- If they are different, use the Card Table to record the relevant reference information in the Remembered Set corresponding to the Region where the reference points to the object.
- When garbage collection is performed, the memory set is added to the GC Roots enumeration range; it can be guaranteed not to perform a global scan.
G1 can be understood as a hash table, Key is the starting address of other regions, and Value is the index number set of the card table.
Because G1 divides the Java heap into regions, and the number of regions is significantly larger than that of traditional generations, compared to traditional garbage collectors, G1 needs to consume 10% of the Java heap capacity. ~ 20% extra space to maintain the work of the collector.
G1 garbage collector workflow
- Initial Marking: This stage is only to mark the objects that GC Roots can directly associate with and modify the value of TAMS (Next Top at Mark Start), so that the next stage of concurrent user program operation can be used correctly To create a new object in the Region, the thread needs to be paused at this stage, but it takes a short time. And it is done synchronously when the Minor GC is borrowed, so the G1 collector does not actually have an additional pause at this stage.
- Concurrent Marking: starts from GC Roots to analyze the reachability of objects in the heap, and recursively scans the object graph in the entire heap to find the surviving objects. This stage takes a long time, but it can be combined with the user program Concurrent execution. After the object graph scanning is completed, the objects recorded by SATB that have reference changes during concurrency must be processed again.
- Final Marking: makes another short pause for the user thread to process the last few SATB records left over after the concurrency phase ends.
- Live Data Counting and Evacuation: is responsible for updating the statistical data of Regions, sorting the recycling value and cost of each Region, and formulating a recycling plan according to the pause time that the user expects. You can freely select multiple regions to form a collection, and then copy the surviving objects in the recovered part of the region == copy == to the empty Region, and then empty those regions.
Except for concurrent marking, all other processes must be STW
The difference between G1 and CMS
- G1 as a whole is a mark-and-sort algorithm, but locally (between two regions) is a copy algorithm. And CMS is a mark-sweep algorithm, so G1 will not generate memory fragmentation, while CMS will generate memory fragmentation
- CMS uses a post-write barrier to maintain the card table, while G1 not only uses a post-write barrier to maintain the card table, but also uses a pre-write barrier to track pointer changes during concurrency (in order to achieve the original snapshot).
- CMS uses the traditional new-generation and old-generation division method for Java heap memory, while G1 uses a brand-new division method.
- The CMS collector only collects the old generation and can be used with the new generation Serial and ParNew collectors. The collection range of the G1 collector is the old generation and the young generation. No need to combine with other collectors
- CMS uses incremental updates to solve the problem of incorrect marking under concurrent marking, while G1 uses the original snapshot to solve it.