3
头图

Interviewer : about the G1 garbage collector?

Candidate : Hmm, okay

Candidate : Last time I remembered that the drawbacks of the CMS garbage collector: memory fragmentation && space needs to be reserved

candidate : When dealing with these two problems, it is very likely that the pause time will be too long. To put it bluntly, the pause time of CMS is "unpredictable."

candidate : and G1 can be understood as an "upgrade" on the CMS garbage collector

candidate : The G1 garbage collector can set you a pause time you want Stop The Word, and the G1 garbage collector will try to satisfy you according to this time

candidate : When I introduced the JVM heap earlier, I drew a picture. The memory distribution of the heap is isolated by ``physical'' space

Candidate : In the world of the G1 garbage collector, the division of the heap is no longer a "physical" form, but a "logical" form of division

Candidate : However, the concept of "generation" as mentioned before still works in the world of the G1 garbage collector.

candidate : For example: new objects are generally allocated to the Eden area, and the new generation of objects that have passed the default 15 times of Minor GC will be transferred to the old generation if they are still alive...

candidate : Let me draw the spatial distribution of the "heap" in the world of the G1 garbage collector

Candidate : From the figure, you can find that the heap is divided into multiple equal regions, and each region in G1 is called Region

Candidate : The old generation, the new generation, and Survivor shouldn't need me to say more, right? The rules are the same as CMS

Candidate : In G1, there is another area called Humongous (large object), which is actually used to store particularly large objects (greater than half of Region memory)

Candidate : Once it is found that there is no reference to a large object, it can be recycled directly in the Minor GC of the young generation

Interviewer : Well...

candidate : In fact, if you think about it a little bit, you can also understand why the "heap space" is "subdivided" into multiple small areas

candidate : Like the previous garbage collectors, the heap is "physically" divided

candidate : If the heap space (memory) is large, a whole large area needs to be reclaimed every time the "garbage collection" is performed, and the collection time is not easy to control

Candidate : After dividing multiple small areas, it is easy to control its "collection time" by recycling these "small areas"

Interviewer : Well...

Interviewer : Then I probably understand. How about you talk about its GC process?

Candidate : Well, in the G1 collector, it can be divided into Minor GC (Young GC) and Mixed GC. There are also special scenarios where Full GC may occur.

candidate : Then I just say Minor GC first?

Interviewer 16192fb932e965: Well,

candidate : G1 Minor GC is actually triggered at the same time as the garbage collector mentioned earlier

candidate : After the Eden area is full, the Minor GC will be triggered. Minor GC will also happen Stop The World

: I want to add that: in the world of G1, the heap space occupied by the young and old generations is not so fixed (will be dynamically adjusted according to the "maximum pause time")

Candidate : Just know that this piece will provide us with parameters for configuration

candidate : Therefore, dynamically changing the number of young regions can "control" the overhead of Minor GC

Interviewer : Well, what about the recovery process of Minor GC? Can you add a little more in detail?

candidate : Minor GC I think it can be simply divided into three steps: root scan, update && process RSet, copy object

candidate : The first step should be easy to understand, because this is similar to the previous CMS, which can be understood as the initial marking process

candidate : The second step involves the concept of "Rset"

Interviewer : Well...

Candidate : From the last time we talked about the CMS recovery process, we also talked about Minor GC, which uses the "cart table" to avoid full table scanning of objects in the old age

candidate : Because Minor GC is to reclaim objects of the young generation, but if there are objects in the old generation that refer to the young generation, these objects referenced by the old generation cannot be recycled.

candidate : The same problem exists in G1 (Minor GC after all). CMS is a card table, and G1's storage to solve the problem of "cross-generation reference" is generally called RSet

candidate stored in every Region, and it records "other Regions refer to the object relationship of the current Region"

candidate : For the region of the young generation, its RSet only saves references from the old generation (because the young generation does not need to store it, you have to do Minor GC)

candidate : For the Region of the old generation, its RSet will only save references to it in the old generation (the G1 garbage collector will first collect the young generation before the old generation is collected, so there is no need Save the reference of the young generation)

Interviewer : Well...

candidate the second step, it should be easy to understand, right?

candidate : It is nothing more than processing RSet information and scanning, adding references to objects of the old generation and objects of the young generation under the GC Roots to avoid being recycled

Candidate : The third step is also very easy to understand: save the objects that survive the scan in the "empty Survivor area" or "old age", and clear the other Eden areas

candidate : What I want to mention here is that there is another term in G1 called CSet.

Candidate : Its full name is Collection Set, which saves the Region that "will perform garbage collection" in a GC. All surviving objects in CSet will be transferred to other available Regions

Candidate : At the end of Minor GC, soft references, weak references, JNI Weak and other references will be processed, and the collection will end

Interviewer : Well, I understand, it's not difficult

Interviewer : I remember you mentioned Mixed GC earlier, why not talk about this process?

candidate : Okay, no problem.

candidate : When the heap space occupancy rate reaches a certain threshold, the Mixed GC will be triggered (default 45%, determined by parameters)

Candidate : Mixed GC relies on Region data after "Global Concurrency Mark" statistics

candidate : "Global Concurrent Marking" Its process is very similar to CMS. The steps are roughly: initial marking (STW), concurrent marking, final marking (STW) and cleaning (STW)

Interviewer : It's really similar. , will you continue to talk about the specific process?

candidate : Well, I still want to explain: Mixed GC will definitely collect the young generation, and will collect some regions of the old generation for recycling, so it is a "mixed" GC.

candidate : The first is the "initial mark", this process is "shared" the Stop The World of Minor GC (Mixed GC must occur Minor GC), multiplexing the operation of "scanning GC Roots".

Candidate : In this process, both the old and young generations will scan

candidate : In general, the process of "initial marking" is relatively fast, after all, there is no retrospective traversal.

Interviewer :...

candidate : Next comes the "concurrency mark", this stage will not Stop The World

candidate : The GC thread is executed together with the user thread, and the GC thread is responsible for collecting the live object information of each Region

candidate : tracing down from GC Roots to find the surviving objects in the entire heap, which is time-consuming

Interviewer : Well...

candidate : Next comes the "re-marking" stage, which is the same as CMS, marking those objects that have changed during the "concurrent marking" stage

Candidate : Isn't it simple?

Interviewer : Wait a minute

Interviewer : CMS should rescan all thread stacks and the entire young generation as root in the "relabeling" phase.

Interviewer : As far as I know, G1 does not seem to be like this, do you understand this?

candidate : Well, G1 is indeed not like this. To solve the problem of reference changes in the "concurrent marking" phase in G1, the SATB algorithm is used

candidate : It can be simply understood as: when the GC starts, it takes a "snapshot" of the surviving objects

candidate : In the "concurrency phase", write down the old reference value every time the reference relationship changes

candidate : Then in the "re-marking" phase, only the reference of the block "changed" is scanned to see if there are any objects still alive, and added to the "GC Roots"

candidate : But the SATB algorithm has a small problem, that is: if at the beginning, G1 thinks it is alive, then it will not be recycled in this GC, even if it may be in the "concurrent phase" The subject has become garbage.

candidate : Therefore, G1 may also have the problem of "floating garbage"

candidate : But in general, for G1, the problem is not big (after all, it is not pursuing to remove all the garbage at once, but focusing on Stop The World time)

Interviewer : Well...

Candidate : The last stage is "cleanup", this stage will also Stop The World, mainly counting and resetting the flag state

Candidate : According to the "pause prediction model" (in fact, it is the set pause time), how many Regions will be collected in this GC will be determined

candidate : Generally speaking, Mixed GC will select all regions of the young generation, and some regions of the old generation with "high recycling value" (high recycling value is actually more garbage) for collection

Candidate : Finally, the mixed GC cleanup was done by "copying"

candidate : Therefore, a collection may not be all garbage collection, G1 will choose the number of regions based on the pause time (:

Interviewer : Well, I roughly understand the process

Interviewer : will full GC happen in G1?

Candidate : If the mixed GC cannot keep up with the speed of user thread allocation, causing the old generation to fill up and unable to continue the Mixed GC, it will be downgraded to the serial old GC to collect the entire GC heap

candidate : However, this scenario is still very small compared to CMS. After all, G1 does not have the problem of CMS memory fragmentation (:

This article summarizes ( G1 garbage collector features ):

  • From the original "physical" generation to the current "logical" generation, the heap memory is "logically" divided into multiple regions
  • Use CSet to store the collection of recyclable Regions
  • Use RSet to deal with cross-generation references (note: RSet does not retain reference relationships related to the young generation)
  • G1 can be simply divided into: Minor GC and Mixed GC and Full GC
  • [Trigger when Eden area is full] Minor GC recovery process can be simply divided into: (STW) scan GC Roots, update && process Rset, copy clear
  • [Trigger when the entire heap space accounts for a certain percentage] Mixed GC relies on the "global concurrency mark" to get the CSet (recoverable region), and then "copy and clear"
  • When R describes the principle of G1, from a macro perspective, G1 is actually " global concurrent mark " and " copy survival object "
  • Use the SATB algorithm to deal with the problem of object references that may be modified in the "concurrent marking" phase
  • Provide pause time parameters for users to set ( G1 will try to meet the pause time to adjust the number of to be recovered during GC 16192fb932f389)

Welcome to follow my WeChat public [16192fb932f3f9 Java3y ] to talk about Java interviews. The online interviewer series is being updated continuously!

[Online Interviewer-Mobile] The series updated twice a week!
[Online Interviewer-Computer] The series updated twice a week!

Originality is not easy! ! Seek three links! !


Java3y
12.9k 声望9.2k 粉丝