1

How Java Garbage Collection Works?

原文地址 https://javapapers.com/java/h...

This tutorial is to understand the basics of Java garbage collection and how it works. This is the second part in the garbage collection tutorial series. Hope you have read introduction to Java garbage collection, which is the first part.
本系列教程用于帮助读者了解 Java 垃圾回收的基本概念和运作机理。本文是系列教程中的第二部分。希望您已经阅读了前文,即 Java 垃圾回收的介绍。

Java garbage collection is an automatic process to manage the runtime memory used by programs. By doing it automatic JVM relieves the programmer of the overhead of assigning and freeing up memory resources in a program.
Java 中的垃圾回收是一种自动处理的机制,它对程序运行期间所使用的内存进行管理,以帮助开发人员从手工分配和释放内存的苦差事中解脱出来。

Java Garbage Collection GC Initiation

Java 垃圾回收调用

Being an automatic process, programmers need not initiate the garbage collection process explicitly in the code. System.gc() and Runtime.gc() are hooks to request the JVM to initiate the garbage collection process.
作为一种自动运行的机制,Java 垃圾回收无需编程者在代码中手工执行。System.gc()Runtime.gc() 的作用是请求 JVM 启动一次垃圾回收。

Though this request mechanism provides an opportunity for the programmer to initiate the process but the onus is on the JVM. It can choose to reject the request and so it is not guaranteed that these calls will do the garbage collection. This decision is taken by the JVM based on the eden space availability in heap memory. The JVM specification leaves this choice to the implementation and so these details are implementation specific.
虽然这样的请求可以令编程者有发起一次垃圾回收的机会,但最终决定权在 JVM 本身,JVM 可以选择拒绝这么做,所以该方法不保证垃圾回收会真的执行。JVM 做出哪种决定,是基于 eden 区的可用空间大小。JVM 规范允许不同的实现做出不同的决定,所以细节方面要看具体的实现。

Undoubtedly we know that the garbage collection process cannot be forced. I just found out a scenario when invoking System.gc() makes sense. Just go through this article to know about this corner case when System.gc() invocation is applicable.
虽说垃圾回收不能强制执行,但我还是发现某些特定场景中调用 System.gc() 是合理的。具体请参考这篇文章

Java Garbage Collection Process

Java 垃圾回收过程

Garbage collection is the process of reclaiming the unused memory space and making it available for the future instances.
垃圾回收过程的目的是收回无用的内存空间,以供将来新的对象实例使用。

Java-Garbage-Collection-Process3_thumb

Eden Space: When an instance is created, it is first stored in the eden space in young generation of heap memory area.
Eden 空间:当对象实例创建时,会先存放到堆内存新生代的 eden 空间。

NOTE: If you couldn’t understand any of these words, I recommend you to go through the garbage collection introduction tutorial which goes through the memory mode, JVM architecture and these terminologies in detail.
注意:如果你对上面的名词感到不理解,建议先阅读垃圾回收介绍,该文章对相关的技术名词和概念作了介绍。

Survivor Space (S0 and S1): As part of the minor garbage collection cycle, objects that are live (which is still referenced) are moved to survivor space S0 from eden space. Similarly the garbage collector scans S0 and moves the live instances to S1.
幸存区(S0 和 S1):作为 小型垃圾回收周期 的一部分,依旧被引用的对象实例会从 eden 区移至幸存区 S0。类似的,垃圾回收器在扫描 S0 区后,会将被引用的对象移至 S1。

Instances that are not live (dereferenced) are marked for garbage collection. Depending on the garbage collector (there are four types of garbage collectors available and we will see about them in the next tutorial) chosen either the marked instances will be removed from memory on the go or the eviction process will be done in a separate process.
不被引用的对象会被打上回收标记。根据垃圾回收器的不同类型(教程的下一章节将会介绍四种不同的垃圾回收器类型),被标记的对象可以被马上移除,或者将移除工作交给另外的过程来做。

Old Generation: Old or tenured generation is the second logical part of the heap memory. When the garbage collector does the minor GC cycle, instances that are still live in the S1 survivor space will be promoted to the old generation. Objects that are dereferenced in the S1 space is marked for eviction.
老年代:老年(或终身)代是堆内存的第二部分。在小型垃圾回收周期完成后,依旧存留在 S1 幸存区的对象实例将被提升至老年代,其余的则会被打上回收标记。

Major GC: Old generation is the last phase in the instance life cycle with respect to the Java garbage collection process. Major GC is the garbage collection process that scans the old generation part of the heap memory. If instances are dereferenced, then they are marked for eviction and if not they just continue to stay in the old generation.
大型垃圾回收:对象实例进入老年代后,就处于生命周期当中的最后阶段。大型垃圾回收过程就是用来扫描整个老年代区域的。如果对象实例被发现不再被引用,则打上回收标记,否则将依旧存活在老年代当中。

Memory Fragmentation: Once the instances are deleted from the heap memory the location becomes empty and becomes available for future allocation of live instances. These empty spaces will be fragmented across the memory area. For quicker allocation of the instance it should be defragmented. Based on the choice of the garbage collector, the reclaimed memory area will either be compacted on the go or will be done in a separate pass of the GC.
内存碎片化:当实例对象从堆内存中删除时,会留下空白区域。随着程序运行,内存区域当中会遍布此类“孔洞”。为了能更高效的分配内存给新的对象实例,需要对内存区域进行碎片整理。垃圾回收器会有不同的具体实现,要么在回收过程中合并空白区域,要么将合并工作交给另外的过程来做。

Finalization of Instances in Garbage Collection

垃圾回收中的对象实例终结过程

Just before evicting an instance and reclaiming the memory space, the Java garbage collector invokes the finalize() method of the respective instance so that the instance will get a chance to free up any resources held by it. Though there is a guarantee that the finalize() will be invoked before reclaiming the memory space, there is no order or time specified. The order between multiple instances cannot be predetermined, they can even happen in parallel. Programs should not pre-mediate an order between instances and reclaim resources using the finalize() method.
在删除对象实例、收回内存空间之前,Java 垃圾回收器会调用被回收对象的 finalize() 方法,以便该对象释放其持有的资源。虽说这点可以得到保证,但调用该方法的顺序和时间是未知的,你无法预测几个要回收的对象谁的 finalize() 方法会被先调用,它们甚至可能并行发生。程序在利用 finalize() 方法时不应该假设存在这样的顺序。

  • Any uncaught exception thrown during finalize process is ignored silently and the finalization of that instance is cancelled.
    当出现未捕获异常时,该异常会被忽略,并且该对象实例的终结过程会被终止。
  • JVM specification does not discuss about garbage collection with respect to weak references and claims explicitly about it. Details are left to the implementer.
    JVM 规范没有具体规定弱引用的对象实例如何执行垃圾回收,其细节由 JVM 的具体实现来决定。
  • Garbage collection is done by a daemon thread.
    垃圾回收是由一个后台线程来执行的。

When an object becomes eligible for garbage collection?

什么样的对象实例能被打上回收标记?

  • Any instances that cannot be reached by a live thread.
    任何活动线程都无法触及到的对象实例。
  • Circularly referenced instances that cannot be reached by any other instances.
    相互循环引用,但无法被其他对象实例触及到的对象实例。

There are different types of references in Java. Instances eligibility for garbage collection depends on the type of reference it has.
Java 当中有几种不同的引用类型,对象实例是否可回收取决于其被引用属于哪种类型。

Reference 引用类型 Garbage Collection 垃圾回收
Strong Reference
强引用
Not eligible for garbage collection
不可进行垃圾回收
Soft Reference
软引用
Garbage collection possible but will be done as a last option
垃圾回收可以作为最后选项
Weak Reference
弱引用
Eligible for Garbage Collection
可垃圾回收
Phantom Reference
幻引用
Eligible for Garbage Collection
可垃圾回收

During compilation process as an optimization technique the Java compiler can choose to assign null value to an instance, so that it marks that instance can be evicted.
在编译过程当中,作为一个优化技巧,你可以选择让 Java 编译器将 null 值赋给一个对象实例,相当于给该对象实例打上了回收标记。

class Animal {
    public static void main(String[] args) {
        Animal lion = new Animal();
        System.out.println("Main is completed.");
    }

    protected void finalize() {
        System.out.println("Rest in Peace!");
    }
}

In the above class, lion instance is never uses beyond the instantiation line. So the Java compiler as an optimzation measure can assign lion = null just after the instantiation line. So, even before SOP’s output, the finalizer can print ‘Rest in Peace!’. We cannot prove this deterministically as it depends on the JVM implementation and memory used at runtime. But there is one learning, compiler can choose to free instances earlier in a program if it sees that it is referenced no more in the future.
上面这个例子当中,lion 变量在声明后就没有再用到了。因此 Java 编译器可以对其进行优化,在声明语句的后面加上一行 lion = null,此时终结过程打印的“Rest in Peace!”甚至可以发生在“Main is completed.”之前。当然这也不是百分百确定,因为这取决于 JVM 的具体实现和运行时的内存使用情况。但我们起码能够学到,编译器可以在发现对象实例不再被使用时,将其被回收的时机提前。

  • One more excellent example for when an instance can become eligible for garbage collection. All the properties of an instance can be stored in the register and thereafter the registers will be accessed to read the values. There is no case in future that the values will be written back to the instance. Though the values can be used in future, still this instance can be marked eligible for garbage collection. Classic isn’t it?
    再举一个判断对象可否回收的例子:对象中的所有属性值都可以注册到另一个地方,从注册的地方读取,而注册的地方保存的值不会再回写到该对象中。这种情况下,虽然对象的属性值会被用到,但对象本身还是可以被回收的。很经典不是吗?
  • It can get as simple as an instance is eligible for garbage collection when null is assigned to it or it can get complex as the above point. These are choices made by the JVM implementer. Objective is to leave as small footprint as possible, improves the responsiveness and increase the throughput. In order to achieve this the JVM implementer can choose a better scheme or algorithm to reclaim the memory space during garbage collection.
    判断对象可回收的情形,可以简单到被赋值为 null,或者复杂到上面说的情况。这些都由 JVM 的具体实现决定,但目的都是为了尽量节省内存使用、提升程序的响应和处理能力。为达到这个目的,JVM 实现者可以选择更好的规划和算法来回收内存空间。
  • When the finalize() is invoked, the JVM releases all synchronize locks on that thread.
    在调用 finalize() 方法之后,JVM 会释放该线程上的所有同步锁。

Example Program for GC Scope
一个对象回收的例子:

class GCScope {
    GCScope t;
    static int i = 1;

    public static void main(String args[]) {
        GCScope t1 = new GCScope();
        GCScope t2 = new GCScope();
        GCScope t3 = new GCScope();

        // No Object Is Eligible for GC
        // 所有的对象都不能回收

        t1.t = t2; // No Object Is Eligible for GC 所有的对象都不能回收
        t2.t = t3; // No Object Is Eligible for GC 所有的对象都不能回收
        t3.t = t1; // No Object Is Eligible for GC 所有的对象都不能回收

        t1 = null;
        // No Object Is Eligible for GC (t3.t still has a reference to t1)
        // 所有的对象都不能回收(因为 t3.t 仍然引用原来的 t1 对象)

        t2 = null;
        // No Object Is Eligible for GC (t3.t.t still has a reference to t2)
        // 所有的对象都不能回收(因为 t3.t.t 仍然引用原来的 t2 对象)

        t3 = null;
        // All the 3 Object Is Eligible for GC (None of them have a reference.
        // only the variable t of the objects are referring each other in a
        // rounded fashion forming the Island of objects with out any external
        // reference)
        // 所有对象都可以回收了(三个变量都不再引用对象实例,原来的三个实例现在成了
        // 相互引用的孤岛,不再被外部变量引用)
    }

    protected void finalize() {
        System.out.println("Garbage collected from object" + i);
        i++;
    }
}

Example Program for GC OutOfMemoryError
一个内存溢出(OutOfMemoryError)的例子:

Garbage collection does not guarantee safety from out of memory issues. Mindless code will lead us to OutOfMemoryError.
垃圾回收不能保证程序不发生内存溢出。无脑的代码可能会导致 OutOfMemoryError。

import java.util.LinkedList;
import java.util.List;

public class GC {
    public static void main(String[] main) {
        List l = new LinkedList();
        // Enter infinite loop which will add a String to the list: l on each
        // iteration.
        do {
            l.add(new String("Hello, World"));
        } while (true);
    }
}

Output:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.LinkedList.linkLast(LinkedList.java:142)
    at java.util.LinkedList.add(LinkedList.java:338)
    at com.javapapers.java.GCScope.main(GCScope.java:12)

Next is the third part of the garbage collection tutorial series and we will see about the different types of Java garbage collectors available.
接下来是垃圾回收系列教程的第三部分,我们将了解不同的垃圾回收器类型。


捏造的信仰
2.8k 声望272 粉丝

Java 开发人员