JavaScript memory mechanism of V8 engine

For the front-end siege division, the memory mechanism of JS cannot be ignored. If you want to become an industry expert or build a high-performance front-end application, you must figure out the memory mechanism of JavaScript.

Look at chestnuts first

    function foo (){
        let a = 1
        let b = a
        a = 2
        console.log(a) // 2
        console.log(b) // 1
        
        let c = { name: '掘金' }
        let d = c
        c.name = '沐华'
        console.log(c) // { name: '沐华' }
        console.log(d) // { name: '沐华' }
    }
    foo()

It can be seen that after we modify the values of different data types, the results are a bit different.

This is because different data types are stored in different locations in memory. During the execution of JS, there are three main memory spaces: code space , stack , heap

The code space is mainly to store executable code. There is a lot of content about this. You can read my other article for a detailed introduction.

Let's take a look at the stack and heap first

Stack and heap

In JS, every piece of data requires a memory space. And what are the different characteristics of different memory spaces? , As shown

未标题-1.jpg

The call stack is also called the execution stack. Its execution principle is first-in-last-out, and the later execution will be out of the stack first, as shown in the figure

ac6dbbbaed55b4ca7db2307fad77f1c5 (1).gif
Stack:

storage basic type: Number, String, Boolean, null, undefined, Symbol, BigInt
Storage and use method last in first out (just like a bottle, the things that are put in later are taken out first)
Automatically allocate memory space, automatically release, occupy a fixed size of space
Stores reference type variables, but what actually saves is not the variable itself, but the pointer to the object (the address stored in the heap memory)
The variables defined in all methods are stored in the stack, and the method execution ends, the memory stack of this method is also automatically destroyed
The method can be called recursively, so that as the stack depth increases, JVW maintains a long method call trajectory, and the memory is not allocated enough, which will cause a stack overflow

heap:

storage reference type: Object (Function/Array/Date/RegExp)
Dynamically allocate memory space, the size is variable and will not be automatically released
Objects in the heap memory will not be destroyed because the method execution ends, because it may be referenced by another variable (parameter passing, etc.)

`Why is there a distinction between stack and heap`

Usually related to the garbage collection mechanism. When each method is executed, it will establish its own memory stack, and then put the variables in the method into this memory stack one by one. As the method execution ends, the memory stack of this method will be automatically destroyed.

In order to make the memory occupied by the program run to the smallest , the stack space will not be set too large, and the heap space will be large

Every time an object is created, the object will be saved to the heap for repeated reuse. Even if the method execution ends, the object will not be destroyed because it may be referenced by another variable (parameter passing, etc.) until the object has nothing It will be destroyed by the system's garbage collection mechanism when it is referenced

And the JS engine needs to use the stack to maintain the state of the context during program execution. . If all the data is in the stack and the stack space is large, it will affect the efficiency of context switching and thus the execution efficiency of the entire program.

`Memory leaks and garbage collection`

As mentioned above, memory is allocated when variables (objects, strings, etc.) are created in JS, and memory is "automatically" released when they are no longer used. This process of automatically releasing memory is called garbage collection. It is precisely because of the existence of the garbage collection mechanism that many developers do not care about memory management during development, which leads to memory leaks in some cases.

Memory life cycle:

Memory allocation: When we declare variables, functions, and objects, the system will automatically allocate memory for them
Memory usage: read and write memory, that is, use variables, functions, parameters, etc.
Memory recycling: After use, the memory that is no longer used will be automatically recycled by the garbage collection mechanism

Local variables (variables inside the function), when the function execution ends, there is no other reference (closure), the variable will be recycled

The life cycle of global variables will not end until the browser uninstalls the page, which means that global variables will not be garbage collected

`Memory leak`

The operation of the program requires memory. For the continuously running service process, the memory that is no longer used must be released in time, otherwise the memory usage will become larger and larger, which will affect the system performance at least, and cause the process to crash in severe cases.

Memory leak is due to negligence or error, which causes the program to not release the memory that is no longer used in time, resulting in a waste of memory

`Determine memory leak`

In Chrome browser, you can check the memory usage like this

Developer Tools => Performance => Check Memory => click the upper left corner Record => click stop after page operation

Then the memory usage during this period will be displayed

After checking the memory usage once and looking at the current memory usage trend graph, the trend is showing an upward trend, and it can be considered that there is a memory leak
After checking the memory occupancy several times, compare the screenshots and compare the memory occupancy each time. If it shows an upward trend, it can also be considered that there is a memory leak.

In Node , use the process.memoryUsage method to view the memory situation

console.log(process.memoryUsage());

heapUsed: The part of the heap used.
rss (resident set size): All memory usage, including instruction area and stack.
heapTotal: The memory occupied by the "heap", including used and unused.
external: memory occupied by C++ objects inside the V8 engine

To determine memory leaks, the heapUsed field shall prevail

`Under what circumstances will cause memory leaks`

Global variables created accidentally without declaration
Forgotten timers and callback functions, the references in the timers will remain in memory if they are not closed in time
Closure
DOM operation reference (for example, if td is referenced but the entire table is deleted, the memory will retain the entire table)

`How to avoid memory leaks`

So remember a principle: return things in time if you don’t use them.

Reduce unnecessary global variables, such as using strict mode to avoid creating unexpected global variables
Reduce objects with a long life cycle and avoid too many objects
After using the data, dereference in time (variables in the closure, DOM reference, timer clear)
Organize the logic to avoid the infinite loop causing the browser to freeze and crash

`Garbage collection`

JS has an automatic garbage collection mechanism, so how does this automatic garbage collection mechanism work?

`Reclaim the data in the execution stack`

Look at chestnuts

function foo(){
    let a = 1
    let b = { name: '沐华' }
    function showName(){
        let c = 2
        let d = { name: '沐华' }
    }
    showName()
}
foo()

Implementation process:

The JS engine first creates an execution context for the foo function, and pushes the execution context onto the execution stack
When the execution encounters the showName function, it creates an execution context for the showName function and pushes the execution context onto the execution stack, so showName is pressed on top of foo in the stack
Then execute the execution context of the showName function first. There is a pointer (ESP) in the JS engine that records the current execution state, which will point to the executing context, which is showName
When the execution of showName is over, the execution flow enters the next execution context, that is, the foo function. At this time, the execution context of showName needs to be destroyed. mainly the JS engine moves the ESP pointer down to point to the execution context under showName, which is foo, This move down operation is the process of

As shown

`Reclaim the data in the heap`

In fact, it is to find out the values that are no longer used, and then release the memory occupied by them.

For example, in the chestnut just now, when the execution context of the foo function and the showName function is finished, the execution context is cleaned up, but the two objects inside still occupy space, because the object data is stored in the heap, and the cleaned up stack is only the object The reference address is not the object data

This requires the garbage collector

The most difficult task in the garbage collection stage is to find unneeded variables, so there are many garbage collection algorithms, and none of them can handle all scenarios. You need to weigh and choose according to the scenario.

`Reference count`

reference count is the previous garbage collection algorithm . The algorithm defines the "memory no longer used" standard is very simple, it is to see whether an object has a reference to it, if no other objects point to it, it means that the object is no longer needed NS

But it has a fatal problem: circular reference

That is, if there are two objects referencing each other, although they are no longer used, garbage collection will not be recycled, resulting in memory leaks

In order to solve the problem caused by circular references, modern browsers have not adopted the method of reference counting

In V8, the heap is divided into two regions: the young generation and the old generation.

`Young and old`

V8 implements the GC algorithm and uses a generational garbage collection mechanism, so V8 divides the heap memory into two parts: generation (secondary garbage collector) and old generation (main garbage collector)

`New generation`

The new generation usually only supports 1~8M capacity, so mainly stores objects with shorter survival time

The Cenozoic uses the Scavenge GC algorithm to divide the Cenozoic space into two areas: the object area and the free area. As shown in the figure:

As the name implies, only one of the two spaces is used, and the other is free. The workflow is like this

Store the newly allocated object in the object area. When the object area is full, the GC algorithm will be started
Mark the garbage in the object area. After the mark is completed, copy the surviving objects in the object area to the free area, and destroy the objects that are no longer in use. This process will not leave memory fragments
After the copy is completed, the target area and the free area are swapped. The garbage is collected and the two areas in the new generation can be reused indefinitely.

Because the space in the new generation is not large, it is easy to be filled up, so

Objects that are still alive after two garbage collections will be moved to the old generation space
If the proportion of free space objects exceeds 25%, in order not to affect the memory allocation, the objects will be transferred to the old generation space

`Old generation`

The feature of the old generation is that occupies a large space and , so long-lived objects

In the old generation, the mark removal algorithm and the mark compression algorithm are used. Because if the Scavenge GC algorithm is also used, it will take more time to copy large objects

`Mark clear`

In the following cases, the tag removal algorithm will be activated first:

When a certain space is not divided into blocks
When there are too many objects exceeding a certain limit of space capacity
Space cannot guarantee when objects in the young generation are transferred to the old generation

The process of mark removal is like this

Starting from the root (the global object of js), traverse all objects in the heap, and then mark the surviving objects
After the marking is completed, destroy the unmarked objects

Due to the garbage collection stage, the execution of the JS script will be suspended, and the execution of the JS will be resumed after the garbage collection is completed. This behavior is called stop-the-world.

For example, if the data in the heap exceeds 1G, a complete garbage collection may take more than 1 second. During this period, the execution of the JS thread will be suspended, which will cause the performance and responsiveness of the page to decrease.

`Incremental mark`

So in 2011, V8 switched from the stop-the-world mark to the incremental mark. Using the incremental marking algorithm, GC can decompose the recycling task into many small tasks, which are interspersed among the JS tasks for execution, so as to avoid the situation of application jams.

`Concurrent mark`

Then in 2018, another major breakthrough in GC technology was the concurrent mark. Allows JS to run at the same time when GC scans and marks objects.

`Mark compression`

After clearing, it will cause memory fragmentation in the heap memory. When the fragmentation exceeds a certain limit, the mark compression algorithm will be started, and the surviving objects will be moved to one end of the heap. When all the objects are moved, the unnecessary memory will be cleaned up.

`Concluding remarks`

Like and support

`refer to`

browser working principle and practice