4

1. The evolution history of V8

image.png

image.png

The first version of V8 was released in 2008. At that time, the V8 architecture was relatively radical, and the js code was directly compiled into machine code and executed, so the execution speed was very fast, but only Codegen was a compiler, so the optimization of the code was very limited.

image.png

In 2010, V8 released the Crankshaft compiler. The js code will be compiled by the Full-Codegen compiler first. If the subsequent code block will be executed multiple times, it will be recompiled with the Crankshaft compiler to generate more optimized code, and then use Optimized code to execute, thereby improving performance.

image.png

The Crankshaft compiler has limited optimization of the code, so the TurboFan compiler was added to V8 in 2015. At this time, V8 still directly compiles the source code into machine code for execution. There is a core problem with this architecture, and the memory consumption is particularly large (usually a few KB files, converted to machine code may be dozens of MB, which will be a huge memory space).

image.png

The Ignition compiler was added to V8 in 2016, reintroducing bytecode with the aim of reducing memory usage.

image.png

In 2017, V8 officially released a new compilation pipeline, which uses a combination of Ignition and TurboFan to compile and execute code. From this (version 5.9 of V8), the early Full-Codegen and Crankshaft compilers are no longer used to execute js. In the latest In the architecture, there are three core modules: the parser (Parser), the interpreter (Ignition), and the optimizing compiler (TurboFan).

When V8 executes the js source code, first, the parser will parse the source code into an abstract syntax tree (Abstract Syntax Tree), the interpreter will then translate the AST into bytecode, and execute it while interpreting. The number of runs of a specific code segment. If the number of runs exceeds a certain threshold, the code is marked as hot code, and the running information is fed back to the optimizing compiler (TureboFan). Optimize and compile the bytecode, and finally generate the optimized machine code. In this way, when the code is executed again, the interpreter will directly use the optimized machine code to execute without reinterpreting it, thus greatly improving the efficiency of the code. This technique of compiling code at runtime is called just-in-time compilation (JIT).

2. The parser of V8

Parse js source code into AST, this process will go through lexical analysis, syntax analysis, and improve execution efficiency through pre-parsing.

Lexical analysis: Parse the js source code into tokens of the smallest unit.

image.png

In V8, Scanner is responsible for receiving a stream of Unicode characters and parsing them into tokens for use by the parser.

Syntax analysis: According to the grammar rules, the tokens are formed into an abstract syntax tree with a foreground level. During this process, if the source code does not conform to the grammar specification, the parsing process will be terminated and a grammar error will be thrown.

image.png

For a js source code, if all the source code must be parsed before it can be executed, it will inevitably face three problems: 1. Parsing all the code at one time, the code execution time becomes longer, 2. The memory consumption increases, because the parsed AST and The bytecode compiled according to the AST will be stored in the memory. 3. Occupy disk space, and the compiled code will be cached on the disk.

Therefore, mainstream browsers now perform delayed parsing. During the parsing process, only pre-parsing is performed for functions that are not executed immediately, and functions are fully parsed only when the function is called. During pre-parsing, it only verifies whether the syntax of the function is valid, parses the function declaration, determines the function scope, and does not generate AST. What implements pre-parsing is the Pre-Parser parser.

3. Interpreter for V8

Converting Js source code to CPU-recognizable machine code requires huge memory consumption. V8 introduces bytecode to solve the problem of memory and memory usage. Byte code is an abstraction of machine code. The syntax is somewhat similar to assembly. It can be regarded as one instruction.

image.png

The parser Ignition generates bytecode from the AST and executes it.

image.png

During this process, feedback information will be collected and given to TurboFan for optimization and compilation. According to the feedback information collected by Ignition, TurboFan compiles the bytecode into optimized machine code, and the subsequent Ignition has the optimized machine code instead of the bytecode for execution.

image.png

4. V8's optimizing compiler

When the Ignition interpreter executes the bytecode, it still needs to convert the bytecode to machine code, because the CPU can only recognize the machine code. Although there is an additional layer of bytecode conversion, it seems inefficient, but compared to Machine code, based on bytecode, can be more easily optimized for performance. The most important optimization is to use the TurboFan compiler to compile hot code. In the process of interpretation and execution, the Ignitio interpreter will mark the hot code for repeated execution. These marked codes will be compiled by the TurboFan compiler to generate more efficient machine code.

TurboFan mainly uses two algorithms when working, one is inline and the other is escape analysis.

Inlining is the inline analysis of nested functions, as shown in the code on the left side of the figure below. If the code is compiled directly without optimization, the machine code of the two functions will be generated, but in order to further improve performance, TurboFan will The two functions are inlined and then compiled. The intermediate code is mentioned below. Further, since the values of the variables inside the function are determined, the function can be further optimized, as shown in the code on the right side of the figure below. The final generated machine code is much less than before optimization, and the execution efficiency is naturally higher. Through inlining, complexity is reduced, redundant code is eliminated, constants are merged, and inlining techniques are often the basis for escape analysis.

image.png

image.png

Escape analysis is to analyze whether the life cycle of the object is limited to the current function. If the object is defined inside the function, and the object only acts inside the function, for example, the object is not returned, nor passed or called to other functions. At this time, this The object will be considered "unescaped". When compiling and optimizing, scalars are used to replace unescaped objects to reduce object definitions, thereby reducing access to object properties from memory, improving execution efficiency and reducing memory usage.

image.png

The article comes from the video: https://www.zhihu.com/zvideo/1408790742785916928


hfhan
28.9k 声望27.4k 粉丝

砥砺前行