As a front-end programmer, the first thing to do at work every day is to turn on the computer, involuntarily click on the chrome
browser, or touch the fish for a while or immediately enter the working state. Next, the browser window will accompany you through the day. Normally until seven or eight o'clock, it will be ninety o'clock if it is late, and then it will accompany you through the day and keep an eye on your work. As a loyal partner who accompanies you, you ask yourself, have you seriously understood how it works? Have you walked into its inner world?
If you have been curious, then please watch this issue of "Going into the Heart of Chrome and Understanding How the V8 Engine Works".
What is V8
Before getting into a deep understanding of a thing, we must first know what it is.
V8
is a Google
open using C++
write performance JavaScript
and WebAssembly
engine applications Chrome
and Node.js
like. It implements ECMAScript
and WebAssembly
, running on Windows 7
and above, macOS 10.12+
and Linux
systems that x64、IA-32、ARM
or MIPS
processors. V8
can run independently, or it can be embedded in any C++
application.
Origin of V8
Next, we will care about how it was born and why it is called this name.
V8 originally developed by Lars Bak
team development to the car V8
engine (eight cylinder V-engine) named, indicates that this will be a very high-performance JavaScript
engine, in September 2008 with No. 2
chrome
is released together with open source.
Why do we need V8
JavaScript
code we wrote is ultimately to be executed in the machine, but the machine cannot directly recognize these high-level languages. It takes a series of processing to convert the high-level language into instructions that can be recognized by the machine, that is, binary code, and hand it over to the machine for execution. The intermediate conversion process is the specific work of V8
Next, let's take a closer look.
V8 composition
First look at the internal composition of V8
V8
, among which the most important 4 are as follows:
- Parser : Parser, responsible for parsing the source code into
AST
- Ignition : Interpreter, responsible for converting
AST
into bytecode and executing it, and marking hot codes at the same time - TurboFan : Compiler, responsible for compiling hot code into machine code and executing
- Orinoco : Garbage collector, responsible for reclaiming memory space
V8 workflow
The following is the specific work flow chart of several important modules in V8
We analyze them one by one.
Parser
The Parser is responsible for converting the source code into the abstract syntax tree AST
. There are two important stages in the conversion process: Lexical Analysis and
Syntax Analysis.
lexical analysis
Also called word segmentation, it is the process of converting a code in the form of a string into a sequence of tokens. Here, token
is a string, which is the smallest unit that constitutes the source code, similar to English words. Lexical analysis can also be understood as the process of combining English letters into words. The lexical analysis process does not care about the relationship between words. For example, the brackets can be marked as token
in the lexical analysis process, but the matching of the brackets is not checked.
JavaScript
in token
mainly includes the following:
Keywords: var, let, const, etc.
Identifier: consecutive characters not enclosed in quotation marks, which may be a variable, keywords such as if and else, or built-in constants such as true and false
Operators: +, -, *, / etc.
Numbers: like hexadecimal, decimal, octal and scientific expressions, etc.
String: the value of a variable, etc.
Spaces: consecutive spaces, line breaks, indentation, etc.
Comment: Line comment or block comment is a minimum grammatical unit that cannot be split
Punctuation: braces, parentheses, semicolons, colons, etc.
The following is const a = 'hello world'
generated after esprima
lexical analysis of tokens
.
[
{
"type": "Keyword",
"value": "const"
},
{
"type": "Identifier",
"value": "a"
},
{
"type": "Punctuator",
"value": "="
},
{
"type": "String",
"value": "'hello world'"
}
]
Parsing
Grammatical distraction is the process of AST
generated by lexical analysis into token
according to a given formal grammar. That is, the process of combining words into sentences. During the conversion process, the grammar will be verified. If the grammar is wrong, a grammatical error will be thrown.
Above const a = 'hello world'
After parsing generated AST
follows:
{
"type": "Program",
"body": [
{
"type": "VariableDeclaration",
"declarations": [
{
"type": "VariableDeclarator",
"id": {
"type": "Identifier",
"name": "a"
},
"init": {
"type": "Literal",
"value": "hello world",
"raw": "'hello world'"
}
}
],
"kind": "const"
}
],
"sourceType": "script"
}
After Parser
parser generated AST
will be referred Ignition
interpreter for processing.
Ignition interpreter
The Ignition interpreter is responsible for converting AST
into bytecode and executing it. Bytecode is AST
and machine code. It has nothing to do with a specific type of machine code. It needs to be converted into machine code by an interpreter before it can be executed.
Seeing this, everyone must have doubts. Since bytecode also needs to be converted into machine code to run, why not directly convert AST
into machine code and run it directly? Converting to machine code is definitely faster to run directly, so why add an intermediate process?
In fact, V8
of 5.9
previous versions is no bytecode, but directly to the JS code is compiled into machine code and machine code stored in memory, so it takes up a lot of memory, and the early phone memory is not high, Excessive occupancy will cause the performance of the mobile phone to be greatly reduced; and direct compilation into machine code leads to long compilation time and slow startup speed; furthermore, direct conversion of JS code into machine code requires CPU
architectures, and complexity Very high.
5.9
version, bytecode is introduced, which can solve the above-mentioned problems of large memory usage, long startup time, and high code complexity.
Next we look at Ignition
how to AST
converted to bytecode.
The following figure is the work flow chart Ignition
AST
needs to pass the bytecode generator first, and then after a series of optimizations, can the bytecode be generated.
The optimizations include:
- Register Optimizer : Mainly to avoid unnecessary loading and storage of registers
- Peephole Optimizer : Find the reusable part of the bytecode and merge it
- Dead-code Elimination : Delete useless code and reduce the size of bytecode
After the code is converted into bytecode, it can be executed by the interpreter. Ignition
will monitor the execution of the code and record the execution information during the execution, such as the number of executions of the function, the parameters passed each time the function is executed, etc.
When the same code is executed multiple times, it will be marked as hot code. The hot code will be handed over to the TurboFan
compiler for processing.
TurboFan compiler
TurboFan
gets Ignition
, it will first optimize it, and then compile the optimized bytecode into more efficient machine code and store it. Next time the same code is executed again, the corresponding machine code will be executed directly, which greatly improves the execution efficiency of the code.
When a piece of code is no longer a hot code, TurboFan
will perform a de-optimization process to restore the optimized and compiled machine code to bytecode, and return the execution rights of the code to Ignition
.
Now let's take a look at the specific implementation process.
Take sum += arr[i]
as an example. Since JS
is a dynamically typed language, sum
and arr[i]
may be of different types each time. When this code is executed, Ignition
will check the data types of sum
and arr[i]
When it is found that the same code has been executed multiple times, it will be marked as a hot code and handed over to TurboFan
.
TurboFan
is executed, it is a waste of time to arr[i]
sum
and 060eba93cae0d8 every time. sum
and arr[i]
will be determined according to the previous several executions and compiled into machine code. The next time it is executed, the process of judging the data type is omitted.
But if in the subsequent execution process, arr[i]
changes, the previously generated machine code does not meet the requirements, TurboFan
will discard the previously generated machine code, and the execution right will be handed over to Ignition
to complete the de-optimization. process.
Hot code:
Before optimization:
Optimized:
to sum up
Now let's summarize the execution process of V8
- The source code is passed through the
Parser
parser, after lexical analysis and grammatical analysis,AST
AST
generates bytecode and executes it through theIgnition
- During the execution process, if the hot code is found, the hot code is handed over to the
TurboFan
compiler to generate machine code and execute - If the hot code no longer meets the requirements, perform de-optimization processing
This technology of combining bytecode with interpreter and compiler is what we usually call just-in-time compilation ( JIT
).
This article does not introduce the garbage collector Orinoco
, V8
can be introduced in detail in a separate article, we will see you in the next issue.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。