12
头图

delicious value: 🌟🌟🌟🌟🌟

taste: tomato beef

Canteen proprietress: Boss, will you ask about the working principle of the Chrome V8 engine in the interview?

Canteen owner: This piece of knowledge may not only ask questions in interviews, but also learn the working principle of the JS engine. You can better understand JavaScript, Babel's lexical analysis and grammatical analysis in the front-end ecology, ESLint's grammatical check principle and React The realization principle of front-end frameworks such as Vue and Vue. In short, the learning engine principle can be said to serve multiple purposes.

Cafeteria proprietress: Okay, don’t be wordy, let’s get started~

Looking at V8 from a macro perspective

V8 is our front-end internet celebrity. It is written in C++ and is Google's open source high-performance JavaScript and WebAssembly engine, mainly used in Chrome, Node.js, Electron...

Before starting to talk about our protagonist V8 engine, let's talk about the position of V8 from a macro perspective and establish a world view.

With the rapid development of information technology today, this crazy world is full of various electronic devices, such as mobile phones, computers, electronic watches, smart speakers that we use every day, and more and more electric cars running on the road.

As software engineers, we can interpret them as "computers". They are all composed of central processing unit (CPU), storage and input and output devices. The CPU is like a chef, responsible for executing orders and cooking dishes in accordance with the recipes. Storage is like a refrigerator, responsible for saving data and commands (food ingredients) to be executed.

When the computer is powered on, the CPU starts to read instructions from a certain location in the storage, and executes the commands one by one to start working. The computer can also be connected to various external devices, such as: mouse, keyboard, screen, engine and so on. The CPU does not need to fully understand the capabilities of these devices, it is only responsible for data exchange with the ports of these devices. Equipment manufacturers will also provide equipment with software matching the hardware to work with the CPU. Having said that, we have obtained the most basic computer, which is also the architecture proposed by the father of computers von Neumann in 1945.

However, because machine instructions are very unfriendly for humans to read and are difficult to read and remember, people invented programming languages and compilers. Compilers can convert languages that are easier for humans to understand into machine instructions. In addition, we also need an operating system to help us solve the problem of software governance. We know that there are many operating systems, such as Windows, Mac, Linux, Android, iOS, Hongmeng, etc., and there are countless devices using these operating systems. In order to eliminate the diversity of clients, realize cross-platform and provide a unified programming interface, the browser was born.

Therefore, we can regard the browser as the operating system on top of the operating system, and for the JavaScript code that our front-end engineers are most familiar with, the browser engine (such as: V8) is its entire world.

The most powerful JavaScript engine on the planet

There is no doubt that V8 is the most popular and powerful JavaScript engine. The name V8 is inspired by the classic "muscle car" engine in the 1950s.

Programming Languages Software Award

V8 has also been affirmed by the academic community, and won the Programming Languages Software Award from ACM SIGPLAN.

Mainstream JS engine

The mainstream JavaScript engines are as follows:

V8 release cycle

The V8 team uses 4 Chrome release channels to push new versions to users.

  • Canary releases (every day)
  • Dev releases (weekly)
  • Beta releases (every 6 weeks)
  • Stable releases (every 6 weeks)

To learn more, please click V8 engine version release process .

V8 architecture evolution history

On September 2, 2008, V8 and Chrome were open sourced on the same day. The original code submission date can be traced back to June 30, 2008. You can view the visual evolution of the V8 code base through the link below.

The V8 architecture at that time was simple and rude, with only one Codegen compiler.

In 2010, the Crankshaft optimized compiler was added to V8, which greatly improved the runtime performance. The machine code generated by Crankshaft is twice as fast as the previous Codegen compiler, while the volume is reduced by 30%.

In 2015, in order to further improve performance, V8 introduced the TurboFan optimizing compiler.

Then came the watershed. Before that, V8 chose to directly compile the source code into machine code. However, with the popularity of Chrome on mobile devices, the V8 team discovered a fatal problem under this architecture: long compilation time and large memory usage of machine code.

Therefore, the V8 team refactored the engine architecture and introduced the Ignition interpreter and bytecode in 2016.

In 2017, V8 turned on the new compilation pipeline(Ignition + TurboFan) default, and removed Full-codegen and Crankshaft.

A high-performance JS engine not only requires a highly optimized compiler like TurboFan, but also has a lot of room for optimization before the compiler has a chance to start working.

So in 2021, V8 introduced a new compilation pipeline Sparkplug .

For Sparkplug, if you want to know more, please click Sparkplug

Cafeteria proprietress: The original V8 architecture has undergone so many changes

Canteen owner: Yes, the V8 team has made a lot of efforts to continuously optimize the performance of the engine.

V8 working mechanism

Knock on the blackboard and enter the focus of this article.

Cafeteria proprietress: Take out a small book and write down

The core process of V8 executing JavaScript code is divided into the following two stages:

  • Compile
  • implement

compilation stage means that V8 converts JavaScript into bytecode or binary machine code. The execution stage means that the interpreter interprets and executes the bytecode, or the CPU directly executes the binary machine code.

In order to have a better understanding of the overall working mechanism of V8, let's first understand the following concepts.

Machine language, assembly language, high-level language

The instruction set of the CPU is machine language, and the CPU can only recognize binary instructions. But for humans, binary is difficult to read and remember, so people convert binary into a recognizable and memorable language, that is, assembly language, which can convert assembly instructions into machine instructions through an assembly compiler.

Different CPUs have different instruction sets, and programming in assembly language needs to be compatible with different CPU architectures, such as ARM, MIPS, etc., and the learning cost is relatively high. The abstraction of assembly language is far from enough, so high-level language came into being. High-level language shielded the details of computer architecture and was compatible with many different CPU architectures.

The CPU also does not recognize high-level languages. Generally, there are two ways to execute high-level language codes, namely:

  • Interpretation and execution
  • Compile and execute

Interpretation and execution, compilation and execution

Interpretation and execution will first compile the input source code into intermediate code through the parser, and then directly use the interpreter to interpret and execute the intermediate code, and output the result.

Compilation and execution will also convert the source code into intermediate code, and then the compiler will compile the intermediate code into machine code, usually the compiled machine code is stored in the form of a binary file, and the binary file is executed to output the result. The compiled machine code can also be stored in the memory, and the binary code in the memory can be directly executed.

JIT (Just In Time)

explains that the execution start speed is fast and the execution speed is slow, while the compilation execution start speed is slow and the execution speed is fast.

After weighing the pros and cons, V8 uses both interpreted execution and compilation execution. This mixed use is called JIT (Just-in-time compilation).

When V8 executes the JavaScript source code, the parser first parses the source code into an AST abstract syntax tree, and the interpreter (Ignition) converts the AST into bytecode and executes it while interpreting it.

The interpreter will also record the number of executions of a certain code fragment. If the number of executions exceeds a certain threshold, the code will be marked as hot code, and the running information will be fed back to the optimizing compiler TurboFan, which is based on Feedback information will be optimized and compiled bytecode, and finally optimized machine code will be generated.

Canteen proprietress: That is to say, when this code is executed again, the interpreter can directly run the optimized machine code without reinterpreting it. This will improve a lot of performance, right?

Cafeteria owner: That's right!

The names of the interpreter and compiler of V8 have very interesting meanings. The interpreter Ignition stands for igniter, and the compiler TurboFan stands for turbocharging. When the code is started, it is launched by the igniter. Once TurboFan intervenes, the execution efficiency will be higher and higher.

After understanding the general working mechanism of V8, let's go deeper and take a look at the working principle of the V8 core module.

V8 core module working principle

The core modules of V8 include:

  • Parser : The parser is responsible for converting JavaScript code into an AST abstract syntax tree.
  • Ignition : The interpreter is responsible for converting the AST into bytecode and collecting the optimized compilation information required by TurboFan.
  • TurboFan : Use the information collected by the interpreter to convert bytecode into optimized machine code.

V8 needs to wait for the compilation to complete before running the code, so the performance in the parsing and compilation process is very important.

Parser

The parsing process of the parser is divided into two stages:

  • Lexical analysis (Scanner lexical analyzer)
  • Syntax analysis (Pre-Parser, Parser parser)

lexical analysis

Scanner is responsible for receiving the Unicode Stream character stream, parse it into tokens , and provide it to the parser Parser .

For example, the following code:

let myName = '童欧巴'

Will be parsed into let , myName , = , Ouba', which are keywords, identifiers, assignment operators, and strings.

Parsing

Next, the grammatical analysis will convert the tokens generated in the previous step into AST according to the grammatical rules. If there is a grammatical error in the source code, it will terminate at this stage and throw a grammatical error.

You can check the structure of AST through this website: https://astexplorer.net/

You can also https://resources.jointjs.com/demos/javascript-ast , as shown below:

Get the AST, V8 will generate the execution context of the code.

Lazy parsing

The mainstream JavaScript engines all use lazy parsing, because if the source code is completely parsed before execution, it will not only cause excessive execution time, but also consume more memory and disk space.

Lazy parsing means that if a function that is not executed immediately is encountered, only will be pre-parsed (Pre-Parser), and the function will be fully parsed when it is called.

When pre-analyzing, it only verifies that the function's grammar is valid, parses the function declaration, and determines the scope of the function, and does not generate AST. This work is done by the Pre-Parser pre-parser.

Interpreter Ignition

After getting the AST and execution context, the interpreter will convert the AST into bytecode and execute it.

Cafeteria proprietress: Why introduce bytecode?

The introduction of bytecode is an engineering trade-off. It can be seen from the figure that the generated machine code has already taken up a large amount of memory space with just a few KB of file.

Compared with machine code, bytecode not only occupies less memory, but also takes a very fast time to generate bytecode, which improves the startup speed. Although bytecode is not as fast as machine code, it sacrifices a little execution efficiency, and the gains in exchange for it are still worthwhile.

Moreover, bytecode has nothing to do with a specific type of machine code. It can be executed only after the bytecode is converted into machine code by an interpreter, which also makes it easier for V8 to be transplanted to different CPU architectures.

You can view the bytecode generated by the JavaScript code through the following command.

node --print-bytecode index.js

You can also view it through the following link:

Let's look at a piece of code:

// index.js
function add(a, b) {
    return a + b
}

add(2, 4)

The above code will generate the following bytecode after executing the command:

[generated bytecode for function: add (0x1d3fb97c7da1 <SharedFunctionInfo add>)]
Parameter count 3
Register count 0
Frame size 0
   25 S> 0x1d3fb97c8686 @    0 : 25 02             Ldar a1
   34 E> 0x1d3fb97c8688 @    2 : 34 03 00          Add a0, [0]
   37 S> 0x1d3fb97c868b @    5 : aa                Return
Constant pool (size = 0)
Handler Table (size = 0)
Source Position Table (size = 8)
0x1d3fb97c8691 <ByteArray[8]>

Among them, Parameter count 3 represents three parameters, including the incoming a, b and this. The details of the bytecode are as follows:

Ldar a1 // 表示将寄存器中的值加载到累加器中
Add a0, [0] // 从 a0 寄存器加载值并且将其与累加器中的值相加,然后将结果再次放入累加器
Return // 结束当前函数的执行,并把控制权传给调用方,将累加器中的值作为返回值

Each line of bytecode corresponds to a specific function, and each line of bytecode is like building Lego blocks, assembled together to form a complete program.

There are usually two types of interpreters, based on the stack and based on the register. The early V8 interpreter is also based on the stack. The current V8 interpreter adopts a register-based design and supports register-based instruction operations. Save parameters and intermediate calculation results.

The Ignition interpreter mainly uses general-purpose registers and accumulation registers when executing bytecode. Related function parameters and local variables will be stored in general-purpose registers, and accumulation registers will save intermediate results.

In the process of executing instructions, the CPU needs to read and write data. If you read and write directly in the memory, it will seriously affect the execution performance of the program. Therefore, the CPU introduces registers and stores some intermediate data in the registers to improve the execution speed of the CPU.

Compiler TurboFan

In terms of compilation, the V8 team also made a lot of optimizations. Let's look at inline and escape analysis.

Inlining

Regarding inlining, let's first look at a piece of code:

function add(a, b) {
  return a + b
}
function foo() {
  return add(2, 4)
}

As shown in the above code, we call the function add in the foo function. The add function receives two parameters a and b and returns their sum. If it is not optimized by the compiler, the machine code corresponding to these two functions will be generated separately.

In order to improve performance, the TurboFan optimizing compiler will inline the above two functions before compiling. The inlined function is as follows:

function fooAddInlined() {
  var a = 2
  var b = 4
  var addReturnValue = a + b
  return addReturnValue
}

// 因为 fooAddInlined 中 a 和 b 的值都是确定的,所以可以进一步优化
function fooAddInlined() {
  return 6
}

After inline optimization, the machine code generated by the compilation will be much simplified, and the execution efficiency will also be greatly improved.

Escape Analysis

Escape analysis is not difficult to understand. It means that analyzes whether the life cycle of the object is limited to the current function. Let’s look at a piece of code:

function add(a, b){
  const obj = { x: a, y: b }
  return obj.x + obj.y
}

If the object is only defined inside the function, and the object only acts inside the function, it will be considered "unescaped". We can optimize the above code:

function add(a, b){
  const obj_x = a
  const obj_y = b
  return obj_x + obj_y
}

After optimization, there is no need for object definitions, and we can directly load variables into registers, no longer need to access object properties from memory. Not only reduces memory consumption, but also improves execution efficiency.

Regarding escape analysis, Chrome has also exposed security vulnerabilities that slowed down the entire Internet. If you are interested, please poke V8 team, which slowed down the entire Internet

In addition to the various optimization schemes and modules mentioned above, V8 also has many optimization methods and core modules, such as: using hidden classes to quickly obtain object properties, using inline caching to improve function execution efficiency, Orinoco garbage collector, Liftoff 16119b89fa0e17 WebAssembly compiler and so on, this article will not introduce too much, you can learn by

summary

This article introduces and summarizes the evolution of V8 and V8 architecture, the working mechanism of V8, and the working principle of V8 core modules from a macro perspective. We can find that whether it is Chrome or Node.js, they are just a bridge, responsible The JavaScript code written by our front-end engineers is transported to the final destination, converted into the machine code of the corresponding machine and executed. During this journey, the V8 team has made great efforts to give them the greatest respect.

Although the instruction set of the CPU is limited, the programs written by our software engineers are not fixed. It is these programs that are finally executed by the CPU that have the possibility of changing the world.

You are the best programs that change the world!

Cafeteria proprietress: Tongtong, you are the fattest! ^_^

Standing on the shoulders of giants

  • How does V8 execute JavaScript code? -Lao Jiang
  • Graphical Google V8-Li Bing
  • Xu Shiwei's Architecture Class
  • https://v8.dev/blog
  • ❤️Love three combos

1. If you think the food and drink in the cafeteria are still in line with your appetite, just like your support. Your is my biggest motivation.

2. Follow the public front-end canteen, eat every meal!

3. Like, comment, forward === reminder!


童欧巴
2.6k 声望4.2k 粉丝