AST abstract syntax tree

As a front-end classmate, whether you know what AST is or not, it does not affect your use of it in your work at all. less , babel , eslint , code compression, and JavaScript code used in our usual projects can be run in browsers, etc., all based on AST After learning about AST , you can also toss something out by yourself to make fun of the monotonous and boring work.

What is AST

AST（Abstract Syntax Tree） , called abstract syntax tree in Chinese, is an abstract representation of the source code syntax structure. It expresses the grammatical structure of a programming language in a tree-like form, and each node on the tree represents a structure in the source code. The reason why the grammar is "abstract" is that the grammar here does not show every detail that appears in the real grammar. For example, nested parentheses are implicit in the structure of the tree and are not presented in the form of nodes; and if-condition-then can be represented by nodes with three branches. (The above concept comes from Wikipedia ).

JavaScript AST conversion tool

For JavaScript , you can JS Parser the JS into code AST . At present, the more common JS Parser are as follows:

esprima (popular library)
Babylon (used in babel)
acorn (used in webpack)
espree (derived from acorn, used in eslint)
astexplorer (online generation tool, optional different JS Parser for real-time viewing)

The examples in this article are all implemented esprima

How to convert code to AST

In the process of converting the code into AST , there are two important stages: Lexical Analysis and Syntax Analysis.

`lexical analysis`

Also called word segmentation, it is the process of converting a code in the form of a string into a sequence of tokens. The token here is a string, which is the smallest unit that constitutes the source code, similar to English words. Lexical analysis can also be understood as the process of combining English letters into words. The lexical analysis process does not care about the relationship between words. For example, in the lexical analysis process, the brackets can be marked as token , but the matching of the brackets is not checked.

JavaScript in token mainly includes the following:

Keywords: var, let, const, etc.
Identifier: consecutive characters not enclosed in quotation marks, which may be a variable, keywords such as if and else, or built-in constants such as true and false
Operators: +, -, *, / etc.
Numbers: like hexadecimal, decimal, octal and scientific expressions, etc.
String: the value of a variable, etc.
Spaces: consecutive spaces, line breaks, indentation, etc.
Comment: Line comment or block comment is a minimum grammatical unit that cannot be split
Punctuation: braces, parentheses, semicolons, colons, etc.

The following is const a = 'hello world' generated after esprima lexical analysis of tokens .

[
    {
        "type": "Keyword",
        "value": "const"
    },
    {
        "type": "Identifier",
        "value": "a"
    },
    {
        "type": "Punctuator",
        "value": "="
    },
    {
        "type": "String",
        "value": "'hello world'"
    }
]

`Parsing`

Also called a parser, it is the process of AST token according to a given formal grammar. That is, the process of combining words into sentences. During the conversion process, the grammar will be verified, and if the grammar is wrong, a grammatical error will be thrown.

After the above const a = 'hello world' is parsed, the generated AST as follows:

{
  "type": "Program",
  "body": [
    {
      "type": "VariableDeclaration",
      "declarations": [
        {
          "type": "VariableDeclarator",
          "id": {
            "type": "Identifier",
            "name": "a"
          },
          "init": {
            "type": "Literal",
            "value": "hello world",
            "raw": "'hello world'"
          }
        }
      ],
      "kind": "const"
    }
  ],
  "sourceType": "script"
}

After getting AST , we can analyze AST and do some of our own things on this basis. For example, the simplest way is to replace a certain variable in the code with another name.

`practice`

Below we will achieve a variable defined in the above code a replaced variable b . To achieve this requirement, we need to convert the source code into AST , and then perform some operations on this basis, change the content of the tree, and then AST into object code. That is to go through the process of parsing -> conversion -> generation.

First, we need to analyze the specific difference between AST AST generated by target code. The following is the AST generated by const b = 'hello world'

{
  "type": "Program",
  "body": [
    {
      "type": "VariableDeclaration",
      "declarations": [
        {
          "type": "VariableDeclarator",
          "id": {
            "type": "Identifier",
            "name": "b" // 这里不同
          },
          "init": {
            "type": "Literal",
            "value": "hello world",
            "raw": "'hello world'"
          }
        }
      ],
      "kind": "const"
    }
  ],
  "sourceType": "script"
}

Through comparative analysis, we found that the only difference is type to Identifier of id of name property values are not the same. Then we can modify AST to meet our needs.

We need to install estraverse (traverse AST) and escodegen (generate JS according to AST) these two packages.

const esprima = require('esprima');
const estraverse = require('estraverse');
const escodegen = require('escodegen');

const program = "const a = 'hello world'";
const ASTree = esprima.parseScript(program);

estraverse.traverse(ASTree, {
    enter(node) {
        changeAToB(node);
    }
});

const ASTreeAfterChange = escodegen.generate(tree);
console.log(ASTreeAfterChange); // const b = 'hello world'

function changeAToB(node) {
    if (node.type === 'Identifier') {
        node.name = 'b';
    }
}

See, is it easy to achieve. After mastering the AST , we can do many things. Various babel plug-ins are also produced in this way, but the libraries used are different.

How to implement a babel plug-in can refer to the official Babel plug-in manual

AST abstract syntax tree

What is AST

JavaScript AST conversion tool

How to convert code to AST

`lexical analysis`

`Parsing`

`practice`

`Reference article`

阳呀呀

`引用和评论`

去年那个女，30+，已婚，未育的前端打工人2022年过的怎么样

2025年最新反编译微信小程序的教程及工具

手写一个动态海洋和天空效果的vue hooks

你可能不知道的图片加载相关知识

原生JS大揭秘—JS代码执行原理解刨

使用CSS给标题添加书名号并超出省略

原生electron起步-从零到一完成构建和打包