foreword

In the previous article, I shared how the Go compiler parses source files into tokens. This article mainly shares, how the grammar analysis stage is parsed according to different Tokens. In this article you can learn the following:

  1. Overall overview of Go parsing
  2. Go grammar parsing detailed process
  3. The example shows the complete process of grammar parsing

💡 Tips: This article will involve grammar and production related content. If you don’t know much about it, please read the Lexical Analysis & Syntax Analysis Basics

The article is relatively long, but the code is more, mainly to facilitate understanding. I believe you will gain something after watching it

Overall overview of Go parsing

In order to facilitate the understanding later, I provide a source file here, and you can bring in the following content to understand according to the content in this source file:

package main

import (
    "fmt"
    "go/token"
)

type aType string
const A = 666
var B = 888

func main() {
    fmt.Println("Test Parser")
    token.NewFileSet()
}

Entrance

When I introduced the lexical analysis in the previous article, I explained the Go compilation entry in detail, initialized the parser at the compile entry, and initialized the lexical analyzer in the process of initializing the parser, so the lexical analysis The parser is embedded inside the grammar tokenizer. We can see the structure of the parser in: src/cmd/compile/internal/syntax/parser.go as follows:

type parser struct {
    file  *PosBase //记录打开的文件的信息的(比如文件名、行、列信息)
    errh  ErrorHandler //报错回调
    mode  Mode //语法分析模式
    pragh PragmaHandler
    scanner //词法分析器

    base   *PosBase // current position base
    first  error    // first error encountered
    errcnt int      // number of errors encountered
    pragma Pragma   // pragmas

    fnest  int    // function nesting level (for error handling) 函数的嵌套层级
    xnest  int    // expression nesting level (for complit ambiguity resolution) 表达式嵌套级别
    indent []byte // tracing support(跟踪支持)
}

Because on a article has detailed the location of the entrance shared file and have it do what (src / cmd / compile / internal / syntax / syntax.go → Parse (...)), in parsing After the initialization of the parser and lexical analyzer is completed, we will see that it calls the fileOrNil() method of the parser, which is the core method of parsing. The following is a detailed introduction to what this core method does.

Go grammar parsing structure

This part will involve the parsing of each declaration type, the structure of each declaration type, and the relationship between the nodes of the syntax tree generated by the parser. Some places are really difficult to understand. You can look at the text first, and then combine it with the grammar analysis diagram below. It may be easier to understand.

Grammar rules in Go parsing

I'll give an overall introduction to this part first, and then go into the details.

Enter the fileOrNil() method, and you will see such a line of comments in the comments

SourceFile = PackageClause ";" { ImportDecl ";" } { TopLevelDecl ";" } .

It is the grammar rule for Go to parse source files, which is mentioned series 161da64d8a9dbb Lexical Analysis and Syntax Analysis Basics After the Go compiler has initialized the parser and lexical analyzer, it will call this method. This method will continuously obtain the Token through the next() method provided by the lexical tokenizer, and the parsing will follow the grammar rule above. perform grammatical analysis

Maybe you don't know the productions of PackageClause, ImportDecl, and TopLevelDecl. You can find these three productions directly in this file (about productions, it is introduced in the Lexical Analysis and Syntax Analysis Fundamentals section)

SourceFile = PackageClause ";" { ImportDecl ";" } { TopLevelDecl ";" } .

PackageClause = "package" PackageName . 
PackageName = identifier . 

ImportDecl = "import" ( ImportSpec | "(" { ImportSpec ";" } ")" ) . 
ImportSpec = [ "." | PackageName ] ImportPath . 
ImportPath = string_lit .

TopLevelDecl = Declaration | FunctionDecl | MethodDecl . 
Declaration = ConstDecl | TypeDecl | VarDecl . 

ConstDecl = "const" ( ConstSpec | "(" { ConstSpec ";" } ")" ) . 
ConstSpec = IdentifierList [ [ Type ] "=" ExpressionList ] . 

TypeDecl = "type" ( TypeSpec | "(" { TypeSpec ";" } ")" ) . 
TypeSpec = AliasDecl | TypeDef . 
AliasDecl = identifier "=" Type . 
TypeDef = identifier Type . 

VarDecl = "var" ( VarSpec | "(" { VarSpec ";" } ")" ) . 
VarSpec = IdentifierList ( Type [ "=" ExpressionList ] | "=" ExpressionList ) .

fileOrNil() put it bluntly, parses the source file according to the grammar of SourceFile, and finally returns a syntax tree (File structure, which will be introduced below). We know that the compiler will parse each file into a syntax tree.

Below is a brief introduction to the meaning of several grammars in Go grammar analysis (if you read the Lexical Analysis and Grammar Analysis Basics , it should be easy to understand the meaning of these grammars in Go)

SourceFile = PackageClause ";" { ImportDecl ";" } { TopLevelDecl ";" } .

我们可以看到SourceFile是由PackageClause、ImportDecl、TopLevelDecl这三个非终结符号构成的。它的意思就是:
每一个源文件主要是由package声明、导入声明、和顶层声明构成的(其中ImportDecl和TopLevelDecl是可选的,可有可无,这就是中括号的含义)。也就是说,每一个源文件,都应该是符合这个文法规则的

首先是包声明:PackageClause
PackageClause = "package" PackageName . 
PackageName = identifier . 
PackageClause是由一个终结符package和一个非终结符PackageName构成的,而PackageName由一个标识符构成的

所以,在扫描源文件的时候,应该会最先获取到的是package的Token,然后是一个标识符的Token。解析完package声明之后,后边就应该是导入声明

然后是导入声明:ImportDecl
ImportDecl = "import" ( ImportSpec | "(" { ImportSpec ";" } ")" ) . 
ImportSpec = [ "." | PackageName ] ImportPath . 
ImportPath = string_lit .

After understanding the meaning of the grammar above, let's see how the fileNil() method parses the grammar according to the grammar above. But before that, you need to know what the structure of each node of the syntax tree generated by the fileOrNil() method looks like

The structure of each node of the syntax tree

We know that the fileOrNil() method will generate a syntax tree according to the grammar rules of SourceFile, and the structure of each syntax tree is such a structure: (src/cmd/compile/internal/syntax/nodes.go → File )

type File struct {
    Pragma   Pragma
    PkgName  *Name
    DeclList []Decl
    Lines    uint
    node
}

It consists primarily of a package name of the source file (pkgname), and source file all declarations (DeclList). It should be noted that it will also resolve the import as a declaration into DeclList

fileOrNil() will make all declarations (such as var, type, const) in the source file according to the structure of Each declaration defines a corresponding structure to save the declaration information, and these structures are is the child node of the syntax tree ) parsed into DeclList. In the above grammar, we can see the constant, type, variable declaration grammar, in fact, the grammar of functions and methods, which can be found in src/cmd/compile/internal/syntax/parser.go

FunctionDecl = "func" FunctionName ( Function | Signature ) .
FunctionName = identifier .
Function     = Signature FunctionBody .

MethodDecl   = "func" Receiver MethodName ( Function | Signature ) .
Receiver     = Parameters .

The root node structure of this syntax tree is the File structure, and its child node structure is:

ImportDecl struct {
        Group        *Group // nil means not part of a group
        Pragma       Pragma
        LocalPkgName *Name // including "."; nil means no rename present
        Path         *BasicLit
        decl
    }

ConstDecl struct {
        Group    *Group // nil means not part of a group
        Pragma   Pragma
        NameList []*Name
        Type     Expr // nil means no type
        Values   Expr // nil means no values
        decl
    }

TypeDecl struct {
        Group  *Group // nil means not part of a group
        Pragma Pragma
        Name   *Name
        Alias  bool
        Type   Expr
        decl
    }

VarDecl struct {
        Group    *Group // nil means not part of a group
        Pragma   Pragma
        NameList []*Name
        Type     Expr // nil means no type
        Values   Expr // nil means no values
        decl
    }

FuncDecl struct {
        Pragma Pragma
        Recv   *Field // nil means regular function
        Name   *Name
        Type   *FuncType
        Body   *BlockStmt // nil means no body (forward declaration)
        decl
    }

The structure of these nodes is defined in: src/cmd/compile/internal/syntax/nodes.go. That is to say, in the process of parsing, if it parses to satisfy the ImportDecl grammar rules, it will create a node of the corresponding structure to save the relevant information; if it meets the grammar of the var type declaration, it will create a structure related to the var type declaration ( VarDecl) to hold the declaration information

If the analytic function are more complicated, from a structural function node you can see that it contains recipient , function name , type , function body these parts, the most complex function in the body where it is a BlockStmt structure:

BlockStmt struct {
        List   []Stmt //Stmt是一个接口
        Rbrace Pos
        stmt
    }

BlockStmt is composed of a series of declarations and expressions , you can see many expressions and declaration structures in src/cmd/compile/internal/syntax/nodes.go (these structures are also node structure below the function declaration)

// ----------------------------------------------------------------------------
// Statements
......
SendStmt struct {
        Chan, Value Expr // Chan <- Value
        simpleStmt
    }

    DeclStmt struct {
        DeclList []Decl
        stmt
    }

    AssignStmt struct {
        Op       Operator // 0 means no operation
        Lhs, Rhs Expr     // Rhs == ImplicitOne means Lhs++ (Op == Add) or Lhs-- (Op == Sub)
        simpleStmt
    }
......
ReturnStmt struct {
        Results Expr // nil means no explicit return values
        stmt
    }

    IfStmt struct {
        Init SimpleStmt
        Cond Expr
        Then *BlockStmt
        Else Stmt // either nil, *IfStmt, or *BlockStmt
        stmt
    }

    ForStmt struct {
        Init SimpleStmt // incl. *RangeClause
        Cond Expr
        Post SimpleStmt
        Body *BlockStmt
        stmt
    }
......

// ----------------------------------------------------------------------------
// Expressions
......
// [Len]Elem
    ArrayType struct {
        // TODO(gri) consider using Name{"..."} instead of nil (permits attaching of comments)
        Len  Expr // nil means Len is ...
        Elem Expr
        expr
    }

    // []Elem
    SliceType struct {
        Elem Expr
        expr
    }

    // ...Elem
    DotsType struct {
        Elem Expr
        expr
    }

    // struct { FieldList[0] TagList[0]; FieldList[1] TagList[1]; ... }
    StructType struct {
        FieldList []*Field
        TagList   []*BasicLit // i >= len(TagList) || TagList[i] == nil means no tag for field i
        expr
    }
......
// X[Index]
    IndexExpr struct {
        X     Expr
        Index Expr
        expr
    }

    // X[Index[0] : Index[1] : Index[2]]
    SliceExpr struct {
        X     Expr
        Index [3]Expr
        // Full indicates whether this is a simple or full slice expression.
        // In a valid AST, this is equivalent to Index[2] != nil.
        // TODO(mdempsky): This is only needed to report the "3-index
        // slice of string" error when Index[2] is missing.
        Full bool
        expr
    }

    // X.(Type)
    AssertExpr struct {
        X    Expr
        Type Expr
        expr
    }
......

Does it look familiar? There are if, for, return. Stmt in BlockStmt is an interface type, which means that various expression types or declaration type structures above can implement the Stmt interface

Knowing the relationship between the nodes in the syntax tree and the possible structures of these nodes, let's see how fileOrNil parses the nodes of the syntax tree step by step.

Source implementation of fileOrNil()

In this part, I won't go deep into each function to see how it is parsed, but just introduce how it parses the token in the source file step by step from the general outline, because this part is mainly to understand the whole first. Specifically how it parses import, var, const, and func, I will introduce it in detail in the next part

You can see that the code implementation of fileOrNil() mainly includes the following parts:

// SourceFile = PackageClause ";" { ImportDecl ";" } { TopLevelDecl ";" } .
func (p *parser) fileOrNil() *File {
    ......

  //1.创建File结构体
    f := new(File)
    f.pos = p.pos()

  // 2. 首先解析文件开头的package定义
    // PackageClause
    if !p.got(_Package) { //检查第一行是不是先定义了package
        p.syntaxError("package statement must be first")
        return nil
    }
    
    // 3. 当解析完package之后,解析import声明(每一个import在解析器看来都是一个声明语句)
    // { ImportDecl ";" }
    for p.got(_Import) {
        f.DeclList = p.appendGroup(f.DeclList, p.importDecl)
        p.want(_Semi)
    }

    // 4. 根据获取的token去switch选择相应的分支,去解析对应类型的语句
    // { TopLevelDecl ";" }
    for p.tok != _EOF {
        switch p.tok {
        case _Const:
            p.next() // 获取到下一个token
            f.DeclList = p.appendGroup(f.DeclList, p.constDecl)

        ......
    }
    // p.tok == _EOF

    p.clearPragma()
    f.Lines = p.line

    return f
}

There are two important methods in fileOrNil(), which are the key to understanding what fileOrNil() is doing:

  • got
  • appendGroup

First got function whose argument is a token, for determined from the lexical analyzer to obtain the token is not passed in the parameter token

Then there is the appendGroup function, which has two parameters, the first one is DeclList (introduced earlier when introducing the members of the File structure, it is used to store all the declarations source file, which is a slice type); the second parameter is a function that is analysis function for each type declaration statement (for example, I resolved to the current import statement, then I will parse import method as the second parameter passed to appendGroup)

When parsing the import declaration statement, it is the following code:

for p.got(_Import) {
        f.DeclList = p.appendGroup(f.DeclList, p.importDecl)
        p.want(_Semi)
}

The role of appendGroup is actually find out the batch definition , such as the following situations

//import的批量声明情况
import (
    "fmt"
    "io"
    "strconv"
    "strings"
)

//var的批量声明情况
var (
    x int
    y int
)

For the declaration statement structure with batch declaration, it will have a Group field, used to indicate that these variables belong to the same group , such as the declaration structure of import and the declaration structure of var

ImportDecl struct {
        Group        *Group // nil means not part of a group
        Pragma       Pragma
        LocalPkgName *Name // including "."; nil means no rename present
        Path         *BasicLit
        decl
}

VarDecl struct {
        Group    *Group // nil means not part of a group
        Pragma   Pragma
        NameList []*Name
        Type     Expr // nil means no type
        Values   Expr // nil means no values
        decl
}

In the appendGroup method, the parsing method of the corresponding declared type will be called. Like fileOrNil(), it will be parsed according to the grammar of the corresponding type declaration. For example, the method for parsing import declarations is importDecl()

for p.got(_Import) {
        f.DeclList = p.appendGroup(f.DeclList, p.importDecl)
        p.want(_Semi)
}
......

// ImportSpec = [ "." | PackageName ] ImportPath .
// ImportPath = string_lit .
func (p *parser) importDecl(group *Group) Decl {
    if trace {
        defer p.trace("importDecl")()
    }

    d := new(ImportDecl)
    d.pos = p.pos()
    d.Group = group
    d.Pragma = p.takePragma()

    switch p.tok {
    case _Name:
        d.LocalPkgName = p.name()
    case _Dot:
        d.LocalPkgName = p.newName(".")
        p.next()
    }
    d.Path = p.oliteral()
    if d.Path == nil {
        p.syntaxError("missing import path")
        p.advance(_Semi, _Rparen)
        return nil
    }

    return d
}

We can see that it also first creates the structure of the corresponding declaration (here is the import declaration), then records the information of the declaration, and parses the following content according to the grammar of the declaration

Others are declaration statements such as const, type, var, func, etc., which are matched and parsed through switch. They also have corresponding parsing methods (in fact, the codes implemented according to their own grammar rules), I will not list them all here, you can check them yourself in src/cmd/compile/internal/syntax/parser.go

Speaking in front of us, the parser will eventually use the structure does not make sense to build a syntax tree of each node, which root is: src / cmd / compile / internal / syntax / nodes.go → File. Its structure has been introduced earlier, mainly including the package name and all declaration types. These different types of declarations are the child nodes of the syntax tree.

It may be a little difficult to understand through the text description above. The following shows what the whole grammar parsing process is like by means of a diagram.

Graphical parsing process

Take the source code provided at the beginning of the article as an example to demonstrate the process of parsing. As Go Lexical Analysis , the compilation entry of Go starts from src/cmd/compile/internal/gc/noder.go → parseFiles

At this point, I believe you already have an overall understanding of the syntax analysis part of Go. However, the above does not show how the various declaration statements are parsed down, which requires a deep look at how the parsing method of each declaration statement is implemented.

Go grammar parsing detailed process

For the parsing of various declarations and expressions in Go syntax analysis, you can find the corresponding methods in src/cmd/compile/internal/syntax/parser.go

Variable Declaration & Import Declaration Resolution

The invocation of the parsing method related to the variable declaration can be found in the src/cmd/compile/internal/syntax/parser.go→ fileOrNil()

Import declaration parsing

You can see the following code in src/cmd/compile/internal/syntax/parser.go→ fileOrNil()

for p.got(_Import) {
        f.DeclList = p.appendGroup(f.DeclList, p.importDecl)
        p.want(_Semi)
    }

In this appendGroup, the importDecl() method will eventually be called (as can be seen from the above code, the parsing method of the import declaration will not be executed until the import token is matched)

The first thing you need to know is that import can be used in the following ways:

import "a" //默认的导入方式,a是包名
import aAlias "x/a" // 给x/a这个包起个别名叫aAlias
import . "c" // 将依赖包的公开符号直接导入到当前文件的名字空间
import _ "d" // 只是导入依赖包触发其包的初始化动作,但是不导入任何符号到当前文件名字空间

For space reasons, I will not paste the source code of the importDecl() method. Here I sort out what it mainly does:

  1. Create an ImportDeal structure (which will eventually be appended to File.DeclList)
  2. Initialize some data in the structure, such as the location information of the parsed token, group, etc.
  3. To match a token then the next, to see if it is _Name token type (identifier), or _Dot(.)
  4. If the obtained token is _Name, , the package name will be obtained. If the obtained _Dot(.) , a new name will be created.
  5. Then there is the matching package path, which is mainly implemented oliteral()
  6. Returns the ImportDeal structure

What is worth mentioning here is the oliteral() method, which will get the next token to see if it is a token of the basic face value type, that is, _Literal

💡 Tips: The basic denominations are only integers, floating-point numbers, complex numbers, runes and strings.

If it is a base face value type, it will create a structure nodes. BasicLit of the base face value type, and initialize some of its information

BasicLit struct {
        Value string   //值
        Kind  LitKind  //那种类型的基础面值,范围(IntLit、FloatLit、ImagLit、RuneLit、StringLit)
        Bad   bool // true means the literal Value has syntax errors
        expr
}

It is not difficult to see that when parsing to the basic face value type, it is already indecomposable, which is the terminal symbol in the grammar. On the syntax tree, these terminals are all leaf nodes

There are related methods in go's standard library to test grammar parsing. Here I use the interface provided by go/parser to test the result of grammar analysis when importing:

💡 Tips: Like lexical analysis (you can take a look at Go lexical analyzer ), the implementation of syntax analysis in the standard library is different from the implementation in the go compiler, mainly structural Design (for example, the structure of the node has changed, you can check it yourself, it is relatively simple, you understand the implementation of syntax analysis in the compiler, and the standard library can also understand), the implementation idea is the same

package main

import (
    "fmt"
    "go/parser"
    "go/token"
)
const src = `
package test
import "a"
import aAlias "x/a"
import . "c"
import _ "d"
`

func main() {
    fileSet := token.NewFileSet()
    f, err := parser.ParseFile(fileSet, "", src, parser.ImportsOnly) //parser.ImportsOnly模式,表示只解析包声明和导入
    if err != nil {
        panic(err.Error())
    }
    for _, s := range f.Imports{
        fmt.Printf("import:name = %v, path = %#v\n", s.Name, s.Path)
    }
}

print result:

import:name = <nil>, path = &ast.BasicLit{ValuePos:22, Kind:9, Value:"\"a\""}
import:name = aAlias, path = &ast.BasicLit{ValuePos:40, Kind:9, Value:"\"x/a\""}
import:name = ., path = &ast.BasicLit{ValuePos:55, Kind:9, Value:"\"c\""}
import:name = _, path = &ast.BasicLit{ValuePos:68, Kind:9, Value:"\"d\""}

You can test the following types of declaration parsing or expression parsing through the methods provided in the standard library. I will not show the tests one by one below (the testing methods are the same, except that the fields you need to print are changed. Just a moment)

Type declaration parsing

When the parser gets _Type , it will call the parsing method of type to parse it. Also in fileOrNil(), you can see the following code:

......
case _Type:
            p.next()
            f.DeclList = p.appendGroup(f.DeclList, p.typeDecl)
......

It will call the typeDecl() method in appendGroup, which is to parse the syntax according to the grammar declared by the type type, which has been introduced earlier. We know that type has the following uses:

type a string
type b = string

Here's a look at what exactly this method does:

  • Create a TypeDecl (which will eventually be appended to File.DeclList)
  • Initialize some data in the structure, such as the location information of the parsed token, group, etc.
  • Whether the next token is _Assign , if so, get the next token
  • Verify the type of the next token, such as chan, struct, map or func (implemented in the typeOrNil () method, which is actually a bunch of switch cases)
  • return TypeDecl structure

When getting the type of the token on the far right, you need to continue parsing according to its type. Assuming it is a chan type, a chan type structure will be created and the information of the chan type will be initialized (during the parsing process, each type of structure created is a node)

You can also test type parsing with go/parser→ParseFile

const type declaration resolution

When the parser gets _Const , it will call the const parsing method to parse it. Also in fileOrNil(), you can see the following code:

......
case _Const:
            p.next()
            f.DeclList = p.appendGroup(f.DeclList, p.constDecl)
......

const has the following uses:

const A = 666
const B float64 = 6.66
const (
    _   token = iota
    _EOF       
    _Name    
    _Literal
)

Then you can see constDecl method (in fact, it is parsed according to the grammar of the const declaration). It will not be repeated here. It is similar to the analysis of the type above, which is to create the corresponding structure first, and then record some information of the type. The difference is that it has a list of names, because const can declare more than one at the same time. The parsing method is parser.go → nameList ()

var type declaration resolution

When the parser gets _Var , it will call the parsing method of var to parse it. Also in fileOrNil(), you can see the following code:

......
case _Var:
            p.next()
            f.DeclList = p.appendGroup(f.DeclList, p.varDecl)
......

Like the two declarations above, the corresponding parsing method will be called. The difference between var is that its declaration may involve expressions, so the parsing method of var involves the parsing of expressions. I will analyze the parsing of expressions in detail in the following part.

Function declaration parsing implementation

💡 Description: A diagram will be added later to show the process of function parsing

The last is the parsing of the function declaration. As mentioned earlier, the structure of the function declaration node is as follows:

FuncDecl struct {
        Pragma Pragma
        Recv   *Field // 接收者
        Name   *Name  //函数名
        Type   *FuncType //函数类型
        Body   *BlockStmt // 函数体
        decl
    }

In fileOrNil(), you can see the following code:

case _Func:
            p.next()
            if d := p.funcDeclOrNil(); d != nil {
                f.DeclList = append(f.DeclList, d)
            }

The core method of parsing the function is funcDeclOrNil, because the parsing of the function is a little more complicated, I will stick its implementation here, and explain what each line of code is doing through comments

// 函数的文法
// FunctionDecl = "func" FunctionName ( Function | Signature ) .
// FunctionName = identifier .
// Function     = Signature FunctionBody .

// 方法的文法
// MethodDecl   = "func" Receiver MethodName ( Function | Signature ) .
// Receiver     = Parameters . //方法的接收者
func (p *parser) funcDeclOrNil() *FuncDecl {
    if trace {
        defer p.trace("funcDecl")()
    }

    f := new(FuncDecl) //创建函数声明类型的结构体(节点)
    f.pos = p.pos()
    f.Pragma = p.takePragma()

    if p.tok == _Lparen { //如果匹配到了左小括号(说明是方法)
        rcvr := p.paramList() //获取接收者列表
        switch len(rcvr) {
        case 0:
            p.error("method has no receiver")
        default:
            p.error("method has multiple receivers")
            fallthrough
        case 1:
            f.Recv = rcvr[0]
        }
    }

    if p.tok != _Name { //判断下一个token是否是标识符(即函数名)
        p.syntaxError("expecting name or (")
        p.advance(_Lbrace, _Semi)
        return nil
    }

    f.Name = p.name()
    f.Type = p.funcType() //获取类型(下边继续了解其内部实现)
    if p.tok == _Lbrace { // 如果匹配到左中括号,则开始解析函数体
        f.Body = p.funcBody() //解析函数体(下边继续了解其内部实现)
    }

    return f
}

Two important implementations of the function parsing part: funcType() , funcBody() . What exactly are they doing inside?

/*
FuncType struct {
        ParamList  []*Field
        ResultList []*Field
        expr
}
*/
func (p *parser) funcType() *FuncType {
    if trace {
        defer p.trace("funcType")()
    }

    typ := new(FuncType) //创建函数类型结构体(主要成员是参数列表和返回值列表)
    typ.pos = p.pos()
    typ.ParamList = p.paramList() //获取参数列表(它返回的是一个Field结构体,它的成员是参数名和类型)
    typ.ResultList = p.funcResult() //获取返回值列表(它返回的也是一个Field结构体)

    return typ
}
func (p *parser) funcBody() *BlockStmt {
    p.fnest++ // 记录函数的调用层级
    errcnt := p.errcnt // 记录当前的错误数
    body := p.blockStmt("") // 解析函数体中的语句(具体实现继续往下看)
    p.fnest--

    // Don't check branches if there were syntax errors in the function
    // as it may lead to spurious errors (e.g., see test/switch2.go) or
    // possibly crashes due to incomplete syntax trees.
    if p.mode&CheckBranches != 0 && errcnt == p.errcnt {
        checkBranches(body, p.errh)
    }

    return body
}

func (p *parser) blockStmt(context string) *BlockStmt {
    if trace {
        defer p.trace("blockStmt")()
    }

    s := new(BlockStmt) //创建函数体的结构
    s.pos = p.pos()

    // people coming from C may forget that braces are mandatory in Go
    if !p.got(_Lbrace) {
        p.syntaxError("expecting { after " + context)
        p.advance(_Name, _Rbrace)
        s.Rbrace = p.pos() // in case we found "}"
        if p.got(_Rbrace) {
            return s
        }
    }

    s.List = p.stmtList() //开始解析函数体中的声明及表达式(这里边的实现就是根据获取的token来判断是哪种声明或语句,也是通过switch case来实现,根据匹配的类型进行相应文法的解析)
    s.Rbrace = p.pos()
    p.want(_Rbrace)

    return s
}

The above function parsing process is shown with a diagram for easy understanding:

As for how to parse some assignments, for, go, defer, select, etc. in the function body, you can see it yourself

Summarize

This article mainly shares the process of grammar parsing as a whole, and simply shows the internal implementation of type, const, and func declaration parsing. In fact, there are expression parsing in grammar parsing, including parsing of other grammars. There are many contents, so it is impossible to introduce them one by one. Interested partners can study by themselves.

Thanks for reading, the next topic is: Abstract Syntax Tree Construction


书旅
125 声望32 粉丝