Go compilation principle series 6 (type checking)

6. Go compilation process - type checking

foreword

In the previous article, the construction of the abstract syntax tree was shared. The next stage is type checking, which traverses each node of the abstract syntax tree and performs type checking on different types of nodes according to the following steps (static type checking):

Constant, type and function names and type validation
Assignment and initialization of variables
Evaluate compile-time constants, bind declarations to identifiers
Some built-in functions will be rewritten (mentioned below when introducing the source code)
Type of hash key-value pair
Do special syntax or semantic checks (is the referenced struct field uppercase exportable? Is the access to the array literal exceeding its length? Is the array index a positive integer?)

Through type checking, it can ensure that each node of the abstract syntax tree will not have type errors (note that the compilation phase is static type checking), and static type errors in the source code will be found during the type checking process. And, if a type implements an interface, it will also be checked at this stage

Through this article, you can learn what Go's type checking phase does, and what special rewrites are made to nodes when checking some special types of nodes (for example: map, make)

Overall overview of type checking

The type checking phase traverses each node of the abstract syntax tree to determine the type of the node. For example, the following two forms

 var test int
test := 1

The first is to specify the type directly, and the second is to require the compiler to obtain the type of the variable through type inference

As mentioned in the previous articles, the entry file for Go compilation is:

 Go的编译入口文件：src/cmd/compile/main.go -> gc.Main(archInit)

Entering the gc.Main(archInit) method, you will see that after executing lexical analysis, syntax analysis, and construction of abstract syntax tree, there is the following piece of code:

 func Main(archInit func(*Arch)) {
    ......
    lines := parseFiles(flag.Args())//词法分析、语法分析、抽象语法树构建都在这里
    ......
    //开始遍历抽象语法树，对每个结点进行类型检查
    for i := 0; i < len(xtop); i++ {
        n := xtop[i]
        if op := n.Op; op != ODCL && op != OAS && op != OAS2 && (op != ODCLTYPE || !n.Left.Name.Param.Alias()) {
            xtop[i] = typecheck(n, ctxStmt)
        }
    }
    for i := 0; i < len(xtop); i++ {
        n := xtop[i]
        if op := n.Op; op == ODCL || op == OAS || op == OAS2 || op == ODCLTYPE && n.Left.Name.Param.Alias() {
            xtop[i] = typecheck(n, ctxStmt)
        }
    }
    ......
    checkMapKeys()//对哈希中键的类型进行检查
    ......
}

Here xtop is an array, which is the root node of each abstract syntax tree (as mentioned in the abstract syntax tree construction article, it will build each declaration statement into an abstract syntax tree, such as var , const, type, func, etc.). So it will traverse from the root node of each tree, type checking one by one

From the above code, we can see that the type check is mainly called: typecheck method, which will Type checking of constants, types, function declarations, assignment statements, etc. At the same time, the checkMapKeys() method is called later to check the type of the hash keys (the implementation of these two methods will be introduced in detail below)

In fact, in the typecheck method, the core logic is in the typecheck1 method it calls. This method consists of a large switch, and different processing logic is selected according to the Op of each node. There are many branches here, I only choose a few special ones for in-depth understanding

 func typecheck1(n *Node, top int) (res *Node) {
        ......
        switch n.Op {
        // until typecheck is complete, do nothing.
        default:
            Dump("typecheck", n)
    
            Fatalf("typecheck %v", n.Op)
    
        // names
        case OLITERAL:
            ok |= ctxExpr
    
            if n.Type == nil && n.Val().Ctype() == CTSTR {
                n.Type = types.UntypedString
            }
    
        case ONONAME:
            ok |= ctxExpr
    
        case ONAME:
            ......
        case OTARRAY:
        ......
        case OTMAP:
        ......
        }
        ......
}

A deep dive into type checking

OAS: Assignment Statement

 // Left = Right or (if Colas=true) Left := Right
// If Colas, then Ninit includes a DCL node for Left.
OAS

The core of the assignment statement is to call: /usr/local/go/src/cmd/compile/internal/gc/typecheck.go→ typecheckas

 case OAS:
        ok |= ctxStmt

        typecheckas(n)

        // Code that creates temps does not bother to set defn, so do it here.
        if n.Left.Op == ONAME && n.Left.IsAutoTmp() {
            n.Left.Name.Defn = n
        }

In the typecheckas method, you can see the following code, which assigns the type of the constant on the right to the type of the variable on the left

 func typecheckas(n *Node) {
        ......
        if n.Left.Name != nil && n.Left.Name.Defn == n && n.Left.Name.Param.Ntype == nil {
                n.Right = defaultlit(n.Right, nil)
                n.Left.Type = n.Right.Type
        }
        ......
}

For example: var a = 666

OTARRAY: slice

 OTARRAY // []int, [8]int, [N]int or [...]int

For the case where the node type is OTARRAY, it will first check the type of the slice value (the node's right node)

 case OTARRAY:
        ok |= ctxType
        r := typecheck(n.Right, ctxType)
        if r.Type == nil {
            n.Type = nil
            return n
        }

Then according to the difference of the left node, there are three cases, namely []int, [...]int, [6]int

[]int : call t = types.NewSlice(r.Type) directly, and return a structure of type TSLICE , and the type information of the elements will also be stored in the structure
[...]int : handled typecheckcomplit method,

 func typecheckcomplit(n *Node) (res *Node) {
        ......
        // Need to handle [...]T arrays specially.
        if n.Right.Op == OTARRAY && n.Right.Left != nil && n.Right.Left.Op == ODDD {
                n.Right.Right = typecheck(n.Right.Right, ctxType)
                if n.Right.Right.Type == nil {
                        n.Type = nil
                        return n
                }
                elemType := n.Right.Right.Type
        
                length := typecheckarraylit(elemType, -1, n.List.Slice(), "array literal")
        
                n.Op = OARRAYLIT
                n.Type = types.NewArray(elemType, length)
                n.Right = nil
                return n
        }
        ......
}

This method will get the number of elements in the array, and then call [types.NewArray](https://draveness.me/golang/tree/cmd/compile/internal/types.NewArray) to initialize a structure that stores the element type and array size in the array

[6]int : If the size of the array is included when declaring the slice, directly call [types.NewArray](https://draveness.me/golang/tree/cmd/compile/internal/types.NewArray) to initialize a structure that stores the element type and size of the array

It can be found that the length of the array is determined during type checking

At the end, it will update the node's Type and other information

 setTypeNode(n, t)
n.Left = nil
n.Right = nil
checkwidth(t)

OTMAP: map (hash)

 OTMAP    // map[string]int

For nodes of type OTMAP, it will perform type checking on the left and right parts, and then create a TMAP structure, and store the key value type of MAP in the structure.

 case OTMAP:
        ok |= ctxType
        n.Left = typecheck(n.Left, ctxType)
        n.Right = typecheck(n.Right, ctxType)
        l := n.Left
        r := n.Right
        if l.Type == nil || r.Type == nil {
            n.Type = nil
            return n
        }
        if l.Type.NotInHeap() {
            yyerror("incomplete (or unallocatable) map key not allowed")
        }
        if r.Type.NotInHeap() {
            yyerror("incomplete (or unallocatable) map value not allowed")
        }

        setTypeNode(n, types.NewMap(l.Type, r.Type))
        mapqueue = append(mapqueue, n) // check map keys when all types are settled
        n.Left = nil
        n.Right = nil

......
func NewMap(k, v *Type) *Type {
    t := New(TMAP)
    mt := t.MapType()
    mt.Key = k
    mt.Elem = v
    return t
}

We can find from the code that it not only modifies the node, but also puts the node into a mapqueue queue. As mentioned in the previous overview section, checkMapKeys() will recheck the type of the hash key

 func checkMapKeys() {
    for _, n := range mapqueue {
        k := n.Type.MapType().Key
        if !k.Broke() && !IsComparable(k) {
            yyerrorl(n.Pos, "invalid map key type %v", k)
        }
    }
    mapqueue = nil
}

In fact, it is to traverse the queue to verify whether these types can be used as map keys

OMAKE: make

 OMAKE          // make(List) (before type checking converts to one of the following)
OMAKECHAN      // make(Type, Left) (type is chan)
OMAKEMAP       // make(Type, Left) (type is map)
OMAKESLICE     // make(Type, Left, Right) (type is slice)

When writing go code, we often use the make keyword to create slices, maps, channels, etc. In the type checking phase of Go compilation, it will subdivide OMAKE nodes, such as:

 make slice：OMAKESLICE
make map：OMAKEMAP
make chan：OMAKECHAN

The specific implementation is that it will first get the first parameter of make, which is the type. Depending on the type, different processing is performed

 case OMAKE:
        ok |= ctxExpr
        args := n.List.Slice()
        ......
        l := args[0]
        l = typecheck(l, ctxType)
        t := l.Type
        ......
        i := 1
        switch t.Etype {
        default:
            yyerror("cannot make type %v", t)
            n.Type = nil
            return n

        case TSLICE:
            ......
            n.Left = l
            n.Right = r
            n.Op = OMAKESLICE

        case TMAP:
            ......
            n.Op = OMAKEMAP
        case TCHAN:
            ......
            n.Op = OMAKECHAN
        }

        n.Type = t

If the first parameter is the slice type : get the length (len) and capacity (cap) of the slice, and then check the validity of len and cap. And rewrite the node type
If the first parameter is of type map : Get the second parameter of make, if not, it defaults to 0 (the size of the map). and rewrite the node type
If the first argument is of type chan: Get the second argument to make, if not, it defaults to 0 (buffer size for chan). and rewrite the node type

I haven't pasted the code here, you can see it by yourself

Summarize

This article mainly shares several special node types in type checking. There are many other types of node type checking, you can check the source code by yourself

The source code debugging of Go has not been shared in the previous articles, so the next article plans to share the source code debugging method of Go. And take the construction of the abstract syntax tree as an example to debug it

refer to

go-ast-book
"Analysis of the underlying principles of the Go language"
Faith Oriented Programming - Type Checking

Go compilation principle series 6 (type checking)

6. Go compilation process - type checking

foreword

Overall overview of type checking

A deep dive into type checking

OAS: Assignment Statement

OTARRAY: slice

OTMAP: map (hash)

OMAKE: make

Summarize

refer to

书旅

引用和评论

Go编译原理系列10（逃逸分析）

70k star，取代Postman！这款轻量级API工具，太香了！

大模型时代，后端程序员如何避免被AI卷死？

C++ 中 VS 项目引入公共配置文件

LSM-TREE从入门到入魔：从零开始实现一个高性能键值存储｜得物技术

疯狂推荐！从零开始 Dify 部署全攻略！

Cherry Studio 入门 MCP：为你的大模型插上翅膀

Go compilation principle series 6 (type checking)

6. Go compilation process - type checking

foreword

Overall overview of type checking

A deep dive into type checking

OAS: Assignment Statement

OTARRAY: slice

OTMAP: map (hash)

OMAKE: make

Summarize

refer to

书旅

引用和评论

Go编译原理系列10（逃逸分析）

70k star，取代Postman！这款轻量级API工具，太香了！

大模型时代，后端程序员如何避免被AI卷死？

C++ 中 VS 项目引入公共配置文件

LSM-TREE从入门到入魔：从零开始实现一个高性能键值存储 ｜ 得物技术

疯狂推荐！从零开始 Dify 部署全攻略！

Cherry Studio 入门 MCP：为你的大模型插上翅膀

LSM-TREE从入门到入魔：从零开始实现一个高性能键值存储｜得物技术