foreword
The basic four arithmetic operations and the generation of AST
are implemented in the script interpreter GScript implemented in the previous version.
When I am going to add another %
operator for modulo, I will find that the work is very tedious and almost repetitive; mainly two steps:
- Need to add support for the
%
symbol to the lexer. - Implement specific logic for the
%
token when the parser traverses the AST.
The lexical parsing and traversal of the AST are completely repetitive work, so can we simplify these two steps?
Antlr
Antlr
is a common tool to help us solve these problems. With it, we only need to write a lexical file, and then we can automatically generate lexical and parser, and can generate codes in different languages.
Let's take the example of GScript
to see how antlr helps us generate a lexer.
func TestGScriptVisitor_Visit_Lexer(t *testing.T) {
expression := "(2+3) * 2"
input := antlr.NewInputStream(expression)
lexer := parser.NewGScriptLexer(input)
for {
t := lexer.NextToken()
if t.GetTokenType() == antlr.TokenEOF {
break
}
fmt.Printf("%s (%q) %d\n",
lexer.SymbolicNames[t.GetTokenType()], t.GetText(),t.GetColumn())
}
}
//output:
("(") 0
DECIMAL_LITERAL ("2") 1
PLUS ("+") 2
DECIMAL_LITERAL ("3") 3
(")") 4
MULT ("*") 6
DECIMAL_LITERAL ("2") 8
Antlr
will automatically parse our expression as token
, and traverse the line of token
and get the line where the token
is located. , location, etc., is useful for syntax checking during compilation.
To achieve these, we only need to write lexical and grammar rule files.
The lexical and grammatical rules corresponding to the example just now are as follows:
expr
: '(' expr ')' #NestedExpr
| liter=literal #Liter
| lhs=expr bop=( MULT | DIV ) rhs=expr #MultDivExpr
| lhs=expr bop=MOD rhs=expr #ModExpr
| lhs=expr bop=( PLUS | SUB ) rhs=expr #PlusSubExpr
| expr bop=(LE | GE | GT | LT ) expr # GLe
| expr bop=(EQUAL | NOTEQUAL) expr # EqualOrNot
;
DECIMAL_LITERAL: ('0' | [1-9] (Digits? | '_'+ Digits)) [lL]?;
Full rules: https://github.com/crossoverJie/gscript/blob/main/GScript.g4
run:
antlr -Dlanguage=Go -o parser -visitor -no-listener GScript.g4
It can help us generate the code of ---51a282e77bd73ef431dc7de723327cb4 Go
(the default is Java
). For the lexical, grammar rules and installation steps of Antlr
, please refer to the official website .
When we want to implement specific grammar logic, we only need to implement the relevant interface, Antlr
will automatically traverse AST
(of course, it can also be manually controlled), and at the same time access different AST
Node will call back the interface we implement ourselves, so that we can write our own grammar rules.
Take the new modulo operation here as an example:
func (v *GScriptVisitor) VisitModExpr(ctx *parser.ModExprContext) interface{} {
lhs := v.Visit(ctx.GetLhs())
rhs := v.Visit(ctx.GetRhs())
return lhs.(int) % rhs.(int)
}
When the Antlr
callback VisitModExpr
method, the data on the left and right sides of the % symbol can be obtained, and only relevant operations are required at this time.
Based on this mode, a new statement
is added this time. The specific syntax is as follows:
func TestGScriptVisitor_VisitIfElse8(t *testing.T) {
expression := `
if(3!=(1+2)){
return 1+3
} else {
return false
}`
input := antlr.NewInputStream(expression)
lexer := parser.NewGScriptLexer(input)
stream := antlr.NewCommonTokenStream(lexer, 0)
parser := parser.NewGScriptParser(stream)
parser.BuildParseTrees = true
tree := parser.Prog()
visitor := GScriptVisitor{}
var result = visitor.Visit(tree)
fmt.Println(expression, " result:", result)
assert.Equal(t, result, false)
}
Antlr also has various other advantages, such as being able to solve:
- Left recursion.
- Ambiguity.
- priority.
And other issues.
It is also recommended to install the Antlr plug-in in the IDE, so that the AST syntax tree can be viewed intuitively, which can help us better debug the code.
upgrade xjson
With the help of GScript
provided by statement
, xjson
also provides some interesting writing:
xjson
的四则运算语法Antlr
,所以为了能支持GScript
statement
代码。
This also reflects the importance of Antlr
such front-end tools, and the efficiency improvement is very obvious.
Summarize
With the help Antlr
the follow-up GScript
will continue to support function calls, a more complete type system, object-oriented and other features; interested friends, please continue to pay attention.
Source address:
https://github.com/crossoverJie/gscript
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。