foreword
Recently, I have been re-learning the principle of compilation. In fact, I reviewed it two years ago. At first, it was to achieve the ---781c98a5bae612d8f448c74ff3386741--- which passed MySQL
DDL
to generate Python
中sqlalchemy
model
.
Related article here: Handwriting a lexer
Although the related functions have been completed, it seems that the implementation is relatively rough now, and only lexical analysis is used; so this time my purpose is to achieve a complete function through lexical analysis -> syntax analysis -> semantic analysis. The scripting "language".
Effect
Now there are also some phased results, as shown in the following figure:
Currently has the following basic functions:
- Variable declaration and assignment (only int is supported)
- Secondary operation (priority support)
- grammar check
- debug mode, can print AST
Interested friends can view the source code here:
https://github.com/crossoverJie/gscript
If you have a local go environment, you can also install and run it.
go get github.com/crossoverJie/gscript
gscript -h
Or directly download the binary and run: https://github.com/crossoverJie/gscript/releases
accomplish
The current version is written in go, indeed, as the title says, the core code is less than 1k lines of code, of course, this is also related to the current rudimentary functions.
However, although the sparrow is small and complete, the current version still uses some of the knowledge in the compilation principle: lexical and grammatical analysis.
The basic implementation process is as shown above:
- Parse the token from the source code through the lexical analyzer
Then generate an abstract syntax tree (AST) by deriving the token
- If there is an error in the syntax syntax, this step will throw a compilation failure, such as
2*(1+
a parenthesis.
- If there is an error in the syntax syntax, this step will throw a compilation failure, such as
Because tools like ANTLR
are not used to assist in generating code (otherwise the function would not be only so much), the lexical and grammatical analysis are all handwritten, and the amount of code is not large. Debugging friends can directly view the source code.
Lexer: token/token.go:39
Parser: syntax/syntax.go
Some concepts will be involved, such as finite state machine, recursive descent algorithm and other knowledge points, which are not discussed in this article, and will be reorganized after the function of this project is more complete.
planning
Finally, it is the stage of making cakes. If there is no accident, the following functions will be added in the future:
- More basic types, string/long etc.
- Variable scope, functions.
- Even closures.
- OOP is definitely indispensable.
After these features are implemented, it can be regarded as a "modern" scripting language, and I will continue to update the interesting content in the learning and implementation process in the future.
Source address:
https://github.com/crossoverJie/gscript
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。