1
头图

foreword

Recently, I have been re-learning the principle of compilation. In fact, I reviewed it two years ago. At first, it was to achieve the ---781c98a5bae612d8f448c74ff3386741--- which passed MySQL DDL to generate Pythonsqlalchemy model .


Related article here: Handwriting a lexer

Although the related functions have been completed, it seems that the implementation is relatively rough now, and only lexical analysis is used; so this time my purpose is to achieve a complete function through lexical analysis -> syntax analysis -> semantic analysis. The scripting "language".

Effect

Now there are also some phased results, as shown in the following figure:

Currently has the following basic functions:

  • Variable declaration and assignment (only int is supported)
  • Secondary operation (priority support)
  • grammar check
  • debug mode, can print AST

Interested friends can view the source code here:
https://github.com/crossoverJie/gscript

If you have a local go environment, you can also install and run it.

 go get github.com/crossoverJie/gscript
gscript -h

Or directly download the binary and run: https://github.com/crossoverJie/gscript/releases

accomplish

The current version is written in go, indeed, as the title says, the core code is less than 1k lines of code, of course, this is also related to the current rudimentary functions.

However, although the sparrow is small and complete, the current version still uses some of the knowledge in the compilation principle: lexical and grammatical analysis.

The basic implementation process is as shown above:

  • Parse the token from the source code through the lexical analyzer
  • Then generate an abstract syntax tree (AST) by deriving the token

    • If there is an error in the syntax syntax, this step will throw a compilation failure, such as 2*(1+ a parenthesis.

Because tools like ANTLR are not used to assist in generating code (otherwise the function would not be only so much), the lexical and grammatical analysis are all handwritten, and the amount of code is not large. Debugging friends can directly view the source code.

Lexer: token/token.go:39
Parser: syntax/syntax.go

Some concepts will be involved, such as finite state machine, recursive descent algorithm and other knowledge points, which are not discussed in this article, and will be reorganized after the function of this project is more complete.

planning

Finally, it is the stage of making cakes. If there is no accident, the following functions will be added in the future:

  • More basic types, string/long etc.
  • Variable scope, functions.
  • Even closures.
  • OOP is definitely indispensable.

After these features are implemented, it can be regarded as a "modern" scripting language, and I will continue to update the interesting content in the learning and implementation process in the future.

Source address:
https://github.com/crossoverJie/gscript


crossoverJie
5.4k 声望4k 粉丝