1
头图

foreword

This article mainly records the pits encountered when implementing recursive calls in GScript . I have hardly found relevant content on the Chinese Internet for similar problems, so it is necessary to record them.

Before we start, let's briefly introduce the contents of this update GScript v0.0.9:

  • Support for variable parameters
  • Optimization append function semantics
  • Optimized compilation error messages
  • The last one is to support recursive calls

First look at the first variable parameter:

 //formats according to a format specifier and writes to standard output.
printf(string format, any ...a){}

//formats according to a format specifier and returns the resulting string.
string sprintf(string format, any ...a){}

The above are two standard functions added with this update, both of which support variable parameters. The use of ... indicates variable parameters, and the call is as follows:

 printf("hello %s ","123");
printf("hello-%s-%s ","123","abc");
printf("hello-%s-%d ","123",123);
string format = "this is %s ";
printf(format, "gscript");

string s = sprintf("nice to meet %s", "you");
assertEqual(s,"nice to meet you");

Similar to most languages, a variable parameter is essentially an array, so it can be used to loop through:

 int add(string s, int ...num){
    println(s);
    int sum = 0;
    for(int i=0;i<len(num);i++){
        int v = num[i];
        sum = sum+v;
    }
    return sum;
}
int x = add("abc", 1,2,3,4);
println(x);
assertEqual(x, 10);

 // appends "v" to the end of a array "a"
append(any[] a, any v){}

After that, the semantics of the built-in function append() are optimized. This optimization comes from the suggestion of issue12:
https://github.com/crossoverJie/gscript/issues/12

 // Before
int[] a={1,2,3};
println(a);
println();
a = append(a,4);
println(a);
// Output: [1 2 3 4]

// Now
int[] a={1,2,3};
println(a);
println();
append(a,4);
int b = a[3];
assertEqual(4, b);
println(a);
// Output: [1 2 3 4]

Now append after that, there is no need to reassign, and data will also be appended. After optimization, it seems to be a value/reference transfer problem, but in fact, the bottom layer is also value transfer, but this syntax is added to the syntax Sugar, help the user to re-assign an assignment.


After that, a compilation error message was added, such as the following code:

 a+2;
b+c;

Using an undeclared variable will now fail to compile directly:

 1:0: undefined: a
2:0: undefined: b
2:2: undefined: c
 class T{}
class T{}

// output:
2:0: class T redeclared in this block

Syntax errors such as duplicate declarations are also indicated.


The last one is the focus of this discussion, which is the support of recursive functions.

 int num(int x,int y){
    if (y==1 || y ==x) {
        return 1;
    }
    int v1 = num(x - 1, y - 1);
    return c;
}

In the previous version int v1 = num(x - 1, y - 1); this line of code will not be executed. The specific reasons will be analyzed later.

Now a program like 打印杨辉三角 can be implemented using recursion:

 int num(int x,int y){
    if (y==1 || y ==x) {
        return 1;
    }
    int v1 = num(x - 1, y - 1);
    int v2 = num(x - 1, y);
    int c = v1+v2;
    // int c = num(x - 1, y - 1)+num(x - 1, y);
    return c;
}
printTriangle(int row){
    for (int i = 1; i <= row; i++) {
        for (int j = 1; j <= row - i; j++) {
           print(" ");
        }
        for (int j = 1; j <= i; j++) {
            print(num(i, j) + " ");
        }
        println("");
    }
}
printTriangle(7);

// output:
      1 
     1 1 
    1 2 1 
   1 3 3 1 
  1 4 6 4 1 
 1 5 10 10 5 1 
1 6 15 20 15 6 1

return in function

 int num(int x,int y){
    if (y==1 || y ==x) {
        return 1;
    }
    int v1 = num(x - 1, y - 1);
    return c;
}

Now let's take a look at why this code is executed return 1 and then the following statements will not be executed.

In fact, when I first solved the need for the function return and the subsequent function --- 50194787c94b5be108f3f744d7cb04bb statement , it is actually the logic mentioned above, but it is recursive here.

First simplify the code for easy analysis:

 int f1(int a){
    if (a==10){
        return 10;
    }
    println("abc");
}

When the parameter a is equal to 10, subsequent print statements cannot be executed, so how to achieve this requirement?

In a normal human way of thinking: when we finish executing the return statement, we should mark the function to which the statement belongs to return directly, and cannot execute the subsequent statement .

But how should this be practiced?

In fact, look at AST to understand:

When encountering the return statement, it will recursively traverse the syntax tree and mark all the block nodes to indicate that this block subsequent statements are no longer executed, and at the same time The return value must also be recorded.

In this way, when it executes to the next statement , which is println("abc"); , it will judge whether the block is marked, and if so, return directly, so that Implemented return statement does not execute subsequent code.

Part of the implementation code is as follows:

 // 在 return 的时候递归向上扫描所有的 Block,并打上标记,用于后面执行 return 的时候直接返回。
func (v *Visitor) scanBlockStatementCtx(tree antlr.ParseTree, value interface{}) {
    context, ok := tree.(*parser.BlockContext)
    if ok {
        if v.blockCtx2Mark == nil {
            v.blockCtx2Mark = make(map[*parser.BlockContext]interface{})
        }
        v.blockCtx2Mark[context] = value
    }
    if tree.GetParent() != nil {
        v.scanBlockStatementCtx(tree.GetParent().(antlr.ParseTree), value)
    }
}

Source address:
https://github.com/crossoverJie/gscript/blob/793d196244416574bd6be641534742e57c54db7a/visitor.go#L182

recursive problem

But at the same time, the problem also comes, that is, the subsequent recursive code will not be executed when recursive.

In fact, the solution to the problem is also very simple, which is to add a new condition when judging whether it is necessary to return directly there. There is no recursive call in this block .

So we have to first know whether there is a recursive call in this block .

The whole process has the following steps:

  • Compile time: Record the mapping relationship between the function and the current context at the function declaration.
  • Compile time: When scanning statement , extract the function corresponding to statement of this ---b49cec5108762511d4826a68e83816e7 context .
  • Compile time: If the scanned statement is a function call, then judge whether the function is the function in the block , which is the function taken out in the second step.
  • Compile time: If two functions are equal, mark the current block as a recursive call.
  • Runtime: When judging the return statement just now, additionally judge whether the current block is a recursive call, and if so, it cannot return.

Part of the code is as follows:

https://github.com/crossoverJie/gscript/blob/3e179f27cb30ca5c3af57b3fbf2e46075baa266b/resolver/ref_resolver.go#L70

Summarize

The recursive call here actually stuck for me for a long time. I have ideas, but the code I wrote is always inconsistent with expectations. I sat in front of the computer that night until two or three in the morning, and I was puzzled.

In the end, when I couldn't bear to go to bed, I suddenly had an epiphany that made me think of a solution, so I got up early the next day and hurried to practice, and it was really solved.

So sometimes when you encounter difficult problems to relax yourself, there are often unexpected results.

Finally, the current recursion still has some performance problems in some cases. In the future, we will try to put these marking processes in the compilation period. It is okay to compile slowly, but there is a problem with slow runtime.

In the future, we will continue to optimize the runtime exceptions. Currently, it is directly panic , and there is no stack, so the body feels very bad. Interested friends are welcome to try and report bugs.

Source address:

https://github.com/crossoverJie/gscript


crossoverJie
5.4k 声望4k 粉丝