foreword

the last article I introduced some basic concepts of syntax parsing, and how to implement the syntax tree parsing of the Simple language interpreter through a custom DSL language. In this and the last article in this series I'll show you how the Simple interpreter executes the generated syntax tree.

evaluate function and scope

The evaluate function appeared when I introduced the knowledge of grammar parsing. In fact, every AST node in has a corresponding evaluate function. The function of this function is to tell the Simple interpreter how to execute the current AST node. Therefore, the process of executing the code by the Simple interpreter is: starts from the root node and executes the evaluate function of the current node and then recursively executes the evalute function of the child node.

We know that there is a concept called scope when JavaScript code is executed. When we access a variable, we will first see if the variable is defined in the current scope. If not, we will go up the scope chain to find the global scope. Scope, if there is no definition of the variable on the scope chain, an error of Uncaught ReferenceError: xx is not defined will be thrown. When implementing the Simple language interpreter, I implemented a class called Environment with reference to the concept of JavaScript scope. Let's take a look at the implementation of the Environment class:

// lib/runtime/Environment.ts

// Environment类就是Simple语言的作用域
class Environment {
  // parent指向当前作用域的父级作用域
  private parent: Environment = null
  // values对象会以key-value的形式存储当前作用域变量的引用和值
  // 例如values = {a: 10},代表当前作用域有一个变量a,它的值是10
  protected values: Object = {}

  // 当前作用域有新的变量定义的时候会调用create函数进行值的设置
  // 例如执行 let a = 10 时,会调用env.create('a', 10)
  create(key: string, value: any) {
    if(this.values.hasOwnProperty(key)) {
      throw new Error(`${key} has been initialized`)
    }
    this.values[key] = value
  }

  // 如果某个变量被重新赋值,Simple会沿着当前作用域链进行寻找,找到最近的符合条件的作用域,然后在该作用域上进行重新赋值
  update(key: string, value: any) {
    const matchedEnvironment = this.getEnvironmentWithKey(key)
    if (!matchedEnvironment) {
      throw new Error(`Uncaught ReferenceError: ${key} hasn't been defined`)
    }
    matchedEnvironment.values = {
      ...matchedEnvironment.values,
      [key]: value
    }
  }

  // 在作用域链上寻找某个变量,如果没有找到就抛出Uncaught ReferenceError的错误
  get(key: string) {
    const matchedEnvironment = this.getEnvironmentWithKey(key)
    if (!matchedEnvironment) {
      throw new Error(`Uncaught ReferenceError: ${key} is not defined`)
    }

    return matchedEnvironment.values[key]
  }

  // 沿着作用域链向上寻找某个变量的值,如果没有找到就返回null
  private getEnvironmentWithKey(key: string): Environment {
    if(this.values.hasOwnProperty(key)) {
      return this
    }
  
    let currentEnvironment = this.parent
    while(currentEnvironment) {
      if (currentEnvironment.values.hasOwnProperty(key)) {
        return currentEnvironment
      }
      currentEnvironment = currentEnvironment.parent
    }

    return null
  }
}

As can be seen from the above code and comments, the so-called scope chain is actually a linked list composed of Environment instances. When parsing the value of a variable, it will search along the scope chain. If the definition of the variable is not found, an error will be reported. Next, let's take a look at the process of executing the for loop to see what the specific process is like:

Executed code:

for(let i = 0; i < 10; i++) {
  console.log(i);
};

The execution process of the ForStatement code:

// lib/ast/node/ForStatement.ts
class ForStatement extends Node {
  ...

  // evaluate函数会接受一个作用域对象,这个对象代表当前AST节点的执行作用域
  evaluate(env: Environment): any {
    // 上面for循环括号里面的内容是在一个独立的作用域里面的,所以需要基于父级节点传递过来的作用域新建一个作用域,取名为bridgeEnvironment
    const bridgeEnvironment = new Environment(env)
    // if括号内的变量初始化(let i = 0)会在这个作用域里面进行
    this.init.evaluate(bridgeEnvironment)

    // 如果当前作用域没有被break语句退出 && return语句返回 && 测试表达式(i < 10)是真值,for循环就会继续执行,否则for循环中断
    while(!runtime.isBreak && !runtime.isReturn && this.test.evaluate(bridgeEnvironment)) {
      // 因为for循环体(console.log(i))是一个新的作用域,所以要基于当前的brigeEnvironment新建一个子作用域
      const executionEnvironment = new Environment(bridgeEnvironment)
      this.body.evaluate(executionEnvironment)
      // 循环变量的更新(i++)会在brigeEnvironment里面执行
      this.update.evaluate(bridgeEnvironment)
    }
  }
}

Closures and this binding

After understanding the general execution process of the evalute function, let's take a look at how the closure is implemented. We all know that JavaScript is lexically scoped, which means that the scope chain of a function is determined when the function is defined. Let's see how the closure of the Simple language is implemented through the code of the evaluate function of the function declaration node FunctionDeclaration :

// lib/ast/node/FunctionDeclaration.ts
class FunctionDeclaration extends Node {
  ...

  // 当函数声明语句被执行的时候,这个evaluate函数会被执行,传进来的对象就是当前的执行作用域
  evaluate(env: Environment): any {
    // 生成一个新的FunctionDeclaration对象,因为同一个函数可能被多次定义(例如这个函数被嵌套定义在某个父级函数的时候)
    const func = new FunctionDeclaration()
    // 函数复制
    func.loc = this.loc
    func.id = this.id
    func.params = [...this.params]
    func.body = this.body
    
    // 函数被声明的时候会通过parentEnv属性记录下当前的执行作用域,这就是闭包了!!!
    func.parentEnv = env

    // 将函数注册到当前的执行作用域上面,该函数就可以被递归调用了
    env.create(this.id.name, func)
  }
  ...
}

As can be seen from the above code, to implement the closure of the Simple language, actually only needs to record the current scope (parentEnv) when the function is declared.

Next, let's take a look at how to determine which object this is bound to when the function is executed:

// lib/ast/node/FunctionDeclaration.ts
class FunctionDeclaration extends Node {
  ...

  // 函数执行的时候,如果存在调用函数的实例,该实例会被当做参数传进来,例如a.test(),a就是test的这个参数
  call(args: Array<any>, callerInstance?: any): any {
    // 函数执行时传进来的参数如果少于声明的参数会报错
    if (this.params.length !== args.length) {
      throw new Error('function declared parameters are not matched with arguments')
    }

    // 这是实现闭包的重点,函数执行时的父级作用域是之前函数被定义的时候记录下来的父级作用域!!
    const callEnvironment = new Environment(this.parentEnv)
    
    // 函数参数进行初始化
    for (let i = 0; i < args.length; i++) {
      const argument = args[i]
      const param = this.params[i]

      callEnvironment.create(param.name, argument)
    }
    // 创建函数的arguments对象
    callEnvironment.create('arguments', args)

    // 如果当前函数有调用实例,那么这个函数的this将会是调用实例
    if (callerInstance) {
      callEnvironment.create('this', callerInstance)
    } else {
      // 如果函数没有调用实例,就会沿着函数的作用域链就行寻找,直到全局的process(node)或者window(browser)对象
      callEnvironment.create('this', this.parentEnv.getRootEnv().get('process'))
    }

    // 函数体的执行
    this.body.evaluate(callEnvironment)
  }
}

The above code roughly introduces how the this of the Simple language is bound. In fact, the implementation of JavaScript may be quite different from this. Here is just a reference for you.

Summarize

In this article, I will introduce to you how the Simple interpreter executes code, including the content of closures and this binding. Due to space limitations, a lot of content is ignored here, such as how the break statement of for and while loops exits Yes, how does the return statement of the function pass the value to the parent function, if you are interested, you can take a look at my source code:
https://github.com/XiaocongDong/simple

Finally, I hope that after the study of these three series of articles, you can have a certain understanding of the compilation principle and some of the more difficult language features of JavaScript. I also hope that I can continue to bring you high-quality content in the future so that we can make progress together.

Personal technology trends

The article was first published on my blog platform

Welcome to pay attention to the onion of the public to learn and grow together

wechat_qr.jpg


进击的大葱
222 声望67 粉丝

Bitcoin is Noah's Ark.