[JVM source code analysis] The virtual machine interprets and executes the Java method (below)

This article is authored by HeapDump performance community chief lecturer Kumo (Ma Zhi) to collate and publish

Part 34-Parsing the InvokeInterface Bytecode Command

Similar to the invokevirtual instruction, when the target method is not parsed, the LinkResolver::resolve_invoke() function needs to be called for analysis. This function will call some other functions to complete the method analysis, as shown in the following figure.

The pink part in the figure above is different from the parsing of the invokevirtual bytecode instruction. The resolve_pool() function and the related functions called by it have been introduced in detail when introducing the invokevirtual bytecode instruction, and will not be introduced here.

Call the LinkResolver::resolve_invokeinterface() function to parse the bytecode instructions. The implementation of the function is as follows:

void LinkResolver::resolve_invokeinterface(
 CallInfo& result,
 Handle recv,
 constantPoolHandle pool,
 int index, // 指的是常量池缓存项的索引
 TRAPS
) {
  KlassHandle resolved_klass;
  Symbol* method_name = NULL;
  Symbol* method_signature = NULL;
  KlassHandle current_klass;
  // 解析常量池时，传入的参数pool（根据当前栈中要执行的方法找到对应的常量池）和
  // index（常量池缓存项的缓存，还需要映射为原常量池索引）是有值的，根据这两个值能够
  // 解析出resolved_klass和要查找的方法名称method_name和方法签名method_signature
  resolve_pool(resolved_klass, method_name, method_signature, current_klass, pool, index, CHECK);

  KlassHandle recvrKlass (THREAD, recv.is_null() ? (Klass*)NULL : recv->klass());
  resolve_interface_call(result, recv, recvrKlass, resolved_klass, method_name, method_signature, current_klass, true, true, CHECK);
}

We then look at the implementation of the resolve\_interface\_call() function, as follows:

void LinkResolver::resolve_interface_call(
 CallInfo& result,
 Handle recv,
 KlassHandle recv_klass,
 KlassHandle resolved_klass,
 Symbol* method_name,
 Symbol* method_signature,
 KlassHandle current_klass,
 bool             check_access,
 bool            check_null_and_abstract,
 TRAPS
) {
  methodHandle resolved_method;
  linktime_resolve_interface_method(resolved_method, resolved_klass, method_name, method_signature, current_klass, check_access, CHECK);
  runtime_resolve_interface_method(result, resolved_method, resolved_klass, recv, recv_klass, check_null_and_abstract, CHECK);
}

Call 2 functions to analyze the method. First look at the implementation of the linktime\_resolve\_interface_method() function.

Calling the linktime\_resolve\_interface\_method() function will call the LinkResolver::resolve\_interface_method() function. The implementation of this function is as follows:

void LinkResolver::resolve_interface_method(
 methodHandle& resolved_method,
 KlassHandle resolved_klass,
 Symbol* method_name,
 Symbol* method_signature,
 KlassHandle current_klass,
 bool          check_access,
 bool          nostatics,
 TRAPS
) {
  // 从接口和父类java.lang.Object中查找方法，包括静态方法
  lookup_method_in_klasses(resolved_method, resolved_klass, method_name, method_signature, false, true, CHECK);

  if (resolved_method.is_null()) {
    // 从实现的所有接口中查找方法
    lookup_method_in_interfaces(resolved_method, resolved_klass, method_name, method_signature, CHECK);
    if (resolved_method.is_null()) {
      // no method found
      // ...
    }
  }

  // ...
}

First, call the LinkResolver::lookup\_method\_in_klasses() function to find the method. This function was introduced when the invokevirtual bytecode instruction was introduced before, but only the processing logic related to the invokevirtual instruction was introduced. Here you need to continue to view the invokeinterface The relevant processing logic is implemented as follows:

void LinkResolver::lookup_method_in_klasses(
 methodHandle& result,
 KlassHandle klass,
 Symbol* name,
 Symbol* signature,
 bool checkpolymorphism,
 // 对于invokevirtual来说，值为false，对于invokeinterface来说，值为true
 bool in_imethod_resolve,
 TRAPS
) {
  Method* result_oop = klass->uncached_lookup_method(name, signature);

  // 在接口中定义方法的解析过程中，忽略Object类中的静态和非public方法，如
  // clone、finalize、registerNatives
  if (
      in_imethod_resolve &&
      result_oop != NULL &&
      klass->is_interface() &&
      (result_oop->is_static() || !result_oop->is_public()) &&
      result_oop->method_holder() == SystemDictionary::Object_klass() // 方法定义在Object类中
  ) {
    result_oop = NULL;
  }

  if (result_oop == NULL) {
    Array<Method*>* default_methods = InstanceKlass::cast(klass())->default_methods();
    if (default_methods != NULL) {
      result_oop = InstanceKlass::find_method(default_methods, name, signature);
    }
  }
  // ...
  result = methodHandle(THREAD, result_oop);
}

Call the unchached\_lookup\_method() function to search from the current class and the parent class. If the illegal method in the Object class is not found or found, the find_method() function is called to search from the default method. One of the new features of Java 8 is the default method of the interface. This new feature allows us to add a non-abstract method implementation to the interface, and this method only needs to modify the default implementation method with the keyword default.

The implementation of uncached\_lookup\_method() function is as follows:

Method* InstanceKlass::uncached_lookup_method(Symbol* name, Symbol* signature) const {
  Klass* klass = const_cast<InstanceKlass*>(this);
  bool dont_ignore_overpasses = true; 
  while (klass != NULL) {
    Method* method = InstanceKlass::cast(klass)->find_method(name, signature);
    if ((method != NULL) && (dont_ignore_overpasses || !method->is_overpass())) {
      return method;
    }
    klass = InstanceKlass::cast(klass)->super();
    dont_ignore_overpasses = false; // 不要搜索父类中的overpass方法
  }
  return NULL;
}

Find methods from the current class and the parent class. When searching for methods from the class and the parent class, call the find\_method() function, and finally call another overloaded function find\_method() to search from the methods saved in the InstanceKlass::\_methods attribute; when from the default method When searching for a method in InstanceKlass::\_default\_methods, call the find\_method() function to search from the methods saved in the InstanceKlass::\_default\_methods attribute. The implementation of the overloaded find_method() function is as follows:

Method* InstanceKlass::find_method(Array<Method*>* methods, Symbol* name, Symbol* signature) {
  int hit = find_method_index(methods, name, signature);
  return hit >= 0 ? methods->at(hit): NULL;
}

In fact, calling the find\_method\_index() function is to find the method named name and signature according to the binary search, because the methods in the InstanceKlass::\_methods and InstanceKlass::\_default_methods attributes have been sorted. About these functions The storage method and how to sort in the "In-depth analysis of the Java virtual machine: source code analysis and detailed examples (basic volume)" book has been introduced in detail, and will not be introduced here.

The implementation of the called LinkResolver::runtime\_resolve\_interface_method() function is as follows:

void LinkResolver::runtime_resolve_interface_method(
 CallInfo& result,
 methodHandle resolved_method,
 KlassHandle resolved_klass,
 Handle recv,
 KlassHandle recv_klass,
 bool check_null_and_abstract, // 对于invokeinterface来说，值为false
 TRAPS
) {
  // ...

  methodHandle sel_method;

  lookup_instance_method_in_klasses(
            sel_method, 
            recv_klass,
            resolved_method->name(),
            resolved_method->signature(), 
            CHECK);

  if (sel_method.is_null() && !check_null_and_abstract) {
    sel_method = resolved_method;
  }

  // ...
  // 如果查找接口的实现时找到的是Object类中的方法，那么要通过vtable进行分派，所以我们需要
  // 更新的是vtable相关的信息
  if (!resolved_method->has_itable_index()) {
    int vtable_index = resolved_method->vtable_index();
    assert(vtable_index == sel_method->vtable_index(), "sanity check");
    result.set_virtual(resolved_klass, recv_klass, resolved_method, sel_method, vtable_index, CHECK);
  } else {
    int itable_index = resolved_method()->itable_index();
    result.set_interface(resolved_klass, recv_klass, resolved_method, sel_method, itable_index, CHECK);
  }
}

When there is no itable index, it is dynamically allocated through vtable; otherwise, it is dynamically allocated through itable.

The implementation of the called lookup\_instance\_method\_in\_klasses() function is as follows:

void LinkResolver::lookup_instance_method_in_klasses(
 methodHandle& result,
 KlassHandle klass,
 Symbol* name,
 Symbol* signature,
 TRAPS
) {
  Method* result_oop = klass->uncached_lookup_method(name, signature);
  result = methodHandle(THREAD, result_oop);
  // 循环查找方法的实现，不会查找静态方法
  while (!result.is_null() && result->is_static() && result->method_holder()->super() != NULL) {
    KlassHandle super_klass = KlassHandle(THREAD, result->method_holder()->super());
    result = methodHandle(THREAD, super_klass->uncached_lookup_method(name, signature));
  }

  // 当从拥有Itable的类或父类中找到接口中方法的实现时，result不为NULL，
  // 否则为NULL，这时候就要查找默认的方法实现了，这也算是一种实现
  if (result.is_null()) {
    Array<Method*>* default_methods = InstanceKlass::cast(klass())->default_methods();
    if (default_methods != NULL) {
      result = methodHandle(InstanceKlass::find_method(default_methods, name, signature));
    }
  }
}

As above, the find_method() function will be called when searching for the implementation of the default method. This function was introduced in detail when the parsing process of the invokevirtual bytecode instruction was introduced before, and will not be introduced here.

At the end of the LinkResolver::runtime\_resolve\_interface\_method() function, it is possible to call the CallInfo::set\_interface() or CallInfo::set\_virtual() function, calling these two functions is to save the found information To the CallInfo instance. Eventually, the ConstantPoolCacheEntry-related information will be updated in the InterpreterRuntime::resolve\_invoke() function according to the information saved in the CallInfo instance, as follows:

switch (info.call_kind()) {
  // ...
  case CallInfo::itable_call:
    cache_entry(thread)->set_itable_call(
      bytecode,
      info.resolved_method(),
      info.itable_index());
    break;
  default: ShouldNotReachHere();
}

When the dispatch information of itable is stored in CallInfo, call the set\_itable\_call() function, the implementation of this function is as follows:

void ConstantPoolCacheEntry::set_itable_call(
 Bytecodes::Code invoke_code,
 methodHandle method,
 int index
) {
  assert(invoke_code == Bytecodes::_invokeinterface, "");
  InstanceKlass* interf = method->method_holder();
  // interf一定是接口，而method一定是非final方法
  set_f1(interf); // 对于itable，_f1保存的是表示接口的InstanceKlass
  set_f2(index); // 对于itable，_f2保存的是itable索引
  set_method_flags(as_TosState(method->result_type()),
                   0, // no option bits
                   method()->size_of_parameters());
  set_bytecode_1(Bytecodes::_invokeinterface);
}

Use the information in the CallInfo instance to update the information in ConstantPoolCacheEntry.

Chapter 35-Invokespecial and invokestatic words of method call instructions

This article will introduce in detail the assembly implementation logic of invokespecial and invokestatic bytecode instructions

1. Invokespecial instruction

The template of the invokespecial instruction is defined as follows:

def(Bytecodes::_invokespecial , ubcp|disp|clvm|____, vtos, vtos, invokespecial , f1_byte );

The generated function is invokespecial(), and the generated assembly code is as follows:

0x00007fffe1022250: mov %r13,-0x38(%rbp)
0x00007fffe1022254: movzwl 0x1(%r13),%edx
0x00007fffe1022259: mov -0x28(%rbp),%rcx
0x00007fffe102225d: shl $0x2,%edx
0x00007fffe1022260: mov 0x10(%rcx,%rdx,8),%ebx
// 获取ConstantPoolCacheEntry中indices[b2,b1,constant pool index]中的b1
0x00007fffe1022264: shr $0x10,%ebx
0x00007fffe1022267: and $0xff,%ebx
// 检查invokespecial=183的bytecode是否已经连接，如果已经连接就进行跳转
0x00007fffe102226d: cmp $0xb7,%ebx
0x00007fffe1022273: je 0x00007fffe1022312
 
// ... 省略调用InterpreterRuntime::resolve_invoke()函数
// 对invokespecial=183的bytecode进行连接，
// 因为字节码指令还没有连接
 
// 将invokespecial x中的x加载到%edx中
0x00007fffe1022306: movzwl 0x1(%r13),%edx
// 将ConstantPoolCache的首地址存储到%rcx中
0x00007fffe102230b: mov -0x28(%rbp),%rcx
// %edx中存储的是ConstantPoolCacheEntry项的索引，转换为字偏移
0x00007fffe102230f: shl $0x2,%edx
 
// 获取ConstantPoolCache::_f1属性的值
0x00007fffe1022312: mov 0x18(%rcx,%rdx,8),%rbx 
// 获取ConstantPoolCache::_flags属性的值
0x00007fffe1022317: mov 0x28(%rcx,%rdx,8),%edx 
 
 
// 将flags移动到ecx中
0x00007fffe102231b: mov %edx,%ecx
// 从flags中取出参数大小
0x00007fffe102231d: and $0xff,%ecx
// 获取到recv,%rcx中保存的是参数大小，最终计算为 %rsp+%rcx*8-0x8，
// flags中的参数大小可能对实例方法来说，已经包括了recv的大小
// 如调用实例方法的第一个参数是this(recv)
0x00007fffe1022323: mov -0x8(%rsp,%rcx,8),%rcx 
// 从flags中获取return type，也就是从_flags的高4位保存的TosState
0x00007fffe1022328: shr $0x1c,%edx
// 将TemplateInterpreter::invoke_return_entry地址存储到%r10
0x00007fffe102232b: movabs $0x7ffff73b6380,%r10 
// 找到对应return type的invoke_return_entry的地址
0x00007fffe1022335: mov (%r10,%rdx,8),%rdx 
// 通过invokespecial指令调用函数后的返回地址
0x00007fffe1022339: push %rdx 
               
// 空值检查
0x00007fffe102233a: cmp (%rcx),%rax 
 
// ...
 
// 设置调用者栈顶
0x00007fffe102235c: lea 0x8(%rsp),%r13
// 向栈中last_sp的位置保存调用者栈顶
0x00007fffe1022361: mov %r13,-0x10(%rbp)
 
// 跳转到Method::_from_interpretered_entry入口去执行
0x00007fffe1022365: jmpq *0x58(%rbx)

The invokespecial instruction does not need to be dynamically distributed when calling private and construction methods. After the bytecode instruction is parsed, \_f1 in ConstantPoolCacheEntry points to the Method instance of the target method, \_f2 is not used, so the logic of the above assembly is very simple, so I won't introduce too much here.

2. Invokestatic instruction

The template of the invokestatic instruction is defined as follows:

def(Bytecodes::_invokestatic , ubcp|disp|clvm|____, vtos, vtos, invokestatic , f1_byte);

The generated function is invokestatic(), and the generated assembly code is as follows:

0x00007fffe101c030: mov %r13,-0x38(%rbp)
0x00007fffe101c034: movzwl 0x1(%r13),%edx
0x00007fffe101c039: mov -0x28(%rbp),%rcx
0x00007fffe101c03d: shl $0x2,%edx
0x00007fffe101c040: mov 0x10(%rcx,%rdx,8),%ebx
0x00007fffe101c044: shr $0x10,%ebx
0x00007fffe101c047: and $0xff,%ebx
0x00007fffe101c04d: cmp $0xb8,%ebx
// 检查invokestatic=184的bytecode是否已经连接，如果已经连接就进行跳转 
0x00007fffe101c053: je 0x00007fffe101c0f2
 
 
// 调用InterpreterRuntime::resolve_invoke()函数对invokestatic=184的
// 的bytecode进行连接，因为字节码指令还没有连接
// ... 省略了解析invokestatic的汇编代码 
 
// 将invokestatic x中的x加载到%edx中
0x00007fffe101c0e6: movzwl 0x1(%r13),%edx
// 将ConstantPoolCache的首地址存储到%rcx中
0x00007fffe101c0eb: mov -0x28(%rbp),%rcx
// %edx中存储的是ConstantPoolCacheEntry项的索引，转换为字偏移
0x00007fffe101c0ef: shl $0x2,%edx
 
 
// 获取ConstantPoolCache::_f1属性的值
0x00007fffe101c0f2: mov 0x18(%rcx,%rdx,8),%rbx
// 获取ConstantPoolCache::_flags属性的值
0x00007fffe101c0f7: mov 0x28(%rcx,%rdx,8),%edx
 
 
// 从flags中获取return type，也就是从_flags的高4位保存的TosState
0x00007fffe101c0fb: shr $0x1c,%edx
// 将TemplateInterpreter::invoke_return_entry地址存储到%r10
0x00007fffe101c0fe: movabs $0x7ffff73b5d00,%r10
// 找到对应return type的invoke_return_entry的地址
0x00007fffe101c108: mov (%r10,%rdx,8),%rdx
// 通过invokespecial指令调用函数后的返回地址
0x00007fffe101c10c: push %rdx
 
 
// 设置调用者栈顶
0x00007fffe101c10d: lea 0x8(%rsp),%r13
// 向栈中last_sp的位置保存调用者栈顶
0x00007fffe101c112: mov %r13,-0x10(%rbp)
 
// 跳转到Method::_from_interpretered_entry入口去执行
0x00007fffe101c116: jmpq *0x58(%rbx)

The invokespecial instruction does not need to be dynamically distributed when calling a static method. After the bytecode instruction is parsed, \_f1 in ConstantPoolCacheEntry points to the Method instance of the target method, \_f2 is not used, so the logic of the above assembly is very simple, so I won't introduce too much here.

The resolution process of invokestatic and invokespecial will not be introduced too much here. Those who are interested can view the specific resolution process from the LinkResolver::resolve_invoke() function.

Chapter 36-method return command return

The bytecode related instructions returned by the method are shown in the following table.

The template is defined as follows:

def(Bytecodes::_ireturn , ____|disp|clvm|____, itos, itos, _return , itos );
def(Bytecodes::_lreturn , ____|disp|clvm|____, ltos, ltos, _return , ltos );
def(Bytecodes::_freturn , ____|disp|clvm|____, ftos, ftos, _return , ftos );
def(Bytecodes::_dreturn , ____|disp|clvm|____, dtos, dtos, _return , dtos );
def(Bytecodes::_areturn , ____|disp|clvm|____, atos, atos, _return , atos );
def(Bytecodes::_return , ____|disp|clvm|____, vtos, vtos, _return , vtos );

def(Bytecodes::_return_register_finalizer , ____|disp|clvm|____, vtos, vtos, _return , vtos );

The generating functions are all TemplateTable::\_return(). But if it is the return instruction in the object's construction method, then this instruction may also be rewritten as the \_return\_register\_finalizer instruction.

The assembly code corresponding to the generated return bytecode instruction is as follows:

part 1

// 将JavaThread::do_not_unlock_if_synchronized属性存储到%dl中
0x00007fffe101b770: mov 0x2ad(%r15),%dl
// 重置JavaThread::do_not_unlock_if_synchronized属性值为false
0x00007fffe101b777: movb $0x0,0x2ad(%r15)

// 将Method*加载到%rbx中
0x00007fffe101b77f: mov -0x18(%rbp),%rbx
// 将Method::_access_flags加载到%ecx中
0x00007fffe101b783: mov 0x28(%rbx),%ecx
// 检查Method::flags是否包含JVM_ACC_SYNCHRONIZED
0x00007fffe101b786: test $0x20,%ecx
// 如果方法不是同步方法，跳转到----unlocked----
0x00007fffe101b78c: je 0x00007fffe101b970


// 如果在%dl寄存器中存储的_do_not_unlock_if_synchronized的值不为0，
// 则跳转到no_unlock，表示不要释放和锁相关的资源 
0x00007fffe101b792: test $0xff,%dl 
0x00007fffe101b795: jne 
0x00007fffe101ba90 // 跳转到----no_unlock----处

An attribute \_do\_not\_unlock\_if\_synchronized is defined in the JavaThread class. This value indicates that the receiver should not be released when an exception is thrown (in the case of non-static method calls, we will always resolve the method to On a certain object, this object is the receiver here, which can also be called the receiver), this value will only work in the case of interpretation and execution. Initially it will be initialized to false. As can be seen in the above assembly, when the value of \_do\_not\_unlock\_if\_synchronized is true, it means that the receiver does not need to be released, so although it is currently a synchronized method, it is directly called to no_unlock.

part 2

If the following assembly code is executed, it means that the value of \_do\_not\_unlock\_if_synchronized stored in the %dl register is 0, and the lock release operation needs to be performed.

// 将之前字节码指令执行的结果存储到表达式栈顶，
// 由于return不需要返回执行结果，所以不需要设置返回值等信息，
// 最终在这里没有生成任何push指令

// 将BasicObjectLock存储到%rsi中，由于%rsi在调用C++函数时可做为
// 第2个参数传递，所以如果要调用unlock_object就可以传递此值
0x00007fffe101b79b: lea -0x50(%rbp),%rsi

// 获取BasicObjectLock::obj属性地址存储到%rax中
0x00007fffe101b79f: mov 0x8(%rsi),%rax 

// 如果不为0，则跳转到unlock处，因为不为0，表示
// 这个obj有指向的锁对象，需要进行释放锁的操作
0x00007fffe101b7a3: test %rax,%rax
0x00007fffe101b7a6: jne 0x00007fffe101b8a8 // 跳转到----unlock----处

// 如果是其它的return指令，则由于之前通过push指令将结果保存在
// 表达式栈上，所以现在可通过pop将表达式栈上的结果弹出到对应寄存器中

The -0x50(%rbp) of the first instruction points to the first BasicObjectLock object, and the value of sizeof(BasicObjectLock) is 16, which is 16 bytes. When we introduced the stack frame before, we introduced the structure of the Java interpretation stack, as follows:

Assuming that there are 2 lock objects in the current stack frame, 2 BasicObjectLock objects will be stored in the stack frame. There are 2 attributes in BasicObjectLock, \_lock and \_obj, each occupying 8 bytes. The layout is shown in the figure below.

Since the return bytecode instruction is responsible for releasing the interpreted and executed Java method with the synchronized keyword, the first lock object created for the synchronized keyword is stored in the place closest to the bottom of the stack from the current stack frame, that is The gray part in the above picture, and other lock objects we don't care about for now. The address of the BasicObjectLock represented by the gray part can be obtained through -0x50 (%rbp), and then the \_lock and \_obj attributes can be operated.

Since the lock-related knowledge has not been introduced yet, I will not introduce too much here, and I will introduce it in detail after the introduction of the lock-related knowledge.

Part 3

When the variable throw\_monitor\_exception is true, the assembly code that throws the lock state exception is generated by calling the call\_VM() function. These assembly codes are mainly used to execute the C++ function InterpreterRuntime::throw\_illegal\_monitor\_state \_exception(). After the execution is completed, the assembly code generated by the should\_not\_reach\_here() function will be executed.

When the variable throw\_monitor\_exception is false and install\_monitor\_exception is true, the C++ function InterpreterRuntime::new\_illegal\_monitor\_state_exception() is executed by calling the call\_VM() function to generate assembly code. Finally jump to the unlocked place to execute.

Part 4

In the InterpreterMacroAssembler::remove\_activation() function, after bind and unlock, it will call the InterpreterMacroAssembler::unlock\_object() function to generate the following assembly code. The function of InterpreterMacroAssembler::unlock_object() is as follows:

Unlocks an object. Used in monitorexit bytecode and remove_activation. Throws an IllegalMonitorException if object is not locked by current thread.

The generated assembly code is as follows:

// **** unlock ****

// ============调用InterpreterMacroAssembler::unlock_object()函数生成如下的汇编代码==================

// 将%r13存储到栈中，防止异常破坏了%r13寄存器中的值
0x00007fffe101b8a8: mov %r13,-0x38(%rbp)

// 将BasicObjectLock::_lock的地址存储到%rax寄存器中
0x00007fffe101b8ac: lea (%rsi),%rax
// 将BasicObjectLock::_obj存储到%rcx寄存器中
0x00007fffe101b8af: mov 0x8(%rsi),%rcx

// 将BasicObjectLock::_obj的值设置为NULL，表示释放锁操作
0x00007fffe101b8b3: movq $0x0,0x8(%rsi)

// ----------当UseBiasedLocking的值为true时，调用MacroAssembler::biased_locking_exit()生成如下的汇编代码------------
// 从BasicObjectLock::_obj对象中取出mark属性值并相与
0x00007fffe101b8bb: mov (%rcx),%rdx
0x00007fffe101b8be: and $0x7,%rdx
// 如果BasicObjectLock::_obj指向的oop的mark属性后3位是偏向锁的状态，则跳转到---- done ----
0x00007fffe101b8c2: cmp $0x5,%rdx
0x00007fffe101b8c6: je 0x00007fffe101b96c
// ------------------------结束调用MacroAssembler::biased_locking_exit()生成的汇编代码---------------------

// 将BasicObjectLock::_lock这个oop对象的_displaced_header属性值取出
0x00007fffe101b8cc: mov (%rax),%rdx
// 判断一下是否为锁的重入，如果是锁的重入，则跳转到---- done ----
0x00007fffe101b8cf: test %rdx,%rdx
0x00007fffe101b8d2: je 0x00007fffe101b96c

// 让BasicObjectLock::_obj的那个oop的mark恢复为
// BasicObjectLock::_lock中保存的原对象头
0x00007fffe101b8d8: lock cmpxchg %rdx,(%rcx)
// 如果为0，则表示锁的重入，跳转到---- done ---- ????
0x00007fffe101b8dd: je 0x00007fffe101b96c

// 让BasicObjectLock::_obj指向oop，这个oop的对象头已经替换为了BasicObjectLock::_lock中保存的对象头
0x00007fffe101b8e3: mov %rcx,0x8(%rsi)

// -----------调用call_VM()函数生成汇编代码来执行C++函数InterpreterRuntime::monitorexit()----------------
0x00007fffe101b8e7: callq 0x00007fffe101b8f1
0x00007fffe101b8ec: jmpq 0x00007fffe101b96c
0x00007fffe101b8f1: lea 0x8(%rsp),%rax
0x00007fffe101b8f6: mov %r13,-0x38(%rbp)
0x00007fffe101b8fa: mov %r15,%rdi
0x00007fffe101b8fd: mov %rbp,0x200(%r15)
0x00007fffe101b904: mov %rax,0x1f0(%r15)
0x00007fffe101b90b: test $0xf,%esp
0x00007fffe101b911: je 0x00007fffe101b929
0x00007fffe101b917: sub $0x8,%rsp
0x00007fffe101b91b: callq 0x00007ffff66b3d22
0x00007fffe101b920: add $0x8,%rsp
0x00007fffe101b924: jmpq 0x00007fffe101b92e
0x00007fffe101b929: callq 0x00007ffff66b3d22
0x00007fffe101b92e: movabs $0x0,%r10
0x00007fffe101b938: mov %r10,0x1f0(%r15)
0x00007fffe101b93f: movabs $0x0,%r10
0x00007fffe101b949: mov %r10,0x200(%r15)
0x00007fffe101b950: cmpq $0x0,0x8(%r15)
0x00007fffe101b958: je 0x00007fffe101b963
0x00007fffe101b95e: jmpq 0x00007fffe1000420
0x00007fffe101b963: mov -0x38(%rbp),%r13
0x00007fffe101b967: mov -0x30(%rbp),%r14
0x00007fffe101b96b: retq 
// ------------------------结束call_VM()函数调用生成的汇编代码--------------------------------

// **** done ****

0x00007fffe101b96c: mov -0x38(%rbp),%r13
0x00007fffe101b970: mov -0x40(%rbp),%rsi

// ==========结束调用InterpreterMacroAssembler::unlock_object()函数生成如下的汇编代码============

Part 5

// 如果是其它的return指令，则由于之前通过push指令将结果保存在
// 表达式栈上，所以现在可通过pop将表达式栈上的结果弹出到对应寄存器中


// **** unlocked ****
// 在执行这里的代码时，表示当前的栈中没有相关的锁，也就是
// 相关的锁对象已经全部释放

// **** restart ****
// 检查一下，是否所有的锁都已经释放了

// %rsi指向当前栈中最靠栈顶的BasicObjectLock
0x00007fffe101b970: mov -0x40(%rbp),%rsi
// %rbx指向当前栈中最靠栈底的BasicObjectLock
0x00007fffe101b974: lea -0x40(%rbp),%rbx

// 跳转到----entry----
0x00007fffe101b978: jmpq 0x00007fffe101ba8b

Part 6

Execute the following code to generate the code that calls the InterpreterRuntime::throw\_illegal\_monitor\_state_exception() function by calling the call\_VM() function:

// **** exception ****
// Entry already locked, need to throw exception

// 当throw_monitor_exception的值为true时，执行如下2个函数生成的汇编代码：
// 执行call_VM()函数生成的汇编代码，就是调用C++函数InterpreterRuntime::throw_illegal_monitor_state_exception()
// 执行should_not_reach_here()函数生成的汇编代码 

// 当throw_monitor_exception的值为false，执行如下汇编：
// 执行调用InterpreterMacroAssembler::unlock_object()函数生成的汇编代码
// install_monitor_exception的值为true时，执行call_VM()函数生成的汇编代码，就是调用C++函数InterpreterRuntime::new_illegal_monitor_state_exception() 
// 无条件跳转到----restart ----

Part 7

// **** loop ****

// 将BasicObjectLock::obj与NULL比较，如果不相等，则跳转到----exception----
0x00007fffe101ba79: cmpq $0x0,0x8(%rsi)
0x00007fffe101ba81: jne 0x00007fffe101b97d // 则跳转到----exception----

Part 8

// **** entry ****

// 0x10为BasicObjectLock，找到下一个BasicObjectLock
0x00007fffe101ba87: add $0x10,%rsi 
// 检查是否到达了锁对象存储区域的底部
0x00007fffe101ba8b: cmp %rbx,%rsi
// 如果不相等，跳转到loop
0x00007fffe101ba8e: jne 0x00007fffe101ba79 // 跳转到----loop----

Part 9

// **** no_unlock ****

// 省略jvmti support
 
// 将-0x8(%rbp)处保存的old stack pointer(saved rsp)取出来放到%rbx中
0x00007fffe101bac7: mov -0x8(%rbp),%rbx

// 移除栈帧
// leave指令相当于：
// mov %rbp, %rsp
// pop %rbp
0x00007fffe101bacb: leaveq 
// 将返回地址弹出到%r13中
0x00007fffe101bacc: pop %r13
// 设置%rsp为调用者的栈顶值
0x00007fffe101bace: mov %rbx,%rsp
0x00007fffe101bad1: jmpq *%r13

The return address of the explanation method is return address. Since the current C++ function calls Java, this return address is actually the return address of the C++ function, and we don’t need to consider it.

The entire call conversion is shown in the figure below.

The red part indicates the end of this process.

The lock release process is involved in the return bytecode instruction, so the above flowchart looks a little more complicated. After we introduce the lock-related knowledge, we will introduce the return instruction again. I won't introduce too much here.

Chapter 37-Resuming the caller's stack frame routine Interpreter::\_invoke\_return_entry

We have introduced the execution logic of the return bytecode instruction before. This bytecode instruction will only perform the operation of releasing the lock and exiting the current stack frame, but when control is transferred to the caller, the caller's stack frame needs to be restored. State, such as let %r13 point to bcp, %r14 point to local variable table, etc. In addition, you also need to pop up the pressed actual parameter, jump to the caller's next bytecode instruction to continue execution, and all these operations are performed by Interpreter ::\_return\_entry routine is responsible. This routine was introduced in the previous introduction of bytecode instructions such as invokevirtual and invokeinterface. When using these bytecode instructions to call a method, it will be pressed into the one-dimensional array of Interpreter::\_return\_entry and stored according to the return type of the method. Corresponding to the routine address, this routine will be executed after the return bytecode instruction is executed.

In the bytecode instructions such as invokevirtual and invokeinterface, the corresponding routine entry is obtained by calling the following functions:

address* TemplateInterpreter::invoke_return_entry_table_for(Bytecodes::Code code) {
  switch (code) {
  case Bytecodes::_invokestatic:
  case Bytecodes::_invokespecial:
  case Bytecodes::_invokevirtual:
  case Bytecodes::_invokehandle:
    return Interpreter::invoke_return_entry_table();
  case Bytecodes::_invokeinterface:
    return Interpreter::invokeinterface_return_entry_table();
  default:
    fatal(err_msg("invalid bytecode: %s", Bytecodes::name(code)));
    return NULL;
  }
}

You can see that the invokeinterface bytecode gets the corresponding routines from the Interpreter::\_invokeinterface\_return\_entry array, and the others get from the one-dimensional array of Interpreter::\_invoke\_return\_entry. as follows:

address TemplateInterpreter::_invoke_return_entry[TemplateInterpreter::number_of_return_addrs];
address TemplateInterpreter::_invokeinterface_return_entry[TemplateInterpreter::number_of_return_addrs];

When a one-dimensional array is returned, the entry address of the routine will be further determined according to the method return type. Let's take a look at the generation process of these routines.

The Interpreter::\_return_entry entry will be generated in the TemplateInterpreterGenerator::generate\_all() function, as follows:

{
    CodeletMark cm(_masm, "invoke return entry points");
    const TosState states[] = {itos, itos, itos, itos, ltos, ftos, dtos, atos, vtos};
    const int invoke_length = Bytecodes::length_for(Bytecodes::_invokestatic); // invoke_length=3
    const int invokeinterface_length = Bytecodes::length_for(Bytecodes::_invokeinterface); // invokeinterface=5
 
    for (int i = 0; i < Interpreter::number_of_return_addrs; i++) { // number_of_return_addrs = 9
       TosState state = states[i]; // TosState是枚举类型
       Interpreter::_invoke_return_entry[i] = generate_return_entry_for(state, invoke_length, sizeof(u2)); 
       Interpreter::_invokeinterface_return_entry[i] = generate_return_entry_for(state, invokeinterface_length, sizeof(u2));
    }
}

Except for the invokedynamic bytecode instruction, all other method call instructions need to call the routine generated by the generate\_return\_entry\_for() function after the interpretation and execution, and generate\_return\_entry\_for() of the generated routine The function is implemented as follows:

address TemplateInterpreterGenerator::generate_return_entry_for(TosState state, int step, size_t index_size) {
 
  // Restore stack bottom in case万一 i2c adjusted stack
  __ movptr(rsp, Address(rbp, frame::interpreter_frame_last_sp_offset * wordSize)); // interpreter_frame_last_sp_offset=-2
  // and NULL it as marker that esp is now tos until next java call
  __ movptr(Address(rbp, frame::interpreter_frame_last_sp_offset * wordSize), (int32_t)NULL_WORD);
 
  __ restore_bcp();
  __ restore_locals();
 
  // ...
 
  const Register cache = rbx;
  const Register index = rcx;
  __ get_cache_and_index_at_bcp(cache, index, 1, index_size);
 
  const Register flags = cache;
  __ movl(flags, Address(cache, index, Address::times_ptr, ConstantPoolCache::base_offset() + ConstantPoolCacheEntry::flags_offset()));
  __ andl(flags, ConstantPoolCacheEntry::parameter_size_mask);
  __ lea(rsp, Address(rsp, flags, Interpreter::stackElementScale()) ); // 栈元素标量为8
  __ dispatch_next(state, step);
 
  return entry;
}

According to the difference of the state (the return type of the method), when the next bytecode instruction of the caller method is selected to be executed, it is determined from which entry of the bytecode instruction to start execution. Let's take a look, when the passed state is itos (that is, when the return type of the method is int), the generated assembly code is as follows:

// 将-0x10(%rbp)存储到%rsp后，置空-0x10(%rbp)
0x00007fffe1006ce0: mov -0x10(%rbp),%rsp // 更改rsp
0x00007fffe1006ce4: movq $0x0,-0x10(%rbp) // 更改栈中特定位置的值
// 恢复bcp和locals，使%r14指向本地变量表，%r13指向bcp
0x00007fffe1006cec: mov -0x38(%rbp),%r13
0x00007fffe1006cf0: mov -0x30(%rbp),%r14
 // 获取ConstantPoolCacheEntry的索引并加载到%ecx
0x00007fffe1006cf4: movzwl 0x1(%r13),%ecx 
 
 // 获取栈中-0x28(%rbp)的ConstantPoolCache并加载到%ecx
0x00007fffe1006cf9: mov -0x28(%rbp),%rbx 
// shl是逻辑左移，获取字偏移
0x00007fffe1006cfd: shl $0x2,%ecx 
// 获取ConstantPoolCacheEntry中的_flags属性值
0x00007fffe1006d00: mov 0x28(%rbx,%rcx,8),%ebx
// 获取_flags中的低8位中保存的参数大小
0x00007fffe1006d04: and $0xff,%ebx 
 
// lea指令将地址加载到内存寄存器中，也就是恢复调用方法之前栈的样子
0x00007fffe1006d0a: lea (%rsp,%rbx,8),%rsp 
 
// 跳转到下一指令执行
0x00007fffe1006d0e: movzbl 0x3(%r13),%ebx 
0x00007fffe1006d13: add $0x3,%r13
0x00007fffe1006d17: movabs $0x7ffff73b7ca0,%r10
0x00007fffe1006d21: jmpq *(%r10,%rbx,8)

The logic of the above assembly code is very simple, so I won't introduce too much here.

Chapter 38-Explain the small example of calling between methods

In this article, we introduce a small example of the main() method that explains the execution of the add() method that calls the analysis execution. This example is as follows:

package com.classloading;

public class TestInvokeMethod {
  public int add(int a, int b) {
    return a + b;
  }

  public static void main(String[] args) {
    TestInvokeMethod tim = new TestInvokeMethod();
    tim.add(2, 3);
  }
}

Compile into a bytecode file by the Javac compiler, as follows:

Constant pool:
   #1 = Methodref #5.#16 // java/lang/Object."<init>":()V
   #2 = Class #17 // com/classloading/TestInvokeMethod
   #3 = Methodref #2.#16 // com/classloading/TestInvokeMethod."<init>":()V
   #4 = Methodref #2.#18 // com/classloading/TestInvokeMethod.add:(II)I
   #5 = Class #19 // java/lang/Object
   #6 = Utf8 <init>
   #7 = Utf8 ()V
   #8 = Utf8 Code
   #9 = Utf8 LineNumberTable
  #10 = Utf8 add
  #11 = Utf8 (II)I
  #12 = Utf8 main
  #13 = Utf8 ([Ljava/lang/String;)V
  #14 = Utf8 SourceFile
  #15 = Utf8 TestInvokeMethod.java
  #16 = NameAndType #6:#7 // "<init>":()V
  #17 = Utf8 com/classloading/TestInvokeMethod
  #18 = NameAndType #10:#11 // add:(II)I
  #19 = Utf8 java/lang/Object
{
  public com.classloading.TestInvokeMethod();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1 // Method java/lang/Object."<init>":()V
         4: return

  public int add(int, int);
    descriptor: (II)I
    flags: ACC_PUBLIC
    Code:
      stack=2, locals=3, args_size=3
         0: iload_1
         1: iload_2
         2: iadd
         3: ireturn

  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=3, locals=2, args_size=1
         0: new #2 // class com/classloading/TestInvokeMethod
         3: dup
         4: invokespecial #3 // Method "<init>":()V
         7: astore_1
         8: aload_1
         9: iconst_2
        10: iconst_3
        11: invokevirtual #4 // Method add:(II)I
        14: pop
        15: return
}

The following is divided into several parts to introduce the related content of the call.

1. C++ function calls main() method

Now we look at aload_1 whose bytecode index is 8. The stack frame status at this time is as follows:

Since the tos\_out of aload\_1 is atos, the address of the TestInvokeMethod instance will be cached in the register at the top of the stack. When iconst\_2 is executed, it will be entered from atos. The assembly of the iconst\_2 instruction is as follows:

// aep
push %rax
jmpq // 跳转到下面那条指令执行

// ...

mov $0x2,%eax // 指令的汇编代码

Since the tos\_out of isconst\_2 is itos, when entering the next instruction, it will be entered from the tos\_int of isconst\_3 as itos, as follows:

// iep
push %rax

mov $0x3,%eax

The next step is to execute the invokevirtual bytecode instruction. At this time, 2 has been pushed into the expression stack, and 3 is used as the stack top cache in the %eax register, but the tos_in of invokevirtual is vtos, so from the invokevirtual bytecode instruction When the iep enters, the value in the %eax register will also be pushed into the expression stack, and the final stack state is shown in the figure below.

2.main() method calls add() method

When the invokevirtual bytecode instruction is executed, assuming that the bytecode instruction has been parsed, that is, the method call-related information has been saved in the corresponding ConstantPoolCacheEntry, the relevant assembly code executed is as follows:

0x00007fffe1021f90: mov %r13,-0x38(%rbp) // 将bcp保存到栈中
// invokevirtual x中取出x，也就是常量池索引存储到%edx，
// 其实这里已经是ConstantPoolCacheEntry的index，因为在类的连接
// 阶段会对方法中特定的一些字节码指令进行重写
0x00007fffe1021f94: movzwl 0x1(%r13),%edx 
// 将ConstantPoolCache的首地址存储到%rcx
 
 
0x00007fffe1021f99: mov -0x28(%rbp),%rcx 
 
// 左移2位，因为%edx中存储的是ConstantPoolCacheEntry索引，左移2位是因为
// ConstantPoolCacheEntry占用4个字
0x00007fffe1021f9d: shl $0x2,%edx 
        
// 计算%rcx+%rdx*8+0x10，获取ConstantPoolCacheEntry[_indices,_f1,_f2,_flags]中的_indices
// 因为ConstantPoolCache的大小为0x16字节，%rcx+0x10定位
// 到第一个ConstantPoolCacheEntry的位置
// %rdx*8算出来的是相对于第一个ConstantPoolCacheEntry的字节偏移
0x00007fffe1021fa0: mov 0x10(%rcx,%rdx,8),%ebx 
 
// 获取ConstantPoolCacheEntry中indices[b2,b1,constant pool index]中的b2
0x00007fffe1021fa4: shr $0x18,%ebx 
 
// 取出indices中含有的b2，即bytecode存储到%ebx中
0x00007fffe1021fa7: and $0xff,%ebx 
 
// 查看182的bytecode是否已经连接 
0x00007fffe1021fad: cmp $0xb6,%ebx 
  
// 如果连接就进行跳转，跳转到resolved 
0x00007fffe1021fb3: je 0x00007fffe1022052

Let's look directly at the logic implementation after method analysis, as follows:

// **** resolved ****
// resolved的定义点，到这里说明invokevirtual字节码已经连接
// 获取ConstantPoolCacheEntry::_f2,这个字段只对virtual有意义
// 在计算时，因为ConstantPoolCacheEntry在ConstantPoolCache之后保存，
// 所以ConstantPoolCache为0x10，而
// _f2还要偏移0x10，这样总偏移就是0x20
// ConstantPoolCacheEntry::_f2存储到%rbx
0x00007fffe1022052: mov 0x20(%rcx,%rdx,8),%rbx 
 // ConstantPoolCacheEntry::_flags存储到%edx
0x00007fffe1022057: mov 0x28(%rcx,%rdx,8),%edx 
 // 将flags移动到ecx中
0x00007fffe102205b: mov %edx,%ecx 
// 从flags中取出参数大小 
0x00007fffe102205d: and $0xff,%ecx 
 
          
// 获取到recv,%rcx中保存的是参数大小，最终计算参数所需要的大小为%rsp+%rcx*8-0x8，
// flags中的参数大小对实例方法来说，已经包括了recv的大小
// 如调用实例方法的第一个参数是this(recv)
0x00007fffe1022063: mov -0x8(%rsp,%rcx,8),%rcx // recv保存到%rcx 
 
// 将flags存储到r13中
0x00007fffe1022068: mov %edx,%r13d 
// 从flags中获取return type，也就是从_flags的高4位保存的TosState
0x00007fffe102206b: shr $0x1c,%edx 
 
// 将TemplateInterpreter::invoke_return_entry地址存储到%r10
0x00007fffe102206e: movabs $0x7ffff73b6380,%r10 
// %rdx保存的是return type，计算返回地址
// 因为TemplateInterpreter::invoke_return_entry是数组，
// 所以要找到对应return type的入口地址
0x00007fffe1022078: mov (%r10,%rdx,8),%rdx 
// 向栈中压入返回地址
0x00007fffe102207c: push %rdx 
 
// 还原ConstantPoolCacheEntry::_flags 
0x00007fffe102207d: mov %r13d,%edx 
// 还原bcp
0x00007fffe1022080: mov -0x38(%rbp),%r13

After executing the above code, the relevant value has been stored in the relevant register. The relevant register status is as follows:

rbx: 存储的是ConstantPoolCacheEntry::_f2属性的值
rcx: 就是调用实例方法时的第一个参数this
rdx: 存储的是ConstantPoolCacheEntry::_flags属性的值

The state of the stack is shown in the figure below.

It should be noted that the return address is also the address of a routine, which is the address stored in the subscript corresponding to the integer in the TemplateInterpreter::invoke\_return\_entry one-dimensional array, because the call to the add() method returns an integer type. How to get the return type of the add() method? It is derived from the TosState of _flags of ConstantPoolCacheEntry.

Let's continue to look at the assembly code to be executed by the invokevirtual bytecode instruction, as follows:

// flags存储到%eax
0x00007fffe1022084: mov %edx,%eax 
// 测试调用的方法是否为final 
0x00007fffe1022086: and $0x100000,%eax 
// 如果不为final就直接跳转到----notFinal---- 
0x00007fffe102208c: je 0x00007fffe10220c0 
 
// 通过(%rcx)来获取receiver的值，如果%rcx为空，则会引起OS异常
0x00007fffe1022092: cmp (%rcx),%rax 
 
// 省略统计相关代码部分
 
// 设置调用者栈顶并保存
0x00007fffe10220b4: lea 0x8(%rsp),%r13
0x00007fffe10220b9: mov %r13,-0x10(%rbp)
 
// 跳转到Method::_from_interpretered_entry入口去执行
0x00007fffe10220bd: jmpq *0x58(%rbx)

Execute the Method::\_from\_interpretered_entry routine. This routine was introduced in detail before. After the execution is completed, a stack frame will be created for the add() method. The stack state at this time is shown in the figure below.

Execute the iload\_0 and iload\_1 instructions. Since two iloads appear consecutively, it is \_fast\_iload2. The assembly is as follows:

movzbl  0x1(%r13),%ebx
neg     %rbx
mov     (%r14,%rbx,8),%eax
push    %rax
movzbl  0x3(%r13),%ebx
neg     %rbx
mov     (%r14,%rbx,8),%eax

Note that only the first variable is pushed onto the stack, and the second variable is stored in %eax as the stack top buffer.

Call the iadd instruction, because tos_in is itos, so the assembly is as follows:

mov (%rsp),%edx
add    $0x8,%rsp
add    %edx,%eax

The final result is cached in %eax.

3. Exit the add() method

Execute the ireturn bytecode instruction to unstack the add() method. For the example, the relevant assembly code executed is as follows:

// 将JavaThread::do_not_unlock_if_synchronized属性存储到%dl中
0x00007fffe101b770: mov 0x2ad(%r15),%dl
// 重置JavaThread::do_not_unlock_if_synchronized属性值为false
0x00007fffe101b777: movb $0x0,0x2ad(%r15)
 
// 将Method*加载到%rbx中
0x00007fffe101b77f: mov -0x18(%rbp),%rbx
// 将Method::_access_flags加载到%ecx中
0x00007fffe101b783: mov 0x28(%rbx),%ecx
// 检查Method::flags是否包含JVM_ACC_SYNCHRONIZED
0x00007fffe101b786: test $0x20,%ecx
// 如果方法不是同步方法，跳转到----unlocked----
0x00007fffe101b78c: je 0x00007fffe101b970

The assembly implementation at unlocked is as follows:

// 将-0x8(%rbp)处保存的old stack pointer(saved rsp)取出来放到%rbx中
0x00007fffe101bac7: mov -0x8(%rbp),%rbx
 
// 移除栈帧
// leave指令相当于：
// mov %rbp, %rsp
// pop %rbp
0x00007fffe101bacb: leaveq 
// 将返回地址弹出到%r13中
0x00007fffe101bacc: pop %r13
// 设置%rsp为调用者的栈顶值
0x00007fffe101bace: mov %rbx,%rsp
0x00007fffe101bad1: jmpq *%r13

Execute the leaveq instruction to exit the stack, and the stack state at this time is shown in the figure below.

Then we will pop up the return address and jump to the relevant address saved in the TemplateInterpreter::invoke\_return\_entry array to execute the corresponding routine.

4. Execution return routine

For the example, the assembly code generated when the passed state is itos is as follows:

// 将-0x10(%rbp)存储到%rsp后，置空-0x10(%rbp)
0x00007fffe1006ce0: mov -0x10(%rbp),%rsp // 更改rsp
0x00007fffe1006ce4: movq $0x0,-0x10(%rbp) // 更改栈中特定位置的值
// 恢复bcp和locals，使%r14指向本地变量表，%r13指向bcp
0x00007fffe1006cec: mov -0x38(%rbp),%r13
0x00007fffe1006cf0: mov -0x30(%rbp),%r14
 // 获取ConstantPoolCacheEntry的索引并加载到%ecx
0x00007fffe1006cf4: movzwl 0x1(%r13),%ecx 

 // 获取栈中-0x28(%rbp)的ConstantPoolCache并加载到%ecx
0x00007fffe1006cf9: mov -0x28(%rbp),%rbx 
// shl是逻辑左移，获取字偏移
0x00007fffe1006cfd: shl $0x2,%ecx 
// 获取ConstantPoolCacheEntry中的_flags属性值
0x00007fffe1006d00: mov 0x28(%rbx,%rcx,8),%ebx
// 获取_flags中的低8位中保存的参数大小
0x00007fffe1006d04: and $0xff,%ebx 

// lea指令将地址加载到内存寄存器中，也就是恢复调用方法之前栈的样子
0x00007fffe1006d0a: lea (%rsp,%rbx,8),%rsp 

// 跳转到下一指令执行
0x00007fffe1006d0e: movzbl 0x3(%r13),%ebx 
0x00007fffe1006d13: add $0x3,%r13
0x00007fffe1006d17: movabs $0x7ffff73b7ca0,%r10
0x00007fffe1006d21: jmpq *(%r10,%rbx,8)

The above assembly code is also an unstack operation performed. The most important thing is to pop the actual parameters pushed in when the interpreted execution method is called from the stack, and then execute the next instruction pop in the invokevirtual in the main() method. The stack state at this time is shown in the figure below.

It should be noted that the execution result of calling the add() method is stored in the top of the stack cache at this time, so when jumping to the next instruction pop, it must be entered from the iep entry of pop, so that it can be executed correctly. Up.

5. Exit the main() method

When the pop instruction is executed, it will enter from the iep entry, and the executed assembly code is as follows:

// iep
push %rax

// ...

add    $0x8,%rsp

Since the main() method calls the add() method without returning a result, for the main() method, this result will be popped from the expression stack of the main() method. Next, execute the return instruction, the assembly code corresponding to this instruction is as follows:

// 将JavaThread::do_not_unlock_if_synchronized属性存储到%dl中
0x00007fffe101b770: mov 0x2ad(%r15),%dl
// 重置JavaThread::do_not_unlock_if_synchronized属性值为false
0x00007fffe101b777: movb $0x0,0x2ad(%r15)

// 将Method*加载到%rbx中
0x00007fffe101b77f: mov -0x18(%rbp),%rbx
// 将Method::_access_flags加载到%ecx中
0x00007fffe101b783: mov 0x28(%rbx),%ecx
// 检查Method::flags是否包含JVM_ACC_SYNCHRONIZED
0x00007fffe101b786: test $0x20,%ecx
// 如果方法不是同步方法，跳转到----unlocked----
0x00007fffe101b78c: je 0x00007fffe101b970

The main() method is an asynchronous method, so it jumps to unlocked. In the unlocked logic, some logic to release the lock will be executed. This is not important for our example. Let's look at the operation of unstacking directly, as follows:

// 将-0x8(%rbp)处保存的old stack pointer(saved rsp)取出来放到%rbx中
0x00007fffe101bac7: mov -0x8(%rbp),%rbx

// 移除栈帧
// leave指令相当于：
// mov %rbp, %rsp
// pop %rbp
0x00007fffe101bacb: leaveq 
// 将返回地址弹出到%r13中
0x00007fffe101bacc: pop %r13
// 设置%rsp为调用者的栈顶值
0x00007fffe101bace: mov %rbx,%rsp
0x00007fffe101bad1: jmpq *%r13

The final stack state is shown in the figure below.

The return address is the return address of the C++ language. Then how to exit some of the above stack frames and the end method is a matter of C++.