Introduction to ASM
ASM is a general Java bytecode manipulation and analysis framework, which can be used to modify existing classes or directly generate classes dynamically in binary form. ASM provides some common bytecode conversion and analysis algorithms, from which you can build customized complex conversion and code analysis tools. ASM provides functions similar to other Java bytecode frameworks, but focuses on performance. Because its design and implementation are as small and fast as possible, it is very suitable for use in dynamic systems (of course, it can also be used in a static way, such as in a compiler).
ASM is used in many projects, including the following:
- OpenJDK, generate lambda call site, and Nashorn compiler;
- Groovy compiler and Kotlin compiler;
- Cobertura and Jacoco, measure code coverage with tooling classes;
- CGLIB, used to dynamically generate proxy classes;
- Gradle, generate some classes at runtime;
For more reference official website: https://asm.ow2.io/
IDE plugin
ASM directly manipulates the bytecode. If you are not familiar with the bytecode operation collection, it will be very difficult to write. Therefore, ASM provides the development plug-in BytecodeOutline for mainstream IDEs:
- IDEA:ASM Bytecode Outline
- Eclipse:BytecodeOutline
Take IDEA as an example, just right-click in the corresponding class -> Show Bytecode outline, roughly as shown in the figure below:
The panel contains three tabs:
Bytecode
: the bytecode file corresponding to the class;ASMified
: UseASM
generate the code corresponding to the bytecode;Groovified
: the bytecode instruction corresponding to the class;
ASM API
ASM
library provides for generating and converting two compiled classes API
, a is the core API
, based on the class represented in the form of event; the other is a tree API
, based on the object represented in the form of classes; can be compared to XML
file Analysis method: SAX
mode and DOM
mode; core API corresponds to SAX
mode, tree API
corresponds to DOM
mode; each mode has its own advantages and disadvantages:
- Event-based APIs are faster than object-based APIs and require less memory. However, when using event-based APIs, the implementation of class conversion may be more difficult;
- The object-based API will load the entire class into memory;
ASM
library is organized in several packages, which are distributed in several separate JAR files:
org.objectweb.asm
andorg.objectweb.asm.signature
packages: define event-based API and provide class parser and writer components, which are contained in asm.jar;org.objectweb.asm.util
package: provides various tools based on the core API, these tools can be used in the development and debugging of ASM applications, included inasm-util.jar
;org.objectweb.asm.commons
package: provides several useful predefined class converters, mainly based on the core API, included inasm-commons.jar
;org.objectweb.asm.tree
package: defines object-based APIs and provides tools for converting between event-based representations and object-based representations, included inasm-tree.jar
;org.objectweb.asm.tree.analysis
package: The package provides a tree API-based class analysis framework and several predefined class analyzers, which are included inasm-analysis.jar
;
Core API
Before learning the core API
, it is recommended to understand the visitor pattern, because ASM
's operation and analysis of bytecode are based on the visitor pattern;
Visitor mode
The visitor model suggests putting the new behavior into a visitor instead of trying to integrate it into the existing class. Now, the original object that needs to perform the operation will be passed as a parameter to the method in the visitor, so that the method can access all the necessary data contained in the object; common application scenarios:
- If you need to perform certain operations on all elements in a complex object structure (such as an object tree), you can use the visitor mode;
- The visitor pattern can be used to clean up the business logic of the auxiliary behavior;
- This mode can be used when a certain behavior is only meaningful in some classes in the class hierarchy, but has no meaning in other classes;
Bytecode is actually a complex object structure, and Sharding-Jdbc
in sql
also uses visitor mode. It can be found that there are some data with relatively stable data structure and fixed syntax;
More reference: visitor pattern
class
The visitor model has two core classes: independent visitor and receiver event generator; there are two core classes in the ASM
ClassVisitor
and ClassReader
, which are introduced below;
ClassVisitor
ASM API
used to generate and transform compiled classes is based on the ClassVisitor
abstract class. Each method in this class corresponds to the class file structure of the same name:
public abstract class ClassVisitor {
public ClassVisitor(int api);
public ClassVisitor(int api, ClassVisitor cv);
public void visit(int version, int access, String name,String signature, String superName, String[] interfaces);
public void visitSource(String source, String debug);
public void visitOuterClass(String owner, String name, String desc);
AnnotationVisitor visitAnnotation(String desc, boolean visible);
public void visitAttribute(Attribute attr);
public void visitInnerClass(String name, String outerName,String innerName, int access);
public FieldVisitor visitField(int access, String name, String desc,String signature, Object value);
public MethodVisitor visitMethod(int access, String name, String desc,String signature, String[] exceptions);
void visitEnd();
}
The content can have any length and complexity of the components will be returned to the auxiliary visitor category, mainly including: AnnotationVisitor
, FieldVisitor
, MethodVisitor
; For more information, please refer to the Java virtual machine specification;
All the above methods will be ClassReader
. The parameters in all methods are ClassReader
. Of course, each method is called in order:
visit visitSource? visitOuterClass? ( visitAnnotation | visitAttribute )* ( visitInnerClass | visitField |visitMethod )* visitEnd
First calls visit
then is visitSource
up a call, is followed visitOuterClass
up a call, and then in any order of visitAnnotation
and visitAttribute
access to any of a plurality, in any order followed to visitInnerClass
, visitField
and visitMethod
any number of calls, and finally to a visitEnd
call ends.
ClassReader
The main function of this type is to read the bytecode file, and then notify ClassVisitor
the read data. The bytecode file can be passed in in various ways:
public ClassReader(final InputStream inputStream)
: byte stream mode;public ClassReader(final String className)
: the full path of the file;public ClassReader(final byte[] classFile)
: binary file;
The common usage is as follows:
ClassReader classReader = new ClassReader("com/zh/asm/TestService");
ClassWriter classVisitor = new ClassWriter(ClassWriter.COMPUTE_MAXS);
classReader.accept(classVisitor, 0);
ClassReader
the accept
method for processing receiving a visitor, further comprising a further parsingOptions
parameters, options, comprising:
SKIP_CODE
: skip access to compiled code (this may be useful if you only need the class structure);SKIP_DEBUG
: Do not access debugging information, nor create artificial tags for it;SKIP_FRAMES
: skip the stack map frame;EXPAND_FRAMES
: decompress these frames;
ClassWriter
In the above example, ClassWriter
is used, which is inherited from ClassVisitor
. It is mainly used to generate classes and can be used alone, as shown below:
ClassWriter cw = new ClassWriter(0);
cw.visit(V1_5, ACC_PUBLIC + ACC_ABSTRACT + ACC_INTERFACE,"pkg/Comparable", null, "java/lang/Object",new String[]{"pkg/Mesurable"});
cw.visitField(ACC_PUBLIC + ACC_FINAL + ACC_STATIC, "LESS","I", null, new Integer(-1)).visitEnd();
cw.visitField(ACC_PUBLIC + ACC_FINAL + ACC_STATIC, "EQUAL","I", null, new Integer(0)).visitEnd();
cw.visitField(ACC_PUBLIC + ACC_FINAL + ACC_STATIC, "GREATER","I", null, new Integer(1)).visitEnd();
cw.visitMethod(ACC_PUBLIC + ACC_ABSTRACT, "compareTo","(Ljava/lang/Object;)I", null, null).visitEnd();
cw.visitEnd();
byte[] b = cw.toByteArray();
//输出
FileOutputStream fileOutputStream = new FileOutputStream(new File("F:/asm/Comparable.class"));
fileOutputStream.write(b);
fileOutputStream.close();
The above ClassWriter
, and then converts it into a byte array, and finally FileOutputStream
. The decompilation result is as follows:
package pkg;
public interface Comparable extends Mesurable {
int LESS = -1;
int EQUAL = 0;
int GREATER = 1;
int compareTo(Object var1);
}
ClassWriter
need to provide a parameter flags
when instantiating 060c2caba2f3a2, the options include:
COMPUTE_MAXS
: Will calculate the size of the local variables and operand stack part for you; still have to callvisitMaxs
, but you can use any parameters: they will be ignored and recalculated; when using this option, you must still calculate these frames yourself;COMPUTE_FRAMES
: Everything is automatically calculated;visitFrame
visitMaxs
must still be called (parameters will be ignored and recalculated);- 0: Nothing will be calculated automatically; you must calculate the size of the frame, local variables and operand stack yourself;
The above is only ClassWriter
, but it is more meaningful to integrate the above three core classes. Let's focus on the conversion operation;
Conversion operation
ClassVisitor
between the class reader and the class writer, and integrate the three. The general code structure is as follows:
ClassReader classReader = new ClassReader("com/zh/asm/TestService");
ClassWriter classWriter = new ClassWriter(ClassWriter.COMPUTE_MAXS);
//处理
ClassVisitor classVisitor = new AddFieldAdapter(classWriter...);
classReader.accept(classVisitor, 0);
The corresponding architecture of the above code is shown in the figure below:
An adapter for adding attributes is provided here. You can rewrite the visitEnd
method and then write new attributes. The code is as follows:
public class AddFieldAdapter extends ClassVisitor {
private int fAcc;
private String fName;
private String fDesc;
//是否已经有相同名称的属性
private boolean isFieldPresent;
public AddFieldAdapter(ClassVisitor cv, int fAcc, String fName,
String fDesc) {
super(ASM4, cv);
this.fAcc = fAcc;
this.fName = fName;
this.fDesc = fDesc;
}
@Override
public FieldVisitor visitField(int access, String name, String desc,
String signature, Object value) {
//判断是否有相同名称的字段,不存在才会在visitEnd中添加
if (name.equals(fName)) {
isFieldPresent = true;
}
return cv.visitField(access, name, desc, signature, value);
}
@Override
public void visitEnd() {
if (!isFieldPresent) {
FieldVisitor fv = cv.visitField(fAcc, fName, fDesc, null, null);
if (fv != null) {
fv.visitEnd();
}
}
cv.visitEnd();
}
}
According to the ClassVisitor
in which each method of 060c2caba2f5b3 is called, if there are multiple attributes in the class, then visitField
will be called multiple times, and each time it will check whether the field to be added already isFieldPresent
, and then save it in the 060c2caba2f5b7 logo, so that it is accessed In the final visitEnd
, judge whether new attributes need to be added;
ClassVisitor classVisitor = new AddFieldAdapter(classWriter,ACC_PUBLIC + ACC_FINAL + ACC_STATIC,"id","I");
public static final int id
is added here; you can write the byte array into the class file, and then decompile and view:
public class TestService {
public static final int id;
......
}
Tools
In addition to the core classes above, ASM
also provides some tool classes for the convenience of users:
- Type
Type
object represents aJava
type, which can be constructed from a type descriptor or aClass
object; theType
type also contains static variables that represent primitive types; - TraceClassVisitor
Extend theClassVisitor
class and construct a textual representation of the accessed class; useTraceClassVisitor
to obtain a readable trace of the actual generated content; - CheckClassAdapter
ClassWriter
class does not verify whether the call sequence of its methods is appropriate, and whether the parameters are valid; therefore, some invalid classes that are rejected by the Java virtual machine validator may be generated. In order to detect some of these errors as early as possible, you can use theCheckClassAdapter
category; - ASMifier
This classTraceClassVisitor
tool (by default, it uses aTextifier
backend that produces the output type shown above). This backend makesTraceClassVisitor
class print the Java code used to call it.
method
In the introduction of the above ClassVisitor
, the components of the access complexity will be returned to the auxiliary visitor class, including: AnnotationVisitor
, FieldVisitor
, MethodVisitor
; Before introducing MethodVisitor
, learn about the Java virtual machine execution model;
Execution model
When each method is executed, the Java virtual machine will synchronously create a stack frame (Stack Frame) to store information such as the local variable table, operand stack, dynamic connection, method export, etc.
interest. The process from when each method is called to the completion of execution corresponds to the process of a stack frame from pushing to popping in the virtual machine stack;
- Local variable table: contains variables that can be accessed in random order by their indexes;
- Operand stack: The bytecode instruction is used as the value stack of the operand;
Look at an execution stack with 3 frames:
The first frame: Contains 3 local variables, the maximum value of the operand stack is 4, and contains 2 values;
The second frame: Contains 2 local variables, the maximum value of the operand stack is 3, and contains 2 values;
The third frame: Contains 4 local variables, the maximum value of the operand stack is 2, and contains 2 values;
Byte code instruction
The bytecode instruction consists of an operation code that identifies the instruction and a fixed number of parameters:
- Opcode: is an unsigned byte value, identified by a mnemonic symbol. For example, the opcode value 0 is designed by the mnemonic NOP and corresponds to an instruction that does not perform any operation.
- Parameter: It is a static value, which determines the precise command behavior. They are given immediately after the opcode.
Bytecode instructions are divided into two categories:
- A small number of instructions are used to transfer values from local variables to the operand stack;
- Other instructions only act on the operand stack: they pop some values from the stack, calculate the result based on these values, and then push it back to the stack;
Local variable instructions:
ILOAD
: used to load a boolean, byte, char, short or int local variable;LLOAD, FLOAD, DLOAD
: used to load long, float or double values respectively;ALOAD
: used to load any non-primitive value, that is, object and array references;
Operand stack instructions:
ISTORE
: Pop a boolean, byte, char, short or int local variable value from the operand stack and store it in the local variable specified by its index i;LSTORE,FSTORE,DSTORE
: pop up long, float or double values respectively;ASTORE
: used to pop up any non-primitive value;GETFIELD
,PUTFIELD
:GETFIELD owner name desc
pops up an object reference and pushes the value ofname
PUTFIELD owner name desc
pops up a value and an object reference, and stores the value in itsname
field;
In both cases, the object must be ofowner
, and its field must be of typedesc
GETSTATIC
andPUTSTATIC
are similar instructions, but for static fields.INVOKEVIRTUAL、INVOKESTATIC、INVOKESPECIAL、INVOKEINTERFACE、INVOKEDYNAMIC
:INVOKEVIRTUAL owner name desc
callsname
method defined in theowner
, and its method descriptor isdesc
.INVOKESTATIC
used for static methods,INVOKESPECIAL
used for private methods and constructors, andINVOKEINTERFACE
used for methods defined in interfaces. Finally, for the java7 class,INVOKEDYNAMIC
used for the new dynamic method invocation mechanism.
MethodVisitor
ASM API
used to generate and convert compiled methods is based on the MethodVisitor
abstract class; it is ClassVisitor
by the visitMethod
method of 060c2caba2fdc0; this class also defines a method for each bytecode instruction category according to the number and types of these instructions; These methods are called in the following order:
visitAnnotationDefault? ( visitAnnotation | visitParameterAnnotation | visitAttribute )*( visitCode( visitTryCatchBlock | visitLabel | visitFrame | visitXxx Insn |visitLocalVariable | visitLineNumber )*visitMaxs )?visitEnd
Let's look at an example of converting an existing method, adding a start and end log to the method;
Prepare the instance that needs to be converted, and add logs before and after the
query
public class TestService { public void query(int param) { System.out.println("service handle..."); } }
Rewrite
ClassVisitor
invisitMethod
public class MyClassVisitor extends ClassVisitor implements Opcodes { public MyClassVisitor(ClassVisitor cv) { super(ASM5, cv); } @Override public MethodVisitor visitMethod(int access, String name, String desc, String signature, String[] exceptions) { MethodVisitor methodVisitor = cv.visitMethod(access, name, desc, signature, exceptions); if (!name.equals("<init>") && methodVisitor != null) { methodVisitor = new MyMethodVisitor(methodVisitor); } return methodVisitor; } }
Filter out the <init>
method, other methods will be MyMethodVisitor
, and then rewrite the MethodVisitor
method;
Overload MethodVisitor
public class MyMethodVisitor extends MethodVisitor implements Opcodes { public MyMethodVisitor(MethodVisitor mv) { super(Opcodes.ASM4, mv); } @Override public void visitCode() { super.visitCode(); mv.visitFieldInsn(GETSTATIC, "java/lang/System", "out", "Ljava/io/PrintStream;"); mv.visitLdcInsn("start"); mv.visitMethodInsn(INVOKEVIRTUAL, "java/io/PrintStream", "println", "(Ljava/lang/String;)V", false); } @Override public void visitInsn(int opcode) { if ((opcode >= Opcodes.IRETURN && opcode <= Opcodes.RETURN) || opcode == Opcodes.ATHROW) { //方法在返回之前打印"end" mv.visitFieldInsn(GETSTATIC, "java/lang/System", "out", "Ljava/io/PrintStream;"); mv.visitLdcInsn("end"); mv.visitMethodInsn(INVOKEVIRTUAL, "java/io/PrintStream", "println", "(Ljava/lang/String;)V", false); } mv.visitInsn(opcode); } }
visitCode
method before accessing, and visitInsn
needs to determine whether the operator is a method return. The general method will perform the mv.visitInsn(RETURN)
operation before returning. At this time, it can be judged opcode
View the generated new bytecode file
public class TestService { public TestService() { } public void query(int var1) { System.out.println("start"); System.out.println("service handle..."); System.out.println("end"); } }
Tools
Some tools are also provided under the method:
LocalVariablesSorter
: This method adapter renumbers the local variables used in a method according to the order in which they appear in this method. At the same time, you can use thenewLocal
method to create a new local variable;AdviceAdapter
: This method adapter is an abstract class that can be usedRETURN
orATHROW
instructions; its main advantage is that it also applies to constructors, where the code cannot be inserted only at the beginning of the constructor, but at the beginning of the constructor. Insert after calling the super constructor.
scenes to be used
ASM is used in many projects, here are two common usage scenarios: AOP and instead of reflection;
AOP
Aspect-oriented programming is mainly used to solve some system-level problems in program development, such as logs, transactions, and permission waiting; the key technology is proxy, which includes dynamic proxy and static proxy, and there are many ways to implement it:
- AspectJ: belongs to static weaving, the principle is static proxy;
- JDK dynamic agent:
JDK
dynamic agent two core classes:Proxy
andInvocationHandler
; - Cglib dynamic proxy: encapsulates
ASM
, and can dynamically generate newClass
; it is more powerfulJDK dynamic proxy in function;
Among them, the dynamic proxy method Cglib
ASM
. In the above example, we also saw the bytecode enhancement function ASM
Instead of reflection
FastJson
known for its fast speed, one of which is to use ASM
instead of Java
reflection; there is also a ReflectASM
package specifically used to replace Java
reflection;
ReflectASM is a very small Java class library that provides high-performance reflection processing through code generation, and automatically provides access classes for get/set fields. Access classes use bytecode operations instead of Java's reflection technology, so it is very fast.
Look at a simple way of using ReflectASM
TestBean testBean = new TestBean(1, "zhaohui", 18);
MethodAccess methodAccess = MethodAccess.get(TestBean.class);
String[] mns = methodAccess.getMethodNames();
for (int i = 0; i < mns.length; i++) {
System.out.println(methodAccess.invoke(testBean, mns[i]));
}
TestBean
are printed normally here. Why is the speed fast? Because a temporary ASM
TestBeanMethodAccess
, the invoke method is internally rewritten, and the decompilation is as follows:
public Object invoke(Object var1, int var2, Object... var3) {
TestBean var4 = (TestBean)var1;
switch(var2) {
case 0:
return var4.getName();
case 1:
return var4.getId();
case 2:
return var4.getAge();
default:
throw new IllegalArgumentException("Method not found: " + var2);
}
}
It can be found that invoke is actually a normal call, and the speed is definitely faster than using java reflection.
Reference documents
Thanks for attention
You can follow the WeChat public "160c2caba30b04 roll back code ", read the first time, the article is continuously updated; focus on Java source code, architecture, algorithm and interview.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。