1. Bytecode and reference detection

1.1 Java bytecode

The bytecode in this chapter focuses on Java bytecode. Java bytecode is an instruction format executed by the Java virtual machine. You can view the bytecode file corresponding to a Class through the javap -c -v xxx.class (Class file path) command, as shown in the following figure:

1.2 Bytecode detection

The essence of bytecode detection is to perform relevant analysis and detection on the Class file generated after the compilation of the .java or .kt file. Before formally introducing the principle and actual combat of bytecode analysis in citation detection, first introduce the technical pre-research background of bytecode citation detection.

2. Pre-research background of bytecode detection technology

The entire pre-research background needs to start with the software architecture of the APP that the author is responsible for-the domestic sales official website APP.

2.1 APP software architecture of domestic sales official website

There are currently a total of 12 sub-warehouses on the domestic official website APP. The sub-warehouses are independently compiled into AAR files for use in the APP project. The software architecture diagram is shown in the following figure:

Below APP, the upper light blue is the business layer, the middle green is the component layer, and the bottom dark blue is the basic framework layer:

  • Business layer : Located at the top layer of the architecture, business modules (such as shopping malls, communities, services) divided according to business lines correspond to product businesses.
  • component layer : It is some basic functions of APP (such as login, self-upgrade) and business common components (such as sharing, address management, video playback), providing certain reuse capabilities.
  • basic framework layer : Provides complete reuse capabilities through basic components that are completely unrelated to the business (such as tripartite frameworks and self-encapsulated general capabilities).

2.2 App client development model of the official website for domestic sales

  • The official website APP is currently divided into 3 business lines. Parallel development of multiple business versions is the norm, so modularization is very necessary.
  • The modular sub-bins of the official website APP have all been used in the form of AAR for APP use, and there are situations where the upper layer AAR depends on the lower layer AAR.
  • The official website APP modularized warehouse optimization work is interspersed with each business version, and each business version is developed in parallel, and the underlying warehouse will inevitably be modified.
  • When the various business versions of the official website APP are developed in parallel, generally only the new warehouses that need to modify the code of the current version will be newly pulled, and other warehouses will continue to rely on the old version of AAR.

2.3 Runtime crashes caused by incorrect references to classes, methods, and properties

Assume the following scenario:

During the development of the official website APP version 5.0, since the HardWare warehouse has no business modifications, we continue to use the last version of HardWare 4.9.0.0 (usually only re-pulled warehouses that need to be modified during the version development process, and warehouses that do not need to be modified will continue to be used Old version), but the Core warehouse has code changes, so the new 5.0 branch is pulled, and the related code is modified, and a fun1 method in the CoreUtils class is deleted, as shown in the following figure:

Note: The fun1 method in the core warehouse CoreUtils.class is used in the hardware detection module v4.9.0.0 version AAR, and the fun1 method is not used in other warehouses, including the main APP project.

Please think about whether there will be problems with the compilation of the above scenarios?

Answer: There is no problem with compiling

The APP main warehouse relies on the AAR file compiled by the HardWare warehouse of version 4.9.0.0. This AAR file was compiled as early as version 4.9, so there is no compilation problem in the HardWare warehouse;

The APP main warehouse relies on the Core warehouse of version 5.0.0.0, and HardWare relies on the Core warehouse of version 4.9.0.0. The final compilation will take the higher version of the Core warehouse 5.0.0.0 to participate in the compilation of the APP project. The App warehouse is deleted if it is not used. The fun1 method does not have a compilation problem.

Will there be any problems in the running process of

Answer: There is a problem.

When the APP runs to the HardWare warehouse and calls the fun1 method in the CoreUtils class, there will be a runtime crash: Method Not Found.

Because the core warehouse that participated in the compilation of the APP project was version 5.0.0.0, the fun1 method has been deleted in this version, so runtime errors will occur.

real case:

1) Can't find the method

2) Cannot find the class

Fortunately, the above problems were discovered and fixed in time during the development and testing phases. If they flow online, it will be a crash scenario when a certain function is run, which will be very serious.

If all modules of the APP you are responsible for are source code dependencies, under normal circumstances, if there is a reference problem, the compiler will prompt, so under normal circumstances, there is no need to worry (unless the underlying SDK of the dependency has a reference problem), but if it is similar to the official website. The software architecture needs to be paid attention to.

2.4 Status analysis and thinking

There have been run-time exceptions caused by citation problems in the local testing process. The detection of such run-time exceptions is not enough to rely on manual labor, and automated detection tools must be used to check. Traditional FindBugs, Lint, etc. are code static detection tools, which cannot detect runtime exceptions caused by this potential reference problem, and static code detection cannot solve this problem. So self-developed automated inspection tools are imminent!

Three, the solution of bytecode detection

If you can check every class in all JAR and AAR packages through automated tools during APK compilation, check whether there is a reference problem in the use of the methods and attributes that are called, and the suspected problems will be detected at compile time. , If necessary, directly report an error to terminate the compilation, and output an error log to remind the developer to check, to prevent the problem from flowing into the line and causing runtime exceptions.

principle : each sub-bin Java class (or Kotlin class) after compiled AAR or JAR, AAR, JAR will in all classes of Class file, we actually need to generate the compiled Class files for analysis.

perform bytecode analysis on Class files?

It is recommended to use JavaAssist or ASM here. We know that the Android compilation process is mainly controlled by Gradle. To analyze the bytecode of the Class file, we need to implement our own Gradle Transform, and analyze the Class bytecode in the Transform. Here we directly Make a Gradle plugin.

During compilation, automatically analyze the Class bytecode whether there are method references, attribute references, class references can not be found or the current class has no access to the problem, stop the compilation when the problem is found, and output related logs to remind the developer to analyze and support the plug-in Configuration.

At this point, the main framework of the entire program is relatively clear, as shown in the following figure:

3.1 Principles of method and attribute reference detection

Method and attribute reference problem identification:

How to identify a method reference problem?

  • The method was deleted, and the related method name could not be found;
  • Cannot find a method with the same method signature, which mainly means that the input parameter amount and input parameter type of the method cannot be matched;
  • The method is a non-public method, and the current class does not have permission to access the method.

recognize that there is a problem with an attribute (field) reference?

  • The attribute was deleted, and the related attributes and fields were not found;
  • The attribute is a non-public attribute, and the current class does not have permission to access this attribute.

Permission modifier description:

method and attribute reference bytecode detection : We can use JavaAssist, ASM and other libraries that support bytecode operations to scan methods and attributes in all classes, and analyze method calls and attribute references for reference problems.

3.2 Method and attribute reference detection in practice

The following code has been written by Kotlin, the specific process of implementing Gradle Plugin and Transform is omitted, and the code for the detection function is directly uploaded. Method and field reference detection:

// Gradle Plugin、自定义Transform的部分这里不做赘述
// 方法引用检测
// 遍历每个类中的 每个方法 (包括构造方法 addBy Qihaoxin)
classObj.declaredBehaviors.forEach { ctMethod ->
    //遍历当前类中所有方法
    ctMethod.instrument(object : ExprEditor() {
        override fun edit(m: MethodCall?) {
            super.edit(m)
            //每个方法调用都会回调此方法,在此方法中进行检测
            //引用检查功能
            try {
                //这里不是每个方法都需要校验的,过滤掉 我们不需要处理的 系统方法,第三方sdk方法 等等 只校验我们自己的业务逻辑代码
                if (ctMethod.declaringClass.name.isNeedCheck()) {
                    return
                }
                if (m == null) {
                    throw Exception("MethodCall is null")
                }
                //不需要检查的包名
                if (m.className.isNotWarn() || classObj.name.isNotWarn()) {
                    return
                }
                //method找不到,底层会直接抛异常的,包括方法删除、方法签名不匹配的情况
                m.method.instrument(ExprEditor())
                //访问权限检测,该方法非public,且对当前调用这个方法的类是不可见的
                if (!m.method.visibleFrom(classObj)) {
                    throw Exception("${m.method.name} 对 ${classObj.name} 这个类是不可见的")
                }
            } catch (e: Exception) {
                e.message?.let {
                    errorInfo += "--方法分析 Exception Message: ${e.message} \n"
                }
                errorInfo += "--方法分析异常发生在 ${ctMethod.declaringClass.name} 这个类的${m?.lineNumber}行, ${ctMethod.name} 这个方法  \n"
                errorInfo += "------------------------------------------------\n"
                isError = true;
            }
        }
 
        /**
         * 成员变量调用的分析主要有:
         * 变量直接被删掉后找不到的问题
         * private变量的只能定义该变量的类试用
         * protected变量的可被类自己\子类\同包名的访问
         * */
        override fun edit(f: FieldAccess?) {
            super.edit(f)
            try {
                if (f == null) {
                    throw Exception("FieldAccess is null")
                }
                //不需要检查的包名
                if (f.className.isNotWarn() || classObj.name.isNotWarn()) {
                    return
                }
                //这里不用判空,如果field找不到(这个属性被删掉了),底层会直接抛异常NotFoundException
                val modifiers = f.field.modifiers
                if (ctMethod.declaringClass.name == classObj.name) {
                    //只处理定义在本类中的方法,不然基类里的方法也会被处理到--会出现本类实际没访问基类里的private变量但报错的问题
                    if (ctMethod.declaringClass.name == classObj.name) {
                        if (!f.field.visibleFrom(classObj)) {
                            throw Exception("${f.field.name} 对 ${classObj.name} 这个类是不可见的")
                        }
                    }
                }
            } catch (e: Exception) {
                e.message?.let {
                    errorInfo += "--字段分析 Exception Message: ${e.message} \n"
                }
                errorInfo += "--字段分析异常发生在 ${classObj.name} 该类在 ${f?.lineNumber}行,使用 ${f?.fieldName} 这个属性时\n"
                errorInfo += "------------------------------------------------\n"
                isError = true
            }
        }
    })
}

In the above code implementation, all methods are traversed, and method calls and field access within the methods are detected. So how to check global variables?

class BillActivity {
    ...
    private String mTest1 = CreateNewAddressActivity.TAG;
    private static String mTest2 = new CreateNewAddressActivity().getFormatProvinceInfo("a","b", "c");
    ...
}

For example, in the above code, how should the value of the mTest1 attribute and the value of the mTest2 attribute be tested? This problem has troubled the author for a long time. In both JavaAssist and ASM, the relevant APIs for obtaining the current value of the attribute could not be found, and the relevant ideas and materials for directly analyzing the attribute value of the Class bytecode could not be found.

After studying the knowledge of Class bytecode, doing a lot of experiments, and playing a lot of logs, the solution ideas slowly surfaced.

Let's first look at a piece of bytecode of BillActivity:

Here we find the global variable mTest1 defined, and then you can notice that an init method appears in the Method on the right. In fact, Java will generate an init method in the bytecode file after compilation, which is called an instance constructor. , The instance constructor will converge the statement block, variable initialization, call the parent class constructor and other operations to the init method. What about our mTest2 global variable?

After searching, I found that mTest2 is actually in the static code block. It seems that the mTest2 assignment is not wrapped by the method, as shown in the following figure:

In fact, after consulting a lot of information, it is known that after compilation, Java will generate a clinit method in the bytecode file, which is called a class constructor. The class constructor will initialize static statement blocks and static variables and converge to the clinit method. . The above figure shows that the clinit method is not displayed in the Class bytecode through javap because javap does not display the relevant adaptation.

Through the experiment Log, it is found that the initialization of mTest2 does appear in the clinit method, and the same bytecode as the above figure is viewed in the ByteCode of ASMPlugin, which is displayed as the bytecode with the clinit method identification, as shown in the following figure:

After studying here, we actually know that the assignment of mTest1 and mTest2 actually occurs in the init and clinit methods. So we traverse all the methods in the class to check that the reference check of methods and properties can cover global variables.

The problem seems to have been completely solved here, but after looking at the global variable code for a few times, I found a new problem:

class BillActivity {
    ...
    private String mTest1 = CreateNewAddressActivity.TAG;
    private static String mTest2 = new CreateNewAddressActivity().getFormatProvinceInfo("a","b", "c");
    ...
}

We only cared about whether there was a problem with the reference of the TAG attribute and the getFormatProvinceInfo method, but we did not do a reference check on the CreateNewAddressActivity class itself. Assuming this class is private, there will still be problems here. So we can't forget the reference check of the class.

3.3 Principles of Class Reference Checking

How to identify a problem with a class reference?

  • The class is deleted, no related class can be found;
  • The class is non-public, and the current class does not have permission to access the class.

3.4 Actual Combat of Class Reference Detection

Class reference check

//类的引用检查
if (classObj.packageName.isNeedCheck()) {
    classObj.refClasses?.toList()?.forEach { refClassName ->
        try {
            if (refClassName.toString().isNotWarn() || classObj.name.isNotWarn()) {
                return@forEach
            }
            //该类被删除,找不到相关类
            val refClass = classPool.getCtClass(refClassName.toString())
                ?: throw NotFoundException("无法找到该类:$refClassName")
            //权限检测
            //.....省略.....跟方法和属性的权限检测一样,这里不再赘述
        } catch (e: Exception) {
            e.message?.let {
                errorInfo += "--类引用分析 Exception Message: ${e.message} \n"
            }
            errorInfo += "--类引用分析异常 在类:${classObj.name} 中引用了 $refClassName \n"
            errorInfo += "------------------------------------------------\n"
            isError = true
        }
    }
}

At this point, the principle and actual combat of bytecode reference detection are introduced.

3.5 Reflection on the solution

After implementing the citation detection function in the buildSrc of the official website for domestic sales, I learned that many other apps have been modularized. I think that other apps may also adopt a modular architecture similar to the official website, and there will be similar pain points. I reflect on the current technology implementation and integration. Without universal access capabilities, I deeply feel that this has not been done. After solving the pain points of its own APP, other apps need to be laterally empowered to solve the pain points faced by the large team, so that there is an independent Gradle plug-in behind.

Four, independent Gradle plugin

If you need an APP module for reference detection during compilation, you are welcome to access the Gradle plug-in for bytecode reference detection that I developed.

4.1 Standalone Gradle Plugin Target

1) Independent Gradle plug-in, convenient for all APP access;

2) Support common development configuration items, support plug-in function switch, abnormal skip and other configurations;

3) Perform reference check on the compiled bytecodes of Java and Kotlin. When the APK package is compiled on CI and Jenkins and a reference problem is found, the compilation error will be reported and the specific information of the reference problem will be output for development analysis and solution.

4.2 Plug-in function

1) Method reference detection;

2) Attribute (field) reference detection;

3) Class reference detection;

4) The plug-in supports common configurations, which can be turned on or off.

For example, it can detect Class Not Found \Method Not Found or Field Not Found problems. The running time of the entire plug-in during compilation is very short. Take the official website APP for domestic sales as an example. The plug-in runs about 2.3 seconds during the compilation of the APP, which is very fast, so there is no need to worry about increasing compilation time.

4.3 Plug-in access

Add the dependency in the main project root directory build.gradle:

dependencies {
        ...
        classpath "com.byteace.refercheck:byteace-refercheck:35-SNAPSHOT" //目前是试运行版本,版本还需迭代;欢迎大家体验并提建议和问题,帮助不断完善插件功能
}

Use the plug-in in the build.gradle of the APP project and set the configuration information:

//官网自研的字节码引用检查插件
apply plugin: 'com.byteace.refercheck'
//官网自研的字节码引用检查插件-配置项
referCheckConfig {
        enable true //是否打开引用检查功能
        strictMode true // 控制是否发现问题时停止构建,
        check "com.abc.def" //需要检查的类的包名,因为工程中会使用很多sdk或者第三方库我们一般不做检查,只检查我们需要关注的类的包名
        notWarn "org.apache.http,com.core.videocompressor.VideoController" //人工检查确认后不需要报错的包名
}

4.4 Description of plug-in configuration items

Enable : Whether to open the reference check function, if it is false, then no reference check

StrictMode : When the strict mode is turned on, the compilation is directly interrupted when a reference exception is found (when the strict mode is turned off, only the exception information will be printed in the log of the compilation process, and the compilation will not be terminated if the reference problem is found).

Suggestion: When launching the Release package on Jekins or CI, both enable and strictMode configured in build.gradle are set to true.

Check : The package name that needs to be checked. Generally, only the current APP package name is configured to check. If you need to check the dependent third-party SDKs, you can configure it according to your needs.

NotWarn citation problems. After the developer checks the plug-in error report and determines that it will not actually cause a crash, you can configure the class names that are not currently referenced here, and you can skip the check. If class A cannot reference a method in class B, you can configure the class name of class B here, and no error will be reported.

4.5 Description of NotWarn configuration items in the official domestic sales website APP

The official domestic sales website APP added org.apache.http and com.core.videocompressor.VideoController to the whitelist without error. org.apache.http actually uses the package in the Android system. The package does not participate in APK compilation. If you do not add this configuration item, an error will be reported, but it will not be wrong in actual operation.

com.core.videocompressor.VideoController If this item is not added, an error will be reported: CompressProgressListener class cannot be referenced in FileProcessFactory. Check the FileProcessFactory code. Line 138 of the FileProcessFactory class calls the convertVideo method, and the last listner parameter passes null.

The bytecode Class file of this class is as follows, and the last input parameter null of converVideo will be automatically forced to type conversion:

And this CompressProgressListener is not public, it is the default package. And the FileProcessFactory class and CompressProgressListener are not in the same package, so an error will be reported. But it will not crash during actual operation, so you need to add its class name to the whitelist without error.

If you encounter a case that should not be reported in the process of using the plug-in, you can skip it through the whitelist control. At the same time, I hope to report the case to me. I will analyze the case and update the plug-in iteratively.

Five, summary

In the pre-research process, due to the deep knowledge of bytecode, and there are many tutorials similar to bytecode instrumentation and code generation on the Internet, but there are too few materials for bytecode analysis, so you need to be familiar with bytecode knowledge and Experiment and explore slowly in practice, and the details need to be polished slowly.

In the pre-research process, we actively considered the versatility and configurability of the solution, and finally developed a universal Gradle plug-in, actively promoting the access of other modules, taking this precious opportunity to empower the horizontal technology and strive for the success of the large team.

At present, there are two APP plug-ins. The plug-ins will continue to be maintained and iterated. After the plug-ins are stable, they are planned to be integrated into CI and Jenkins. Welcome apps in need to access the Gradle plug-in for reference detection, hoping to help apps and teams that have pain points in reference detection.

Author: vivo official website mall client team-Qi Haoxin

vivo互联网技术
3.3k 声望10.2k 粉丝