Vivo&#39;s JaCoCo-based test coverage design and practice

Author: vivo Internet Server Team - Xu Shen

This article mainly introduces the practice of using JaCoCo to achieve test coverage in vivo's internal R&D platform, including the introduction of JaCoCo principles and the new code coverage statistics problems encountered in the practice process and the solutions to the problem of coverage loss caused by frequent releases.

1. Why do you need test coverage?

1.1 In the daily R&D process, some problems are often found

The design of test cases is based on experience. When developing a new function, the test scenario is often underestimated, and bugs are found after the launch;
Development often makes some code changes outside the requirements (refactoring the code in a small range or finding small defects in the development process and changing them at will), which leads to the inability of testing tasks to test the corresponding scenarios, causing online problems;
The test effect cannot be quantitatively assessed, resulting in the inability to further improve the quality of the test work.

1.2. Is there any technical means to avoid the above problems as much as possible?

Code coverage has been widely used in the industry to improve test quality, so what is code coverage?

Code coverage is a measure in software testing that describes the proportion and extent to which the source code in the program is tested, and the resulting proportion is called code coverage .

Code coverage metrics usually include the following categories:

Function/method coverage : how many of the functions/methods are called to
Branch coverage : how many branches of control structures (such as if statements) are executed
Condition coverage : how many boolean subexpressions are tested as true and false
Line coverage : how many lines of source code have been tested

1.3 Scenarios that are often found in the process of using test coverage

In the if/else statement, the code in if{} is covered, but the code in else{} is not covered, so it can be concluded that some branch scenarios are not tested;
In the try/catch statement, the code in try{} is covered, but the code in catch{} is not covered, so it can be concluded that the abnormal scenario is not tested;
If (condition 1 || condition 2 || condition 3) statement , condition 1 is covered, condition 2 and condition 3 are not covered, it can be concluded that some conditional scenarios are not tested;

Correct use of code coverage indicators by testers can effectively improve the quality of testing, thereby improving the quality of the version on-line.

2. The use of JaCoCo in test coverage scenarios

2.1 Introduction to JaCoCo

Current mainstream code coverage tools:

C/C++→Gcov, Java→JaCoCo, JavaScript→Istanbul.

Considering that the server side is mainly the Java language, the CICD platform preferentially uses JaCoCo to support the code coverage statistics capability of the Java language.

Through JaCoCo's official website, we can see that JaCoCo's mission is to provide standard technology for code coverage analysis in the environment of the Java VM. The focus is to provide a lightweight, flexible, and well-documented library for integrating with various build and development tools.

2.2 Advantages of JaCoCo

JaCoCo supports multi-dimensional coverage analysis such as instruction (C0), branch (C1), line, method, class and cyclomatic complexity;
Based on Java bytecode, can also work without source files;
Good performance with little runtime overhead, especially for large projects;
A relatively complete API, which is easy to integrate with other tools;
The remote protocol and JMX control can perform data downloads from proxy requests at any point in time.

2.3 Principle of JaCoCo

Mainly from JaCoCo official website

JaCoCo supports several different methods to collect coverage information. For each method, it is implemented by different technologies. The orange path part in the figure below is the method recommended by JaCoCo, which is to collect coverage information through On-The-Fly method:

From the above figure, we know that JaCoCo collects coverage information by inserting probes into Java Byte Code. Probes are additional instructions that can be inserted between existing instructions. They do not change the behavior of the method, but record the fact that they have been executed.

The following is an example of a simple program:

This code is converted to the following bytecode after Java compilation:

Because of the linear sequence of Java bytecode instructions, the control flow is jumped through conditional or unconditional instructions, and the jump target is technically an offset relative to the target instruction. This is similar to the jumping method of assembly instructions learned in university. For better readability, symbolic labels (L1, L2) are used instead of actual instruction addresses.

The orange part in the figure above is the inserted probe. In theory, we can insert a probe at each edge of the control flow graph. Since the probe implementation itself requires some bytecode instructions, this will increase the size of the class file. several times; fortunately, this is not required, in fact we only need to insert a few probes for each method according to the control flow of the method. For example, a method without any branches only needs one probe.

If a probe has been performed, we know that the corresponding edge has been visited. From this edge we can draw conclusions to other preceding nodes and edges:

If an edge has been visited, we know that the source node of this edge has been executed;
If a node has been executed and the node is the target of only one edge, we know that edge has been visited.

Applying these rules recursively can determine the execution state of all instructions of a method if we have probes in the right places, probes are just a small piece of additional instruction that needs to be inserted at the edge of the control flow.

3. CICD platform's solution for test coverage

Through the above introduction to the JaCoCo principle, combined with our company's internal research and development process, the design of the code coverage function on the CICD platform is as follows:

From the design diagram of the CICD platform for test coverage above, it can be seen that the whole process consists of three stages

3.1 Before testing

Before testing, testers (developers/operations and maintenance personnel) enable the test coverage function on the pipeline. When the pipeline executes the release, the JaCoCo Agent package will be downloaded on the test environment, and the JavaAgent parameters will be configured when the Java process is started;

During or after the process is started, a class file is intercepted by the Agent when it is loaded, the class file is instrumented, and probes are inserted in the necessary paths (the principle of inserting probes has been introduced in the previous section).

3.2 In testing

During the testing process, the tester executes the test case (manual execution or automated script) in the test environment, the called code will be recorded by the probe, and the probe data will be saved in the memory of the Java process.

3.3 After testing

Testers can publish the test environment multiple times, and for the code of the same branch, the result data of multiple tests can be merged to form a full amount of coverage data;

After the test, the CICD platform manually/automatically downloads (dump) coverage data through JaCoCo's API, merges (merge) historical coverage data, and generates a test coverage report;

According to the results of the test coverage report, testers view the scenarios that are omitted from the test, conduct supplementary tests, and summarize the reasons for the omissions afterwards to improve the test efficiency.

Fourth, the problems encountered in the process of practice and solutions

After the test coverage has been online for a period of time, some problems have been found in the practice process, which are summarized as follows:

4.1 Compiling on different machines will cause the problem of inconsistent classid

In practice, we often encounter such a problem. Users feedback and confirm that the case has been executed normally, but the generated report shows that it is not covered. After investigation, it is found that the class in the test environment is inconsistent with the class when the report was generated.

Inside JaCoCo, the coverage data is stored with the classid as the key, and the classid is obtained according to the bytecode hash algorithm of the class. See the algorithm of the classid in the JaCoCo source code as follows:

Inconsistencies include:

There are differences in the environment of the machine compiled at the time of publishing and the machine that generated the report, such as operating system version, JDK version, etc., resulting in inconsistent compiled classes;
There is a difference between the code version compiled at the time of publishing and the code version when the report is generated, resulting in inconsistent compiled classes.

To solve the problem of the above environment, it is necessary to keep the machine environment compiled during the test coverage process consistent, or to compile only once and use the same class file. Considering the problem of storage space, vivo adopts the method of keeping the environment consistent to solve.

For the second case, it is common in a team that adopts agile R&D. In a version, the test is transferred according to the function point, which often leads to the source code has been modified during the test process, and the code version when the report is generated and the code version when it is released. It is already inconsistent, this situation is more complicated, we will introduce it below.

4.2 Pay more attention to the coverage of incremental code in the development process

In our daily R&D activities, we use more automated scripts to return the full code, and the newly developed functions are mainly represented by incremental codes. We pay more attention to the coverage of incremental codes. JaCoCo itself does not support incremental codes. coverage.

There are also many solutions to this problem on the Internet, which are basically based on git version differences. When generating the report, filter out the classes that have no difference, and form two coverage reports, one is the full code coverage report, and the other is the incremental code coverage report. In addition, we hope to present the coverage of incremental code and full code in a coverage report, analyze the missing scenarios in combination with the coverage path of the code in the full report, and mark the incremental code in the report. The coverage of code and incremental code, the expected effect is shown in the following figure:

In order to achieve the above effect, several transformation steps are required:

Calculate the changes of the current code branch, need to be accurate to the code line
Modify the JaCoCo calculation logic, and count the coverage index values separately for incremental codes
Modify the JaCoCo report format to be compatible with the coverage of full code and incremental code in the report

For the calculation of changes in code branches, the code comparison function provided by GitLab is abandoned to obtain the difference information before different versions. If there are too many differences between versions, GitLab's API interface call timeout often occurs;

And GitLab's comparison function cannot meet customized scenarios. For example, a line of code is recognized as changed code only because of formatting, etc., using the diff command that comes with Linux to achieve the ability to compare code differences:

For the transformation of JaCoCo calculation logic, increase the coverage index statistics for incremental code, and add a new Counter in the CoverageNodeImpl class to count the coverage indicators of new classes, methods, lines, and instructions; add the increment method in the SourceNodeImple class. Added statistical logic for lines of code.

4.3 Revisiting the question about classid

The issue about classid has been mentioned above. If it is an environmental problem, it is better to solve it, but now the Internet team basically uses the agile mode, and it is basically impossible to wait for the development work to be completed before transferring to the test, which will inevitably lead to the latest coverage rate. Reporting, there will be loss of coverage data by class as a unit, requiring testers to repeatedly execute test cases back and forth, otherwise the test coverage data will not look good.

Now that you know what the problem is, is there a way to fix it? Is it possible to directly find the previous classid and copy the probe data corresponding to the previous classid to the current classid? Of course not, because the source code changes, resulting in changes in the number of probes, the following situations will occur:

or like this

In such a situation, it is impossible to determine which probes are newly added or deleted; even if the front and back probes are consistent, the probe position may change due to code modification:

So is this problem unsolved? Here is a general idea. The current coverage data is stored in the unit of class. We can modify the granularity of storage and refine it to the method level, so that most of the probe data of a class can be retained. If the method is used, the test data of other methods can continue to be retained, and it is only necessary to re-test the method, which can effectively reduce the repeated testing of all schemes of the entire class by the tester.

V. Summary

For the test coverage function, whether it has improved the quality of the test, the answer is obvious.

Of course, because of the problems mentioned above, it brought some trouble to the testers. In order to improve the test coverage data, the testers repeatedly tested the same function; at the same time, it also brought benefits to the testers. Faced with the strict requirements of test coverage indicators, I was forced to look at the implementation logic of the code, which improved my business level and the level of reading code. There were even scenarios where testers and developers confronted each other about whether the code logic was reasonable.

Finally, test coverage is not the only criterion for measuring test quality, and test coverage should be used reasonably to improve test quality.

Vivo's JaCoCo-based test coverage design and practice

1. Why do you need test coverage?

1.1 In the daily R&D process, some problems are often found

1.2. Is there any technical means to avoid the above problems as much as possible?

1.3 Scenarios that are often found in the process of using test coverage