Author: Su Yanjiao (Mu Lei)
Android projects generally use gradle as a build and packaging tool. Gradle's concise and dynamic features are popular. Similarly, the slow build execution speed has always been criticized.
In recent years, as Youku's features have become more abundant, Youku's code size has also increased sharply. At the same time, the huge code size has also led to a continuous increase in construction time. The whole package construction time was as high as 35 minutes, which seriously affected the integration and iterative efficiency. Therefore, it is imperative to optimize the construction speed. As of November 2021, Youku's construction time-consuming optimization has achieved relatively ideal optimization results (see below), and the practice plan for construction speed optimization is now documented.
android build type | 2020 year | 2021 |
---|---|---|
Android debug package construction time-consuming | 12min | 2.5min |
Android release package building time-consuming | 35 min | 12min |
Plan and income statistics chart:
Optimization ideas
Technology optimization projects generally adopt three dimensions: setting data indicators, technology optimization, and result anticorrosion. The disassembly of the application technology optimization category shows that we need to complete the following three sub-projects:
- Set data indicators: Collect and select core optimized data indicators to reflect the value of results. This paper selects indicators such as construction time, construction failure rate, and number of hourly dimension constructions as the data support for the optimization of results;
- Technical optimization: It can be known from the influencing factors that affect the construction speed, including software and hardware, so the construction speed optimization can be divided into two major directions: software optimization and hardware optimization;
- Achievement anticorrosion: that is, to maintain the technical optimization indicators from deterioration, and to ensure the optimization results.
Next, I will follow the three parts of setting data index and result anti-corrosion, technical optimization-software optimization, technical optimization-hardware optimization.
Optimization
Set data indicators and result anti-corrosion
Optimization projects need to establish and improve the corresponding data index system, and judge the effectiveness by judging the volume optimization items and optimization schemes by data. Before the construction optimization, the author built a data evaluation and monitoring market based on Alibaba's Aone FaaS (Severless Service) platform. The market has multiple indicators such as construction type, construction time, construction success rate, construction task time-consuming, etc., to meet the needs of building optimization projects based on type, task frequency, and high-time-consuming tasks.
After completing the relevant data capacity building, through the construction of key data indicators-construction time-consuming and construction success rate tracking and analysis, it is concluded that the main influencing factor of the construction time is the high time-consuming task, and the main influencing factor of the construction success rate is Unreasonable build task. Therefore, we can quickly discover and analyze the deterioration of the construction speed through high-time task alarms and unreasonable task alarms, thereby ensuring the time-consuming optimization results.
Software optimization
Build side go to atlas
Atlas is a containerized framework that runs on the android system derived from the continuous development of Taobao on mobile phones. We also call it Dynamic Bundle framework. It mainly provides support for decoupling, componentization, and dynamics. Covers the engineer's engineering coding period, Apk operation period and subsequent operation and maintenance period of various issues.
Relying on a deeply customized product structure and a highly complex and deeply hooked runtime framework, Atlas can be regarded as a mobile OSGI implementation solution and a componentized solution. However, with the adjustment of Youku's mobile terminal architecture and the implementation of the self-developed remote program, the Atlas runtime framework gradually lost its role as the OSGI framework, so the Atals framework was removed during the runtime.
When the Atlas dependency is removed during runtime, Atlas's complex build process (as shown in the figure below) also loses its significance. Immediately, Youku launched the Atlas removal project on the build side, with the goal of removing, native, pure, and streamlined Atlas build plugins. Through a series of actions such as product nativeization, build task cleanup, tool chain upgrade, etc., while completing the Atlas removal target on the build side, the build performance has also been partially improved.
Benefits: The debug package construction time is reduced by about 3 minutes. The release package construction time is reduced by 4min-5min.
gradle upgrade and android gradle plugin upgrade
The gradle team has been continuously optimizing gradle's build speed and other performance indicators, while the google team is also continuing to optimize the performance of the android gradle plugin build tool. In order to further improve the build performance of Youku's android terminal, it was decided to upgrade the Youku android build system, the android gradle plugin build tool version was upgraded from 3.0.1 (2017) to 3.4.3 (2019), and the gradle build tool was changed from 4.4 (2017) to 5.5 (2019) version.
Comparing the time-consuming construction before and after the upgrade, it can be found that after the upgrade of the build tool, the performance improvement mainly comes from three aspects:
- With the upgrade of android gradle plugin, aapt2, proguard and other build tools have also been upgraded. After these tools are upgraded, the build performance has been slightly improved;
- Better task arrangement and parallelization mechanism: After upgrading gradle and agp, the agp 3.4.3 version has carried out the integration and optimization of signature, compression, and alignment tasks;
- Configure on-demand loading and asynchronous strategy: android gradle plugin 3.4.3 adopts the asynchronous loading strategy of resources, that is, only the dependency pull work is done in the configuaration phase, and the product decompression, filtering, and merging work is no longer performed, which can effectively avoid io congestion Problem, avoid cpu busy and other phenomena.
Benefits: The debug package construction time is reduced by about 2 minutes. The release package construction time is reduced by about 4 minutes.
dx build optimization
After upgrading to android gradle plugin 3.4.2 version, agp added three new dx build parameters, which can significantly improve the speed of dx processing class files after testing. After testing, the following three attributes are set to reduce the construction time.
android.dexingNumberOfBuckets=16
android.dexingWriteBuffer.size=256
android.dexingReadBuffer.size=256
Reading the agp source code carefully, we can see that these parameters construct the size of the dx cache in the built-in memory and the size of the dex read and write slices. The default dexingNumberOfBuckets is half of the number of CPUs, and the read and write size is 1KB. This will cause the CPU to be busy in high io situations. The use of reducing the number of disk writes and increasing the cache can significantly reduce the construction time.
Benefits: The time it takes to build the debug release package is reduced by about 3 minutes.
Redundant task sorting
As we continue to iterate and upgrade the platform, some construction tasks, construction functions, etc. have been abandoned, but due to the particularity of the construction system-the ability to roll back without products, the tasks of the construction system have been in a single increase state. In addition, due to the high risk, low profit, complex and obsolete logic functions of the construction system, the construction speed governance power is insufficient.
In response to the above problems, through construction logic combing, construction configuration item cleaning, single task debugging and other methods, gradually figure out the function of each construction task, and clean up 30+ useless tasks such as postPackageDebug, and perform core transformation management and task management. The function is simplified. The following table is a cleanup list for constructing redundant tasks.
Task name | Effect output | Whether to keep |
---|---|---|
postPackageDebug | apk post-processing | Can be discarded |
remoteSignAppDebug | Remote signature | Can be discarded |
DexCountDebug | Number of dex | Can be discarded |
ChannelPackageDebug | Channel package construction | Can be discarded |
generateAppInfoDebug | appinfo generation | Can be discarded |
uploadBuildFilesDebug | File Upload | Can be discarded |
buildPatchBaseApkDebug | Hot repair build | Can be discarded |
.... |
Benefits: The debug package construction time is reduced by 15s+, and the release package construction time is reduced by 20s+.
Task pipeline
Gradle uses the task situation to arrange the construction tasks. Based on this extended feature, the product can be post-processed. For example, the apk post-processing includes channel processing, arsc processing, alignment, signature, subcontracting processing, image compression and other tasks. Each task requires repeated decompression, compression, and copy operations of the apk, which wastes cpu, system io, and increases Time-consuming to build.
In order to reduce the time-consuming apk construction and simplify the complex operation of apk products, we reorganized and expanded existing tasks to implement a low-copy, one-time decompression, and one-time compression product pipeline processing mechanism. The following figure shows the apk post-processing during the construction process. Mechanism flow chart.
Benefit: The debug package time is reduced by 21s, and the release package time is reduced by 11s.
Build template optimization
According to the purpose, there are many build variants of android, debug version, release version, remote and non-remote, etc. For the development stage, some plug-in optimization functions such as turbo dex reduction and 7zip compression are completely unnecessary and can be directly disabled.
Benefits: The debug is shortened by about 1-2min, and the release package is unchanged.
Reduce code size
Java code obfuscation takes about 60% of the overall construction time in the construction time, and the obfuscation time is positively correlated with the code size. Therefore, the construction time is positively correlated with the code size.
Part of the expansion of the code scale comes from business expansion, and part comes from the corruption of redundant codes. From the second half of 2020 to the first half of 2021, Youku has carried out a normalized package volume management, which has achieved relatively excellent results, and the problem of code corruption has also been partially alleviated.
As shown in the figure below, Youku's android end has been reduced by 25% since the second half of 2020, and the construction speed contribution is about 45s+.
Benefits: The time it takes to build the release package is reduced by about 45s+. The debug package construction takes about 5s-10s.
Hardware optimization
Private build tenant pool
Use iostat, tsar and other linux performance analysis tools to analyze the build process of Youku android side (as shown in the figure below). The cpu io-wait phenomenon in the entire construction process is serious, that is, there are a large number of io operations during the construction process. Due to the insufficient IO performance of the construction machine, the construction takes a long time. This side confirms the effectiveness of reducing the io construction speed in software optimization.
There are two main solutions to the io bottleneck problem: using buffered io processing and improving Io performance:
- First of all, only the agp plug-in was retained for construction, and it was found that the construction speed was not significantly improved. Proof: There is no io optimization space for custom plug-ins;
- Secondly, in the analysis of the android build process, there are fewer io fragmented writes, and buffered io is used to deal with io bottlenecks, and there is not much room for optimization;
- Finally, it is better to replace mechanical hard drives with SSDs. The physical machine construction comparison data are as follows:
Machine type | Non-first-time build (depending on cache situation) Main situation | First time build (no dependent cache scenario) | Hardware situation |
---|---|---|---|
Group One (Desktop) | 12min57s | 21 min | SSD Ex900 521G / write peak about 900Mb/s / intel 2.9Ghz 16 threads/16G memory |
Group two (Dell R740 blade server) | 25min | 40min | Mechanical hard disk / write peak about 254Mb/s / Snapdragon 2.1Ghz 24 threads / 48G memory |
Group Three (Dell WorkStation) | 19min58s | 27min | SSD EX900 521G / write peak about 900Mb/s / Snapdragon 2.2Ghz 20 threads / 32G memory |
Group Four (Desktop) | 10min10s | 23min | SSD 512G / write peak about 1G/s AMD 3.5Ghz 24 threads 32G memory |
Group five (devops cluster) | 23min | 40min | Mostly mechanical hard drives, depending on the specific machine scheduled |
Benefits: The debug is reduced by about 5 minutes, and the release package is reduced by 10 minutes.
Summarize
In summary, in order to maintain the results of building speed optimization, we can carry out the following work:
- In order to meet the needs of data indicator setting and construction optimization anticorrosion, we need to set construction optimization indicators and establish a reasonable data evaluation system;
- Through software optimization methods such as splitting the build template, cleaning up redundant tasks, gradle upgrade and android gradle plugin upgrade, and setting construction related parameters-dx build optimization and other software optimization methods, we can obtain most of the construction speed optimization results;
- The hardware optimization part needs to establish the analysis of the key bottleneck of the construction process, and the bottleneck of each application construction may be different.
Limited by technical means and stability issues, the construction speed optimization still has the following unfinished parts.
- Obfuscation rule control and cleanup: There is a positive correlation between the execution speed of obfuscation tasks and the number of obfuscation rules. Unreasonable obfuscation rules will cause the time-consuming increase of obfuscation tasks;
- Project corruption control: The scale of useless code is an important indicator of the degree of project corruption, and the scale of useless code is an important factor that affects the construction speed. But how to control the degree of engineering corruption of large-scale projects is the next important topic that needs to be explored for application architecture and overall application development;
- R8 build optimization: Through the upgrade test of the android dex build tool chain-r8, it is found that the build speed of Youku release package has been significantly improved. However, due to some google bugs, r8 optimizes the delay.
In the future, Youku’s technical team will continue to optimize the construction time-consuming for the above-mentioned problems. Welcome everyone to discuss with us at any time.
[Related Documents]
- Linux performance optimization guide: https://www.processon.com/view/618255cf1e08536d882c8afb?fromnew=1
- r8 related issues: https://issuetracker.google.com/issues/192304366?pli=1
- Atlas:https://github.com/alibaba/atlas
We are hiring!
Youku—Technology Center—The architecture team recruits people, as described below.
【job description】
- Responsible for android infrastructure work with Youku as the core, including basic framework, middleware, etc.
- Responsible for the long-term management of app stability, performance, package slimming, etc., to improve the basic user experience.
- Research on the engineering of mobile frontier technologies and the exploration of technological trends.
- Solve all kinds of difficult problems and support fast, stable and efficient business iteration.
【Job Requirements】
- Familiar with basic technologies such as Android SDK and Framework, and have good source code analysis capabilities.
- Proficient in Java language, solid basic skills, Kotlin, C/C++ development ability is preferred.
- Experience in any part of the Dalvik/Art virtual machine is preferred.
- Experience in key technology selection, difficult bug repair, memory optimization, etc. is preferred.
Resume : yanjiao.syj@alibaba-inc.com
, 3 mobile technology practices & dry goods for you to think about every week!
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。