Project background

This APP is an LBS product for users. Users reported that there are problems such as slow startup during the use of the APP. This article mainly analyzes and introduces solutions to the problem of slow startup of the native Android APP.

Challenges encountered

The slow startup problem of user feedback is subjective evaluation. For professional technicians, these feedback evaluations are not quantitative enough to provide us with effective data support for solving the problem. Of course, the negative reviews of users also exposed the two major problems of our APP.

1. Imperfect monitoring system

Regardless of the back-end service or the mobile terminal, we have not introduced a full-link tracking framework, resulting in a lack of comprehensive monitoring coverage for applications. At that time, we only recorded a small number of system-level monitoring indicators, and the critical code path monitoring was seriously insufficient.

2. Poor code quality

Our project has gone through multiple generations of development and maintenance. With the increase in business complexity and the impact of personnel turnover, the code quality is uneven, the code corruption problem is serious, some subsystems have extremely poor code readability, and the code style is not uniform.

solve the problem

After analyzing the problem of slow startup, we prescribe the right remedy for the two major challenges. The main steps to solve the problem are as follows.

1. Improve the monitoring system

After discussing and summing up, we were faced with two choices. One was custom development based on the open source Skywalking and other APM systems. The problem was that existing engineers were not familiar with the system, and it was difficult to supplement relevant knowledge in a short period of time; the other was The use of a mature APM system, but this will bring greater short-term funding pressure. In view of the urgency of our project, we chose option two and introduced U-Meng + application performance monitoring platform U-APM.

After the introduction of U-APM, we can perform very detailed monitoring on the application side, such as the overview monitoring shown in the figure below. Combined with the startup analysis capabilities of U-APM, we have clearly defined the time-consuming and distribution of user startup for the first time.

In addition, we also introduced ELK suites (ElasticSearch, Logstash, Kibana) and structured logs for the server to efficiently output and aggregate information such as key code path parameters and time-consuming information.

2. Strengthen code quality control

Our previous code development paid too much attention to speed and ignored quality. Although there was CR (Code Review), it was mostly formal and quality control was not strict. We learn from the best practices of engineering efficiency in the industry, monitor key indicators such as code submission and review return rate, and regularly issue reports, report unqualified projects, and hold special meetings to introduce code quality improvement practices.

3. Hands-on solution

With the monitoring system, we have bright eyes to observe our system. Through the analysis of the time-consuming quantification, we initially judged the bottleneck, and gave priority to solving the main contradiction. In theory, every request has room for optimization. Does it need to be optimized for every request? The answer is of course no, because there are thousands of interfaces between services. If too much energy is spent on non-critical interfaces, it is easy to get twice the result with half the effort. The idea of ROI (Return On Investment) we adopt is to give priority to solving the time-consuming and frequently requested function calls in order to obtain the maximum benefit. Specifically, there are the following points.

1) Android app problem

It takes a long time to start the app for the first time, optimize the synchronous content of Application and Activity, render the core content page back to the user as soon as possible, and use asynchronous tasks to load time-consuming and non-core page content. Android specific api and principle, please refer to the official website description https://developer.android.google.cn/topic/performance/vitals/launch-time

2) Server-side issues

Through monitoring, we found that individual back-end services GC (Garbage Collection) are frequent, which will cause Stop the World after a period of time, triggering time-consuming fluctuations. Through monitoring the critical code path and special rectification in combination with memory usage, we found that the problem was mainly caused by the improper use of the local cache in the application. A large number of temporary caches squeezed memory and caused application fluctuations.

project summary

After the application of U-APM, rich monitoring and code quality improvement, the later performance of the Android APP has gradually improved, and the amount of slow start-up of users has been greatly reduced. In general, there are two main reasons for the slow startup of Android APP. One is that the code design of the APP itself on the mobile side is not reasonable. Common problems are evenly distributed on the official website. Best practices https://developer.android.google.cn/topic /performance/vitals; Second, there is room for optimization of the background service quality accessed by APP.

For the optimization of Android applications, Google also has many practical cases, such as

PLAY ALL

Android Performance Patterns

Personal profile:

Mo Guangzhong, an open source technology enthusiast, full-stack R&D engineer, has participated in many open source projects and led the design and development of many APPs.


性能优化实践者
11 声望220 粉丝