IM development dry goods sharing: ten thousand characters long text, detailed explanation of IM "message" list lagging optimization practice

This article was originally shared by the Rongyun technical team. The original title is "Dry Goods: IM "Message" List Stall Optimization Practice". In order to make the article better understand, the content has been revised.

1 Introduction

With the popularization of the mobile Internet, IM instant messaging applications are indispensable in daily use, whether they are IM developers or ordinary users, such as: a letter from acquaintances, a Q from an IM living fossil, and a nail in a corporate scene Wait, almost everyone must install.

The following are a few mainstream IM applications (look at the homepage to know which one it is, I will not talk nonsense):

As shown in the figure above, the homepage of these IMs (that is, the "message" list interface) is a must-see for users every time they open the application. Over time, there will be more and more content in the "message" list on the homepage, and the types of messages will become more and more complex.

No matter which IM, as the amount and types of data in the "message" list increase, the sliding experience of the list will definitely be affected. As the "first page" of the entire IM, how the experience of this list directly determines the user's first impression is very important!

In view of this, mainstream IMs on the market will pay special attention to and focus on optimization for the sliding experience of the "message" list (mainly the problem of stalling).

This article will share the analysis and practice of the Rongyun IM technical team based on the stuck problem in the "message" list of its own products (this article takes the Andriod side as an example), to show you the analysis ideas and solutions of an IM when solving similar problems I hope to inspire you.

Special note: The source code of the product optimized and practiced in this article can be obtained from public channels, and interested readers can download it from the "Appendix 1: Source Code Download" of this article. It is recommended to be used for research and learning purposes only.

study Exchange:

5 groups for instant messaging/push technology development and communication: 215477170 [recommended]
Introduction to Mobile IM Development: "One entry is enough for novices: Develop mobile IM from scratch"
Open source IM framework source code: https://github.com/JackJiang2011/MobileIMSDK

(This article has been simultaneously published at: http://www.52im.net/thread-3732-1-1.html)

2. Related articles

IM client optimization related articles:

"IM development dry goods sharing: how do I solve a large number of offline messages causing the client to freeze"
"IM development and dry goods sharing: Netease Yunxin IM client's chat message full-text retrieval technology practice"
"Rongyun Technology Sharing: Practice of Network Link Keep Alive Technology of Rongyun Android IM Products"
"Ali Technology Sharing: Xianyu IM's Cross-End Transformation Practice Based on Flutter"

Other articles shared by Rongyun technical team:

"Rongyun IM Technology Sharing: Thinking and Practice of Message Delivery Scheme for Ten Thousands of People Chatting"
"Rongyun Technology Sharing: Fully Revealing the Reliable Delivery Mechanism of 100 Million-level IM Messages"
"IM Message ID Technology Topic (3): Decrypting the Chat Message ID Generation Strategy of Rongyun IM Products"
"Instant Messaging Yunrongyun CTO's Entrepreneurship Experience Sharing: Technology Entrepreneurship, are you really ready? 》
"Rongyun Technology Sharing: Real-time Audio and Video First Frame Display Time Optimization Practice Based on WebRTC"

3. Technical background

For an IM software, the "message" list is the first interface the user comes into contact with. Whether the "message" list slides smoothly has a great impact on the user's experience.

With the continuous increase of functions and accumulation of data, more and more information will be displayed on the "message" list.

We found that after the product is used for a period of time, for example, when you return to the "message" list interface to slide after calling, there will be serious jams.

So we began a detailed analysis of the "message" list stuck situation, looking forward to find out the root of the problem, and use appropriate solutions to optimize.

PS: The source code of the product discussed in this article can be obtained from public channels, and interested readers can download it from "Appendix 1: Source Code Download" in this article.

4. What is lagging?

When it comes to app lag, many people will say that it is caused by the inability to complete the rendering within 16ms of the UI.

So why does it need to be completed in 16ms? And what needs to be done within 16ms?

With these two questions, let's learn more in this section.

4.1 Refresh rate (RefreshRate) and frame rate (FrameRate)
Refresh rate: refers to the number of screen refreshes per second, which is specific to hardware. At present, most mobile phones have a refresh rate of 60Hz (the screen refreshes 60 times per second), and some high-end phones use 120Hz (such as iPad Pro).

Frame rate: It is the number of frames drawn per second, which is specific to the software. Generally, as long as the frame rate is consistent with the refresh rate, the picture we see is smooth. So when the frame rate is 60FPS, we won't feel the card.

So what is the relationship between refresh rate and frame rate?

Give an intuitive example you will understand:

If the frame rate is 60 frames per second, and the screen refresh rate is 30Hz, then the upper half of the screen still stays in the previous frame, and the lower half of the screen renders the next frame of the picture—— This situation is called "tearing". On the contrary, if the frame rate is 30 frames per second and the screen refresh rate is 60Hz, then two consecutive frames will display the same picture, and this will appear as [Stick].

Therefore, it is meaningless to unilaterally increase the frame rate or refresh rate, and both need to be improved at the same time.

Due to the 60Hz refresh rate currently used by most Android screens, in order to achieve a frame rate of 60FPS, it is required to complete one frame of drawing within 16.67ms (ie: 1000ms/60Frame = 16.666ms / Frame).

4.2 Vertical synchronization technology
Since the display starts from the top row of pixels and refreshes row by row downwards, there is a time difference between the refresh from the top to the bottom.

There are two common problems:

1) If the frame rate (FPS) is greater than the refresh rate, the screen tearing mentioned above will occur;
2) If the frame rate is higher, then before the next frame can be displayed, the data of the next frame will be overwritten, and the middle frame will be skipped. This situation is called frame skipping.

In order to solve this problem that the frame rate is greater than the refresh rate, the vertical synchronization technology is introduced. Simply put, the display sends a vertical synchronization signal (VSYNC) every 16ms, and the system will wait for the arrival of the vertical synchronization signal before proceeding to a frame. Rendering and buffer update, so that the frame rate and refresh rate are locked.

4.3 How does the system generate a frame
Before Android 4.0: Processing user input events, drawing, and rasterization are all executed by the main thread of the application in the CPU, which can easily cause jams. The main reason is that the task of the main thread is too heavy to handle many events. Secondly, there are only a small number of ALU units (arithmetic logic units) in the CPU, which is not good at doing graphics calculations.

After Android 4.0, the application turns on hardware acceleration by default.

After turning on hardware acceleration: the image operations that the CPU is not good at are handed over to the GPU to complete. The GPU contains a large number of ALU units, which are designed to achieve a large number of mathematical operations (so GPUs are generally used for mining). After the hardware acceleration is turned on, the rendering work in the main thread will be handed over to a separate rendering thread (RenderThread), so that when the main thread synchronizes the content to the RenderThread, the main thread can be released for other work, and the rendering thread completes the next work .

Then the complete one-frame process is as follows:

As shown in FIG:

1) First, in the first 16ms, the display shows the content of frame 0, and the CPU/GPU processes the first frame;
2) After the vertical synchronization signal arrives, the CPU immediately processes the second frame, and after the processing is completed, it will be handed over to the GPU (the display will display the image of the first frame).

The whole process seems to be no problem, but once the frame rate (FPS) is less than the refresh rate, the screen will freeze.

A and B on the figure represent two buffers respectively. Because the CPU/GPU processing time exceeds 16ms, in the second 16ms, the monitor should display the content in the B buffer, but now it has to repeatedly display the content in the A buffer, that is, the frame is dropped. ).

Because the A buffer is occupied by the display and the B buffer is occupied by the GPU, the CPU cannot start processing the content of the next frame when the vertical synchronization signal (VSync) arrives, so the CPU does not trigger in the second 16ms Drawing work.

4.4 Triple Buffer
In order to solve the problem of frame drop caused by the frame rate (FPS) less than the screen refresh rate, Android4.1 introduced a three-level buffer.

In the case of double buffering, since the Display and GPU each occupy a buffer, the CPU cannot draw when the vertical synchronization signal arrives. Then add a new buffer, the CPU can draw when the vertical synchronization signal arrives.

In the second 16ms, although one frame is displayed repeatedly, when the Display occupies the A buffer and the GPU occupies the B buffer, the CPU can still use the C buffer to complete the drawing work, so that the CPU is also fully used. Land use. The subsequent display is also relatively smooth, effectively avoiding further aggravation of Jank.

Through the drawing process, we know that the stuttering is due to a dropped frame, and the reason for the dropped frame is that when the vertical synchronization signal arrives, the data is not ready for display. Therefore, if we want to deal with lag, we must shorten the time of CPU/GPU drawing as much as possible, so that we can ensure that one frame of rendering is completed within 16ms.

5. Analysis of Caton Problem

5.1 Stuttering effect in low-end mobile phones
With the above theoretical basis, we began to analyze the issue of "message" list stuck. Since the Pixel5 used by Boss is a high-end machine, the lag is not obvious, we deliberately borrowed a low-end machine from the test classmates.

The configuration of this low-end machine is as follows:

Let's take a look at the effect before optimization:

Sure enough, it is very stuck, let's see what the refresh rate of the phone is:

It's 60Hz, no problem.

Check the specific architecture of SDM450 on the Qualcomm website:

You can see that the CPU of this phone is an 8-core A53 Processor:

The A53 Processor is generally used as a small core in the large and small core architecture. Its main function is to save power. Those scenes with low performance requirements are generally responsible for them, such as standby state, background execution, etc., and A53 does take power consumption To the extreme.

On Samsung Galaxy A20s mobile phones, all use this processor, and there is no large core, so the processing speed will naturally not be very fast, which requires our APP to be optimized better.

After having a general understanding of the mobile phone, we use the tool to check the freeze point.

5.2 Analyze the stuck point
First, open the GPU rendering mode analysis tool that comes with the system to view the "message" list.

You can see that the histogram has been higher than the sky. At the bottom of the figure, there is a green horizontal line (representing 16ms). If this horizontal line is exceeded, frame drop may occur.

According to the color mapping table given by Google, let's take a look at the approximate location of the time-consuming.

First of all, we must make it clear that although this tool is called the GPU rendering mode analysis tool, most of the operations displayed in it occur in the CPU.

Secondly, according to the color comparison table, you may also find that the colors given by Google do not correspond to the colors on the real phone. So we can only judge the approximate location of the time-consuming.

As you can see from our screenshots, the green part accounts for a large proportion, part of which is Vsync delay, and the other part is input processing + animation + measurement/layout.

The explanation given in the Vsync delay icon is the time it takes for the operation between two consecutive frames.

In fact, when SurfaceFlinger distributes Vsync next time, it will insert a message of Vsync arrival into the UI thread's MessageQueue, and the message will not be executed immediately, but will be executed after the previous message is executed. So Vsync delay refers to the time between when Vsync is put into the MessageQueue and when it is executed. The longer this part of the time, the more processing is performed in the UI thread, and some tasks need to be offloaded to other threads for execution.

Input processing, animation, measurement/layout are all callbacks when the vertical synchronization signal arrives and starts to execute the doFrame method.

void doFrame(long frameTimeNanos, int frame) {
//...Omit irrelevant code

  try{
        Trace.traceBegin(Trace.TRACE_TAG_VIEW, "Choreographer#doFrame");

        AnimationUtils.lockAnimationClock(frameTimeNanos / TimeUtils.NANOS_PER_MS);

        mFrameInfo.markInputHandlingStart();

        //输入处理

        doCallbacks(Choreographer.CALLBACK_INPUT, frameTimeNanos);

        mFrameInfo.markAnimationsStart();

        //动画

        doCallbacks(Choreographer.CALLBACK_ANIMATION, frameTimeNanos);

        doCallbacks(Choreographer.CALLBACK_INSETS_ANIMATION, frameTimeNanos);

        mFrameInfo.markPerformTraversalsStart();

        //测量/布局

        doCallbacks(Choreographer.CALLBACK_TRAVERSAL, frameTimeNanos);

        doCallbacks(Choreographer.CALLBACK_COMMIT, frameTimeNanos);

    } finally{
        AnimationUtils.unlockAnimationClock();

        Trace.traceEnd(Trace.TRACE_TAG_VIEW);

    }

}

If this part is time-consuming, you need to check whether time-consuming operations are performed in the input event callback, or whether there are a large number of custom animations, or whether the layout level is too deep and it takes too much time to measure View and layout.

6. Specific optimization plan and practice summary

6.1 Asynchronous execution
After having a general direction, we began to optimize the "message" list.

In the problem analysis, we found that Vsync has a large delay, so the first thing we thought of was to strip out the time-consuming tasks in the main thread and put them in the worker thread for execution. In order to locate the main thread method faster and time-consuming, you can use Didi's Dokit or Tencent's Matrix for slow function positioning.

We found that in the ViewModel of the "message" list, LiveData was used to subscribe to the changes in the user information table, the group information table, and the group member table in the database. As long as there are changes in these three tables, the "message" list will be traversed again, the data will be updated, and the page refresh will be notified.

This part of the logic is executed in the main thread and takes about 80ms. If there are many "message" lists and large changes in the database table data, this part of the time will increase.

mConversationListLiveData.addSource(getAllUsers(), new Observer<List<User>>() {

       @Override

       public void onChanged(List<User> users) {
           if(users != null&& users.size() > 0) {
               //遍历“消息”列表

               Iterator<BaseUiConversation> iterable = mUiConversationList.iterator();

               while(iterable.hasNext()) {
                   BaseUiConversation uiConversation = iterable.next();

                   //更新每个item上用户信息

                   uiConversation.onUserInfoUpdate(users);

               }

               mConversationListLiveData.postValue(mUiConversationList);

           }

       }

   });

Since this part is time-consuming, we can put the operation of traversing the update data in the sub-thread for execution, and then call the postValue method to notify the page to refresh after the execution is complete.

We also found that every time you enter the "message" list, you need to get the "message" list data from the database, and you will also read the session data from the database when you load more.

After reading the session data, we will filter the acquired sessions. For example, sessions that are not in the same organization should be filtered out.

After the filtering is completed, it will be deduplicated:

1) If the session already exists, update the current session;
2) If it does not exist, create a new session and add it to the "message" list.
Then you need to sort the "message" list according to certain rules, and finally notify the UI to refresh.

The time-consuming of this part is 500ms~600ms, and the time-consuming will increase as the amount of data increases, so this part must be executed in a sub-thread.

But you must pay attention to thread safety issues here, otherwise data will be added multiple times, and multiple duplicate data will appear on the "message" list.

6.2 Increase cache
When checking the code, we found that there are many places to get the current user information, and the current user information is stored in the local SP (later changed to MMKV) and stored in Json format. Then when the user information is obtained, it will be read from the SP (IO operation), and then deserialized into an object (reflection).

/**

Get current user information

public UserCacheInfo getUserCache() {

  try{
      String userJson = sp.getString(Const.USER_INFO, "");

      if(TextUtils.isEmpty(userJson)) {
          return null;

      }

      Gson gson = newGson();

      UserCacheInfo userCacheInfo = gson.fromJson(userJson, UserCacheInfo.class);

      returnuserCacheInfo;

  } catch(Exception e) {
      e.printStackTrace();

  }

  return null;

}

It would be very time-consuming to obtain the current user's information in this way every time.

In order to solve this problem, we cache the user information obtained for the first time, and return it directly if the current user information exists in the memory, and update the object in the memory every time the current user information is modified.

/**

Get current user information

public UserCacheInfo getUserCacheInfo(){

  //如果当前用户信息已经存在，则直接返回

  if(mUserCacheInfo != null){
      return  mUserCacheInfo;

  }

  //不存在再从SP中读取

  mUserCacheInfo = getUserInfoFromSp();

  if(mUserCacheInfo == null) {
      mUserCacheInfo = newUserCacheInfo();

  }

  return mUserCacheInfo;

}

/**

Save user information

public void saveUserCache(UserCacheInfo userCacheInfo) {

  //更新缓存对象

  mUserCacheInfo = userCacheInfo;

  //将用户信息存入SP

  saveUserInfo(userCacheInfo);

}

6.3 Reduce the number of refreshes
In this scheme, on the one hand, it is necessary to reduce unreasonable refreshes, and on the other hand, it is necessary to change partial global refresh to partial refresh.

In the ViewModel of the "message" list, LiveData subscribes to changes in the user information table, group information table, and group member table in the database. As long as there are changes in these three tables, the "message" list will be traversed again, the data will be updated, and the page refresh will be notified.

The logic seems to be fine, but the code for notifying the page to refresh is written in a loop, that is, every time a piece of session data is updated, the page is notified to refresh once. If there are 100 sessions, it needs to be refreshed 100 times.

mConversationListLiveData.addSource(getAllUsers(), new Observer<List<User>>() {

       @Override

       public void onChanged(List<User> users) {
           if(users != null&& users.size() > 0) {
               //遍历“消息”列表

               Iterator<BaseUiConversation> iterable = mUiConversationList.iterator();

               while(iterable.hasNext()) {
                   BaseUiConversation uiConversation = iterable.next();

                   //更新每个item上用户信息

                   uiConversation.onUserInfoUpdate(users);

                   //未优化前的代码，频繁通知页面刷新

                   //mConversationListLiveData.postValue(mUiConversationList);

               }

               mConversationListLiveData.postValue(mUiConversationList);

           }

       }

   });

The optimization method is: extract the code that notifies the page refresh to the outside of the loop, and wait for the data to be refreshed once after the update is complete.

There is a draft function in our APP. Every time you come out of the conversation, you need to judge whether there is undeleted text (draft) in the input box of the conversation. If there is, save it and display it in the "Message" list [Draft]+ Content, the user will restore the draft after entering the session next time. Due to the existence of the draft, you need to refresh the page every time you return to the "message" list from the conversation. Before optimization, the global refresh is used here, but we only need to refresh the item corresponding to the session that just exited.

For an IM application, reminding users of unread messages is a common function. The user avatar in the "message" list will display the unread messages of the current session. When we enter the conversation, the unread messages need to be cleared and the "message" list must be updated. Before optimization, the global refresh is also used here. In fact, this part can also be changed to refresh a single item.

Our APP has added a new feature called typing. As long as a user is typing text in the conversation, the copy of XX is typing... will be displayed on the "Messages" list. Before optimization, the list is also refreshed globally. If someone is typing in several sessions at the same time, then basically the entire "message" list will always be refreshed. So here is also changed to partial refresh, only refreshing session items that are currently being typed.

6.4 onCreateViewHolder optimization

When analyzing the Systrace report, we found the situation in the above figure: a swipe is accompanied by a large number of CreateView operations.

Why does this happen?

We know that RecyclerView itself has a caching mechanism. If the layout of the newly displayed item is the same as the old one during sliding, CreateView will not be executed again, but the old item will be reused, and bindView will be executed to set the data, which can reduce the time of creating a view. IO and reflection are time-consuming.

So why is it different here than expected?

Let's take a look at the caching mechanism of RecyclerView first.

RecyclerView has 4 levels of cache, we only analyze the commonly used 2 levels here:

1）mCachedViews；
2）mRecyclerPool。
The default size of mCachedViews is 2. When the item is just moved out of the visible range of the screen, the item will be placed in mCachedViews, because the user is likely to move the item back to the visible range of the screen again, so put it into the item in mCachedViews There is no need to re-execute the createView and bindView operations.

The FIFO principle is adopted in mCachedViews. If the number of caches reaches the maximum, the item that entered first will be removed and placed in the next level of cache.

mRecyclerPool is a RecycledViewPool type, in which a corresponding cache pool is created according to the item type. The default size of each cache pool is 5. Items removed from mCachedViews will be cleared of data and placed in the corresponding cache pool according to the corresponding itemType .

There are two things worth noting here:

1) The first one is that the item is cleared of data, which means that the bindView method needs to be re-executed to reset the data next time the item is used;
2) The other is that there will be multiple buffer pools depending on the itemType. The size of each buffer pool is 5 by default, which means that different types of items will be placed in different buffer pools, and new items will be displayed each time. If there is an item that can be reused, it will first find the corresponding type of buffer pool. If there is an item, it will be reused and bindView will be executed. If it is not, the view will be recreated, and the createView and bindView operations need to be executed.
A large number of CreateViews appeared in the Systrace report, indicating that there was a problem in reusing the item, which caused a new item to be recreated every time a new item was displayed.

Let's consider an extreme scenario. There are 3 types of items in our "message" list:

1) Group chat item;
2) Single chat item;
3) Secret chat item.
We can display 10 items on one screen. The first 10 items are all group chat types. From 11 to 20 are single chat items, from 21 to 30 are secret chat items.

From the figure, we can see that group chat 1 and group chat 2 have been removed from the screen and will be placed in the mCachedViews cache at this time. For single chat 1 and single chat 2 because no reusable item can be found in the single chat buffer pool of mRecyclerPool, the CreateView and BindView operations need to be performed.

Since all the people who moved out of the screen were group chats, when the single chat item entered, there was no way to get reusable items from the single chat buffer pool, so CreateView and BindView were always needed.

Until single chat 1 enters the buffer pool, as shown in the figure above, if the single chat item or group chat item is about to enter the screen, it can be reused, but it is a secret chat that comes in, because the secret chat is in the buffer pool There is no reusable item, so the secret chat item that enters the screen next also needs to execute CreateView and BindView. In this case, the entire RecyclerView's caching mechanism basically fails.

Here is an extra note, why is there group chat 1 ~ group chat 5 in the group chat buffer pool instead of group chat 6 ~ group chat 10? It is not that the drawing is wrong here, but that RecyclerView judges that when the buffer pool is full, no new items will be added.

/**

   * Add a scrap ViewHolder to the pool.

   * <p>

   * If the pool is already full for that ViewHolder's type, it will be immediately discarded.

   *

   * @param scrap ViewHolder to be added to the pool.

   */

  public void putRecycledView(ViewHolder scrap) {
      final int viewType = scrap.getItemViewType();

      final ArrayList<ViewHolder> scrapHeap = getScrapDataForType(viewType).mScrapHeap;

      //如果缓存池大于等于最大可缓存数，则返回

      if(mScrap.get(viewType).mMaxScrap <= scrapHeap.size()) {
          return;

      }

      if(DEBUG && scrapHeap.contains(scrap)) {
          throw new  IllegalArgumentException("this scrap item already exists");

      }

      scrap.resetInternal();

      scrapHeap.add(scrap);

  }

So far we can explain why we found so many CreateViews from the Systrace report. Knowing the problem, then we need to find a solution. Views are created multiple times mainly because the reuse mechanism fails or does not work well, and the main reason for the failure is that we have 3 different item types at the same time. If we can change the 3 different items into one, then We can get reusable items from the buffer pool when Single Chat 4 enters the screen, thus eliminating the step of CreateView and directly BindView resetting the data.

After having thoughts, when we checked the code, we found that whether it was a group chat, a single chat, or a secret chat, the same layout was used, and the same itemType could be used. The reason for the separation in the past was because of the use of some design patterns. I wanted group chat, single chat, and secret chat to be implemented in their respective classes, and it would be more convenient and clear if there are new extensions in the future.

At this time, you need to choose between performance and mode, but if you think about it carefully, the layout of the different types of chats on the "message" list is basically the same, and the different types of chats are only different in the UI display. We can Reset when bindView.

We only register BaseConversationProvider when registering, so there is only one itemType. GroupConversationProvider, PrivateConversationProvider, SecretConversationProvider all inherit from the BaseConversationProvider class, and the onCreateViewHolder method is only implemented in the BaseConversationProvider class.

A List is included in the BaseConversationProvider class to save the three objects of GroupConversationProvider, PrivateConversationProvider, and SecretConversationProvider. When the bindViewHolder method is executed, the method of the parent class is executed first, and some logic common to the three types of chats, such as avatars, is processed here. , The time when the last message was sent, etc. After the processing is completed, determine the current chat type through isItemViewType, and call the corresponding subclass bindViewHolder method to perform subclass specific data processing. Here you need to pay attention to page display errors caused by reuse, such as changing the color of the session title in the secret chat, but due to the reuse of items, the color of the session title of the group chat has also changed.

After the transformation, we can save a lot of CreateView operations (IO + reflection), so that RecyclerView's caching mechanism can run well.

6.5 Preload + global cache
Although we have reduced the number of CreateViews, we still need CreateView on the first screen when we first enter, and we find that CreateView takes a long time.

Can this part of time be optimized?

The first thing we thought of was to use the asynchronous loading layout method in onCreateViewHolder, putting IO and reflection in the child thread to do it, and then this solution was removed (the specific reason will be discussed later). If it can't be loaded asynchronously, then we will consider the operation of creating View to be executed in advance and cached.

We first created a ConversationItemPool class, which is used to preload items in child threads and cache them. When onCreateViewHolder is executed, the cached item is directly obtained from this class, which can reduce the time-consuming execution of onCreateViewHolder.

/**

   * Add a scrap ViewHolder to the pool.

   * <p>

   * If the pool is already full for that ViewHolder's type, it will be immediately discarded.

   *

   * @param scrap ViewHolder to be added to the pool.

   */

  public void putRecycledView(ViewHolder scrap) {
      final  int viewType = scrap.getItemViewType();

      final ArrayList<ViewHolder> scrapHeap = getScrapDataForType(viewType).mScrapHeap;

      //如果缓存池大于等于最大可缓存数，则返回

      if(mScrap.get(viewType).mMaxScrap <= scrapHeap.size()) {
          return;

      }

      if(DEBUG && scrapHeap.contains(scrap)) {
          throw new IllegalArgumentException("this scrap item already exists");

      }

      scrap.resetInternal();

      scrapHeap.add(scrap);

  }

In ConversationItemPool, we use a thread-safe queue to cache the created items. Since it is a global cache, we must pay attention to the problem of memory leaks.

So how many items are appropriate to preload?

After comparing our test machines with different resolutions, the number of items displayed on the first screen is generally 10-12. Since the first 3 items cannot be cached when sliding for the first time, the CreateView method needs to be executed. We also need to count these 3, so we set the number of preloads to 16. Afterwards, the View is recycled in the onViewDetachedFromWindow method and put into the buffer pool again.

@Override

public ViewHolder onCreateViewHolder(ViewGroup parent, int viewType) {

//从缓存池中取item

View view = ConversationListItemPool.getInstance().getItemFromPool();

//如果没取到，正常创建Item

if(view == null) {
    view = LayoutInflater.from(parent.getContext()).inflate(R.layout.rc_conversationlist_item,parent,false);

}

return  ViewHolder.createViewHolder(parent.getContext(), view);

}

Note: There must be a downgrade operation in the onCreateViewHolder method. In case the cached View is not retrieved, one needs to be created and used normally. In this way, we have successfully reduced the time-consuming onCreateViewHolder to 2 milliseconds or less, and can achieve zero time-consuming when the RecyclerView cache takes effect.

To solve the time-consuming solution of creating View from XML, in addition to preloading in asynchronous threads, you can also use some open source libraries such as X2C framework. The main principle is to convert XML files into Java code during compilation to create Views, eliminating IO and reflection time. Or use jetpack compose declarative UI to build the layout.

6.6 onBindViewHolder optimization

When we checked the Systrace report, we also found that in addition to the time-consuming CreateView, BindView is also time-consuming, and this time-consuming even exceeds CreateView. In this way, if 10 items are newly displayed during a sliding process, it will take more than 100 milliseconds.

This is absolutely unacceptable, so we started to clean up the time-consuming operation of onBindViewHolder.

First of all, we must be clear that the onBindViewHolder method is only used for UI settings, and should not do any time-consuming operations and business logic processing. We need to process the time-consuming operations and business processing in advance and store them in the data source.

When we checked the onBindViewHolder method, we found that if the user avatar does not exist, a default avatar will be generated again, which will be generated with the initials of the user name. In this method, MD5 encryption is performed first, then Bitmap is created, compressed, and then stored locally (IO). This series of operations is very time-consuming, so we decided to extract the operation from onBindViewHolder, put the generated data into the data source in advance, and obtain it directly from the data source when using it.

On our "message" list, each session needs to display the sending time of the last message. The time display format is very complicated. Each time in onBindViewHolder, the milliseconds of the last message will be formatted into the corresponding String to display. This part is also very time-consuming. We extract this part of the code for processing. In onBindViewHolder, we only need to take out the formatted string from the data source and display it.

The current number of unread messages will be displayed on our avatar, but the number of unread messages has several different situations.

for example:

1) If the number of unread messages is single digit, the background image is round;
2) The number of unread messages is two digits, and the background image is an ellipse;
3) If the number of unread messages is greater than 99, 99+ will be displayed, and the background image will be longer;
4) The message is blocked, only a small dot is displayed, and the quantity is not displayed.
As shown below:

Due to these situations, the code here directly sets different png background images based on the number of unread messages. This part of the background can actually be achieved by using Shape.

If you use a png image, you need to decode the png and then render it by the GPU. The image decoding consumes CPU resources. The Shape information will be directly transmitted to the bottom layer to be rendered by the GPU, which is faster. So we replace the png image with the Shape implementation.

In addition to the picture settings, the most used in onBindViewHolder is TextView. The time spent by TextView on text measurement accounts for a large proportion of the text settings. This part of the measurement time can actually be executed in sub-threads. Android official also Aware of this, a new class was introduced in Android P: PrecomputedText, which allows the most time-consuming text measurement to be executed in a child thread. Since this class is only available in Android P, we can use AppCompatTextView instead of TextView, and do version compatibility processing in AppCompatTextView.

AppCompatTextView tv = (AppCompatTextView) view;

// Use this method instead of setText

tv.setTextFuture(PrecomputedTextCompat.getTextFuture(text,tv.getTextMetricsParamsCompat(),ThreadManager.getInstance().getTextExecutor()));

It is very simple to use, the principle is not repeated here, you can Google it yourself. In the lower version, StaticLayout is also used for rendering, which can speed up the speed. For details, see an article "Improving Comment Rendering on Android" shared by Instagram.

4.7 Layout optimization
In addition to reducing the time-consuming of BindView, the level of layout also affects the time-consuming of onMeasure and onLayout. We found that measurement and layout took a lot of time when using the GPU rendering mode analysis tool, so we plan to reduce the layout level of the item.

Before optimization, the maximum level of our item layout is 5. In fact, some of them just added a layer of layout to wrap them just for the convenience of controlling the display and hiding. Finally, we used the constraint layout to reduce the maximum level to 2 layers.

In addition, we also checked whether there is a repeated setting of the background color, because repeated setting of the background color will lead to overdrawing. The so-called over-drawing refers to a pixel being drawn multiple times in the same frame. If the invisible UI is also doing drawing operations, this will cause the pixels in some areas to be drawn multiple times, wasting a lot of CPU and GPU resources.

In addition to removing the repeated background, we can also minimize the use of transparency. The Android system draws the same area twice when drawing transparency, the first time is the original content, and the second time is the newly added transparency effect. Basically, the transparency animation in Android will cause overdrawing, so you can minimize the use of transparency animation, and try not to use the alpha attribute on the View. The specific principle can refer to the official Google video.

After using the constraint layout to reduce the hierarchy and remove the repeated background, we still find that it is still a bit stuck. I checked the relevant information on the Internet and found that some netizens reported that there would be a problem with the use of constraint layout in the RecyclerView item, which should be caused by a bug in the constraint layout. We also checked the version number of the constraint layout we used.

// App dependencies

appCompatVersion = '1.1.0'

constraintLayoutVersion = '2.0.0-beta3'

Using the beta version, we changed to the latest stable version 2.1.0. Found that the situation is much better. So try not to use the test version for commercial applications.

6.8 Other optimizations
In addition to the optimization points mentioned above, there are some small optimization points, such as the following points.

1) For example, if you use a higher version of RecyclerView, the prefetch function will be enabled by default:

From the above figure, we can see that the UI thread has been idle after completing the data processing and handing it over to the Render thread. It needs to wait for the arrival of a Vsync signal to process the data, and this idle time is wasted, and the prefetching is enabled. This free time can be used wisely in the future.

2) Set the setHasFixedSize method of RecyclerView to true. When the width and height of our item are fixed, use the onItemRangeChanged(), onItemRangeInserted(), onItemRangeRemoved(), and onItemRangeMoved() methods of the Adapter to update the UI without recalculating the size.

3) If you do not use RecyclerView animation, you can turn off the default animation by ((SimpleItemAnimator) rv.getItemAnimator()).setSupportsChangeAnimations(false) to improve efficiency.

7. Deprecated optimization scheme

In the process of optimizing the "message" list, we adopted some optimization schemes, but they were not adopted in the end. They are also listed here for explanation.

7.1 Load layout asynchronously
As mentioned in the previous article, in the process of reducing the time consumption of CreateView, we initially planned to use asynchronous loading layout to execute IO and reflection in sub-threads.

We use Google's official AsyncLayoutInflater to load the layout asynchronously. This class will call back to notify us when the layout is loaded. But it is generally used in the onCreate method. In the onCreateViewHolder method, the ViewHolder needs to be returned, so there is no way to use it directly.

In order to solve this problem, we have customized an AsyncFrameLayout class, which inherits from FrameLayout, we will add AsyncFrameLayout as the root layout of ViewHolder in the onCreateViewHolder method, and call the custom inflate method to load the layout asynchronously, and the load is successful Later, the successfully loaded layout will be added to AsyncFrameLayout as a child View of AsyncFrameLayout.

public void inflate(int layoutId, OnInflateCompleted listener) {

   new AsyncLayoutInflater(getContext()).inflate(layoutId, this, newAsyncLayoutInflater.OnInflateFinishedListener() {
       @Override

       public void onInflateFinished(@NotNull View view, int resid, @Nullable @org.jetbrains.annotations.Nullable ViewGroup parent) {
           //标记已经inflate完成

           isInflated = true;

           //加载完布局以后，添加为AsyncFrameLayout中

           parent.addView(view);

           if(listener != null) {
               //加载完数据后，需要重新请求BindView绑定数据

               listener.onCompleted(mBindRequest);

           }

           mBindRequest = null;

       }

   });

}

Note here: Because it is executed asynchronously, the onBinderViewHolder method will be executed after the onCreateViewHolder execution is completed. At this time, the layout is very likely not to be loaded, so you need to use a flag as isInflated to identify whether the layout is loaded successfully, if not loaded When you are done, do not bind the data first. At the same time, it is necessary to record this BindView request. When the layout is loaded, it is actively called once to refresh the data.

The main reason for not adopting this method is that the layout level will be increased. After preloading is used, this scheme may not be used.

7.2 DiffUtil
DiffUtil is a data comparison tool officially provided by Google. It can compare two sets of new and old data, find out the differences, and then notify RecyclerView to refresh.

DiffUtil uses Eugene W. Myers' difference algorithm to calculate the minimum number of updates to convert one list to another. But comparing data will be time-consuming, so you can also use the AsyncListDiffer class to perform the comparison operation in an asynchronous thread.

In using DiffUtil, we found that there are too many data items to compare. In order to solve this problem, we encapsulated the data source, added a field to indicate whether to update, and changed all variables to private types. It also provides a set method, in which the updated field is set to true uniformly. In this way, when comparing two sets of data, we only need to determine whether the field is true, and then we know whether there is an update.

The idea is beautiful, but when actually encapsulating the data source, I found that there are also classes in the class (that is, there are objects in the class, not basic data types), and the outside can be completely obtained by first getting an object, and then modifying the reference of the object Among them, the set method is skipped in this way. If we want to solve this problem, then we need to provide all the set methods of the class attributes in the encapsulation class, and do not provide the get methods of the class in the class, the changes are very large.

If it is just this problem, it can be solved, but we found that there is a function on the "message" list, that is, whenever one of the conversations receives a new message, the conversation will move to the first place in the "message" list. Because the position has changed, the entire list needs to be refreshed once, which violates the original intention of using DiffUtil for partial refresh. For example, the fifth session of the "message" list receives a new message. At this time, the fifth session needs to be moved to the first session. If the entire list is not refreshed, the problem of repeated sessions will occur.

Due to this problem, we abandon DiffUtil, because even if the problem of repeated sessions is solved, the benefits will not be great.

7.3 Refresh when sliding stops
In order to avoid a large number of refresh operations on the "message" list, we updated the data when the "message" list was swiped, and waited for the swiping to stop before refreshing.

However, in the actual test process, the refresh after the stop will cause the interface to freeze once, which is more obvious on the low-end and mid-range computers, so this strategy is abandoned.

7.4 Pagination loading in advance
Since the number of "message" lists may be large, we use paging to load data.

In order to ensure that users do not perceive the waiting time for loading, we intend to obtain more data before the user is about to slide to the end of the list, so that the user can slide down without a trace.

The idea is ideal, but in practice, it is also found that there will be a momentary freeze on the low-end machine, so this method is temporarily abandoned.

In addition to the above schemes are abandoned, we found in the optimization process that the "message" list of similar products of other brands does not slide very fast. If the sliding speed is slow, then the number of items that need to be displayed in a sliding process is only Will be small, so that one swipe does not need to render too much data. This is actually an optimization point. Later we may consider the practice of reducing the sliding speed.

8. Summary of this article

In the development process, with the continuous addition of business, the complexity of our methods and logic will continue to increase. At this time, we must pay attention to the time-consuming methods, and extract them to sub-threads as much as possible for execution.

When using Recyclerview, don't refresh it brainlessly. Those that can be refreshed locally are never refreshed globally, and those that can be delayed are never refreshed immediately.

When analyzing the freeze, you can use tools to improve the efficiency. After finding the general problem and troubleshooting direction through Systrace, you can locate the specific code through the Profiler that comes with Android Studio.

Appendix: More IM dry goods articles

"One entry is enough for novices: develop mobile IM from scratch"
"From the perspective of the client to talk about the message reliability and delivery mechanism of the mobile terminal IM"
"How to ensure the efficiency and real-time performance of large-scale group message push in mobile IM? 》
"Technical issues that need to be faced in mobile IM development"
"Implementation of IM Message Delivery Guarantee Mechanism (1): Guarantee the reliable delivery of online real-time messages"
"Implementation of IM Message Delivery Guarantee Mechanism (2): Guaranteeing the Reliable Delivery of Offline Messages"
"How to ensure the "sequence" and "consistency" of IM real-time messages? 》
"A low-cost method to ensure the timing of IM messages"
"Should I use "push" or "pull" for online status synchronization in IM single chat and group chat? 》
"IM group chat messages are so complicated, how to ensure that they are not lost or repetitive? 》
"Talk about the optimization of login request in mobile IM development"
"How to save data by pulling data during IM login on the mobile terminal? 》
"On the principle of multi-sign-in and message roaming on mobile IM"
"How to design a "failure retry" mechanism for a completely self-developed IM? 》
"Easy to understand: cluster-based mobile terminal IM access layer load balancing solution sharing"
"Technical Test and Analysis of WeChat's Influence on the Network (Full Paper)"
"WeChat Technology Sharing: Practice of Generating Massive IM Chat Message Sequence Numbers in WeChat (Principles of Algorithms)"
"Is it so difficult to develop IM yourself? Teach you to teach yourself an Andriod version of simple IM (with source code) "
"Rongyun Technology Sharing: Decrypting the Chat Message ID Generation Strategy of Rongyun IM Products"
"Suitable for novices: develop an IM server from scratch (based on Netty, with complete source code)"
"Pick up the keyboard and do it: work with me to develop a distributed IM system by hand"
"Suitable for novices: teach you to use Go to quickly build a high-performance and scalable IM system (source code)"
"What is the realization principle of "Nearby" function in IM? How to implement it efficiently? 》
"IM Message ID Technology Topic (1): Practice of Generating Massive IM Chat Message Sequence Numbers on WeChat (Principles of Algorithms)"
"IM Development Collection: The most complete in history, a summary of various function parameters and logic rules of WeChat"
"IM development dry goods sharing: how do I solve a large number of offline messages causing the client to freeze"
"Introduction to zero-based IM development (1): What is an IM system? 》
"Introduction to zero-based IM development (2): What is the real-time nature of the IM system? 》
"Introduction to zero-based IM development (3): What is the reliability of the IM system? 》
"Introduction to zero-based IM development (4): What is the message timing consistency of the IM system? 》
"A set of IM architecture technical dry goods for hundreds of millions of users (Part 2): reliability, orderliness, weak network optimization, etc."
"IM Scan Code Login Technology Topic (3): Easy to understand, one detailed principle of IM scan code login function is enough"
"Understanding the "Reliability" and "Consistency" Issues of IM Messages and Discussion of Solutions"
"Ali Technology Sharing: Xianyu IM's Cross-End Transformation Practice Based on Flutter"
"Rongyun Technology Sharing: Fully Revealing the Reliable Delivery Mechanism of 100 Million-level IM Messages"
"IM development dry goods sharing: how to elegantly realize the reliable delivery of a large number of offline messages"
"IM development and dry goods sharing: Youzan mobile terminal IM componentized SDK architecture design practice"
"IM development and dry goods sharing: Netease Yunxin IM client's chat message full-text retrieval technology practice"

This article has been simultaneously published on the official account of "Instant Messaging Technology Circle".
This article has been simultaneously published at: http://www.52im.net/thread-3732-1-1.html

IM development dry goods sharing: ten thousand characters long text, detailed explanation of IM "message" list lagging optimization practice

1 Introduction

2. Related articles

3. Technical background

4. What is lagging?

5. Analysis of Caton Problem

6. Specific optimization plan and practice summary

7. Deprecated optimization scheme

8. Summary of this article

Appendix: More IM dry goods articles

JackJiang

引用和评论

长连接网关技术专题(十二)：大模型时代多模型AI网关的架构设计与实现

极致出海友好，融云 IM 支持消息免打扰设置时区

支持百万人超大群聊的Web端IM架构设计与实践

全平台开源即时通讯IM框架MobileIMSDK：7端+TCP/UDP/WebSocket协议

鸿蒙NEXT如何保证应用安全：详解鸿蒙NEXT数字签名和证书机制

《北京日报》点赞！融云助力打造“数字丝路”新范式

拥抱国产化：转转APP的鸿蒙NEXT端开发尝鲜之旅