A Preliminary Study of the Snapshot Technology of Dingding Mini Program

Author: Sun Ran (Boiled Shrimp)

For applet technology, it is inevitable that there will be a white screen or loading page display during container loading and front-end asynchronous rendering. It can take a moment to display the first screen as long as it takes a few seconds. If the blank screen lasts for a long time, it will greatly affect the user experience. According to Google's statistics, if the page takes more than 3 seconds to load, 53% of users will choose to exit the page directly.

In order to speed up the display of the homepage of the applet, Alipay and use the HTML-based 1619dbfdaa045d snapshot technology. The main idea is to cache the HTML of the homepage for priority rendering with the data at the next startup to advance the time of the first screen display, which is suitable for traditional Small program scene rendered by WebView. This HTML-based snapshot technology can greatly shorten the white screen time at startup, but the speed of the first screen display is still not fast enough, and the user will still have a visible white screen feeling during the period. And the snapshots still show pages that cannot be clicked. You need to wait for the JS part to be ready before you can click to interact.

In order to pursue the ultimate experience effect, we have proposed a new small program snapshot technology, the goal is to completely eliminate the white screen phenomenon, but also to be able to respond to user interaction.

Core idea

Different from the existing HTML-based snapshot technology, we propose a native image-level snapshot technology, which mainly consists of the following three steps:

Step 1: Save the homepage of the applet as a picture at the right time after the applet is started, we call it snapshot
Step 2: Next time you open the mini program, first display the last saved snapshot, and then start the mini program
Step 3: At the right time after the applet is started, hide the snapshot, show the real applet homepage, and save the current interface view as the next snapshot (same as step 1)

Effect

Now the new DING schedule page in Dingding uses the snapshot technology, and the comparison of the before and after effects is as follows:

before	after

It can be seen that through the snapshot technology, the page has the effect of opening the first screen in seconds, the phenomenon of starting the white screen completely disappeared, and the first screen rendering time of the page has been reduced from about 1700ms to less than 300ms.

Below I will introduce in detail several key considerations of the snapshot technology.

Scene and timing

The ideal snapshot should be able to completely overlap with the first screen page, and will not produce any visual changes when the snapshot is hidden. timing generating the snapshot scene using the snapshot directly determine the optimization effect that the snapshot technology can achieve.

What page is suitable for snapshots?

Not all small programs are suitable for using snapshot technology to enhance the first screen experience. If used improperly, the snapshot may also become a deduction for the experience. In order to achieve the best results, it is generally suitable to use snapshots if the first screen page meets the following conditions:

The first screen page is relatively fixed. If the first screen is not fixed, it is difficult to find a suitable snapshot time to ensure that the snapshot coincides with the next first screen
The first screen page does not contain user privacy data. The user's private data should not be snapshotted

When is the snapshot taken?

If the timing of the snapshot is too early, the snapshot may also show a blank screen or not render the complete homepage frame.

If the timing of the snapshot is too late, the user may have already interacted with the first screen (scrolling, clicking, etc.), and it is easy to generate a snapshot that cannot be overlapped with the first screen.

Therefore, it is necessary to determine the best snapshot timing according to different first screen scenes. Generally, when we consider snapshots:

In the onReady life cycle callback of the first screen of the applet. But at this time, the page may still not be rendered. You can take a snapshot after considering a proper delay.
If the data of the first screen of the applet needs to be pulled remotely, it can be carried out after the data of the first screen is obtained remotely
No snapshots will be taken when the user scrolls, clicks, and other interactions

When to hide the snapshot?

We generally consider to hide the currently displayed snapshot when The order of the two is generally to generate a snapshot immediately after hiding the snapshot in order to achieve a seamless connection between the snapshot and the real page.

Of course, you also need to consider scenarios where the applet may fail to start. Here you need to set a display upper limit for the display time of the snapshot. If the first screen of the applet still does not start successfully when the display time reaches the upper limit, the snapshot will be hidden directly to prevent an embarrassing situation where the first screen is visible to the user but never responds.

Here we have made a small visual optimization while still hiding the snapshot. Considering that when you hide the snapshot, if you hide the snapshot directly, once the snapshot is slightly different from the real page, there may be a flickering somatosensory visually.

So when hiding the snapshot, we will make a 200ms fade-out animation to alleviate the flicker caused by the difference between this snapshot and the real page. Because sometimes the timing of the snapshot may be slightly earlier than the timing of the successful asynchronous events such as the successful loading of the homepage network data and the successful image loading, resulting in the lack of snapshots than the real page elements or inaccurate data, and the fade-out animation can effectively dilute the vision caused by these errors. A sense of abnormality. The following demo compares these two situations:

Straight out	fade out

Interactive

Since the snapshot and the real first screen page are basically the same, the user will think that the first screen has been successfully displayed from the user's physical perception, and it should also be an interactive page. Therefore, is not enough to simply display the dead snapshot page. Interaction is an important capability of our snapshot .

Our snapshot supports responding to the user's click behavior. The specific method is to temporarily store the user's click event when the user clicks the snapshot, and distribute the event to the real page when the snapshot is hidden.

If the user has multiple click events during this process, we will only respond to the last click event.

From the user's physical sense, the user may feel that the response to this click will be slower, but it will not let the user perceive whether it clicked on the snapshot or the real homepage.

If it is a scenario where the small program starts slowly, you can also consider displaying loading after the user clicks:

To further improve the interactivity of the snapshot layer, we can even allow developers to set some click areas and simple operations of the snapshot layer, so that users can quickly respond to click events when they click on the snapshot layer. For example, the Dingding workbench is very suitable for this scenario: the various applications in the workbench generally do not change frequently, and there are very clear division areas:

You can configure different click areas and corresponding actions (for example: jump to other pages/applications), such as:

[{
  area: {
    left: 100, 
    top: 100, 
    width: 100, 
    height: 100
  },
  action: {
    type: 'openLink', 
    params: { url: 'http://xxx' }
  }
}, ...]

In this way, the user can jump directly when clicking the designated area of the snapshot, without waiting for the completion of the applet start.

Storage and security

Snapshots are sensitive data and can only be saved locally on the client and cannot be uploaded. Care must be taken to manage their storage, otherwise it will easily lead to a public relations incident.

For the storage of snapshots, we considered the following points:

Encrypted storage

The snapshot data must be stored by . The encryption method used here is the encryption method in the group wireless bodyguard.

privacy protection

The snapshot cannot contain the user's private data. In other words, the snapshot should only contain some UI elements or meaningless default data, and should not contain user privacy data.

No user privacy data	Contains user privacy data

So how to get a snapshot of the homepage without user privacy data? You can consider taking a snapshot before the front-end gets data from the network and cache. However, such snapshots must be incomplete and will lose a certain experience. This is why we do not recommend using snapshots in first-screen scenes with user data.

Snapshot cleanup

Snapshots are stored in the client and need to have a storage limit. When the snapshot data reaches a certain amount, some old snapshot data needs to be eliminated. \
Secondly, you should consider cleaning up the existing corresponding snapshot data when you update the version of the applet and when users log out and switch users.

accuracy

When the snapshot is online, we need to perceive the user experience of the snapshot. The best experience is that the user does not perceive the existence of the snapshot at all, that is, the snapshot and the real page completely overlap; and if the snapshot and the real page are significantly different, the user experience will be greatly reduced, which is what we need to perceive.

Here we mainly focus on the accuracy of the index, which is the degree of similarity (overlap) between the snapshot and the real page. The higher the accuracy, the more natural the transition between the snapshot and the real page, and the better the experience; on the contrary, it will not only not improve the experience, but may also cause confusion to users.

How to judge the accuracy of a snapshot

Every time a snapshot is generated, we will compare this snapshot with the previous snapshot to obtain a quantified index to reflect the accuracy of the snapshot. Then the next question becomes how to judge the similarity of the .

Here may first think of directly using the pixel-by-pixel comparison method to calculate the ratio of different pixels in the two snapshots. The higher the ratio, the more accurate the snapshot, but in fact, this method cannot reflect the true similarity and the user's somatosensory. . For example, when the positions of two snapshots are slightly offset, the obtained similarity value may be very low; or two snapshots with small color difference may also get very poor results. Moreover, the number of pixels in a snapshot may reach millions of magnitudes. Tests have found that pixel-by-pixel comparison may take several seconds at a time.

We are now using the "perceptual hash algorithm" used in Google to search images to quantify the accuracy of snapshots. The process of the algorithm itself is roughly to obtain some "fingerprint" information after compressing the picture, and then calculate the "difference index" by comparing the fingerprint information of different pictures. The higher the difference index, the lower the similarity between the two. This algorithm can reflect the similarity of the two snapshots, and its efficiency is greatly improved compared to the pixel-by-pixel comparison method. The online data statistics show that the entire algorithm takes no more than 3ms.

We experimented on some scenarios and got the difference index. It can be seen that for scenes with small character changes, the difference index is very low; for scenes with obvious visual gaps, the difference index becomes higher. The quantified value obtained in this way can reflect the impact of the snapshot on the real sense of the user.

		Scene difference index visual effect
Minor character changes	1
Overall offset	6

How to restore the wrong snapshot scene

After being able to perceive the accuracy of the snapshot, we also need to know the difference between the snapshot and the real page for the snapshot with poor accuracy, so as to improve the timing of the snapshot.

Here, we track the current real page situation by taking the DOM tree of the front-end page at the time of the snapshot. The specific operation is to obtain desensitized when the snapshot is generated, and then rely on the CSS file of the applet framework, and finally directly use the browser to restore the interface at the time of the snapshot.

Other abilities

Partial snapshot

One of the major limitations of snapshots is that they cannot adapt to changing first-screen scenes. The use of snapshots in such scenes can easily cause each snapshot to fail to overlap with the real homepage, which reduces the user experience. Therefore, we consider providing a capability to only take snapshots of the parts of the homepage that are basically unchanged every time, and not take snapshots of other changeable parts, so that part of the first screen content can be displayed in seconds every time.

For example, in the homepage of DingTalk, the upper part is a relatively fixed display, while the lower part of the feed stream may display different information each time it is opened. Then in this scenario, we don't need to take a snapshot of the first screen of the entire homepage every time. You can specify a certain height to take a snapshot, so that part of the homepage can be displayed in seconds.

Super one-screen snapshot

When the homepage is scrollable, we can even consider snapshots longer than one screen, and make the snapshots scrollable when the snapshot is displayed next time the applet is launched. This solution needs to pay attention to two issues:

snapshot size \
Online statistics show that the average size of one-screen snapshot files is about 100K. If it is a snapshot of one screen, the size may reach several hundred K. It is necessary to estimate an upper limit of length or upper limit of the size of the snapshot when generating the snapshot to prevent abnormal situations such as OOM in the low-end machine when the snapshot is used.
snapshot rolling \
If the user performs a scrolling operation when the snapshot is displayed, the current scroll offset needs to be recorded when the snapshot is hidden, so that the real homepage can also be scrolled to the specified position so that the snapshot and the real page overlap.

performance

For the performance of snapshots, we conducted laboratory tests and online statistics.

In the laboratory test, we constructed a very large snapshot (5.2M) extreme scene, and compared it with the normal snapshot on the low-end machine:

		Ordinary scene extreme scene
Snapshot size	262K	5.2M
Memory footprint	1840K	3245K
Loading visual experience	appears directly	There is a very short delay

The snapshot loading process does not affect the normal page switching, but there may be a short delay in the loading of an oversized snapshot.

Online data shows that it takes about 280ms to load a page with a snapshot, and the average size of the snapshot is about 110K.

Snapshot generation and accuracy detection are all performed in asynchronous threads. At this time, user interaction has not started, and snapshots will not be taken after the user scrolls and interacts, which will not cause too much impact on performance.

Here is another interesting data: the average number of user clicks on the snapshot is 0.6 times, and the first click time is about 1500ms. In other words, when the snapshot is displayed for 1.5 seconds, more than half of the people will start interacting for the first time. This is enough to illustrate the importance of making snapshots interactive.

Outlook

Although the origin of the snapshot technology is to solve the startup performance problem of small programs, the actual application scenarios can be extended to more places.

In theory, any form of asynchronous rendering scene, whether it is a small program rendered by WebView or weex, or an ordinary H5 web page, or even some native scenes (scenarios that require loading), as long as one piece can be in the client The displayed views can use the snapshot technology to solve the white screen or loading problem in the process, and can be displayed in seconds and can be interactive. Because snapshot is a purely native technology, its implementation does not depend on the rendering method of the real page. It needs to be more concerned with more appropriate snapshot timing and application scenarios to obtain a better experience.

Summarize

We propose a brand-new small program snapshot technology, which realizes the second opening and interaction of the small program homepage. It can completely eliminate the loading or white screen phenomenon during the opening of the applet, allowing the applet to open up to a native experience, and it can also respond to user click interactions.

It is a purely native technology that does not depend on the rendering of the applet container and front-end. As long as there is a view, it can be snapshot, as long as there is snapshot data, it can be displayed immediately, and it can even be extended to other non-small program scenes.

However, its limitations are mainly dependent on the style of the first screen and the timing of the snapshot. The changeable first screen containing user privacy data is not suitable for snapshots, and the timing of the generation of high-quality snapshots is relatively demanding. In terms of guaranteeing the accuracy of snapshots, there is still a lot of room for optimization in the similarity comparison method of snapshots, and these need to be continuously polished in the future.