前端 - Interpretation of Cube Technology | The Past and Present of Cube Rendering Design - 个人文章

Author: He Jin (Xiao Jun)

This article is the fourth article in the "Cube Technology Interpretation" series. You are welcome to review the previous articles.
" Cube Technology Interpretation | Cube Mini Program Technology Details "
" Interpretation of Cube Technology | Overview of Alipay's New Generation Dynamic Technology Architecture and Selection "
" Interpretation of Cube Technology | Detailed Explanation of Cube Card Technology Stack "

Ali is a heavy-duty company with many front-end developers. From 2016 to 2017, when Weex was still 1.0, it was not long before React Native was open sourced, and when Flutter was not born, how to quickly deploy the front-end development environment on the premise Coming to the dual platform of android/iOS is a big hot spot, and Alipay hatched a dynamic cross-platform solution inside.

The first three articles respectively introduced the current architecture of Cube, Cube card and Cube applet technology product form. This article mainly discusses the rendering design of Cube, and helps you understand the past and present of Cube card rendering technology.

Native native rendering problems

We all know that a native view requires several steps to render the screen. Take android as an example: create, measure, layout, and draw. These need to be completed on the main thread. When implementing a native list, even if the item is perfectly reused, when rendering different data , also need to measure, layout, draw steps are indispensable, and as the view nesting level is deeper, the more the main thread resource consumption, when the list fly up, the frame rate drops rapidly, causing the page to freeze, based on this Question, how to solve rendering efficiency is an important part during cube research.

Generally speaking, optimizing the frame rate of list scrolling, that is, view level, layout complexity, removing unnecessary background colors, solving over-drawing, lazy image loading, item reuse, etc., but it is still impossible to bypass measure, layout, and draw. At that time, weex and RN still mapped the tags in html to the platform layer view. In some scenarios, developers could not optimize themselves like native development, and they were criticized for rendering performance. Therefore, the rendering goals during the cube research are: optimized rendering efficiency + cross-platform.

Cross-platform asynchronous rendering scheme

Asynchronous rendering

Based on the background and requirements mentioned above, then we wondered if there is a way to remove the key step from the thread, that is, asynchronous rendering. When the list is scrolled, only the system gestures and the scrolling algorithm and animation of the list itself need to occupy the main thread, which will greatly improve the frame rate. The product of drawing elements in the view is a pixel cache (the design used by the Cube is Bitmap), which returns to the main thread to refresh and display the view.

Cross-Platform Architecture

Another goal of cross-platform is to be able to quickly expand other platforms. The cube separates the parts involved in the platform to form the platform layer.

platform

The standard C++ atomic interface common to each platform is provided here, which is implemented in platform languages on different platforms. Initially, only two platforms, android and iOS, are implemented. Android calls java methods through jni, and iOS mixes C++ and OC in the implementation file. If you need to extend other platforms such as macOS in the future, you only need to implement the interface defined by the platform layer, which can achieve the goal of quickly extending other platforms.

core

The library is a basic library implemented in C++ based on the platform atomic interface, such as file IO, UI controls, image download, message communication, etc., for the upper engine. Above the library is the core implementation of cube rendering. The rendering part includes the data model and rendering logic. The component library refers to some system entity controls supported by the cube, or the entity components that developers can connect to.

The following figure is the first version of the cube rendering architecture diagram.

cube rendering architecture diagram

Asynchronous rendering technology selection

As mentioned earlier, the "product" of asynchronous rendering in the asynchronous rendering scheme is a bitmap handed over to the "container" View. Why is it a bitmap? It seems to be very unfriendly to memory. What kind of View is a View? Is there any particularity? Let's talk about which solutions have been studied during the cube research period, and why did you choose the bitmap in the end.

Android platform technology selection

The road to android selection is bumpy and bumpy. The first thing that can be thought of is textureView and GLSurfaceView that support independent rendering threads as containers, but there are obvious flaws. They cannot be used in common business list scenarios, but only in specific scenarios.

SurfaceView, GLSurfaceView

SurfaceView has existed since android1.0. The main feature is that its rendering can be implemented in child threads. Therefore, the problem is that although it inherits View, it has an independent Surface, not in View hierachy, and its display is not It is controlled by the properties of the View, so it cannot be zoomed and panned like a normal view, nor can it be used as an item in a listView/RecycleView as a normal view, and there will be a problem of asynchronous scrolling.

GLSurfaceView inherits SurfaceView, which comes with GLThread and has the same problem as GLSurfaceView. In short, these two views are more suitable for single video rendering or scene like map rendering.

Some people may ask, is it not enough to use SurfaceView/GLSurfaceView for the entire page, and even the list is implemented in the render thread? Two questions here:

1. If the list container is also implemented in the render thread, just like the current flutter, then the list sliding gesture processing needs to be implemented by itself, such as drag, fling, various list scrolling animations, and scrolling acceleration calculations, which are expensive. In addition, the capture of touch events still depends on the platform layer, and the processing of events needs to be switched to the render thread. There must be an experience problem caused by thread switching costs. Now many rendering engines based on flutter engine transformation are facing these problems;

2. At that time, the main goal of the cube team was to verify quickly. The cost of implementing the list was too high, which was not the main contradiction.

TextureVIew

textureView is provided by google since android4.0. It appears to a large extent to make up for the lack of integration of SurfaceView, GLSurfaceView and native View. Based on the problem of animation of these two views and native view described in the above section, textureView It seems to be more suitable for our scene, which can not only support independent render threads, but also ensure the perfect integration with the native view.

However, in the actual research process, it was found that the rendering mechanism of textureView is not suitable for long lists. If the item of each list is a textureView, it will involve recycling out of the screen and creating it in the screen, otherwise it will cause memory problems. The recycling and creation of SurfaceTexture is an asynchronous process, and there is a flashing black screen problem. In addition, it is further found that there is a certain upper limit on the number and capacity of textureView (the cumulative size of each view), and the upper limit of different mobile phones is also very different. Simply put, this is a technical route that looks beautiful, but has countless compatibility pits.

Bitmap+Ordinary View

In the end, the solution that bitmap does not seem to be perfect is chosen. Although this is considered by most android developers that bitmap brings a lot of memory consumption, it is regarded as unacceptable, but as the application of cube becomes wider and wider, this gradually proves to be a problem. At the time, it was the most common solution.

Each layer corresponds to a system view, and the drawing content of each view is asynchronously drawn on the bitmap in the sub-thread through the Canvas API. When the view is on the screen, the system onDraw draws the bitmap "product".

BitmapCache
Although the Bitmap drawing scheme is used, the problem of memory overload must be considered. Here we use BitmapCache, mainly for list-type scenarios, relying on the system's item recycling callback notification, putting the bitmap canvas in the Cache, and when the item is rendered on the screen, priority is given Take the bitmap canvas from the cache and use it, and take the same size first. If it does not exist, take the width and height larger than the target width and height, so that the view only draws the bitmap part to achieve the purpose of correct rendering.

iOS platform technology selection

The implementation principle of iOS is roughly the same as that of android. The difference is that the "product" drawn by the asynchronous thread of iOS will not be rendered by CoreGraphics in the drawRect of UIView. This method is inefficient and the page is stuck. Assign the canvas to the UIView's layer and host it to the system rendering layer.

The evolution of rendering technology

The general scheme and key technology selection of cube asynchronous rendering are described above. In fact, from the launch of DDA Planet in early 2019, to now, cube has been used more and more widely in Alipay, which is accompanied by the continuous development of cube team according to actual business scenarios. In the process of exploration and optimization, the rendering link has undergone two reconstructions. It's important to stress that this evolution was done under strict memory/performance and compromises on Android compatibility. Some designs that look less elegant or advanced are actually forced to do so, such as choosing Bitmap as the pixel buffer, such as the design of accessing third-party components. In a sense, it doesn't make much sense to talk about technical pros and cons without constraints. We used to learn from the part of flutter, but Cube finally moved forward along the technical route suitable for its own scene.

Common terms

LayoutTree: The original tree structure that DomApi builds through add, update, and remove and is laid out by yoga to describe the parent-child relationship of nodes and contains layout information;
RenderTree: used to describe the parent-child relationship of the drawing nodes, the tree structure containing the drawing information, and the difference from layoutTree Example: a layoutNode visible is gone, then the node will not appear in RenderTree;
Layer: In general, the root node and its child nodes are drawn on the same canvas, which is defined as a layer, which corresponds to a view of the platform layer. When the child node has animation properties or exceeds the scope of the parent node, a separate layer is required;
LayerTree: The layer node mentioned above, the tree structure is constructed, a layer corresponds to a view of the platform layer, we call it ContainerView;
Entity node: The node that requires an independent layer is an entity node;
Virtual nodes: In addition to physical nodes, other nodes will be drawn on the canvas of the parent container, these are virtual nodes.

evolution

Early stage of research - feasibility of 1.0 verification scheme

During the investigation period, the feasibility of the scheme was verified, and the scene was relatively simple. The dynamic page of friends in Alipay was used as the verification scene, and each state (an item/cell) was used as a rendering unit. Time, map, "Like", "Reward", "Review" and other elements are drawn on the layer corresponding to the root node, and the small icons next to the "Like", "Reward" and "Review" texts are used as external entity components. Added to the View of the rootLayer through addSubView.

data model

As shown in the figure below, the RenderTree is constructed according to the layoutTree, but the non-rendering nodes are not on the renderTree. The layerTree has only one self-drawing layer (rootLayer), and other custom components X. In the end, except for the custom components, all other nodes are drawn on the rootLayer .

rendering process

The bridge thread builds the layoutTree through DomApi. When the main thread triggers rendering, the main thread builds the RenderTree based on the layoutTree. During the construction process, it encounters external entity components, creates an instance and addsSubView, and then switches the sub-thread to draw the RenderTree, that is, all virtual nodes on the rootLayer. Toggles the main thread map (bitmap "product") when done.

shortcoming

Cannot support multi-layer structure
The entity view is not reused, that is, how many items/cells are in the dynamic list of friends, there will be how many "likes", "rewards", and "comments" entity components

But this research verified the feasibility of asynchronous rendering, and the frame rate increased significantly when the list was scrolled.

Productization period - 2.0 supports multiple layers

The feasibility has been verified earlier. When designing a product, the multi-layer structure must be satisfied, that is, in an actual card, one or several different nodes will be set as layers, and these nodes and their child nodes will be set as layers. , which are drawn on different canvases for rendering by different layers.

data model

The improvement is that there is a multi-layer node in the layerTree, and the sub-virtual nodes below the layer node will be drawn on the bitmap "product" of the layer.

rendering process

In the process of building the layoutTree by the brige thread, each instruction (addNode, removeNode...) will be distributed to the main thread of the render module accordingly. The renderer builds the RenderTree according to the instruction, and uses the instruction information to generate a task to join the queue. When the VSync signal comes, the task is triggered. Dequeue and deduplicate, build a layerTree, distribute different layers to different draw threads for drawing, and cut the main thread texture (bitmap "product") after drawing is complete.

shortcoming

The main thread has a large amount of calculation, which may cause lag
The render node contains not only drawing information, but also a drawing object and logic. For example, the display: "none" node is ignored and not displayed, and the responsibilities are not clear.

Optimization period - 3.0 learns from each other's strengths

As can be seen above, the construction of renderTree and the construction of layerTree are all in the UI thread. In the case of a large number of nodes and complex conditions, the UI will be stuck. In order to pursue the ultimate scrolling frame rate, the main thread calculation content should be reduced as much as possible. In the optimized version 3.0, the renderObject builds the layer, and the drawing influence range caused by the change of the calculation node is changed to be completed in the sub-thread, forming the version running online now.

data model

The structure of PaintTree is added, which is mounted on the Layer node. The style and attribute values are copied from RenderTree, but it does not involve any logic processing. It is simply a drawing object. Each drawing task only draws the paint node on the paintTree. , with no concurrency issues with layerTree and renderTree.

rendering process

The layout thread builds the layoutTree, switches to the render thread to build the renderTree, when the platform layer triggers rendering, switches to the renderTree to build the layerTree, calculates the scope of influence, etc., switches to the main thread to add the materialized View corresponding to the layer to the container View, and generates a drawing task After the paint thread is executed, the main thread texture (bitmap product) is switched after the drawing is completed.

shortcoming

Increased flash rate when the render thread is busy

The above is the evolution of cube rendering from its birth to the current online solution. Currently, there are more than 20+ card-form access services in the Alipay terminal, and the number of card templates running online has reached more than 500, showing more than 10 billion PVs. It has withstood The test of all business parties.

However, some problems were also found in technical support. For example, when there are too many rendering tasks, the render thread blocks the queue, and the failure to consume in time leads to a higher probability of white screen. Recently, cube is also continuing to study optimization solutions.

existing problems

Consistency problem at both ends

The current drawing api of cube adopts CanvasApi provided by the system platform layer (iOS is CoreGraphics), which leads to the fact that the two platforms must manually align the details of drawing points, lines and surfaces at both ends, otherwise there will be differences in effect. Some new features are added, such as support for dot-dash line, which requires two platforms to implement the DrawDottedLine interface respectively, but this problem, the cube team is investigating self-drawing, that is, using the skia api to sink the drawing interface to C++ to achieve cross-platform self-drawing;
Text is also a point that is prone to differences. The platform layer api is used to lay out the text, and the layout api is called to draw when drawing, so there may be product platform differences, but the cube team has already laid out the text on the Cube applet. The algorithm sinks in the C++ layer, does not depend on the platform API, and realizes the consistency of the two platforms; the constraints limited to memory/performance have not been applied on the Cube card.

flash problem

Because of the asynchronous rendering used in scrolling, there will inevitably be a flickering problem caused by the main thread card already on the screen and the asynchronous drawing not being completed. There is a cost in thread switching. This flickering must exist in theory, but it is only a matter of time. The cube team is committed to To improve rendering efficiency, minimize the loss caused by thread switching, and improve user experience in list scrolling.

future plan

For the currently known problems, the cube team is committed to continuous optimization. The main optimization points include but are not limited to the following:

Rendering snapshots to improve the rendering efficiency of cold start and reduce the flashing time;
Rendering strategies, such as pre-rendering, synchronous and asynchronous drawing adaptation, thread model optimization, component caching and pre-loading, etc., reduce the flash rate and improve rendering efficiency;
Optimized the yoga layout engine for Cube cards to improve layout efficiency;
Skia self-drawing implementation to achieve double-end consistency;

The application of cube rendering technology includes two technical forms: card and applet. The scenarios include Alipay in-end, out-end, IOT and other diverse scenarios. Team members will continue to make efforts in rendering performance, user experience, and tool chain. , strive to polish the product well, serve the developers well, and grow into a competitive cross-platform dynamic rendering solution.

Pay attention to [Alibaba Mobile Technology], Ali's cutting-edge mobile dry goods & practice will give you thoughts!

Interpretation of Cube Technology | The Past and Present of Cube Rendering Design

Native native rendering problems

Cross-platform asynchronous rendering scheme

Asynchronous rendering

Cross-Platform Architecture

Asynchronous rendering technology selection

Android platform technology selection

SurfaceView, GLSurfaceView

TextureVIew

Bitmap+Ordinary View

iOS platform technology selection

The evolution of rendering technology

Common terms

evolution

Early stage of research - feasibility of 1.0 verification scheme

Productization period - 2.0 supports multiple layers

Optimization period - 3.0 learns from each other's strengths

existing problems

Consistency problem at both ends

flash problem

future plan

阿里巴巴终端技术

引用和评论

SLS：基于 OTel 的移动端全链路 Trace 建设思考和实践

手写一个动态海洋和天空效果的vue hooks

你可能不知道的图片加载相关知识

使用CSS给标题添加书名号并超出省略

原生electron起步-从零到一完成构建和打包

Koa+Typescript起手式(空环境) 不用每次玩node都要搭环境了！

LRU算法，你别跑，我就要吃透你