FlutterWeb performance optimization exploration and practice

Meituan food delivery merchants have been exploring technologies based on FlutterWeb for a long time. Currently, it has implemented multi-terminal multiplexing of App, PC, and H5 in multiple businesses, effectively improving the overall efficiency of production and research. In this process, performance issues are the biggest challenge we face. This article considers actual business scenarios and introduces the exploration and practice of Meituan takeaway merchants on FlutterWeb performance optimization, hoping to help or inspire everyone.

1. Background

1.1 About FlutterWeb

Back in 2018, Google released the FlutterWeb Beta version for the first time, expressing its vision of achieving one code and multi-terminal operation. After more than two years of hard work by countless engineers, at the beginning of this year (March 2021), Flutter 2.0 was officially released. It merged FlutterWeb functions into the Stable Channel, which means that Google has strengthened its determination to multi-terminal reuse.

图1 FlutterWeb历史

Of course, Google's "ambition" is not without confidence. It is mainly reflected in its powerful cross-terminal capabilities. Let's take a look at how Flutter's cross-terminal capabilities are reflected on the Web side:

图2 Flutter跨端能力

The above figures are the architecture diagrams of FlutterNative and FlutterWeb respectively. By comparison, it can be seen that the application layer Framework is public, which means that in FlutterWeb, we can also directly use components such as Widgets and Gestures to implement logical cross-terminals. Regarding cross-end rendering, FlutterWeb provides two modes to align the rendering capabilities of the Engine layer: Canvaskit Render and HTML Render. The following table compares the differences between the two:

图3 模式对比

Canvaskit Render mode : The bottom layer is based on Skia's WebAssembly version, and the upper layer uses WebGL for rendering, so it can better ensure consistency and scrolling performance, but poor compatibility (WebAssembly only supports Chrome 57 version) is what we need The problem faced. In addition, Skia's WebAssembly file size has reached 2.5M, and the Skia self-drawing engine needs font library support, which means that it needs to rely on super large Chinese font files, which has a greater impact on page loading performance, so it is not recommended to use it directly in the Web at present Canvaskit Render (the official recommendation is also to use Canvaskit Render mode for desktop applications).

HTML Render mode : Use HTML + Canvas to align the rendering capabilities of the Engine layer, so the compatibility is excellent. In addition, MTFlutterWeb has explored and practiced scrolling performance, and is currently able to cope with most business scenarios. Regarding the loading performance, the initial package in this mode is 1.2M, which is 1/2 of the product volume of the Canvaskit Render mode, and we can intervene in the compilation process to control the output product, so there is a lot of room for optimization.

Based on the above reasons, the Meituan takeaway technical team chose to optimize and explore the performance of FlutterWeb pages in the HTML Render mode.

1.2 Business status

Meituan's takeaway merchant terminal provides a series of services such as order management, product maintenance, customer evaluation, takeaway classroom, etc. in diversified forms such as App and PC, and the dual-terminal business functions of the App and PC are basically aligned. In addition, we also provide a special multi-store management function for chain merchants on the PC. At the same time, in order to meet the requirements of platform operation, some businesses have foreign investment H5 scenarios, such as Meituan Takeaway Merchant Class, which is a content platform that helps merchants learn takeaway operation knowledge, understand industry development, and follow up business strategies in the form of articles and videos. , Has a strong communication attribute, so we provide the ability to share outside the site.

图4 业务形态

In order to achieve multi-terminal (App, PC, H5) reuse and improve R&D efficiency, we started the construction of the MTFlutterWeb At present, we have completed more than 9 efficiency-improving services based on MTFlutterWeb. In the App, we can provide high-performance services based on FlutterNative; in the PC and Mobile browsers, we use FlutterWeb to achieve low-cost adaptation and improve production and research. Overall efficiency.

However, loading performance issues are the biggest obstacle to the promotion of MTFlutterWeb applications. Here is still taking the classroom business of Meituan takeaway merchants as an example. At the beginning of the project, the TP90 line of the page’s full load time reached about 6s, which is far from our baseline value (the page’s full load time TP90 line is not higher than 3s. The baseline value is mainly based on the US The business scenarios and user portraits of the group takeaway merchants are determined) There are some gaps, and there is a lot of room for improvement in the user experience. Therefore, the optimization of FlutterWeb page loading performance is a problem that we urgently need to solve.

2. Challenge

However, to break through the performance bottleneck of FlutterWeb page loading, the challenge we face is also huge. This is mainly reflected in the optimization strategy of FlutterWeb's lack of static resources, as well as the complex architecture design and compilation process. The following figure shows the process of converting Flutter business code into web platform products. Let's analyze it in detail:

图5 FlutterWeb 编译流程

underlying SDKs such as 161c046227c244 Framework, Flutter_Web_SDK (Flutter_Web_SDK is based on HTML and Canvas, carrying the specific implementation of the HTML Render mode) can be directly introduced by business code, helping us to quickly develop cross-terminal applications;
flutter_tools is the compilation entry for each platform (Android, iOS, Web). It receives flutter build web commands and parameters and starts the compilation process, while waiting for the processing result callback. In the callback, we can perform secondary processing on the compiled product;
frontend_server is responsible for converting Dart to AST, generating the kernel intermediate product app.dill file (in fact, the compilation process of each platform will generate such intermediate product), and handing over to each platform Compiler for translation;
Dart2JS Compiler is the module responsible for translating JS in Dart-SDK. It reads and parses the above-mentioned intermediate product app.dill, and injects JS tool methods such as Math, List, Map, etc., and finally produces a Web platform that can be executed JS file.
compilation product mainly static resources such as main.dart.js, index.html, images, etc. FlutterWeb lacks the optimization methods in conventional web projects for these static resources, such as file hashing, file fragmentation, CDN support, etc.

It can be seen that in order to complete the optimization of the FlutterWeb compilation product, it is necessary to intervene in many compilation modules of FlutterWeb. In order to improve the overall compilation efficiency, most of the modules are compiled in advance into snapshot files (a Dart compilation product that can be run by the Dart VM to improve execution efficiency), for example: flutter_tools.snapshot, frontend_server.snapshot , Dart2js.snapshot, etc., which increase the difficulty of intervening in the FlutterWeb compilation process.

Three, the overall design

As mentioned earlier, in order to achieve logic and rendering cross-platform, Flutter's architecture design and compilation process have a certain degree of complexity. However, because the specific implementation of each platform (Android, iOS, Web) is decoupled, our idea is to locate and optimize the Web platform implementation of each module (Dart-SDK, Framework, Flutter_Web_SDK, flutter_tools). The overall design diagram is as follows Shown:

图6 整体设计

SDK slimming : We have separately slimmed down the Dart-SDK, Framework, Flutter_Web_SDK that FlutterWeb depends on, and integrated these streamlined SDKs into the CI/CD (Continuous Integration and Deployment) system, laying a foundation for reducing the product package size Foundation
compilation optimization : In addition, we have intervened in the compilation process in flutter_tools, and optimized JS file fragmentation, static resource hashing, resource file upload CDN and other optimizations, making these basic performance optimization methods in conventional web applications Able to land in FlutterWeb. At the same time, the resource optimization in special scenarios of FlutterWeb has been strengthened, such as: font icon simplification, Runtime Manifest isolation, Mobile/PC sub-platform packaging, etc.;
load optimization : After optimizing static resources in the compilation stage, we support resource preloading and on-demand loading when running at the front end. By setting a reasonable loading timing, the initial code size is reduced and the first screen of the page is improved. Rendering speed.

Below, we will give a detailed description of each optimization.

Four, design and practice

4.1 Streamline SDK

4.1.1 Package volume analysis

If you want to do well, you must first sharpen your tools. Before we start volume cutting, we need a set webpack-bundle-analyzer to visually compare the volume ratio of each module. Help optimize performance.

Dart2JS officially provides the --dump-info command option to analyze the JS product, but its performance is not satisfactory. It cannot analyze the volume ratio of each module well. It is more recommended to use source-map-explorer . Its principle is to reverse the solution through the sourcemap file, which can clearly reflect the occupied size of each module, which provides guidelines for the streamlining of the SDK. The following figure shows the reverse solution information of the FlutterWeb JS product (the screenshot only contains Framework and Flutter_Web_SDK):

图7 反解信息

4.1.2 SDK tailoring

The SDKs that FlutterWeb relies on mainly include Dart-SDK, Framework, and Flutter_Web_SDK. These SDKs have a huge impact on the package size and contribute almost all the size of the initialization package. Although in the compilation process in Release mode, Dart Compiler will use Tree-Shaking to remove those imported but unused packages, classes, functions, etc., which greatly reduces the package size. However, there are still some codes in these SDKs that can be further optimized.

Take Flutter Framework as an example. Because it is a common module for all platforms, there is inevitably the compatibility logic of each platform (usually in the form of conditional judgment such as if-else, switch), and this part of the code cannot be used by Tree-Shaking Excluding, we observe the following code:

// FileName: flutter/lib/src/rendering/editable.dart
void _handleKeyEvent(RawKeyEvent keyEvent) {
  if (kIsWeb) {
    // On web platform, we should ignore the key.
    return;
  }
  // Other codes ...
}

The above code is selected from the RenderEditable class in Framework. When the kIsWeb variable is true, it means that the current application is running on the Web platform. Limited by the mechanism principle of Tree-Shaking, in the above code, the compatible logic of other platforms, that is, the part annotated with Other codes cannot be removed, but this part of the code is Dead Code for the Web platform (it can never be The executed code) can be further optimized.

图8 部分功能构成

The above figure shows part of the functional composition of the SDK. It can be seen from the figure that these SDKs that FlutterWeb relies on contain some low frequency functions, such as Bluetooth, USB, WebRTC, gyroscope and other functions. To this end, we provide the ability to customize these long tail functions (these functions are not turned on by default, but the business is configurable) to tailor the functions that are not enabled for long tail.

Through the above analysis, our idea is to remove Dead Code twice and trim these long tail functions. Based on this idea, we went deep into Dart-SDK, Framework, and Flutter_Web_SDK, and finally reduced the JS Bundle product volume from 1.2M to 0.7M, laying a solid foundation for FlutterWeb page performance optimization.

图9 精简成果

4.1.3 SDK integrated CI/CD

In order to improve the construction efficiency, we customized the environment that FlutterWeb relies on as a Docker image and integrated it into the CI/CD (Continuous Integration and Deployment) system. After the SDK is tailored, we need to update the Docker image. The whole process takes a long time and is not flexible enough. Therefore, we package and upload Dart-SDK, Framework, Flutter_Web_SDK to the cloud by version, read the CI/CD environment variable before compilation: sdk_version (SDK version number), remotely pull the SDK package of the corresponding version, and replace the current Docker The corresponding modules in the environment are based on this solution to realize the flexible release of the SDK. The specific flowchart is shown in the following figure:

图10 集成CI/CD

4.2 JS Sharding

After FlutterWeb is compiled, the main.dart.js file will be generated by default, which contains the SDK code and business logic, which will cause the following problems:

function cannot be updated in time : In order to optimize the browser's cache, our project has enabled strong caching of static resources. If the main.dart.js product does not support hash naming, the program code may not be updated in time;
cannot use CDN : FlutterWeb only supports the resource loading method of relative domain name by default, and cannot use CDN domain names other than the current domain name, resulting in the inability to enjoy the advantages of CDN;
First screen rendering performance is not good : Although we have performed the SDK slimming, the main.dart.js file is still maintained at 0.7M or more, and the single file loading and parsing time is too long, which will inevitably affect the rendering time of the first screen.

For the support of file hashing and CDN loading, we perform secondary processing on static resources in the flutter_tools compilation process: traverse the static resource product, increase the file Hash (file content MD5 value), and update the resource reference; at the same time, we customize the Dart- SDK modified the loading logic of static resources such as main.dart.js and fonts to support CDN resource loading.

For more detailed program design, please refer to the "The Practice of Flutter Web in Food Delivery" 161c046227c606. Below we focus on some optimization strategies related to main.dart.js fragmentation.

4.2.1 Lazy Loading

Flutter officially provides the deferred as keyword to implement lazy loading of Widgets, and dart2js can package lazy-loaded Widgets on demand during the compilation process. This unpacking mechanism is called Lazy Loading. With the help of Lazy Loading, we can use deferred to import various routes (pages) in the routing table to achieve the purpose of business code separation. The specific usage and effects are as follows:

// 使用方式
import 'pages/index/index.dart' deferred as IndexPageDefer;
{
  '/index': (context) => FutureBuilder(
    future: IndexPageDefer.loadLibrary(),
    builder: (context, snapshot) => IndexPageDefer.Demo(),
  )
  ... ...
}

图11 效果演示

After using Lazy Loading, the code of the business page will be split into multiple PartJS (corresponding to the xxx.part.js file in the figure). This seems to solve the problem of coupling between business code and SDK, but in actual operation, we found that every change in business code will still cause the compiled main.dart.js to change accordingly (file Hash value change) . After positioning and tracking, we found that this changed part is the loading logic and mapping relationship of PartJS, which we call Runtime Manifest. Therefore, it is necessary to design a set of solutions to extract the Runtime Manifest to ensure that the modification of the business code has the lowest impact on main.dart.js.

4.2.2 Runtime Manifest extraction

By extracting the business code, the main.dart.js file is composed of SDK and Runtime Manifest at this time:

图12 main.dart.js构成

How can the Runtime Manifest be removed? Compared with conventional Web projects, our approach is to extract basic dependencies such as SDK, Utils, and tripartite packages using packaging tools such as Webpack and Rollup to extract and assign a stable Hash value. At the same time, the Runtime Manifest (loading logic and mapping relationship of the fragmented file) is injected into the HTML file to ensure that the changes in the business code will not affect the public package. With the help of conventional web project compilation ideas, we deeply analyzed the generation logic of Runtime Manifest and the loading logic of PartJS in FlutterWeb, and customized the following solutions:

图13 Runtime Manifest抽离

In the above figure, the generation logic of Runtime Manifest is located in the Dart2JS Compiler module. In this generation logic, we mark the Runtime Manifest code block, and then extract the marked Runtime Manifest code block in flutter_tools and write it into the HTML file (Exists in the form of JS constants). In the loading process of PartJS, we changed the way of reading manifest information to obtaining JS constants. According to this way of splitting, the change of business code will only change the Runtime Manifest information, but will not affect the main.dart.js public package.

4.2.3 main.dart.js slice

After the above introduction of Lazy Loading and Runtime Manifest extraction, the volume of the main.dart.js file is stable at about 0.7M. The browser loads a large single file, which will have a heavy network burden, so we designed a slicing scheme. Make full use of the browser’s feature of parallel loading of multiple files to improve file loading efficiency.

The specific implementation plan is: split main.dart.js into multiple plain text files during the flutter_tools compilation process, and the front end loads them in parallel through XHR and sequentially splices them into JavaScript code and places them in the <script> tag to realize the slice file Parallel loading.

图14 并行加载

4.3 Preloading scheme

As mentioned in the previous section, although we have done a lot of work to stabilize the content of main.dart.js, under the operating mechanism of Flutter Tree-Shaking, each project references different Framework Widgets, which will result in the main generated by each project. The content of .dart.js is inconsistent. As more and more projects are connected to FlutterWeb, the probability of each business's page exchange is getting higher and higher. Our expectation is that when business A is accessed, the main.dart.js referenced by business B can be cached in advance, so When the user actually enters the B service, the time to load resources can be saved. The following is a detailed technical solution.

4.3.1 Technical scheme

We divide the overall technical solution into three stages: compiling, monitoring, and running.

In the compilation stage, according to the pre-customized matching rules on the release pipeline, the path of the qualified resource files is filtered out, and the cloud JSON is generated and uploaded;
In the monitoring phase, after DOMContentLoaded, it monitors network resources, events, and DOM changes, and analyzes and weights the monitoring results according to specific rules to obtain a status indicator that the first screen has been loaded;
In the operation stage, after the first screen is loaded, the cloud JSON file issued by the configuration platform is parsed, and the resources that meet the configuration rules are preloaded with HTTP XHR, so as to realize the pre-caching function of the file.

The following figure shows the overall scheme design of pre-caching:

图15 预缓存方案设计

compilation phase

The compilation phase will expand the existing release pipeline, adding a prefetch build job after flutter build, so that after the build, the product catalog can be traversed and filtered to obtain the resources we need to generate cloud JSON to provide a data basis for the run phase. The following flow chart is the detailed scheme design of the compilation stage:

图16 预缓存编译阶段

The compilation phase is divided into three parts:

The first part: According to different release environments, initialize the online/offline configuration platform to prepare for the reading and writing of configuration files;
Part 2: Download and parse the resource group JSON issued by the configuration platform, filter out the resource path that meets the configuration rules, update the JSON file and publish it to the configuration platform;
The third part: through the API provided by the release pipeline, inject PROJECT_ID and the release environment into the HTML file, and provide global variables for the runtime to be read.

Through the integration of the pipeline during the compilation period, we can generate new cloud JSON and upload it to the cloud to provide a data basis for the distribution of the runtime phase.

monitoring phase

We know that the browser has a limit on the number of concurrent file requests. In order to ensure that the browser has a high priority for rendering the current page, and at the same time it can also complete the pre-caching function, we have designed a set of loading strategies for cached files , Without affecting the loading of the current page, realize the loading operation of the cache file. The following is a detailed technical plan:

图17 预缓存监听阶段

After the page DOMContentLoaded, we will monitor the changes in the three parts.

The first part is to monitor DOM changes. This part is mainly after the Ajax request on the page, as the MV mode changes, the DOM will also change accordingly. We use the MutationObserver API provided by the browser to collect DOM changes, filter effective nodes for depth-first traversal, and calculate the recursive weight value of each DOM. If it is below the threshold, we consider that the first screen has been loaded.
The second part is to monitor resource changes. We use the PerformanceObserver API provided by the browser to filter out img/script type resources. When the collected resources do not increase within 3 seconds, we think that the first screen has been loaded.
The third part is to monitor the Event event. When the user interacts with click, wheel, touchmove, etc., we think that the current page is in an interactive state, that is, the first screen has been loaded, so that the resources will be pre-cached later.

Through the above steps, we can get a time when the first screen rendering is completed, and then the pre-caching function can be implemented. The following is the realization of the pre-caching function.

Operating phase

The overall process of pre-caching is: download the cloud JSON generated in the compilation stage, parse out the CDN path of the resource that needs to be pre-cached, and finally request the cached resource through HTTP XHR, and use the browser’s own caching strategy to transfer the resource files of other services Write. When a user visits a page that has hit the cache, the resource has been loaded in advance, which can effectively reduce the loading time of the first screen. The following figure shows the detailed scheme design of the operation stage:

图18 预缓存运行阶段

In the monitoring stage, we can get the timing when the first screen rendering of the page is completed, and get the cloud JSON, and first determine whether the cache of the item is enabled. When the project is available, the resource array will be matched according to the global variable PROJECT_ID, and then pre-accessed in HTTP XHR mode, and the cache file will be written into the browser cache pool. So far, the resource pre-caching has been executed.

4.3.2 Effect display and data comparison

When pre-cache is hit by mutual access between pages, the browser will return data in the form of 200 (Disk Cache), which saves a lot of resource loading time. The following figure shows the resource loading situation after hitting the cache:

图19 预缓存效果展示

At present, the pre-caching function has been connected to 10+ pages of Meituan’s takeaway business. The 90-line average of resource loading has dropped from 400ms to 350ms, which is a 12.5% reduction; the 50-line average has dropped from 114ms to 100ms, which is a reduction 12%. As more and more projects are accessed, the effect of pre-caching will become more obvious.

图20 预缓存数据展示

4.4 Packaging by platform

As mentioned earlier, most of Meituan’s takeaway businesses are aligned on both ends. In order to maximize efficiency, we strengthened the multi-platform adaptability of FlutterWeb, and realized the reuse of FlutterWeb on the PC side.

In the process of PC adaptation, we inevitably need to write dual-end compatible codes, such as: in order to realize the reuse of card components in the list page. To this end, we have developed an adaptation tool ResponsiveSystem, which is passed into the PC and App for each end implementation, and the internal adaptation will be completed by different platforms:

// ResponsiveSystem 使用举例
Container(
  child: ResponsiveSystem(
    app: AppWidget(),
    pc: PCWidget(),
  ),
)

The above code can easily realize the adaptation between PC and App, but neither AppWidget nor PCWidget can be removed by Tree-Shaking during the compilation process, so the package size will be affected. In this regard, we optimize the compilation process and design a sub-platform packaging solution:

图21 分平台打包

Modify flutter-cli to support the --responsiveSystem command line parameter;
We added additional processing in the AST analysis stage in flutter_tools: ResponsiveSystem keyword matching, and combined with the compilation platform (PC or Mobile) to rewrite the AST node;
After removing useless AST nodes, generate code snapshots of each platform (each snapshot only contains individual platform code);
Compile and generate two sets of JS products, PC and App based on the code snapshot, and perform resource isolation. For public resources such as images and fonts, we put them into the common directory.

In this way, we have removed the useless code of the respective platforms and avoided the package size problem caused by the PC adaptation process. Still taking the Meituan takeaway business classroom business (6 pages) as an example, after accessing the sub-platform and packaging, the single-platform code size is reduced by about 100KB.

图22 效果展示

4.5 Simplified icon font

When accessing the FlutterWeb page, even if the Icon icon is not used in the business code, a 920KB icon font file: MaterialIcons-Regular.woff will be loaded. Through exploration, we found that some system UI components in the Flutter Framework (such as CalendarDatePicker, PaginatedDataTable, PopupMenuButton, etc.) use Icon icons, and Flutter provides a full amount of Icon icon font files for the convenience of developers.

--tree-shake-icons command option officially provided by Flutter is to merge the Icon used by the business with a reduced font file (approximately 690KB) maintained internally by Flutter, which can reduce the font file size to a certain extent. What we need is to only package the Icon used by the business, so we tree-shake-icons and designed the Icon's on-demand packaging solution:

图23 图标字体精简

Scan all business codes and dependent Plugins, Packages, Flutter Framework, and analyze all the icons used;
Compare all the scanned Icons with material/icons.dart (the file contains the unicode encoding collection of Flutter Icon), and get a condensed icon encoding list: iconStrList;
Use FontTools to generate the font file .woff from iconStrList. At this time, the font file only contains the icons that are actually used.

Through the above solutions, we have solved the package size problem caused by the excessively large font files. Taking the Meituan takeaway classroom business (5 Icons are used in the business code) as an example, the font files have been reduced from 920KB to 11.6kB.

图24 效果展示

V. Summary and Prospects

In summary, we have explored and practiced FlutterWeb performance optimization based on the HTML Render mode, mainly including the streamlining of SDK (Dart-SDK, Framework, Flutter_Web_SDK), and optimization of static resource products (for example: JS fragmentation, file hash, font Icon file simplification, sub-platform packaging, etc.) and front-end resource loading optimization (pre-loading and on-demand request). finally reduced the JS product from 1.2M to 0.7M (non-business code), and the full page load time TP90 line was reduced from 6s to 3s . This result has been able to meet most of the business requirements of Meituan takeaway merchants. The future planning will focus on the following 3 directions:

reduces the cost of web-side adaptation : At present, 9+ businesses have used MTFlutterWeb to achieve multi-terminal multiplexing, but there is still room for optimization in the adaptation efficiency on the web side (especially the PC side). The goal is to reduce the adaptation cost to Below 10% (currently about 20%);
builds the FlutterWeb disaster recovery system : Flutter dynamic package has a certain probability of loading failure, and FlutterWeb as a bottom-line solution can improve the overall business loading success rate. In addition, FlutterWeb can provide "free installation and update" capabilities to reduce the maintenance cost of old historical versions of FlutterNative;
Continued advancement in : The phased results of performance optimization have consolidated the foundation for the application promotion of MTFlutterWeb, but there is still room for further optimization. For example, currently we only separate the business code and Runtime Manifest, while Framework And the tripartite package also affects the hit rate of the browser cache to a certain extent. Separating this part of the code can further improve the page loading performance.

Read more technical articles from the

| in the public account menu bar dialog box, and you can view the collection of technical articles from the Meituan technical team over the years.

| This article is produced by the Meituan technical team, and the copyright belongs to Meituan. Welcome to reprint or use the content of this article for non-commercial purposes such as sharing and communication, please indicate "the content is reproduced from the Meituan technical team". This article may not be reproduced or used commercially without permission. For any commercial activity, please send an email to tech@meituan.com to apply for authorization.