4
头图

Image credit: https://unsplash.com/photos/m_HRfLhgABo

Author of this article: Wu Liuyi

The article was first published on my blog https://github.com/mcuking/blog/issues/110

background

It has been more than a year since the article "Building Your Own Online IDE" on how to privatize and deploy the CodeSandbox sandbox was published. At first, it was to build front-end code and preview the effect in real time on the block reuse platform. However, in the source-based low-code platform project launched by Cloud Music last year, there is also the need to build front-end applications online and in real time . Initially, the sandbox was developed from scratch. However, the self-developed sandbox has the following problems:

  • less flexibility

    The npm dependencies of the built application need to be packaged into the code of the sandbox itself in advance, and it is impossible to dynamically obtain the application dependency content from the service during the construction process;

  • poor compatibility

    The technology selection of the built application is relatively limited, for example, the use of less is not supported;

  • No isolation from platform is achieved

    The low-code platform and the sandbox do not use iframes as isolation, and there will be a problem that the global variables or styles of the sandbox build page are polluted by the external low-code platform.

Of course, if you continue to develop on this self-developed sandbox, the problems mentioned above can still be solved gradually, but more manpower needs to be invested.

However, CodeSandbox, as the most mainstream and mature online construction sandbox, does not have the problems listed above. And the implementation code is all open source, and there is no security problem. Therefore, it was decided to replace the self-developed sandbox of the low-code platform with the privately deployed CodeSandbox. The work during this period is mainly divided into the following two aspects:

  • Customized requirements for low-code platforms

    For example, in order to drag and drop components into the sandbox-built page, it is necessary to perform native event monitoring across iframes on the sandbox-built page, so as to further calculate the exact position of the drag and drop.

  • Improve sandbox build speed

    Since the low-code platform needs to build applications online, there are two characteristics: firstly, the complete front-end application code needs to be built instead of some code fragments, and secondly, the application code needs to be modified frequently and the effect can be viewed in real time. There are higher requirements.

Among them, there were twists and turns in the process of improving the speed of sandbox construction: it took nearly 2 minutes from the beginning to build a simple middle and background application containing antd dependency, and it was optimized step by step to about 1 second to achieve second opening, which is even faster than CodeSandbox The sandbox build speed of the official website is even faster.

Supplement: The articles on the two platforms mentioned above are described as follows, and those who are interested can view them by themselves:
Low-code platform: thinking and practice of NetEase Cloud Music low-code system construction
Block reuse platform: practice of cross-project block reuse scheme

Let's introduce the performance optimization process of CodeSandbox sandbox. Before the official start, in order to make it easier for readers to understand, we will briefly introduce the construction process of sandbox.

Sandbox build process

CodeSandbox is essentially a simplified version of Webpack that runs in the browser. The following is the architecture diagram of the entire sandbox, which mainly includes two parts: the online Bundler part and the Packager service.

沙箱原理图

The user only needs to introduce the packaged Sandbox component. An iframe tag will be created inside the component to load the deployed sandbox page. The js code in the page is the core part of the sandbox - the online Bundler. The first step in the sandbox construction process is that the Sandbox component will pass the compile command containing the source code of the application to be built to the online Bundler in the iframe through postMessage. After receiving the compile command, the online Bundler will start to build the application, and will be prepackaged from npm at first The service gets the npm dependencies of the application.

The three stages of sandbox construction- dependency preloading stage, compilation stage, and execution stage are described in detail below.

Dependency preload phase (Npm Preload)

Why do you need to rely on the preload phase

Since it is difficult to install the node_modules resources of the front-end application in the browser environment, the module resources of the dependent npm package need to be obtained from the server during the compilation stage, and the entry file field of the npm package ( package#main etc.) and meta information to calculate the specific path of the specified module in the npm package on the CDN, and then request the module content. for example:

If a view module of the front-end application demo.js references react dependence, as shown below:

 import React from 'react';
const Demo = () => (<div>Demo</div>);
export default Demo;

After compiling the demo.js module, it will continue to compile the module's dependencies react , firstly, it will be obtained from the CDN react package.json and the meta information for react :

https://unpkg.com/react@17.0.2/package.json

https://unpkg.com/react@17.0.2/?meta

Then calculate the specific path of the react package entry file (the whole process is the process of file resolve), and request the content of the module from the CDN:

https://unpkg.com/react@17.0.2/index.js

Then continue to compile the module and its dependencies, and so on recursively until all the referenced dependent modules in the application are compiled.

It can be seen that the sandbox implemented by the browser needs to continuously obtain the module content of the npm package from the CDN during the entire application compilation process, resulting in a lot of HTTP requests, which is the legendary HTTP request waterfall. And because the browser has a limit on the number of concurrent HTTP requests under the same domain name (for example, for HTTP/1.x version HTTP requests, the Chrome browser has a limit of 6), the entire compilation process is ultimately very time-consuming.

Depends on how the preload phase works

In order to solve this problem, there is a dependency preloading stage - that is, before starting to compile the application, the sandbox first requests the content of the npm package that the application depends on from the npm packaging service, and the packaging service will export the modules of the npm package. Packaged and returned as a JSON module, which is also known as a Manifest. For example, here is a link and screenshot of the React package's Manifest module:

https://prod-packager-packages.codesandbox.io/v2/packages/react/17.0.2.json

Manifest

In this way, you only need to send an HTTP request to get the content of each npm package.

During the dependency preloading phase, the sandbox will request Manifests of all dependent packages in the application, and then merge them into one Manifest. The purpose is that in the next compilation phase, the sandbox only needs to find a specific module of the npm package from the Manifest. Of course, if it is not found in the Manifest, the sandbox will still request the module from the CDN to ensure the compilation process goes smoothly.

The principle of the Packager service

The basic principles of the npm packaging service (also called the Packager service) mentioned above are as follows:

First install the specified npm package on the disk through yarn, then parse the require statement in the AST of the npm package entry file, then recursively parse the required module, and finally package all the referenced modules into the Manifest file for output (the purpose is to remove the redundant modules in npm packages, such as documentation, etc.) .

In short , relying on the preloading phase is to avoid generating a large number of requests during the compilation phase and causing the compilation time to be too long . Part of the goal of relying on pre-builds is the same as Vite's - relying on pre-builds .

Note: The reason why the necessity and operation mechanism of the dependency preloading stage is introduced in detail here is mainly to pave the way for the sandbox performance optimization part described later. If the reader does not understand the performance optimization part, he can come back and review it.

Compilation phase (Transpilation)

Simply put, the compilation phase starts from the entry file of the application, compiles the source code, parses the AST, finds the subordinate dependent modules, and then recursively compiles to finally form a dependency graph. The mutual reference between modules follows the CommonJS specification.

Supplement: For the content of simulating CommonJS, you can refer to the following article about Webpack, which will not be expanded here due to space problems: webpack series - modular principle-CommonJS

编译阶段

Execution stage (Evaluation)

Like the compilation phase, it starts from the entry file, uses eval to execute the entry file, and if require is called during the execution process, recursively eval depends on the module.

At this point, the construction process of the sandbox is explained. For more details, please refer to the following articles:

Improve sandbox build speed

Next, we enter the topic of this article - how to improve the construction speed of the sandbox. The whole process will take the construction of the simple middle and background applications mentioned at the beginning of the article including antd as an example to illustrate how to gradually optimize the construction speed from 2 minutes to about 1s. There are four main aspects:

  • Cache Packager service packaging results
  • Reduce the number of requests for a single npm package module during the compilation phase
  • Enable Service-Worker + CacheStorage cache
  • Implement functions like Webpack Externals

Cache Packager service packaging results

Through the analysis of the sandbox building application process, the first problem found is that it takes about 1 minute to request the Manifest from the Packager service in the dependency preloading stage antd package, and sometimes even the request times out. According to the previous description of the Packager service principle, it can be judged that the main reason for the time-consuming is that the antd package (including its dependencies) is large in size, whether it is to download the antd package or from antd Package entry file recursively packing all referenced modules would be very time consuming.

In this regard, the packaging results of the Packager service can be cached, and when the sandbox requests again, it will be directly read from the cache and returned, without the need to go through the process of downloading + packaging. The specific method of caching can be decided by readers according to their own circumstances. As for the problem of slow packaging for the first time, you can request the Packager service in advance for common npm packages to trigger packaging, so as to ensure that the manifest of the npm package can be quickly obtained during the application building process.

After caching the packaging results of the Packager service, the build time of the application is optimized from nearly 2 minutes to about 70s.

Reduce the number of requests for a single npm package module during the compilation phase

When we continue to analyze the network requests of the sandbox during the compilation phase, we will find that there will be a large number of antd package and @babel/runtime package-related module requests, as shown in the following figure:

请求瀑布流

According to the explanation of the sandbox principle above, the dependency preloading phase is designed to avoid a large number of npm single-module requests during the compilation phase, so why are there so many requests? In summary, there are two reasons:

  • The packager service and the sandbox build determine the difference in the entry file of the npm package
  • The npm package itself does not specify an entry file or the entry file cannot be associated with all modules that will be used during compilation

The packager service and the sandbox build determine the difference in the entry file of the npm package

antd包的为例,该包本身的rc-xxxpackage.json main and module , taking rc-slider as an example, the following is the entry file definition part of the package package.json (note that the entry file name has no suffix):

 {
  "main": "./lib/index",
  "module": "./es/index",
  "name": "rc-slider",
  "version": "10.0.0-alpha.4"
}

We already know that the Packager service starts from the entry file of the npm package, and recursively packages all the referenced modules into a Manifest returned. The module field has a higher priority than the main field, so the Packager service will start packaging with ./es/index.js as the entry file. However, after the Manifest is packaged and officially returned to the sandbox, it will also check whether the entry file defined by the field in module package.json exists in the npm package, if not, then will remove the module field from package.json .

Unfortunately, the logic of checking whether the entry file actually exists does not take into account the fact that the file name has no suffix, and it happens that the module field of the npm package does not write the file suffix, so in the returned Manifest rc-slider package.json s- The --- package.json module field of --9175a421c2e1dbb742e5abe68124f94a--- has been removed.

接下来是浏览器侧的沙箱开始编译应用,编译到rc-slider时, rc-sliderpackage.jsonmodule It is deleted, so it is specified by the --- main field ./lib/index.js module as the entry file to start compiling, but only es module in the Manifest, so only During the compilation process, the modules under lib are dynamically requested from the CDN, resulting in a large number of HTTP requests blocking compilation.

请求瀑布流

Regarding the problem that the Packager service does not have a compatible entry file name without a suffix, the author has submitted a PR to the CodeSandbox official to fix it, click to view .

Next, let's look at another example ramda package package.json The relevant entry file part:

 {
  "exports": {
    ".": {
      "require": "./src/index.js",
      "import": "./es/index.js",
      "default": "./src/index.js"
    },
    "./es/": "./es/",
    "./src/": "./src/",
    "./dist/": "./dist/"
  },
  "main": "./src/index.js",
  "module": "./es/index.js",
  "name": "ramda",
  "version": "0.28.0"
}

Packager 服务是module ./es/index.js ,但编译阶段中沙箱却最终选择export. The default specified ./src/index.js is used as the entry to start the compilation, which in turn also generates a large number of requests for a single module.

The essence of the problem is that the strategy for determining the entry file of the npm package is not exactly the same when the packager service packages the npm package and when the application is built in the sandbox . To solve this problem, it is necessary to determine the entry file strategy on both sides.

 沙箱侧确定入口文件的逻辑在 packages/sandpack-core/src/resolver/utils/pkg-json.ts 中。

Packager 服务侧相关逻辑在 functions/packager/packages/find-package-infos.ts / functions/packager/packages/resolve-required-files.ts / functions/packager/utils/resolver.ts 中。

Readers can decide whether to use the strategy for determining the npm entry file on the packager service side or the sandbox side as the unified standard. In short, the strategies on both sides must be consistent.

The npm package itself does not have an entry file or the entry file cannot be associated with all modules that will be used during compilation

First analyze the @babel/runtime package, through the package.json 85e8198d9d2527ee20d4d24427a11b97--- of the package, you can find that it does not define an entry file. Generally, the package is used to directly refer to the specific module in the package, for example var _classCallCheck = require("@babel/runtime/helpers/classCallCheck"); , so according to the packaging principle of the Packager service, the modules that will be used during compilation in the package cannot be packaged into the Manifest, which eventually leads to a large number of requests for a single module during the compilation phase.

In this regard, the author only adopts a special treatment method for special cases: when packaging npm packages that do not define entry files or that entry files cannot be associated with all modules that will be used during compilation, manually package the specified directory or specified module during the npm packaging process. Packaged into Manifest. For example, for the @babel/runtime package, all files in its root directory are manually packaged into the Manifest during the packaging process. There is no better solution yet, if readers have a better solution, please leave a message.

Of course, if it is an internal npm package, you can also add a custom field similar to sandpackEntries package.json ---621a2db770072a24691b19690bc50dd8---, that is, specify multiple entry files, so that the Packager service can use the modules used in the compilation stage. Pack as much as possible into the Manifest. For example, the components for the low-code platform may be divided into normal mode and design mode. The design mode is to drag components and configure component parameters more conveniently on the low-code platform. Designer.js will be defined in addition to index.js. As a component entry file in design mode, multiple entry files can be specified in this case (the concept of multiple entry is only for the Packager service). The related transformation is the functions/packager/packages/resolve-required-files.ts resolveRequiredFiles function in ---eb354b00ab430d8749c09755d92d25e0---, as shown in the following figure:

define multi entries

By reducing the number of requests for a single npm package module in the compilation phase, the build time of the application was reduced from about 70s to about 35s.

Enable Service-Worker + CacheStorage cache

When analyzing a large number of npm package requests for a single module, the author also built the exact same application in the sandbox of the CodeSandbox official site, and did not encounter this problem. Later, I found that the official website only caches the resources that have been requested. That is to say, when using CodeSandbox for the first time or building an application in browser incognito mode, you will still encounter a large number of HTTP requests.

So how is the official website cached? First intercept the request in the application construction process through Service-Worker. If it is found that the resource needs to be cached, it will first check whether it has been cached in CacheStorage. If not, continue to request the remote service, and cache a copy of the content returned by the request. into CacheStorage; if the corresponding cache is found, it is directly read from CacheStorage and returned, thereby reducing the request time.

As shown in the figure below, CodeSandbox cache content mainly includes:

  1. Static resource modules for sandboxed pages
  2. Manifest of npm packages requested from the Packager service
  3. npm package single module content requested from CDN

cacheStorage

However, CodeSandbox has disabled the caching function in the sandbox version provided to the outside world. We need to enable this function. The relevant code is in packages/app/src/sandbox/index.ts , as shown in the following figure:

cacheStorage

In addition, the caching function is implemented by the SWPrecacheWebpackPlugin plugin--when packaging the CodeSandbox sandbox code, enable the SWPrecacheWebpackPlugin plugin and pass in the specific cache policy configuration, and then it will be built The service-worker.js script is automatically generated in the sandbox. Finally, the cache function can be enabled by registering and executing the script when the sandbox is running. What we need to do here is to modify the address of the cache policy to the corresponding address of the sandbox we deployed privately. The specific module is in packages/app/config/webpack.prod.js :

cacheStorage

Supplement: The SWPrecacheWebpackPlugin plugin is mainly used to avoid manually writing Service Worker scripts. Developers only need to provide specific caching strategies. For more details, please click the following link: https://www.npmjs.com/package/sw-precache- webpack-plugin

After enabling the cache on the browser side, the build time of the application can be basically stabilized to about 12s.

Implement functions like Webpack Externals

The optimization of the above three aspects is basically in the network aspect - either increasing the cache or reducing the number of requests. So can compiling and executing the code itself be further optimized? Next, let's analyze it together.

The author found a problem when using the browser debugging tool to debug the compilation process of the sandbox: even if only one component of the antd package is used in the application, for example:

 import React from 'react';
import { Button } from 'antd';
const Btn = () => (<Button>Click Me</Button>);
export default Btn;

But it will still compile antd modules associated with all components in the package, which will eventually lead to a long compilation time. After investigation, it is found that the main reason is that all components are referenced in the entry file of antd . The following is part of the code of the entry file in es mode antd/es/index.js :

 export { default as Affix } from './affix';
export { default as Anchor } from './anchor';
export { default as AutoComplete } from './auto-complete';
...

According to the explanation of the compilation phase and execution phase above, we can know that the sandbox will recursively compile and execute all referenced modules from the antd entry file.

Because the sandbox also uses babel to compile js files, the author initially thought of integrating the babel-plugin-import plugin when compiling js files. The function of this plugin is to implement the on-demand introduction of components. Click to view more details of the plugin. . The following code compilation effect will be more intuitive:

 import { Button } from 'antd';
      ↓ ↓ ↓ ↓ ↓ ↓
var _button = require('antd/lib/button');

After integrating the plugin, it is found that the sandbox build speed has indeed improved, but as the number of components used by the application increases, the build speed will be slower. So is there a better way to reduce or even eliminate the need to compile modules? Yes, implement the function of class Webpack Externals, the following is the principle of the whole function:

1. Skip the compilation of the antd package in the compilation phase to reduce compilation time.

2. Before the execution phase starts, load and execute the umd-form construct of antd globally through the script tag, so that the content exported in the antd package is mounted to the window object up. Next, when executing the compiled code, if you find a component in the antd package that needs to be referenced, you can get it from the window object and return it. The time of the execution phase is also reduced since it is no longer necessary to execute the antd package associated with all modules.

Note: This involves the concepts of Webpack Externals and umd module specifications. Due to space problems, I will not elaborate here. If you are interested, you can learn about it through the following links:

With the idea, let's start to transform the CodeSandbox source code:

The first is the transformation of the compilation stage. When a module is compiled, the dependencies of the module will be added and the compilation will continue. When adding a dependency, it is judged that if the dependency is an external npm package, it will exit directly to block further compilation of the dependency.

The specific code is in packages/sandpack-core/src/transpiled-module/transpiled-module.ts , and the changes are shown in the following figure:

external 编译阶段

Then there is the transformation of the execution stage, because CodeSandbox finally compiles all modules into CommonJS modules and then simulates the CommonJS environment for execution (mentioned in the sandbox construction process section above). Therefore, it is only necessary to judge in the simulated require function that if the module is referenced by the external npm package, it can be obtained and returned directly from the window object.

The specific code is in packages/sandpack-core/src/transpiled-module/transpiled-module.ts , and the changes are shown in the following figure:

external 执行阶段

In addition, before the sandbox starts to execute the compiled code, it is necessary to dynamically create script tags to load and execute antd constructs in the form of umd packages. Fortunately, CodeSandbox already provides dynamic loading of external js/css resources. capability, no additional development is required. Just pass the link that requires js/css resources to the sandbox through the externalResources parameter.

Finally, you need to configure the relevant parameters in the sandbox.config.json file, as shown in the following figure:

 {
  "externals": {
    "react": "React",
    "react-dom": "ReactDOM",
    "antd": "antd"
  },
  "externalResources": [
    "https://unpkg.com/react@17.0.2/umd/react.development.js",
    "https://unpkg.com/react-dom@17.0.2/umd/react-dom.development.js",
    "https://unpkg.com/antd@4.18.3/dist/antd.min.js",
    "https://unpkg.fn.netease.com/antd@4.18.3/dist/antd.css"
  ]
}
Supplement: sandbox.config.json The content of the file will be obtained in the sandbox build, and the file is placed in the root directory of the built application. Click to view configuration details .

Finally, after the optimization of the above four aspects, the sandbox can complete the construction of the entire application in about 1s, and the effect is shown in the following figure:

沙箱构建效果图

future plan

So is the sandbox construction performance optimization solution close to perfect?

The answer is of course no. Readers can imagine that as the scale of the application to be built increases, the modules that need to be compiled and executed will also increase. The CodeSandbox sandbox recursively compiles all referenced modules through the entry file of the application, and then starts from the application. The entry file recursively executes the pattern of all referenced modules , which inevitably leads to an inevitable increase in the overall build time.

So is there a better way? Vite, which is very popular recently, provides an idea: in the process of application code execution, other modules are referenced through ES Module, the browser will initiate a request to obtain the module, the server intercepts the request and compiles it after matching the corresponding module. return. This method does not require full compilation of application modules in advance, and on-demand dynamic compilation will greatly shorten the application construction time. The more complex the application, the more obvious the advantage of construction speed.

The author is trying to transform Vite so that it can run in the browser. The harvest will be summarized in the next article of the sandbox series - "Building a Vite Sandbox for Browser" , and the implementation code of the sandbox prototype will also be Synced to https://github.com/mcuking/vitesandbox-client , stay tuned!

concluding remarks

A sandbox environment that can run code (including front-end/Node service and other application code) is implemented in the browser on the client side. Compared with the method of running code in the server-side container, it has the advantages of not occupying service resources, low operating cost, and fast startup speed. and other advantages, it can create considerable value in many application scenarios. In addition, the browser version of the sandbox is also one of the few rich front-end applications. The main functions of the entire sandbox application are implemented in the browser, posing greater challenges to front-end development.

The following pictures are some of the author's attempts in the sandbox field in the past two years. Interested students are welcome to communicate with each other: https://github.com/mcuking/blog

沙箱规划图

References

This article is published from the NetEase Cloud Music technical team, and any form of reprinting of the article is prohibited without authorization. We recruit various technical positions all year round. If you are ready to change jobs and happen to like cloud music, then join us at grp.music-fe(at)corp.netease.com!

云音乐技术团队
3.6k 声望3.5k 粉丝

网易云音乐技术团队