6
头图

Preface

  • This article is reprinted
  • Translation of the original title: 2021 Web Worker status
  • Translation of the original author: Tapir
  • Translation of the original address: Knowing
  • Original address: The State Of Web Workers In 2021
Introduction: The Web is single-threaded. This makes it more and more difficult to write smooth and sensitive applications. Web Worker has a bad reputation, but for Web developers, it is a very important tool for solving fluency problems. Let's take a look at Web Workers.

We always compare the Web with so-called "Native" platforms (such as Android and iOS). The Web is streaming. When you open an application for the first time, there are no available resources locally. This is a fundamental difference, which prevents many of the architectures available on Native from being easily applied to the Web.

However, no matter what field you are focusing on, you must have used or understood multithreading technology. iOS allows developers to use Grand Central Dispatch simple parallelized code, while Android uses the new unified task scheduler WorkManager achieve the same thing, and the game engine Unity uses job systems . The platforms I listed above not only support multithreading, but also make multithreaded programming as easy as possible.

In this article, I will outline why I think multi-threading is important in the Web field, and then introduce the multi-threading primitives that we can use as developers. In addition, I will also talk about some topics related to architecture to help you achieve multi-threaded programming more easily (even incrementally).

Unpredictable performance issues

Our goal is to keep the application smooth and responsive. Smooth means stable and sufficiently high frame rate. Sensitivity means that the UI responds to user interactions with the lowest latency. Both are key factors in maintaining the elegance and high quality of the application.

According to the RAIL model , sensitive means that the time to respond to user behavior is controlled within 100ms, and smooth means that any element on the screen is stable at 60 fps when moving. Therefore, as a developer, we have 1000ms/60 = 16.6ms to generate each frame, which is also called the "frame budget".

I just mentioned "us", but actually "browser" takes 16.6ms to complete all the work behind rendering a frame. Our developers are only directly responsible for part of the actual work of the browser. The browser’s work includes (but is not limited to):

  • Detect the element of the user's operation (element)
  • Send out the corresponding event
  • Run the relevant JavaScript time handler
  • Calculation style
  • Layout (layout)
  • Paint layer
  • Combine these layers into a picture that the end user sees on the screen
  • (And more...)

What a huge workload.

On the other hand, "performance gap" is increasing . The performance of flagship mobile phones has become higher and higher with the upgrading of mobile phone products. And low-end models are becoming cheaper and cheaper, which allows people who could not afford a mobile phone to have access to the mobile Internet. In terms of performance, the performance of these low-end phones is equivalent to the 2012 iPhone.

Applications built for the Web will run on a wide range of different devices with widely varying performance. The time to complete the JavaScript execution depends on how fast the device is running the code. Not only JavaScript, but other tasks performed by the browser (such as layout and paint) are also constrained by the performance of the device. A task that takes only 0.5ms to run on a modern iPhone may take 10ms on Nokia 2. The performance of user equipment is completely unpredictable.

Note: RAIL has been a guiding framework for 6 years. You need to pay attention to that, in fact, 60fps is just a placeholder value, which represents the native refresh rate of the user's display device. For example, the new Pixel phone has a 90Hz screen and the iPad Pro has a 120Hz screen, which reduces the frame budget to 11.1ms and 8.3ms, respectively.

What’s more complicated is that, apart from measuring the time between requestAnimationFrame() has no better way to determine the refresh rate running app device.

JavaScript

JavaScript is designed to run synchronously with the browser's main rendering loop. Almost all web applications will follow this pattern. The disadvantage of this design is that slow JavaScript code will block the browser rendering loop. JavaScript running synchronously with the browser's main rendering loop can be understood as: if one of them is not completed, the other cannot continue. In order to allow long-term tasks to be coordinated in JavaScript, a asynchronous model .

In order to keep the application smooth, you need to ensure that the time for your JavaScript code to run and other tasks (style, layout, drawing...) done by the browser does not exceed the frame budget of the device. In order to keep the application agile, you need to ensure that any given event handler does not take more than 100ms, so that changes can be displayed on the device screen in time. In development, it is already very difficult to implement the above using your own equipment, and it is almost impossible to implement these on all equipment.

The usual advice is to "chunk your code". This approach can also be called "yield control to the browser." The basic principle is the same: in order to give the browser a chance to enter the next frame, you need to divide the code into chunks of similar size, so that you can return control when switching between code blocks Let the browser do the rendering.

There are many ways to "yield control (yield) to the browser", but none are particularly elegant. The recently proposed task scheduling API aims to directly expose this capability. However, even if we can use await yieldToBrowser() (or other similar things) to transfer control, this technology itself will still have flaws: in order to ensure that the frame budget is not exceeded, you need to do it in sufficiently small chunks. Business, moreover, your code must transfer control at least once per frame.

Too frequent transfer of control rights code will cause too much overhead for scheduling tasks, which will have a negative impact on the overall performance of the application. Combining the "unpredictable device performance" I mentioned earlier, we can conclude that there is no chunk size suitable for all devices. When trying to "code split" the UI business, you will find that this approach is very problematic, because the step-by-step rendering of the complete UI by transferring control to the browser will increase the total cost of layout and drawing.

Web Workers

There is a way to break code execution that is synchronized with the browser rendering thread. We can move some code to a different thread. Once in a different thread, we can let the continuously running JavaScript code block, without accepting the complexity and cost of code division and transfer control rights Using this method, the rendering process will not even notice that another thread is performing a blocking task. The API to achieve this on the Web is Web Worker . A Web Worker can be created by passing in an independent JavaScript file path, and this file will be loaded and run in the newly created thread.

const worker = new Worker("./worker.js");

Before we go into the discussion, it is important that although Web Workers, Service Workers and Worklets are very similar, they are not the same thing at all, and their purpose is different:

  • In this article, I will only discuss Web Workers (often referred to simply as "Worker"). Worker is a JavaScript scope running in a separate thread. Worker is generated (and owned) by one page.
  • ServiceWorker is a short-term , a JavaScript scope running in a separate thread, acting as a proxy to handle all network requests sent from the same-origin page. The most important point is that you can implement arbitrary complex caching logic by using Service Workers. In addition, you can also use Service Worker to further implement long background requests, message push and other functions that do not need to be associated with a specific page. It is similar to Web Worker, but the difference is that Service Worker has a specific purpose and additional constraints.
  • Worklet is an independent JavaScript scope with strict API restrictions. It can choose whether to run on a separate thread. The point of Worklet is that the browser can move Worklet between threads. AudioWorklet , CSS Painting API and Animation Worklet are examples of Worklet applications.
  • SharedWorker is a special Web Worker. Multiple Tabs and windows of the same source can refer to the same SharedWorker. This API is almost impossible to use through polyfills, and currently only Blink has implemented it. Therefore, I will not go into depth in this article.

JavaScript is designed to run synchronously with the browser, which means that there is no concurrency to deal with, which results in many APIs exposed to JavaScript that are not thread-safe. For a data structure, thread safety means that it can be accessed and operated by multiple threads in parallel, and its state will not be corrupted.

This is generally achieved through mutex (mutexes) . When a thread performs an operation, the mutex locks other threads. Because browsers and JavaScript engines do not handle locking-related logic, they can do more optimizations to make code execution faster. On the other hand, the absence of a lock mechanism causes Workers to run in a completely isolated JavaScript scope, because any form of data sharing will cause problems due to lack of thread safety.

"thread" primitive of the Web, the "thread" here is very different from that in C++, Java and other languages. The biggest difference is that relying on the isolation environment means that the Worker does not have permission to access other variables and codes in the created page, and vice versa, the latter cannot access the variables in the Worker. The only way of data communication is to call API postMessage , which will make a copy of the transmitted information and trigger the message event on the receiving end. Isolation of the environment also means that the Worker cannot access the DOM, and the UI cannot be updated in the Worker—at least without significant effort (such as AMP's worker-dom ).

Browser support for Web Workers can be said to be universal, even for IE10. However, the usage rate of Web Worker is still low. I think this is largely due to the special design of Worker API.

Concurrency model of JavaScript

If you want to use Worker, you need to adjust the architecture of the application. JavaScript actually supports two different concurrency models, these two models are usually classified as "Off-Main-Thread architecture" (out of the main thread architecture). Both of these models use Worker, but there are very different ways to use it, and each has its own trade-off strategy. These two models represent two directions to solve the problem, and any application can find a more suitable one between the two.

Concurrency Model #1: Actor

I personally tend to understand Worker as Actor in the 16119feef9c28d Actor model The programming language Erlang for the implementation of the Actor model can be said to be the most popular version. Each Actor can choose whether to run on a separate thread, and it completely retains its own operating data. No other thread can access it, which makes rendering synchronization mechanisms like mutexes unnecessary. Actors will only propagate information to other actors and respond to the information they receive.

For example, I would think of the main thread as the Actor that owns and manages the DOM or all UI. It is responsible for updating the UI and capturing external input events. There will also be an Actor responsible for managing the state of the application. The DOM Actor converts low-level input events into application-level semantic events, and passes these events to the state Actor . The state actor modifies the state object according to the received event, and may use a state machine or even involve other actors. Once the state object is updated, the state actor will send a copy of the updated state object to the DOM Actor. The DOM Actor will update the DOM according to the new state object. Paul Lewis and I explored the Actor-centric application architecture at the Chrome Development Summit in 2018.

Of course, this model is not without problems. For example, every message you send needs to be copied. The time it takes to copy depends not only on the size of the message, but also on the current running status of the application. According to my experience, postMessage is usually "fast enough" , but it is not very good in some scenarios. Another problem is that migrating code to Worker can free the main thread, but at the same time have to pay for communication overhead, and Worker may be busy executing other code before responding to your message. We need to consider these issues to make a balance. If you are not careful, Worker may have a negative impact on UI response.

Very complex messages can be delivered through postMessage. The underlying algorithm (called "structured cloning") can handle data structures with loops and even Map and Set . However, he cannot handle functions or classes, because these codes cannot be shared across scopes in JavaScript. Annoyingly, passing a function through postMessage will throw an error. However, if a class is passed, it will only be silently converted to a normal JavaScript object, and all methods will be lost in the process (the details behind this are Meaningful, but beyond the scope of this article).

In addition, postMessage is a "Fire-and-Forget" messaging mechanism, without the concept of request and response. If you want to use the request/response mechanism (in my experience, most application architectures will eventually make you have to), you have to do it yourself. This is Comlink , which is a library that uses the RPC protocol at the bottom, which can help the main thread and Worker to access each other's objects. When using Comlink, you don't care about postMessage at all. The only thing to note is that due to the asynchronous nature of postMessage, the function does not return a result, but instead returns a promise. In my opinion, Comlink has refined the excellent parts of the Actor mode and shared memory concurrency models and provided them to users.

Comlink is not magic. In order to use the RPC protocol, you still need to use postMessage. If your application eventually has a rare bottleneck due to postMessage, then you can try to use the feature can be transferred (transferred) The transfer of ArrayBuffer is almost instant, and the transfer of ownership is completed at the same time: in this process, the JavaScript scope of the sender will lose access to the data. When I experiment run physics simulation WebVR applications outside the main thread , use this trick.

Concurrency model #2: shared memory

As I mentioned before, traditional threading is based on shared memory. This approach is not feasible in JavaScript, because almost all JavaScript APIs are designed assuming that there is no concurrent access to the object. Changing this now will either break the Web or cause significant performance loss due to the current necessity of synchronization. On the contrary, the concept of shared memory is currently limited to a proprietary type: SharedArrayBuffer (or SAB for short).

SAB is like ArrayBuffer, it is a linear memory block, which can be by 16119feef9c3d2 Typed Array or DataView . If the SAB is sent via postMessage, the other end will not receive a copy of the data, but will receive a handle to the same memory block. Any modification on one thread is visible on all other threads. In order to allow you to create your own mutex and other concurrent data structures, Atomics provides various types of tools to implement some atomic operations and thread-safe waiting mechanisms.

The disadvantages of SAB are manifold. First, and most importantly, SAB is just a piece of memory. SAB is a very low-level primitive, with increased engineering complexity and maintenance complexity as the cost, and it provides high flexibility and many capabilities. Also, you can't handle JavaScript objects and arrays in the way you are familiar with. It is just a string of bytes.

In order to improve work efficiency in this area, I experimentally wrote a library buffer-backed-object . It can synthesize JavaScript objects and persist the value of the object to an underlying buffer. In addition, WebAssembly uses Worker and SharedArrayBuffer to support the threading model of C++ or other languages. WebAssembly currently provides the best solution to achieve shared memory concurrency, but it also requires you to give up many of the benefits (and comfort) of JavaScript and switch to another language, and usually this will produce more binary data.

Case study: PROXX

in 2019 , I and my team released PROXX , which is a Web-based minesweeper, specifically for the machine function. The resolution of the feature phone is very low, usually there is no touch interface, the CPU performance is poor, and there is no GPU to make up for. Despite so many restrictions, these feature phones are still very popular because of their outrageously low prices and a fully functional Web browser. Because of the popularity of feature phones, the mobile Internet is open to those who could not afford it before.

In order to ensure that this game runs responsively and smoothly on these functional machines, we use a type of Actor architecture. The main thread is responsible for rendering the DOM (via preact, and using WebGL if available) and capturing UI events. The state and game logic of the entire application run in a Worker, it will confirm whether you step on the mine, if not, how it should be displayed on the game interface. The game logic will even send intermediate results to the UI thread to continuously provide users with visual updates.

Other benefits

I talked about the importance of fluency and sensitivity, and how to use Worker to achieve these goals more easily. Another external benefit is that Web Worker can help your application consume less device power. By using more CPU cores in parallel, the CPU will use less "high performance" mode, which will reduce power consumption overall. From Microsoft David Rousset to power Web applications were explore .

Adopt Web Worker

If you read this, I hope you have a better understanding of why Worker is so useful. So now the next obvious question is: how to use it.

At present, Worker has not been used on a large scale, so there is not much practice and architecture around Worker. It is difficult to determine in advance which parts of the code are worth migrating to Worker. I do not advocate the use of a specific architecture and abandon others, but I want to share my approach with you. I gradually use Worker in this way and have a good experience:

Most people have used modules to build applications because most packagers rely on modules to perform packaging and code splitting. The main technique of using Web Worker to build applications is to strictly separate the code related to UI and pure computational logic. In this way, the number of modules that must exist in the main thread (such as those that call the DOM API) can be reduced, and you can instead complete these tasks in Worker.

In addition, try to rely as little as possible on synchronization so that asynchronous modes such as callbacks and async/await can be adopted later. If this is achieved, you can try to use Comlink to migrate the module from the main thread to the Worker, and measure whether this can improve performance.

If you want to use Worker for an existing project, it may be a bit tricky. Take a moment to carefully analyze the APIs in the code that partially rely on DOM operations or can only be called on the main thread. If possible, remove these dependencies by refactoring, and gradually use the model I proposed above.

In either case, a key point is to ensure that the Off-Main-Thread architecture is measurable. Don't assume (or estimate) whether using Worker will be faster or slower. Browsers sometimes work in an inexplicable way, so that many optimizations can lead to counterproductive effects. It's important to figure out specific numbers, it can help you make a wise decision!

Web Worker and Bundler (Bundler)

Most modern Web development environments use packagers to significantly improve loading performance. The packager can package multiple JavaScript modules into one file. However, for Worker, due to the requirements of its constructor, we need to keep the file independent. I found that many people will separate and encode the Worker code into Data URL or Blob URL, instead of choosing to work hard on the packager to achieve their needs. Both Data URL and Blob URL will bring big problems: Data URL can't work in Safari at all. Blob URL can be said to be possible, but there is no concept of origin and path, which means that path parsing and retrieval cannot be used normally. . This is another obstacle to the use of Worker, but recently mainstream packagers have been strengthened in handling Workers:

  • Webpack : For Webpack v4, worker-loader plug-in allows Webpack to understand Worker. Starting from Webpack v5, Webpack can automatically understand the constructor of Worker, and even share modules between the main thread and Worker to avoid repeated loading.
  • Rollup : For Rollup, I have written rollup-plugin-off-main-thread , this plug-in allows Worker to be used out of the box
  • Parcel : Parcel deserves special mention, its v1 and v2 both support Worker out of the box, without additional configuration.

It is very common to use ES Module when developing applications with these packagers. However, this will bring new problems.

Web Worker and ES Module

All modern browsers support <script type="module" src="file.js"> to run JavaScript modules. All modern browsers except Firefox now also support a writing method corresponding to Worker: new Worker("./worker.js", {type: "module"}) . Safari has just started to support it recently, so it is important to consider how to support older browsers. Fortunately, all packagers (with the plugins mentioned above) will ensure that the code of your module runs in Worker, even if the browser does not support Module Worker. In this sense, using a packager can be regarded as a polyfill for Module Worker.

future

I like Actor mode. But the concurrency design in JavaScript is not very good. We built a lot of tools and libraries to make up for it, but after all, this is what JavaScript should do at the language level. Some TC39 engineers are very interested in this topic, and they are trying to find a way to make JavaScript better support these two modes. Many related proposals are currently under evaluation, such as allowing code to be transmitted by postMessage, such as being able to use high-level, scheduler-like APIs (this is common on Native) to share objects between threads.

These proposals have not yet made very significant progress in the standardization process, so I will not spend time here to discuss them in depth. If you are curious, you can follow TC39 proposal to see what the next generation of JavaScript will contain.

Summarize

Worker is a key tool to ensure the agility and smoothness of the main thread. It ensures this by preventing long-running code from blocking browser rendering. Due to the inherent asynchronous nature of communication with Worker, the adoption of Worker requires some adjustments to the application architecture, but in return, you can more easily support access by various devices with huge performance gaps.

You should make sure to use an architecture that facilitates code migration so that you can measure the performance impact of non-main thread architecture. The design of Web Worker will lead to a certain learning curve, but the most complicated part can be abstracted by libraries like Comlink.


FAQ

There will always be some common questions and ideas, so I want to preemptively and record my answers here.

postMessage n't 06119feef9c787 slow?

My core advice for all performance issues is: first measure! Before you calculate, there is no quick or slow talk. But based on my experience , postMessage is usually "fast enough". This is my rule of thumb: If JSON.stringify(messagePayload) is less than 10kb, even on the slowest mobile phone, you don't have to worry about causing frame jams. If postMessage really becomes a bottleneck in your application, you can consider the following techniques:

  • Split your tasks so you can send smaller messages
  • If the message is a state object and only a small part of it has changed, then only the changed part is sent instead of the entire object
  • If you send a lot of messages, you can try to combine multiple messages into one
  • As a last resort, you can try to convert your information into digital representation and transfer ArrayBuffers instead of object-based messages

I want to access the DOM from the Worker

I received a lot of feedback like this. However, in most cases, this just shifts the problem. You may be able to effectively create a second main thread, but you will also encounter the same problem, the difference is that it is in a different thread. In order for the DOM to be safely accessed in multiple threads, it is necessary to increase the lock, which will reduce the speed of DOM operations and may damage many existing web applications.

In addition, the synchronization model actually has advantages. It gives the browser a clear signal-when the DOM is available and can be rendered on the screen. In a multi-threaded DOM world, this signal will be lost, and we have to manually deal with part of the rendering logic or other logic.

I really don’t like to split my code into separate files in order to use Worker

I agree. There are some proposals in TC39 that are being reviewed in order to be able to inline one module into another without having as many minor issues as Data URL and Blob URL. Although there is no satisfactory solution yet, there will definitely be an iteration of JavaScript to solve this problem in the future.

Supplementary summary description

List some scenarios where the author currently uses Worker:
  1. When your algorithm program logic time is relatively long (beyond the "frame budget"), and hinder the rendering engine.
  2. When you want to try concurrent design patterns
  3. Adjustment of task scheduling architecture design (the scheduling mechanisms implemented by JS may not be optimal)
  4. ... the rendering and calculation are completely decoupled, and the calculation should be reasonably split into the Worker

wlove
6.9k 声望1.8k 粉丝

wx:wywin2023