rrweb takes you to restore the problem scene

Author of this article: Ling Jiang

background

There are many content management systems (Content Management System, CMS) inside Cloud Music, which are used to support the operation and configuration of the business. When operating students encounter problems during use, they expect developers to give feedback and solve problems in a timely manner; the pain point is There is no problem site for developers, and it is difficult to quickly locate the problem. The usual scenarios are:

Operation classmate Watson: "Sherlock, when I configured the mlog tag, it was prompted that the tag does not exist, please help me to see it, urgent."
Development classmate Sherlock: "Don't panic, I'll take a look." (Open the operation management background of the test environment, and after a single operation, everything is very normal...)
Development classmate Sherlock: "I'm normal here, where is your workstation, I'll go to your place to see"
Operation classmate Watson: "I'm in Beijing..."
Developer Sherlock: "I'm in Hangzhou..."

In order to give timely feedback to the relevant problems encountered by the operation students in the use, and to locate and solve the use problems encountered by CMS users as soon as possible, a plug-in for one-click problem reporting is designed and implemented to restore the problem scene, mainly including two parts: recording and display. :

ThemisRecord plugin: report basic user information, user permissions, API requests & results, error stack, screen recording
Listen to the platform to undertake the demonstration: display screen recording playback, user, request and error stack information

Reporting process

The main process of the plug-in design for one-click problem reporting is shown in the figure below. During the screen recording, the plug-in needs to collect user basic information, API request data, error stack information and screen recording information, and upload the data to the NOS cloud and listening platform.
插件设计
In the whole reporting process, how to realize the operation screen recording and playback is a difficult point. After investigation, we found that the rrweb open source library can well meet our needs. The scenarios supported by the rrweb library include screen recording playback, custom events, console recording and playback, etc. Among them, screen recording playback is the most commonly used scenario. For details, see Scenario Example .

This article mainly introduces the implementation principle of the screen recording and playback of the rrweb library.

rrweb library

rrweb is mainly composed of three libraries: rrweb , rrweb-player and rrweb-snapshot

rrweb: Provides two methods: record and replay; the record method is used to record DOM changes on the page, and the replay method supports restoring DOM changes based on timestamps.
rrweb-player: Based on the svelte template, it provides a playback GUI tool for rrweb, and supports functions such as pause, double-speed playback, and dragging the timeline. The methods such as replay provided by rrweb are called internally.
rrweb-snapshot: It includes two major features: snapshot and rebuilding. Snapshot is used to serialize DOM into incremental snapshots, and rebuilding is responsible for restoring incremental snapshots to DOM.

To understand the principle of the rrweb library, you can start with the following key questions:

How to implement event monitoring
How to serialize the DOM
How to implement a custom timer

How to implement event monitoring

To achieve screen recording based on rrweb, the following methods are usually used to record events. Through the emit callback method, all events corresponding to DOM changes can be obtained. After getting the event, it can be processed according to business needs. For example, our one-click reporting plug-in will be uploaded to the cloud, and developers can pull the data from the cloud and play it back on the listening platform.

let events = [];

rrweb.record({
  // emit option is required
  emit(event) {
    // push event into the events array
    events.push(event);
  },
});

record method will initialize the event monitoring according to the event type. For example, DOM element changes, mouse movement, mouse interaction, scrolling, etc. have their own event monitoring methods. This article mainly focuses on the monitoring and processing flow of DOM element changes.

To achieve change a DOM element to listen, can not do without the browser provides MutationObserver API, the API will change after a series of DOM, by batch asynchronous way to trigger callbacks and DOM change by MutationRecord array passed to the callback method. The detailed MutationObserver can be viewed at MDN .

Internal rrweb is also based on this API to implement monitoring, callback methods for MutationBuffer class provides processMutations method:

  const observer = new MutationObserver(
    mutationBuffer.processMutations.bind(mutationBuffer),
  );

mutationBuffer.processMutations method will do different processing according to the MutationRecord.type

type === 'attributes' : Represents DOM attribute changes, all the nodes whose attributes change will be recorded in the this.attributes array, the structure is { node: Node, attributes: {} } , and only the attributes involved in this change are recorded in the attributes;
type === 'characterData' : represents the change of characterData node, which will be recorded in the this.texts array, the structure is { node: Node, value: string } , and the value is the latest value of the characterData node;
type === 'childList' : Represents the change of the child node tree childList. Compared with the previous two types, the processing will be more complicated.

childList incremental snapshot

When the childList changes, if the entire DOM tree is completely recorded every time, the data will be very large, which is obviously not a feasible solution. Therefore, rrweb adopts the incremental snapshot processing method.

There are three key Sets: addedSet , movedSet , droppedSet , corresponding to three node operations: add, move, delete, which is similar to the React diff The Set structure is used here to implement deduplication processing of DOM nodes.

new node

Traverse the MutationRecord.addedNodes node, add the addedSet node to 061d6543788a2b, and remove from droppedSet if the node exists in the deleted set droppedSet .

Example: Create nodes n1, n2, append n2 to n1, and append n1 to body.

body
  n1
    n2

The above node operation will only generate a MutationRecord record, that is, adding n1, the process of "n2 append to n1" will not generate a MutationRecord record, so when traversing the MutationRecord.addedNodes node, you need to traverse its child nodes, otherwise the n2 node will be missed.

After traversing all MutationRecord record arrays, addedSet . The result of serialization of each node is:

export type addedNodeMutation = {
  parentId: number;
  nextId: number | null;
  node: serializedNodeWithId;
}

The association relationship of DOM is parentId and nextId . If the parent node or the next sibling node of the DOM node has not been serialized, the node cannot be accurately located, so it needs to be stored first and processed at the end.
双向链表

rrweb uses a doubly linked list addList to store nodes whose parent nodes have not been added. When inserting nodes addList

If the previousSibling of the DOM node already exists in the linked list, it will be inserted after the node.previousSibling node
If the nextSibling of the DOM node already exists in the linked list, it will be inserted before the node.nextSibling node
are not present, insert the head of the linked list

In this way, the order of sibling nodes can be guaranteed. The nextSibling DOM node must be behind the node, and the previousSibling must be in front of the node; addedSet serialization of 061d6543788c7c is completed, the addList linked list will be traversed in reverse order, so that Make sure that the nextSibling of the DOM node must be serialized before the DOM node, and you can get nextId next time you serialize the DOM node.

Node movement

Traverse the MutationRecord.addedNodes node. If the recorded node has the __sn attribute, it will be added to movedSet . There are __sn attributes that represent DOM nodes that have been serialized, which means that the nodes are moved.

Before movedSet the node in 061d6543788cd7, it will determine whether its parent node has been removed:

If the parent node is removed, there is no need to process it, skip it;
The parent node is not removed, the node is serialized.

Node deletion

Traverse the MutationRecord.removedNodes node:

If the node is the new node this time, ignore the node, addedSet , and record it in droppedSet . It needs to be used when processing the new node: although we removed the node, but Its child nodes may still exist in addedSet . When processing the addedSet node, it will determine whether its ancestor node has been removed;
The node to be deleted is recorded in this.removes , and the parentId and node id are recorded.

How to serialize the DOM

MutationBuffer instance will call the serializeNodeWithId snapshot to serialize the DOM node.
serializeNodeWithId internally calls the serializeNode method to nodeType different types of nodes such as Document, Doctype, Element, Text, CDATASection, and Comment according to 061d6543788ea1. The key is the serialization of Element:

Traverse the attributes attribute of the element, and call the transformAttribute method to process the resource path as an absolute path;

    for (const { name, value } of Array.from((n as HTMLElement).attributes)) {
        attributes[name] = transformAttribute(doc, tagName, name, value);
    }

Determine whether the element needs to be hidden by checking whether the element contains the blockClass class name, or whether it matches the blockSelector selector; in order to ensure that the element hiding will not affect the page layout, an empty element with the same width and height will be returned;

    const needBlock = _isBlockedElement(
        n as HTMLElement,
        blockClass,
        blockSelector,
    );

Distinguish between external chain style files and inline styles, serialize CSS styles, and convert relative paths of referenced resources in CSS styles into absolute paths; for external chain files, read all styles through cssRules of CSSStyleSheet instance, and splicing them into one String, put it in the _cssText attribute;

    if (tagName === 'link' && inlineStylesheet) {
        // document.styleSheets 获取所有的外链style
        const stylesheet = Array.from(doc.styleSheets).find((s) => {
            return s.href === (n as HTMLLinkElement).href;
        });
        // 获取该条css文件对应的所有rule的字符串
        const cssText = getCssRulesString(stylesheet as CSSStyleSheet);
        if (cssText) {
            delete attributes.rel;
            delete attributes.href;
            // 将css文件中资源路径转换为绝对路径
            attributes._cssText = absoluteToStylesheet( 
                cssText,
                stylesheet!.href!,
            );
        }
    }

maskInputValue method to encrypt the user input data;
Convert canvas to base64 image and save it, record the current playback time of media, scroll position of elements, etc.;
Returns a serialized object serializedNode , which contains the previously processed attributes. The key to serialization is that each node will have a unique id, of which rootId represents the id of the document it belongs to, helping us identify the root node during playback.

    return {
        type: NodeType.Element,
        tagName,
        attributes,
        childNodes: [],
        isSVG,
        needBlock,
        rootId,
    };

Event timestamp

After getting the serialized DOM node, the wrapEvent method will be called uniformly to add a timestamp to the event, which needs to be used during playback.

function wrapEvent(e: event): eventWithTime {
  return {
    ...e,
    timestamp: Date.now(),
  };
}

serialized id

serializeNodeWithId __sn.id attribute of the DOM node during serialization. If it does not exist, it will call genId to generate a new id and assign it to the __sn.id attribute. The id is used to uniquely identify the DOM node, which is established by id The mapping relationship from id -> DOM helps us find the corresponding DOM node during playback.

function genId(): number {
  return _id++;
}

const serializedNode = Object.assign(_serializedNode, { id });

If the DOM node has child nodes, the serializeNodeWithId method will be called recursively, and finally the following tree data structure will be returned:

{
    type: NodeType.Document,
    childNodes: [{
        {
            type: NodeType.Element,
            tagName,
            attributes,
            childNodes: [{
                //...
            }],
            isSVG,
            needBlock,
            rootId,
        }
    }],
    rootId,
};

How to implement a custom timer

replay
In the playback process, in order to support the random dragging of the progress bar and the setting of the playback speed (as shown in the figure above), a custom high-precision timer Timer is implemented. The key attributes and methods are:

export declare class Timer {
    // 回放初始位置，对应进度条拖拽到的任意时间点
    timeOffset: number;
    // 回放的速度
    speed: number;
    // 回放Action队列
    private actions;
    // 添加回放Action队列
    addActions(actions: actionWithDelay[]): void;
    // 开始回放
    start(): void;
    // 设置回放速度
    setSpeed(speed: number): void;
}

playback entry

The events recorded above can be played back in the iframe through the play

const replayer = new rrweb.Replayer(events);
replayer.play();

In the first step, rrweb.Replayer instance is initialized, an iframe will be created as a container to carry the event playback, and then two services will be called and created: createPlayerService used to process the logic of event playback, and createSpeedService used to control the playback speed.

In the second step, the replayer.play() method will be called to trigger the PLAY event type and start the processing flow of event playback.

// this.service 为 createPlayerService 创建的回放控制service实例
// timeOffset 值为鼠标拖拽后的时间偏移量
this.service.send({ type: 'PLAY', payload: { timeOffset } });

Baseline timestamp generation

时间轴

The key to playback support for dragging and dropping is to pass in the time offset timeOffset parameter:

The total duration of playback = events[n].timestamp - events[0] n , 061d65437892e2 is the total length of the event queue minus one;
The total duration of the time axis is the total duration of playback, and the coordinate on the time axis corresponding to the starting position of the mouse drag is timeOffset ;
Calculate the baseline timestamp (baselineTime) after dragging according to timestamp and timeOffset initial event;
Then intercept baseline timestamp (baselineTime) from all event queues according to timestamp of the event, that is, the event queue that needs to be played back.

`Playback Action Queue Transition`

After getting the event queue, you need to traverse the event queue, convert it to the corresponding playback action according to the event type, and add it to the Action queue of the custom timer Timer.

actions.push({
    doAction: () => {
        castFn();
    },
    delay: event.delay!,
});

doAction is the method to be called during EventType . For example, the change of DOM element corresponds to the incremental event EventType.IncrementalSnapshot . If it is an incremental event type, the playback action will call the applyIncremental method to apply the incremental snapshot, construct the actual DOM node based on the serialized node data, and add it to the iframe container for the reverse process of the previous serialized DOM.
delay = event.timestamp - baselineTime, which is the difference between the timestamp of the current event and the baseline timestamp

`requestAnimationFrame timing playback`

Timer is a timer custom precision timer, mainly because start method use a requestAnimationFrame to playback timing of asynchronous processing queue; browser native setTimeout and setInterval compared, requestAnimationFrame not block the main task thread , and the execution of setTimeout and setInterval may be blocked.

Secondly, the performance.now() time function is used to calculate the current playing time; performance.now() will return a timestamp represented by a floating point number with a precision of up to microseconds, which is higher than other available time functions, such as Date.now() can only return the millisecond level .

 public start() {
    this.timeOffset = 0;
    // performance.timing.navigationStart + performance.now() 约等于 Date.now()
    let lastTimestamp = performance.now();
    // Action 队列
    const { actions } = this;
    const self = this;
    function check() {
      const time = performance.now();
      // self.timeOffset为当前播放时长：已播放时长 * 播放速度(speed) 累加而来
      // 之所以是累加，因为在播放的过程中，速度可能会更改多次
      self.timeOffset += (time - lastTimestamp) * self.speed;
      lastTimestamp = time;
      // 遍历 Action 队列
      while (actions.length) {
        const action = actions[0];
        // 差值是相对于`基线时间戳`的，当前已播放 {timeOffset}ms
        // 所以需要播放所有「差值 <= 当前播放时长」的 action
        if (self.timeOffset >= action.delay) {
          actions.shift();
          action.doAction();
        } else {
          break;
        }
      }
      if (actions.length > 0 || self.liveMode) {
        self.raf = requestAnimationFrame(check);
      }
    }
    this.raf = requestAnimationFrame(check);
  }

After completing the conversion of the playback Action queue, the timer.start() method will be called to perform playback in sequence at the correct time interval. In each requestAnimationFrame callback, the Action queue will be traversed in positive order. If the difference between the current Action and the baseline timestamp is less than the current playback duration, it means that the Action needs to be triggered in this asynchronous callback, and the action.doAction method will be called. To realize the playback of this incremental snapshot. The replayed Action will be deleted from the queue to ensure that the next requestAnimationFrame callback will not be re-executed.

`Summarize`

After understanding the key issues of "how to implement event monitoring", "how to serialize the DOM", and "how to implement a custom timer", we have basically mastered the workflow of rrweb. In addition, rrweb is playing back the The sandbox mode of the iframe is also used at the time to realize the restrictions on some JS behaviors. Interested students can learn more about it.

In short, based on rrweb can easily help us realize the screen recording and playback function, such as the one-click reporting function that is currently used in the CMS business. By combining API requests, error stack information and screen recording playback functions, it can help developers locate problems. And solve, making you a Sherlock too.

This article is published from NetEase Cloud Music Front-end Team . Any unauthorized reprinting of the article is prohibited. We recruit front-end, iOS, and Android all year round. If you are ready to change jobs and happen to like cloud music, then join us at grp.music-fe (at) corp.netease.com!

rrweb takes you to restore the problem scene

background

Reporting process

rrweb library

How to implement event monitoring

childList incremental snapshot

new node

Node movement

Node deletion

How to serialize the DOM

Event timestamp

serialized id

How to implement a custom timer

playback entry

Baseline timestamp generation

`Playback Action Queue Transition`

`requestAnimationFrame timing playback`

`Summarize`

云音乐技术团队

`引用和评论`

AI Code 在团队开发工作流的融合思考

Vue.js-Vue实例

Flex 布局学习总结（对齐方式）

Koa+Typescript起手式(空环境) 不用每次玩node都要搭环境了！

JavaScript&ES6----数组去重的多种方法

Base64编码的“暗坑”：解密失败？可能是这些原因！

从 DeepSeek 看25年前端的一个小趋势