React 源码深度解读(十):Diff 算法详解

10

欢迎关注我的公众号睿Talk,获取我最新的文章:
clipboard.png

一、前言

React 是一个十分庞大的库,由于要同时考虑 ReactDom 和 ReactNative ,还有服务器渲染等,导致其代码抽象化程度很高,嵌套层级非常深。阅读 React 源码是一个非常艰辛的过程,在学习过程中给我帮助最大的就是这个系列文章。作者对代码的调用关系梳理得非常清楚,而且还有配图帮助理解,非常值得一读。站在巨人的肩膀之上,我尝试再加入自己的理解,希望对有志于学习 React 源码的读者带来一点启发。

本系列文章基于 React 15.4.2 ,以下是本系列其它文章的传送门:
React 源码深度解读(一):首次 DOM 元素渲染 - Part 1
React 源码深度解读(二):首次 DOM 元素渲染 - Part 2
React 源码深度解读(三):首次 DOM 元素渲染 - Part 3
React 源码深度解读(四):首次自定义组件渲染 - Part 1
React 源码深度解读(五):首次自定义组件渲染 - Part 2
React 源码深度解读(六):依赖注入
React 源码深度解读(七):事务 - Part 1
React 源码深度解读(八):事务 - Part 2
React 源码深度解读(九):单个元素更新
React 源码深度解读(十):Diff 算法详解

二、Diff 策略

React 在比较新旧 2 棵虚拟 DOM 树的时候,会同时考虑两点:

  • 尽量少的创建 / 删除节点,多使用移动节点的方式
  • 比较次数要尽量少,算法要足够的快

如果只考虑第一点,算法的时间复杂度要达到 O(n3) 的级别。也就是说对于一个有 1000 个节点的树,要进行 10 亿次的比较,这对于前端应用来说是完全不可接受的。因此,React 选用了启发式的算法,将时间复杂度控制在 O(n) 的级别。这个算法基于以下 2 个假设:

  • 如果 2 个节点的类型不一样,以这 2 个节点为根结点的树会完全不同
  • 对于多次 render 中结构保持不变的节点,开发者会用一个 key 属性标识出来,以便重用

另外,React 只会对同一层的节点作比较,不会跨层级比较,如图所示:

clipboard.png

Diff 使用的是深度优先算法,当遇到下图这样的情况:

clipboard.png

最高效的算法应该是直接将 A 子树移动到 D 节点,但这样就涉及到跨层级比较,时间复杂度会陡然上升。React 的做法比较简单,它会先删除整个 A 子树,然后再重新创建一遍。结合到实际的使用场景,改变一个组件的从属关系的情况也是很少的。

clipboard.png

同样道理,当 D 节点改为 G 节点时,整棵 D 子树也会被删掉,E、F 节点会重新创建。

对于列表的 Diff,节点的 key 有助于节点的重用:

clipboard.png

如上图所示,当没有 key 的时候,如果中间插入一个新节点,Diff 过程中从第三个节点开始的节点都是删除旧节点,创建新节点。当有 key 的时候,除了第三个节点是新创建外,第四和第五个节点都是通过移动实现的。

三、无 Key Diff

在了解了总体的 Diff 策略后,我们来近距离的审视源码。先更新示例代码,注意 li 元素没有使用 key :

class App extends Component {
  constructor(props) {
    super(props);
    this.state = {
      data : ['one', 'two'],
    };

    this.timer = setInterval(
      () => this.tick(),
      5000
    );
  }

  tick() {
    this.setState({
      data: ['new', 'one', 'two'],
    });
  }

  render() {
    return (
      <ul>
      {
        this.state.data.map(function(val, i) {
          return <li>{ val }</li>;
        })
      }
      </ul>
    );
  }
}

export default App;

元素变化如下图所示,5 秒后会新增一个 new 元素。

clipboard.png

我们跳过前面的逻辑,直接来看 Diff 的代码

_updateRenderedComponent: function (transaction, context) {
    var prevComponentInstance = this._renderedComponent;
    // 之前的 Virtual DOM
    var prevRenderedElement = prevComponentInstance._currentElement;
    // 最新的 Virtual DOM
    var nextRenderedElement = this._renderValidatedComponent();

    var debugID = 0;
    if (__DEV__) {
        debugID = this._debugID;
    }

    if (shouldUpdateReactComponent(prevRenderedElement,
            nextRenderedElement)) {
        ReactReconciler.receiveComponent(
            prevComponentInstance,
            nextRenderedElement,
            transaction,
            this._processChildContext(context)
        );
    } else {
        ...
    }
},

// shouldUpdateReactComponent.js
function shouldUpdateReactComponent(prevElement, nextElement) {
    var prevEmpty = prevElement === null || prevElement === false;
    var nextEmpty = nextElement === null || nextElement === false;
    if (prevEmpty || nextEmpty) {
        return prevEmpty === nextEmpty;
    }

    var prevType = typeof prevElement;
    var nextType = typeof nextElement;
    if (prevType === 'string' || prevType === 'number') {
        return (nextType === 'string' || nextType === 'number');
    } else {
        return (
            nextType === 'object' &&
            prevElement.type === nextElement.type &&
            prevElement.key === nextElement.key
        );
    }
}

_updateRenderedComponent方法位于 ReactCompositeComponent 内。它先获取新、旧 2 个 Virtual DOM,然后通过shouldUpdateReactComponent判断节点类型是否相同。在我们的例子里,跟节点都是 ul 元素,因此跳过中间的层级后,会走到 ReactDOMComponent 的 updateComponent 方法。他会更新属性和子元素,更新属性部分上一篇文章已经讲清楚了,下面看看更新子元素部分。最终会调用 ReactMultiChild 的 _updateChildren :

_updateChildren: function (nextNestedChildrenElements, transaction, context) {
    ...
    
    var nextChildren = this._reconcilerUpdateChildren(
                prevChildren,
                nextNestedChildrenElements,
                mountImages,
                removedNodes,
                transaction,
                context
            );
            
    ...
    
    for (name in nextChildren) {
        if (!nextChildren.hasOwnProperty(name)) {
            continue;
        }
        var prevChild = prevChildren && prevChildren[name];
        var nextChild = nextChildren[name];
        if (prevChild === nextChild) {
            updates = enqueue(
                updates,
                this.moveChild(prevChild, lastPlacedNode,
                    nextIndex, lastIndex)
            );
            lastIndex = Math.max(prevChild._mountIndex,
                lastIndex);
            prevChild._mountIndex = nextIndex;
        } else {
            if (prevChild) {
                // Update `lastIndex` before `_mountIndex` gets unset by unmounting.
                lastIndex = Math.max(prevChild._mountIndex,
                    lastIndex);
                // The `removedNodes` loop below will actually remove the child.
            }
            // The child must be instantiated before it's mounted.
            updates = enqueue(
                updates,
                this._mountChildAtIndex(
                    nextChild,
                    mountImages[nextMountIndex],
                    lastPlacedNode,
                    nextIndex,
                    transaction,
                    context
                )
            );
            nextMountIndex++;
        }
        nextIndex++;
        lastPlacedNode = ReactReconciler.getHostNode(nextChild);
    }
    // Remove children that are no longer present.
    for (name in removedNodes) {
        if (removedNodes.hasOwnProperty(name)) {
            updates = enqueue(
                updates,
                this._unmountChild(prevChildren[name],
                    removedNodes[name])
            );
        }
    }
    if (updates) {
        processQueue(this, updates);
    }
    this._renderedChildren = nextChildren;
},

_reconcilerUpdateChildren: function (
    prevChildren,
    nextNestedChildrenElements,
    mountImages,
    removedNodes,
    transaction,
    context
) {
    var nextChildren;
    var selfDebugID = 0;
    
    nextChildren = flattenChildren(nextNestedChildrenElements,
        selfDebugID);
        
    ReactChildReconciler.updateChildren(
        prevChildren,
        nextChildren,
        mountImages,
        removedNodes,
        transaction,
        this,
        this._hostContainerInfo,
        context,
        selfDebugID
    );
    return nextChildren;
},

在开始 Diff li 元素之前,它会先调用_reconcilerUpdateChildren去更新 li 元素的子元素,也就是文本。在函数中调用了flattenChildren方法,将数组转换成对象:

function flattenSingleChildIntoContext(
    traverseContext: mixed,
    child: ReactElement < any > ,
    name: string,
    selfDebugID ? : number,
): void {
    // We found a component instance.
    if (traverseContext && typeof traverseContext === 'object') {
        const result = traverseContext;
        const keyUnique = (result[name] === undefined);
        
        if (keyUnique && child != null) {
            result[name] = child;
        }
    }
}

function flattenChildren(
    children: ReactElement < any > ,
    selfDebugID ? : number,
): ? {
    [name: string]: ReactElement < any >
} {
    if (children == null) {
        return children;
    }
    var result = {};

    traverseAllChildren(children, flattenSingleChildIntoContext, result);   
    
    return result;
}

// traverseAllChildren.js
function traverseAllChildren(children, callback, traverseContext) {
    if (children == null) {
        return 0;
    }

    return traverseAllChildrenImpl(children, '', callback, traverseContext);
}

function traverseAllChildrenImpl(
    children,
    nameSoFar,
    callback,
    traverseContext
) {
    var type = typeof children;

    if (type === 'undefined' || type === 'boolean') {
        // All of the above are perceived as null.
        children = null;
    }

    if (children === null ||
        type === 'string' ||
        type === 'number' ||
        // The following is inlined from ReactElement. This means we can optimize
        // some checks. React Fiber also inlines this logic for similar purposes.
        (type === 'object' && children.$$typeof === REACT_ELEMENT_TYPE)) {
        callback(
            traverseContext,
            children,
            // If it's the only child, treat the name as if it was wrapped in an array
            // so that it's consistent if the number of children grows.
            nameSoFar === '' ? SEPARATOR + getComponentKey(children, 0) :
            nameSoFar
        );
        return 1;
    }

    var child;
    var nextName;
    var subtreeCount = 0; // Count of children found in the current subtree.
    var nextNamePrefix = nameSoFar === '' ? SEPARATOR : nameSoFar +
        SUBSEPARATOR;

    if (Array.isArray(children)) {
        for (var i = 0; i < children.length; i++) {
            child = children[i];
            nextName = nextNamePrefix + getComponentKey(child, i);
            subtreeCount += traverseAllChildrenImpl(
                child,
                nextName,
                callback,
                traverseContext
            );
        }
    }
    
    ...

    return subtreeCount;
}

function getComponentKey(component, index) {
    // Do some typechecking here since we call this blindly. We want to ensure
    // that we don't block potential future ES APIs.
    if (component && typeof component === 'object' && component.key != null) {
        // Explicit key
        return KeyEscapeUtils.escape(component.key);
    }
    // Implicit key determined by the index in the set
    return index.toString(36);
}

flattenSingleChildIntoContext作为flattenChildren的回调函数,会作用在每一个数组元素上,构造一个对象(result)。对象的 key 是通过getComponentKey这个方法生成的,可以看到如果没有定义 key 属性,则默认会用数组的 index 作为 key 。最终构造出来的对象是这个样子的:

clipboard.png

然后就会调用ReactChildReconciler.updateChildren方法,去更新 li 的文本和创建新的 li 元素。

updateChildren: function (
    prevChildren,
    nextChildren,
    mountImages,
    removedNodes,
    transaction,
    hostParent,
    hostContainerInfo,
    context,
    selfDebugID // 0 in production and for roots
) {
    if (!nextChildren && !prevChildren) {
        return;
    }
    var name;
    var prevChild;
    for (name in nextChildren) {
        if (!nextChildren.hasOwnProperty(name)) {
            continue;
        }
        prevChild = prevChildren && prevChildren[name];
        var prevElement = prevChild && prevChild._currentElement;
        var nextElement = nextChildren[name];
        
        if (prevChild != null &&
            shouldUpdateReactComponent(prevElement, nextElement)) {
            ReactReconciler.receiveComponent(
                prevChild, nextElement, transaction, context
            );
            nextChildren[name] = prevChild;
        } else {
            if (prevChild) {
                removedNodes[name] = ReactReconciler.getHostNode(
                    prevChild);
                ReactReconciler.unmountComponent(prevChild, false);
            }
            // The child must be instantiated before it's mounted.
            var nextChildInstance = instantiateReactComponent(
                nextElement, true);
            nextChildren[name] = nextChildInstance;
            // Creating mount image now ensures refs are resolved in right order
            // (see https://github.com/facebook/react/pull/7101 for explanation).
            var nextChildMountImage = ReactReconciler.mountComponent(
                nextChildInstance,
                transaction,
                hostParent,
                hostContainerInfo,
                context,
                selfDebugID
            );
            mountImages.push(nextChildMountImage);
        }
    }
    // Unmount children that are no longer present.
    for (name in prevChildren) {
        if (prevChildren.hasOwnProperty(name) &&
            !(nextChildren && nextChildren.hasOwnProperty(name))) {
            prevChild = prevChildren[name];
            removedNodes[name] = ReactReconciler.getHostNode(
                prevChild);
            ReactReconciler.unmountComponent(prevChild, false);
        }
    }
},

在更新 li 前,会根据 key 去取上一次 render 对应的元素prevChild = prevChildren && prevChildren[name]。以第 0 个元素为例,prevElement 为 ReactElement[3]( key 为‘.0’),而 nextElement 为 ReactElement[2],因此会调用ReactReconciler.receiveComponent来更新文本元素,过程与上一篇文章一样。

clipboard.png

当遍历到最后一个 li ,由于没有 prevChild,会创建一个新的实例。

再回到 ReactMultiChild 的 _updateChildren 方法,这时nextChildren已经创建好,开始遍历:

_updateChildren: function (nextNestedChildrenElements, transaction, context) {
    ...
    
    var nextChildren = this._reconcilerUpdateChildren(
                prevChildren,
                nextNestedChildrenElements,
                mountImages,
                removedNodes,
                transaction,
                context
            );
            
    ...
    
    for (name in nextChildren) {
        if (!nextChildren.hasOwnProperty(name)) {
            continue;
        }
        var prevChild = prevChildren && prevChildren[name];
        var nextChild = nextChildren[name];
        if (prevChild === nextChild) {    ------------------------------------ 1)
            updates = enqueue(
                updates,
                this.moveChild(prevChild, lastPlacedNode,
                    nextIndex, lastIndex)
            );
            lastIndex = Math.max(prevChild._mountIndex,
                lastIndex);
            prevChild._mountIndex = nextIndex;
        } else {                          ------------------------------------ 2)
            if (prevChild) {
                // Update `lastIndex` before `_mountIndex` gets unset by unmounting.
                lastIndex = Math.max(prevChild._mountIndex,
                    lastIndex);
                // The `removedNodes` loop below will actually remove the child.
            }
            // The child must be instantiated before it's mounted.
            updates = enqueue(
                updates,
                this._mountChildAtIndex(
                    nextChild,
                    mountImages[nextMountIndex],
                    lastPlacedNode,
                    nextIndex,
                    transaction,
                    context
                )
            );
            nextMountIndex++;
        }
        nextIndex++;
        lastPlacedNode = ReactReconciler.getHostNode(nextChild);
    }
    // Remove children that are no longer present.
    for (name in removedNodes) {
        if (removedNodes.hasOwnProperty(name)) {
            updates = enqueue(
                updates,
                this._unmountChild(prevChildren[name],
                    removedNodes[name])
            );
        }
    }
    if (updates) {
        processQueue(this, updates);
    }
    this._renderedChildren = nextChildren;
},

前面 2 个 li 元素会走到分支 1),第三个元素会到分支 2),创建新的 li 元素,过程与上一篇的类似。

四、有 Key Diff

例子改一下,加入 key:

class App extends Component {
    constructor(props) {
        super(props);
        this.state = {
            data: ['one', 'two']
        };

        this.timer = setTimeout(() => this.tick(), 5000);
    }

    tick() {
        this.setState({
            data: ['new', 'one', 'two']
        });
    }

    render() {
        return (
            <ul>
                {
                    this.state.data.map(function (val, i) {
                         return <li key={val}>{ val }</li>;
                    })
                }
            </ul>
        );
    }
}

流程与无 key 类似,不再赘述。flattenChildren后的对象是这个样子的:

clipboard.png

由于使用了 key ,ReactChildReconciler.updateChildren不再需要更新 text 了,只需要创建一个新的实例。

clipboard.png

然后到 ReactMultiChild 的 _updateChildren :

_updateChildren: function (nextNestedChildrenElements, transaction, context) {
    ...
    
    for (name in nextChildren) {
        if (!nextChildren.hasOwnProperty(name)) {
            continue;
        }
        var prevChild = prevChildren && prevChildren[name];
        var nextChild = nextChildren[name];
        if (prevChild === nextChild) {    ------------------------------------ 1)
            updates = enqueue(
                updates,
                this.moveChild(prevChild, lastPlacedNode,
                    nextIndex, lastIndex)
            );
            lastIndex = Math.max(prevChild._mountIndex,
                lastIndex);
            prevChild._mountIndex = nextIndex;
        } else {                          ------------------------------------ 2)
            if (prevChild) {
                // Update `lastIndex` before `_mountIndex` gets unset by unmounting.
                lastIndex = Math.max(prevChild._mountIndex,
                    lastIndex);
                // The `removedNodes` loop below will actually remove the child.
            }
            // The child must be instantiated before it's mounted.
            updates = enqueue(
                updates,
                this._mountChildAtIndex(
                    nextChild,
                    mountImages[nextMountIndex],
                    lastPlacedNode,
                    nextIndex,
                    transaction,
                    context
                )
            );
            nextMountIndex++;
        }
        nextIndex++;
        lastPlacedNode = ReactReconciler.getHostNode(nextChild);
    }
    // Remove children that are no longer present.
    for (name in removedNodes) {
        if (removedNodes.hasOwnProperty(name)) {
            updates = enqueue(
                updates,
                this._unmountChild(prevChildren[name],
                    removedNodes[name])
            );
        }
    }
    if (updates) {
        processQueue(this, updates);
    }
    this._renderedChildren = nextChildren;
},

匹配第一个元素的时候,会到分支 2),后面 2 个元素都是走分支 1)。

有 key 跟没 key 相比,减少了 2 次文本元素的更新,提高了效率。

五、深挖 Key Diff

再来考虑更复杂的情况:

clipboard.png

假设要做上图的变化,再来分析下源码:

_updateChildren: function (nextNestedChildrenElements, transaction, context) {
    ...
    
    var nextIndex = 0;
    var lastIndex = 0;
    var nextMountIndex = 0;
    var lastPlacedNode = null;
            
    for (name in nextChildren) {
        if (!nextChildren.hasOwnProperty(name)) {
            continue;
        }
        var prevChild = prevChildren && prevChildren[name];
        var nextChild = nextChildren[name];
        if (prevChild === nextChild) {    ------------------------------------ 1)
            updates = enqueue(
                updates,
                this.moveChild(prevChild, lastPlacedNode, nextIndex, lastIndex)
            );
            lastIndex = Math.max(prevChild._mountIndex, lastIndex);
            prevChild._mountIndex = nextIndex;
        } else {                          ------------------------------------ 2)
            if (prevChild) {
                // Update `lastIndex` before `_mountIndex` gets unset by unmounting.
                lastIndex = Math.max(prevChild._mountIndex, lastIndex);
                // The `removedNodes` loop below will actually remove the child.
            }
            // The child must be instantiated before it's mounted.
            updates = enqueue(
                updates,
                this._mountChildAtIndex(
                    nextChild,
                    mountImages[nextMountIndex],
                    lastPlacedNode,
                    nextIndex,
                    transaction,
                    context
                )
            );
            nextMountIndex++;
        }
        nextIndex++;
        lastPlacedNode = ReactReconciler.getHostNode(nextChild);
    }
    ...
},

moveChild: function (child, afterNode, toIndex, lastIndex) {
    // If the index of `child` is less than `lastIndex`, then it needs to
    // be moved. Otherwise, we do not need to move it because a child will be
    // inserted or moved before `child`.
    if (child._mountIndex < lastIndex) {
        return makeMove(child, afterNode, toIndex);
    }
},

这里要先搞清楚 3 个 index:

  • nextIndex:遍历 nextChildren 时候的 index,每遍历一个元素加 1
  • lastIndex:上一次从 prevChildren 中取出来元素时,这个元素在 prevChildren 中的 index
  • _mountIndex:元素在数组中的位置

下面我们来走一遍流程:

  • nextChildren 的第一个元素是 B ,在 prevChildren 中的位置是 1(_mountIndex),nextIndex 和 lastIndex 都是 0。节点移动的前提是_mountIndex < lastIndex,因此 B 不需要移动。lastIndex 更新为 _mountIndex 和 lastIndex 中较大的:1 。nextIndex 自增:1;
  • nextChildren 的第二个元素是 A ,在 prevChildren 中的位置是 0(_mountIndex),nextIndex 和 lastIndex 都是 1。这时_mountIndex < lastIndex,因此将 A 移动到 lastPlacedNode (B)的后面 。lastIndex 更新为 _mountIndex 和 lastIndex 中较大的:1 。nextIndex 自增:2;
  • nextChildren 的第三个元素是 D ,中 prevChildren 中的位置是 3(_mountIndex),nextIndex 是 2 ,lastIndex 是 1。这时不满足_mountIndex < lastIndex,因此 D 不需要移动。lastIndex 更新为 _mountIndex 和 lastIndex 中较大的:3 。nextIndex 自增:3;
  • nextChildren 的第四个元素是 C ,中 prevChildren 中的位置是 2(_mountIndex),nextIndex 是 3 ,lastIndex 是 3。这时_mountIndex < lastIndex,因此将 C 移动到 lastPlacedNode (D)的后面。循环结束。

观察整个过程,移动的原则是将原来的元素往右边移,通过 lastIndex 来控制。在 lastIndex 左边的,就往 lastIndex 的右边移动;在 lastIndex 左边的,不需要动。

六、总结

本文详细讲解了 Diff 过程中 key 的作用以及复用节点的移动原则,并结合源码做了深入的讨论。到此为止,整个 React 源码解读系列先告一段落了,后会有期。

你可能感兴趣的

载入中...