16
头图

"Time makes heroes" is an eternal truth. In the context of the current era, online documents can be called such a "hero".

Since the outbreak of the new crown epidemic in 2020, remote office has completely subverted the traditional corporate management model, and online documents, as an important part of remote office software, have also ushered in rapid development.

Today, even though online office products such as Tencent Documents, Graphite Documents, Feishu, Yuque and Lingxi Documents are already available in the market, online documents themselves are still facing challenges in functions, technology, data security, services, and ecology, such as Data processing efficiency, multi-person collaboration, secondary expansion, system integration, framework compatibility issues, etc.

From a technical point of view, online, data processing, and multi-person collaboration are the most critical technical indicators for developing an online document system. However, there are relatively mature technical solutions for online and data processing, and implementation is not difficult. Therefore, multi-person collaboration is the core element that affects the ease of use of the online document system.

What is multi-person collaboration?

Multi-person collaboration, that is, multiple people edit the same document at the same time, and users can see the modifications made by others without refreshing. Google Docs, Tencent Docs, Graphite Docs, Quip, etc. all have multi-person collaboration capabilities.

So, how is multi-person collaboration achieved?

For any information to be edited and displayed by multiple people in real time, the following three steps need to be implemented:

  • Operationalization
  • Transferable
  • Revertible

These three steps are similar to the encoding and decoding process. First, the information is converted into a set of operations, then the operations are transmitted to other terminals through the network, and finally the operations are restored to information at the local terminal.

These steps seem simple, but each step requires a lot of consideration. For example, in the process of operationalization, when segmenting and combining information, how to ensure that all changes in information can be decomposed into a set of operations, how to make operations cover all changes in information, and how to determine the granularity of segmentation.

The following points need to be considered for transmission:

  1. Transfer content
    a. Original text
    i. Clear
    ii. Redundancy
    b. Compression technology
    i. Logical compression
    ii. Protocol compression
    iii. Manual compression
  2. Network protocol
    a. Socket
    i. TCP
    ii. UDP
    b. HTTP
    c. WebSocket
  3. QoS (Quality of Service)
    a. Fail fast
    b. Automatic rollback
    c. Automatic reconnection
    d. Automatic recovery

Recoverable mainly involves:

  1. Absolute operation restoration
    a. Control volume
    b. Reasonable reminder
  2. Relative operation reduction
    a. Strictly sequential
    b. Guarantee sequentiality from the source
    c. Sequential remedy
  3. Restore of local operations
    a. Filter the received operation set
    b. Refine the operation particles from the source
    c. Save locally and execute locally
  4. Non-invasive restoration
    a. Define the invasion
    b. Eliminate intrusions
    c. Thousands of people and thousands of faces

After understanding the basic principles of multi-person collaboration, let's study its technical difficulties.

What are the technical difficulties of multi-person collaboration?

Multi-person collaboration is essentially Multiple Leader Replication in a distributed system, that is, any client can be regarded as a Data Leader. Synchronizing data between these will inevitably encounter 1610212b69e392 out of order and conflict problems . This is the main difficulty of multi-person collaboration.

For the conflict problem of Multiple Leader Replication, there are the following solutions:

  • Avoid conflicts, that is, do not allow multiple users to edit the same place at the same time. The solution is simple and rude, and it is necessary to check whether the product form is suitable for the solution when using it.
  • Expose the conflict to the user and let the user resolve it by themselves. At present, most professional version control software uses this method, but it is not suitable for products with a large number of non-professional users, such as online documents.
  • Mark the write operation with a global index, which can be a timestamp or a sequence number. The index must be global and incremented. In any conflict, choose the one with the higher index to write. The advantage of this method is that conflict resolution is completely automated and does not require user involvement. The disadvantage is that if you encounter a long synchronization interval, you will lose a lot of user input.

In the actual development of an online document system, Operational Transformation (OT) algorithm technology is a more commonly used method to solve the problem of multi-person collaboration conflict. This technology was born in 1989, and its principle is to unify the text content into the following three types of operation methods, with the purpose of providing users with ultimate consistency:

  • retain(n): retain n characters
  • insert(str): insert character str
  • delete(str): delete character str

After completing the above operations, the OT algorithm merges and transforms the concurrent operations to form a new operation flow, and applies it to the historical version to realize lock-free synchronous editing.


Operation transformation process in OT algorithm technology (Source: https://en.wikipedia.org/wiki/Operational_transformation )

The idea behind the OT algorithm is actually very simple, that is, the corresponding operation conversion is performed under specific conditions. Therefore, OT is mainly used for text, which is usually complex and not scalable. For more advanced structures such as rich text editing, OT trades complexity for the realization of the user's expectations, without causing too much negative impact on system performance. Therefore, most real-time collaborative editing logic is implemented based on OT algorithm.

Because of this, the OT algorithm has become one of the most important solutions to the current collaborative conflict processing. However, even if it has been born for more than 30 years, theories related to control algorithms have already blossomed, but it still cannot handle distributed implementation issues well, and the development of a system that supports real-time collaborative editing by multiple people is far more complicated than imagined.

Where is the breakthrough to achieve multi-person collaboration?

It can be seen that it is not enough to realize a complex multi-person real-time collaborative editing system relying on algorithmic logic alone. It also needs to invest a lot of research and development costs according to different business scenarios (such as project kanban, plain text editing, undo/redo, etc.) And time, and in constant exploration, to find a balance between product performance and ease of use.

So, is there a simpler and faster solution?

By analyzing the sample codes of many online collaborative office products on the market, we found that these products, in addition to the OT algorithm mentioned above, basically use the third-party form component . By embedding components, the online document system well supports the final consistency of multi-person collaboration, providing users with easier-to-use and diversified experience effects, reducing R&D costs, and achieving higher-density computational complexity , Which greatly improves the efficiency of multi-person collaboration.

What functions does the form component for multi-person collaboration need to have?

First of all, for the table function.

Since the numerical sensitivity of the table is much higher than that of other data types, it can achieve more delicate operation granularity and computational complexity when used as a multi-person collaborative document. Therefore, the selected components must have powerful table function support, not only to show strong capabilities in data entry, data reporting, etc., but also to have various statistics, calculation summaries, perspective analysis, and graphical means.

Secondly, needs to have an open API interface to meet more customization options .

This type of component needs to provide a rich event and application program interface for controlling cell state, form protection, data transmission and other logic. For multi-person collaboration, it is also necessary to restrict users from editing the same content and inserting timestamps. (Serialization) and other functions.

Out of curiosity, I downloaded and tried a variety of online form components, and found that there are only a handful of components that can meet the above requirements, and SpreadJS is undoubtedly the most eye-catching one. This component focuses on the "online Excel" that can be embedded in the system. The pure front-end architecture can be easily embedded in the system development without considering the compatibility with the native system. It is worth mentioning that SpreadJS uses a sparse array (Sparse Array) as a storage model. Compared with traditional chain storage or array storage, sparse arrays will only store non-empty data, and do not need to open extra for empty data. Memory space.

In addition to saving memory space, for loosely laid out data types such as tables, sparse arrays are also easier to build a data dictionary based on row index, so as to replace or restore any level of nodes in the entire storage structure at any time. With this feature, SpreadJS implements efficient data rollback and data recovery (Redo/Undo) in multi-person collaboration.


SpreadJS's sparse matrix storage model (Sparse Array)

Concluding remarks

The epidemic has accelerated the digital transformation of enterprises. In the future, enterprise collaborative office will develop toward the improvement of product ease of use, integration and secondary expansion capabilities, a high degree of compatibility with the original system/business, and meeting the use habits of end users.

How to break down technical barriers and develop online document products that can not only meet the needs of users in different scenarios, but also have market competitiveness and differentiation is the primary consideration for SaaS companies and system suppliers.

"Good wind has sent me to Qingyun with strength." In today's fiercely competitive online document field, in addition to spending a lot of energy on independent research and development, learning to use strength to meet different business scenarios and customer needs may also be a good choice.

If you want to learn more about front-end form technology, welcome to the SegmentFault D-Day big front-end technology salon to be held this Saturday. Yao Yao, senior product technical consultant and technical evangelist of Putaocheng, will communicate and share the front-end spreadsheet technology face to face with you. Those things.

Click here for more details!


小魔
735 声望1k 粉丝