开源软件 - SOFARegistry source code｜Data synchronization module analysis - 金融级分布式架构SOFAStack

Text｜Song Guolei (GitHub ID: glmapper )

SOFAStack Committer, Senior R&D Engineer of Huami Technology

Responsible for the development of Huami account system and framework governance

This article is 3024 words read 10 minutes

｜Foreword｜

This article mainly analyzes the source code around the data synchronization module of SOFARegistry. Among them, the concept of registry and the infrastructure of SOFARegistry will not be described in detail. Interested partners can get relevant introduction in the article "Registry under Mass Data - Introduction to SOFARegistry Architecture" [1].

The main ideas of this paper are roughly divided into the following two parts:

-In the first part, use the role classification in SOFARegistry to explain which roles will perform data synchronization;

-The second part analyzes the specific implementation of data synchronization.

PART. 1 - Role classification of SOFARegistry

As shown in the figure above, SOFARegistry contains the following 4 roles:

Client

Provides the basic API capability for applications to access the service registry. The application system invokes the service subscription and service publishing capabilities of the service registry programmatically by relying on the client JAR package.

SessionServer

The session server is responsible for accepting the client's service publishing and service subscription requests, and as an intermediate layer, it forwards write operations to the DataServer layer. The SessionServer layer can be expanded with the growth of the number of business machines.

DataServer

The data server is responsible for storing specific service data. The data is stored in consistent Hash shards according to dataInfoId, and supports multi-copy backup to ensure high data availability. This layer can be expanded as the scale of the service data volume grows. \

MetaServer

The metadata server is responsible for maintaining a consistent list of SessionServer and DataServer in the cluster. As an address discovery service within the SOFARegistry cluster, it can notify the entire cluster when the SessionServer or DataServer node changes.

Among these four roles, MetaServer, as a metadata server itself, does not handle actual business data, but is only responsible for maintaining a consistent list of cluster SessionServer and DataServer, and does not involve data synchronization.

The core actions between Client and SessionServer are subscription and publishing. Broadly speaking, it belongs to the data synchronization between the user-side client and the SOFARegistry cluster. For details, see:

https://github.com/sofastack/sofa-registry/issues/195 , and therefore outside the scope of this article.

As a session service, SessionServer mainly solves the problem of massive client connections, and secondly, it caches all pub data published by clients. Session itself does not persist service data, but transfers data to DataServer. DataServer storage service data is stored in consistent Hash shards according to dataInfoId, supports multiple copy backups, and ensures high data availability.

From the functional analysis of SessionServer and DataServer, it can be concluded that:

The service data cached by the SessionServer needs to be consistent with the service data stored by the DataServer .

DataServer supports multiple copies to ensure high availability, so data server needs to keep service data consistent between multiple copies of DataServer.

In SOFARegistry, the above two data consistency guarantees are implemented through the data synchronization mechanism.

PART. 2 - The specific implementation of data synchronization

After understanding the role classification of SOFARegistry, we start to dive into the specific implementation details of data synchronization. I will mainly focus on data synchronization between SessionServer and DataServer, and data synchronization between multiple copies of DataServer.

"Data synchronization between SessionServer and DataServer"

The data synchronization between SessionServer and DataServer is based on the following push-pull combination mechanism:

Push: When the data changes, the DataServer will actively notify the SessionServer. The SessionServer checks and confirms that it needs to be updated (compared with the version) and then actively obtains the data from the DataServer.

Pull: In addition to the above-mentioned DataServer's active push, SessionServer will actively query the DataServer for the version information of all dataInfoIds at regular intervals, and then compare it with the version in the SessionServer's memory. If the version is found to have changed, it will take the initiative to obtain data from the DataServer. This "pull" logic is mainly a supplement to "push". If there are mistakes or omissions in the "push" process, it can be made up in time at this time.

There will be some differences in the versions checked in the push and pull modes. For details, please refer to the specific introductions in "Data synchronization in push mode" and "Data synchronization in pull mode" below.

"Data synchronization process in push mode"

The push mode is initiated by the daemon thread SyncingWatchDog continuously looping to implement data change checking and notification:

Aggregate data versions by slot grouping. The connection between data and each session corresponds to a SyncSessionTask. SyncSessionTask is responsible for executing the task of synchronizing data. The core synchronization logic is:

com.alipay.sofa.registry.server.data.slot.SlotDiffSyncer#sync method is completed, the general process is shown in the following sequence diagram:

The fourth step of the logic in the red part of the above picture is to update the data memory data according to the dataInfoId diff. It can be seen that only the removed dataInfoId is processed here, and no task processing is performed for the new and updated ones, but through the following 5th -7 steps to complete. The main reason for this is to avoid empty pushes that could lead to some dangerous situations.

In step 5, all pub versions with changed dataInfoId are compared. For the specific comparison logic, please refer to the introduction in the diffPublisher section below.

"Event notification processing of data change"

Data change events will be collected in the dataCenter2Changes cache of DataChangeEventCenter, and then a daemon thread ChangeMerger will be responsible for continuously reading from the dataCenter2Changes cache. These captured event sources will be assembled into ChangeNotifier tasks and submitted to a separate thread pool (notifyExecutor) processing, the whole process is asynchronous.

"Data synchronization process in pull mode"

In pull mode, the SessionServer is responsible for initiating,

com.alipay.sofa.registry.server.session.registry.SessionRegistry.VersionWatchDog

By default, the version data is scanned every 5 seconds. If the version changes, a pull will be actively performed. The process is roughly as follows:

It should be noted that the pull mode complements the push process. The version here is the lastPushedVersion of each sub, and the version of the push mode is the version of the pub data. For the acquisition of lastPushedVersion, please refer to:

com.alipay.sofa.registry.server.session.store.SessionInterests#selectSubscribers

"Data synchronization between multiple copies of DataServer"

The main reason is that the follower of the data corresponding to the slot regularly synchronizes data with the leader. The synchronization logic is not much different from the data synchronization logic between the SessionServer and the DataServer. Data synchronization between leaders.

"Logical Analysis of Incremental Synchronization Diff Calculation"

Whether it is synchronization between SessionServer and DataServer, or synchronization between multiple copies of DataServer, they are all based on incremental diff synchronization, and will not synchronize the full amount of data at one time.

This section briefly analyzes the calculation logic of incremental synchronization diff. The core code is:

com.alipay.sofa.registry.common.model.slot.DataSlotDiffUtils (It is recommended to read this part of the code directly with the test cases in the code) .

It mainly includes two calculation digests and publishers .

diffDigest

Use the DataSlotDiffUtils#diffDigest method to receive two arguments:

- targetDigestMap can be understood as target data

- sourceDigestMap can be understood as baseline data

The core computing logic is as follows code analysis:

Then according to the above diff calculation logic, there are the following scenarios (assuming that dataInfoId in the baseline dataset data is a and b) :

1. The target dataset is empty: return dataInfoId as a and b as new items;

2. The target data set is equal to the baseline data set, and the newly added items, the items to be updated and the items to be removed are all empty;

3. The target data set includes three dataInfoIds a, b, and c, then return c as the item to be removed;

4. If the target dataset includes two dataInfoIds a and c, return c as the item to be removed and b as the new item

diffPublisher

diffPublisher is slightly different from diffDigest calculation. diffPublisher receives three parameters. In addition to the target data set and baseline data set, there is also a publisherMaxNum (default 400) , which is used to limit the number of data processed each time; the core is also given here. Code Explanation:

Here is also an analysis of several scenarios (the following refers to the publisher corresponding to the update dataInfoId, and the registerId corresponds to the publisher one by one) :

1. The target data set is the same as the baseline data set, and the data does not exceed publisherMaxNum, the returned data to be updated and removed are empty, and there is no remaining unprocessed data;

2. Situations that need to be removed:
The baseline does not include the registerId of the target dataset dataInfoId (removed is registerId, not dataInfoId)

3. Situations that need to be updated:
- A registerId exists in the target dataset that does not exist in the baseline dataset
- The version of registerId that exists in the target dataset and the baseline dataset is different

PART. 3 - Summary

This article mainly introduces the data synchronization module in SOFARegistry. In the whole process, we first expounded the data synchronization problem between different roles from the SOFARegistry role classification, and carried out the data synchronization between SessionServer and DataServer and the data synchronization between multiple copies of DataServer.

In the data synchronization analysis between SessionServer and DataServer, the overall process of data synchronization in push and pull scenarios is analyzed. Finally, the diff calculation logic of data addition in SOFARegistry is introduced, and the specific scenarios are described in combination with the relevant core codes.

On the whole, SOFARegistry has the following three points in the processing of data synchronization that can inspire us:

1. In terms of consistency, SOFARegistry is based on AP, which satisfies eventual consistency. In the actual synchronization logic processing, combined with the event mechanism, it is basically completed asynchronously, which effectively weakens the impact of data synchronization on the core process. Impact;

2. In the two parts of pull mode and data change notification, a similar "production-consumption model" is used internally. On the one hand, the decoupling of production and consumption logic is more independent from the code; Eliminate the problem of mutual blocking caused by different production and consumption speeds;

3. The pull mode complements the push mode; we know that the push mode is server -> client. When data changes, if some abnormality occurs, causing a server -> client link to fail to push, it will cause different clients to hold The data is inconsistent; the supplement of the pull mode allows the client to actively complete the data consistency check.

【Reference link】

[1] "Registry Center under Massive Data - Introduction to SOFARegistry Architecture": https://www.sofastack.tech/blog/sofa-registry-introduction/

The original address of the project: https://www.sofastack.tech/projects/sofa-registry/code-analyze/code-analyze-data-synchronization/