数据库 - Cloud Class｜Talk about the data filtering characteristics of DRS - 华为云开发者之家

[Recommended topics In the DevOps market, HUAWEI CLOUD DevCloud takes the lead. sees how it helps companies respond quickly to the rapidly changing business environment.

[Abstract] Currently, DRS has supported other cloud, local IDC, ECS self-built MySQL, SQL Server, MongoDB (database type), PostgreSQL and other database engines, support through public network, Huawei VPN, Huawei Cloud VPC network Real-time migration and real-time synchronization of data in multiple network scenarios under the environment. At the same time, DRS also provides a wealth of data auxiliary functions. Today, we will talk about the "little assistant" in the process of data synchronization---data filtering.

This article is shared from the HUAWEI CLOUD community " [Cloud Small Lesson] [Lesson 15] Let’s talk about the data filtering features of DRS 160b45f4aeb605", the original author:

As we all know, Data Replication Service (DRS) is an easy-to-use, stable, and efficient cloud service for online database migration and real-time database synchronization.

At present, DRS has supported other cloud, local IDC, ECS self-built MySQL, SQL Server, MongoDB (database type), PostgreSQL and other database engines, and support through public network, Huawei VPN, Huawei Cloud VPC network environment, etc. Real-time data migration and real-time synchronization of network scenes. At the same time, DRS also provides a wealth of data auxiliary functions. Today, we will talk about the "little assistant" in the process of data synchronization---data filtering.

First, let's take a look at the business scenario of the data filtering feature-data synchronization.

1. Introduction to data synchronization

Features

Data synchronization is one of the important functions of DRS. That is, between different systems, the data is copied from one data source to other databases through synchronization technology, and the consistency is maintained to realize the real-time flow of key business data.

Common scenarios

Real-time analysis, report system, data warehouse environment.

Features

The data synchronization function focuses on tables and data, and meets a variety of flexibility needs, such as many-to-one (as shown in the figure below), one-to-many, dynamically adding or subtracting synchronization tables, and synchronizing data between different table names.

After having a preliminary understanding of data synchronization, we can know that data synchronization is different from migration. Migration is for the purpose of overall database relocation, and data synchronization is maintain the continuous flow of data between different businesses.

Then, in various application scenarios such as regular data synchronization and split summary data tables, we often want to obtain a certain part of the data in each table in real time for summary analysis. At this time, if we can set some synchronization rules, It becomes especially important to help us synchronize data more accurately and efficiently.

The data filtering feature of DRS is a necessary "good medicine" to solve this scenario. By processing the synchronized objects, adding rules for the selected objects, it is convenient for you to set the data conditions you want to synchronize. Now, let’s take a look at today’s protagonist-data filtering.

2. Data filtering

principle

Based on the principle of data consistency as the primary goal, let's take an example to look at several scenarios that will be encountered in the update operation of the source database after the data filtering rules are added:

Assume that the id in the source library is 1, 2, 3, 5, 6. We set the data filter condition as id between 1 and 5, then the id of the target database after data synchronization is 1, 2, 3, 5. As shown below:

First, let's look at several common update scenarios:

■ If you insert id=1.5 in the source database, and this id meets the filter conditions (id between 1 and 5), the target database will perform the same update operation. As shown below:

■ If id=2 is deleted from the source database, and the id meets the filter conditions (id between 1 and 5), the target database will perform the same deletion action. As shown below:

■ If you update id=3 to id=3.5 in the source database, and the updated id still meets the filter conditions (id between 1 and 5), the target database will perform the same update operation. As shown below:

The above are several scenarios that we often encounter when we synchronize data. Of course, sometimes we also encounter the following more special update scenarios:

■ If we perform an update operation, update id=2 in the source database to id=7, that is, id=2 in the source database before the update meets the filter conditions (id between 1 and 5), and id=7 does not meet the filter after the update Condition, when the synchronization continues, the target library will perform the same update operation, and the id of the target library after synchronization is 1, 3, 5, 7. As shown below:

■ If we perform the update operation, update the id=6 of the source database to id=4, that is, the id=6 in the source database before the update does not meet the filter conditions (id between 1 and 5), and the id=4 meets the filter conditions after the update , When the synchronization is continued, the target database performs the same update operation, but at this time the target database does not have id=6, and the data cannot be matched.

At this time, we will deal with the conflict handling strategy selected according to the current synchronization task:

●  冲突处理策略为“覆盖”，则id=4变成新数据在目标数据库插入，源库和目标库的结果仍然一致。如下图所示：

●  冲突处理策略为“忽略”，则会忽略该更新动作，即源库有id=4，而目标库没有，这种情况目标库数据将少于源库。如下图所示：

●  冲突处理策略为“报错”，则任务将会失败并立即中止。

Steps

Log in to the management console.
Click the icon in the upper left corner of the management console to select a region and item. The data replication service currently also supports the creation of the required instances by using the physical machine resources purchased by the dedicated computing cluster service to realize data migration and data synchronization. You can choose your own dedicated computing cluster.
In "All Services" or "Service List", select "Database> Data Replication Service" to enter the data replication service information page.
On the Data Synchronization Management page, click Create Synchronization Task.
On the "Scene Selection" page, select the "Source Database Source" and "Target Database Source" respectively, and click "Next" to enter the "Sync Instance" page.
On the "Synchronization Instance" page, fill in the task name, notification recipient information, description, and synchronization instance information, and click "Next".
After the synchronization instance is successfully created, on the "Source and target libraries" page, fill in the source library information and target library information, and click "Test connection" at the "Source and target libraries" to test and confirm the connection with the source library. After connecting with the target library, check the agreement and click "Next".
On the "Set Synchronization" page, select the data conflict policy and synchronization object, and click "Next".

Figure 1 Synchronization mode

表1 同步模式和对象

On the "Data Processing" page, select "Data Filtering" for "Processing Type"

Figure 2 Data processing

Select the table object to be processed in the "Object Selection" area.
In the filter condition area, fill in the filter condition (just fill in the part after the WHERE of the SQL statement, such as id=1), and click "Verify".

Description:

● Only one verification rule can be added to each table.

● Data filtering for the source database for Oracle supports up to 20,000 tables at a time, and data filtering for the source database for MySQL supports up to 10,000 tables at a time.

● The filter conditions do not support the use of packege, functions, variables, and constants unique to a certain database engine, and must use general SQL standards.

After the verification is passed, click "Generate Processing Rule" to see the rule in the processing rule table.
After the check is correct, click "Next" to enter the next step.

DRS provides multiple functions such as online migration, backup migration, data synchronization, data subscription, and multi-active disaster recovery. It is suitable for different data circulation scenarios with its own unique characteristics, providing you with a safe and worry-free data replication experience. For more details, please click here.

Click to follow and learn about Huawei Cloud's fresh technology for the first time~

Cloud Class｜Talk about the data filtering characteristics of DRS

1. Introduction to data synchronization

Features

Common scenarios

Features

2. Data filtering

principle

Steps

Description:

华为云开发者联盟

引用和评论

华为云开发者联盟入选 2023 中国技术品牌影响力企业榜，深耕开发者生态

53 倍性能提升！TiDB 全局索引如何优化分区表查询？

分布式数据库解析

做到真正0丢失、0重复：Apache SeaTunnel 实现万亿级数据一致性全解密

在 Kubernetes 上用 KubeBlocks + Dify 快速构建生产级 AIGC 应用

入选AAAI 2025！解决医学图像分割软边界与共现难题，中国地质大学等提出图像分割模型ConDSeg

数据库的下一场革命：S3 延迟已降至原先的 10%，云数据库架构该进化了