In today's digital age, we all agree that data creates value. In order to maximize the value of data, we are constantly building data migration pipelines, from homogeneous to heterogeneous, from relational to non-relational, from the cloud to the cloud, from the data warehouse to the data lake. The value of mining data in various scenarios. In this criss-cross data network, logical replication plays an extremely important role.
Let's pull our perspective from the complex network back to one of its endpoints, starting with PostgreSQL and deciphering the principles of its logical replication.
1 Concepts and Principles <br>Logical replication is a method of replicating data and its changes based on replication identifiers. Different from physical replication's description of page operations, logical replication is a description of transactions and data tuples.
Figure - WAL data flow example
As shown in the figure, the data flow of physical replication is to operate on blocks of tablespace/database/filenode files, while the content of logical replication is to describe tuples.
Next, let's look at several concepts in logical replication:
duplication slot
A replication slot is a set of information that records the status of replication. Since WAL (Write Ahead Log) files are deleted after the data is actually dropped, replication slots prevent premature cleanup of WAL logs required for logical replication parsing. In logical replication, where each slot streams a series of changes from a single database, creating a replication slot requires specifying the output plugin it uses, while creating a replication slot provides a snapshot.
output plugin
The output plugin is responsible for decoding the WAL log into a readable format. Common plugins use test_decoding (mostly used for testing), pgoutput (default), and wal2json (the output is json). PostgreSQL defines a series of callback functions. In addition to using the above plugins, we can write our own output plugins through callback functions.
Figure - Replication slot data flow
Copy Agreements and Messages
Through the replication protocol, we can get the WAL data stream from the source. For example through the PSQL tool it is suggested to copy the connection
psql "dbname=postgres replication=database"
Turn on streaming WAL
START_REPLICATION[ SLOT slot_name] [ PHYSICAL] XXX/XXX[ TIMELINE tli]
Whether it is physical replication or logical replication, using PostgreSQL's publish-subscribe or pg_basebackup to build streaming replication, it interacts with defined messages through replication protocols (physical replication and logical replication data stream content is different)
Figure - WAL data flow message types
Figure - XLogData message in logical replication
work process
When we understand the concept, let's take a look at the entire parsing workflow. Since the content of a transaction in the WAL file is not necessarily continuous, it needs to pass through Reorder and put it in the buffer, organize it into a message according to the transaction ID, send it to the output plugin after COMMIT, and send the message stream to the target after the output plugin parses it. end.
Figure - Logic Parsing Workflow
2 Problems and Evolution <br>When we grasped the principle of logical replication and planned to use it to build our data migration application, we still have some problems that have not been solved. Let's take a look at what the burning problem is and how we can deal with it.
Problem 1: Failover slot
For high availability, the database will have at least one primary and one standby architecture. When the primary database fails and performs high-availability switchover, the standby database does not have the corresponding replication slot information, that is, lacks the failover slot. This is because the physical file that saves the slot information is not synchronized to the standby database. So how do we manually create a faliover slot?
- The main library creates a replication slot to check whether the wal files of the standby library are continuous
- Copy the physical files containing slot information to the standby database, under the pg_repslot directory
- After the standby database is restarted, the replication slot information can be seen after the restart. The reason is that the function StartupReplicationSlots, which reads the physical file of the slot, is only called when the postmaster process starts.
- Query the slot status of the primary database regularly, and use the pg_replication_slot_advance function to advance the replication slot of the standby database
Since then, we have also had corresponding information on the standby database, and manually implemented the failover slot. The well-known high-availability software Patroni in the PostgreSQL ecosystem is also implemented in this way. The difference is that when Patroni queries the slot status of the main database, the information is written into the DCS, and the standby database gets the location information in the DCS for advancement.
Problem 2: DDL synchronization
Native logical replication does not support parsing DDL statements, we can use event triggers to process them.
- Use event triggers to sense table structure changes, record them in the DDL_RECORD table, and publish the table through logical replication.
- After the receiving end obtains the data changes of the table, it can be processed as the corresponding DDL statement for execution.
Figure - Event trigger to achieve DDL synchronization
Problem 3: Two-way synchronization
When data migration involves two-way synchronization pipelines, for example, if you want to implement dual-master and dual-write operations, and operate on the same object in the database, a WAL loop will occur.
Figure - Two-way synchronization of the same table results in a data loop
In order to solve this problem, some DTS applications will create an auxiliary table, first operate the auxiliary table in the transaction, and know that the record is inserted by the DTS application by parsing the operation on the auxiliary table, so as to filter the transaction and no longer cyclic analysis . PostgreSQL provides Origin records for transactions, without auxiliary tables, the Origin ID can be specified through the pg_replication_origin_session_setup function or the replorigin_create in publish and subscribe.
After specifying the Origin ID, in addition to filtering through the DTS application after parsing, we can also filter during the parsing process through the FilterByOriginCB callback function in the parsing plugin, which reduces data transmission and is more efficient.
Figure - OriginFilter function DEMO in test_decoding
other problems:
In addition to the above three issues, there are some usage issues or limitations. Some are listed here, not expanded, but briefly explained.
Toast processing: For the toast value (which can be judged in the message format), we generally use placeholders for processing, and the receiving end will not process this column after receiving the placeholder. Although it is a bit troublesome, it is also in the process of and transmission. The result of the trade-off in the scheme of the toast value.
Heartbeat table: Since the XMIN recorded by the replication slot is global, when the table we publish has not been updated, XMIN is not advanced and WAL is backlogged. We can create a heartbeat table, write data periodically and publish it, so that XMIN can advance .
Large transaction delay: According to the workflow mentioned above, we can know that the default transaction will not be parsed until COMMIT, which will inevitably lead to delays for large transactions. The PG14 version provides streaming mode for parsing, that is, the transaction is parsed in progress and sent to the receiver.
3 Application and Practice <br>In the first two sections, we have decrypted PostgreSQL from the perspective of principles and problems. Next, we will look at how to apply and practice data migration through the logical replication principle we have mastered.
Full and incremental synchronization
In real data migration scenarios, most of them are scenarios in which full and incremental data must be synchronized, and after we have opened up the data transmission channel, we will also expand the security, efficiency, and functions of this channel, such as cleaning, ETL capabilities such as desensitization raise new requirements. Let's first take a look at how to achieve full and incremental synchronization.
Figure - Schematic diagram of data flow
The main processes include:
- Create replication slots and export snapshots
- Full data migration based on snapshots
- Incremental data migration based on replication slots
We use the PG database or the message queue MQ as the data agent. Full and incremental analysis can be performed at the same time. When the full amount of data is processed, the state machine notifies the incremental handler for incremental publishing. And for the data in the proxy, it can be preprocessed after parsing.
Practice of migrating self-built instances to the cloud
Finally, I will share with you a practice of migrating a self-built instance to the cloud. This case is to migrate the self-built PG10 version instance to the RDS PG 11 version on JD Cloud. Data security is ensured through the return of incremental data and data verification. Switch smoothly with business.
Figure - Data Migration to the Cloud
DTS application is mainly divided into the following stages:
- Data Check Phase: Check Primary Keys, Permissions, Configurations
- Data migration stage: structure, inventory, incremental data migration, monitoring migration status
- Application migration stage: switch domain names and introduce traffic
- Rollback phase: Incremental data reflow, if there is a problem, it can be rolled back quickly.
Author: Jiang Shuai
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。