​​​​Abstract: This issue will introduce in detail the migration of the community version of Redis, kvrocks and Pika to GaussDB (for Redis)

This article is shared from the Huawei Cloud Community "Huawei Cloud PB-level database GaussDB (for Redis) Demystification Issue Ten: GaussDB (forRedis) Migration Series ( )" 160a365fb2a616, the original author: Gauss Redis official blog.

GaussDB (for Redis) is a cloud-native NoSQL database based on a separate computing and storage architecture and compatible with the Redis ecosystem. It is based on a shared storage pool's multi-copy strong consistency mechanism and supports persistent storage. While ensuring the characteristics of high compatibility, cost-effectiveness, high reliability, and lossless expansion of the database, the GaussDB (for Redis) team provides users with a variety of data migration solutions for different database products. This issue will introduce the community version of Redis in detail. , Kvrocks and Pika to GaussDB (for Redis) migration.

1. Redis to GaussDB (for Redis) migration

As a very popular memory database, the community version of Redis is widely used because of its high performance and rich data structure. GaussDB (forRedis) is a persistent database compatible with the Redis ecology. It not only provides excellent read and write performance, but also provides data persistence. Relying on the advanced system architecture, it guarantees the strong consistency of three copies of data at a very low cost, which can be avoided. The community version of Redis requires forks and high costs.

1.1 Migration principle

Use Huawei Cloud's self-developed migration tool drs-redis to migrate the source Redis to the target GaussDB (for Redis). During the migration process, drs-redis pretends to run as a slave node of the source Redis, and after establishing a connection with the source Redis, it triggers the Redis master-slave synchronization. The source Redis generates an RDB file and transmits it to drs-redis to complete the full synchronization. Then send all write commands saved in the buffer to drs-redis to complete incremental synchronization. The drs-redis migration tool receives and parses the RDB file of the source Redis, sends the parsed data to GaussDB (for Redis) through redis commands, and then sends the incremental data to GaussDB (for Redis) through command propagation. To complete the migration.

1.2 Prerequisite

  • Deploy the migration tool drs-redis.
  • Ensure that the migration tool drs-redis, source Redis and target GaussDB (for Redis) are interconnected.

1.3 Operation steps

  1. Correctly modify drs related configuration files.
  2. Clean up possible remaining data in the migration process.
  3. Start drs and track the log to ensure that the migration is carried out correctly.

1.4 Instructions for use

  • drs-redis pretends to be the slave node of the source Redis, and only reads the full amount of data and incremental commands from the source, so there is no risk of data damage.
  • The source side increases the process of writing data to drs-redis, so the performance will have a slight impact.
  • GaussDB (for Redis) supports multiple databases. If the source is a single-node Redis and multiple databases need to be kept, you can enable the namespace function on the GaussDB (for Redis) side to avoid migrating data from multiple databases to the same space. data lost.
  • If there is no slave node on the source before, the source will add a replication-buffer to cache the incremental commands.

Problem: The replication-buffer of redis master-slave synchronization is a ring buffer. If the buffer is written too fast, the data not sent to drs-redis will be overwritten. The source Redis will actively disconnect for data consistency, causing migration failure.

Recommendation: During the migration process, reduce the rate of data written by Redis at the source end and migrate during the low pressure period. Configure the client-output-buffer-limit parameter of redis and increase the size of the replication-buffer appropriately.

1.5 Migration performance reference

Environment: The source single-node Redis and the migration tool drs-redis are deployed on HUAWEI CLOUD 8U32GB elastic cloud server, the target end is 4U16GB, 3-node GaussDB (for Redis) instance.

scene one:

− The source replication buffer uses the default value (slave 268435456 67108864 60). The default value means that the cache backlog exceeds 268435456bytes (256MB) or 67108864bytes (64MB) for 60s. The source will actively disconnect from the slave node.

− The source write rate is 5MB/s, and the migration process can continue without any synchronization failure caused by a full source buffer.

− The data read rate of the migration tool is the same as the source write rate.

Scene two:

− There is no restriction on the source replication buffer (config set "client-output-buffer-limit" "slave 0 0 0").

− The source write rate is 10MB/s, and if the capacity is sufficient, the migration will continue.

− The data read rate of the migration tool is the same as the source write rate.

Conclusion: In the HUAWEI CLOUD environment, use the 8U32GB elastic cloud server to deploy the migration tool. If the source replication buffer adopts the default value, the migration can be performed at the source end 5MB/s write rate; if the source end does not impose restrictions on the replication buffer, Migration can be performed at a write rate of 10MB/s at the source.

2. Migration from Kvrocks to GaussDB (for Redis)

Kvrocks is an open source NoSQL key-value database compatible with the Redis ecosystem. The bottom layer is based on RocksDB and provides namespace functions to support data partitioning. The Kvrocks cluster management function is relatively weak, and it needs to cooperate with external components when building a cluster by itself. The redis commands supported by Kvrocks are also not comprehensive enough, such as the lack of stream and hyperloglog data structures that are often used in message flow and statistics scenarios.

GaussDB (for Redis) is as compatible as RedisCluster, allowing users to use it directly without modifying the code when applying, and is 100% compatible with native interfaces. While GaussDB (for Redis) adapts to the Kvrocks business, it can also overcome the shortcomings of weak device management capabilities and low compatibility with Redis.

2.1 Migration principle

Use the open source tool kvrocks2redis to migrate Kvrocks to GaussDB (forRedis), and on this basis, adapt the namespace function of Kvrocks from the source code level of GaussDB (for Redis).

The migration process is divided into two stages: full and incremental: after the migration starts, the full migration is performed first, at this time, a snapshot of kvrocks is taken, and the corresponding data version (seq) is recorded. Then parse the full data file into redis command and write it into GaussDB (for Redis). After the full migration is completed, it enters the continuous incremental migration process. The migration tool cyclically sends the PSYNC command to Kvrocks, and continuously forwards the acquired incremental data to GaussDB (forRedis) to complete the incremental migration.

2.2 Prerequisites

  • Deploy kvrocks2redis to an independent host.
  • Ensure network interoperability between the source, target, and migration tools.
  • The source Kvrocks instance should back up data in advance.
  • The target GaussDB (forRedis) instance clears all data.

2.3 Operation steps

  1. Modify the migration tool configuration file and fill in the source kvrocks connection information, the target GaussDB (for Redis) connection information, and the mapping relationship from the source kvrocks namespace to the target GaussDB (for Redis) DB.
  1. Ensure that the content of the configuration file is correct.
  1. Start the migration tool.
  1. Track the log to ensure that the full migration is completed successfully and enter the continuous incremental migration process.
  1. authenticating. Make sure that after data migration, the target GaussDB (for Redis) has loaded all the data correctly.
  1. After the subsequent business side pressure is transferred to GaussDB (for Redis), stop incremental migration, that is, manually stop the running of the migration tool.

2.4 Instructions for use

  • kvrocks2redis needs to extract data from Kvrocks to a local file, and parse the command from it and send it to the target GaussDB (forRedis). This process may affect the source performance, but theoretically there is no risk of data damage.
  • During the running of the migration tool, if there is a problem, the migration tool will automatically stop to facilitate problem location.
  • GaussDB (for Redis) does not provide semantic commands for clearing the database from a security point of view, so it is necessary to ensure that there is no data before the migration starts.

3. Migration from Pika to GaussDB (for Redis)

Pika is a durable large-capacity Redis storage service, which solves the capacity bottleneck of Redis due to the huge amount of stored data and insufficient memory. However, its cluster management function is relatively weak, and it needs to use twemproxy or codis to realize static data sharding, and the data consistency is weak. At the same time, because the data is all stored in disk, the performance is significantly reduced compared to the community version of Redis.

GaussDB (for Redis) realizes the separation of hot and cold, and solves the problem of interactive access between the cache (cache) and the database (Data Base, DB). When the amount of user data is less than the memory, it can achieve the same performance as the community version of redis. Through the proxy agent, the upper-layer business can be unaware of the data migration in the process of expansion and contraction of the kernel processing.

3.1 Migration principle

Use the open source migration tool pika-port to migrate Pika to GaussDB (for Redis). Pika-port pretends to be a slave node of Pika, and performs data migration through master-slave replication. The Pika master node judges whether to perform a full migration or an incremental migration by comparing the pika-port and its own binlog offset. If a full migration is required, the Pika master node will send a snapshot of the full data to pika-port, and pika-port will send the parsed snapshot data to GaussDB (for Redis). After the full migration is completed, the incremental migration is entered. Pika-port parses the incremental data and sends it to GaussDB (for Redis) in the form of redis commands.

3.2 Prerequisites

  • Deploy the migration tool pika-port
  • Ensure that the source Pika instance, pika-port, and target GaussDB (forRedis) instances can communicate with each other.

3.3 Operation steps

  1. Correctly modify the pika-port configuration file;
  2. Start the migration tool pika-port;
  3. Tracking logs to ensure that the service is stopped after the full migration is completed, and the incremental migration process is entered;
  4. After the incremental migration is completed, verify the correctness and completeness of the migrated data;
  5. After verification, the business will be switched to GaussDB (for Redis).

3.4 Instructions for use

  • The pika-port pretends to be the slave node of the source Pika, and only reads the full and incremental data, and there is no risk of data damage.
  • The source side adds a master-slave synchronization process with pika-port, which may affect the source side performance.
  • The full and incremental migration can be combined without stopping the service, and the service will be temporarily suspended when the business is switched to GaussDB (for Redis).

3.5 Migration performance reference

Environment: Pika (single node) and pika-port are deployed on HUAWEI CLOUD 8U32GB elastic cloud server at the same time, the target end is 8U16GB, 3-node GaussDB (for Redis) instance.

Preset data: Use the memtier_benchmark tool to preset 200GB of data.

Migration performance: about 50000qps.

4. Conclusion

Gauss Redis is based on the community version of Redis, combined with Huawei's self-developed strong consistent storage DFV Pool, which has the advantages of strong consistency, second expansion, ultra-availability, and low cost, ensuring the accuracy and reliability of counting.

Author of this article: HUAWEI CLOUD Gauss Redis team.

Resume delivery in Hangzhou, Xi'an and Shenzhen: yuwenlong4@huawei.com

For more technical articles, follow the official blog of Gauss Redis:

https://bbs.huaweicloud.com/community/usersnew/id_1614151726110813

Click to follow, and learn about Huawei Cloud's fresh technology for the first time~


华为云开发者联盟
1.4k 声望1.8k 粉丝

生于云,长于云,让开发者成为决定性力量