Author: Xiao Asia

A member of the DBA team of Aikesen, responsible for database failure and platform problem resolution in the project, with a special liking for database high availability and distributed technology.

Source of this article: original contribution

*The original content is produced by the open source community of Aikesheng, and the original content shall not be used without authorization. For reprinting, please contact the editor and indicate the source.


Background introduction

There have been several problems in the customer environment recently. After discussion, it was decided to make a schema change and migrate mycat to dble. The requirements are: minimal changes.

problem sorting

Here are some things to consider based on the client's situation and requirements:

  • parameter settings
  • Sharding function
  • data node data
  • Business SQL

problem handling

1. Parameter setting

Mycat side parameters are as follows:

 <system>
<property name="defaultSqlParser">druidparser</property>
<property name="processors">4</property>
<property name="processorBufferPool">20480000</property>
<property name="processorBufferLocalPercent">100</property>
<property name="frontSocketSoRcvbuf">10485760</property>
<property name="frontSocketSoSndbuf">41943040</property>
<property name="frontSocketNoDelay">1</property>
<property name="backSocketSoRcvbuf">41943040</property>
<property name="backSocketSoSndbuf">10485760</property>
<property name="backSocketNoDelay">1</property>
<property name="maxPacketSize">2048576000</property>
<property name="memoryPageSize">100m</property>
</system>

The DBLE side recommendations for this parameter are as follows:

Note: The memory management configuration of mycat and dble is quite different. For example: there is no threadlocal concept. It is recommended to read the following documentation: https://actiontech.github.io/dble-docs-cn/2.Function/2.07_memory_manager.html

2. Fragmentation function

View mycat sharding rules:

 <function name="mod-long" class="io.mycat.route.function.PartitionByMod">
<!-- how many data nodes -->
<property name="count">8</property>
</function>

Directly using DBLE's hash algorithm test, it is found that some data query errors are reported:

Check the data in the table:

Sample data: 20210810143211157000000000036

Field: user_code is varchar(32)

Check the mycat environment fragmentation rules: the fragmentation algorithm used is mod-long. The value range of long is -9223372036854774808~9223372036854774807, the query error 20210810143211157000000000036 exceeds the value range of long. So it can be concluded that mycat has done special treatment for this special case.

In this case, the DBLE side recommends the following:

1. It is recommended to replace the fragmentation algorithm with HashString

If the value of the sharding column exceeds the maximum value of Long (9223372036854774807), it needs to be replaced with a sharding algorithm of String type;

Note: Historical data needs to be re-imported after replacing the algorithm shards

2. Using DBLE's custom algorithm

Refer to the PartitionByMod algorithm of mycat and add it to [dble's custom algorithm https://actiontech.github.io/dble-docs-cn/1.config_file/1.09_dble_route_function_spec.html?q= ].

The final configuration is as follows:

 <function name="test_func" class="com.actiontech.dble.custom.sharding.algorithm.PartitionByMod">
    <property name="count">8</property>
</function>

3. Data node data

By introducing the custom splitting algorithm function of DBLE, it is compatible with the splitting algorithm on the original environment mycat, so the storage node data of the original mycat environment does not need to be changed. This greatly reduces our migration workload.

4. Business SQL

After several rounds of testing, we found that mycat often sends SQL directly to the back-end nodes, which caused us to encounter many problems during testing, because DBLE subdivided various situations.

In order to ensure the accuracy of business query data to the greatest extent, DBLE has cut a lot of such SQL support, but in order to minimize business changes, DBLE has also been appropriately opened.

Here are some records:

  • Support for insert into … select … syntax

    Currently DBLE has released support for this syntax for vertical tables.

  • Support for rownum passthrough

    At present, DBLE has released the transparent transmission of the specified table rownum.

  • UPDATE query with sub-query is not supported

    1.Update multi-node update will trigger the problem of distributed transactions, and it is difficult to ensure consistency

    2. According to the current execution plan, mycat uses the source sql to broadcast and deliver the update+ subquery (delivery to all nodes associated with the table). This mechanism cannot guarantee the correctness in some cases.

    3. Multi-table association update is not supported. As a middleware product, dble processes a certain type of SQL. The correct processing method for this type of SQL is to issue sub-queries first, recover the results, and then splicing the outer SQL for processing. It is currently difficult to analyze and implement this processing method, and performance issues also need to be considered.

    4. In order to ensure the correctness of sql such as update+subquery, it must not be done by broadcasting such as mycat, so this kind of sql is currently not supported

  • Global table + sharded table type SQL support

    ERROR 4004 (HY000): Unknown function FN_TREE_PATHNAME

    1. dble in version 2.20.04.x and later has been optimized to directly execute sql in this case.

    2. The global table + fragmentation table has not been processed before because there may be problems with direct delivery. For the special scenario of global table + fragmentation table: global table + vertical table, it is not calculated that these nodes use the same a node. (Actually it can be calculated, dble has been optimized in version 2.20.04.x and later)

    3. mycat will directly issue the statement to the node, and the global table will store the same data in each configured node. If each node and the result of the split table Left Join are combined with a simple UNION ALL, it will cause data corruption. Repeat, the accuracy of the data cannot be guaranteed. Therefore, it cannot be directly transmitted.

More

More enhancements to MyCat for DBLE are documented at: https://actiontech.github.io/dble-docs-cn/0.overview/0.2_dble_enhance_MyCat.html


爱可生开源社区
426 声望207 粉丝

成立于 2017 年,以开源高质量的运维工具、日常分享技术干货内容、持续的全国性的社区活动为社区己任;目前开源的产品有:SQL审核工具 SQLE,分布式中间件 DBLE、数据传输组件DTLE。


引用和评论

0 条评论