Author: Xiao Asia
A member of the DBA team of Aikesen, responsible for database failure and platform problem resolution in the project, with a special liking for database high availability and distributed technology.
Source of this article: original contribution
*The original content is produced by the open source community of Aikesheng, and the original content shall not be used without authorization. For reprinting, please contact the editor and indicate the source.
Background introduction
There have been several problems in the customer environment recently. After discussion, it was decided to make a schema change and migrate mycat to dble. The requirements are: minimal changes.
problem sorting
Here are some things to consider based on the client's situation and requirements:
- parameter settings
- Sharding function
- data node data
- Business SQL
problem handling
1. Parameter setting
Mycat side parameters are as follows:
<system>
<property name="defaultSqlParser">druidparser</property>
<property name="processors">4</property>
<property name="processorBufferPool">20480000</property>
<property name="processorBufferLocalPercent">100</property>
<property name="frontSocketSoRcvbuf">10485760</property>
<property name="frontSocketSoSndbuf">41943040</property>
<property name="frontSocketNoDelay">1</property>
<property name="backSocketSoRcvbuf">41943040</property>
<property name="backSocketSoSndbuf">10485760</property>
<property name="backSocketNoDelay">1</property>
<property name="maxPacketSize">2048576000</property>
<property name="memoryPageSize">100m</property>
</system>
The DBLE side recommendations for this parameter are as follows:
- defaultSqlParser , memoryPageSize , processorBufferLocalPercent have been deprecated in dble, no need to configure
- ProcessorBufferPool parameter name changed to bufferPoolPageSize
- Other parameters in dble can keep the same configuration information as mycat in server.xml. Details of some parameters can be found here: https://github.com/actiontech/dble-docs-cn/blob/2.19.11.0/tag/1 .config_file/1.03_server.xml.md
Note: The memory management configuration of mycat and dble is quite different. For example: there is no threadlocal concept. It is recommended to read the following documentation: https://actiontech.github.io/dble-docs-cn/2.Function/2.07_memory_manager.html
2. Fragmentation function
View mycat sharding rules:
<function name="mod-long" class="io.mycat.route.function.PartitionByMod">
<!-- how many data nodes -->
<property name="count">8</property>
</function>
Directly using DBLE's hash algorithm test, it is found that some data query errors are reported:
Check the data in the table:
Sample data: 20210810143211157000000000036
Field: user_code is varchar(32)
Check the mycat environment fragmentation rules: the fragmentation algorithm used is mod-long. The value range of long is -9223372036854774808~9223372036854774807, the query error 20210810143211157000000000036 exceeds the value range of long. So it can be concluded that mycat has done special treatment for this special case.
In this case, the DBLE side recommends the following:
1. It is recommended to replace the fragmentation algorithm with HashString
If the value of the sharding column exceeds the maximum value of Long (9223372036854774807), it needs to be replaced with a sharding algorithm of String type;
Note: Historical data needs to be re-imported after replacing the algorithm shards
2. Using DBLE's custom algorithm
Refer to the PartitionByMod algorithm of mycat and add it to [dble's custom algorithm https://actiontech.github.io/dble-docs-cn/1.config_file/1.09_dble_route_function_spec.html?q= ].
The final configuration is as follows:
<function name="test_func" class="com.actiontech.dble.custom.sharding.algorithm.PartitionByMod">
<property name="count">8</property>
</function>
3. Data node data
By introducing the custom splitting algorithm function of DBLE, it is compatible with the splitting algorithm on the original environment mycat, so the storage node data of the original mycat environment does not need to be changed. This greatly reduces our migration workload.
4. Business SQL
After several rounds of testing, we found that mycat often sends SQL directly to the back-end nodes, which caused us to encounter many problems during testing, because DBLE subdivided various situations.
In order to ensure the accuracy of business query data to the greatest extent, DBLE has cut a lot of such SQL support, but in order to minimize business changes, DBLE has also been appropriately opened.
Here are some records:
Support for insert into … select … syntax
Currently DBLE has released support for this syntax for vertical tables.
Support for rownum passthrough
At present, DBLE has released the transparent transmission of the specified table rownum.
UPDATE query with sub-query is not supported
1.Update multi-node update will trigger the problem of distributed transactions, and it is difficult to ensure consistency
2. According to the current execution plan, mycat uses the source sql to broadcast and deliver the update+ subquery (delivery to all nodes associated with the table). This mechanism cannot guarantee the correctness in some cases.
3. Multi-table association update is not supported. As a middleware product, dble processes a certain type of SQL. The correct processing method for this type of SQL is to issue sub-queries first, recover the results, and then splicing the outer SQL for processing. It is currently difficult to analyze and implement this processing method, and performance issues also need to be considered.
4. In order to ensure the correctness of sql such as update+subquery, it must not be done by broadcasting such as mycat, so this kind of sql is currently not supported
Global table + sharded table type SQL support
ERROR 4004 (HY000): Unknown function FN_TREE_PATHNAME
1. dble in version 2.20.04.x and later has been optimized to directly execute sql in this case.
2. The global table + fragmentation table has not been processed before because there may be problems with direct delivery. For the special scenario of global table + fragmentation table: global table + vertical table, it is not calculated that these nodes use the same a node. (Actually it can be calculated, dble has been optimized in version 2.20.04.x and later)
3. mycat will directly issue the statement to the node, and the global table will store the same data in each configured node. If each node and the result of the split table Left Join are combined with a simple UNION ALL, it will cause data corruption. Repeat, the accuracy of the data cannot be guaranteed. Therefore, it cannot be directly transmitted.
More
More enhancements to MyCat for DBLE are documented at: https://actiontech.github.io/dble-docs-cn/0.overview/0.2_dble_enhance_MyCat.html
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。