Three-high Mysql - Building the expansion and switching of the "three-high" architecture - 个人文章

introduction

The content is the study notes of the "High Concurrency, High Performance and High Availability Mysql Actual Combat" video of MOOC.com and the notes after personal organization and expansion. This section describes the switching and expansion of the other two parts of the three-high architecture. The expansion refers to The sub-database and sub-table reduce the pressure on the database. At the same time, because the sub-database and sub-table need to introduce some optimization methods for the problem of node downtime, and the switching part is about the switching problem of node downtime. Finally, we describe how to combine the master-slave switch of replication. Build a three-height structure.

If the content is difficult, you can follow the "How Mysql Works" personal reading notes column to make up the lesson:

The address is as follows: Learn Mysql from scratch .

expand

Partition Table

Innodb's partition table refers to splitting a table into multiple tables, but note that this concept is different from the physical sub-table of sub-database sub-tables. Although Innodb has been divided in the storage engine, in fact, the partition table is in the Server. The layer is still treated as a table.

The construction of the partition table can be seen in the following cases:

In order to verify that the Sever layer treats it as a table, you can enter the command line to view it with the following command. In the screenshot, you can see that although it is a table on the surface, the Innodb storage engine actually splits them into four tables:

 InnoDB分区存在下面的几种方式：

Range partitioning: Divide the partition by the storage range of the data.
Hash partition: partition by hash value.
List partitioning: Partitioning is performed according to the way the field values are obtained.

A partition table has the following characteristics:

Reduce the level of the Btree tree to improve the efficiency of search queries.
The first access requires access to all partitions.
All partitioned tables use a common MDL lock , which means that the lock table for the partitioned table will be processed synchronously
Since a partitioned table is just a table to the server layer, partitioning does not actually improve performance.

Sub-library and sub-table

Strictly speaking, sub-database and sub-table should be divided into sub-database and sub-table. The processing of sub-database is generally less, and more is the operation of sub-table according to the business. The sub-table is usually divided into the following ways:

Vertical sub-table: Vertical sub-table refers to the division of hot and cold according to certain rules according to the data of a table.

Horizontal sub-tables: Horizontal sub-tables usually split tables according to data rows. This method is similar to splitting real data rows into multiple tables to prevent single table data from being too large. At the same time, range values or hash values are used internally for horizontal sub-tables. For table data lookup, horizontal sub-tables are the most commonly used .

Note that the sharding table described in this section is different from the partitioning table above. At the Server layer, this sharding table will be treated as an actual split rather than the same table.

The concept of sub-library is not used very much now. It is divided into the following contents in the concept of sub-library:

Vertical sub-database: data is distributed in multiple databases or distributed to multiple nodes.

Horizontal sub-library: The data table is divided according to special business rules, and each library is responsible for its own main business. At the same time, the basic structure configuration of the database is the same. There is usually a scenario in horizontal sub-library where newer data and older data are placed in different libraries for viewing, which is also similar to the structure diagram below.

What are the advantages and disadvantages of sub-database sub-table?

advantage:

increase isolation
Increase concurrency and isolation: because data structures are seen as different libraries and blocks at the server layer,
Although it is very similar to the partition table, it is completely different in nature.

shortcoming:

For partial failure features will multiply and appear.
Single-point transactions cannot be implemented, and distributed locks need to be introduced for control.
After the vertical database is divided into the table, the join query cannot be performed, and a lot of SQL will be written.
There will be problems with SQL for range queries

Dble and Mycat

Introduction: These two middleware are the mainstream middleware used in the market for Mysql for database and table division. Mycat may be more familiar, while Dble is further optimized and extended on the basis of Mycat.

Basic operating principle:

The SQL statement that parses the query.
Distribute SQL queries to multiple databases and multiple tables for query according to middleware algorithms, and send them to data nodes at the same time
The data of the data nodes are aggregated and merged, and finally returned to the client.

Dble: The high-performance Mysql sub-database sub-table middleware, developed by a domestic company called Aikesheng, can be said to be the light of domestic production. The project is completely open source and is optimized and improved based on another open source project Mycat. The tools are mainly written in JAVA, and most developers can try to solve some practical problems.

The design structure of Dble is as follows. For the client, it is no different from the sharding that is usually connected to Mysql. However, in fact, this is because a series of optimization operations are performed inside Dble:

Basic concepts of Dble:

Schema: Virtual database (different from traditional database schema).
ShardingTable: Virtual table, which divides the data algorithmically through the virtual table.
ShardingNode: A virtual node that exists in the Database of the database. It can be considered that a DB is a node.
dbGroup: The actual Mysql cluster.
Database: Indicates the actual Database.

Finally, let's take a look at what Dble does through a simple table splitting case:

Dble's sub-database and sub-table feature is that both sub-database and sub-table are implemented using sub-table.

As you can see in the above figure, first of all, our physical table is treated as a shariding table by Dble, and the virtual table here will be distributed to two Mysql nodes first in Dble. For Mysql1 and Mysql2, the difference between them is The two parties who have no relationship do not know the existence of the other party (which is different from the main-main architecture mentioned in the previous article), while Dble creates virtual machine nodes in the actual db between these two nodes for horizontal sub-libraries. Algorithms are distributed to different libraries for query. The table here looks small because there may actually be other virtual nodes inside. For Dble, it is split and merged into different Mysql management. These virtual nodes Nodes are separate data for Mysql1 and Mysql. For Mysql itself, there is no obvious perception and difference from ordinary data. The real data merge is done by Dble.

Dble installation, construction and use

For the specific operation and use, please refer to the official document: Introduction · Dble manual (actiontech.github.io) , the installation process is skipped here. We focus on the configuration of Dble:

First of all Dble has several important configuration files:

cluster.cnf : cluster parameter configuration
bootstrap.cnf : instance parameter configuration, including JVM startup parameters, Dble performance, scheduled tasks, ports, etc.
user.xml : Dble user configuration
db.xml : database related configuration
sharding.xml : data splitting related configuration

Note that these files are all templates in the installed Dble directory. Except for some parts that need to be modified according to their own Mysql conditions, they only need to be renamed and removed _template .

The next step is to modify the db.xml file. In this file, you need to modify the relevant configuration according to your own database node. For example, the frame line in the screenshot below needs to be modified for your own Mysql node configuration.

If the screenshot is not clear, you can refer to an official template for modification. The places that need to be changed have relevant signs, which are easier to understand.

 <?xml version="1.0"?>
<Dble:db xmlns:Dble="http://Dble.cloud/">

    <dbGroup name="dbGroup1" rwSplitMode="1" delayThreshold="100">
        <heartbeat errorRetryCount="1" timeout="10">show slave status</heartbeat>
        <dbInstance name="instanceM1" url="ip4:3306" user="your_user" password="your_psw" maxCon="200" minCon="50" primary="true">
            <property name="testOnCreate">false</property>
            <property name="testOnBorrow">false</property>
            <property name="testOnReturn">false</property>
            <property name="testWhileIdle">true</property>
            <property name="connectionTimeout">30000</property>
            <property name="connectionHeartbeatTimeout">20</property>
            <property name="timeBetweenEvictionRunsMillis">30000</property>
            <property name="idleTimeout">600000</property>
            <property name="heartbeatPeriodMillis">10000</property>
            <property name="evictorShutdownTimeoutMillis">10000</property>
        </dbInstance>

        <!-- can have multi read instances -->
        <dbInstance name="instanceS1" url="ip5:3306" user="your_user" password="your_psw" maxCon="200" minCon="50" primary="false">
            <property name="heartbeatPeriodMillis">60000</property>
        </dbInstance>
    </dbGroup>
</Dble:db>

The next step is to modify the user.xml part, which needs to be noted that there are managerUser and shardingUser roles, one is the administrator responsible for managing Dble users, and the other sharingUser needs to create table permissions for client requests. sub-table.

 <?xml version="1.0" encoding="UTF-8"?>
<Dble:user xmlns:Dble="http://Dble.cloud/">
    <managerUser name="man1" password="654321" whiteIPs="127.0.0.1,0:0:0:0:0:0:0:1" readOnly="false"/>
    <managerUser name="user" usingDecrypt="true" readOnly="true" password="AqEkFEuIFAX6g2TJQnp4cJ2r7Yc0Z4/KBsZqKhT8qSz18Aj91e8lxO49BKQElC6OFfW4c38pCYa8QGFTub7pnw==" />

    <shardingUser name="root" password="123456" schemas="testdb" readOnly="false" blacklist="blacklist1" maxCon="20"/>
    <shardingUser name="root2" password="123456" schemas="testdb,testdb2" maxCon="20" tenant="tenant1">
        <privileges check="true">
            <schema name="testdb" dml="0110">
                <table name="tb01" dml="0000"/>
                <table name="tb02" dml="1111"/>
            </schema>
        </privileges>
    </shardingUser>
    <!--rwSplitUser not work for now-->
    <rwSplitUser name="rwsu1" password="123456" dbGroup="dbGroup1" blacklist="blacklist1"
                 maxCon="20"/>
    <blacklist name="blacklist1">
        <property name="selectAllow">true</property>
    </blacklist>
</Dble:user>

Finally, let's take a look at the core configuration of sharing.xml . According to the official introduction, he has the following three main contents.

schema (virtual schema, multiple can be configured)
shardingNode (virtual shard, multiple can be configured)
function (split algorithm, multiple can be configured)

Supported Partitioning Algorithms: Currently supported partitioning algorithms are: hash, stringhash, enum, numberrange, patternrange, date, jumpstringhash. For the specific partitioning algorithm details, please read the introduction of the relevant contents of the document, and I will not introduce too much here.

At present, there are relatively few cases used in China. Here, I directly found a blog in the actual combat part. I will not demonstrate the sub-database and sub-table of Dble here. You can come back and have a look when you need it:

Dble sub-database sub-table actual combat_Li Rulei's technology blog_51CTO Blog

How to improve the performance of sub-database and sub-table?

To improve the performance of sub-database and sub-table, the first thing that comes to mind is to build multiple primary and secondary nodes to combine primary and secondary replication with Dble.

Secondly, you can configure read-write separation on Dble, and you can also refer to the official documentation to configure read-write separation.

There is a problem with the sub-database and sub-table

There are actually the following problems in the automatic management of sub-databases and sub-tables by dble, and dble is processed based on mycat. The following rules will have similar problems for sub-databases and sub-tables:

The query statement needs to have split fields as much as possible:

dble determines the position of the data node based on the split field.
It is impossible to judge that the data node can only traverse all the nodes . This will lead to negative optimization of sub-database sub-table queries

The inserted statement must also have split fields:

Dble judges the point at which the data is based on the split field.

Split try to use the equivalent condition:

Range split fields cause too many node scans.
Use the IN statement to reduce the number of IN clause point values.

Reduce table search traversal:

The following actions will affect performance

Without split field.
Distinct, group by, order by.

Reduce the result set:

Data interaction can cause query performance to suffer.
Distributed systems lead to a large amount of data interaction among nodes.

Join tables across nodes:

Tables that are frequently joined need to be split according to fixed rules.
Use split field as join condition.
Try to add more filter conditions to the drive table.
Minimize pagination of data.
Complex statements are split into simple statements.

The above content is summarized as follows:

Reduce data interaction.
Data addition, deletion and modification need to add split fields.
Join keys for split processing.

toggle

The core of the switch is to protect the business or protect the data ?

How to switch identity:

Stop the standby database synchronization
Configure the master library to replicate the slave library.
Replication is a level relationship and can be independent.

Switch strategy:

From simply switching the strategy card rate, we can see that there are the following two ways:

Reliable priority strategy: means that the seconds\_behind\_master parameter cannot be too large and cannot be too large behind the library. The A library needs to be switched to the read-only form. At this time, the business can only read and not add new data. When seconds\_behind\_master=0, it means that the two libraries are synchronized. At this point, the A library stops, the B library stops copying A, and the A library starts to copy the B library. This reliability means that the data is reliable, but it does not guarantee that the business will not be affected, because the biggest problem comes from downtime.
Availability strategy: Cancel the consistent process of waiting for data, A library read-only B library is closed read-only, B library stops copying A library, A library starts copying B library, the advantage is that the system has no unwritable time, the disadvantage is that if there is no time when switching Timely replay of the relay log can easily lead to data inconsistency.

For most common business execution, try to use the reliable priority strategy , but if the availability strategy is strictly recommended for business high availability, for example, log pipeline also requires availability, and some businesses that require strong consistency and allow a certain amount of data loss can be considered. Use availability strategies.

How does the business switch?

Reserve the interface to notify the connection to the new database address.
The microservice framework notifies the business, such as the registry.
Internal use of DNS, domain name connection, refresh DNS processing after switching.
- K8S uses this method to implement processing.
keepalived for VIP drift. Priority processing is processed by detection.
Proxy mode switching: add a layer of proxy load balancing processing.
Active/standby switch during Dble

Automatic master-slave switching

How does keepalived switch between active and standby?

keepalive is a frequently used middleware. It plays the role of identity switching and VIP drift in automatic master-slave switching. VIP is Virtual IP Address, which is a solution to realize HA (high availability) system. The purpose of high availability is to To avoid stopping external services due to system failure by technical means, the general implementation method is to deploy a backup server to take over the business when the main server fails. VIP is used to provide a fixed "virtual" access address to the client to avoid the impact on the client when the backend server switches.

Keepalived is designed to manage VIPs, so using Keepalived to implement VIP configuration is very simple. Keepalived uses the Virtual Router Redundancy Protocol (VRRP) protocol to implement communication and election between the active and standby servers.

The master-slave switch of keepalive is similar to the following pairing method. When, for example, the MysqlA service is down, it will automatically switch to the next A` server to provide services, and a new node will be elected internally through a certain election algorithm.

How does Mha (master high availability) perform active/standby switchover?

Mha is also a commonly used Mysql high-availability component. It completes the master-slave switching through the concept of self-developed master-slave switching. This component is developed by facebook engineers and supports the GTID method. The computer server downloads the binlog log. But the biggest problem with Mha is the inability to VIP drift.

As can be seen from the figure below, in the working mechanism of Mha, if the service of node A is found to be down, it will immediately log in to the server of node A to rescue the file, but there is a problem of data synchronization in replication, in semi-synchronization There will be a tripping time in the replication, which may cause the binlog transmission to be converted into asynchronous transmission, and the binlog may not be transmitted. In this case, other data may be incomplete and the data may not be synchronized.

It is worth mentioning that Mha does not directly access the down node through the client, but needs to wait for the data of the SLAVE node to drop into the database, and then access the main database from the database to rescue the binlog. These operations are basically from the designer's daily work experience. Some of the problems found in the special treatment are very favored by the majority of developers.

After the rescue binlog task is completed, the next step for Mha is to re-select the Master. Note that the down node will not try to restart and recover with the script, because in this case, there is usually no great effect without human intervention, even if restarting There may also be data inconsistencies.

Three-high system construction

Before building the three highs, we need to think about why sometimes the cluster is built and why does it still hang? The reason is that any mature single system will not rely solely on a certain open source component, but will add a large number of fault-tolerant mechanisms in the middle layer to prevent a certain component from crashing and causing large-scale losses. This involves the concept of DRDS. DRDS means (Distributed Relational Database Service) distributed Mysql cluster, and Mysql cluster is usually not its own cluster, but maintains the basic characteristics of three highs through a series of middleware.

Next, let's take a step-by-step look at how the three-high system is built:

There is a problem with sub-database and sub-table: we found that when there is a problem with dble, the entire service will be unavailable. We will discuss the single-point problem of Dble. Here, there is a linkage between Mha and Dble to solve the problem of Mysql node downtime. .

For developers, it is basically impossible to encounter this thing in the daily practice process. Here is also a general construction process and some basic configurations for learning when needed:
Aikesen DBLE Mha-dble high availability linkage instance

The following is the enhanced structure diagram of Mha combined with DBLE for the entire architecture:

Through the combination of Mha and dble, when the node is down, the node can be switched through Mha to ensure the normal operation of the sub-database and sub-table.

But we also found a problem that dble itself is also a single point, so dble also needs to do cluster load balancing to prevent the entire node from being unavailable, and the load distribution of dble can be handled by Haproxy combined with zookeeper.

HAProxy is a proxy software that provides high availability, load balancing and applications based on TCP (Layer 4) and HTTP (Layer 7), supports virtual hosting, and is a free, fast and reliable solution. HAProxy is especially useful for heavily loaded web sites that often require session persistence or Layer 7 processing. HAProxy runs on current hardware and can fully support tens of thousands of concurrent connections. And its mode of operation makes it easy and secure to integrate into your current architecture, while protecting your web server from being exposed to the network.
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these types of services are used by distributed applications in some form. Every time these services are implemented, a lot of work goes into fixing the inevitable bugs and race conditions. Due to the difficulty of implementing such services, applications often ignore them initially, making them brittle and unmanageable in the face of change. Even done correctly, different implementations of these services can lead to management complexity when the application is deployed.

Finally, the client finds the gateway's entrance through keepalive's VIP drift, and finds the corresponding dble after the election and distribution. The dble then sub-databases and sub-tables to find the relevant data for processing, and finally finds the relevant Mysql node to summarize the data and returns it to the client. When there is a problem with the Mysql node, Mha will re-elect the master and slave nodes through a series of scripts to rescue the binlog file.

So far, the complete structure of a distributed Mysql cluster with a three-high architecture is completed. We can see that the above text is represented by a structure diagram and it is still very complicated:

Summarize

This part describes the characteristics of Mysql partition table and a domestic open source sub-database sub-table plugin. In fact, sub-database sub-table is basically unnecessary for most small and medium projects, and it usually appears in relatively large system architectures. , dble has a good performance as a domestic open source component. It is improved on the basis of Mycat and written in JAVA language, which is very suitable for the preferences of WEB developers.

In the switching part, we talked about the other two components: MHA and Keepalive. These two components are referenced by a large number of resources and cases, so here is a brief introduction to how they are combined with Mysql to enhance the high availability of the cluster architecture. ,

write to the end

The three-high architecture looks very complex and tall, but in fact, after we split the components, we found that the roles of each component are very clear and the role is relatively obvious. Of course, these contents may be more exposed to and used by operation and maintenance. For developers, these contents only need a simple understanding of the process and theory.

Three-high Mysql - Building the expansion and switching of the "three-high" architecture

introduction

expand

Partition Table

Sub-library and sub-table

Dble and Mycat

Dble installation, construction and use

There is a problem with the sub-database and sub-table

toggle

How does the business switch?

Automatic master-slave switching

Three-high system construction

Summarize

write to the end

阿东

引用和评论

清华大学第五弹：DeepSeek与AI幻觉

MySQL慢查询日志：性能优化的终极指南

MySQL 备份 Shell 脚本：支持远程同步与阿里云 OSS 备份

《SQL应用场景解析：如何通过SQL解决实际业务问题》

Devin 发布 DeepWiki，2 星的项目直接装出万星的气场

一键实现 Oracle 数据整库同步至 Apache Doris

php+mysql 搭建一个在线游戏网站目前已有2000+游戏【代码解析一】

Three-high Mysql - Building the expansion and switching of the "three-high" architecture

introduction

expand

Partition Table

Sub-library and sub-table

Dble and Mycat

Dble installation, construction and use

There is a problem with the sub-database and sub-table

toggle

How does the business switch?

Automatic master-slave switching

Three-high system construction

Summarize

write to the end

阿东

引用和评论

清华大学第五弹：DeepSeek与AI幻觉

MySQL慢查询日志：性能优化的终极指南

MySQL 备份 Shell 脚本：支持远程同步与阿里云 OSS 备份

《SQL应用场景解析：如何通过SQL解决实际业务问题》

Devin 发布 DeepWiki，2 星的项目直接装出万星的气场

一键实现 Oracle 数据整库同步至 Apache Doris

php+mysql 搭建一个在线游戏网站目前已有2000+游戏【代码解析 一】

php+mysql 搭建一个在线游戏网站目前已有2000+游戏【代码解析一】