foreword

Amazon Aurora is a relational database service developed by Amazon Cloud Technology. It provides full compatibility with open source databases MySQL and PostgreSQL, and can also provide performance and availability comparable to commercial databases. In terms of performance, Aurora MySQL can support five times the throughput with the same configuration as the open source standard MySQL, and Amazon Aurora PostgreSQL can support three times the throughput improvement with the same configuration as the open source standard PostgreSQL. From the perspective of scalability, Amazon Aurora has made enhancements and innovations in storage and computing, horizontal and vertical.

Amazon Aurora's maximum data storage capacity now supports up to 128TB and can support dynamic shrinking of storage. In terms of computing, Amazon Aurora provides the scalability configuration of multiple read replicas to support the expansion of 15 up to 15 read replicas in a region, provides a multi-master architecture to support the expansion of 4 write nodes in the same region, and provides Serverless The serverless architecture instance-level scale-up in seconds provides a global database to achieve low-latency cross-region expansion of the database.

With the growth of user data volume, Amazon Aurora has provided good scalability. Can it be further enhanced with more data volume and more concurrent access capabilities? You can consider using sharding and sharding to support the configuration of multiple underlying Amazon Aurora clusters. Based on this, a series of blogs, including this blog, will be introduced accordingly, aiming to provide a reference for proxy selection when you divide databases and tables.

This blog will focus on how to use the open source Gaea proxy, an open source sub-database and sub-table middleware tool, to build a database cluster. It will cover sharding and sharding, read-write separation, sharded table association query, and automatic failover recognition.

Introduction to Gaea

Gaea is an open source database middleware based on the mySQL protocol for an Internet e-commerce business. Gaea supports sub-database sub-table, SQL routing, read-write separation, configuration hot loading, connection pool, custom SQL interception and filtering and other basic features, and supports sub-database Joins between sharded tables and global tables, supports multiple sharded tables but with the same routing rules.

For more function introductions, please refer to the latest instructions on Github https://github.com/XiaoMi/Gaea

1. Design Architecture Diagram

image.png

Gaea contains four modules, namely Gaea-proxy, Gaea-cc, Gaea-agent, and Gaea-web. Gaea-proxy is an online agent, responsible for receiving SQL traffic, Gaea-cc is a central control module, responsible for Gaea-proxy configuration management and some background tasks, Gaea-agent is deployed on the machine where mysql is located, and is responsible for instance creation, management, and recovery Waiting for work, Gaea-web is a management interface of Gaea, which makes the overall use of Gaea more convenient.

2. Basic Concepts

cluster

Clusters are divided according to the degree of business importance. A cluster can contain multiple Gaea-proxy instances. The cluster to which the proxy belongs is determined by specifying the cluster_name in the configuration file that Gaea-proxy depends on when it starts. The proxy instance in the cluster only provides services for the namespace in the cluster, which plays the role of physical isolation. A cluster can serve multiple namespaces.

namespace

Namespace, each business system corresponds to one namespace, one namespace corresponds to multiple databases, and business parties can switch freely. Each namespace theoretically belongs to only one cluster. Through the Gaea-cc configuration management interface, specify the cluster to which the namespace belongs.

slice

Sharding, logical grouping, a shard contains mysql one master and multiple slaves.

shard

Table sharding rules, which determine how a table is sharded, including the type of sharding and the location of the shards.

proxy

Refers to the proxy itself, which undertakes online traffic.

gaea_cc

The agent control module is mainly responsible for configuration distribution, instance monitoring, etc.

3. Deployment Architecture Diagram

In the production environment, you can refer to the following deployment architecture diagram to realize the agent and centralized configuration management of multiple mySQL clusters. There is no difference between the Gaea core cluster and the ordinary cluster in the figure.

image.png

Environment build

1. Amazon Aurora cluster

First, create three sets of Aurora MySQL clusters, the model is db.r5.2xlarge, each cluster has one write node and one read node.

image.png

2. Gaea proxy build

1) Install Go locale

Refer to https://go.dev/doc/install

2) Download the source code and compile it

git clone https://github.com/XiaoMi/Gaea.git

cd Gaea && make

After completion, gaea and gaea-cc will appear in the bin directory

[ec2-user@ip-172-31-29-68 bin]$ ls

gaea gaea-cc

3. Proxy configuration

The Gaea configuration consists of two parts. The local configuration is the configuration content directly used by Gaea_proxy, which generally does not need to be changed at runtime. gaea is a multi-tenant mode. Each tenant is called a namespace. The configuration of the namespace is variable at runtime, and is generally stored in etcd. The following is a demonstration of using a json file instead of etcd. For detailed configuration instructions, please refer to the instructions on Github.

1) Prepare the local basic configuration file, which is used for the global configuration of Gaea. The configuration file is placed in $Gaea_Home/etc. Refer to the following example:

 ; 配置类型,目前支持file/etcd两种方式,file方式不支持热加载,但是可以快速体验功能
; file 模式下读取file_config_path下的namespace配置文件
; etcd 模式下读取coordinator_addr/cluster_name下的namespace配置文件
config_type=file
;file config path, 具体配置放到file_config_path的namespace目录下,该下级目录为固定目录
file_config_path=/data/gaea/Gaea/etc/file
;配置中心地址,目前只支持etcd
;coordinator_addr=http://127.0.0.1:2379
;配置中心用户名和密码
;username=test
;password=test
;环境划分、test、online
environ=test
;service name
service_name=gaea_proxy
;gaea_proxy 当前proxy所属的集群名称
cluster_name=gaea_default_cluster
;日志配置
log_path=/data/gaea/Gaea/log
log_level=Notice
log_filename=gaea
log_output=file
;管理地址
admin_addr=0.0.0.0:13399
;basic auth
admin_user=admin
admin_password=admin
;代理服务监听地址
proto_type=tcp4
proxy_addr=0.0.0.0:13306
; 默认编码
proxy_charset=utf8
;慢sql阈值,单位: 毫秒
slow_sql_time=100
;空闲会话超时时间,单位: 秒
session_timeout=3600
;打点统计配置
stats_enabled=true
stats_interval=10
;encrypt key, 用于对etcd中存储的namespace配置加解密
encrypt_key=1234abcd5678efg*
;server_version 服务器版本号配置
server_version=5.7-gaea
;auth plugin mysql_native_password or caching_sha2_password or ''
;自定义认证插件,支持 5.x 和 8.x 版本认证,认证插件为 caching_sha2_password 时,不支持低版本客户端认证
auth_plugin=mysql_native_password

Swipe left to see more

2) Prepare the sharding configuration file

Gaea supports multi-tenancy, each tenant has a separate namespace, and each namespace has a configuration file

Gaea supports two sharding methods, kingshard and mycat. This test uses kingshard's hash method to divide tables.

The namespace configuration file must be located in the etc/file/namespace directory of the Gaea compilation directory. The following configurations mainly do these things:

  1. At the same time, the two sub-database sub-table methods of kingshard and mycat are enabled (only kingshard is used for the test)
  2. Configure the user who accepts requests on the front end (Gaea) and the user who connects to Amazon Aurora on the back end (root)
  3. Enable read-write separation
  4. The data sharding rules are as follows:
  • Three physical clusters (slices) are created for tbl_ks using the hash method of kingshard, and the shard key is configured with the id field
  • Each of the 3 physical shards has a shard table (location 1, 1, 1)
  • Configure the tbl_ks2 table as the association table of tbl_ks (proxy will automatically distribute the child table data to the same slice according to the fragmentation rules of the parent table to facilitate join and association queries)
  • Configure tbl_global as a global table (all shards have full data of this table)
  • SQL for non-sharded queries is sent to slice 1 by default (default_slice": "slice-1)
 {
    "name": "gaea_namespace_1",
    "online": true,
    "read_only": true,
    "allowed_dbs": {
        "db_ks": true,
        "db_mycat": true
    },
    "default_phy_dbs": {
        "db_ks": "db_ks",
        "db_mycat": "db_mycat"
    },
    "slow_sql_time": "1000",
    "black_sql": [
        ""
    ],
    "allowed_ip": null,
    "slices": [
        {
            "name": "slice-1",
            "user_name": "root",
            "password": "Pass1234",
            "master": "shard-1.cluster-c3luzotrlr7v.ap-southeast-1.rds.amazonaws.com:3306",
            "slaves": ["shard-1.cluster-ro-c3luzotrlr7v.ap-southeast-1.rds.amazonaws.com:3306"],
            "capacity": 32,
            "max_capacity": 1024,
            "idle_timeout": 3600
        },
        {
            "name": "slice-2",
            "user_name": "root",
            "password": "Pass1234",
            "master": "shard-2.cluster-c3luzotrlr7v.ap-southeast-1.rds.amazonaws.com:3306",
            "slaves": ["shard-2.cluster-ro-c3luzotrlr7v.ap-southeast-1.rds.amazonaws.com:3306"],
            "capacity": 32,
            "max_capacity": 1024,
            "idle_timeout": 3600
        },
        {
            "name": "slice-3",
            "user_name": "root",
            "password": "Pass1234",
            "master": "shard-3.cluster-c3luzotrlr7v.ap-southeast-1.rds.amazonaws.com:3306",
            "slaves": ["shard-3.cluster-ro-c3luzotrlr7v.ap-southeast-1.rds.amazonaws.com:3306"],
            "capacity": 32,
            "max_capacity": 1024,
            "idle_timeout": 3600
        }
    ],
    "shard_rules": [
        {
            "db": "db_ks",
            "table": "tbl_ks",
            "type": "hash",
            "key": "id",
            "locations": [
                1,
                1,
                1
            ],
            "slices": [
                "slice-1",
                "slice-2",
                "slice-3"
            ]
        },
        {
           "db": "db_ks",
           "table": "tbl_ks2",
           "type": "linked",
           "parent_table": "tbl_ks",
           "key": "id"
        },
        {
            "db": "db_ks",
            "table": "tbl_global",
            "type": "global",
            "locations": [
                1,
                1,
                1
            ],
    "slices": [
        "slice-1",
        "slice-2",
        "slice-3"
              ]

        }

    ],
    "global_sequences": [

    ],
    "users": [
        {
            "user_name": "gaea",
            "password": "gaea_password",
            "namespace": "gaea_namespace_1",
            "rw_flag": 2,
            "rw_split": 1
        }
    ],
    "default_slice": "slice-1",
    "open_general_log": false,
    "max_sql_execute_time": 5000,
    "max_sql_result_size": 100000
}

Swipe left to see more

4. Start Proxy

Use the Gaea binary file in the bin directory to start, and specify the static configuration file path

 [root@ip-172-31-29-68 etc]# ../bin/gaea -config ./test.ini
Build Version Information:Version: fbac80acdd922a3d563bc703994f7f9145c2d41b
GitRevision: fbac80acdd922a3d563bc703994f7f9145c2d41b
User: root@ip-172-31-29-68.ap-southeast-1.compute.internal
GolangVersion: go1.17.6
BuildStatus: Clean
BuildTime: 2022-02-09--07:13:01
BuildBranch: master
BuildGitDirty: 0
[GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
using env: export GIN_MODE=release
using code: gin.SetMode(gin.ReleaseMode)
[GIN-debug] GET /api/proxy/ping --> github.com/XiaoMi/Gaea/proxy/server.(AdminServer).ping-fm (2 handlers)
[GIN-debug] PUT /api/proxy/config/prepare/:name --> github.com/XiaoMi/Gaea/proxy/server.(AdminServer).prepareConfig-fm (2 handlers)
[GIN-debug] PUT /api/proxy/config/commit/:name --> github.com/XiaoMi/Gaea/proxy/server.(AdminServer).commitConfig-fm (2 handlers)
[GIN-debug] PUT /api/proxy/namespace/delete/:name --> github.com/XiaoMi/Gaea/proxy/server.(AdminServer).deleteNamespace-fm (2 handlers)
[GIN-debug] GET /api/proxy/config/fingerprint --> github.com/XiaoMi/Gaea/proxy/server.(AdminServer).configFingerprint-fm (2 handlers)
[GIN-debug] GET /api/proxy/stats/sessionsqlfingerprint/:namespace --> github.com/XiaoMi/Gaea/proxy/server.(AdminServer).getNamespaceSessionSQLFingerprint-fm (2 handlers)
[GIN-debug] GET /api/proxy/stats/backendsqlfingerprint/:namespace --> github.com/XiaoMi/Gaea/proxy/server.(AdminServer).getNamespaceBackendSQLFingerprint-fm (2 handlers)
[GIN-debug] DELETE /api/proxy/stats/sessionsqlfingerprint/:namespace --> github.com/XiaoMi/Gaea/proxy/server.(AdminServer).clearNamespaceSessionSQLFingerprint-fm (2 handlers)
[GIN-debug] DELETE /api/proxy/stats/backendsqlfingerprint/:namespace --> github.com/XiaoMi/Gaea/proxy/server.(*AdminServer).clearNamespaceBackendSQLFingerprint-fm (2 handlers)
[GIN-debug] GET /api/metric/metrics --> github.com/gin-gonic/gin.WrapH.func1 (2 handlers)
[GIN-debug] GET /debug/pprof/ --> github.com/gin-gonic/gin.WrapF.func1 (2 handlers)
[GIN-debug] GET /debug/pprof/cmdline --> github.com/gin-gonic/gin.WrapF.func1 (2 handlers)
[GIN-debug] GET /debug/pprof/profile --> github.com/gin-gonic/gin.WrapF.func1 (2 handlers)
[GIN-debug] POST /debug/pprof/symbol --> github.com/gin-gonic/gin.WrapF.func1 (2 handlers)
[GIN-debug] GET /debug/pprof/symbol --> github.com/gin-gonic/gin.WrapF.func1 (2 handlers)
[GIN-debug] GET /debug/pprof/trace --> github.com/gin-gonic/gin.WrapF.func1 (2 handlers)
[GIN-debug] GET /debug/pprof/block --> github.com/gin-gonic/gin.WrapF.func1 (2 handlers)
[GIN-debug] GET /debug/pprof/goroutine --> github.com/gin-gonic/gin.WrapF.func1 (2 handlers)
[GIN-debug] GET /debug/pprof/heap --> github.com/gin-gonic/gin.WrapF.func1 (2 handlers)
[GIN-debug] GET /debug/pprof/mutex --> github.com/gin-gonic/gin.WrapF.func1 (2 handlers)
[GIN-debug] GET /debug/pprof/threadcreate --> github.com/gin-gonic/gin.WrapF.func1

Swipe left to see more

5. Verify the connection

Use the username and password specified in the dynamic configuration file to connect to the proxy. Because the dynamic configuration file above enables the two sub-database sub-table methods of kingshard and mycat, after logging in, you will see two pre-configured databases in the proxy by default:

 [root@ip-172-31-29-68 namespace]# mysql -ugaea -pgaea_password -P13306 -h172.31.29.68
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MySQL connection id is 10001
Server version: 5.6.20-gaea MySQL Community Server (GPL)
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)]> show databases;
+----------+
| Database |
+----------+
| db_ks |
| db_mycat |
+----------+
2 rows in set (0.00 sec)

Swipe left to see more

function test

1. Sub-database sub-table verification

Because Gaea is generally compatible with DDL, the official website recommends creating the corresponding database in the back-end database in advance and then applying the rules in the proxy. Because we use kingshard's hash rules to divide tables, we build corresponding databases and tables in each Amazon Aurora in advance. Note: table must start from 0000

 CREATE TABLE IF NOT EXISTS tbl_ks_0000(
id INT UNSIGNED AUTO_INCREMENT,
col1 VARCHAR(100) NOT NULL,
col2 VARCHAR(40) NOT NULL,
col_date DATE,
PRIMARY KEY ( id )
)ENGINE=InnoDB DEFAULT CHARSET=utf8;

Swipe left to see more

image.png

  • Insert data and view

image.png

 MySQL [db_ks]> select * from tbl_ks;
+----+-------+-------+------------+
| id | col1 | col2 | col_date |
+----+-------+-------+------------+
| 3 | name3 | addr3 | 2022-02-11 |
| 1 | name1 | addr1 | 2022-02-11 |
| 4 | name4 | addr4 | 2022-02-11 |
| 2 | name2 | addr2 | 2022-02-11 |
+----+-------+-------+------------+
4 rows in set (0.03 sec)

Swipe left to see more

  • Log in to 3 Amazon Aurora to check the data and verify that the hash distribution is done by id
 [root@ip-172-31-29-68 ec2-user]# mysql -uroot -pPass1234 -hshard-1.cluster-c3luzotrlr7v.ap-southeast-1.rds.amazonaws.com -P3306 -e "select * from db_ks.tbl_ks_0000"
+----+-------+-------+------------+
| id | col1 | col2 | col_date |
+----+-------+-------+------------+
| 3 | name3 | addr3 | 2022-02-11 |
+----+-------+-------+------------+
[root@ip-172-31-29-68 ec2-user]# mysql -uroot -pPass1234 -hshard-2.cluster-c3luzotrlr7v.ap-southeast-1.rds.amazonaws.com -P3306 -e "select * from db_ks.tbl_ks_0001"
+----+-------+-------+------------+
| id | col1 | col2 | col_date |
+----+-------+-------+------------+
| 1 | name1 | addr1 | 2022-02-11 |
| 4 | name4 | addr4 | 2022-02-11 |
+----+-------+-------+------------+
[root@ip-172-31-29-68 ec2-user]# mysql -uroot -pPass1234 -hshard-3.cluster-c3luzotrlr7v.ap-southeast-1.rds.amazonaws.com -P3306 -e "select * from db_ks.tbl_ks_0002"
+----+-------+-------+------------+
| id | col1 | col2 | col_date |
+----+-------+-------+------------+
| 2 | name2 | addr2 | 2022-02-11 |

Swipe left to see more

2. Global table test

The global table is a table with completely consistent data on each slice (to be precise, each DB of each slice), which is convenient to perform some cross-shard queries. If it is a small table and a sharded table for join, it is recommended to use the global table. , corresponding to the configuration in shard_rules as follows:

 {
            "db": "db_ks",
            "table": "tbl_global",
            "type": "global",
            "locations": [
                1,
                1,
                1
            ],
    "slices": [
        "slice-1",
        "slice-2",
        "slice-3"
    ]

Swipe left to see more

The test process is as follows:

  • Create a table in the Amazon Aurora backend database

Similarly, because Gaea does not support DDL, you need to create tables in all back-end databases before applying rules

 CREATE TABLE IF NOT EXISTS tbl_global(
   id INT UNSIGNED AUTO_INCREMENT,
   PRIMARY KEY ( id )
   )ENGINE=InnoDB DEFAULT CHARSET=utf8;

Swipe left to see more

  • connect proxy and insert data

Inserting a row will prompt 3 rows affected

 MySQL [db_ks]> insert into tbl_global (id) values (1);
Query OK, 3 rows affected (0.02 sec)
MySQL [db_ks]> insert into tbl_global (id) values (2);
Query OK, 3 rows affected (0.02 sec)
MySQL [db_ks]> insert into tbl_global (id) values (3);
Query OK, 3 rows affected (0.02 sec)

Swipe left to see more

  • Log in to each Amazon Aurora to see if 3 rows of data are inserted
 [ec2-user@ip-172-31-29-68 ~]$ mysql -uroot -pPass1234 -hshard-1.cluster-c3luzotrlr7v.ap-southeast-1.rds.amazonaws.com -e "select * from db_ks.tbl_global"
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
+----+

Swipe left to see more

  • Do the test of global table and shard table join
 MySQL [db_ks]> select * from tbl_global,tbl_ks where tbl_ks.id=tbl_global.id;
+----+----+-------+-------+------------+
| id | id | col1  | col2  | col_date   |
+----+----+-------+-------+------------+
|  2 |  2 | name2 | addr2 | 2022-02-11 |
|  3 |  3 | name3 | addr3 | 2022-02-11 |
|  1 |  1 | name1 | addr1 | 2022-02-11 |
+----+----+-------+-------+------------+
3 rows in set (0.05 sec)

Swipe left to see more

  • Validate by executing the plan
 MySQL [db_ks]> explain select * from tbl_global,tbl_ks where tbl_ks.id=tbl_global.id;
+-------+---------+-------+--------------------------------------------------------------------------------------------+
| type  | slice   | db    | sql                                                                                        |
+-------+---------+-------+--------------------------------------------------------------------------------------------+
| shard | slice-1 | db_ks | SELECT * FROM (`tbl_global`) JOIN `tbl_ks_0000` WHERE `tbl_ks_0000`.`id`=`tbl_global`.`id` |
| shard | slice-2 | db_ks | SELECT * FROM (`tbl_global`) JOIN `tbl_ks_0001` WHERE `tbl_ks_0001`.`id`=`tbl_global`.`id` |
| shard | slice-3 | db_ks | SELECT * FROM (`tbl_global`) JOIN `tbl_ks_0002` WHERE `tbl_ks_0002`.`id`=`tbl_global`.`id` |
+-------+---------+-------+--------------------------------------------------------------------------------------------+
3 rows in set (0.00 sec)

Swipe left to see more

3. Global table data consistency test

Let's talk about the conclusion first, when writing the global table through proxy, the transaction is not atomic.

If the global table of shard 1 has id=1, and the other two shards do not have this data, then an error will be reported from proxy insert id=1, but both shards 2 and 3 can be successfully written to the local, so it needs to be checked regularly Whether the data in the global table is consistent.

 # 在分片1 手动插入id=4 制造不一致
mysql -uroot -pPass1234 -hshard-1.cluster-c3luzotrlr7v.ap-southeast-1.rds.amazonaws.com -e "insert into db_ks.tbl_global values (4)"
# 在proxy insert id=4 会报错
MySQL [db_ks]>  insert into tbl_global (id) values (4);
ERROR 1105 (HY000): unknown error: execute in InsertPlan error: ERROR 1062 (23000): Duplicate entry '4' for key 'PRIMARY'
# 在分片2 和3 查看全局表能发现id=4 这条数据
[ec2-user@ip-172-31-29-68 ~]$ mysql -uroot -pPass1234 -hshard-2.cluster-c3luzotrlr7v.ap-southeast-1.rds.amazonaws.com -e "select * from db_ks.tbl_global"
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
|  4 |
+----+

Swipe left to see more

4. Do join query on parent table and child table

1) Verify the data distribution of the sub-table

 # 连接proxy手动插入3条数据到
MySQL [db_ks]> insert into tbl_ks2 (id) values (1);
Query OK, 1 row affected (0.01 sec)
MySQL [db_ks]>
MySQL [db_ks]> insert into tbl_ks2 (id) values (2);
Query OK, 1 row affected (0.01 sec)
MySQL [db_ks]> insert into tbl_ks2 (id) values (3);
Query OK, 1 row affected (0.01 sec)

MySQL [db_ks]> explain select * from tbl_ks2;
+-------+---------+-------+------------------------------+
| type  | slice   | db    | sql                          |
+-------+---------+-------+------------------------------+
| shard | slice-1 | db_ks | SELECT * FROM `tbl_ks2_0000` |
| shard | slice-2 | db_ks | SELECT * FROM `tbl_ks2_0001` |
| shard | slice-3 | db_ks | SELECT * FROM `tbl_ks2_0002` |
+-------+---------+-------+------------------------------+
3 rows in set (0.00 sec)

Swipe left to see more

From the above execution results, it is confirmed that the child table is also sharded according to the rules of the parent table, and the id field is hashed in the same shard as the parent table.

2) Verify parent table and child table do join query

 # 先看看父表的数据
MySQL [db_ks]> select * from tbl_ks;
+----+----------+-------+------------+
| id | col1     | col2  | col_date   |
+----+----------+-------+------------+
|  2 | name2    | addr2 | 2022-02-11 |
|  3 | name3    | addr3 | 2022-02-11 |
|  1 | failover | addr1 | 2022-02-11 |
|  4 | name4    | addr4 | 2022-02-11 |
+----+----------+-------+------------+
4 rows in set (0.04 sec)

Swipe left to see more

join test

 MySQL [db_ks]> select * from tbl_ks,tbl_ks2 where tbl_ks.id=tbl_ks2.id;
+----+----------+-------+------------+----+
| id | col1     | col2  | col_date   | id |
+----+----------+-------+------------+----+
|  3 | name3    | addr3 | 2022-02-11 |  3 |
|  1 | failover | addr1 | 2022-02-11 |  1 |
|  2 | name2    | addr2 | 2022-02-11 |  2 |
+----+----------+-------+------------+----+
MySQL [db_ks]> explain select * from tbl_ks,tbl_ks2 where tbl_ks.id=tbl_ks2.id;
+-------+---------+-------+------------------------------------------------------------------------------------------------+
| type  | slice   | db    | sql                                                                                            |
+-------+---------+-------+------------------------------------------------------------------------------------------------+
| shard | slice-2 | db_ks | SELECT * FROM (`tbl_ks_0001`) JOIN `tbl_ks2_0001` WHERE `tbl_ks_0001`.`id`=`tbl_ks2_0001`.`id` |
| shard | slice-3 | db_ks | SELECT * FROM (`tbl_ks_0002`) JOIN `tbl_ks2_0002` WHERE `tbl_ks_0002`.`id`=`tbl_ks2_0002`.`id` |
| shard | slice-1 | db_ks | SELECT * FROM (`tbl_ks_0000`) JOIN `tbl_ks2_0000` WHERE `tbl_ks_0000`.`id`=`tbl_ks2_0000`.`id` |
+-------+---------+-------+------------------------------------------------------------------------------------------------+
3 rows in set (0.00 sec)

Swipe left to see more

From the above execution results, the parent table and the child table can be associated with the sharding key for join query.

5. Check whether the read-write separation is realized

Write a script to verify that read requests are all sent to the slave library for execution

 #!/bin/bash
while true;do
mysql -ugaea -pgaea_password -P13306 -h172.31.29.68 -e "use db_ks;select  * from tbl_ks;"
sleep 0.1
done

Swipe left to see more

The result is shown in the figure below. All read requests are sent to the replica node for execution, and the master node does not receive the query request.

image.png

6. Failure recovery verification

This test is to verify that the proxy can automatically recognize the occurrence of the failover and automatically recover without changing the configuration when Aurora failsover.

The test script is as follows:

 [root@ip-172-31-29-68 ec2-user]# cat test_failover.sh
#!/bin/bash
while true
do
     mysql -ugaea -pgaea_password -P13306 -h172.31.29.68 -e "use db_ks; update  tbl_ks set col1='failover' where id = 1;"
    now=$(date +"%T")
    echo "update done: $now"
    sleep 1

done

Swipe left to see more

Run the script, then click Action→Failover on the Amazon Aurora cluster's write node. Automatic switchover of Amazon Aurora write and read nodes is initiated. During the switching process, the read/write endpoints and read-only endpoints of the entire cluster remain unchanged, but the underlying mapped nodes change.

image.png

Watch Amazon Aurora complete failover within 30 seconds from Event

image.png

But judging from the script output, the proxy keeps reporting errors and cannot automatically detect the failover event

 ERROR 1105 (HY000) at line 1: unknown error: execute in UpdatePlan error: ERROR 1290 (HY000): The MySQL server is running with the --read-only option so it cannot execute this statement
update done: 11:52:09
ERROR 1105 (HY000) at line 1: unknown error: execute in UpdatePlan error: ERROR 1290 (HY000): The MySQL server is running with the --read-only option so it cannot execute this statement

Swipe left to see more

Need to restart proxy to resume normal writing

Epilogue

This article expands Amazon Aurora's sub-database sub-table capability and read-write separation capability by means of database middleware. Through Gaea, this solution can achieve connection pooling, SQL compatibility, cross-shard routing, read-write separation, join between sharded tables and global tables, and the ability to do associated queries on parent and child tables. The disadvantage is that Gaea cannot recognize and automatically recover after Aurora failover occurs, and needs to restart or manually refresh the configuration to recover.

In the future, we will continue to launch a series of blogs on the expansion and research of other middleware.

Author of this article

image.png

Zhang Zhenwei

Amazon Cloud Technology APN Solution Architect

Mainly responsible for partner architecture consulting and program design, and at the same time dedicated to the application and promotion of Amazon cloud technology cloud services in China, good at data migration, database tuning and data analysis.


亚马逊云开发者
2.9k 声望9.6k 粉丝

亚马逊云开发者社区是面向开发者交流与互动的平台。在这里,你可以分享和获取有关云计算、人工智能、IoT、区块链等相关技术和前沿知识,也可以与同行或爱好者们交流探讨,共同成长。


引用和评论

0 条评论