About This article is about the sharing of "Alibaba Cloud Enterprise Autonomous Database RDS Detailed Explanation" by Song Libing, senior technical expert of Alibaba Cloud Database in the 2021 Cloud Home Conference-Enterprise Cloud Native Database Best Practice Forum.

5d88a1bab10ecc516ce869948649d721.jpeg

This article will introduce the enterprise-level autonomous database system from two aspects.

  • RDS MySQL product
  • RDS MySQL self-developed kernel

33.png

1. RDS MySQL products

1) Alibaba Cloud RDS-cloud native autonomous database

image001.png

Alibaba Cloud RDS is the earliest RDS service in China, based on AliSQL, the MySQL branch of Alibaba. After Alibaba completes its IOE goal, Alibaba's entire e-commerce business is supported by RDS. The reliability and stability of RDS have been verified in extremely demanding real-life scenarios such as Double Eleven. After more than ten years of development, in addition to considerable development in terms of stability and performance, Alibaba Cloud RDS is also firmly moving towards cloud-native and intelligent. Now the overall RDS architecture is based on cloud-native K8S for deployment and Management, the bottom layer relies on Alibaba Cloud's high-performance ECS and high-throughput ESSD distributed cloud storage, which truly separates computing and storage. Based on the technology of artificial intelligence and the experience of experts, the RDS autonomy capability of DAS is realized.

2) RDS MySQL product-high service availability

image003.png

From a product perspective, high availability is a classic architecture in the MySQL ecosystem. Through binlog replication, RDS provides high availability across availability zones, which can achieve four nines of availability and second-level switching.

image005.png

In addition to the classic two-node architecture, RDS also provides a three-node high-availability architecture based on the majority agreement. In the three-node high-availability architecture, including two data nodes and one log node, the main and standby data nodes, the RPO of the user's data 0 loss can be equal to 0. This replication does not use MySQL native binlog replication, but uses the majority protocol (Consensuse) implemented by itself for binlog distribution.

image007.png

In addition to hot standby, RDS also provides cold standby capabilities. RDS can periodically back up full data and real-time binlog, and upload the data to OSS. If there is a data recovery requirement, you can quickly perform data recovery through OSS.

3) RDS MySQL product-resource pooling and cloud native

image009.png

Under the architecture of separation of computing and storage, RDS instances can provide up to 32TiB of storage. The full data backup is carried out in the form of snapshots. No matter how large the user data is, it can be backed up in seconds, so that the user is truly insensitive. Based on the second-level data snapshot capability, RDS achieves minute-level instance creation. The creation of a read-only instance can be completed in about two or three minutes.

4) RDS MySQL product-automatic read-write separation

image011.png

Read-only nodes are a classic architecture in MySQL. By creating read-only instances to expand read capabilities, Alibaba Cloud's RDS provides an intermediary to allow user applications to automatically achieve read-write separation without modification. Providing multi-address read-write separation allows different services to be accessed with different addresses. The advantage is that the services will not affect each other and the address isolation performance is very good.

5) RDS MySQL product-enterprise-level data security

image013.png

Nowadays, more and more attention is paid to security. Many users on the cloud have requirements for compliance and auditing. RDS provides users with full-link encryption capabilities and security capabilities such as audit logs. In addition, users can also enjoy Security facilities at the network level and operating system level of Alibaba Cloud. On the whole, we can do very strict safety compliance requirements before, during and after the event.

6) RDS MySQL product-overall architecture

image015.png

RDS is not only the MySQL kernel, but also many modules to serve users. Users only need to control them through the console, or create or manage instances through OpenAPI, and RDS automatically completes the rest.

2. RDS MySQL self-developed kernel

1) Practicability: online solidification of SQL execution plan

image017.png

When using RDS online, users will encounter a situation where SQL is particularly slow. For some reasons, SQL may not select the optimal execution plan. MySQL provides the function of hint. Users can add hints in the SQL statement to prompt whether MySQL should use an index. , Or whether to open a certain optimizer strategy. Improve the efficiency of SQL execution through Hint. Why is this feature rarely used by users? Because when SQL was found to be slow, the application was already running online. Modifying the SQL in the application at this time may require a long process or even impossible to modify. SQL Outline allows users to add Hints to SQL statements on the RDS server side. Only need to call a stored procedure. For example, to add an Index Hint, you only need to specify the SQL statement and the corresponding index strategy in add\_index\_outline. When doing rule matching, the value in the Where condition will be automatically ignored, and the same statement with different values will match this rule. The user's application does not need to make any changes, and the rules will take effect immediately after adding the rules. This is a very practical function.

2) Practicality: diagnosable and measurable

image019.png

We have added a lot of monitoring and diagnostic information, which can be divided into three categories: instance level, object level, and statement level.

image021.png

There are many indicators at the instance level. The indicators in MySQL and the system are stored in the table in seconds, so that users can easily query through MySQL statements. Including the instance's CPU usage, memory usage, disk IO, the number of server layer connections, network access, and a large amount of status information in InnoDB. RDS records these status information in seconds. When a problem occurs, it is easy to accurately locate the problem through the change of the status.

image023.png

The statistics of table usage and index usage are added at the object level, which provides a basis for us to adjust the data structure.

image025.png

Statement level MySQL has a statement level statistics table, RDS adds some very useful information to this table. For example: CPU time used by the statement, MDL and InnoDB lock waiting time, Mutex spin and wait statistics, Read/Write IO statistics. These statistical items help users locate SQL problems quickly and accurately.

3) Stability: Buffer Pool optimization

image027.png

Cloud native needs to be able to adjust online specifications, and Buffer Pool is a very important resource. When the specifications and memory change, the Buffer Pool also needs to change. MySQL provides the ability to adjust the size of the Buffer Pool online. In the test, we found that MySQL Buffer Pool Resize will have a certain impact on business traffic. The frequency and amplitude of service traffic jitter are large (green line). Alibaba Cloud RDS has optimized Buffer Pool Resize. After optimization, the impact of Buffer Pool Resize on business traffic is much better (blue line).

4) Stability: Concurrency control

image029.png

Some SQL statements frequently encountered online will use a lot of CPU resources or memory resources. If you don't limit it, you may consume CPU and memory resources and cause the entire instance to become unstable. Concurrency control This feature can be used to limit the number of concurrency for a particular SQL. Concurrency control strategies can be divided into three dimensions: operation type, operation object, and keywords. Operation types refer to four types: SELECT, INSERT, UPDATE, and DELETE. The operation object refers to the library and the table. Concurrency control is similar to SQL Outline, and is configured on the RDS server side.

5) Security: transparent data encryption

image031.png

Transparent encryption supports AES encryption algorithm and national secret algorithm SM4. Because some units have compliance requirements, they must use the national secret algorithm.

6) Security: Recycle Bin

image033.png

The recycle bin is a tool to quickly retrieve table-level data. When a user deletes a table or Truncate table, the table is not directly deleted from the disk, but placed in the recycle bin. The user can set how long to automatically delete the table in the background automatically. When a misoperation, wrong deletion or emptying of the table occurs, the table can be quickly retrieved from the recycle bin.

7) Security: Flashback Query

image035.png

Flashback Query provides the ability to quickly retrieve rows of data. The row-level data retrieval function uses historical data in Undo. RDS can create a snapshot of historical data in seconds. RDS provides a time-based snapshot query mechanism, which traces historical data back to a specified time through Undo records. When a misoperation occurs, or there is a need to back files, the user can query historical data at a specified time point through SELECT.

8) High performance-Binlog In Redo

image037.png

MySQL needs to persist the Redo and Binlog files when the transaction is committed. In order to ensure that Crash Safe, the two files must be persisted in sequence, which has a great impact on performance. RDS will write Binlog to Redo at the same time, so when the transaction is committed, only the Redo file needs to be persisted. Binlog files only need to be asynchronously persisted in the background. This function can significantly improve the write performance while ensuring data consistency.

image039.png

The above figure is the performance test result of this function, there is a 30%-40% performance peak improvement, and even more than 100% performance improvement in the case of small concurrency.

9) High performance-DDL optimization

image041.png

When doing non-Instant DDL operations, a large number of Pages of DDL tables are often processed. MySQL accomplishes this operation by scanning the LRU linked list of the Buffer Pool. LRU contains all the data pages in the Buffer Pool, especially for large-sized instances, it takes a long time to scan the LRU once, and it will also affect the execution of other SQL statements. After RDS is optimized, DDL can directly hit its Page, no longer need to scan LRU. It can not only improve performance but also maintain the stability of the instance. The performance test in the above figure is to do an Export Table when there is business traffic. The execution time of Export Table has been reduced from 80 seconds to 0.34 seconds.

image043.png

The above picture is a test of the Optimize Table. When there is business traffic, the Optimize Table is made. This table has 600MB of data. The optimization of DDL has increased the performance of OPTIMIZE TABLE by more than ten times, from the original 220 seconds to 17 seconds.

10) High performance-Faster Query Cache

image045.png

We refactored the MySQL-based Query Cache, and made a lot of changes to concurrency control, memory management, and caching mechanism. The performance is improved when the hit rate is high, and there is nothing to the performance when the hit rate is low. Impact, this feature can be turned on by default.

11) RDS MySQL self-developed kernel: enterprise-level three-node

image047.png

Binlog is transmitted based on the self-developed majority protocol. The leader is responsible for transmitting Binlog to the log node or follower node, and after reaching the majority, the transaction is submitted on the leader node. The three nodes choose their own leaders to form an autonomous system. During the whole process, there is no data loss and the RTO is 0.

Copyright Notice: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users. The copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.

阿里云开发者
3.2k 声望6.3k 粉丝

阿里巴巴官方技术号,关于阿里巴巴经济体的技术创新、实战经验、技术人的成长心得均呈现于此。


引用和评论

0 条评论