I. Introduction

Aurora Serverless is an on-demand autoscaling configuration of Amazon Aurora. Aurora Serverless v2 scales database workloads to hundreds of thousands of transactions in fractions of a second. It adjusts capacity in fine-grained increments, providing the right amount of database resources for the needs of the application. You don't need to manage database capacity, you only pay for the resources your application consumes. Amazon Aurora offered a serverless option back in 2018.

The latest version of Aurora Serverless V2 provided by Amazon Aurora is a step up from the previous V1 version. The key improvement is that the resource capacity is expanded in place, so that the resource capacity expansion speed is increased from the V1 minute level to the second level. The v2 version can To be more fine-grained when adjusting the capacity, use 0.5 ACU as the expansion unit (V1 double expansion), and can adjust the capacity according to multiple dimensions, through continuous monitoring and use the buffer pool as large as possible. Compared with V1, Aurora Serverless v2 adds complete Amazon Aurora features, including multi-availability zone support, read replicas and global databases, etc., and supports cross-AZ and cross-region high-availability deployment and read expansion.

Amazon Aurora Serverless v2 is ideal for a variety of applications. For example, in the face of rapid business growth scenarios and massive multi-tenancy scenarios, when an enterprise with hundreds of thousands of applications, or a software-as-a-service (SaaS) provider with a multi-tenant environment with hundreds or thousands of databases, can Use Amazon Aurora Serverless v2 to manage the capacity of many databases in the entire SaaS application, and it is also suitable for scenarios with obvious business throughput fluctuations, such as game business, e-commerce business, test environment, etc., as well as new business systems with unpredictable throughput . Amazon Aurora Serverless v2 can effectively save customers costs for business systems that are in slumps most of the time.

As a new generation of cloud-native serverless database, Aurora Serverless V2 provides unparalleled elastic scalability, moving like a rabbit; at the same time, it also provides indestructible high availability for enterprise-level applications, which is as quiet as a rock.

This blog will focus on the elastic scaling and high availability features of Aurora Serverless V2, carry out testing and analysis, and further show you the features of Aurora Serverless V2.

2. Test

2.1 Expansion test

2.1.1 Test target

  • Aurora Serverless V2 Elastic Scaling Ability with Load Changes
  • Comparison of elastic scaling capabilities between Aurora Serverless V2 and V1

Aurora Serverless resource expansion takes ACU as a unit. About ACU definition:

  • Aurora Capacity Unit (ACU) is used to measure the resource capacity allocated by Aurora Serverless
  • 1 ACU has 2 GiB of memory and corresponding resources such as CPU and network. The ratio of CPU, network and memory is the same as that of the preset Aurora instance
  • Aurora Serverless V2 startup capacity can be set to a minimum of 0.5 ACU (1 GiB memory), and the maximum ACU support can be set to 128

2.1.2 Test Results and Analysis

To simulate load peaks and troughs, use sysbench to read and write loads, and test 10/100/50/600/10 based on different threads for 120 seconds per round. Observe the resource expansion of Aurora Serverless V2/V1 in the initial 20 seconds:

image.png

CloudWatch Dashboard monitoring metrics during testing:

image.png

Observation results: V2 CPUUtilization and ServerlessDabaseCapacity curve fitting is very high, the ACU value changes with the change of the CPU index, especially when the CPU rises, the ACU can increase instantaneously; when the CPU drops, the ACU value decreases relatively smoothly

V1 ServerlessDabaseCapacity extension has a certain delay and lag relative to CPUUtilization extension

V2/V1 overall performance comparison:

image.png

Because the Aurora Serverless V2 system expands more quickly, the load rises, and V2 always obtains a higher resource allocation (ACU) than V1, so the performance of Aurora Serverless V2 is improved by 1.5-3 times (TPS & QPS) compared with V1 in different stress test scenarios. Meanwhile, V2 uses MySQL 8.0 V1 uses MySQL 5.7 version, the performance will be different

Extended speed test:

Set the V2 Min ACU to 8 ACU and 4 ACU to see if the ACU expansion speed improves the test load sysbench The number of read and write threads adopts a constant value of 100 and runs for 15 minutes

image.png

Test observation:

The ACU expansion speed is related to the Min ACU or the ACU value of the current database. The larger the ACU value, the faster the expansion speed.

2.1.3 Extensibility Test Summary

  • Aurora Serverless V2 uses instant in-place expansion to achieve agile expansion with load changes and achieve second-level ACU expansion
  • Realize fine-grained resource expansion with 0.5ACU as expansion unit
  • Aurora Serverless V2 ACU resource expansion At the same time, other corresponding resources, such as the database engine buffer pool innodb_buffer_pool, also realize dynamic expansion
  • The ACU expansion speed is related to the min ACU or the ACU value of the current database. The larger the ACU value, the faster the expansion speed.
  • Aurora Serverless V2 resource expansion speed is agile and retraction is relatively stable to ensure stable system load support
  • Aurora Serverless V2 vs V1
  • Physical performance increased by 5-3 times
  • Resource expansion speed increased by 10-15 times
  • Extension units are more fine-grained
  • It can run smoothly in high concurrency scenarios

2.2 Read replica test

Aurora Serverless V2 adds the read replica function, which can create up to 15 read replicas to achieve cross-AZ disaster recovery and read load expansion; Aurora Serverless V2's high failover priority read replica (Tier 0/1) ACU will follow the The master node ACU scales to ensure that the master node load can be quickly carried after the master-slave load failover; Aurora Serverless V2 low failover priority read replica (Tier 2-15) ACU will not scale with the master node ACU, but will implement resources according to its own instance load ACU scaling

2.2.1 Test target

  • Will Aurora Serverless V2 tier 0/1 read replicas scale as the primary ACU scales
  • Will Aurora Serverless V2 tier 2-15 read replica load scale independently of the primary node?
  • Aurora Serverless V2 master-slave switchover time

2.2.2 Test Results and Analysis

Create one master and two slave Aurora Serverless V2 cluster read replicas with failover levels of Tier 0 and Tier 15 respectively (Min ACU: 4; Max ACU: 32):

image.png

2.2.3 Summary of Read Replica Test

  • Aurora Serverless V2 can achieve cross-AZ high availability master-slave switchover time in seconds through read replicas
  • Aurora Serverless V2 achieves horizontal scaling of read load through read replicas
  • The ACU of the Tier 0/1 read replica is also changing with the change of the ACU of the master node. The ACU values of the two are basically the same, which ensures sufficient supply of resources after the master-slave switchover
  • The ACU of the tier 2-15 read replica read replica will change independently and will not change with the change of the ACU of the master node

2.3 Global Database Test

Aurora Serverless V2 adds the global database function, which can realize cross-region disaster recovery and local read access by adding a global database; the global database adopts the physical replication method and efficiently transmits data through the cross-regional backbone network of Amazon Cloud Technology, so that the cross-regional data replication latency is less than 1 second; when disaster recovery occurs, the slave cluster can be promoted to the master cluster in minutes; a master cluster can be built up to five slave clusters, and a total of 90 read replicas can be created in the master-slave cluster

2.3.1 Test target

  • Aurora Serverless V2 global database: Run read-write load on the master cluster Run read-only load on the slave cluster Observe the ACU changes of the master and slave clusters
  • Aurora Serverless V2 Global Database: Run high concurrent write-only workloads on master cluster Observe master-slave cluster replication latency on slave clusters
  • Aurora Serverless V2 Global Database: Execute Managed Planned Failover Operation to Observe Failover Time

2.3.2 Test Results and Analysis

Main cluster (4 ACU-32 ACU) in US East 1
Slave cluster (4 ACU – 32 ACU) in US West 2

image.png

2.3.3 Global Database Test Summary

  • Aurora Serverless V2 can achieve cross-region high availability master-slave switchover time in minutes through a global database
  • Aurora Serverless V2 realizes cross-regional disaster recovery and nearby data access through a global database
  • The ACU of the slave cluster will change independently with the change of its own load, and will not change with the change of the ACU of the master cluster
  • The replication latency of the master-slave cluster is relatively low, usually around 200 milliseconds

3. Migrate the database to Aurora Serverless V2

3.1 Reasons to choose Aurora Serverless V2

There are many benefits to choosing Aurora Serverless V2. The reasons for choosing Aurora Serverless V2 are summarized in the following four aspects:

  1. Highly scalable

The innovative cloud-native serverless database realizes the elastic scaling of the database, further simplifies the creation, maintenance and expansion of the database for customers, and realizes high scalability and automatic scaling capacity.

Amazon Aurora Serverless v2 uses instant in-place scaling technology, which is a step up from the previous generation in terms of scalability. It can scale to support the most demanding applications in less than a second, and can scale database workloads from Support for hundreds of transactions scales to support hundreds of thousands of transactions.

  1. Provides high availability for enterprise applications

Aurora Serverless V2 provides all Aurora functions, including backtracking, cloning, global database, multi-availability zone deployment, and read replicas, etc., to meet the needs of business-critical applications, and can achieve high availability across multiple availability zones by creating read replicas, Achieve cross-availability zone failover in seconds; cross-region high availability can be achieved by creating a global database, and minute-level cross-region failover can be achieved, providing high availability for enterprise-level applications.

  1. easy to manage

Aurora Serverless V2 can automatically expand on demand, automatically expand or reduce capacity according to the needs of the application, simplifying the creation, maintenance and expansion of databases for customers, and no longer needs to perform complex database capacity provisioning and management, the database will automatically be based on application needs. Extended resources.

  1. cost-effective

Aurora Serverless V2 can be scaled with fine-grained 0.5 ACU incremental resources to ensure exactly the amount of database resources required by the application, and only pay for the used capacity, while Aurora Serverless V2 can be billed by the second, enabling more fine-grained billing model.

3.2 How to migrate the database to Aurora Serverless V2

Version requirements:

Aurora Serverless V2 MySQL: Aurora MySQL 3.0.2 and above (compatible with MySQL 8.0)

Aurora Serverless V2 PostgreSQL: Aurora PostgreSQL 13.6 and above

migrate:

Migration Scenario 1: Migrating an on-premise Aurora cluster to Aurora Serverless V2

Aurora Serverless V2 supports a flexible hybrid configuration architecture in the cluster , that is, the master node can be a preset mode instance, and the read node can be an Aurora Serverless V2 instance; at the same time, it also supports that the master node is an Aurora Serverless V2 instance, and the read node is an Aurora preset mode instance.

Migration method: Create an Aurora Serverless V2 hybrid configuration architecture to convert the on-premise mode master node instance into an Aurora Serverless V2 instance through master-slave switchover:

  • Upgrading an Aurora On-Premise Mode Master Node to the Version Required for Aurora Serverless V2
  • Setting Min ACU and Max ACU at the cluster level
  • Increase the instance type to Serverless Read Replica (Failover level: Tier 0/1)
  • Perform master-slave switching: Provisioned Writer becomes Provisioned Reader; Serverless Reader becomes Serverless Writer

Migration Scenario 2: Migrating Aurora Serverless V1 to Aurora Serverless V2

Migration method: Migrate by creating a snapshot

  • Create snapshots based on Aurora Serverless V1
  • Restoring a Provisioned Aurora Cluster Based on a Snapshot
  • Upgrading an Aurora On-Premise Mode Master Node to the Version Required for Aurora Serverless V2
  • Setting Min ACU and Max ACU at the cluster level
  • Increase the instance type to Serverless Read Replica (Failover level: Tier 0/1)
  • Perform master-slave switchover: Provisioned Writer becomes Provisioned Reader; Serverless Reader becomes Serverless Writer

4. Summary

This blog focuses on the characteristics of Aurora Serverless V2 as a new generation of cloud-native database: high scalability - moving like a rabbit, and high availability for enterprise-level applications - quiet as a rock; when the cloud-native database Aurora is deeply integrated without services, the database must be Innovation to the extreme!

I hope that after reading this blog, you can now build and enjoy the innovations of Aurora Serverless V2 to build your future-oriented innovative modern applications. Click to learn about the exciting content of the official course of the cloud native database elite class .

V. Appendix: Overall Test Process

5.1 Test Environment

Create and install two EC2 test machines:

Test region: us-east-1

Test side:

Two C5 4XLarge: AMI amzn2-ami-hvm (Root Device 100g)
**
Install and configure sysbench:**

Install sysbench on each of the two EC2 test machines

 sudo yum -y install git gcc make automake libtool openssl-devel ncurses-compat-libs
sudo yum -y install https://dev.mysql.com/get/mysql80-community-release-el7-3.noarch.rpm
sudo yum repolist
sudo rpm --import https://repo.mysql.com/RPM-GPG-KEY-mysql-2022
sudo yum -y install mysql-community-devel mysql-community-client mysql-community-common 
git clone https://github.com/akopytov/sysbench
cd sysbench
./autogen.sh
./configure
make
sudo make install
sysbench --version

5.2 Expansion test

5.2.1 Test environment preparation

test environment:

test end

Two EC2 C5 4XLarge with sysbench installed before

image.png

Prepare sysbench data

  1. Prepare sh to set relevant environment variables on two EC2 stress test machines respectively
 host=<Aurora Serverless V2/V1 endpoint>
username=<master user name>
password=<password>
run_time=200
interval=1

implement:

source set_variables.sh

  1. Create test database demos on Aurora Serverless V2 and V1 libraries respectively

create database demo

  1. Sysbench prepares 500 tables of data, each table has 50,000 rows and a total of 5GB of data

     sysbench --db-driver=mysql --mysql-user=${username} --mysql-password=${password} --mysql-db=demo --mysql-host=${host} --mysql-port=3306 --tables=500 --table-size=50000 --time=${run_time} --forced-shutdown --rand-type=uniform --db-ps-mode=disable --report-interval=${interval} --threads=50 /usr/local/share/sysbench/oltp_write_only.lua prepare
  2. Check test table status and data Overall test data 5GB

     /usr/bin/mysqlcheck -u ${username} -p${password} -h ${host} -P 3306 -a -B demo
    
    --Connect and query table sizes
    /usr/bin/mysql -u ${username} -p${password} -h ${host} -P 3306
    
    SELECT TABLE_SCHEMA, count(TABLE_NAME) AS "Table count",
     sum(DATA_LENGTH/1024/1024/1024) AS "Data size in GB" FROM INFORMATION_SCHEMA.TABLES
     WHERE TABLE_SCHEMA='demo' GROUP BY TABLE_SCHEMA;

    Create a Cloudwatch Dashboard to prepare for subsequent test monitoring:

Name: Aurora Serverless Monitor

Create 6 Widgets in Dashboard Aurora Serverless Monitor:

image.png

5.2.2 Testing

  1. Prepare sysbench on two Ec2 stress testing machines respectively. Stress testing script read and write test based on different threads. 10/100/50/600/10 stress testing with different thread specifications for 120 seconds (2 minutes) Print statistics every second

     $ cat sysbench_read_write.sh
    #!/bin/bash
    host= <Aurora Serverless V2/V1 endpoint>(请替换)
    username=admin
    password=<password>
    interval=1
    run_time=120
    for threads in 10 100 50 600 10
    do
    echo "start ......................... `date` "
    sysbench --db-driver=mysql --mysql-user=${username} --mysql-password=${password} --mysql-db=demo --mysql-host=${host} --mysql-port=3306 --tables=500 --table-size=50000 --time=${run_time} --forced-shutdown --rand-type=uniform --db-ps-mode=disable --report-interval=${interval} --threads=${threads} /usr/local/share/sysbench/oltp_read_write.lua run
    echo "end ......................... `date` "
    done
  2. Prepare monitoring scripts on two Ec2 stress testing machines to monitor database dynamic variables (innodb_buffer_pool_size and max_connections) every second

     $ cat stats-loop.sh
    host=< Aurora Serverless V2/V1 endpoint >
    username=<master user name>
    export MYSQL_PWD=<password>
    
    while true; do /usr/bin/mysql -u ${username} -h ${host} -P 3306 -e "select NOW() AS 'Time',
    @@max_connections AS 'Max Connections',
    COUNT(host) as 'Current Connections',
    round(@@innodb_buffer_pool_size/1024/1024/1024,2) AS 'Innodb Buffer Pool Size (GiB)',
    COUNT AS 'InnoDB history length'
    From information_schema.innodb_metrics,
    information_schema.processlist
    where name='trx_rseg_history_len'"; sleep 1; done
  3. Run Amazon configure to configure id/key/region in preparation for subsequent Amazon cli runs
  4. Prepare monitoring scripts on two Ec2 stress testing machines to monitor database ACU every second

     $ cat stats-loop-acu.sh
    cluster_name="aurora-serverless-v2-demo" (请替换成你的Aurora Serverless V2集群名字)
    export LC_ALL=Cwhile true; do
    aws cloudwatch get-metric-statistics —metric-name "ServerlessDatabaseCapacity" \
    --start-time "$(date -d '5 sec ago')" —end-time "$(date -d 'now')" —period 1 \
    --namespace "AWS/RDS" \
    --statistics Average \
    --dimensions Name=DBClusterIdentifier,Value=$cluster_name
    sleep 1; done
  5. Invoke the sysbench stress testing script on two Ec2 stress testing machines to perform thread 10/100/50/600/10 stress testing on Aurora Serverless V2/V1 respectively. Each round of stress testing executes for 120 seconds and tracks the Aurora Serverless V2/V1 database every second. The Innodb_buffer_pool_size and max_connections sizes also track the ACU allocated per second for Aurora Serverless V2/V1 databases (overall test runs 3 times)

     $ cat run_sysbench.sh
    sh sysbench_read_write.sh > $1_$2_sysbench.log &
    sh stats-loop.sh > $1_$2_buffer_pool.log &
    sh stats-loop-acu.sh > $1_$2_acu.log &
    $1 – 参数1=V2/V1 (代表在V2还是V1上运行)
    $2 – 参数2 = 1/2/3 (代表第几次执行)
    示例:sh run_sysbench.sh 2 1 (表示针对Aurora Serverless V2 做第一轮测试) 测试输出三个log格式:v2_1_sysbench.log/v2_1_buffer_pool.log/v2_1_acu.log

After the overall sysbench test is completed, the above three monitoring logs (sysbench.log/buffer_pool.log/acu.log) are used to sort out the information in the first 20 seconds of each thread stress test to further analyze whether Aurora Serverless V2 can achieve on-demand when the system load changes. Agile scaling

sysbench thread 10 stress test:

  1. Test data collation (first 20 seconds)

image.png

image.png

sysbench thread 100 stress test:

  1. Test data collation (first 20 seconds)

image.png

image.png

sysbench thread 50 stress test:

  1. Test data collation (first 20 seconds)

image.png

image.png

sysbench thread 600 stress test:

image.png

image.png

sysbench thread 10 stress test:

image.png

image.png

CloudWatch Dashboard monitoring metrics during testing:

image.png

Observation results: The curve fitting between V2 CPUUtilization and ServerlessDabaseCapacity is very high. The ACU value changes with the CPU index, especially when the CPU increases during the load rise, the ACU can reach an instant rise; when the CPU drops, the ACU value decreases relatively smoothly

image.png

Observation results: V2 QueriesPerSec and ServerlessDabaseCapacity curve fit is relatively high

image.png

Observation results: V2 DBConnections and ServerlessDabaseCapacity curve fit is relatively high

5.3 Read Replica Test

5.3.1 Test environment preparation

test environment:

test end

Previously installed Aurora Serverless V2 test machine: EC2 C5 4XLarge

Create one master and two slave Aurora Serverless V2 cluster read replicas with failover levels of Tier 0 and Tier 15:

image.png

Prepare sysbench data:

Connect to the master node create demo database to prepare sysbench test data (500 tables, each with 50,000 records and a total of 5GB of data) (for specific steps, please refer to the test in the previous chapter)

Create a Cloudwatch Dashboard to prepare for subsequent test monitoring:

Dashboard Name: Aurora-Serverless-v2-reader

image.png

Test load:

  • sysbench read and write load (for specific test scripts, please refer to the test in the previous chapter)
  • sysbench write-only load (please refer to the attached script below)
  • sysbench read-only load (please refer to the attached script below)

sysbench write-only load: (cyclically execute sysbench write-only load for 10 minutes each time)

 $ cat same_sysbench_only_write.sh
host="请替换成你的Aurora Endpoint "
username="admin"
password="****"
interval=1  
run_time= 600
threads=$1
while true  
do         
        echo $threads
echo "start ......................... `date` "
sysbench --db-driver=mysql --mysql-user=${username} --mysql-password=${password} --mysql-db=demo --mysql-host=${host} --mysql-port=3306 --tables=500 --table-size=50000 --time=${run_time} --forced-shutdown --rand-type=uniform --db-ps-mode=disable --report-interval=${interval} --threads=${threads} /usr/local/share/sysbench/oltp_write_only.lua run
echo "end ......................... `date` "
sleep 1
done

sh same_sysbench_only_write.sh 100 (参数为并发线程数)

sysbench read-only load: (cyclically execute sysbench read-only load for 10 minutes each time)

 $ cat same_sysbench_only_read.sh
host="请替换成你的Aurora Endpoint"
username="admin"
password="******"
interval=1  
run_time=600 
threads=$1
while true  
do         
echo "start ......................... `date` "
sysbench --db-driver=mysql --mysql-user=${username} --mysql-password=${password} --mysql-db=demo --mysql-host=${host} --mysql-port=3306 --tables=500 --table-size=50000 --time=${run_time} --forced-shutdown --rand-type=uniform --db-ps-mode=disable --report-interval=${interval} --threads=${threads} /usr/local/share/sysbench/oltp_read_only.lua run
echo "end ......................... `date` "
sleep 1
done
sh same_sysbench_only_read.sh 100 (参数为并发线程数)

5.3.2 Testing

Test Scenario 1:

Test: Only add sysbench read and write pressure on the master node:

image.png

Test Scenario 2:

Test: Add constant sysbench read and write pressure on the master node (threads of 100), and add constant sysbench read-only pressure on Tier 15 read replicas (threads of 10):

image.png

Test Scenario 3:

Test: Manually do Failover on the master node to observe when the master-slave switches to the Tier 0 read replica:

image.png

5.4 Global database test

5.4.1 Test environment preparation

test environment:

test end

  • Previously installed Aurora Serverless V2 test machine in US East 1: EC2 C5 4XLarge
  • The newly installed Aurora Serverless V2 test machine in Meixi 2: EC2 C5 4XLarge Install the sysbench test software (please refer to the previous chapter for specific steps)

database environment

  • Main cluster (4 ACU-32 ACU) in US East 1
  • Slave cluster (4 ACU – 32 ACU) in US West 2

image.png

Prepare sysbench data:

Connect to the main cluster master node create demo database to prepare sysbench test data (500 tables, each with 50,000 records and a total of 5GB of data) (for specific steps, please refer to the previous chapter)

Create a Cloudwatch Dashboard to prepare for subsequent test monitoring:

Dashboard Name: Aurora-Serverless-v2-reader

image.png

Test load:

  • sysbench read and write load (please refer to the previous chapter for the specific test script)
  • sysbench only writes the load (please refer to the previous chapter for the specific test script)
  • sysbench read-only load (refer to the previous chapter for specific test scripts)

5.4.2 Testing

Test Scenario 1:

Test: Add constant sysbench read and write pressure on the master cluster (threads of 100), and add constant sysbench read-only pressure on the slave cluster (threads of 10):

image.png

Test Scenario 2:

Test: Add constant sysbench write-only pressure on the master node (100 threads), observe replication latency on the slave cluster:

image.png

Test Scenario 3:

Test: Execute Managed-failover (switch from us-west-2 to us-east-1) and observe the time required for master-slave switchover:

  • Connect to the continuous running script from the cluster endpoint to query the max_connections information of the cluster (please refer to the previous section to query the script)
  • Manually do managed-failover operations on the primary cluster
  • Record when the failover operation occurs
  • Observe roughly how long it takes to get information from the cluster

image.png

Author of this article

Bingbing liu

Liu Bingbing, Amazon cloud technology database solution architect, is responsible for the consulting and architecture design of database solutions based on Amazon cloud technology, and is committed to the research and promotion of big data. Before joining Amazon Cloud Technology, he worked in Oracle for many years, and has rich experience in database cloud planning, design operation and maintenance optimization, DR solutions, big data and data warehouses, and enterprise applications.


亚马逊云开发者
2.9k 声望9.6k 粉丝

亚马逊云开发者社区是面向开发者交流与互动的平台。在这里,你可以分享和获取有关云计算、人工智能、IoT、区块链等相关技术和前沿知识,也可以与同行或爱好者们交流探讨,共同成长。