With the large-scale application of microservices, there are more and more distributed transactions across microservices. So what is the performance of distributed transactions? How much will the performance drop? Can it meet business needs? These indicators are related to whether distributed transactions can be smoothly introduced into production applications, which is a matter of great concern to everyone.
This article attempts to deeply analyze the additional overhead brought by distributed transactions, which factors in the application will affect the final performance, where are the bottlenecks, and how to improve performance. This article uses the saga transaction of the multi-language distributed transaction manager https://github.com/yedf/dtm as a sample of the performance test, and conducts an in-depth analysis of the results of the performance test.
test environment
model | CPU/memory | storage | system | Mysql |
---|---|---|---|---|
Alibaba Cloud ecs.c7.xlarge | 4 core 8G | 500G ESSD IOPS 26800 | Ubuntu 20.04 | Docker mysql:5.7 |
Testing process
# 在dtm目录下
docker-compose -f helper/compose.mysql.yml up -d # 启动Mysql
# 运行sysbench对mysql进行测试
sysbench oltp_write_only.lua --time=60 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password= --mysql-db=sbtest --table-size=1000000 --tables=10 --threads=10 --events=999999999 --report-interval=10 prepare
sysbench oltp_write_only.lua --time=60 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password= --mysql-db=sbtest --table-size=1000000 --tables=10 --threads=10 --events=999999999 --report-interval=10 run
go run app/main.go bench > /dev/nul # 启动dtm的bench服务,日志较多,重定向到nul设备
bench/run-dtm.sh # 新启动命令行,运行dtm相关的各项测试
PS: If you need to do hands-on testing, it is recommended that you buy a host in Hong Kong or abroad, so that the related github and docker access will be much faster, and the environment can be quickly set up. The host I bought in China is very slow to access github and docker, sometimes it can’t connect and can’t be tested smoothly.
Test index
We will compare the following indicators:
- Global-TPS: From the user's perspective, how many global transactions have been completed.
- DB-TPS: The number of transactions completed at the DB level in each test
- OPS: How many SQL statements were completed in each test
Comparative Results
Mysql | No DTM-2SQL | DTM-2SQL | DTM-2SQL-Barrier | No DTM-10SQL | DTM-10SQL | DTM-10SQL-Barrier | |
---|---|---|---|---|---|---|---|
Global-TPS | - | 1232 | 575 | 531 | 551 | 357 | 341 |
DB-TPS | 2006 | 2464 | 2300 | 2124 | 1102 | 1428 | 1364 |
OPS | 12039 | 4928 | 5750 | 6372 | 10620 | 9282 | 9548 |
Mysql performance
We first tested the performance of Mysql itself. In this performance test of DTM, there are many write operations, so this time we mainly performed performance test on Mysql writing.
We used the oltp_write_only benchmark in sysbench. In this benchmark, each transaction contains 6 write SQL (insert/update/delete).
Under this benchmark, the number of completed transactions per second is approximately 2006, and the number of completed SQL is approximately 12039. These two results will be quoted in subsequent DTM related tests.
DTM test
There are many transaction modes involved in distributed transactions. We select a representative simple Saga mode as a representative to analyze the performance of distributed transaction DTM.
The Saga transaction we selected contains two sub-transactions, one is the balance transferred from TransOut and the other is the balance transferred from TransIn. Each transfer in and out contains two SQLs, which are the updated balance and the record flow.
No DTM-2SQL
We first test the situation without DTM, that is, directly call TransOut and TransIn. The test result is that 1232 global transactions are completed per second. Each global transaction contains two sub-transactions, namely, transfer-out and transfer-in, so the DB-TPS is 2464, and then each sub-transaction contains two SQLs, so the total SQL operation is 4928.
This result is higher than MYSQL and DB-TPS, while DB-SQL is only half. The main reason is that each transaction needs to synchronize data to disk, which requires additional performance. At this time, the bottleneck is mainly in the transaction capacity of the system database.
DTM-2SQL
We then tested the use of DTM. After using DTM, the sequence diagram of a SAGA transaction is as follows:
The global transaction will include 4 transactions: TransIn, TransOut, save the global transaction + transaction branch, modify the global transaction as completed. Modifying each sub-transaction branch to be completed also requires a transaction, but DTM uses asynchronous writing to merge, reducing the number of transactions.
The number of SQL included in each global transaction is: 1 save global transaction, 1 save branch, 1 read all branches, 2 modify branches for completion, and 1 modify global transaction for completion, a total of 6 additional SQLs, plus The 4 SQL of the original sub-transaction is 10.
In the test results, the number of global transactions completed per second is 575, then the DB-TPS is 2300, and the OPS is 5750. Compared with the previous solution without DTM, the DB-TPS has dropped slightly, and the OPS has risen to a certain extent. The bottleneck is still in the system database.
DTM-2SQL-Barrier
After adding the sub-transaction barrier, each sub-transaction branch will have one more insert statement, and the number of SQL corresponding to each global transaction is 12.
In the test results, the number of global transactions completed per second is 531, then the DB-TPS is 2124, and the OPS is 6372. Compared with the previous DTM solution, the DB-TPS is slightly decreased, and the OPS is slightly increased, which is in line with expectations.
No DTM-10SQL
We adjust the pressure test data, adjust the number of SQL in each sub-transaction from 2 to 10, and execute the SQL loop in the sub-transaction 5 times.
In the stress test result without DTM, the number of global transactions completed per second is 551, DB-TPS is 1102, and OPS is 10620. In this result, OPS is close to MYSQL, and the bottleneck is mainly in the OPS of the database.
DTM-10SQL
In the results of this stress test, the number of global transactions completed per second is 357, DB-TPS is 1428, OPS is 9282, and OPS is reduced by more than ten percent compared to the case without DTM. The main reason is that the DTM table has a higher With more fields and indexes, the execution cost of each SQL will be larger, so the total OPS will be lower.
DTM-10SQL-Barrier
In the test results, the number of global transactions completed per second is 341, then the DB-TPS is 1364, and the OPS is 9548. Compared with the previous DTM scheme, DB-TPS has slightly decreased, and OPS has increased slightly, which is in line with expectations.
summary
Since distributed transactions need to save the state of global transactions and branch transactions, additional writes will be generated, approximately 4+n (the number of sub-transactions) SQL operations and 2 database transactions will be generated for each global transaction. When the business is very simple and there is less SQL, the use of distributed transactions will result in a 50% reduction in transaction throughput; if the business is more complex and there are more SQL, the performance will drop by about 35%. The main reason for the decline is the preservation of the global/branch transaction state, which generates additional SQL operations.
From the comparison of the pressure measurement results of DTM with the pressure measurement data of MYSQL, the additional overhead generated by DTM is very small, and the ability of the database has been maximized.
An Alibaba Cloud server with ecs.c7.xlarge+500G disk, after installing mysql, can provide about 300~600 Global-TPS, and the monthly cost is 900 yuan (price in October 2021). This cost compares the services provided In terms of ability, it is already very low.
If you need more powerful performance, you can buy a higher configuration, or you can deploy multiple sets of DTM at the application layer. The cost of the two solutions is not high, and it is enough to meet the needs of most companies.
Welcome everyone to visit the https://github.com/yedf/dtm project, give stars to support our work!
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。