4

Overview

Previously, dtm gave a performance test report of Mysql as a storage engine. On a machine with a common configuration, 2.68w IOPS, a 4-core 8G machine, it can support about 900+ distributed transactions per second, which can meet the needs of most companies. Business needs.

This time I brought a test report of the Redis storage engine. On a machine with a common configuration, it can reach a distributed transaction capacity of about 10,800 per second. Compared with Mysql storage, there is a performance improvement of about 10 times, which satisfies most of them. The company’s business needs.

Let's describe the test steps in detail and analyze the various factors that affect performance.

test environment

The following servers are all from Alibaba Cloud, and the region is Tokyo (external network access is more convenient)

Redis server: ecs.hfc6 4-core 8G CPU clocked at 3.1 GHz/3.5 GHz, intranet sending and receiving packets of 500,000 PPS ubuntu 20.04

Two application servers: ecs.hfc6 8-core 16G CPU clocked at 3.1 GHz/3.5 GHz Intranet transceiver package 800,000 PPS ubuntu 20.04 redis5.x

Test steps:

Ready for Redis

Note: If the application server is involved, then both servers need to perform related operations

Prepare Redis on the application server. This time because of extreme performance considerations, docker installation is not used, but apt install is used to install, run the following command

apt update
apt install -y redis
# 修改/etc/redis/redis.conf,找到其中的bind,改为bind 0.0.0.0
systemctl restart redis-server

Configure application server

apt update
apt install -y git
git clone https://github.com/dtm-labs/dtm.git && cd dtm && git checkout 5907f99 && cd bench && make

Configure dtm

Modify the conf.sample.yml in the dtm directory to configure Redis, for example:

Store:
  Driver: 'redis'
    Host: 'redis ip'
    Port: 6379
    
# 另外再把ExamplesDB里面的配置删除,因为我们没有安装mysql

Start the bench server

`
LOG_LEVEL=warn go run bench/main.go redis
`

Start the test

`
ab -n 1000000 -c 10 "http://127.0.0.1:8083/api/busi_bench/benchEmptyUrl"
`

Get results

I have seen the result of ab which shows that the number of operations completed per second for the two application servers is 10875.

Redis performance analysis

Let's first look at the performance of Redis itself, what are the factors that affect it, first look at the following test data:

`
redis-benchmark -n 300000 SET 'abcdefg' 'ddddddd'
`

The number of completed requests per second 10w

`
redis-benchmark -h Intranet other host IP -p 6379 -n 300000 SET'abcdefg''ddddddd'
`

The number of completed requests per second 10w

From the above two results, the performance difference between the local Redis test and the remote Redis test is not obvious. I have also tested more commands and found no obvious differences, so the following will mainly test the performance of local Redis instead of comparing the difference between local and remote.

`
redis-benchmark -n 300000 EVAL "redis.call('SET', 'abcdedf', 'ddddddd')" 0
`

The number of requests completed by Lua scripts per second 10w

`
redis-benchmark -n 300000 EVAL "redis.call('SET', KEYS[1], ARGS[1])" 1 'aaaaaaaaa' 'bbbbbbbbbb'
`

The number of requests completed by Lua scripts per second 10w

`
redis-benchmark -n 3000000 -P 50 SET 'abcdefg' 'ddddddd'
`

If you take the Pipeline, the number of completed requests per second is 150w, which is a significant improvement in performance compared to a single Set operation. From the comparison of this data and a single operation, Redis itself has little memory operation overhead, and a lot of overhead is spent on network IO, so batch tasks can greatly increase throughput

`
redis-benchmark -n 300000 EVAL "for k=1, 10 do; redis.call('SET', KEYS[1], ARGS[1]); end" 1 'aaaaaaaaa' 'bbbbbbbbbb'
`

In Lua, we execute Set 10 times continuously, and the number of completed requests per second is 6.1w, which is not much different from executing Set only once. This result is within our expectations, because the previous Pipeline results show that the memory operation overhead of Redis is significantly less than that of the network.

`

dtm performance analysis

dtm needs to track the progress of global distributed transactions. Let's take Saga as an example, which probably involves the following operations:

  • Save transaction information, including global transactions, transaction branches, and indexes to find expired transactions. dtm uses a Lua script to complete these operations
  • When each transaction branch is completed, modify the status of the transaction branch. When modifying the state, it is necessary to confirm that the global transaction has not been rolled back to avoid that the transaction in the rollback is still executed, so dtm also uses a Lua script to complete
  • The global transaction is completed, and the modification of the global transaction is successful. At this time, it is also necessary to prevent the transaction that has timed out and rollback from being overwritten. It is also a Lua script

Then the theoretical cost of a global transaction with two transaction branches on Redis is about the cost of 4 Lua scripts. From the perspective that about 6w simple Lua scripts can be completed per second, it is ideal to complete 1.5w per second. Distributed transactions. Since the actual Lua script is more complicated than the one we tested, and the amount of data transferred is larger, the final 1.08w transactions per second is completed, which is almost the limit of performance. When the transaction performance of 1.08w per second is reached, it is observed that the CPU of Redis is already 100%, and the performance bottleneck is reached.

Outlook

1w transactions per second is already very high performance, enough to deal with most scenarios. Including message queues, spikes, etc.

When Redis can support such a large transaction volume, if it is such a large transaction volume for a long time, then the storage space of redis will soon be insufficient, and options may be added in the future to allow timely cleaning up of completed transactions

Can the performance of dtm be improved in the future? We can look at it from two aspects:

One is that in the current single-process situation, dtm can reach 1w transactions per second. On redis6.0, official data shows that 4CPU performance is increased by about 150%. In this case, dtm is expected to support 2.5w transactions per second.

The other is the development of dtm in the direction of clusters, providing cluster capabilities, allowing dynamic expansion and so on. In this regard, we need to look at the future usage, and then make relevant plans.

project address

For more theoretical knowledge and practice of distributed transactions, you can visit the following projects and public accounts:

https://github.com/dtm-labs/dtm

Pay attention to the [Distributed Affairs] official account, get more knowledge about distributed affairs, and join our community at the same time


叶东富
1.1k 声望6.1k 粉丝