3

JuiceFS is very suitable for MySQL physical backup. For details, refer to our official document 161cad93030f8d. Recently, a customer reported during testing that the data preparation xtrabackup --prepare ) for backup verification was very slow. We used the performance analysis tools provided by JuiceFS to analyze and quickly found performance bottlenecks. By continuously adjusting the parameters of XtraBackup and the mounting parameters of JuiceFS, the time was shortened to 1/10 of the original within one hour. This article records and shares our performance analysis and optimization process to provide you with a reference for analyzing and optimizing IO performance.

data preparation

We use the SysBench tool to generate a single-table database with a size of about 11GiB, and the partition of the database table is set to 10. In order to simulate a normal database read and write scenario, the database is accessed through SysBench with a pressure of 50 requests per second. Under this pressure, the data written by the database to the data disk is in the range of 8~10 MiB/s. Use the following command to back up the database to JuiceFS.

# xtrabackup --backup --target-dir=/jfs/base/

In order to ensure that the data of each data preparation operation is exactly the same, use the snapshot function of /jfs/base generate a snapshot /jfs/base_snapshot/ based on the 061cad93030fee directory. Before each operation, the data from the previous data preparation operation will be deleted and a new snapshot will be generated.

Use default parameters

# ./juicefs mount volume-demoz /jfs

#  time xtrabackup --prepare --apply-log-only --target-dir=/jfs/base_snapshot

The total execution time is 62 seconds.

JuiceFS supports exporting the operation log oplog, and can visually display the oplog. Before performing the xtrabackup --prepare operation, we open a new terminal to connect to the server, and enter it on the command line

# cat /jfs/.oplog > oplog.txt

Start collecting oplog logs, and then perform the xtrabackup --prepare operation. After the operation, download oplog.txt locally and upload it to the oplog analysis page provided by : 161cad9303103e https://juicefs.com/oplog/.

We visualize the oplog.

Here is a general introduction to the meaning of the various elements in the figure below. One of our oplogs contains timestamp, thread ID, file system operation functions (read, write, fsync, flush, etc.), operation duration, etc. The number on the left represents the thread ID, the horizontal axis represents the time, and different types of operations are marked with different colors.

We zoomed in on a part of the image, and different colors represent different types of operations at a glance.

Exclude several threads that have nothing to do with this operation. In the data preparation process, 4 threads are responsible for reading and 5 threads are responsible for writing data. Reading and writing overlap in time.

Increase the memory buffer of XtraBackup

Refer to XtraBackup official document , data preparation is a process of performing crash recovery on the backup data set using the embedded InnoDB.

Use --use-memory option increase the memory buffer size of the embedded InnoDB, the default is 100MB, we increase it to 4GB.

# time xtrabackup --prepare --use-memory=4G --apply-log-only --target-dir=/jfs/base_snapshot

The execution time dropped to 33 seconds.

It can be seen that the reading and writing are not overlapped, and the data is read into the memory and written to the file system after the processing is completed.

Increase the number of XtraBackup reader threads

The time is cut by half by increasing the buffer area, and the entire reading process is still time-consuming. We see that each reader thread is basically running full, we try to add more reader threads.

# time xtrabackup --prepare --use-memory=4G --innodb-file-io-threads=16 --innodb-read-io-threads=16 --apply-log-only --target-dir=/jfs/base_snapshot

The execution time dropped to 23 seconds.

The number of read threads has been increased to 16 (default 4), and the read operation has dropped to about 7 seconds.

JuiceFS enables asynchronous write

In the previous step, we greatly optimized the read operation time, and now the time consumed by the write process is more obvious. By analyzing the oplog, it is found that fsync cannot be parallelized in the write operation. Therefore, increasing the number of write threads cannot improve the efficiency of writing. In the actual operation, we have also verified this by increasing the number of write threads. I will not repeat it here. . Analyzing the parameters (offset, write data size) of oplog's write operation to the same file (same file descriptor), we found that there are a large number of random write operations. We can enable the --writeback option when mounting JuiceFS. When writing data Write to the local disk first, and then asynchronously write to the object storage.

# ./juicefs mount --writeback volume-demoz /jfs
# time xtrabackup --prepare --use-memory=4G --innodb-file-io-threads=16 --innodb-read-io-threads=16 --apply-log-only --target-dir=/jfs/base_snapshot

The time dropped to 11.8 seconds.

The writing process has been reduced to about 1.5 seconds.

We see that the read operation of the reader thread is still relatively intensive, we try to continue to increase the number of reader threads, the maximum number of InnoDB reader threads is 64, we directly adjust to 64.

# time xtrabackup --prepare --use-memory=4G --innodb-file-io-threads=64 --innodb-read-io-threads=64 --apply-log-only --target-dir=/jfs/base_snapshot

The execution time is 11.2 seconds, which is basically the same as before.

We have seen that the read operation of the reader thread is already relatively sparse, and there should be a dependency between the data read by the thread, which makes it impossible to fully parallelize, and it is no longer possible to compress the time of the read process by increasing the number of threads.

Increase the disk cache of JuiceFS

In the previous step, we have reached the top to improve the efficiency of the reading process by increasing the number of reading threads. We can only reduce the time of the reading process by reducing the delay of reading data.

JuiceFS provides read-ahead and cache acceleration capabilities in the processing of read operations. Next, we will try to reduce the latency of read operations by increasing the local cache of JuiceFS.

Change the local cache of JuiceFS from a high-efficiency cloud disk to an SSD cloud disk, and change the cache size from 1G to 10G.

# ./juicefs mount --writeback volume-demoz --cache-size=10000 --cache-dir=/data/jfsCache /jfs

# time xtrabackup --prepare --use-memory=4G --innodb-file-io-threads=64 --innodb-read-io-threads=64 --apply-log-only --target-dir=/jfs/base_snapshot

The execution time dropped to 6.9 seconds.

By improving the cache performance and increasing the cache space, the time-consuming read operation is further reduced.

At this point, we summarize, we are constantly looking for points that can be optimized by analyzing the oplog, and reducing the entire data preparation process from 62 seconds to 6.9 seconds step by step. The effect is more intuitively displayed in the figure below.

Increase the amount of database data

The above operations are optimized for a relatively small data set such as 11G by continuously adjusting the parameters to obtain a good result. For comparison, we generate a single-table database with a partition of 10 around 115G in the same way. In the case of SysBench continuous 50 requests per second, the backup operation is performed.

# time xtrabackup --prepare --use-memory=4G --innodb-file-io-threads=64 --innodb-read-io-threads=64 --apply-log-only --target-dir=/jfs/base_snapshot

This process took 74 seconds.

We see that reading and writing are still separate.

When the amount of data increases about 10 times, the corresponding preparation time also increases to 10 times. This is because the time required for the backup ( xtrabackup --backup ) process has been expanded to 10 times. When the pressure on the database remains unchanged by SysBench, the xtrabackup_logfile generated during the backup process is also 10 times the original. The data preparation is to xtrabackup_logfile all the data updates in 061cad930312c6 into the data file. It can be seen that even if the data size is increased by 10 times, the time to update a single log is basically the same. This can also be verified from the above figure. After the data size increases, the preparation process is still divided into two obvious processes: reading data and writing data, indicating that the set 4GB buffer size is still sufficient. The whole process It can still be done in memory and then updated to the file system.

Summarize

We use SysBench, a relatively simple tool, to construct the initial data, and continuously give the database a certain amount of data update pressure to simulate the database operation scenario when the data is backed up. Use JuiceFS oplog to observe the read and write characteristics of XtraBackup access to backup data during the data preparation process, and adjust the parameters of XtraBackup and JuiceFS to continuously optimize the efficiency of the data preparation process.

In the actual production scenario, the situation is much more complicated than our SysBench simulation. The linear relationship above may not be strictly established, but we quickly find the points that can be optimized by analyzing the oplog, and then continuously adjust the cache and concurrency ideas of XtraBackup and JuiceFS It is universal.

The entire parameter tuning process takes about 1 hour. The oplog analysis tool has played a big role in this process, helping us quickly locate system performance bottlenecks, so as to adjust the parameters for optimization, and hope that this oplog analysis function can also help Everyone quickly locates and analyzes the performance problems encountered.

If you have any help, please pay attention to our project Juicedata/JuiceFS ! (0ᴗ0✿)


JuiceFS
183 声望9 粉丝

JuiceFS 是一款面向云环境设计的高性能共享文件系统。提供完备的 POSIX 兼容性,可将海量低价的云存储作为本地磁盘使用,亦可同时被多台主机同时挂载读写。