Author: Liu Kaiyang
The Beijing DBA of Eikesheng delivery service team has a strong learning interest in database and peripheral technology. He likes to read books and pursue technology.
Source of this article: original submission
* Produced by the Aikesheng open source community, original content is not allowed to be used without authorization, please contact the editor and indicate the source for reprinting.
Some time ago, when doing MySQL backup and recovery test on centos 8 environment, I encountered a problem. Let me share it with you.
1. Talk about the environment
Server OS | Database version | Backup tool |
---|---|---|
Centos 8 for X86 | mysql 8.0.18 | xtrabackup 8.0.10 |
The problem scenario of the editor appeared on centos 8, and the verification also used the centos 8 version, but I believe that at the end you will find that this problem has little to do with the OS, let's continue to look down.
2. Talk about the problem
The process from backup to recovery was quite smooth, but the following problem appeared when the database was finally started. Take a closer look, it seems that the binlog file name of the database has been eaten.
2021-03-08T16:48:08.510032+08:00 0 [Warning] [MY-010097] [Server] Insecure configuration for --secure-file-priv: Current value does not restrict location of generated files. Consider setting it to a valid, non-empty path.
2021-03-08T16:48:08.510125+08:00 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.18) starting as process 16556
mysqld: File '/var/lib/mysql/.000003' not found (OS errno 2 - No such file or directory)
2021-03-08T16:48:09.196673+08:00 0 [ERROR] [MY-010958] [Server] Could not open log file.
2021-03-08T16:48:09.196787+08:00 0 [ERROR] [MY-010041] [Server] Can't init tc log
2021-03-08T16:48:09.196961+08:00 0 [ERROR] [MY-010119] [Server] Aborting
2021-03-08T16:48:10.632676+08:00 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.18) MySQL Community Server - GPL.
Don't panic, this phenomenon may be that the binlog index file is modified incorrectly when the database is restored, which will cause the database to fail to start. The solution is simple. Here, MySQL reports the error and outputs the information in the binlog.index file. Just change the binlog The name is rewritten, and the database can be restored afterwards.
Of course, this is not the focus of this article. The focus is on why xtrabackup will only delete and not insert because of inconsistent binlog file names. Because the previous test was negligent and did not confirm whether the configuration files are consistent, and after an error was reported, it was found that the binlog file names in the two backup and recovery environments were inconsistent. Therefore, it is guessed that the fault occurred because of the binlog file name problem. The characteristics of the failure case, let's first perform a review to reproduce it.
3. Engage in testing
The general test process is as follows:
(1) Prepare the MySQL 8.0.18 environment and use the corresponding xtrabackup to make a full backup
[root@yang ~]# xtrabackup --defaults-file=/etc/my.cnf --host=192.168.1.7 --user=root --password=123 --port=3306 --backup --target-dir=/root/backup-full/
xtrabackup: recognized server arguments: --datadir=/var/lib/mysql --server-id=1 --log_bin=/var/lib/mysql/mysql-bin
xtrabackup: recognized client arguments: --host=192.168.1.7 --user=root --password=* --port=3306 --backup=1 --target-dir=/root/backup-full/
xtrabackup version 8.0.10 based on MySQL server 8.0.18 Linux (x86_64) (revision id: 94f9645)
210515 12:11:55 version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup;host=192.168.1.7;port=3306' as 'root' (using password: YES).
210515 12:11:55 version_check Connected to MySQL server
210515 12:11:55 version_check Executing a version check against the server...
# A software update is available:
210515 12:12:00 version_check Done.
210515 12:12:00 Connecting to MySQL server host: 192.168.1.7, user: root, password: set, port: 3306, socket: not set
Using server version 8.0.18
xtrabackup: uses posix_fadvise().
······
210515 12:12:04 completed OK!
// 备份完成
(2) Modify the name of the binlog file in the configuration file of the new instance to artificially create the point of failure we encountered
[root@yang ~]# cat /etc/my.cnf | grep log_bin
log_bin = /var/lib/mysql/binlog
[root@yang ~]# vim /etc/my.cnf
[root@yang ~]# cat /etc/my.cnf | grep log_bin
log_bin = /var/lib/mysql/mysql-bin
(3) Prepare to restore the environment
[root@yang ~]# pkill mysqld
[root@yang ~]# rm -rf /var/lib/mysql
[root@yang ~]# cd backup-full/
[root@yang backup-full]# ls
backup-my.cnf ib_buffer_pool ibdata1 mysql binlog.000004 binlog.index mysql.ibd
performance_schema sys test undo_001 undo_002 xtrabackup_binlog_info xtrabackup_checkpoints
xtrabackup_info xtrabackup_logfile xtrabackup_tablespaces
[root@yang backup-full]# cat binlog.index
/var/lib/mysql/binlog.000004
(4) The prepare process of xtrabackup, that is, the data preparation phase
[root@yang ~]# xtrabackup --defaults-file=/etc/my.cnf --user=root --prepare --target-dir=/root/backup-full/
······
210515 12:14:28 completed OK!
[root@yang ~]# cat backup-full/binlog.index
/var/lib/mysql/binlog.000004
(5) The copy-back process of xtrabackup, which is the recovery phase of the final data
[root@yang ~]# xtrabackup --defaults-file=/etc/my.cnf --user=mysql --copy-back --target-dir=/root/backup-full/
xtrabackup: recognized server arguments: --datadir=/var/lib/mysql --server-id=1 --log_bin=/var/lib/mysql/binlog
xtrabackup: recognized client arguments: --user=mysql --copy-back=1 --target-dir=/root/backup-full/
xtrabackup version 8.0.10 based on MySQL server 8.0.18 Linux (x86_64) (revision id: 94f9645)
210515 12:14:51 [01] Copying undo_001 to /var/lib/mysql/undo_001
······
210515 12:14:51 [01] Copying ./xtrabackup_info to /var/lib/mysql/xtrabackup_info
210515 12:14:51 [01] ...done
210515 12:14:51 [01] Copying ./xtrabackup_master_key_id to /var/lib/mysql/xtrabackup_master_key_id
210515 12:14:51 [01] ...done
210515 12:14:51 [01] Copying ./ibtmp1 to /var/lib/mysql/ibtmp1
210515 12:14:51 [01] ...done
210515 12:14:51 [01] Creating directory ./#innodb_temp
210515 12:14:51 [01] ...done.
210515 12:14:51 completed OK!
[root@yang ~]# cat backup-full/binlog.index
/var/lib/mysql/binlog.000004
(6) Environmental reproduction
[root@yang ~]# chown -R mysql:mysql /var/lib/mysql/*
[root@yang ~]# /usr/sbin/mysqld --defaults-file=/etc/my.cnf --user=mysql &
[1] 1545
[root@yang ~]#
[1]+ Exit 1 /usr/sbin/mysqld --defaults-file=/etc/my.cnf --user=mysql
[root@yang ~]# tail -f /var/log/mysqld.log
2021-05-15T12:20:18.510125+08:00 0 [System] [MY-01016] [Server] /usr/sbin/mysqld (mysqld 8.0.18) starting as process 1545
mysqld: File '/var/lib/mysql/.000004' not found (OS errno 2 - No such file or directory)
2021-05-15T12:20:18.596673+08:00 0 [ERROR] [MY-010958] [Server] Could not open log file.
2021-05-15T12:20:18.596787+08:00 0 [ERROR] [MY-010041] [Server] Can't init tc log
2021-05-15T12:20:18.596961+08:00 0 [ERROR] [MY-010119] [Server] Aborting
2021-05-15T12:20:19.132676+08:00 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.18) MySQL Community Server - GPL.
[root@yang ~]# ls /var/lib/mysql |grep bin
mysql-bin.000004
mysql-bin.index
[root@yang ~]# cat /var/lib/mysql/mysql-bin.index
/var/lib/mysql/.000004
In the above operation, at each step of the operation, we check whether the binlog file in the backup source has been tampered with, and found that it is not. Only the binlog.index file was changed when the data was finally copied-back.
(7) Let's make a guess, will it cause this error even if the binlog name is the same? Let's test it
[root@yang mysql]# cat /etc/my.cnf | grep log_bin // 修改配置文件与备份源的binlog文件名相同
log_bin = /var/lib/mysql/binlog
[root@yang ~]# rm -rf /var/lib/mysql/*
[root@yang ~]# xtrabackup --defaults-file=/etc/my.cnf --user=mysql --copy-back --target-dir=/root/backup-full/
······
[root@yang mysql]# cat /var/lib/mysql/binlog.index
/var/lib/mysql/.000004
A verified guess is not a fortune telling. Then the problem of inconsistent binlog names during xtrabackup backup is eliminated. Then how to further determine the behavior of xtrabackup?
(8) Perform strace monitoring during the process of deleting data again and recovering data to see what xtrabackup has done
// 在copy-back过程中,另开一个会话输出xtrabackup 的strace
[root@yang ~]# ps aux |grep xtrabackup | grep -v 'color=auto' | awk '{print $2}' | xargs strace -o strace_output-copy_write.txt -T -tt -f -y -s 3000 -e trace=all -p
strace: Process 2572 attached
strace: Process 2575 attached
4. Chat analysis
Next, we look at the output analysis of strace, strace_output-copy_write.txt, as shown below:
The description of the above figure points us to a problem with the judgment logic of xtrabackup. If we continue to analyze, we can package xtrabackup into a source code package, and then use gdb to analyze the logic judgment in the source code to continue the analysis. The relevant steps are not here. Too much explanation. Interested students can refer to the 9th, 12th and 32nd questions of the "One Question One Experiment" column in the community, as well as the reference documents at the end of the article.
The problem of logical judgment is generally found in the source code of the following figure. The official requests found that the problem has been resolved in the new version of xtrabackup, attach the link: https://github.com/percona/percona-xtrabackup/pull/ 1094 .
Code changes:
5. Summary
This article summarizes two points:
(1) When MySQL starts to report an error when the binlog name is lost during backup, it may be that the information in the binlog.index file is lost. If this is the case, just rewrite the file and restart to complete the database recovery.
(2) When using the xtrabackup version from percona-xtrabackup-8.0.6 to percona-xtrabackup-8.0.23-16, you need to pay attention to the binlog file name. If the absolute path is configured, this kind of error in the article will be predicted in advance. The official has been on https ://github.com/percona/percona-xtrabackup/blob/percona-xtrabackup-8.0.25-17/storage/innobase/xtrabackup/src/backup_copy.cc version fix.
6. Come for reference
- Skills for using strace: one question and one experiment/12: what is the role of table cache
- The use of gdb: one question and one experiment/09-MySQL crashes inexplicably, how to keep the scene
- Compile xtrabackup with reference to the debug version of MySQL: one question and one experiment/39: how to compile the debug version of MySQL
- xtrabackup compilation: https://www.percona.com/doc/percona-xtrabackup/8.0/installation/compiling_xtrabackup.html (there is a pit in the article, it is recommended that cmake not use the -B parameter)
- Source code changes: https://github.com/percona/percona-xtrabackup/commit/d27028be415b8a1940066d993ca9fa7f1ce1b675
- Bug fix version: https://github.com/percona/percona-xtrabackup/blob/percona-xtrabackup-8.0.25-17/storage/innobase/xtrabackup/src/backup_copy.cc
- Linux system call function explanation: https://man7.org/linux/man-pages/man2/open.2.html
ps: The fault points involved in the problem are easy to deal with. This article is mainly to share with you the use of some tools and the analysis ideas of the problem, learn more about MySQL and make progress together.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。