Author: Jiang Yu

DBA member of Aikesheng, mainly responsible for MySQL fault handling and DMP platform related technical support. Pursuit of skills
technique, enjoy it.

Source of this article: original contribution

*The original content is produced by the open source community of Aikesheng, and the original content shall not be used without authorization. For reprinting, please contact the editor and indicate the source.


In some data compensation scenarios, when using backup to import table data, if it is known that there is conflicting data in the target table, the --force parameter of mysql will be used to skip the conflicting row data, preventing batch processing from being interrupted and exiting due to row conflicts. However, some backup data may not be imported in some cases.

Let's reproduce a specific scenario:

  • First prepare an sbtest1 table
mysql> select count(*) from testdb.sbtest1;
+----------+
| count(*) |
+----------+
|    10000 |
+----------+
1 row in set (0.01 sec)

mysql> desc sbtest1;
+-------+-----------+------+-----+---------+-------+
| Field | Type      | Null | Key | Default | Extra |
+-------+-----------+------+-----+---------+-------+
| id    | int(11)   | NO   | PRI | NULL    |       |
| k     | int(11)   | NO   |     | 0       |       |
| c     | char(120) | NO   |     |         |       |
| pad   | char(60)  | NO   |     |         |       |
+-------+-----------+------+-----+---------+-------+
4 rows in set (0.01 sec)

mysql> select * from sbtest1 limit 10;
+----+------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| id | k    | c                                                                                                                       | pad                                                         |
+----+------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
|  1 | 4987 | 31451373586-15688153734-79729593694-96509299839-83724898275-86711833539-78981337422-35049690573-51724173961-87474696253 | 98996621624-36689827414-04092488557-09587706818-65008859162 |
|  2 | 4978 | 21472970079-70972780322-70018558993-71769650003-09270326047-32417012031-10768856803-14235120402-93989080412-18690312264 | 04776826683-45880822084-77922711547-29057964468-76514263618 |
|  3 | 4990 | 49376827441-24903985029-56844662308-79012577859-40518387141-60588419212-24399130405-42612257832-29494881732-71506024440 | 26843035807-96849339132-53943793991-69741192222-48634174017 |
|  4 | 5187 | 85762858421-36258200885-10758669419-44272723583-12529521893-95630803635-53907705724-07005352902-43001596772-53048338959 | 37979424284-37912826784-31868864947-42903702727-96097885121 |
|  5 | 5026 | 24805466175-85245528617-94635882649-46305216925-28637832581-03224489581-68883711727-95491561683-91969681472-12022277774 | 19288959552-55556468076-14192290426-55457672510-18043372364 |
|  6 | 5008 | 52892836230-54177743992-01821871718-48412537487-30066596248-87215430797-00375777469-64498831720-58542556455-90784765418 | 59487960480-08453890592-99628797439-16757639138-29377916560 |
|  7 | 5015 | 85820931248-14475640036-11980694501-86588543167-31029306229-09626867980-90685354565-02350460358-25863585366-53793794448 | 26081374730-86321700986-51212137094-30635959762-03880194434 |
|  8 | 4979 | 81578049255-33453976301-67096870761-27658738403-30546242249-53677469854-26594573136-34292002037-52736825353-99165193170 | 64289062455-51067794311-09919261228-11533354367-07401173317 |
|  9 | 4996 | 02844262904-89815504820-46476698406-25828746135-14201395324-78201250152-94654394113-77666987600-97276171313-77528982779 | 38215684217-21734504182-35952570785-14195084201-48480769869 |
| 10 | 5030 | 40220850778-55738450546-96714529889-19538905303-07639037605-88657978350-20035510648-96563475423-99572626664-70428901981 | 34112888944-54615532073-15991265867-97024246980-72648798713 |
+----+------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
10 rows in set (0.00 sec)
  • Use mysqldump --no-create-info to back up the table data of sbtest1
mysqldump -h127.0.0.1 -P7777 -uroot -p --default-character-set=utf8mb4 --set-gtid-purged=off --single-transaction --no-create-info --tables testdb sbtest1 > sbtest1.sql
  • Manually delete data and keep only individual data rows to simulate the scenario of data conflict in some rows

    mysql> delete from testdb.sbtest1 where id !=5000;
    Query OK, 9999 rows affected (0.08 sec)
    
    mysql> select * from testdb.sbtest1;
    +------+------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
    | id   | k    | c                                                                                                                       | pad                                                         |
    +------+------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
    | 5000 | 4997 | 88923911551-62811085158-27667899446-99179940770-00828015942-62702842719-18939469756-03432067751-32588369065-48213701836 | 39063801052-61512609987-53254072797-78688473347-53259431059 |
  • row in set (0.00 sec)

  • Let's restore the table data using the previous backup file
[root@jy-dmp08 ~]# mysql  -h127.0.0.1 -P7777 -uroot -p  testdb --force < sbtest1.sql
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1062 (23000) at line 24: Duplicate entry '5000' for key 'PRIMARY'
#执行报错id为5000的行冲突
  • Let's query how many rows of data have been imported
mysql> select count(*) from testdb.sbtest1;
+----------+
| count(*) |
+----------+
|     4684 |
+----------+
1 row in set (0.00 sec)
  • --force will skip the conflicting line and continue to execute. Why is there only 4684 data, shouldn't it be 10000? In fact, here --force skips the insert statement where the conflicting error data is located.
  • A parameter extended-insert of mysqldump should be mentioned here: this parameter is enabled by default and uses a multi-line batch insert statement; you can use the --skip-extended-insert parameter to turn off multi-line batch insert.
 -e, --extended-insert
                      Use multiple-row INSERT syntax that include several
                      VALUES lists.
                      (Defaults to on; use --skip-extended-insert to disable.)
  • Analysis to this problem almost found. The extended-insert parameter is enabled by default, and the import error will skip the insert statement where the error line is located; the missing part of the data is the data in the skipped multi-line insert statement. Let's check whether the conjecture is correct.
#备份文件共有两个多行insert语句
[root@jy-dmp08 ~]# grep -i 'insert' sbtest1.sql |wc -l
2


  • It can be seen that the error line with id=5000 is in the first multi-line insert statement, and the number of lines inserted by the second insert is 4683 plus the existing line with id=5000 is exactly 4684 lines.
  • Let's use the --skip-extended-insert parameter to see the effect.
mysqldump -h127.0.0.1 -P7777 -uroot -p --default-character-set=utf8mb4 --set-gtid-purged=off --single-transaction --no-create-info  --skip-extended-insert --tables testdb sbtest1 > sbtest1.sql


[root@jy-dmp08 ~]#  mysql  -h127.0.0.1 -P7777 -uroot -p  testdb --force < sbtest1.sql
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1062 (23000) at line 5023: Duplicate entry '5000' for key 'PRIMARY'
#执行报错id为5000的行冲突

mysql> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
|    10000 |
+----------+
1 row in set (0.00 sec)
  • It can be seen that the data import is normal, and there is no data loss problem.

Recommendations:

--force usage scenarios

  • In general, it is not recommended to use --force to import backup data.
  • Use the --force parameter to skip the SQL statement that reports the error when it is known that there is conflicting data in the import target table to avoid import interruption and exit.
  • Using the --force parameter, you can get the error information of all conflicting row data, which is convenient for checking the conflicting error data later.

--force limit

  • --force ignores conflicting data rows. This scenario is only applicable to the case where only conflicting old data is allowed to be retained during import.
  • The premise of using --force to import table data is to back up through the mysqldump --skip-extended-insert parameter, which is disabled by default; the default multi-row insert mode with the --force parameter will skip the entire location where the conflicting data is located. Insert SQL, resulting in partial data loss.
  • Using the skip-extended-insert single-row batch insert mode will increase the size of the backup file and the time it takes to import the backup file. This method is only suitable for a small amount of table data. If you get something, give it a like

爱可生开源社区
426 声望207 粉丝

成立于 2017 年,以开源高质量的运维工具、日常分享技术干货内容、持续的全国性的社区活动为社区己任;目前开源的产品有:SQL审核工具 SQLE,分布式中间件 DBLE、数据传输组件DTLE。