Foreword:

We all know that when executing a select query, there is a big difference between using an index or not. If an index is not used, a select statement may execute for several seconds or longer, and if an index is used, it may be completed instantly. So when the update statement is executed, what is the difference between using an index or not? Is there a big difference in execution time? Let's explore this article together.

1. update SQL test

In order to compare the difference, here I create two large tables with the same data, one with a normal index and the other without a normal index. Let's compare the difference between the two.

# tb_noidx 表无普通索引
mysql> show create table tb_noidx\G
*************************** 1. row ***************************
       Table: tb_noidx
Create Table: CREATE TABLE `tb_noidx` (
  `increment_id` int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT '自增主键',
  `col1` char(32) NOT NULL COMMENT '字段1',
  `col2` char(32) NOT NULL COMMENT '字段2',
  ...
  `del` tinyint(4) NOT NULL DEFAULT '0' COMMENT '是否删除',
) ENGINE=InnoDB AUTO_INCREMENT=3696887 DEFAULT CHARSET=utf8 COMMENT='无索引表'

mysql> select count(*) from tb_noidx;
+----------+
| count(*) |
+----------+
|  3590105 |
+----------+

mysql> select concat(round(sum(data_length/1024/1024),2),'MB') as data_length_MB, concat(round(sum(index_length/1024/1024),2),'MB') as index_length_MB
    -> from information_schema.tables where table_schema='testdb' and table_name = 'tb_noidx'; 
+----------------+-----------------+
| data_length_MB | index_length_MB |
+----------------+-----------------+
| 841.98MB       | 0.00MB          |
+----------------+-----------------+

# tb_withidx 表有普通索引
mysql> show create table tb_withidx\G
*************************** 1. row ***************************
       Table: tb_withidx
Create Table: CREATE TABLE `tb_withidx` (
  `increment_id` int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT '自增主键',
  `col1` char(32) NOT NULL COMMENT '字段1',
  `col2` char(32) NOT NULL COMMENT '字段2',
  ...
  `del` tinyint(4) NOT NULL DEFAULT '0' COMMENT '是否删除',
  PRIMARY KEY (`increment_id`),
  KEY `idx_col1` (`col1`),
  KEY `idx_del` (`del`)
) ENGINE=InnoDB AUTO_INCREMENT=3696887 DEFAULT CHARSET=utf8 COMMENT='有索引表'

mysql> select count(*) from tb_withidx;
+----------+
| count(*) |
+----------+
|  3590105 |
+----------+

mysql> select concat(round(sum(data_length/1024/1024),2),'MB') as data_length_MB, concat(round(sum(index_length/1024/1024),2),'MB') as index_length_MB
    -> from information_schema.tables where table_schema='testdb' and table_name = 'tb_withidx'; 
+----------------+-----------------+
| data_length_MB | index_length_MB |
+----------------+-----------------+
| 841.98MB       | 210.50MB        |
+----------------+-----------------+

Here is an explanation, the data of the two tables tb_noidx and tb_withidx are exactly the same. The table has about 360W data, which takes up about 840M of space. Among them, the col1 field has a high degree of discrimination, and the del field has a low degree of discrimination. Below, we use these two fields as filter conditions to execute the update statement:

# 以 col1 字段为筛选条件 来更新 col2 字段
mysql> explain update tb_withidx set col2 = '48348a10d7794d269ecf10f9e3f20b52' where col1 = '48348a10d7794d269ecf10f9e3f20b52';
+----+-------------+------------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key      | key_len | ref   | rows | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+
|  1 | UPDATE      | tb_withidx | NULL       | range | idx_col1      | idx_col1 | 96      | const |    1 |   100.00 | Using where |
+----+-------------+------------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+
1 row in set (0.00 sec)

mysql> update tb_withidx set col2 = '48348a10d7794d269ecf10f9e3f20b52' where col1 = '48348a10d7794d269ecf10f9e3f20b52';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> explain update tb_noidx set col2 = '48348a10d7794d269ecf10f9e3f20b52' where col1 = '48348a10d7794d269ecf10f9e3f20b52';
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table    | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
|  1 | UPDATE      | tb_noidx | NULL       | index | NULL          | PRIMARY | 4       | NULL | 3557131 |   100.00 | Using where |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set (0.00 sec)

mysql> update tb_noidx set col2 = '48348a10d7794d269ecf10f9e3f20b52' where col1 = '48348a10d7794d269ecf10f9e3f20b52';
Query OK, 1 row affected (13.29 sec)
Rows matched: 1  Changed: 1  Warnings: 0

# 以 col1 字段为筛选条件 来更新 col1 字段
mysql> explain update tb_withidx set col1 = 'col1aac4c0f07449c688af42886465b76b' where col1 = '95aac4c0f07449c688af42886465b76b';
+----+-------------+------------+------------+-------+---------------+----------+---------+-------+------+----------+------------------------------+
| id | select_type | table      | partitions | type  | possible_keys | key      | key_len | ref   | rows | filtered | Extra                        |
+----+-------------+------------+------------+-------+---------------+----------+---------+-------+------+----------+------------------------------+
|  1 | UPDATE      | tb_withidx | NULL       | range | idx_col1      | idx_col1 | 96      | const |    1 |   100.00 | Using where; Using temporary |
+----+-------------+------------+------------+-------+---------------+----------+---------+-------+------+----------+------------------------------+
1 row in set (0.01 sec)

mysql> update tb_withidx set col1 = 'col1aac4c0f07449c688af42886465b76b' where col1 = '95aac4c0f07449c688af42886465b76b';
Query OK, 1 row affected, 1 warning (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> explain update tb_noidx set col1 = 'col1aac4c0f07449c688af42886465b76b' where col1 = '95aac4c0f07449c688af42886465b76b';
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table    | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
|  1 | UPDATE      | tb_noidx | NULL       | index | NULL          | PRIMARY | 4       | NULL | 3557131 |   100.00 | Using where |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set (0.01 sec)

mysql> update tb_noidx set col1 = 'col1aac4c0f07449c688af42886465b76b' where col1 = '95aac4c0f07449c688af42886465b76b';
Query OK, 1 row affected, 1 warning (13.15 sec)
Rows matched: 1  Changed: 1  Warnings: 0

# 以 del 字段为筛选条件 来更新 col2 字段
# del为0的大概203W条 del为1的大概155W条
mysql> select del,count(*) from tb_withidx GROUP BY del;
+-----+----------+
| del | count(*) |
+-----+----------+
| 0   |  2033080 |
| 1   |  1557025 |
+-----+----------+

mysql> explain update tb_withidx set col2 = 'col24c0f07449c68af42886465b76' where del = 0;
+----+-------------+------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
|  1 | UPDATE      | tb_withidx | NULL       | index | idx_del       | PRIMARY | 4       | NULL | 3436842 |   100.00 | Using where |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set (0.00 sec)

mysql> update tb_withidx set col2 = 'col24c0f07449c68af42886465b76' where del = 0;
Query OK, 2033080 rows affected (47.15 sec)
Rows matched: 2033080  Changed: 2033080  Warnings: 0

mysql> explain update tb_noidx set col2 = 'col24c0f07449c68af42886465b76' where del = 0;
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table    | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
|  1 | UPDATE      | tb_noidx | NULL       | index | NULL          | PRIMARY | 4       | NULL | 3296548 |   100.00 | Using where |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set (0.00 sec)

mysql> update tb_noidx set col2 = 'col24c0f07449c68af42886465b76' where del = 0;
Query OK, 2033080 rows affected (49.79 sec)
Rows matched: 2033080  Changed: 2033080  Warnings: 0

# 以 del 字段为筛选条件 来更新 del 字段
mysql> explain update tb_withidx set del = 2 where del = 0;                                      
+----+-------------+------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
|  1 | UPDATE      | tb_withidx | NULL       | index | idx_del       | PRIMARY | 4       | NULL | 3436842 |   100.00 | Using where |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set (0.03 sec)

mysql> update tb_withidx set del = 2 where del = 0;
Query OK, 2033080 rows affected (2 min 34.96 sec)
Rows matched: 2033080  Changed: 2033080  Warnings: 0

mysql> explain update tb_noidx set del = 2 where del = 0;  
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table    | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
|  1 | UPDATE      | tb_noidx | NULL       | index | NULL          | PRIMARY | 4       | NULL | 3296548 |   100.00 | Using where |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
1 row in set (0.00 sec)

mysql>  update tb_noidx set del = 2 where del = 0; 
Query OK, 2033080 rows affected (50.57 sec)
Rows matched: 2033080  Changed: 2033080  Warnings: 0

From the above experiment, we can roughly see that whether an index is used or not has a great impact on the execution speed of the update statement. The specific performance is as follows:

  • If you add an index on a field with a higher degree of discrimination and update it with this field as the filter condition, whether it is to update this field or other fields, the index update will be much faster.
  • If you add an index on a field with a low degree of discrimination and update it with this field as the filter condition, when updating other fields, there is little difference between whether there is an index or not. When updating this field with a low degree of discrimination, use the index Updates are slower.

2. Some experience summaries

Let's try to explain the above experimental results. First, let's look at the update SQL execution process, which is roughly as follows:

  1. First, the client sends a request to the server to establish a connection.
  2. The server first looks at the query cache. For the SQL that updates a table, all query caches for that table are invalidated.
  3. Then come to the parser, perform grammatical analysis, and verify some system keywords to verify whether the grammar is compliant.
  4. Then the optimizer performs SQL optimization, such as how to choose an index, and then generates an execution plan.
  5. The executor goes to the storage engine to query the data that needs to be updated.
  6. The storage engine judges whether there is data that needs to be updated in the current buffer pool, and returns directly if it exists, otherwise it loads the data from the disk.
  7. The executor calls the storage engine API to update the data.
  8. The memory updates data and writes undo log and redo log information at the same time.
  9. The executor writes the binlog, commits the transaction, and the process ends.

In other words, to execute the update statement, the updated record needs to be queried first. It is not difficult to understand why the field with a higher degree of discrimination is used as the filter condition to update, and the execution is faster when there is an index.

For fields with a low degree of discrimination, there is little difference between using or not using an index. The reason is that the time required to query the records to be updated is not much different, and the number of rows that need to be scanned is not much different. When updating a field with a low degree of discrimination, because the index b+ tree needs to be maintained, the update speed will be slowed down.

As mentioned before, although the index can speed up the query, but the index also has a disadvantage, that is, the index needs to be dynamically maintained. When the data in the table is added, deleted, or modified, the data maintenance speed will be reduced. The results of this experiment can also prove this conclusion.

Through this experiment, we can also get some index related experience:

  • Create indexes only for columns used for searching, sorting, grouping, and joining.
  • Try to build indexes on fields with high discrimination, and avoid building indexes on fields with low discrimination.
  • Avoid creating too many indexes for frequently updated tables.
  • Do not have redundant indexes, which will increase maintenance costs.

MySQL技术
298 声望40 粉丝

MySQL技术学习者