9
头图

Incorrect use of numbers for querying string fields will result in implicit type conversions, and will not hit the index pit. I believe most of our friends have stepped on it.
Especially when most of the data stored in the field is numeric, it is easy to preconceived that the field is of int and mistakenly use where file_id=123456789 execute the query. It is better to Explain command in advance, and find that it is useless to hit the index, thereby correcting the error; a large number of slow queries have appeared after the release of the code and the CPU usage and disk IO of the database server have soared. Into a production accident.

Attentive friends will definitely find that although the varchar field cannot hit the index with a numeric search, the int field can usually find the result quickly if it is searched with a string. Why is this?

Below we use actual tests to illustrate the reasons for this phenomenon.
The MySQL version used for testing is 5.7.18, the file is as follows, the storage engine is InnoDB , and the number of table data is 5 million+.

mysql> SELECT VERSION();
+---------------------+
| VERSION()           |
+---------------------+
| 5.7.18-20170830-log |
+---------------------+

mysql> DESC `file`;
+----------+---------------------+------+-----+---------+----------------+
| Field    | Type                | Null | Key | Default | Extra          |
+----------+---------------------+------+-----+---------+----------------+
| id       | int(11)             | NO   | PRI | NULL    | auto_increment |
| fs_id    | varchar(20)         | NO   | MUL | NULL    |                |
| filename | varchar(255)        | NO   |     | NULL    |                |
| shareid  | bigint(20) unsigned | NO   | MUL | NULL    |                |
| uk       | bigint(20) unsigned | NO   |     | NULL    |                |
| pid      | varchar(32)         | NO   |     | NULL    |                |
+----------+---------------------+------+-----+---------+----------------+

mysql> SELECT COUNT(*) FROM `file`;
+----------+
| COUNT(*) |
+----------+
|  5416697 |
+----------+

varchar field is queried with numbers

fs_id field in the data table file varchar , and a common index idx_fs_id is established.
When querying using strings, it takes 0.07 seconds.
View the execution plan through the EXPLAIN command. The result shows that the index of fs_id

mysql> SELECT * FROM `file` WHERE `fs_id`='635341798980956';
+---------+-----------------+-------------+------------+------------+---------+
| id      | fs_id           | filename    | shareid    | uk         | pid     |
+---------+-----------------+-------------+------------+------------+---------+
| 1043170 | 635341798980956 | ⑮MySQL高级 | 3181065465 | 3959617630 | o6RlSp0 |
+---------+-----------------+-------------+------------+------------+---------+
1 row in set (0.07 sec)

mysql> EXPLAIN SELECT * FROM `file` WHERE `fs_id`='635341798980956';
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key       | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | file  | NULL       | ref  | idx_fs_id     | idx_fs_id | 62      | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-------+

However, when querying using numbers, it took 7.04 seconds.
View the execution plan through the EXPLAIN command, and found that a full table scan was performed during the query, and the index was not used.

mysql> SELECT * FROM `file` WHERE `fs_id`=635341798980956;
+---------+-----------------+-------------+------------+------------+---------+
| id      | fs_id           | filename    | shareid    | uk         | pid     |
+---------+-----------------+-------------+------------+------------+---------+
| 1043170 | 635341798980956 | ⑮MySQL高级 | 3181065465 | 3959617630 | o6RlSp0 |
+---------+-----------------+-------------+------------+------------+---------+
1 row in set (7.04 sec)

mysql> EXPLAIN SELECT * FROM `file` WHERE `fs_id`=635341798980956;
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | file  | NULL       | ALL  | idx_fs_id     | NULL | NULL    | NULL | 4878670 |    10.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+

fs_id field is clearly indexed, but when querying with numbers, a full table scan is still required.
The reason for this situation, I believe most friends know, because the fs_id field is a string type, and the input parameter is an integer type, so the implicit type conversion .

int field is queried with string search

shareid field in the data table file bigint , and a general index idx_shareid is established.
When using numbers and strings to query, the time-consuming is both 0.04 seconds.
View the execution plan through the EXPLAIN command. The result shows that no matter whether the quotation marks are added around the number, the index of idx_shareid

mysql> SELECT * FROM `file` WHERE `shareid`=3181065465;
+---------+-----------------+-------------+------------+------------+---------+
| id      | fs_id           | filename    | shareid    | uk         | pid     |
+---------+-----------------+-------------+------------+------------+---------+
| 1043170 | 635341798980956 | ⑮MySQL高级 | 3181065465 | 3959617630 | o6RlSp0 |
+---------+-----------------+-------------+------------+------------+---------+
1 row in set (0.04 sec)

mysql> SELECT * FROM `file` WHERE `shareid`='3181065465';
+---------+-----------------+-------------+------------+------------+---------+
| id      | fs_id           | filename    | shareid    | uk         | pid     |
+---------+-----------------+-------------+------------+------------+---------+
| 1043170 | 635341798980956 | ⑮MySQL高级 | 3181065465 | 3959617630 | o6RlSp0 |
+---------+-----------------+-------------+------------+------------+---------+
1 row in set (0.04 sec)

mysql> EXPLAIN SELECT * FROM `file` WHERE `shareid`=3181065465;
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key         | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | file  | NULL       | ref  | idx_shareid   | idx_shareid | 8       | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+------+----------+-------+

mysql> EXPLAIN SELECT * FROM `file` WHERE `shareid`='3181065465';
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key         | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | file  | NULL       | ref  | idx_shareid   | idx_shareid | 8       | const |    1 |   100.00 | Using index condition |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+------+----------+-----------------------+

For this result, there are now three questions:

  1. What are the rules for implicit type conversion?
  2. Why do you need a full table scan when querying data when implicit type conversion is triggered?
  3. Why can the int field hit the index with a string search?

1. What are the rules for implicit type conversion?

There is a very simple way to verify the rules of implicit type conversion, which is to look at the results of SELECT '10' > 9 and SELECT 9 > '10'

If the rule is "Convert a string into a number" , then it is a digital comparison, SELECT '10' > 9 should be 1, and SELECT 9 > '10' should be 0;
If the rule is "Convert numbers into strings" , then it is a string comparison. SELECT '10' > 9 should be 0, and SELECT 9 > '10' should be 1.

mysql> SELECT '10' > 9;
+----------+
| '10' > 9 |
+----------+
|        1 |
+----------+
mysql> SELECT 9 > '10';
+----------+
| 9 > '10' |
+----------+
|        0 |
+----------+

It can be seen that SELECT '10' > 9 is 1, and SELECT 9 > '10' is 0, so it can be confirmed that the MySQL type conversion rule is: When a string is compared with a number, the string will be converted into a number .

Therefore, when we use the following statement to query:

SELECT * FROM `file` WHERE `fs_id`=635341798980956;

For the MySQL optimizer, this statement is equivalent to converting fs_id to int and then comparing it with the input value:

SELECT * FROM `file` WHERE CAST(`fs_id` AS signed INT)=635341798980956;

As we all know, performs a function operation on the index field when querying, the query process will not be able to use the index .

2. Why is the implicit type conversion triggered and a full table scan is required when querying data?

For InnoDB's B+ tree index structure , I believe most friends have a certain understanding.

The example has the following set of data:

1, 2, 3, 4, 6, 6, 7, 11, 13, 21, 23, 39, 42, 61, 71, 
101, 201, 220, 303, 345, 411, 601, 620, 701, 1402, 3333

When indexing as a numeric type, the B+ tree index structure is as follows:

数值类型B+树索引

When indexing as a string type, the data sequence and B+ tree index structure are as follows:

1, 101, 11, 13, 1402, 2, 201, 21, 220, 23, 3, 303, 3333, 
345, 39, 4, 411, 42, 6, 6, 601, 61, 620, 7, 701, 71

字符串类型B+树索引

In fact, B+ tree index comes from the orderly sibling nodes of the same layer. function operation on the index field, which may destroy the order of the index value .
a string with a number, it will convert the string into a number .

Therefore, when the string type field fs_id receives a numeric type input parameter, fs_id will be converted to a numeric type. The index created by the string type is out of order for the numeric type, so the index of the fs_id It can only be found through a full table scan.

3. Why can the int field hit the index with a string search?

Is it because numeric type field query with string will not trigger implicit type conversion? Not really.

Through the above verification, we know: When a string is compared with a number, the string will be converted into a number .

Therefore, when we use the string type as an input parameter to query a numeric field:

SELECT * FROM `file` WHERE `shareid`='3181065465'

For the MySQL optimizer, this statement is equivalent to converting the input parameter '3181065465' into int and then querying:

SELECT * FROM `file` WHERE `shareid`=CAST('3181065465' AS signed INT);

The function operation on the input parameters after the equal sign does not affect the index usage of the shareid Therefore, although an implicit type conversion occurs when a string query is used for a numeric type field, it does not affect the use of the field index.

to sum up

MySQL query, when a string is compared with a number, an implicit type conversion is triggered. The conversion rule is to convert a string to a number.

When the index field is a string type and the input parameter is a numeric type, the field will be converted to a numeric type and then searched, that is, a functional operation is performed on the index field, which destroys the orderliness of the index, so the index cannot be used.
When the index field is a numeric type and the input parameter is a string type, the input parameter will be converted to a numeric type and then searched. The input parameter after the equal sign is performed functionally, which does not affect the orderliness of the index field, so You can use the index.


白菜1031
5.4k 声望1.6k 粉丝