Incorrect use of numbers for querying string fields will result in implicit type conversions, and will not hit the index pit. I believe most of our friends have stepped on it.
Especially when most of the data stored in the field is numeric, it is easy to preconceived that the field is of int
and mistakenly use where file_id=123456789
execute the query. It is better to Explain
command in advance, and find that it is useless to hit the index, thereby correcting the error; a large number of slow queries have appeared after the release of the code and the CPU usage and disk IO of the database server have soared. Into a production accident.
Attentive friends will definitely find that although the varchar
field cannot hit the index with a numeric search, the int
field can usually find the result quickly if it is searched with a string. Why is this?
Below we use actual tests to illustrate the reasons for this phenomenon.
The MySQL version used for testing is 5.7.18, the file
is as follows, the storage engine is InnoDB , and the number of table data is 5 million+.
mysql> SELECT VERSION();
+---------------------+
| VERSION() |
+---------------------+
| 5.7.18-20170830-log |
+---------------------+
mysql> DESC `file`;
+----------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+---------------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| fs_id | varchar(20) | NO | MUL | NULL | |
| filename | varchar(255) | NO | | NULL | |
| shareid | bigint(20) unsigned | NO | MUL | NULL | |
| uk | bigint(20) unsigned | NO | | NULL | |
| pid | varchar(32) | NO | | NULL | |
+----------+---------------------+------+-----+---------+----------------+
mysql> SELECT COUNT(*) FROM `file`;
+----------+
| COUNT(*) |
+----------+
| 5416697 |
+----------+
varchar field is queried with numbers
fs_id
field in the data table file
varchar
, and a common index idx_fs_id
is established.
When querying using strings, it takes 0.07 seconds.
View the execution plan through the EXPLAIN command. The result shows that the index of fs_id
mysql> SELECT * FROM `file` WHERE `fs_id`='635341798980956';
+---------+-----------------+-------------+------------+------------+---------+
| id | fs_id | filename | shareid | uk | pid |
+---------+-----------------+-------------+------------+------------+---------+
| 1043170 | 635341798980956 | ⑮MySQL高级 | 3181065465 | 3959617630 | o6RlSp0 |
+---------+-----------------+-------------+------------+------------+---------+
1 row in set (0.07 sec)
mysql> EXPLAIN SELECT * FROM `file` WHERE `fs_id`='635341798980956';
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-------+
| 1 | SIMPLE | file | NULL | ref | idx_fs_id | idx_fs_id | 62 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-------+
However, when querying using numbers, it took 7.04 seconds.
View the execution plan through the EXPLAIN command, and found that a full table scan was performed during the query, and the index was not used.
mysql> SELECT * FROM `file` WHERE `fs_id`=635341798980956;
+---------+-----------------+-------------+------------+------------+---------+
| id | fs_id | filename | shareid | uk | pid |
+---------+-----------------+-------------+------------+------------+---------+
| 1043170 | 635341798980956 | ⑮MySQL高级 | 3181065465 | 3959617630 | o6RlSp0 |
+---------+-----------------+-------------+------------+------------+---------+
1 row in set (7.04 sec)
mysql> EXPLAIN SELECT * FROM `file` WHERE `fs_id`=635341798980956;
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | file | NULL | ALL | idx_fs_id | NULL | NULL | NULL | 4878670 | 10.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
fs_id
field is clearly indexed, but when querying with numbers, a full table scan is still required.
The reason for this situation, I believe most friends know, because the fs_id
field is a string type, and the input parameter is an integer type, so the implicit type conversion .
int field is queried with string search
shareid
field in the data table file
bigint
, and a general index idx_shareid
is established.
When using numbers and strings to query, the time-consuming is both 0.04 seconds.
View the execution plan through the EXPLAIN command. The result shows that no matter whether the quotation marks are added around the number, the index of idx_shareid
mysql> SELECT * FROM `file` WHERE `shareid`=3181065465;
+---------+-----------------+-------------+------------+------------+---------+
| id | fs_id | filename | shareid | uk | pid |
+---------+-----------------+-------------+------------+------------+---------+
| 1043170 | 635341798980956 | ⑮MySQL高级 | 3181065465 | 3959617630 | o6RlSp0 |
+---------+-----------------+-------------+------------+------------+---------+
1 row in set (0.04 sec)
mysql> SELECT * FROM `file` WHERE `shareid`='3181065465';
+---------+-----------------+-------------+------------+------------+---------+
| id | fs_id | filename | shareid | uk | pid |
+---------+-----------------+-------------+------------+------------+---------+
| 1043170 | 635341798980956 | ⑮MySQL高级 | 3181065465 | 3959617630 | o6RlSp0 |
+---------+-----------------+-------------+------------+------------+---------+
1 row in set (0.04 sec)
mysql> EXPLAIN SELECT * FROM `file` WHERE `shareid`=3181065465;
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | file | NULL | ref | idx_shareid | idx_shareid | 8 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+------+----------+-------+
mysql> EXPLAIN SELECT * FROM `file` WHERE `shareid`='3181065465';
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+------+----------+-----------------------+
| 1 | SIMPLE | file | NULL | ref | idx_shareid | idx_shareid | 8 | const | 1 | 100.00 | Using index condition |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+------+----------+-----------------------+
For this result, there are now three questions:
- What are the rules for implicit type conversion?
- Why do you need a full table scan when querying data when implicit type conversion is triggered?
- Why can the int field hit the index with a string search?
1. What are the rules for implicit type conversion?
There is a very simple way to verify the rules of implicit type conversion, which is to look at the results of SELECT '10' > 9
and SELECT 9 > '10'
If the rule is "Convert a string into a number" , then it is a digital comparison, SELECT '10' > 9
should be 1, and SELECT 9 > '10'
should be 0;
If the rule is "Convert numbers into strings" , then it is a string comparison. SELECT '10' > 9
should be 0, and SELECT 9 > '10'
should be 1.
mysql> SELECT '10' > 9;
+----------+
| '10' > 9 |
+----------+
| 1 |
+----------+
mysql> SELECT 9 > '10';
+----------+
| 9 > '10' |
+----------+
| 0 |
+----------+
It can be seen that SELECT '10' > 9
is 1, and SELECT 9 > '10'
is 0, so it can be confirmed that the MySQL type conversion rule is: When a string is compared with a number, the string will be converted into a number .
Therefore, when we use the following statement to query:
SELECT * FROM `file` WHERE `fs_id`=635341798980956;
For the MySQL optimizer, this statement is equivalent to converting fs_id
to int and then comparing it with the input value:
SELECT * FROM `file` WHERE CAST(`fs_id` AS signed INT)=635341798980956;
As we all know, performs a function operation on the index field when querying, the query process will not be able to use the index .
2. Why is the implicit type conversion triggered and a full table scan is required when querying data?
For InnoDB's B+ tree index structure , I believe most friends have a certain understanding.
The example has the following set of data:
1, 2, 3, 4, 6, 6, 7, 11, 13, 21, 23, 39, 42, 61, 71,
101, 201, 220, 303, 345, 411, 601, 620, 701, 1402, 3333
When indexing as a numeric type, the B+ tree index structure is as follows:
When indexing as a string type, the data sequence and B+ tree index structure are as follows:
1, 101, 11, 13, 1402, 2, 201, 21, 220, 23, 3, 303, 3333,
345, 39, 4, 411, 42, 6, 6, 601, 61, 620, 7, 701, 71
In fact, B+ tree index comes from the orderly sibling nodes of the same layer. function operation on the index field, which may destroy the order of the index value .
a string with a number, it will convert the string into a number .
Therefore, when the string type field fs_id
receives a numeric type input parameter, fs_id
will be converted to a numeric type. The index created by the string type is out of order for the numeric type, so the index of the fs_id
It can only be found through a full table scan.
3. Why can the int field hit the index with a string search?
Is it because numeric type field query with string will not trigger implicit type conversion? Not really.
Through the above verification, we know: When a string is compared with a number, the string will be converted into a number .
Therefore, when we use the string type as an input parameter to query a numeric field:
SELECT * FROM `file` WHERE `shareid`='3181065465'
For the MySQL optimizer, this statement is equivalent to converting the input parameter '3181065465'
into int and then querying:
SELECT * FROM `file` WHERE `shareid`=CAST('3181065465' AS signed INT);
The function operation on the input parameters after the equal sign does not affect the index usage of the shareid
Therefore, although an implicit type conversion occurs when a string query is used for a numeric type field, it does not affect the use of the field index.
to sum up
MySQL query, when a string is compared with a number, an implicit type conversion is triggered. The conversion rule is to convert a string to a number.
When the index field is a string type and the input parameter is a numeric type, the field will be converted to a numeric type and then searched, that is, a functional operation is performed on the index field, which destroys the orderliness of the index, so the index cannot be used.
When the index field is a numeric type and the input parameter is a string type, the input parameter will be converted to a numeric type and then searched. The input parameter after the equal sign is performed functionally, which does not affect the orderliness of the index field, so You can use the index.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。