Indexing is a function of speeding up data search by using a certain regular data structure relationship with the actual data. There is a large amount of content stored in our database, and the index can quickly find the location of the actual file corresponding to the node through the data node according to specific rules and algorithms. Simply put, an index is like a directory of books, which can help us locate the specific content of a book accurately.
Recently, I encountered a problem when learning indexing. Let's take a look at it by reproducing it.
First create a test table as follows:
CREATE TABLE `simple_table` (
`id` int NOT NULL AUTO_INCREMENT,
`c1` datetime DEFAULT NULL,
`c2` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `c2__idx` (`c2`),
KEY `fun_c1_idx` ((cast(`c1` as date)))
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
fun_c1_idx: is the function index supported by mysql8
Then randomly insert 1000 pieces of data into this table.
select * from simple_table where date(c2) = '2022-01-01';
It can be seen that the above SQL statement cannot go through the index. Because the actual value and primary key value of the column are stored in the index tree, the function operation on the conditional field will invalidate the index. Simply put, if '2022-01-01' is used to match, it will not be able to locate the value in the index tree. Therefore, the correct choice is to give up the index and choose a full table scan.
Let's look at the next SQL.
select id,c2 from simple_table where date(c2) = '2022-01-01';
Unlike the first one, this SQL only returns some of the columns, and these columns are all in the index. Then we use explain to analyze the execution plan of this SQL to determine whether it can go through the index:
It can be clearly seen from the above figure that the key value is c2__idx
, that is, the index is gone.
It's very strange here, doesn't it mean that the function operation on the conditional field will invalidate the index, why is the index gone again?
This is the problem I encountered when I was learning, and I later found out that it was because I didn't figure out what "walking the index" meant. Everyone knows that indexes can speed up queries, but do you know why indexes can speed up queries? The answer is to reduce the number of queries.
Now we go back to the above SQL, we can see that although the key value is c2__idx
, the rows value is 1000. That is, all records of the scanned full table, ie c2__idx
are scanned. But because c2__idx
already contains all the columns that need to be queried, the optimizer chose to walk this index.
Finally, let's think about a question, is it necessarily faster to use an index? Let's take a look at this problem through a specific example:
select * from simple_table;
select * from simple_table where id > 0;
You can see that the first SQL is not indexed, and the second SQL uses the primary key index without the need for explain analysis. It can be seen that the speed of not using the index is faster. This is because although the index is used, the entire index tree is scanned to the right from the leftmost leaf node of the primary key index, and a full table scan is performed, which makes the index meaningless. .
To sum up: whether a query uses an index only indicates the execution process of a SQL statement; and whether it is a slow query is determined by its execution time, that is to say, there is no necessity between whether an index is used and whether it is a slow query. contact. When we use an index, we should not only focus on whether it works, but whether the index reduces the number of data rows scanned by the query, and the efficiency of scanning rows will be improved. For a large table, it is not only necessary to create an index, but also to consider the filterability of the index. If the filterability is good, the execution speed will be faster.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。