Hello everyone, I am Glacier~~
As one of the most used relational databases in the Internet industry, MySQL is inseparable from its free and open source features. However, many small partners have worked for many years and only know that they use MySQL to perform CRUD operations. This has also led to many small partners who want to change jobs and enter large factories after working for many years, but they repeatedly hit a wall during interviews.
asks a simple question: How is the select statement executed in MySQL? This is also a question that many interviewers like to ask. If you can't even answer this simple question, then you have to plan your career well.
Well, today we will talk about how the select statement is executed in MySQL. The main content of the article is as follows.
Frequently used select statement
In order to better run through the full text, here is the simplest select query statement, for example: to query the user information of the user with id 1001 in the user table, use the following SQL statement to query.
select * from user where user_id = 1001;
When we enter the above SQL statement on the MySQL command line, how is this SQL statement executed in MySQL? Next, we will take this SQL statement as an example to talk about how the select statement is executed in MySQL.
MySQL logical architecture
Before introducing the execution process of the select statement in MySQL, let's take a look at the logical architecture of MySQL, because the execution of any SQL statement is inseparable from the support of the MySQL logical architecture. In other words, the execution flow of SQL statements in MySQL is inseparable from the logical architecture of MySQL.
In the above figure, we simply drew the logical architecture diagram of MySQL, and gave the logical layering and the functions of each part in each layer. Logically, we can roughly divide MySQL into three layers: Server layer, storage engine layer, and system file layer. The Server layer can be divided into network connection layer (connector) and data service layer (Server layer).
The Server layer contains the core components of MySQL such as connectors, query caches, analyzers, optimizers, and executors. In addition, the Server layer also contains all built-in functions (such as date and time functions, encryption and decryption functions, Aggregate functions, mathematical functions, etc.), storage engines, triggers, views, etc.
The storage engine layer is mainly responsible for interacting with the system file layer. The storage engine layer itself is a plug-in architecture design and supports storage engines such as InnoDB, MyISAM, Archive, and Memory. In MySQL 5.5.5 and later versions, the default storage engine of MySQL is InnoDB.
The system file layer is mainly responsible for storing the actual data and storing the data on the disk of the server in the form of files.
Next, let's talk about how a select statement is executed in each part of MySQL's logical architecture.
How is the connector authorized?
First, let's take a look at how the MySQL connector is verified when the command to connect to MySQL is entered on the server command line. For example, we entered the following command on the command line of the server.
mysql -ubinghe -p
After executing "Enter", enter the password of the binghe account to connect with MySQL. At this point, the connection process needs to complete the classic TCP handshake operation. After that, the connector starts to verify whether the identity of the connection is legal, and the most direct way is to verify whether the user name and password are correct.
If the username or password is wrong, MySQL will prompt Access denied for user
. If the user name and password are correct, the connector will query the permissions of the current connection in the MySQL permission table. After querying the permissions, as long as the connection is not disconnected, the permission operations involved in this connection will all depend on the permissions queried at this time.
In other words, after a user logs in to MySQL and successfully connects to MySQL, even if the administrator modifies the permissions of the current user, as long as the user does not disconnect the MySQL connection, it will not be affected by the management modification permissions. After the administrator modifies the authority, it will only work on the newly created connection.
If the client does not perform any operation for a long time after connecting to MySQL, the connector will automatically disconnect the client. The specific length of disconnection is controlled by the MySQL parameter wait_timeout
, which is 8 hours by default. We can adjust the value of this parameter by ourselves according to actual business needs, so that MySQL can meet our actual business scenarios.
Since the connection between the client and MySQL is relatively complicated, this process is also relatively time-consuming, it will involve TCP handshake operation, but also query the permission information of the current connection, etc. Often in the actual work process, we will use the database connection pool method to cache the database connection, which means that we use long connections to interact with MySQL.
But using a long connection to connect to MySQL also has a problem: that is sometimes found that the memory occupied by MySQL rises very quickly, this is because MySQL is in the process of executing, the temporary memory used is managed in the connection object . These occupied resources will only be released when the connection is disconnected. If the connection is not released for a long time, a large amount of temporary memory will take up memory space. If it takes a long time, it may cause too much memory to be "destroyed" by the operating system, giving the impression that MySQL restarted unexpectedly.
We can use the following solution to solve this problem:
- Periodically or after executing a query operation that takes up more memory, disconnect the connection and re-establish the connection with MySQL later.
- If you use MySQL 5.7 or a newer version of MySQL, you can reinitialize MySQL resources
mysql_reset_connection
The re-initialization process will not reconnect to MySQL, nor will it re-do permission verification operations.
What is the role of the query cache?
After logging in to MySQL, the client will establish a connection with MySQL. At this time, when the select statement is executed, it will first check whether the current select statement has been executed in the query cache. If the corresponding select statement has been executed before, the executed select statement and query result will be stored in the query cache in the form of key-value, where key is the query statement and value is the result data of the query.
If the corresponding data is not found in the query cache, the subsequent query phase will continue. After the execution is complete, the results will be cached in the query cache. If the subsequent query hits the cache, it will directly return the data in the query cache, and the performance is still quite high.
However, most of the time I don't recommend that you turn on the query cache, why? The reason is simple: query cache invalidation frequency is very frequent, as long as a table is updated, all query caches on this table will be emptied. And in MySQL 8.0, the query cache function is directly deleted ( Glacier also proved this when looking at the MySQL source code ).
What did the analyzer do to the select statement?
The analyzer mainly performs lexical analysis and grammatical analysis operations on the select statement.
If the select statement does not hit the cache, the analyzer will first perform a "lexical analysis" operation on it. At this time, MySQL will identify what each string in the select statement represents.
For example, MySQL will recognize that this is a query statement through the "select" keyword, and also recognize "user" as "data table name user" and "id" as "field name id". Next, it is necessary to perform "grammatical analysis", according to the grammatical rules, to determine whether the select statement meets the MySQL grammar. If it is determined that the input SQL statement does not meet the grammatical rules, MySQL will prompt the corresponding error message.
How does the optimizer optimize the select statement?
After lexical analysis and grammatical analysis are performed on the select statement, it must be optimized by the optimizer before it can be executed. For example, if multiple indexes are used in our select statement, the optimizer will decide which index to use to query the data; for example, in the select statement, there are multiple table-related operations, and the optimizer will determine the connection order of each table , The connection order of the data table is different, the execution efficiency will be very different, and the optimizer will often choose to use the connection order with high query efficiency.
If the select statement is optimized by the optimizer, it will enter the execution phase.
How does the executor execute the select statement?
Entering the select statement in the execution phase, first, the executor will check the permissions of the current connection. The most direct way is to check whether the current connection has query permissions on the data table user. If the current connection does not have permission to query the data table user, an error of no permission will be returned. For example, the following error will be returned.
ERROR 1142 (42000): SELECT command denied to user 'binghe'@'localhost' for table 'user'
If the current connection has query permissions on the data table user, the execution will continue. First, the operation of opening the data table will be performed. At this time, the optimizer will use the interface of the corresponding storage engine to perform the query operation according to the storage engine used when creating the table. Here, we give an example:
Assuming that we have not established an index on the id field, the process executed by the executor is roughly as follows.
(1) Read the first row of data in the data table user through the storage engine to determine whether the id value of the current row is equal to 1001. If it is not equal to 1001, continue to read the next row of data; if it is equal to 1001, put the current row into The result is concentrated.
(2) Continue to read the next row of data through the storage engine, and execute the same logical judgment as (1) until all the data in the user table is processed.
(3) After processing all the data, the executor will return the data in the result set to the client.
If there is an index on the id field, the overall logic executed is roughly the same as there is no index on the id field.
If the slow query is enabled, when the select statement is executed, a rows_examined field will be output in the slow query log. This field indicates how many rows of data in the data table are scanned during the execution of the select statement. However, in some scenarios, when the executor is called once, the storage engine will scan multiple rows, which causes the number of rows scanned by the storage engine to be not exactly the same as the number of rows identified by the rows_examined field.
here today, I’m Glacier, see you in the next issue~~
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。