Interpretation of new features | MySQL 8.0: explain analyze analyzes the SQL execution process - 个人文章

Author: Hu Chengqing
, good at failure analysis and performance optimization, personal blog: 161e8d74154024 https://www.jianshu.com/u/a95ec11f67a8 , welcome to discuss.
Source of this article: original contribution
*The original content is produced by the open source community of Aikesheng, and the original content shall not be used without authorization. For reprinting, please contact the editor and indicate the source.

introduce

MySQL 8.0.16 introduces an experimental feature: explain format=tree , tree-like output execution process, and estimated cost and estimated number of returned rows. In MySQL 8.0.18, EXPLAIN ANALYZE was introduced. On the basis of format=tree, when used, SQL will be executed, and the actual information related to the iterator (it feels easier to understand with "operator" here) is output, such as execution cost, return Number of lines, execution time, number of loops.

Documentation link: https://dev.mysql.com/doc/refman/8.0/en/explain.html#explain-analyze

Example:

mysql> explain format=tree SELECT * FROM t1 WHERE t1.a IN (SELECT t2.b FROM t2 WHERE id < 10);
*************************** 1. row ***************************
  -> Nested loop inner join  (cost=4.95 rows=9)
    -> Filter: (`<subquery2>`.b is not null)  (cost=2.83..1.80 rows=9)
        -> Table scan on <subquery2>  (cost=0.29..2.61 rows=9)
            -> Materialize with deduplication  (cost=3.25..5.58 rows=9)
                -> Filter: (t2.b is not null)  (cost=2.06 rows=9)
                    -> Filter: (t2.id < 10)  (cost=2.06 rows=9)
                        -> Index range scan on t2 using PRIMARY  (cost=2.06 rows=9)
    -> Index lookup on t1 using a (a=`<subquery2>`.b)  (cost=2.35 rows=1)
1 row in set (0.01 sec)

mysql> explain analyze SELECT * FROM t1 WHERE t1.a IN (SELECT t2.b FROM t2 WHERE id < 10)\G
*************************** 1. row ***************************
  -> Nested loop inner join  (cost=4.95 rows=9) (actual time=0.153..0.200 rows=9 loops=1)
    -> Filter: (`<subquery2>`.b is not null)  (cost=2.83..1.80 rows=9) (actual time=0.097..0.100 rows=9 loops=1)
        -> Table scan on <subquery2>  (cost=0.29..2.61 rows=9) (actual time=0.001..0.002 rows=9 loops=1)
            -> Materialize with deduplication  (cost=3.25..5.58 rows=9) (actual time=0.090..0.092 rows=9 loops=1)
                -> Filter: (t2.b is not null)  (cost=2.06 rows=9) (actual time=0.037..0.042 rows=9 loops=1)
                    -> Filter: (t2.id < 10)  (cost=2.06 rows=9) (actual time=0.036..0.040 rows=9 loops=1)
                        -> Index range scan on t2 using PRIMARY  (cost=2.06 rows=9) (actual time=0.035..0.038 rows=9 loops=1)
    -> Index lookup on t1 using a (a=`<subquery2>`.b)  (cost=2.35 rows=1) (actual time=0.010..0.010 rows=1 loops=9)
1 row in set (0.01 sec)

It can be seen that explain format=tree shows a clearer execution process than the traditional execution plan. And explain analyze will output the actual execution time, the number of returned lines and the number of loops on this basis.

reading order

From right to left: Before encountering a parallel iterator, execution starts from the right;
From top to bottom: When encountering parallel iterators, the ones above are executed first.

The reading order of the above example is as follows (note that it is best not to output \G, otherwise the indentation of the first line will be inaccurate), and the execution order of SQL is:

Use Nested loop inner join algorithm;
t2 First take the data (Index range scan), filter (Filter), and materialize it into a temporary table (Materialize) as a driving table;
The drive table data is brought into t1 for query (Index lookup on t1), and the loop is executed 9 times.

Important information

Take the following example:

Index lookup on t1 using a (a=`<subquery2>`.b)  (cost=2.35 rows=1) (actual time=0.015..0.017 rows=1 loops=9)

cost

Estimated cost information, the calculation is more complicated. If you want to know more, you can check: explain format=json Detailed explanation

rows

The first rows are the estimated values and the second rows are the actual number of rows returned.

actual time

"0.015..0.017", note that there are two values here, the first value is the actual time to get the first row, and the second value is the time to get all the rows. If the loop is repeated many times, it is the average time, in milliseconds.

loops

Because the Nested loop inner join algorithm is used here, according to the reading order, t2 is the driving table, and the first query is materialized into a temporary table; the t1 table is used as the driven table, and the number of loop queries is 9, that is, loops=9.

Interpretation of new features | MySQL 8.0: explain analyze analyzes the SQL execution process

introduce

reading order

Important information

cost

rows

actual time

loops

爱可生开源社区

引用和评论

如何巧妙解决 Too many connections 报错？

Open WebUI：开源AI交互平台的全面解析

被 Manus 带火的 MCP 是什么｜一文看懂

MySQL × 向量数据库：大模型时代的黄金组合实战指南

百万级群聊的设计实践

MySQL Binlog 深度解析

MySQL 备份 Shell 脚本：支持远程同步与阿里云 OSS 备份