Author: Hu Chengqing

, good at failure analysis and performance optimization, personal blog: 161e8d74154024 https://www.jianshu.com/u/a95ec11f67a8 , welcome to discuss.

Source of this article: original contribution

*The original content is produced by the open source community of Aikesheng, and the original content shall not be used without authorization. For reprinting, please contact the editor and indicate the source.


introduce

MySQL 8.0.16 introduces an experimental feature: explain format=tree , tree-like output execution process, and estimated cost and estimated number of returned rows. In MySQL 8.0.18, EXPLAIN ANALYZE was introduced. On the basis of format=tree, when used, SQL will be executed, and the actual information related to the iterator (it feels easier to understand with "operator" here) is output, such as execution cost, return Number of lines, execution time, number of loops.

Documentation link: https://dev.mysql.com/doc/refman/8.0/en/explain.html#explain-analyze

Example:

mysql> explain format=tree SELECT * FROM t1 WHERE t1.a IN (SELECT t2.b FROM t2 WHERE id < 10);
*************************** 1. row ***************************
  -> Nested loop inner join  (cost=4.95 rows=9)
    -> Filter: (`<subquery2>`.b is not null)  (cost=2.83..1.80 rows=9)
        -> Table scan on <subquery2>  (cost=0.29..2.61 rows=9)
            -> Materialize with deduplication  (cost=3.25..5.58 rows=9)
                -> Filter: (t2.b is not null)  (cost=2.06 rows=9)
                    -> Filter: (t2.id < 10)  (cost=2.06 rows=9)
                        -> Index range scan on t2 using PRIMARY  (cost=2.06 rows=9)
    -> Index lookup on t1 using a (a=`<subquery2>`.b)  (cost=2.35 rows=1)
1 row in set (0.01 sec)

mysql> explain analyze SELECT * FROM t1 WHERE t1.a IN (SELECT t2.b FROM t2 WHERE id < 10)\G
*************************** 1. row ***************************
  -> Nested loop inner join  (cost=4.95 rows=9) (actual time=0.153..0.200 rows=9 loops=1)
    -> Filter: (`<subquery2>`.b is not null)  (cost=2.83..1.80 rows=9) (actual time=0.097..0.100 rows=9 loops=1)
        -> Table scan on <subquery2>  (cost=0.29..2.61 rows=9) (actual time=0.001..0.002 rows=9 loops=1)
            -> Materialize with deduplication  (cost=3.25..5.58 rows=9) (actual time=0.090..0.092 rows=9 loops=1)
                -> Filter: (t2.b is not null)  (cost=2.06 rows=9) (actual time=0.037..0.042 rows=9 loops=1)
                    -> Filter: (t2.id < 10)  (cost=2.06 rows=9) (actual time=0.036..0.040 rows=9 loops=1)
                        -> Index range scan on t2 using PRIMARY  (cost=2.06 rows=9) (actual time=0.035..0.038 rows=9 loops=1)
    -> Index lookup on t1 using a (a=`<subquery2>`.b)  (cost=2.35 rows=1) (actual time=0.010..0.010 rows=1 loops=9)
1 row in set (0.01 sec)

It can be seen that explain format=tree shows a clearer execution process than the traditional execution plan. And explain analyze will output the actual execution time, the number of returned lines and the number of loops on this basis.

reading order

  1. From right to left: Before encountering a parallel iterator, execution starts from the right;
  2. From top to bottom: When encountering parallel iterators, the ones above are executed first.

The reading order of the above example is as follows (note that it is best not to output \G, otherwise the indentation of the first line will be inaccurate), and the execution order of SQL is:

  1. Use Nested loop inner join algorithm;
  2. t2 First take the data (Index range scan), filter (Filter), and materialize it into a temporary table (Materialize) as a driving table;
  3. The drive table data is brought into t1 for query (Index lookup on t1), and the loop is executed 9 times.

Important information

Take the following example:

Index lookup on t1 using a (a=`<subquery2>`.b)  (cost=2.35 rows=1) (actual time=0.015..0.017 rows=1 loops=9)

cost

Estimated cost information, the calculation is more complicated. If you want to know more, you can check: explain format=json Detailed explanation

rows

The first rows are the estimated values and the second rows are the actual number of rows returned.

actual time

"0.015..0.017", note that there are two values here, the first value is the actual time to get the first row, and the second value is the time to get all the rows. If the loop is repeated many times, it is the average time, in milliseconds.

loops

Because the Nested loop inner join algorithm is used here, according to the reading order, t2 is the driving table, and the first query is materialized into a temporary table; the t1 table is used as the driven table, and the number of loop queries is 9, that is, loops=9.


爱可生开源社区
426 声望207 粉丝

成立于 2017 年,以开源高质量的运维工具、日常分享技术干货内容、持续的全国性的社区活动为社区己任;目前开源的产品有:SQL审核工具 SQLE,分布式中间件 DBLE、数据传输组件DTLE。