如何优化SQL查询中的ORDER BY COUNT(column)以避免Using temporary和Using filesort?

我想要统计某个id和对应的数量并排序,sql如下:

SELECT
    t.label_name AS topic,
    t.label_id,
    COUNT( work_order_id) AS requestWorkOrderNumber
FROM
    biz_work_order_label t
INNER JOIN biz_work_order t1 ON
    (t1.id = t.work_order_id)
WHERE
    t.deleted = 0
    AND t1.deleted = 0
    AND (t.label_type = 8
        AND t1.problem_location_code LIKE '%440000%'
        AND t1.reporting_time BETWEEN '2024-01-01 00:00:00' AND '2024-12-31 23:59:59'
        AND t1.nature IN ('ts', 'jb'))
GROUP BY
    t.label_name,
    t.label_id
ORDER BY
    requestWorkOrderNumber DESC
LIMIT 10

explain的结果:
image.png
数据量大约400万,查询这个sql需要5s多,label_name、label_id、work_order_id都加过索引,没有效果,应该是count函数导致索引失效了。
应该如何优化呢?

阅读 1.2k
avatarAI BotBETA

要优化这个SQL查询,尤其是避免Using temporaryUsing filesort,可以考虑以下几种方法:

  1. 索引优化

    • 确保在连接和过滤条件上使用了合适的索引。
    • 对于ORDER BYGROUP BY字段,考虑创建覆盖索引。
  2. 查询重写

    • 使用子查询或CTE(公用表表达式)来预先计算数量,从而避免在GROUP BYORDER BY中直接使用聚合函数。

以下是优化后的SQL查询示例:

-- 首先,创建一个覆盖索引来加速查询
CREATE INDEX idx_biz_work_order_label_optimized
ON biz_work_order_label (deleted, label_type, work_order_id)
WHERE deleted = 0;

CREATE INDEX idx_biz_work_order_optimized
ON biz_work_order (deleted, problem_location_code, reporting_time, nature, id)
WHERE deleted = 0;

-- 然后,使用子查询来计算数量,并排序
WITH RequestWorkOrderCounts AS (
    SELECT
        t.label_name AS topic,
        t.label_id,
        COUNT(t1.id) AS requestWorkOrderNumber
    FROM
        biz_work_order_label t
    INNER JOIN biz_work_order t1 ON t.work_order_id = t1.id
    WHERE
        t.deleted = 0
        AND t1.deleted = 0
        AND t.label_type = 8
        AND t1.problem_location_code LIKE '%440000%'
        AND t1.reporting_time BETWEEN '2024-01-01 00:00:00' AND '2024-12-31 23:59:59'
        AND t1.nature IN ('ts', 'jb')
    GROUP BY
        t.label_name,
        t.label_id
)
SELECT
    topic,
    label_id,
    requestWorkOrderNumber
FROM
    RequestWorkOrderCounts
ORDER BY
    requestWorkOrderNumber DESC
LIMIT 10;

解释:

  1. 索引创建

    • biz_work_order_label表的索引idx_biz_work_order_label_optimized包括了deleted, label_type, 和work_order_id,并过滤deleted = 0
    • biz_work_order表的索引idx_biz_work_order_optimized包括了deleted, problem_location_code, reporting_time, nature, 和id,并过滤deleted = 0
  2. 查询重写

    • 使用CTE(WITH子句)RequestWorkOrderCounts预先计算每个label_namelabel_id的请求工作单数量。
    • 在外层查询中,仅对预先计算的结果进行排序和限制,从而避免在GROUP BYORDER BY中直接使用聚合函数。

这种方法通过减少查询中的临时表和文件排序操作,可以显著提高查询性能。确保在生产环境中测试这些更改,并根据实际情况调整索引和查询结构。

2 个回答

确保在以下列上有索引

CREATE INDEX idx_work_order_label ON biz_work_order_label (deleted, label_type, label_name, label_id);

CREATE INDEX idx_work_order ON biz_work_order (deleted, problem_location_code, reporting_time, nature);

优化后的SQL语句

SELECT
    t.label_name AS topic,
    t.label_id,
    COUNT(t1.id) AS requestWorkOrderNumber
FROM
    (SELECT label_name, label_id, work_order_id
     FROM biz_work_order_label
     WHERE deleted = 0 AND label_type = 8) t
INNER JOIN
    (SELECT id, problem_location_code, reporting_time, nature
     FROM biz_work_order
     WHERE deleted = 0
       AND problem_location_code LIKE '440000%'
       AND reporting_time BETWEEN '2024-01-01 00:00:00' AND '2024-12-31 23:59:59'
       AND nature IN ('ts', 'jb')) t1
ON t.work_order_id = t1.id
GROUP BY
    t.label_name,
    t.label_id
ORDER BY
    requestWorkOrderNumber DESC
LIMIT 10;

在biz_work_order_label 表上面的work_order_id增加索引,biz_work_order_label 自动会变成被驱动表,会提升效率,现在只需200ms就可以查出来

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题
宣传栏