SQL grouping

SQL grouping solves the problem of totals and subtotals in OLAP scenarios. Its syntax is divided into several categories, but the same problem is to be solved:

ROLLUP and CUBE are GROUPING SETS that encapsulate rules, and GROUPING SETS are the most primitive rules.

For ease of understanding, let's start with a problem and work our way up.

bottom table

The above is the sample bottom table, there are 8 pieces of data, city 1, city 2 two cities, below each have area 1~4, each piece of data has the population number of the data.

Now I want to calculate the total population, as well as the subtotals for each city. Before we master the grouping syntax, we can only get it after unioning two select statements:

 SELECT city, sum(people) FROM test GROUP BY city
union
SELECT '合计' as city, sum(people) FROM test

But the two select statements are aggregated twice, and the performance is not a small overhead, so SQL provides the GROUPING SETS syntax to solve this problem.

GROUPING SETS

GROUP BY GROUPING SETS can specify any aggregation item. For example, if we want to calculate the total and the group total at the same time, we need to perform a GROUP BY according to the empty content and perform a sum, and then perform a GROUP BY according to the city and perform a sum, and replace it with GROUPING SETS The description is:

 SELECT 
city, area,
sum(people)
FROM test
GROUP BY GROUPING SETS((), (city, area))

Among them, GROUPING SETS((), (city, area)) represents the total calculated according to the aggregation of () and (city, area) respectively. The returned result is:

As you can see, the row with a NULL value is the total we want, and its value is calculated without any GROUP BY restrictions.

Similarly, we can also write GROUPING SETS((), (city), (city, area), (area)) and any number and combination of GROUP BY conditions.

The data calculated by this rule is called "super group record". We found that NULL values generated by "super grouping records" are easily confused with real NULL values, so SQL provides the GROUPING function to solve this problem.

function GROUPING

For the NULL generated by the supergroup record, it can be recognized as 1 by the GROUPING() function:

 SELECT 
GROUPING(city),
GROUPING(area),
sum(people)
FROM test
GROUP BY GROUPING SETS((), (city, area))

The specific effect is shown in the following figure:

As you can see, all fields calculated by super grouping will be identified as 1. We use the SQL CASE expressions we learned before to convert them into totals and subtotals, and then we can get a data analysis table:

 SELECT 
CASE WHEN GROUPING(city) = 1 THEN '总计' ELSE city END,
CASE WHEN GROUPING(area) = 1 THEN '小计' ELSE area END,
sum(people)
FROM test
GROUP BY GROUPING SETS((), (city, area))

Then when the front-end table is displayed, the "total" and "subtotal" cells in the first row are merged into "total" to complete the BI visual analysis function of total.

ROLLUP

ROLLUP means roll up and is a GROUPING SETS of a specific rule. The following two ways of writing are equivalent:

 SELECT sum(people) FROM test
GROUP BY ROLLUP(city)

-- 等价于
SELECT sum(people) FROM test
GROUP BY GROUPING SETS((), (city))

Look at a set of equivalent descriptions:

 SELECT sum(people) FROM test
GROUP BY ROLLUP(city, area)

-- 等价于
SELECT sum(people) FROM test
GROUP BY GROUPING SETS((), (city), (city, area))

Did you find the pattern? ROLLUP will "roll up" the contents of the GROUP BY in order. Using the GROUPING function to determine super grouped records is also applicable to ROLLUP.

CUBE

CUBE is different, it expands the content in every possible way (hence the name CUBE).

By analogy with the above example, let's write two sets of equivalent expansions:

 SELECT sum(people) FROM test
GROUP BY CUBE(city)

-- 等价于
SELECT sum(people) FROM test
GROUP BY GROUPING SETS((), (city))

In the above example, because only one item is not visible, the following two groups can be seen:

 SELECT sum(people) FROM test
GROUP BY CUBE(city, area)

-- 等价于
SELECT sum(people) FROM test
GROUP BY GROUPING SETS((), (city), (area), (city, area))

The so-called CUBE is a description of a multi-dimensional shape. There are 2^1 expansions in two dimensions, 2^2 expansions in three dimensions, and so on. As you can imagine, if a lot of combinations are described with CUBE, the complexity will explode.

Summarize

After learning the GROUPING syntax, you will not be entangled in this problem in the future as a front-end classmate:

The product has totals and subtotals enabled. Do we take an extra amount or get them together?

The standard answer and principle of this question are in this article. PS: For databases that do not support GROUPING syntax, finding a way to block them is like a front-end polyfill, which is a downgrade solution. As for how to shield, refer to the two SELECT + UNION mentioned at the beginning of the article.

The discussion address is: Intensive Reading "SQL grouping" Issue #406 ascoders/weekly

If you'd like to join the discussion, click here , there are new topics every week, with a weekend or Monday release. Front-end intensive reading - help you filter reliable content.

Follow Front-end Intensive Reading WeChat Official Account

Copyright notice: Free to reprint - non-commercial - non-derivative - keep attribution ( Creative Commons 3.0 license )

SQL grouping

bottom table

GROUPING SETS

function GROUPING

ROLLUP

CUBE

Summarize

黄子毅

引用和评论

精读《算法题 - 二叉树中的最大路径和》

手写一个动态海洋和天空效果的vue hooks

你可能不知道的图片加载相关知识

使用CSS给标题添加书名号并超出省略

原生electron起步-从零到一完成构建和打包

Koa+Typescript起手式(空环境) 不用每次玩node都要搭环境了！

LRU算法，你别跑，我就要吃透你