SQL grouping solves the problem of totals and subtotals in OLAP scenarios. Its syntax is divided into several categories, but the same problem is to be solved:
ROLLUP and CUBE are GROUPING SETS that encapsulate rules, and GROUPING SETS are the most primitive rules.
For ease of understanding, let's start with a problem and work our way up.
bottom table
The above is the sample bottom table, there are 8 pieces of data, city 1, city 2 two cities, below each have area 1~4, each piece of data has the population number of the data.
Now I want to calculate the total population, as well as the subtotals for each city. Before we master the grouping syntax, we can only get it after unioning two select statements:
SELECT city, sum(people) FROM test GROUP BY city
union
SELECT '合计' as city, sum(people) FROM test
But the two select statements are aggregated twice, and the performance is not a small overhead, so SQL provides the GROUPING SETS syntax to solve this problem.
GROUPING SETS
GROUP BY GROUPING SETS can specify any aggregation item. For example, if we want to calculate the total and the group total at the same time, we need to perform a GROUP BY according to the empty content and perform a sum, and then perform a GROUP BY according to the city and perform a sum, and replace it with GROUPING SETS The description is:
SELECT
city, area,
sum(people)
FROM test
GROUP BY GROUPING SETS((), (city, area))
Among them, GROUPING SETS((), (city, area))
represents the total calculated according to the aggregation of ()
and (city, area)
respectively. The returned result is:
As you can see, the row with a NULL value is the total we want, and its value is calculated without any GROUP BY restrictions.
Similarly, we can also write GROUPING SETS((), (city), (city, area), (area))
and any number and combination of GROUP BY conditions.
The data calculated by this rule is called "super group record". We found that NULL values generated by "super grouping records" are easily confused with real NULL values, so SQL provides the GROUPING function to solve this problem.
function GROUPING
For the NULL generated by the supergroup record, it can be recognized as 1 by the GROUPING()
function:
SELECT
GROUPING(city),
GROUPING(area),
sum(people)
FROM test
GROUP BY GROUPING SETS((), (city, area))
The specific effect is shown in the following figure:
As you can see, all fields calculated by super grouping will be identified as 1. We use the SQL CASE expressions we learned before to convert them into totals and subtotals, and then we can get a data analysis table:
SELECT
CASE WHEN GROUPING(city) = 1 THEN '总计' ELSE city END,
CASE WHEN GROUPING(area) = 1 THEN '小计' ELSE area END,
sum(people)
FROM test
GROUP BY GROUPING SETS((), (city, area))
Then when the front-end table is displayed, the "total" and "subtotal" cells in the first row are merged into "total" to complete the BI visual analysis function of total.
ROLLUP
ROLLUP means roll up and is a GROUPING SETS of a specific rule. The following two ways of writing are equivalent:
SELECT sum(people) FROM test
GROUP BY ROLLUP(city)
-- 等价于
SELECT sum(people) FROM test
GROUP BY GROUPING SETS((), (city))
Look at a set of equivalent descriptions:
SELECT sum(people) FROM test
GROUP BY ROLLUP(city, area)
-- 等价于
SELECT sum(people) FROM test
GROUP BY GROUPING SETS((), (city), (city, area))
Did you find the pattern? ROLLUP will "roll up" the contents of the GROUP BY in order. Using the GROUPING function to determine super grouped records is also applicable to ROLLUP.
CUBE
CUBE is different, it expands the content in every possible way (hence the name CUBE).
By analogy with the above example, let's write two sets of equivalent expansions:
SELECT sum(people) FROM test
GROUP BY CUBE(city)
-- 等价于
SELECT sum(people) FROM test
GROUP BY GROUPING SETS((), (city))
In the above example, because only one item is not visible, the following two groups can be seen:
SELECT sum(people) FROM test
GROUP BY CUBE(city, area)
-- 等价于
SELECT sum(people) FROM test
GROUP BY GROUPING SETS((), (city), (area), (city, area))
The so-called CUBE is a description of a multi-dimensional shape. There are 2^1 expansions in two dimensions, 2^2 expansions in three dimensions, and so on. As you can imagine, if a lot of combinations are described with CUBE, the complexity will explode.
Summarize
After learning the GROUPING syntax, you will not be entangled in this problem in the future as a front-end classmate:
The product has totals and subtotals enabled. Do we take an extra amount or get them together?
The standard answer and principle of this question are in this article. PS: For databases that do not support GROUPING syntax, finding a way to block them is like a front-end polyfill, which is a downgrade solution. As for how to shield, refer to the two SELECT + UNION mentioned at the beginning of the article.
The discussion address is: Intensive Reading "SQL grouping" Issue #406 ascoders/weekly
If you'd like to join the discussion, click here , there are new topics every week, with a weekend or Monday release. Front-end intensive reading - help you filter reliable content.
Follow Front-end Intensive Reading WeChat Official Account
<img width=200 src="https://img.alicdn.com/tfs/TB165W0MCzqK1RjSZFLXXcn2XXa-258-258.jpg">
Copyright notice: Free to reprint - non-commercial - non-derivative - keep attribution ( Creative Commons 3.0 license )
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。