When doing data analysis, we often hear the concept of year-on-year and month-on-month. When various enterprises and organizations release statistical data, they usually like to use year-on-year and month-on-month comparisons with previous historical data to illustrate the changes in data. For example, the Bureau of Statistics announced that the CPI in January 2022 will increase by 0.9% year-on-year and 0.6% month-on-month.
In practice, in database-based data analysis scenarios, month-on-month and year-on-year are typical complex computing scenarios, especially before the emergence of analytical functions in commercial databases such as Oracle. Taking MySQL as an example, the Lag and Lead functions were only introduced in version 8.0. These two functions, combined with the windowing function, effectively improve the implementation efficiency of complex operations such as year-on-year and chain-comparison. In the 5.x series, MySQL needed to rely on multiple nested subqueries and self-associations to achieve such computations.
Let's take a simple example to look at it separately. MySql 5.x and 8.0 are calculated on a year-on-year basis and a chain-by-month basis.
For example data of see the table:
CREATE TABLE sales (
`产品ID` varchar(20),
`销售数量` int(20) ,
`销售时间` timestamp(6) NULL DEFAULT NULL
)
INSERT INTO sales VALUES ('C1001', 15, '2020-06-01 10:10:12');
INSERT INTO sales VALUES ('C1002',26, '2020-05-02 0:10:12');
INSERT INTO sales VALUES ('C1003', 21, '2020-04-03 0:10:12');
INSERT INTO sales VALUES ('C1003', 23, '2020-04-04 0:10:12');
INSERT INTO sales VALUES ('C1003', 0, '2020-03-05 0:10:12');
INSERT INTO sales VALUES ('C1001', 16, '2020-02-06 3:0:12');
INSERT INTO sales VALUES ('C1002', 32, '2020-01-07 0:10:12');
INSERT INTO sales VALUES ('C1001', 16, '2019-12-08 0:12:24');
INSERT INTO sales VALUES ('C1001', 32, '2019-06-09 0:12:24');
INSERT INTO sales VALUES ('C1002', 17, '2019-05-09 0:12:24');
1. MySQL 5.x: achieve year-on-year and proportion calculations through sub-queries and associations
Take the total sales value of different years by year and month, and calculate the month-on-month (total sales compared with the previous period) and year-on-year (total sales compared with the same period last year) as an example.
Example table structure and data
Calculate month-on-month and year-on-year through SQL:
select year(c.销售时间) yy,month(c.销售时间) mm,
concat(ifnull(abs(round((sum(c.销售数量)-ss1)/ss1*100,2)),0),'%') 同比,
concat(ifnull(abs(round((sum(c.销售数量)-ss2)/ss2*100,2)),0),'%') 环比
from sales c
left join (select month(a.销售时间) mm1,
year(a.销售时间) yy1,
sum(a.销售数量) ss1
from sales a
GROUP BY mm1,yy1) a
on month(c.销售时间) = a.mm1
and a.yy1 = year(c.销售时间)-1
left join (select month(a.销售时间) mm2,
year(a.销售时间) yy2,
sum(a.销售数量) ss2
from sales a
GROUP BY mm2,yy2) b
on (b.yy2 = year(c.销售时间) and b.mm2+1 = month(c.销售时间) OR (yy2=year(c.销售时间)-1
AND b.mm2 = 12 AND month(c.销售时间) = 1))
group by yy, mm
order by yy,mm asc
Calculation results:
2. MySQL 8.0: achieve year-on-year and proportion calculations through analytical functions**
MySql8.0 supports the Lead and Lag analysis functions. Although it can greatly improve the efficiency of the same and chain calculations, it still needs to write SQL statements for processing.
2.1 Calculation year-on-year
select t2.年份,t2.月份,concat(round((t2.数量-t1.数量)/t1.数量,2)*100,'%') as 同比 from (
SELECT year(销售时间) as 年份,month(销售时间) as 月份,sum(销售数量) as 数量 from sales
group by year(销售时间),month(销售时间) order by year(销售时间) desc, month(销售时间) desc
) t1
,(
SELECT year(销售时间) as 年份,month(销售时间) as 月份,sum(销售数量) as 数量 from sales
group by year(销售时间),month(销售时间) order by year(销售时间) desc, month(销售时间) desc
) t2 where t1.年份=t2.年份-1 and t1.月份=t2.月份
2.2 Calculate the chain ratio
SELECT
mm,
CONCAT(
ROUND(
IFNULL(
(xl - first_xl) / first_xl * 100,
2
),
0
),
'%'
) AS 环比
FROM
(
SELECT
mm,
xl,
lead (xl, 1) over (ORDER BY mm DESC) AS first_xl
FROM
(
SELECT
DATE_FORMAT(销售时间, '%Y-%m') AS mm,
sum(销售数量) AS xl
FROM
sales
GROUP BY
DATE_FORMAT(销售时间, '%Y-%m')
) t
) a
After SqlServer2008R2 and Oracle10g, Lag and Lead analysis functions are provided. The specific calculation logic and usage are similar to those of MySQL 8.0 above.
3. Calculation engine using BI tools
For such complex computing scenarios, business intelligence BI data analysis tools provide more efficient solutions. Take Wyn Enterprise embedded business intelligence software as an example. Its built-in Wax analysis expression and fast calculation engine provide the ability to directly realize complex calculations such as year-on-year and chain-comparison, without the need to write complex and lengthy SQL.
3.1 Use the built-in fast calculation function of year-on-year and month-on-month**
Calculations such as year-on-year and chain ratios are generally standard functions of BI tools, and we can implement them directly through settings.
3.2 Using data analysis expressions
If the built-in fast calculations are not sufficient, more complex calculations can also be achieved by analyzing expressions. Analysis expression is a more flexible and powerful data calculation method. With rich functions, users can freely combine them like Excel formulas to achieve more powerful analysis capabilities. Analysis expressions perform business calculations based on data models, and use some defined functions to use correct syntax to complete a complex business logic calculation. In this way, users can use data more flexibly and make maximum use of data.
Dear bosses, by comparing the differences between SQL and BI data analysis tools in handling complex calculations such as year-on-year and month-on-month, we can find that professional tools are more convenient in data calculation and processing capabilities. In the future, if you have similar analysis and calculation requirements, it is more appropriate to choose BI analysis tool for processing.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。