Author: Liu Chen

Net name bisal, with more than ten years of application operation and maintenance work experience, currently mainly engaged in the improvement of database application research and development capabilities, Oracle ACE, has Oracle OCM & OCP, EXIN DevOps Master, SCJP and other international certifications, the first batch of Oracle YEP in China Member, OCMU member, one of the Chinese translators of "DevOps Best Practices", CSDN & ITPub expert blogger, public account "bisal's personal grocery store", long-term insistence on sharing technical articles, multiple online and offline sharing of technical topics .

Source of this article: original submission

* Produced by the Aikesheng open source community, original content is not allowed to be used without authorization, please contact the editor and indicate the source for reprinting.


Two days ago, my colleagues had a request for MySQL data grouping. For the following test data, you need to find the most recent record of create_date in each name group:

It should be noted that the MySQL used here is 5.6, and this statement was used initially:

select name, value, create_date, update_date from t1 group by name order by create_date desc;

The query results are as follows. It seems to be correct, but if you look carefully, you will find the problems. For example, the most recent create_date of name=a should be the record of value=3, and the most recent create_date of name=d should be the record of value=10. recording:

What you get with this SQL is actually only the first record inserted in each name group, and then sorted in descending order according to create_date, which is completely different from the original requirement.

At this time, the strategy of divide and conquer can be adopted, sorting first, and then grouping:

select * from (select name, value, create_date, update_date from t1 order by create_date desc) t group by t.name;

You can get the original demand data:

Of course, there may be other methods for this demand. Friends who are interested can try to write and share.

A careful friend may find a problem. The group by in the above SQL seems a bit strange. If the fields in the select need to appear in the group by as usual, the above statement did not report an error?

If we execute the same statement in MySQL 5.7:

select name, value, create_date, update_date from t1 group by name order by create_date desc; 

Will prompt this error:

If it is in Oracle, it will prompt this:

Is this a feature of MySQL 5.6? I searched, in fact, because of this, in this 5.6 environment, the sql_mode parameters are as follows:

In 5.7, the sql_mode parameter is as follows, and there is an additional ONLY_FULL_GROUP_BY, which means "the column that appears in the SELECT statement, HAVING condition, and ORDER BY statement must be a column of GROUP BY or a function column that depends on a column of GROUP BY" , It is not difficult to explain why the phenomenon is different in 5.6 and 5.7, or because 5.7 has higher requirements for SQL format verification by default:

Therefore, when upgrading from 5.6 to 5.7, it is very likely that the same SQL execution results will be different. This will have high requirements for compatibility testing. The reason is that on the one hand, it is determined by the characteristics, on the other hand, it is various Caused by different configuration parameters.

You can delete this ONLY_FULL_GROUP_BY in sql_mode in 5.7 to achieve the same effect as in 5.6, or rewrite SQL, for example:

select * from t1 a where create_date = (select max(create_date) from t1 b where a.name = b.name);

or,

select * from t1 a where not exists (select * from t1 b where a.name = b.name and b.create_date > a.create_date);

MySQL 8.0 supports the row_number() function, and the operation should be similar to the following Oracle.

Row_number() can be used in Oracle to achieve this requirement:

select * from (select name, create_date, row_number() over (partition by name order by create_date desc) as r from t1) where r=1;

爱可生开源社区
426 声望211 粉丝

成立于 2017 年,以开源高质量的运维工具、日常分享技术干货内容、持续的全国性的社区活动为社区己任;目前开源的产品有:SQL审核工具 SQLE,分布式中间件 DBLE、数据传输组件DTLE。