delete from bt_ask_to_cate_backup
where (ask_id, cate)
in (SELECT ask_id, cate
FROM bt_ask_to_cate_backup
group by ask_id,cate
having count(*) > 1
)
and id
not in (SELECT min(id)
FROM bt_ask_to_cate_backup
group by ask_id,cate
having count(*) > 1
)
我插入数据时,没有判断重复数据。现在我想删除重复数据(根据 ask_id
和 cate
两个字段),并且保留重复数据中 id
最小的那一条记录即可。
我原本是这样写的,按照链接用SQL删除重复记录的N种方法:
delete from bt_ask_to_cate_backup bb where (bb.ask_id, bb.cate) in (SELECT
但是用了加名 MYSQL 报错,因此去掉了别名,就成了最上面那条语句的样子,现在报的错是:
#1093 - You can't specify target table 'bt_ask_to_cate_backup' for update in FROM clause。
怎样解决?
可以创建一个临时表(只存id),用来把原表的重复记录最小的id存储起来。然后delete from table where id not in (select * from tmp_table)
这样两个表都是只操作主键字段,这是最快的。
具体步骤如下:
create tmp_table (
id int unsigned not null primary key
)
然后
insert into tmp_table (select min(id) from bt_ask_to_cate_backup group by ask_id,cate)
最后
delete from bt_ask_to_cate_backup where id not in (select * from tmp_table)
但是如果bt_ask_to_cate_backup表数据量太大,你其实还可以按下面方法处理:
首先,在表上给ask_id,cate这两个字段建联合索引,索引创建时间会很长,不过后面的操作时间大大缩短了,整体来看,时间是缩减不少的。
select min(id) from bt_ask_to_cate_backup group by ask_id,cate having count(id)>1
这样把在ask_id,cate这两个字段上重复的记录最小id找到了,下面用最小id找到对应的ask_id,cate的字段值(用foreach遍历查找),然后循环执行sql语句 delete from bt_ask_to_cate_backup where ask_id=xxx and cate=xxx and id<>xxx;
一次循环就可以把所以重复的数据清除掉。