In a recent interview, there was really an interviewer who asked me about a solution to an online fault problem that I had encountered before (when the interviewer said it, I secretly thanked that I had encountered it...), as follows .
Recently, there is a project where the upper computer obtains the data reported by the lower computer. Due to the frequent reporting frequency and the large amount of data, the data grows too fast and the disk is occupied.
In order to save costs, data backup is performed regularly, and table records are deleted through delete.
Obviously delete has been executed, but the size of the table file has not been reduced, which is puzzling
Mysql is used as the database in the project. For tables, it is generally table structure and table data. The space occupied by the table structure is relatively small, generally the space occupied by the table data.
When we use delete to delete data, the data records in the table are indeed deleted, but there is no change in the file size of the table.
Mysql data structure
Anyone who has used mysql must have heard about the B+ tree. In MySQL InnoDB, the B+ tree is used as the structure for storing data, which is often referred to as the index-organized table, and the data is stored in pages. Therefore, when deleting data, there are two situations:
- Delete some records in the data page
- Delete the content of the entire data page
The table file size has not changed and is related to the mysql design
For example, if you want to delete the R4 record:
InnoDB directly marks this record R4 as deleted, which is called a reusable location. If a record with an ID between 300 and 700 is to be inserted later, the position will be reused. It can be seen that the size of the disk file will not decrease.
The general deletion of the entire page of data also deletes the record mark, and the data reuses this location. The difference from deleting the silent record is that the entire page of record is deleted. When the data inserted later is not in the original range, the location can be reused. If only the silent record is deleted, it can be reused only when the inserted data matches the deleted record position.
Therefore, whether it is the deletion of a data row or a data page, it is marked as a deleted state for reuse, so the file will not be reduced.
How can we make the table size smaller
DELETE only deletes the data flag, but does not organize the data file. When new data is inserted, the record space set as the delete flag will be used again. You can use OPTIMIZE TABLE to reclaim the unused space and defragment the data file. .
OPTIMIZE TABLE 表名;
Note: OPTIMIZE TABLE only works on MyISAM, BDB and InnoDB tables.
In addition, you can also rebuild the table through ALTER TABLE
ALTER TABLE 表名 ENGINE=INNODB
Some people may ask what is the difference between OPTIMIZE TABLE and ALTER TABLE?
alter table t engine = InnoDB (that is, recreate), and optimize table t is equal to recreate+analyze
Online DDL
Finally, let’s talk about Online DDL. One of the daily tasks of dba is definitely ddl changes. ddl changes will lock the table. This can be said to be an eternal pain in dba's heart, especially the implementation of ddl changes, resulting in a large number of threads on the library in the "Waiting for meta" data lock" state. Therefore, Online DDL was introduced after version 5.6.
Before the launch of Online DDL, there were two main ways to execute DDL: copy and inplace. The inplace method is also known as (fast index creation). Compared with the copy method, the inplace method does not copy data, so it is faster. But this method only supports adding and deleting indexes, and it needs to lock the table all the way like the copy method, which is not very practical. Compared with the previous two methods, Online mode can not only read, but also support write operations.
When executing an online DDL statement, use the ALGORITHM and LOCK keywords. These two keywords are at the end of our DDL statement and can be separated by commas. Examples are as follows:
ALTER TABLE tbl_name ADD COLUMN col_name col_type, ALGORITHM=INPLACE, LOCK=NONE;
ALGORITHM options
- INPLACE: Replace: perform DDL operations directly on the original table.
- COPY: Copy: Use a temporary table method to clone a temporary table, execute DDL on the temporary table, and then import the data into the temporary table, and then rename it. During this period, double the disk space is needed to support such operations. During execution, the table does not allow DML operations.
- DEFAULT: The default method, MySQL has its own choice, and the INPLACE method is preferred.
LOCK option
- SHARE: Shared lock, the table that executes DDL can be read, but cannot be written.
- NONE: There is no restriction, the table that executes DDL can be read and written.
- EXCLUSIVE: Exclusive lock, the table executing DDL cannot be read or written.
- DEFAULT: The default value, which is the default value used when the LOCK clause is not specified in the DDL statement. If the value of LOCK is specified as DEFAULT, it is to the MySQL clause to determine whether to lock or not to lock the table. It is not recommended to use. If you are sure that your DDL statement will not lock the table, you can not specify lock or specify its value as default, otherwise it is recommended to specify its lock type.
When performing DDL operations, the ALGORITHM option does not need to be specified. At this time, MySQL automatically selects the appropriate mode in the order of INSTANT, INPLACE, and COPY. You can also specify ALGORITHM=DEFAULT, which has the same effect. If the ALGORITHM option is specified but not supported, an error will be reported directly.
OPTIMIZE TABLE and ALTER TABLE table name ENGINE=INNODB both support Oline DDL, but it is still recommended to use it when business visits are low
to sum up
When delete deletes data, the corresponding data row is not actually deleted, but only marked as a reusable state, so the table space will not become smaller.
You can rebuild the table to quickly make the table after the delete data smaller (OPTIMIZE TABLE or ALTER TABLE). After version 5.6, the creation of the table already supports the online operation, but it is best to use it when the business is low.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。