Lifetime best friend redo log, binlog "dead MySQL series II"

Series of articles

Originally, a select statement was executed in MySQL like this "
Lifetime best friend redo log, binlog "dead MySQL series two"

咔咔闲谈

In the previous issue, the overall structure of MySQL was analyzed based on the query process of a query statement. Similarly, this issue also uses a query SQL statement as an introduction. To be sure, the process update statement executed by the query statement will also be executed.

Therefore, the focus of this issue is not on the MySQL architecture diagram. The title of the article also gives everyone the focus, which is to understand redo log and binlog.

One, redo log

The first step is to create a table user, the primary key is id, the following is the creation statement.

CREATE TABLE `user` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `name` varchar(255) NOT NULL,
 `age` tinyint(4) NOT NULL,
 `time` int(11) NOT NULL,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

Insert a piece of data

insert into user (`name`,`age`,`time`) values ("咔咔","25",unix_timestamp(now()))

If you want to change the age of the inserted piece of data to 26, you need to execute a statement

update user set age = 26 where id = 1;

The first issue of the article mentioned the execution flow of a query statement, which is the same as the update statement. Bring that picture here to get familiar with it.

The function of each module can be checked back to the first issue of the article.

In MySQL8.0, the redo log and binlog log files are located in /var/lib/mysql , as shown in the figure

The file named ib_logfile is the redo log, and the beginning of undo is the rollback log. The later period of the rollback log is discussed in detail.

redo log (redo log) is an essential element to achieve transaction persistence. When a transaction is committed, it does not directly modify the data in the database, but first ensures that related operations are recorded in the redo log.

The redo log size in the Innodb storage engine is fixed. The above figure shows a set of two files are configured. The default size of each file is 48M. Use the innodb_log_file_size parameter to control the size of a single file. The default is in MySQL 5.6.8 and later versions. 48M.

Then redo log can record 48M operations, redo log is a closed loop writing. The set number of files and file size will no longer increase.

write pos will record the current position and move backward at the same time, after the end of the ib-log-file-3 file, and then return to the ib-logfilg-0 file to start writing.

The check point records the current erased position. To make the file write cyclically, it must be erased at the same time. The premise of clear data is to update the record to the data file.

The green part above is the writable part. If writepos catches up with the checkpoint, what should we do?

You must understand that the advancement of write pos is because the update operation is being performed, so that the update operation can no longer be performed until the record is updated to the data file, and then the check point is erased before the update operation can continue.

There are also some calculation rules for the setting of innodb_log_file_size, which will be introduced for you below.

If the innodb_log_file_size setting is too small, it will cause frequent redo log file switching, frequent database check points (check points), resulting in an increase in the number of record updates to the data file, thereby affecting IO performance.

Similarly, if there is a large transaction, and all redo log logs are full, but not yet completed, the log cannot be switched, which will cause MySQL to directly block it.

The innodb_log_file_size setting is too large, although it greatly improves the IO performance, but when MySQL restarts or crashes, the recovery time will be prolonged because the redo log file is too large. And this recovery time is usually uncontrollable.

After setting a reasonable redo log size and quantity, Innodb can guarantee that even if the database is restarted abnormally, the previously submitted records will not be lost. This is also called crash-safe.

Here, the understanding of crash-safe will not mention what it is, the following articles will let you understand.

`2. How to set innodb_log_file_size according to the project situation`

For the parameter innodb_log_files_in_group, setting 3~4 is enough, no optimization is needed.

Focus on the size setting or optimization setting of innodb_log_file_size.

Before MySQL8.0, the size of the transaction log (redo log) generated in a period of time was usually calculated, and the MySQL log file should carry the business log volume of at least one hour.

here for a period of time must be determined by one's own business conditions. The outside world can calculate the log volume of 1 minute and the log volume of 1 hour.

First look at a command pager of the MySQL client. In the daily operation of MySQL, by setting the display mode of the pager, work efficiency can be greatly improved.

Currently, to view the value of the sequence within one minute, you can execute the pager grep sequence, mysql> show engine innodb status\ G select sleep (60); show engine innodbstatus\ G; returns the result of 061b011b394d6f.

Prohibit the pager setting to execute nopager. If you do not execute this command, the command will only become invalid until the next restart.

Here Kaka is an operation performed on the virtual machine. You can see that there is no operation within one minute, so the value is the same before and after. You can do the test on the test server.

The calculated select (data at the back-data at the front)/1024/1024*60 asMB_per_hour; the value is the size of the redo log one hour later

However, it must be inappropriate to use this method to calculate. The value calculated within one minute of busy business or business idle time will cause a large error.

The appropriate method is to determine several time points in a day, use a script to execute regularly, then record the corresponding value, and then take the average value, the calculated error will be minimized.

What is sequece? When each binlog is generated, the value starts from 1, and then increments. For each additional transaction, the sequenumber is incremented by 1.

`Two, binlog`

You can generally understand that the MySQL architecture is divided into two layers, one is the server layer, and the other is the storage engine layer.

The server layer is of course responsible for functions, while the storage engine layer is responsible for handling storage-related operations.

And the redo log mentioned above is unique to the Innodb storage engine layer, other storage engines do not have it, and the server layer also has its own log record, which is the binlog that will be discussed.

The difference between redo log and binlog

Redo log is unique to the Innodb engine, and binlog is unique to the MySQL server layer, and all engines can be used.

The redo log is the physical log, which records the modification made by an update operation, and the binlog is the logical log, which records the execution logic of an update statement

The redo log is written cyclically, and the space is fixed. For example, four 1GB redo log files are configured above, and the binlog is written additionally. After this file is written, the next file will not overwrite the previous log. This means that you often see that as long as you have a complete binlog file, you can restore the data you want.

Why does MySQL have two logs?

Before the Innodb storage engine, MySQL's default storage engine was MyIsam, but MyIsam did not have the ability to restart and restore, and the binlog log was only used for archiving.

Innodb was introduced to Mysql in the form of a plug-in by another company. Since binlog does not have the ability to restart and restore, then I use redo log to realize the restart and restore function.

This results in two logs being written when you use the Innodb storage engine.

`Three, what is a two-phase commit`

After having a certain understanding of redo log and binlog, let's take a look at the execution flow of an update statement.

update user set age = age + 1 where id = 1;

The executor first goes to the engine layer to find the row with id = 1. Since ID is the primary key, it will find this row in the primary key index tree. If the data page of the row with ID=2 is originally in the memory, it is directly returned to the executor. Otherwise, you need to read the memory from the disk first, and then return.
After the executor gets the result of id = 2 returned by the storage engine, it adds 1 to age, which was originally 25, and now it is 26. Write this new line of data in the call to the engine interface.
The engine updates this row of data to the memory first, and at the same time records the update operation in the redo log. At this time, the redo log is in the prepare state. Then inform the executor that the execution is complete and the transaction can be submitted at any time.
Then the executor generates the binlog of this operation and writes the binlog to disk.
The executor calls the commit transaction interface of the engine, and the engine changes the redo log that has just been written to the commit state, and the update is complete.

At this point, you should be clear. An update SQL will write redo log first and then binlog. This is why the title is called redo log and binlog of lifelong friends.

`Fourth, why two-phase submission is required`

It is to make the logic between redo log and binlog consistent. Look at the following two situations.

Write redo log first and then binlog

The update statement is age = age +1
Write data to the redo log, and the MySQL process restarts abnormally
At this time binlog has not yet started writing
After the system restarts for data recovery, the value at this time is 26
When you need to build a slave library, you need to use binlog to restore data, but at this time, the operation of age = age +1 is not recorded in binlog
Then there will be less update from the slave library at this time, and the restored age will still be 25, resulting in inconsistent data in the master library.

Write binlog first and then redo log

The update statement is age = age +1
Write data to binlog, MySQL restarts abnormally
At this time redo log has not been written yet
The MySQL system restarts, this update operation does not exist for the redo log, so the value after restart is still 25
But the value in binlog will already be 26
When you need to build a slave library, the value of the slave library is 26 and the value of the master library is 25, resulting in inconsistent master-slave data

Therefore, if two-phase commit is not used, the original library and the library data recovered with its binlog log are inconsistent.

`5. "Kong Yiji" lets you understand what redo log is`

Look at the article "Kong Yiji" in a ninth grade Chinese text in junior high school. Even if you don't remember the content, you always remember the title!

This case is also mentioned in the article by Teacher Ding. Why can Ding use this case flexibly to talk about redo log and we can't think of it?

The essential reason is that there is no thorough understanding of the knowledge points, and the use of life cases to explain the technology is the easiest to understand and not difficult to forget.

The protagonist in "Kong Yiji" calls him the shopkeeper of the hotel. The shopkeeper has two magic weapons that make his work much more efficient than other bosses. One is a small blackboard and the other is a ledger.

Imagine if there is a customer who wants to pay on credit, is it efficient to write directly on the blackboard, or is it faster to flip through the dense accounts?

The shopkeeper will definitely choose to record it on the blackboard first, and then write the blackboard record in the ledger when there are few or not busy.

On the other hand, if the boss does not have a blackboard, he can only find the name of the credit person in the dense ledger. If there is a credit record added before, it is found that there is none before adding it.

This process is not only cumbersome but also inefficient, which is unacceptable. If the hotel has many guests, the owner will not be able to record it.

Similarly, this problem also exists in MySQL. Every time an update statement is executed, the record needs to be found and then updated. The IO cost and search cost of the entire process are high. Therefore, MySQL also uses the wisdom of the hotel shopkeeper to use blackboards to improve execution efficiency.

Draw a picture so that everyone can better understand the correspondence between the shopkeeper, blackboard, and MySQL.

`Six, detailed explanation of redo log parameters`

transaction is achieved through the redo log.

After the transaction is committed, the data in the database is not directly modified, but the relevant operations are first ensured to be recorded in the redo log.

The database will flush the dirty page data in the memory to the disk according to the corresponding mechanism.

The figure above is a simple redo log writing process.

Mentioned in the above figure two unfamiliar concepts, Buffer pool, redo log buffer, these two are part of the memory area of the Innodb storage engine.

The redo log file is located on the disk.

In other words, when there is a DML (insert, update, delete) operation, the data will be written to the Buffer pool first, and then written to the redo log buffer.

The redo log buffer will be written to the redo log according to the flashing mechanism.

The setting parameters of this mechanism are innodb_flush_log_at_trx_commit , and the parameters are 0, 1, 2

The figure above shows the redo log write strategy.

When the value of this parameter is 0, after committing the transaction, the data will be stored in the redo log buffer, and then the data will be written to the disk file every second
When the value of this parameter is 1, after committing the transaction, the redo log buffer must be flushed from the memory to the disk file. As long as the transaction is successfully submitted, the redo log must be on the disk.
When the value of this parameter is 2, after the transaction is committed, the redo log buffer log is written to the os cache cache corresponding to the disk file, instead of directly entering the disk file, the data in the os cache will be written after 1 second Into the disk file.

server stops abnormally (transaction writing process)

When the parameter is 0, the log of the previous second is stored in the log buffer, which is the memory. If the machine goes down, 1 second of transaction data may be lost.
When the parameter is 1, the database has very high IO requirements. If the IOPS provided by the underlying hardware is relatively poor, the concurrency of the MySQL database will soon be unable to be improved due to hardware IO problems.
When the parameter is 2, the data is directly written into the os cache. This part belongs to the operating system. If the operating system is partially damaged or the power is cut off, the transaction data within 1 second will be lost. This strategy is relative to the first It is much safer, and the IO requirements are not so high.

Summary

About performance: 0>2>1

Regarding security: 1>2>0

According to the above conclusions, in the MySQL database, the default value of the flashing strategy is 1, which guarantees that the data will never be lost after the transaction is submitted.

Persist in learning, persisting in writing, and persisting in sharing are the beliefs that KaKa has been upholding since his career. I hope that the article on the huge Internet can bring you a little help, I am KaKa, see you in the next issue.

Lifetime best friend redo log, binlog "dead MySQL series II"

Series of articles

One, redo log

`2. How to set innodb_log_file_size according to the project situation`

`Two, binlog`

`Three, what is a two-phase commit`

`Fourth, why two-phase submission is required`

`5. "Kong Yiji" lets you understand what redo log is`

`Six, detailed explanation of redo log parameters`

咔咔

`引用和评论`

网盘变身磁盘：开启存储新时代

MySQL慢查询日志：性能优化的终极指南

做到真正0丢失、0重复：Apache SeaTunnel 实现万亿级数据一致性全解密

MySQL 备份 Shell 脚本：支持远程同步与阿里云 OSS 备份

《SQL应用场景解析：如何通过SQL解决实际业务问题》

Devin 发布 DeepWiki，2 星的项目直接装出万星的气场

好用的开源埋点方案-ClkLog埋点用户分析系统