10

Implementation principle of multi-version concurrency control (MVCC) for MySQL transactions

MySQL version: 8.0.27
Refer to "How MySQL Works"

1. What is multi-version concurrency control?

Multiversion Concurrency Control The full English name of the technology is Multiversion Concurrency Control , referred to as MVCC .

Multi-version concurrency control (MVCC) implements concurrency control by saving snapshots of data at a certain point in time. That is to say, no matter how long the transaction is executed, the data seen inside the transaction is not affected by other transactions. Depending on the time when the transaction starts, each transaction may see different data for the same table at the same time. of.

The idea of multi-version concurrency control is to save the historical version of the data, and realize the concurrency control of the database through the management of multiple versions of the data row. In this way, we can determine whether the data is displayed by comparing the version number, and the isolation effect of the transaction can be guaranteed without locking when reading the data.

Through multi-version concurrency control we can solve the following problems:

  1. For the problem of blocking between read and write, MVCC can prevent read and write from blocking each other, that is, reading does not block writing, and writing does not block reading, which can improve the concurrent processing capability of transactions.
  2. Reduces the probability of deadlock. This is because MVCC adopts an optimistic locking method, which does not require locking when reading data, and only locks necessary rows for writing operations.
  3. Solve the problem of consistent read. Consistent read is also known as snapshot read. When we query the snapshot of the database at a certain point in time, we can only see the results of updates submitted by transactions before this point in time, but not the results of updates submitted by transactions after this point in time. .

2. Version chain

In the table of the InnoDB storage engine, each row of records contains some hidden fields, of which the following two fields play an important role in MySQL's implementation of MVCC:

  • db_trx_id : The transaction ID of the last operation (insert or update) on this record. Each time a transaction changes a record, the transaction ID is assigned to the db_trx_id hidden field.
  • db_roll_ptr : The rollback pointer, that is, the undo log information pointing to this record. Every time a transaction changes a reference record, the old version record will be written to the undo log , and then db_roll_ptr is equivalent to a pointer, through which the information before the modification of the record can be found.

InnoDB记录隐藏字段

undo log is also known rollback log , it can save the data version during the transaction, can be used for rollback, and can provide read operations under multi-version concurrency control (MVCC).

For example, there is the following transcript table:

CREATE TABLE `report`  (
  `id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT,
  `name` varchar(10) NOT NULL COMMENT '姓名',
  `score` tinyint(3) UNSIGNED NOT NULL COMMENT '成绩',
  PRIMARY KEY (`id`) USING BTREE
) ENGINE=InnoDB CHARACTER SET=utf8mb4;
INSERT INTO `report` VALUES (1, '小明', '70');

The table now contains only one record:

SELECT * FROM report;
+----+--------+-------+
| id | name   | score |
+----+--------+-------+
|  1 | 小明   |    70 |
+----+--------+-------+

Suppose the transaction id that inserts the record is 20, and then two transactions perform UPDATE operations on this record respectively, and the transaction ids are 30 and 40 respectively. The operation process is as follows:

Transaction 30Transaction 40
BEGIN;-
-BEGIN;
UPDATE report SET score=80 WHERE id=1;-
UPDATE report SET score=81 WHERE id=1;-
COMMIT;-
-UPDATE report SET score=90 WHERE id=1;
-UPDATE report SET score=91 WHERE id=1;
-COMMIT;

Every time a record is changed, an undo log will be recorded. Each undo log has a db_roll_ptr attribute (the undo log corresponding to the INSERT operation does not have this attribute, because the record does not have an earlier version). The logs are all connected and strung together into a linked list, as shown below:

版本链结构

After each update of the record, the old value will be put into an undo log and saved as an old version of the record. With the increase of the number of updates, all versions will be linked into a linked list by the db_roll_ptr attribute. This linked list is called version chain , the head node of the version chain is the latest value of the current record, and each version also contains the transaction ID ( db_trx_id ) corresponding to when the version was generated.


Three, ReadView (read view)

For transactions using the isolation level of to read and to read repeatedly, it must be guaranteed to read the records modified by the transaction that has committed to , that is to say, if another transaction has modified the record but has not yet committed, it cannot be directly Read the latest version of the record.
The core problem is: it is necessary to determine which version in the version chain is visible to the current transaction. To this end, InnoDB proposed the concept of ReadView (read view) . This ReadView mainly contains 4 more important attributes:

  • m_ids : A list of transaction ids representing read and write transactions that were active in the current system when the ReadView was generated.
  • min_trx_id : Indicates the smallest transaction id among the active read and write transactions in the current system when ReadView is generated, that is, the smallest value in m_ids .
  • max_trx_id : Indicates the id value that should be assigned to the next transaction in the system when ReadView is generated.
  • creator_trx_id : Indicates the transaction id of the transaction that generated the ReadView.
Note that max_trx_id not the maximum value in m_ids , transaction ids are allocated incrementally. For example, now there are three transactions with id 1, 2, and 3, and then the transaction with id 3 is committed. Then when a new read transaction generates ReadView, m_ids includes 1 and 2, min_trx_id is 1, max_trx_id is 4.
A transaction id is assigned to a transaction only when changes are made to the records in the table (when executing INSERT, DELETE, UPDATE statements), otherwise the transaction id value in a read-only transaction defaults to 0.

MVCC Data Visibility Algorithm

With ReadView, when accessing a record, you only need to follow the steps below to determine whether a version of the record is visible:

  1. If the accessed version's db_trx_id value is equal to the creator_trx_id value in ReadView, it means that the current transaction is accessing its own modified record, so this version can be accessed by the current transaction.
  2. If the value of db_trx_id of the accessed version is less than the value of min_trx_id in ReadView, it indicates that the transaction that generated this version has been committed before the current transaction generates ReadView, so this version can be accessed by the current transaction.
  3. If the value of db_trx_id of the accessed version is greater than or equal to the value of max_trx_id in ReadView, it indicates that the transaction that generated this version is opened after the current transaction generates ReadView, so this version cannot be accessed by the current transaction.
  4. If the access version of db_trx_id value ReadView of min_trx_id and max_trx_id between, db_trx_id property value is not in the m_ids list, and if so, explain generation version of the transaction or when creating ReadView active Yes, this version cannot be accessed; if not, it means that the transaction that generated this version when the ReadView was created has been committed, and this version can be accessed.
  5. If the data of a certain version is not visible to the current transaction, follow the version chain to find the data of the previous version, and continue to judge the visibility according to the above steps until the earliest version in the version chain. If the earliest version is also not visible, it means that the record is completely invisible to the transaction, and the query result does not contain the record.

MVCC数据可见性算法判断流程图

The difference between committed read and repeatable read isolation level in ReadView

In MySQL, a very big difference between committed read and repeatable read isolation level is the timing of when they generate ReadView:

  • Committed read Isolation level generates a ReadView before every normal SELECT operation.
  • Repeatable read Only generate a ReadView before the first ordinary SELECT operation, and the subsequent query operations will reuse this ReadView.
Read committed: A ReadView is generated before each data read

For transactions using the committed read isolation level, a ReadView is generated before each read data. Let's use a concrete example to see what the effect is.

Let's take the table report as an example. Now there is only one record in the table report , and the transaction ID of the last modified record is 40:

SELECT * FROM report;
+----+--------+-------+
| id | name   | score |
+----+--------+-------+
|  1 | 小明   |    91 |
+----+--------+-------+

Now there are two transactions with transaction id 50 and 52 being executed in the system:

# 事务50
BEGIN;

UPDATE report SET score = 70 WHERE id = 1;

UPDATE report SET score = 71 WHERE id = 1;
# 事务52
BEGIN;

# 更新了一些别的表的记录
...
During transaction execution, only when the record is actually modified for the first time (such as using INSERT, DELETE, UPDATE statements), a separate transaction id will be assigned, and the transaction id is incremented. So we update some other table records in transaction 52, the purpose is to let it assign transaction id.

At this moment, the version list obtained by the record with id 1 in table report is as follows:

Suppose there is now a transaction that uses the committed read isolation level to start executing:

# 将当前会话的隔离级别设为读已提交
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
SHOW VARIABLES LIKE 'transaction_isolation';
+-----------------------+-----------------+
| Variable_name         | Value           |
+-----------------------+-----------------+
| transaction_isolation | READ-COMMITTED |
+-----------------------+-----------------+
# 使用读已提交隔离级别开启一个新的事务
BEGIN;

# SELECT1:事务 50、52未提交
SELECT * FROM report WHERE id = 1;
+----+--------+-------+
| id | name   | score |
+----+--------+-------+
|  1 | 小明   |    91 |
+----+--------+-------+
# 得到的列score的值为91

Combined with "MVCC Data Visibility Algorithm Judgment Flowchart" , the execution process of the above SELECT1 statement can be analyzed as follows:

  1. When the SELECT statement is executed, a ReadView will be generated first. The content of the m_ids list of ReadView is [50, 52], min_trx_id is 50, max_trx_id is 53, and creator_trx_id is 0.
  2. Select the visible records from the version chain. As can be seen from the figure, the content of the column score of the latest version is 71. The value of trx_id of this version is 50, which is in the list of m_ids , so it does not meet the visibility requirements. Jump to db_roll_ptr last version.
  3. The content of the column score of the previous version is 70, and the value of trx_id in this version is also 50, which is also in the list of m_ids , so it does not meet the requirements, and continue to skip to the previous version.
  4. The content of the column score of the previous version is 91. The value of trx_id in this version is 40, which is smaller than the value of 062309cca696e5 in min_trx_id , which is 50. Therefore, this version meets the requirements. The version returned to the user is the record whose column score is 91. .

After that, we submit the transaction with transaction id 50:

# 事务 50
BEGIN;

UPDATE report SET score = 70 WHERE id = 1;

UPDATE report SET score = 71 WHERE id = 1;

COMMIT;

Then go to the transaction with transaction id 52 to update the record with id 1 in table report :

# 事务 52
BEGIN;

# 更新了一些别的表的记录
...

UPDATE report SET score = 75 WHERE id = 1;

UPDATE report SET score = 78 WHERE id = 1;

At this moment, the version chain of the record with id 1 in table report like this:

Then go to the transaction with the isolation level of that has been submitted to read and continue to find this record with an id of 1, as follows:

# 使用读已提交隔离级别开启一个新的事务
BEGIN;

# SELECT1:事务 50、52未提交
SELECT * FROM report WHERE id = 1;
+----+--------+-------+
| id | name   | score |
+----+--------+-------+
|  1 | 小明   |    91 |
+----+--------+-------+
# 得到的列score的值为91

# SELECT2:事务 50 提交,事务 52 未提交
SELECT * FROM report WHERE id = 1; 
+----+--------+-------+
| id | name   | score |
+----+--------+-------+
|  1 | 小明   |    71 |
+----+--------+-------+
# 得到的列score的值为71

Combined with "MVCC Data Visibility Algorithm Flowchart" , the execution process of the above SELECT2 statement can be analyzed as follows:

  1. Because the isolation level of the current transaction is and the read has been submitted, a separate ReadView will be generated when the SELECT statement is executed. The content of the m_ids list of the ReadView is [52], min_trx_id is 52, max_trx_id is 53, and creator_trx_id is .
  2. Then select the visible records from the version chain. As can be seen from the figure, the content of the column score of the latest version is 78, the value of trx_id of this version is 52, which is in the list of m_ids , so it does not meet the visibility requirements, and jumps according to db_roll_ptr to the previous version.
  3. The content of the column score of the previous version is 75. The value of trx_id in this version is 52, which is also in the list of m_ids , so it does not meet the requirements, and continue to skip to the previous version.
  4. The content of the column score in the previous version is 71. The value of trx_id in this version is 50, which is smaller than the value of 062309cca6986d in min_trx_id , which is 52. So this version meets the requirements, and the final version returned to the user is the record whose column score is 71.

By analogy, if the record with the transaction id of 52 is also submitted later, when you query SELECT * FROM report WHERE id = 1; in the transaction that uses the committed read isolation level, the result is that the value of the column score is 78, and the specific process is no longer analyzed. .
To sum it up: A transaction using the read committed isolation level generates a separate ReadView at the start of each query.

Repeatable Read: Generates a ReadView on the first read of data

For transactions using the repeatable read isolation level, only a ReadView is generated when the query statement is executed for the first time, and subsequent queries will not be generated repeatedly. Let's take a concrete example to see what the effect is.

Let's take the table report as an example. Now there is only one record in the table report , and the transaction ID of the last modified record is 52:

SELECT * FROM report;
+----+--------+-------+
| id | name   | score |
+----+--------+-------+
|  1 | 小明   |    78 |
+----+--------+-------+

Now there are two transactions with transaction id 60 and 62 executing in the system:

# 事务 60
BEGIN;

UPDATE report SET score = 60 WHERE id = 1;

UPDATE report SET score = 61 WHERE id = 1;
# 事务 62
BEGIN;

# 更新了一些别的表的记录
...

At this moment, the version list obtained by the record with id 1 in table report is as follows:

Suppose there is now a transaction that uses the repeatable read isolation level to start executing:

# 将当前会话的隔离级别设为读已提交
SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SHOW VARIABLES LIKE 'transaction_isolation';
+-----------------------+-----------------+
| Variable_name         | Value           |
+-----------------------+-----------------+
| transaction_isolation | REPEATABLE-READ |
+-----------------------+-----------------+
# 使用可重复读隔离级别开启一个新的事务
BEGIN;

# SELECT1:事务 60、62未提交
SELECT * FROM report WHERE id = 1;
+----+--------+-------+
| id | name   | score |
+----+--------+-------+
|  1 | 小明   |    78 |
+----+--------+-------+
# 得到的列score的值为78

Combined with "MVCC Data Visibility Algorithm Judgment Flowchart" , the execution process of the above SELECT1 statement can be analyzed as follows:

  1. When the SELECT statement is executed, a ReadView will be generated first. The content of the m_ids list of ReadView is [60, 62], min_trx_id is 60, max_trx_id is 63, and creator_trx_id is 0.
  2. Select the visible records from the version chain. As can be seen from the figure, the content of the column score of the latest version is 61. The value of trx_id of this version is 60, which is in the list of m_ids , so it does not meet the visibility requirements. According to db_roll_ptr , jump to last version.
  3. The content of the column score of the previous version is 60, and the value of trx_id in this version is also 60, which is also in the list of m_ids , so it does not meet the requirements. Continue to skip to the previous version.
  4. The content of the column score of the previous version is 78. The value of trx_id in this version is 52, which is smaller than the value of 062309cca69ab3 in min_trx_id , which is 60. Therefore, this version meets the requirements. The version returned to the user is the record whose column score is 78. .

After that, we commit the transaction with transaction id 60, like this:

# 事务 60
BEGIN;

UPDATE report SET score = 60 WHERE id = 1;

UPDATE report SET score = 61 WHERE id = 1;

COMMIT;

Then go to the transaction with transaction id 62 to update the record with id 1 in table report :

# 事务 62
BEGIN;

# 更新了一些别的表的记录
...

UPDATE report SET score = 65 WHERE id = 1;

UPDATE report SET score = 68 WHERE id = 1;

At this moment, the version chain of the record with id 1 in table report like this:

Then go to the transaction with the repeatable read isolation level turned on just now and continue to find the record with the id of 1, as follows:

# 使用可重复读隔离级别开启一个新的事务
BEGIN;

# SELECT1:事务 60、62未提交
SELECT * FROM report WHERE id = 1;
+----+--------+-------+
| id | name   | score |
+----+--------+-------+
|  1 | 小明   |    78 |
+----+--------+-------+
# 得到的列score的值为78

# SELECT2:事务 60 提交,事务 62 未提交
SELECT * FROM report WHERE id = 1; 
+----+--------+-------+
| id | name   | score |
+----+--------+-------+
|  1 | 小明   |    78 |
+----+--------+-------+
# 得到的列score的值仍为78

The execution process of the above SELECT2 statement is as follows:

  1. Because the isolation level of the current transaction is 162309cca69bc8, can be read , and ReadView has been generated when SELECT1 was executed before, so the previous ReadView is directly reused at this time, and the content of the m_ids list of the previous ReadView is [60, 62], min_trx_id is 60, max_trx_id is 63, and creator_trx_id is 0.
  2. Then select the visible records from the version chain. As can be seen from the figure, the content of the column score of the latest version is 68. The value of trx_id of this version is 62, which is in the list of m_ids , so it does not meet the visibility requirements. Jump according to db_roll_ptr to the previous version.
  3. The content of the column score in the previous version is 65. The value of trx_id in this version is 62, which is also in the list of m_ids , so it does not meet the requirements, and continues to skip to the previous version.
  4. The content of the column score of the previous version is 61, the value of trx_id in this version is 60, and the list of m_ids contains the transaction id value of 60, so this version does not meet the requirements. Similarly, the content of the next column score is The 60 version also doesn't meet the requirements. Continue to skip to the previous version.
  5. The content of the column score of the previous version is 78. The value of trx_id in this version is 52, which is smaller than the value of 062309cca69c5f in min_trx_id , which is 60. Therefore, this version meets the requirements. The version returned to the user is the record whose column score is 78. .

That is to say, the results obtained by the two SELECT queries are repeated, and the value of the record score is 78, which is the meaning of repeatable read. If we submit the record with the transaction id of 62 later, and then use the repeatable read isolation level transaction to continue to search for the record with the id of 1, the result is still a score of 78.


白菜1031
5.4k 声望1.6k 粉丝