The realization principle of MySQL transaction processing features

Abstract: The word transaction comes from the translation of the word transactional in English, and the meaning of this word more refers to "transaction". In database systems or software systems, we usually call transactional transactions

This article is shared from the Huawei Cloud Community " [Database Transaction and Lock Mechanism] Isolation Level 160b99e40689fe", the original author: Technical Torchbearer.

The word transaction comes from the translation of the word transactional in English, and the meaning of this word refers more to "transaction". In database systems or software systems, we usually call transactional transactions.

The four characteristics of database transaction ACID: atomicity, consistency, isolation, and durability. The four major characteristics of database transactions are derived from the ISO standard ISO/IEC 10026-1:1992/COR 1:1996, which defines that transactions need to have the above four characteristics. So how are these features implemented in InnoDB? Below we discuss how MySQL (hereinafter referred to as InnoDB engine) supports transaction features.

Before discussing the implementation principle of MySQL's transaction processing feature, we need to understand MySQL's support for MVCC. Wikipedia has the following explanation about MVCC.

Multiversion concurrency control (Multiversion concurrency control, MCC or MVCC) is a concurrency control commonly used in database management systems, and it is also used in programming languages to implement transactional memory. MVCC intends to solve the problem of multiple, long-term read operations and starvation write operations caused by read-write locks. The data item read by each transaction is a historical snapshot (snapshot) and depends on the isolation level implemented. The write operation does not overwrite the existing data item, but creates a new version, which does not become visible until the operation is submitted. Snapshot isolation allows things to see the data state when it starts

Isolation level of database transactions

In order to achieve transaction isolation, the ISO standard organization has four definitions for the isolation levels that transaction locks need to achieve. Let's briefly explain the four transaction isolation levels below.

READ UNCOMMITTED

RU (READ UNCOMMITTED) is called read uncommitted, and some materials are called browse access (browse access), but the correct translation should be uncommitted read. RU is the minimum standard of isolation. Uncommitted read means that in the case of concurrent transactions, a transaction can be allowed to be read by another transaction without committing the modification, which will result in dirty reads. Happening. The following table is the suppression of dirty reads, phantom reads, and repeatable reads for each transaction isolation level. In fact, RU will not only produce dirty reads, but also the other two reads.

First of all, we need to clarify the concepts of the above three data reading problems. For database transactions, our simple understanding is that a series of database operations are in a transaction. This transaction either succeeds or fails all, but you must know the database In actual use, it is not serial, it is concurrent. In the serial scenario, our business in advance is very simple, just one operation, everyone queues up to execute. However, in the scenario of concurrent transactions, there will be competition for the same data. The simple understanding is that you also need to read and write this data, and I also want to read and write this data, so how do you ensure that when multiple transactions operate on one data? Consistency and completeness of the data? At this time, data dirty reads, phantom reads, and repeated reads will appear.

Dirty read

When a transaction is allowed to read data modified by another transaction but not committed, dirty reads may occur

Dirty read means that multiple transactions read and write a piece of data at the same time. When transaction 1 modifies and reads data A, transaction 2 modifies data A, and then this modification is reflected in transaction A.

Let’s imagine a scenario like this. If both transactions are operating on the same record in the amount table, transaction A needs to get the current amount value and then add 3 to it (used to buy cucumbers), the original value of this amount It is 5, but at this time transaction B modifies the amount of this piece of data to 8, and then this modification is taken by transaction A and then 3 is added to 8 to equal 11. However, I never expected that transaction B would roll back after transaction A completed this operation (I regretted it, the banana money was not given). At this time, after transaction A is completed, the amount of the account inexplicably becomes 11, but in fact it should be 8. This is the case of dirty reads.

Non-repeatable

In a transaction, when a row of data is obtained twice and different results are obtained, it means that non-repeatable reads have occurred.

Before understanding non-repeatable read, first understand what is repeatable read. Repeatable read means that the results of multiple reads of the same data in a transaction should be the same (its value is not modified in this transaction) . The converse means that the value of multiple reads of a data in a transaction is not the same. Under what circumstances will non-repeatable reads occur?

Still in the above example, if transaction A reads the original value which is 5 before adding 3, and then continues other operations, at this time transaction B adds 3 to this record and then commits it. When the transaction When A reads the value again, it is found that the current value has become 8. At this time, the value of the two times before and after is completely different, which is not repeatable reading.

Non-repeatable read is for a single transaction, that is, whether a piece of data can be read repeatedly in a transaction, if not, it means that the requirement of repeatable read is not met.

Non-repeatable read and dirty read are very similar, but the two are different. Dirty read means that transaction 2 did not commit the modification and the modified value was obtained by transaction 1 while non-repeatable read refers to the inconsistency that occurred after the modification was committed. Case.

Phantom reading

In the transaction execution process, when two identical query statements are executed, different result sets are obtained. This phenomenon is called phantom read

A phantom read is actually a scenario that cannot be reread. For example, in transaction 1, 3 pieces of data are read according to a certain condition for the first time, but at this time, transaction 2 adds a piece of data that meets this condition in the table. When I read transaction 1 for the second time, I found one more piece of data (conversely, one piece of data was missing). At this time, it is a bit inexplicable for transaction 1, and it seems to have an illusion (too many goods have been sent), so it is called For phantom reading.

Therefore, for the isolation level of uncommitted reads, these three types of read problems are likely to occur, so it is the lowest level of transaction isolation.

READ COMMITTED

RC (READ COMMITTED) Read submission means that it can be read after submission. Some materials are called submitted read (domestic translation is also drunk). The main scenario for commit reading is the UPDATE statement, that is, for the update, it can only be read after it is submitted. Imagine that when the dirty read was introduced above, if transaction 2 submitted this value after modifying the amount instead of rolling back, it has not been for a long time. Dirty read situation.

This is why submitting reads can only solve the problem of dirty reads but not the other two types of reads. Because obviously even if transaction 2 commits this modification, then for transaction 1, the two reads are inconsistent (non-repeatable reading). Of course, the scenario of phantom reading is even more present, because phantom reading is inherently impossible. Special scene of repeated reading.

REPEATABLE READS can be read repeatedly

RR (REPEATABLE READS) repeatable read is a transaction isolation level second only to SERIALIZABLE (serialization). Usually repeatable reads are realized through locks, which cannot avoid the generation of phantom reads. In InnoDB, the RR transaction isolation level is used by default, but unlike other databases, InnoDB uses the NKL lock algorithm (Next-Key Lock) under the RR transaction isolation level to avoid the generation of phantom reads. This is different from other databases, so the transaction isolation level of RR in InnoDB has reached the serialized transaction isolation standard.

NKL refers to locking a range and the data itself, rather than just locking the data itself, so as to avoid the generation of phantom readings, official documents

SERIALIZABLE can be serialized

It is the highest level of transaction isolation. According to the definition, it means that all transactions are executed in accordance with serialization, that is, no concurrent transactions are generated. This avoids all reading problems, but this is impossible for the database, because any A database can not tolerate this situation, so most people think that the use of this transaction isolation level will have a very large impact on performance, but some papers have concluded through experiments that serialization will not have much impact on performance.

Regarding whether serialization has an impact on performance, it depends on the database's implementation of this transaction isolation level. I can't say that serialization is necessarily slow. Anyway, I don't know if it really has a big impact on performance.

MySQL database transaction isolation level query and modification

Query transaction isolation level

In MySQL, we can query the transaction isolation level adopted by the database in the following ways

show variables like '%tx_isolation%';
 
# 查询回话的事务隔离级别
SELECT @@session.tx_isolation;
#查看全局的隔离级别
SELECT @@global.tx_isolation;

Modify the transaction isolation level

MySQL provides the SET TRANSACTION statement, which can change the isolation level of a single session or a global transaction. The syntax format is as follows:

SET [SESSION | GLOBAL] TRANSACTION ISOLATION LEVEL {READ UNCOMMITTED | READ COMMITTED | REPEATABLE READ | SERIALIZABLE}

Among them, the SESSION and GLOBAL keywords are used to specify the scope of the modified transaction isolation level

SESSION: indicates that the modified transaction isolation level will be applied to all transactions in the current session (current cmd window);
GLOBAL: indicates that the modified transaction isolation level will be applied to all transactions in all sessions (global), and the current existing session will not be affected;

If SESSION and GLOBAL are omitted, it means that the modified transaction isolation level will be applied to the next transaction that has not yet started in the current session.

Any user can change the transaction isolation level of the session, but only users with SUPER authority can change the global transaction isolation level

JDBC modify the isolation level of the current connection

connection.setTransactionIsolation(Connection.TRANSACTION_READ_UNCOMMITTED);

Click to follow, and get to know the fresh technology of

The realization principle of MySQL transaction processing features

Isolation level of database transactions

READ UNCOMMITTED

Dirty read

Non-repeatable

Phantom reading

READ COMMITTED

REPEATABLE READS can be read repeatedly

SERIALIZABLE can be serialized

MySQL database transaction isolation level query and modification

Query transaction isolation level

Modify the transaction isolation level

JDBC modify the isolation level of the current connection

华为云开发者联盟

引用和评论

华为云开发者联盟入选 2023 中国技术品牌影响力企业榜，深耕开发者生态

MySQL慢查询日志：性能优化的终极指南

做到真正0丢失、0重复：Apache SeaTunnel 实现万亿级数据一致性全解密

MySQL 备份 Shell 脚本：支持远程同步与阿里云 OSS 备份

《SQL应用场景解析：如何通过SQL解决实际业务问题》

Devin 发布 DeepWiki，2 星的项目直接装出万星的气场

好用的开源埋点方案-ClkLog埋点用户分析系统