Text|
Liu Yuecai
Seata-go project leader
Beijing Xiaoju Technology Co., Ltd. [DiDi] Development Engineer
Zhao Xin (flower name: Yu Yu)
Head of Open Source of Ant Group Seata Project
This article is 5343 words read for 14 minutes
background
Among the four transaction modes of Seata, AT transaction mode is the original transaction mode of Alibaba system, which is non-intrusive to the business. It is also the transaction mode with the largest number of Seata users.
At present, the Seata community is vigorously promoting the construction of its multi-language version, and the four language versions of Go, PHP, JS and Python have basically completed the implementation of the TCC transaction mode. With reference to the implementation of the AT mode of Seata v1.5.2 and combined with the official Seata documentation, this article attempts to explain the detailed process of the Seata AT transaction mode from a code perspective. During development, the AT transaction mode is prioritized.
1. What is AT mode?
AT mode is a distributed transaction mode with two-phase commit. It adopts the local undo log method to obtain the state of data before and after modification, and uses it to achieve rollback. In terms of performance, due to the existence of undo log, AT mode can release locks and connection resources immediately after one-stage execution, and the throughput is higher than XA mode. When users use the AT mode, they only need to configure the corresponding data source. The process of transaction submission and rollback is automatically completed by Seata, which has almost no intrusion on the user's business and is convenient to use.
2. AT mode and ACID and CAP
When talking about the transaction mode of the database, the ACID characteristics related to the transaction are generally discussed first, but in a distributed scenario, its CAP property needs to be considered.
2.1 AT and ACID
Database transactions must meet the four properties of atomicity, consistency, durability and isolation, namely ACID. In the distributed transaction scenario, generally, atomicity and durability are guaranteed first, consistency is guaranteed second, and isolation is due to the differences in the locks of different databases, data MVCC mechanisms and related transaction modes, and has multiple isolation levels. For example, MySQL's own transaction has four isolation levels: Read Uncommitted, Read Committed, Repeatable Read, and Serializable.
2.1.1 Read isolation in AT mode
On the basis of the database local transaction isolation level Read Committed (Read Committed) or above, the default global isolation level of Seata (AT mode) is Read Uncommitted (Read Uncommitted) .
If the application is used in a specific scenario, the global read must be required to be committed . The current Seata method is through the proxy of the SELECT FOR UPDATE statement.
The execution of the SELECT FOR UPDATE statement queries the global lock , and if the global lock is held by another transaction, the local lock is released (the local execution of the SELECT FOR UPDATE statement is rolled back) and retried. During this process, the query is blocked and will not be returned until the global lock is obtained, that is, the read related data has been submitted .
For overall performance considerations, Seata's current solution does not proxy all SELECT statements, but only for FOR UPDATE SELECT statements.
For detailed examples, refer to Seata's official website: https://seata.io/zh-cn/docs/dev/mode/at-mode.html
2.1.2 Write isolation in AT mode
AT will intercept the SQL of the write operation. Before submitting the local transaction, it will obtain a global lock from the TC. If the global lock is not obtained, it cannot write to ensure that there will be no write conflict:
-Before committing a local transaction in one phase, you need to ensure that the global lock is obtained first;
- The global lock cannot be obtained, and the local transaction cannot be submitted;
- Attempts to take the global lock are limited to a certain range, beyond the range will be abandoned, and the local transaction will be rolled back to release the local lock.
For detailed examples, refer to Seata's official website: https://seata.io/zh-cn/docs/dev/mode/at-mode.html
2.2 AT and CAP
In general, all transaction modes of Seata need to ensure CP, that is, consistency and partition fault tolerance, because the core of distributed transactions is to ensure data consistency (including weak consistency). For example, in some transaction scenarios, involving changes in the amount of multiple systems, ensuring consistency can avoid capital losses in the system.
Distributed systems will inevitably encounter unavailability of services. For example, when Seata's TC is unavailable, users may wish to downgrade the service to ensure the availability of the entire service. At this time, Seata needs to convert from a CP system to a system that guarantees AP. .
For example, there is a service that provides the user with the function of modifying information. If there is a problem with the TC service at this time, in order not to affect the user's experience, we hope that the service is still available, but all SQL execution is downgraded to not go global transaction, but executed as a local transaction.
In AT mode, the default priority is to ensure CP, but a configuration channel is provided to allow users to switch between CP and AP modes:
- The tm.degrade-check parameter of the configuration file, if its value is true, the branch transaction guarantees AP, otherwise it guarantees CP;
- Manually modify the service.disableGlobalTransaction property of the configuration center to true, then close the global transaction to implement AP.
3. AT data source agent
In the AT mode, the user only needs to configure the proxy data source of the AT, and all processes of the AT are completed in the proxy data source, which is unaware of the user.
The overall class structure of AT data source agent is as follows:
AT transaction data source agent class structure diagram [from https://seata.io/zh-cn/docs/dev/mode/xa-mode.html ]
In the data source proxy of AT, the DataSource, Connection and Statement of the target database are respectively represented. Before executing the target SQL action, operations such as RM resource registration, undo log generation, branch transaction registration, branch transaction commit/rollback are completed. , and these operations are not perceived by the user.
The following sequence diagram shows the action details of these proxy classes during the execution of AT mode:
Note: The pictures are recommended to be viewed on the PC side
4. AT mode process
The following is the overall process of the AT mode. From here, you can see the execution timing of each key action of the distributed transaction. The details of each action will be discussed later:
Note: The pictures are recommended to be viewed on the PC side
4.1 Stage 1
In the first stage of AT mode, Seata will intercept the business SQL executed by the user through the proxy data source. If the user does not open a transaction, it will automatically open a new transaction. If the business SQL is a write operation (add, delete, and modify), the business SQL syntax will be parsed, a SELECT SQL statement will be generated, and the records to be modified will be found and saved as "before image". Then execute the business SQL. After the execution, use the same principle to find out the records that have been modified and save them as "after image". At this point, an undo log record is complete. Then the RM will register the branch transaction with the TC, and the TC side will add a new lock record. The lock can ensure the read and write isolation of the AT mode. The RM then submits the undo log and the local transaction of the business SQL to ensure the atomicity of the business SQL and the SQL stored in the undo log.
4.2 Phase II Commit
In the two-phase commit of AT mode, the TC side will delete the lock of the transaction, and then notify the RM to delete the undo log records asynchronously.
4.3 Two-phase rollback
If the second stage of the AT mode is rollback, then the RM side needs to restore the business data modified in the first stage through reverse SQL according to the before image records in the undo log data saved in the first stage.
However, before restoring data, dirty data verification is required. Because after the one-phase commit, the record may have been changed by other services during the time between the rollback and the current rollback. The verification method is to compare the after image of the undo log with the data in the current database. If the data is consistent, it means that there is no dirty data; if it is inconsistent, it means that there is dirty data, and the dirty data needs to be processed manually.
5. Key code modules
The following are the main modules of the whole process of AT mode, from which we can understand what needs to be done to develop AT mode:
5.1 Undo log data format
The undo log is stored in the undo_log table. The table structure of the undo_log table is as follows:
rollback_info stores the content of the business data before and after modification. The data table stores the compressed format. Its plaintext format is as follows:
{
"branchId":2828558179596595558,
"sqlUndoLogs":[
{
"afterImage":{
"rows":[
{
"fields":[
{
"keyType":"PRIMARY_KEY",
"name":"id",
"type":4,
"value":3
},
{
"keyType":"NULL",
"name":"count",
"type":4,
"value":70
}
]
}
],
"tableName":"stock_tbl"
},
"beforeImage":{
"rows":[
{
"fields":[
{
"keyType":"PRIMARY_KEY",
"name":"id",
"type":4,
"value":3
},
{
"keyType":"NULL",
"name":"count",
"type":4,
"value":100
}
]
}
],
"tableName":"stock_tbl"
},
"sqlType":"UPDATE",
"tableName":"stock_tbl"
}
],
"xid":"192.168.51.102:8091:2828558179596595550"
}
5.2 UndoLogManager
UndoLogManager is responsible for adding, deleting, and rolling back undo logs. Different databases have different implementations (the SQL syntax of different databases will be different). The common logic is placed in the AbstractUndoLogManager abstract class. The overall class inheritance relationship is as follows:
Note: The pictures are recommended to be viewed on the PC side
The logic of inserting and deleting undo log is relatively simple, just operate the data table directly. Here we focus on the logic of rolling back the undo log:
The source code analysis is as follows:
@Override
public void undo(DataSourceProxy dataSourceProxy, String xid, long branchId) throws TransactionException {
Connection conn = null;b
ResultSet rs = null;
PreparedStatement selectPST = null;
boolean originalAutoCommit = true;
for (; ; ) {
try {
conn = dataSourceProxy.getPlainConnection();
// The entire undo process should run in a local transaction.
// 开启本地事务,确保删除undo log和恢复业务数据的SQL在一个事务中commit
if (originalAutoCommit = conn.getAutoCommit()) {
conn.setAutoCommit(false);
}
// Find UNDO LOG
selectPST = conn.prepareStatement(SELECT_UNDO_LOG_SQL);
selectPST.setLong(1, branchId);
selectPST.setString(2, xid);
// 查出branchId的所有undo log记录,用来恢复业务数据
rs = selectPST.executeQuery();
boolean exists = false;
while (rs.next()) {
exists = true;
// It is possible that the server repeatedly sends a rollback request to roll back
// the same branch transaction to multiple processes,
// ensuring that only the undo_log in the normal state is processed.
int state = rs.getInt(ClientTableColumnsName.UNDO_LOG_LOG_STATUS);
// 如果state=1,说明可以回滚;state=1说明不能回滚
if (!canUndo(state)) {
if (LOGGER.isInfoEnabled()) {
LOGGER.info("xid {} branch {}, ignore {} undo_log", xid, branchId, state);
}
return;
}
String contextString = rs.getString(ClientTableColumnsName.UNDO_LOG_CONTEXT);
Map<String, String> context = parseContext(contextString);
byte[] rollbackInfo = getRollbackInfo(rs);
String serializer = context == null ? null : context.get(UndoLogConstants.SERIALIZER_KEY);
// 根据serializer获取序列化工具类
UndoLogParser parser = serializer == null ? UndoLogParserFactory.getInstance()
: UndoLogParserFactory.getInstance(serializer);
// 反序列化undo log,得到业务记录修改前后的明文
BranchUndoLog branchUndoLog = parser.decode(rollbackInfo);
try {
// put serializer name to local
setCurrentSerializer(parser.getName());
List<SQLUndoLog> sqlUndoLogs = branchUndoLog.getSqlUndoLogs();
if (sqlUndoLogs.size() > 1) {
Collections.reverse(sqlUndoLogs);
}
for (SQLUndoLog sqlUndoLog : sqlUndoLogs) {
TableMeta tableMeta = TableMetaCacheFactory.getTableMetaCache(dataSourceProxy.getDbType()).getTableMeta(
conn, sqlUndoLog.getTableName(), dataSourceProxy.getResourceId());
sqlUndoLog.setTableMeta(tableMeta);
AbstractUndoExecutor undoExecutor = UndoExecutorFactory.getUndoExecutor(
dataSourceProxy.getDbType(), sqlUndoLog);
undoExecutor.executeOn(conn);
}
} finally {
// remove serializer name
removeCurrentSerializer();
}
}
// If undo_log exists, it means that the branch transaction has completed the first phase,
// we can directly roll back and clean the undo_log
// Otherwise, it indicates that there is an exception in the branch transaction,
// causing undo_log not to be written to the database.
// For example, the business processing timeout, the global transaction is the initiator rolls back.
// To ensure data consistency, we can insert an undo_log with GlobalFinished state
// to prevent the local transaction of the first phase of other programs from being correctly submitted.
// See https://github.com/seata/seata/issues/489
if (exists) {
deleteUndoLog(xid, branchId, conn);
conn.commit();
if (LOGGER.isInfoEnabled()) {
LOGGER.info("xid {} branch {}, undo_log deleted with {}", xid, branchId,
State.GlobalFinished.name());
}
} else {
// 如果不存在undo log,可能是因为分支事务还未执行完成(比如,分支事务执行超时),TM发起了回滚全局事务的请求。
// 这个时候,往undo_log表插入一条记录,可以使分支事务提交的时候失败(undo log)
insertUndoLogWithGlobalFinished(xid, branchId, UndoLogParserFactory.getInstance(), conn);
conn.commit();
if (LOGGER.isInfoEnabled()) {
LOGGER.info("xid {} branch {}, undo_log added with {}", xid, branchId,
State.GlobalFinished.name());
}
}
return;
} catch (SQLIntegrityConstraintViolationException e) {
// Possible undo_log has been inserted into the database by other processes, retrying rollback undo_log
if (LOGGER.isInfoEnabled()) {
LOGGER.info("xid {} branch {}, undo_log inserted, retry rollback", xid, branchId);
}
} catch (Throwable e) {
if (conn != null) {
try {
conn.rollback();
} catch (SQLException rollbackEx) {
LOGGER.warn("Failed to close JDBC resource while undo ... ", rollbackEx);
}
}
throw new BranchTransactionException(BranchRollbackFailed_Retriable, String
.format("Branch session rollback failed and try again later xid = %s branchId = %s %s", xid,
branchId, e.getMessage()), e);
} finally {
try {
if (rs != null) {
rs.close();
}
if (selectPST != null) {
selectPST.close();
}
if (conn != null) {
if (originalAutoCommit) {
conn.setAutoCommit(true);
}
conn.close();
}
} catch (SQLException closeEx) {
LOGGER.warn("Failed to close JDBC resource while undo ... ", closeEx);
}
}
}
}
Remarks: It is necessary to pay special attention. When it is rolled back, it is found that the undo log does not exist, and a new record needs to be added to the undo_log table to avoid the scenario where the RM successfully submits the branch transaction after the TM sends a rollback request.
5.3 Compressor compression algorithm
The Compressor interface defines the specification of the compression algorithm used to compress text and save storage space:
public interface Compressor {
/**
* compress byte[] to byte[].
* @param bytes the bytes
* @return the byte[]
*/
byte[] compress(byte[] bytes);
/**
* decompress byte[] to byte[].
* @param bytes the bytes
* @return the byte[]
*/
byte[] decompress(byte[] bytes);
}
The compression algorithms that have been implemented so far are as follows:
5.4 UndoLogParser Serialization Algorithm
The Serializer interface defines the specification of the serialization algorithm used to serialize code:
public interface UndoLogParser {
/**
* Get the name of parser;
*
* @return the name of parser
*/
String getName();
/**
* Get default context of this parser
*
* @return the default content if undo log is empty
*/
byte[] getDefaultContent();
/**
* Encode branch undo log to byte array.
*
* @param branchUndoLog the branch undo log
* @return the byte array
*/
byte[] encode(BranchUndoLog branchUndoLog);
/**
* Decode byte array to branch undo log.
*
* @param bytes the byte array
* @return the branch undo log
*/
BranchUndoLog decode(byte[] bytes);
}
The serialization algorithms that have been implemented so far are as follows:
5.5 Executor executor
Executor is the entry class for SQL execution. Before and after the execution of SQL, AT needs to manage the image records of the undo log, mainly to build the undo log, including assembling the SQL statement for querying the undo log according to different business SQLs; executing the SQL for querying the undo log , get the mirror record data; execute the logic of inserting the undo log (uncommitted transaction).
public interface Executor<T> { /** * Execute t. * * @param args the args * @return the t * @throws Throwable the throwable */ T execute(Object... args) throws Throwable;}
For different business SQL, there are different Executor implementations, mainly because the logic of generating undo log SQL is different for different operations/different database types of business SQL, so the beforeImage() and afterImage() methods are rewritten respectively. The overall inheritance relationship is shown in the following figure:
Note: The pictures are recommended to be viewed on the PC side
In order to visually see the before image SQL and after iamge SQL generated by different types of SQL, here is a summary. If the structure of the target data table is as follows:
public interface Executor<T> {
/**
* Execute t.
*
* @param args the args
* @return the t
* @throws Throwable the throwable
*/
T execute(Object... args) throws Throwable;
}
Note: The pictures are recommended to be viewed on the PC side
5.6 AsyncWorker
AsyncWorker is used for asynchronous execution, such as branch transaction commit and undo log record deletion.
6. About performance
There is no one perfect distributed transaction mechanism that can adapt to all scenarios and perfectly meet all needs. Regardless of AT mode, TCC mode or Saga mode, they are essentially improvements to the security or performance deficiencies of the XA specification in various scenarios. The different transaction modes of Seata are different trade-offs between the four characteristics of consistency, reliability, ease of use, and performance.
Recently, the Seata community found that some peers questioned the performance of the AT model and its data security after performing a short-link stress test on an early Go version of Seata without analyzing the detailed implementation of the code of the AT mode in the Java version. Please check the test methods and test objects carefully before accepting this conclusion, and please distinguish between "Li Gui" and "Li Kui".
In fact, this early Go version implementation only refers to Seata v1.4.0, and does not strictly implement all the features of Seata AT mode. Having said that, even the Seata XA mode that it advocates, it also relies on the single-DB XA mode. However, there are still many bugs in the latest version of MySQL XA transaction mode, and this foundation is not as 100% stable as it imagined.
Seata, jointly built by Alibaba and Ant Group, is the crystallization of our many years of internal distributed transaction engineering practice and technical experience. After open source, it has been verified by more than 150+ industry peers in the production environment. Kaiyuan Avenue is both long and wide. There can be motor vehicle lanes, non-motor vehicle lanes, and sidewalks on this road. Let’s work together to widen and extend the road, instead of standing on the sidewalk to promote the high danger and slow speed of motor vehicle lanes.
7. Summary
The Seata AT mode depends on different versions of DB Drivers (database drivers) from various DB manufacturers. After each database releases a new version, its SQL semantics and usage patterns may change. As Seata has been widely used in various business scenarios by its users in recent years, with the efforts of developers, Seata AT mode keeps the programming interface almost the same as its XA mode, adapts to almost all mainstream databases, and covers these The driver of the main popular version of the database: It truly keeps the "complexity" of the distributed system at the framework level, and handed over the ease of use and high performance to the user.
Of course, the XA and AT modes of the Seata Java version still have a lot of room for improvement and improvement, not to mention the implementation of other multilingual versions. Welcome colleagues who are interested in the construction of Seata and its multilingual version to participate in the construction of Seata, and make joint efforts to build Seata into a standardized distributed transaction platform.
Recommended reading of the week
Go memory leak, is pprof enough?
Full analysis of Go native plugin usage problems
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。