With the rapid development of business and the increasing complexity of business, almost every company's system will move from monolithic to distributed, especially to microservice architecture. Then, it is inevitable to encounter the problem of distributed transactions.
This article first introduces the relevant basic theories, then summarizes the most classic transaction schemes, and finally gives a solution to the out-of-order execution of sub-transactions (idempotent, null compensation, and suspension problems), and shares them with you.
basic theory
Before explaining the specific scheme, let's first understand the basic theoretical knowledge involved in distributed transactions.
Let's take the transfer as an example, A needs to transfer 100 yuan to B, then the balance of A needs to be -100 yuan, and the balance to B is +100 yuan. The entire transfer must be guaranteed, A-100 and B+100 succeed at the same time, or fail at the same time . See how this problem is solved in various scenarios.
affairs
The ability to operate multiple statements as a whole is called a database transaction. A database transaction ensures that all operations within the scope of the transaction can either succeed or fail.
Transactions have 4 properties: atomicity, consistency, isolation, and durability. These four properties are often referred to as ACID properties.
- Atomicity: All operations in a transaction are either completed or not completed, and will not end at a certain link in the middle. If an error occurs during the execution of a transaction, it will be restored to the state it was in before the transaction started, as if the transaction had never been executed.
- Consistency: The integrity of the database has not been compromised before, during, or after a transaction. Integrity including foreign key constraints, application-defined constraints, etc. will not be violated.
- Isolation: The ability of a database to allow multiple concurrent transactions to read, write and modify its data at the same time. Isolation can prevent data inconsistency due to cross execution when multiple transactions are executed concurrently.
- Durability: After the transaction is completed, the modification of the data is permanent, even if the system fails, it will not be lost.
If our business system is not complicated, we can modify the data in a database and a service to complete the transfer, then we can use the database transaction to ensure the correct completion of the transfer business.
Distributed transaction
The inter-bank transfer business is a typical distributed transaction scenario. Assuming that A needs to transfer money across banks to B, data from two banks is involved. The ACID of the transfer cannot be guaranteed through a local transaction of a database, but can only be solved through distributed transactions.
Distributed transaction means that the initiator, resource and resource manager and transaction coordinator of the transaction are located on different nodes of the distributed system. In the above transfer business, the operation of user A-100 and the operation of user B+100 are not located on the same node. In essence, distributed transactions are to ensure the correct execution of data operations in distributed scenarios.
ACID
Distributed transactions partially follow the ACID specification:
- Atomicity: Strictly Follow
- Consistency: The consistency after the transaction is strictly followed; the consistency in the transaction can be relaxed appropriately
- Isolation: unaffected between parallel transactions; visibility of intermediate results of transactions allows security relaxation
- Persistence: Strictly Follow
Because the transaction process is not consistent, but the transaction will eventually complete and eventually reach consistency, so we call distributed transactions "eventually consistent"
Doubts about eventual consistency
It is especially emphasized that the eventual consistency here is different from the eventual consistency of CAP (C: consistency, A: availability, P: partition tolerance). Most of the current books and materials confuse the two. Below we will Focus on consistent interpretation.
The C of CAP refers to consistency when reading data from multiple replicas in a distributed system. Simply put, if I update a piece of data from v1 to v2, and then read arbitrary data:
- Strong consistency: You can ensure that v2 is read every time, then it is strong consistency
- Weak consistency: it may read v1 or v2, then it is weak consistency
- Final consistency: After a certain period of time, to ensure that each read can read v2, then it is eventually consistent
The CAP theory proposes that a distributed system cannot satisfy 3 characteristics at the same time, and can only choose 2 at most. In the face of such a problem, there is a classic scheme called BASE theory, which pursues AP and then relaxes the requirements for C. AWS's Dynamo is one such system, which provides eventually consistent reads. For details, see Dynamo Consistent Reads
In recent years, the distributed theory has been further developed, and many systems do not follow the BASE scheme, but the CP+HA (Highly-Available) scheme. Distributed consensus protocols such as Paxos and Raft fully satisfy CP. In terms of A-availability, although not 100% available, combined with hardware stability upgrades in recent years, high availability can be achieved. The public data of Google distributed lock Chubby shows that the cluster can provide an average availability of 99.99958%, and the operation is interrupted for 130s a year, which can already meet very strict application requirements. The current SQL database software is based on CP+HA, but HA will be lower than Google's extreme data, but generally it can reach 4 9s
CP+HA means not BASE, which means that as long as the write is successful, the next read can read the latest result. Developers don't have to worry that the read data is not the latest data. In the multi-copy read and write, the same as Standalone is the same.
Because distributed transaction research mainly solves data consistency involving multiple databases, the actual data storage is mainly in the database, so it is also CP+HA. Therefore, distributed transactions satisfy C of CAP, but not C of ACID, also known as eventual consistency
Classic solution for distributed transactions
Due to the distributed transaction scheme, complete ACID guarantee cannot be achieved, and there is no perfect scheme that can solve all business problems. Therefore, in practical applications, the most suitable distributed transaction scheme will be selected according to the different characteristics of the business.
1. Two-phase commit/XA
XA is a specification of distributed transactions proposed by the X/Open organization. The XA specification mainly defines the interface between the (global) transaction manager (TM) and the (local) resource manager (RM). Local databases such as mysql play the role of RM in XA
XA is divided into two stages:
The first stage (prepare): that is, all participants RM are ready to execute the transaction and lock the required resources. When the participant is ready, report to the TM that it is ready.
The second stage (commit/rollback): When the transaction manager (TM) confirms that all participants (RM) are ready, it sends a commit command to all participants.
At present, the mainstream databases basically support XA transactions, including mysql, oracle, sqlserver, postgre
An XA transaction consists of one or more resource managers (RM), a transaction manager (TM), and an application program (ApplicationProgram).
The three roles of RM, TM, and AP here are classic role divisions, which will run through subsequent transaction modes such as Saga and Tcc.
Taking the above transfer as an example, the sequence diagram of a successfully completed XA transaction is as follows:
If any one participant fails to prepare, the TM will notify all participants who have completed the prepare to roll back.
The characteristics of XA transactions are:
- Simple and easy to understand, easier to develop
- The resource is locked for a long time, and the concurrency is low
If readers want to further study XA, go language and PHP, Python, Java, C#, Node, etc. can refer to DTM
2. SAGA
Saga is a solution mentioned in this database paper sagas. The core idea is to split a long transaction into multiple local short transactions, which are coordinated by the Saga transaction coordinator. If it ends normally, it will be completed normally. If a step fails, the compensation operation will be called once in reverse order.
Taking the above transfer as an example, the sequence diagram of a successfully completed SAGA transaction is as follows:
Once Saga reaches the Cancel stage, Cancel is not allowed to fail in business logic. If no success is returned due to network or other temporary failures, the TM will keep retrying until Cancel returns a success.
Features of Saga Transactions:
- High degree of concurrency, no long-term locking of resources like XA transactions
- Need to define normal operation and compensation operation, the development volume is larger than XA
- The consistency is weak. For transfers, it may happen that user A has already deducted the money, and the final transfer fails.
There are many SAGA contents in the paper, including two recovery strategies, including concurrent execution of branch transactions. Our discussion here only includes the simplest SAGA.
SAGA is suitable for many scenarios, suitable for long transactions, and suitable for business scenarios that are not sensitive to intermediate results
If readers want to further study SAGA, please refer to DTM , which includes examples of SAGA success and failure rollback, as well as the handling of various network exceptions.
3. TCC
The concept of TCC (Try-Confirm-Cancel) was first proposed by Pat Helland in a paper titled "Life beyond Distributed Transactions: an Apostate's Opinion" published in 2007.
TCC is divided into 3 stages
- Try phase: try to execute, complete all business checks (consistency), reserve necessary business resources (quasi-isolation)
- Confirm stage: Confirm that the actual execution of the business is performed, without any business checks, only use the business resources reserved in the Try stage. The Confirm operation requires an idempotent design, and it needs to be retried after the Confirm fails.
- Cancel stage: Cancel the execution and release the business resources reserved in the Try stage. The exceptions in the Cancel phase are basically the same as the exception handling solutions in the Confirm phase, which require an idempotent design.
Taking the above transfer as an example, the amount is usually frozen in Try, but not deducted. The amount is deducted in Confirm, and the amount is unfrozen in Cancel. The sequence diagram of a successfully completed TCC transaction is as follows:
The Confirm/Cancel phase of TCC is not allowed to return failure due to network or other temporary failures, and TM will continue to retry until Confirm/Cancel returns successfully.
TCC features are as follows:
- High concurrency and no long-term resource locking.
- The amount of development is large, and the Try/Confirm/Cancel interface needs to be provided.
- The consistency is good, and it will not happen that SAGA has deducted the money and finally the transfer fails.
- TCC is suitable for order-type business, business that has constraints on the intermediate state
If readers want to study TCC further, they can refer to DTM
4. Local message table
The local message table scheme was originally published in ACM in 2008 by ebay architect Dan Pritchett. The core of the design is to ensure the execution of tasks that require distributed processing asynchronously through messages.
The general process is as follows:
Writing local messages and business operations are placed in a transaction to ensure the atomicity of business and message sending, either all of them succeed or all of them fail.
Fault tolerance mechanism:
- When the transaction to deduct the balance fails, the transaction is rolled back directly without subsequent steps
- If the round-sequence production message fails, and the transaction to increase the balance fails, it will be retried.
Features of the local message table:
- Rollback is not supported
- Polling production messages is difficult to achieve. If you poll regularly, the total transaction time will be extended. If you subscribe to binlog, it will be difficult to develop and maintain
It is suitable for the business that can be executed asynchronously and the subsequent operations do not need to be rolled back
Five, business news
In the above local message table solution, the producer needs to create additional message tables, and also needs to poll the local message table, and the business burden is heavy. Alibaba's open-sourced RocketMQ versions after 4.3 officially support transaction messages, which essentially put the local message table on RocketMQ to solve the problem of atomicity between production-side message sending and local transaction execution.
Transaction message sending and submission:
- Send a message (half message)
- The server stores the message and responds to the writing result of the message
- Execute the local transaction according to the sending result (if the writing fails, the half message is not visible to the business at this time, and the local logic is not executed)
- Execute Commit or Rollback according to the local transaction status (Commit operation publishes messages, messages are visible to consumers)
The flow chart of normal sending is as follows:
Compensation process:
For transaction messages without Commit/Rollback (messages in pending state), initiate a "checkback" from the server
The Producer receives the checkback message and returns the status of the local transaction corresponding to the message, which is Commit or Rollback
The transaction message scheme is very similar to the local message table mechanism, the main difference is that the original related local table operations are replaced by a reverse lookup interface
Transaction message characteristics are as follows:
- Long transactions only need to be split into multiple tasks, and a reverse check interface is provided, which is easy to use
- There is no good solution for the review of transaction messages, and data errors may occur in extreme cases
It is suitable for the business that can be executed asynchronously and the subsequent operations do not need to be rolled back
If readers want to study transaction messages further, they can refer to DTM , or Rocketmq
6. Best efforts notice
The initiating notifying party makes its best efforts to notify the receiving party of the business processing result through a certain mechanism. Specifically include:
There is a certain message repetition notification mechanism. Because the recipient of the notification may not receive the notification, there must be a certain mechanism to repeat the notification of the message at this time.
Message proofreading mechanism. If the receiver does not notify the receiver despite its best efforts, or the receiver needs to consume the message again after consuming the message, the receiver can actively query the message information from the notifying party to meet the demand.
The local message table and transaction messages described earlier are reliable messages. How is it different from the best-effort notification described here?
Reliable message consistency. The initiating notifier needs to ensure that the message is sent out and the message is sent to the receiving notifying party. The reliability of the message is guaranteed by the initiating notifying party.
Best-effort notification, the initiating party will do its best to notify the receiving party of the business processing result, but the message may not be received. In this case, the receiving party needs to actively call the interface of the initiating party to query the business processing result, and the reliability of the notification The key lies in the recipient of the notification.
Solution-wise, best-effort notification requires:
- Provide an interface, so that the receiving notification can query the business processing results through the interface
- The message queue ACK mechanism, the message queue gradually increases the notification interval according to the interval of 1min, 5min, 10min, 30min, 1h, 2h, 5h, and 10h, until the upper limit of the time window required by the notification is reached. no further notice
Best-effort notification is applicable to business notification types. For example, the result of WeChat transaction is to notify merchants through best-effort notification. There are both callback notifications and transaction query interfaces.
Seven, AT transaction mode
This is a transaction mode in Seata , an open source project of Alibaba, also known as FMT in Ant Financial. The advantage is that the use of this transaction mode is similar to the XA mode. The business does not need to write various compensation operations, and the rollback is automatically completed by the framework. This mode also has many disadvantages. On the other hand, there are problems such as dirty rollback, which can easily lead to data inconsistency. For a comparative study between AT and XA, you can refer to: XA vs AT
A New Scheme for Distributed Transactions
https://github.com/dtm-labs/dtm After studying various classic solutions, based on the experience of many companies using dtm, new and more convenient and easy-to-use new solutions are proposed to help everyone better and faster to solve the problem of data consistency across libraries and services.
second stage message
dtm pioneered a two-phase message architecture, which is much better than local message tables and transaction messages, and can perfectly replace local message tables and transaction messages.
The working sequence diagram of the two-phase message is as follows:
Compared with local message tables and transaction messages, two-phase messages have the following advantages:
- No queues are needed, so no consumers are needed, the user simply calls the API
- The second-phase message also has a back check, but the back check is automatically processed by the framework, and the data is guaranteed to be correct
For details about the two-phase message, please refer to the two-phase message here
Workflow Mode
The XA, Saga, Tcc and other modes have been introduced above. Each mode has related advantages and disadvantages and is suitable for different businesses. Is there a way to combine their advantages, use different patterns for different businesses, and then fuse them into one global transaction?
The Workflow mode pioneered by dtm can support the mixed use of the above three modes, and also allows the mixed use of HTTP/gRPC/local transactions. It has great flexibility and can solve various business scenarios.
For details about Workflow, please refer to Workflow here
exception handling
Problems such as network and business failures may occur in all aspects of distributed transactions. These problems require the business side of distributed transactions to achieve three characteristics of air defense rollback, idempotency, and anti-suspension.
abnormal situation
These exceptions are illustrated below with TCC transactions:
Empty rollback:
Without calling the Try method of the TCC resource, the two-stage Cancel method is called. The Cancel method needs to recognize that this is an empty rollback, and then directly returns success.
The reason is that when a branch transaction is down in service or the network is abnormal, the branch transaction call is recorded as a failure. At this time, the Try phase is not executed. When the fault is recovered, the distributed transaction is rolled back and the second-phase Cancel is called. method, resulting in an empty rollback.
Idempotent :
Since any request may have network exceptions and repeated requests, all distributed transaction branches need to ensure idempotency
suspension:
Suspension means that for a distributed transaction, the second-phase Cancel interface is executed before the Try interface.
The reason is that when the RPC calls the branch transaction try, the branch transaction is registered first, and then the RPC call is executed. If the network of the RPC call is congested at this time, after the RPC times out, the TM will notify the RM to roll back the distributed transaction, which may be rolled back. After completion, Try's RPC request arrives at the participant for real execution.
Let's look at a sequence diagram of network exceptions to better understand the above problems
- When the business is processing request 4, Cancel is executed before Try, and an empty rollback needs to be processed.
- When the business processes request 6, Cancel is repeatedly executed, which requires idempotency
- When the business is processing request 8, the Try is executed after Cancel, and the suspension needs to be processed.
In the face of the above-mentioned complex network anomalies, the solutions suggested by various companies are that the business party uses a unique key to query whether the associated operation has been completed, and if it has been completed, it will directly return success. The relevant judgment logic is complex, prone to errors, and has a heavy business burden.
subtransaction barrier
In the project https://github.com/dtm-labs/dtm , a sub-transaction barrier technology appeared. Using this technology, this effect can be achieved. See the schematic diagram:
After all these requests reach the sub-transaction barrier: abnormal requests will be filtered; normal requests will pass the barrier. After developers use the sub-transaction barrier, all the exceptions mentioned above are properly handled, and business developers only need to pay attention to the actual business logic, which greatly reduces the burden.
The sub-transaction barrier provides the method CallWithDB, and the prototype of the method is:
func (bb *BranchBarrier) CallWithDB(db *sql.DB, busiCall BusiFunc) error
Business developers write their own logic in busiCall and call this function. CallWithDB guarantees that busiCall will not be called in scenarios such as empty rollback and suspension; when the business is repeatedly called, there is idempotent control, which is guaranteed to be submitted only once.
Sub-transaction barriers will manage TCC, SAGA, etc., and can also be extended to other areas
Subtransaction barrier principle
The principle of sub-transaction barrier technology is to establish a branch transaction status table sub_trans_barrier in the local database, and the unique key is global transaction id-sub-transaction id-sub-transaction branch name (try|confirm|cancel)
- Start local transaction
- For the current operation op (try|confirm|cancel), insert ignore a piece of data gid-branchid-op, if the insertion is unsuccessful, the commit transaction returns success (common idempotent control method)
- If the current operation is cancel, then insert ignore a piece of data gid-branchid-try, if the insertion is successful (note that it is successful), the commit transaction returns success
- Call the business logic in the barrier, if the business returns success, the commit transaction returns success; if the business returns failure, the rollback transaction returns failure
Under this mechanism, problems related to network anomalies are solved
- Empty compensation control--If Try is not executed and Cancel is executed directly, then Cancel will be successfully inserted into gid-branchid-try, and the logic in the barrier will not be followed, ensuring empty compensation control
- Idempotent control - any branch cannot repeatedly insert a unique key, ensuring no repeated execution
- Anti-hanging control--Try is executed after Cancel, if the inserted gid-branchid-try is unsuccessful, it will not be executed, ensuring anti-hanging control
A similar mechanism is also used for SAGA et al.
Subtransaction barrier summary
The sub-transaction barrier technology is pioneered at https://github.com/dtm-labs/dtm . Its significance lies in designing simple and easy-to-implement algorithms and providing an easy-to-use interface. In the first creation, its significance lies in the design of simple and easy-to-use interfaces. The implemented algorithm provides a simple and easy-to-use interface. With the help of these two items, developers are completely freed from handling network exceptions.
This technology currently needs to be paired with the dtm-labs/dtm transaction manager, and the SDK has been provided to developers of Go, Python, C#, and Java languages. SDKs for other languages are planned. For other distributed transaction frameworks, as long as appropriate distributed transaction information is provided, the technology can be quickly implemented according to the above principles.
dtm not only implements sub-transaction barriers based on SQL databases, but also sub-transaction barriers based on Redis and Mongo, so it can combine Redis, Mongo, SQL databases, and other storage engines that support transactions to form a global transaction, providing a very large flexibility.
Distributed Transaction Practice
We also have many articles that will take you quickly to develop a distributed transaction through practical examples, including versions in various languages. If you are interested, you can visit: dtm tutorial
Summarize
This article introduces some basic theories of distributed transactions, and explains the commonly used distributed transaction schemes; in the second half of the article, the causes, classifications and elegant solutions of transaction exceptions are also given; The distributed transaction example of , demonstrates the content introduced earlier in a short program.
dtm-labs/dtm supports TCC, XA, SAGA, two-stage message, best-effort notification (two-stage message), and provides HTTP and gRPC protocol support, which is very easy to access.
dtm-labs/dtm has supported clients in languages such as Python, Java, PHP, C#, Node, etc., see: SDK for each language .
Welcome everyone to visit the https://github.com/dtm-labs/dtm project and give a star to support!
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。