13

What is a distributed transaction? The cross-bank transfer service of a bank is a typical distributed transaction scenario. Assuming that A needs to transfer money across banks to B, then the data of two banks is involved. The ACID of the transfer cannot be guaranteed by the local transaction of a database, and it can only be solved by distributed transactions.

Distributed transaction means that the initiator of the transaction, the resource and resource manager, and the transaction coordinator are located on different nodes of the distributed system. In the above transfer business, the user A-100 operation and the user B+100 operation are not located on the same node. Essentially, distributed transactions are to ensure the correct execution of data operations in distributed scenarios.

What is TCC distributed transaction? TCC is the abbreviation of Try, Confirm and Cancel. It was first proposed by Pat Helland in a paper entitled "Life beyond Distributed Transactions: an Apostate's Opinion" published in 2007.

TCC composition

TCC is divided into 3 stages

  • Try phase: try to execute, complete all business checks (consistency), reserve necessary business resources (quasi-isolation)
  • Confirm phase: If the Try of all branches is successful, go to the Confirm phase. Confirm actually executes the business without any business inspections, and only uses the business resources reserved during the Try phase
  • Cancel phase: If one of the Try of all branches fails, go to the Cancel phase. Cancel releases the business resources reserved in the Try phase.

In TCC distributed transaction, there are 3 roles, the same as the classic XA distributed transaction:

  • AP/application, initiate a global transaction, define which transaction branches the global transaction contains
  • RM/resource manager, responsible for the management of various resources of branch affairs
  • TM/transaction manager, responsible for coordinating the correct execution of global transactions, including the execution of Confirm and Cancel, and handling network exceptions

If we want to conduct a business similar to bank inter-bank transfer, the transfer out (TransOut) and transfer in (TransIn) are in different microservices. A typical sequence diagram of a successfully completed TCC transaction is as follows:

image.png

TCC practice

For the previous inter-bank transfer operation, the easiest way is to adjust the balance in the Try phase, reverse the balance in the Cancel phase, and do nothing in the Confirm phase. The problem with this is that if the deduction of A is successful, the transfer of the amount to B fails, and finally rolls back, adjusting the balance of A to the initial value. In this process, if A finds that his balance has been deducted, but the payee B has not received the balance for a long time, it will cause trouble to A.

A better approach is to freeze the amount transferred by A in the Try phase, Confirm to deduct the actual amount, and Cancel to unfreeze the funds, so that the user can see the data clearly at any stage.

Below we carry out the specific development of a TCC transaction

The open source framework currently available for TCC is mainly the Java language, with seata being the representative. Our example uses the Python language, and the distributed transaction framework used is https://github.com/yedf/dtm , which supports distributed transactions very elegantly. Let's explain the composition of TCC in detail below

We first create two tables, one is the user balance table, and the other is the frozen funds table. The table creation statement is as follows:

CREATE TABLE dtm_busi.`user_account` (
  `id` int(11) AUTO_INCREMENT PRIMARY KEY,
  `user_id` int(11) not NULL UNIQUE ,
  `balance` decimal(10,2) NOT NULL DEFAULT '0.00',
  `create_time` datetime DEFAULT now(),
  `update_time` datetime DEFAULT now()
);

CREATE TABLE dtm_busi.`user_account_trading` (
  `id` int(11) AUTO_INCREMENT PRIMARY KEY,
  `user_id` int(11) not NULL UNIQUE ,
  `trading_balance` decimal(10,2) NOT NULL DEFAULT '0.00',
  `create_time` datetime DEFAULT now(),
  `update_time` datetime DEFAULT now()
);

In the trading table, trading_balance records the amount being traded.

We first write the core code to freeze/unfreeze the funds operation, and check the constraint balance+trading_balance >= 0. If the constraint is not established, the execution fails.

def tcc_adjust_trading(cursor, uid, amount):
  affected = utils.sqlexec(cursor, "update dtm_busi.user_account_trading set trading_balance=trading_balance + %d where user_id=%d and trading_balance + %d + (select balance from dtm_busi.user_account where id=%d) >= 0" % (amount, uid, amount, uid))
  if affected == 0:
    raise Exception("update error, maybe balance not enough")

Then adjust the balance

def tcc_adjust_balance(cursor, uid, amount):
  utils.sqlexec(cursor, "update dtm_busi.user_account_trading set trading_balance = trading_balance+ %d where user_id=%d" %( -amount, uid))
  utils.sqlexec(cursor, "update dtm_busi.user_account set balance=balance+%d where user_id=%d" %(amount, uid))

Let's write a specific Try/Confirm/Cancel processing function

@app.post("/api/TransOutTry")
def trans_out_try():
  # 事务以及异常处理
  tcc_adjust_trading(c, out_uid, -30)
  return {"dtm_result": "SUCCESS"}

@app.post("/api/TransOutConfirm")
def trans_out_confirm():
  # 事务以及异常处理
  tcc_adjust_balance(c, out_uid, -30)
  return {"dtm_result": "SUCCESS"}

@app.post("/api/TransOutCancel")
def trans_out_cancel():
  # 事务以及异常处理
  tcc_adjust_trading(c, out_uid, 30)
  return {"dtm_result": "SUCCESS"}

@app.post("/api/TransInTry")
def trans_in_try():
  # 事务以及异常处理
  tcc_adjust_trading(c, in_uid, 30)
  return {"dtm_result": "SUCCESS"}

@app.post("/api/TransInConfirm")
def trans_in_confirm():
  # 事务以及异常处理
  tcc_adjust_balance(c, in_uid, 30)
  return {"dtm_result": "SUCCESS"}

@app.post("/api/TransInCancel")
def trans_in_cancel():
  # 事务以及异常处理
  tcc_adjust_trading(c, in_uid, -30)
  return {"dtm_result": "SUCCESS"}

At this point, the processing function of each sub-transaction has been OK, and then the TCC transaction is opened, and the branch call is made

@app.get("/api/fireTcc")
def fire_tcc():
    # 发起tcc事务
    gid = tcc.tcc_global_transaction(dtm, utils.gen_gid(dtm), tcc_trans)
    return {"gid": gid}

# tcc事务的具体处理
def tcc_trans(t):
    req = {"amount": 30} # 业务请求的负荷
    # 调用转出服务的Try|Confirm|Cancel
    t.call_branch(req, svc + "/TransOutTry", svc + "/TransOutConfirm", svc + "/TransOutCancel")
    # 调用转入服务的Try|Confirm|Cancel
    t.call_branch(req, svc + "/TransInTry", svc + "/TransInConfirm", svc + "/TransInCancel")

At this point, a complete TCC distributed transaction has been written.

If you want to run a successful sample completely, then follow the instructions of the dtmcli-py-sample project and tcc's example.

TCC rollback

What happens if the bank finds that the account of User 2 is abnormal when it is about to transfer the amount to User 2 and the return fails? We modify the code to simulate this situation:

@app.post("/api/TransInTry")
def trans_in_try():
  # 事务以及异常处理
  tcc_adjust_trading(c, in_uid, 30)
  return {"dtm_result": "FAILURE"}

This is the sequence diagram of transaction failure interaction
image.png

The difference between this and successful TCC is that when a sub-transaction returns to failure, the global transaction is subsequently rolled back, and the Cancel operation of each sub-transaction is called to ensure that the global transaction is all rolled back.

TCC network abnormal

In the process of TCC's entire global transaction, various network abnormalities may occur. Typical examples are empty rollback, idempotence, and suspension. Due to TCC's abnormal conditions, it is similar to SAGA, reliable messages and other transaction patterns, so we Put all the solutions to exceptions in this article most classic seven solutions for distributed transactions The exception handling chapter to explain

summary

In this article, we introduced the theoretical knowledge of TCC, and through an example, gave a complete process of writing a TCC transaction, covering the normal successful completion and successful rollback. I believe that readers have an in-depth understanding of TCC through this article.

For more comprehensive knowledge of most classic seven solutions for distributed transactions

The examples used in this article are excerpted from yedf/dtm , which supports multiple transaction modes: TCC, SAGA, XA, transaction message cross-language support, and supports clients in languages such as golang, python, PHP, nodejs, and Java. Provide sub-transaction barrier function, elegantly solve the problems of idempotence, suspension, null compensation and so on.

After reading this dry goods, welcome everyone to visit the https://github.com/yedf/dtm project, give stars to support!


叶东富
1.1k 声望6.1k 粉丝