Foreword

As we all know, distributed transactions are a complex problem, with many different ideas and methods.

目前,只要是微服务架构做的分布式系统,就绕不开分布式事务这个话题。当然,并不是使用了分布式事务解决方案服务就稳定,高大上,不使用分布式事务服务也并不一定不好。目前我们公司就没有使用分布式事务,一样撑起了稳定,灵活的系统。分布式事务的选择与否没有对错,在于真实项目中权衡利弊后的抉择。
Distributed transaction development source

In the beginning, our application was a single service. When used properly, Spring's affairs can help us solve related affairs. Simple analysis of spring affairs: When a proxy method uses @Transactiona and is cut to the future by the AOP aspect. Spring will use ThreadLocal to manage the same DBConnection in the aop aspect. This connection manages the transaction for us: that is, the entire method execution ends (live exception) and the transaction is submitted or rolled back together. At this time, our application is as follows:

image.png
Or like this
image.png

However, with the growth of project traffic, the same DB connection will definitely not be able to accept all kinds of service data. Therefore, in actual projects, the relationship between most of our applications and DB is like this:
image.png
We know that the spring transaction is a transaction controlled by the same db connection under one jvm, but when there are multiple servers, different servers have their own jvm, even if they are called remotely, the db-connection is definitely not the same. Therefore, we need a third party to coordinate the data connection of each service. This leads to the distributed transaction problem we often talk about.

Prepare knowledge points
  1. ACID

    原子性(Atomic)
    一致性(Consistency)
    隔离性(Isolation)
    持久性(Durability)
  2. Isolation level

    Read Uncommitted(读未提交)
    Read Committed(读已提交)
    Repeated Read(重复读)
    Serialization(串行化)
  3. CAP law

    一致性(C)
     在分布式系统中的所有数据备份,在同一时刻是否同样的值。(等同于所有节点访问同一份最新的数据副本)
    
    可用性(A)
     在集群中一部分节点故障后,集群整体是否还能响应客户端的读写请求。(对数据更新具备高可用性)
    
    分区容错性(P)
     以实际效果而言,分区相当于对通信的时限要求。系统如果不能在时限内达成数据一致性,就意味着发生了分区的情况,必须就当前操作在C和A之间做出选择。
  4. BASE theory
    BASE is the abbreviation of the three phrases Basically Available, Soft state and Eventually consistent. The BASE theory is the result of the trade-off between consistency and availability in CAP. It is derived from the summary of the distributed practice of large-scale Internet systems and is gradually evolved based on the CAP theorem. The core idea of BASE theory is: Even if strong consistency cannot be achieved, each application can adopt an appropriate method to make the system achieve ultimate consistency according to its own business characteristics.
基本可用
    基本可用是指分布式系统在出现不可预知故障的时候,允许损失部分可用性—-注意,这绝不等价于系统不可用。比如:

(1)响应时间上的损失。正常情况下,一个在线搜索引擎需要在0.5秒之内返回给用户相应的查询结果,但由于出现故障,查询结果的响应时间增加了1~2秒

(2)系统功能上的损失:正常情况下,在一个电子商务网站上进行购物的时候,消费者几乎能够顺利完成每一笔订单,但是在一些节日大促购物高峰的时候,由于消费者的购物行为激增,为了保护购物系统的稳定性,部分消费者可能会被引导到一个降级页面

软状态
    软状态指允许系统中的数据存在中间状态,并认为该中间状态的存在不会影响系统的整体可用性,即允许系统在不同节点的数据副本之间进行数据同步的过程存在延时

最终一致性
    最终一致性强调的是所有的数据副本,在经过一段时间的同步之后,最终都能够达到一个一致的状态。因此,最终一致性的本质是需要系统保证最终数据能够达到一致,而不需要实时保证系统数据的强一致性。
Open source distributed transaction framework

At present, there are many frameworks for open source distributed transaction problems in the world, such as seata, lcn, and rocketmq using transaction messages. There are many modes supported by each framework (you can think about why each framework supports several modes). Today we explain in detail seata's codeless intrusive AT mode.

First, let's get to know through a simple sata example

============user模块=============
/**
 * @author liuliang
 * @date 2021/7/9 2:31 下午
 */
@Slf4j
@Service
public class UserService extends ServiceImpl<UserMapper, UserInfo> {

    @Resource
    private OrderFeign orderFeign;

    @GlobalTransactional
    public void buyProduct() {
        UserInfo userInfo = this.getById(1L);
        userInfo.setAmount(userInfo.getAmount() - 1);
        //扣减自己的余额
        this.updateById(userInfo);
        //创建订单
        orderFeign.test();
        log.info("购买结束");
    }
}

The user module calls the order module

/**
 * @author liuliang
 * @date 2021/7/9 2:58 下午
 */
@FeignClient(name = "mm-order")
public interface OrderFeign {

    @GetMapping("/order/test")
    void test();
}

order module

/**
 * @author liuliang
 * @date 2021/7/9 3:05 下午
 */
@Slf4j
@Service
public class OrderService extends ServiceImpl<OrderMapper, OrderInfo> {


    public void test() {
        OrderInfo orderInfo = new OrderInfo();
        orderInfo.setNum(1);
        orderInfo.setPrice(100L);
        orderInfo.setSkuName("小视频");
        this.save(orderInfo);
        log.info("小视频数据已保存");
//        throw new RuntimeException("111");
    }

}

It can be seen that when the user module adds the 060f149fddca1d @GlobalTransactional , when the order module is abnormal, the user will roll back, and the different proxy connections belonging to two different jvms can actually be in the same transaction! Next we analyze the principle of seata

  • seata frame

image.png

As you can see, seata has three components:

1. TM(Transaction Manager)事务管理者
2. RM(Resource Manager) 数据源管理者
3. TC(Transaction Coordinator)事务协调器

Generally, the process of a distributed transaction can be analyzed as the following steps:

1. TM 向 TC 申请开启一个全局事务,全局事务创建成功并生成一个全局唯一的 XID。
2. XID 在微服务调用链路的上下文中传播。
3. RM 向 TC 注册分支事务,将其纳入 XID 对应全局事务的管辖。
4. TM 向 TC 发起针对 XID 的全局提交或回滚决议。
5. TC 调度 XID 下管辖的全部分支事务完成提交或回滚请求。

Before analyzing the seata execution process in detail, we will demonstrate the simple demos separately:
Exception before user calls order
Exception when user calls order
Exception after user calls order (breakpoint at order)
Normal after user calls order
Several situations.
Next, we use the form of questions to gradually solve the doubts:
1. What is undo_log and when was it generated? When did it disappear again? When t1 rolls back, the local lock is held by t2 and the global lock is held by t1. Will it deadlock at this time?


io.seata.rm.datasource.undo.mysql.MySQLUndoLogManager#insertUndoLogWithNormal 
io.seata.tm.api.TransactionalTemplate#execute
io.seata.core.rpc.netty.RmMessageListener#onMessage
经过阅读源码发现,seata在数据源代理层,在本地事物提交之前,向undo_log表插入修改前和修改后的记录。
在切面最后,如果事务正常,则触发全局提交,全局提交时,tm通知tc,tc查到各个分支事务,通知删除undo_log数据
          如果事务异常,则出发全局回滚,全局回滚是,tm通知tc,tc查到各个分支事务,通知各个分支回滚到执行前的数据,然后删除undlog
不会死锁,t2获取全局锁的时间可配置,默认6s,超时自动放弃获取全局锁并释放本地锁     

2. Will undo_log be inconsistent with local things?

不会,本地事物里,业务逻辑和undo_log在同一个事务里面

3. If a piece of data is modified by another transaction during execution, is the data garbled? (Or isolation)

写/读隔离,解释全局锁,本地锁

4. Why does seata or lcn have multiple modes to handle distributed transactions?

能力边界,seata会需要原生数据库支持事务,并且有主键

summarizes the AT mode of
Seata's AT mode is based on the characteristics of local transactions and records custom rollback logs by intercepting and parsing SQL, thereby breaking the restriction of the XA protocol's congestion and achieving a balance between consistency, performance, and ease of use: Under the premise of achieving certain consistency (not final consistency), it guarantees higher performance without intruding into the business at all.
In most application scenarios, Seata's AT mode can play a good role, reducing the application's distributed transaction support cost to an extremely low level.


crawler
327 声望79 粉丝

专注技术多年,曾任职京东,汉得等公司主研