Text|Zhao Xin (flower name: Yu Yu)
Ant Group Trusted Native Engineer Ali Open Source Pioneer, Ali Open Source Ambassador
Responsible for the development of the DB Mesh system of the Ant Trusted Native Technology Department, as well as the open source work of containers and distributed transactions
This article is 3846 words, read 10 minutes
|Quotation|
Distributed transaction is a very critical technical node in the microservice technology system. The most popular high-quality distributed transaction implementation verified by large-scale production is undoubtedly Seata. The Seata community has been focusing on Java language implementation for the past four years, and is the de facto standard platform for distributed transaction technology in the Java field.
At a time when microservice systems such as gRPC are under construction in multiple languages, distributed transactions should also be supported in multiple languages. Therefore, when planning the 2022 Seata Roadmap, one of the key points is the construction of Seata's multilingual technology system. After half a year of preparation, especially after the completion of Seata v1.5.2 release, the key task of the community in the second half of this year (2022) is to fully build the multi-language implementation of Seata.
PART. 1--Key technical points
After four years of construction, the Seata Java version has formed a very large technical system. In multi-language construction, it is impossible to fully align the functions of the multi-language version of Seata with Seata Java within half a year. The community needs to comprehensively consider the current urgent needs and future development directions, and first find out the key technical points of Seata's multilingual version.
1. Transaction Mode
Seata provides four classic transaction modes: TCC, SAGA, AT and XA. The chart below is the release time of the four modes.
Each mode has its own characteristics:
AT mode
It is an original transaction mode of the Ali system. Its essence is a two-stage transaction. The second-stage commit/rollback is automatically generated by the framework, which is not intrusive to the business. The user-friendliness is higher than the TCC mode. High, which is also the mode with the most users at present. In the current time node when v1.5.2 has been released, data isolation is achieved through global locks in AT mode. Transaction processing for the same resource can only be operated serially, and the performance is average.
TCC mode
Business layer participants are required to implement the prepare/commit/rollback interfaces. Its performance is the best in the four modes, and its data visibility and isolation are also good, and its popularity is second only to AT. It is especially suitable for financial services that have high requirements on throughput, performance, and isolation.
SAGA mode
It is a long transaction solution. Its first-stage forward service and second-stage compensation service are implemented by business development, which can be easily combined with the microservice framework, and its performance is second only to the TCC mode. Because of its convenient orchestration feature, it has a large number of applications in the field of microservice orchestration. Of course, when using it, users need to write JSON files to orchestrate services. However, its data isolation is not good, and business data is at risk of being written dirty. The community is currently working on SAGA annotation ( https://github.com/seata/seata/pull/4577 ), which can further improve its performance and ease of use.
XA mode
Different from the other three "compensatory" transactions, it provides the strictest isolation: it can ensure data consistency from a global perspective, that is, it ensures that users will not have dirty reads. The interface of Seata XA mode is basically the same as that of AT mode, which is a user-friendly transaction mode, and it is a strict implementation of the XA protocol. The disadvantage of XA mode is that its transaction scope is longer and its performance is the lowest.
The release order of the four modes is as follows:
Each transaction mode has its own business scenarios in the Alibaba system, and its emergence and evolution cater to the existing pain points of their respective business scenarios. AT and XA do not need to understand business semantics and act on the DB driver + DB level, while TCC and SAGA need to implement rollback idempotent logic at the business level, segmented according to the data plane and business plane, and based on business intrusion The four modes are classified as shown in the figure below.
The cornerstones of distributed transactions are communication frameworks, SQL parsing, and database connection-driven implementations. Thanks to the rich ecology of the Java language, the Seata Java version can easily stand on the shoulders of these "giants" and carry out corresponding work, which is also unmatched by other languages. For example, major databases all provide their Java versions of DB drivers. However, when the work background is placed in a multilingual scenario, it is necessary to consider the degree of realization of each language-related technical point.
There are four transaction modes. The AT mode is essentially a transaction at the application layer, and the Redo/Undo done at the database layer needs to be done again at the application layer. One of the key technical points is that the AT mode needs to intervene in the data source for SQL interception. Parse the SQL. Considering the single technical point of SQL parsing, Java and Python languages have antlr, and Go language has free-to-use pingcap/parser provided by TiDB, but many other languages are currently blank in this area.
Therefore, after the community considers the actual situation, in addition to the Go and Python versions, when building multi-language versions, the language version without SQL parsing package does not provide the implementation of AT mode first.
2. Communication Protocol
Regardless of the transaction mode, it is built on Seata's unique architecture.
The picture comes from the seata official website
The overall architecture of Seata consists of the following roles:
Transaction Coordinator
TC for short, maintains the state of global transactions and branch transactions, and drives the commit or rollback of global transactions.
Transaction Manager
TM for short, defines the scope of the global transaction, commits or rolls back the global transaction.
Resource Manager
RM for short, in the same application as the branch transaction, registering the branch transaction, reporting the status of the branch transaction, and driving the commit or rollback of the branch transaction.
The netty framework is used for long-link communication between TC and TM and each RM. Specifically, the communication protocol of Seata Java version defines a private binary bidirectional communication protocol on top of the four-layer TCP protocol. The key point is that the Seata Java version stands on the shoulders of the giant netty.
Back in the context of multilingualism, many languages do not provide a mature TCP communication framework. For example, when Dubbo was building its Go version dubbo-go, in order to realize the private binary protocol communication with Dubbo on top of TCP, the focus of my early work was to first implement the TCP communication framework getty ( https://github.com/apache/dubbo -getty ) before implementing its serialization protocol dubbo-go-hessian2 ( https://github.com/apache/dubbo-go-hessian2 ). If the language is switched to JS, PHP or Python, the construction of the relevant communication protocol requires a lot of effort from the community.
Seata has adapted the transaction context transfer of gRPC at the API layer in 2019. In order to facilitate the construction of multi-language versions of Seata, the Seata Java framework itself is carrying out an important work: Seata Client (including TM and RM) is based on gRPC and Seata Server ( TC) cluster to communicate. It is hoped that with the help of the multi-language advantage of gRPC, the workload of the multi-language version at the communication level can be saved.
3. Configuration and Registration Center
Similar to other microservice frameworks, Seata itself relies on the registry and configuration center in addition to the internal components mentioned in the previous section. The upper-layer application of the microservice finds the registry through the configuration center, and then discovers each service component of Seata through the registry. A typical complete Seata service architecture is shown in the figure below.
The Seata Java architecture can be easily embedded and injected into Java microservice frameworks such as Dubbo, SOFARPC, and Spring. Two of the very important external dependencies are:
Config Centre
According to reference documents 1 and 3, the role of the configuration center is to "place various configuration files", and the client "reads the global transaction switch, transaction session storage mode" and other transaction configuration information through the configuration of the configuration center.
The configuration center types supported by Seata Java are File, Nacos, Apollo, Zk, Consul, Etcd3, etc.
Registry Centre
According to Reference Document 2, the registry records the service routing and addressing mapping relationship. The service will be registered here. When the service needs to call other services, it will find the address of the service and call it. For example, the Seata Client (TM, RM) performs service discovery on the Seata Server (TC) cluster through the registry.
The registries supported by Seata Java include File, Nacos, Eureka, Redis, Zk, Consul, Etcd3, SOFA, etc.
Regardless of the complexity and maintenance cost of this system, the key problem encountered in the construction of Seata's multi-language system is: the multi-language support of these components is uneven. In order to support these components, it is impossible for other language versions of Seata to let the Seata community build multilingual client implementations of these components.
I think that the solution to this problem is not to cut the existing implementation of Seata Java, but to extend the functions of Seata Java to support new and more configuration centers and registration centers. In the cloud-native era, most cloud platforms are based on Kubernetes, and most microservice systems are building their new technology based on K8s. So the Seata multi-language system can build a new name server in the following form:
Use K8s' ConfigMap to store Seata's configuration data
Generally, Seata is used as a component of other microservice platforms such as Spring Cloud Alibaba, Dubbo, HSF, SOFARPC and other frameworks. These frameworks themselves have the concept of configuration center, which supports popular configuration centers such as Nacos, Apollo, SOFARegistry, etc. Seata can reuse the existing configuration centers of these frameworks to reduce maintenance costs.
The configuration type of the multi-language version of Seata can first implement the File type. When using Seata's microservice application to run on the Kubernetes platform, the Seata configuration can be mounted through ConfigMap.
Use K8s API server or DNS component to act as Seata's registry
Similarly to the configuration center, Seata's registry can also reuse the registry of its microservice framework, such as Nacos, Etcd, Zookeeper, etc. When running on the Kubernetes platform, the API server is used as the registry. The multi-language version of Seata implements the File type first, and the API server address can be configured.
Seata server (that is, TC) can be deployed in multiple namespaces, and there can be multiple TC clusters under each namespace. Seata Client (including TM and RM) can obtain the TC cluster address in the form of service, which not only achieves the purpose of high availability of TC, but also It is convenient to load balance the TC cluster at the client level.
PART. 2--Summary
Through the discussion of transaction mode, communication protocol, configuration and registration center, it can be seen that when the Seata multi-language system is constructed, the Seata multi-language implementation is not completely based on the Seata Java version, but the Seata Java version. Versions co-evolved as part of Seata's multilingual system.
PART. 3--Overall work progress
The computer industry has a TIOBE ranking, which regularly updates the popularity of various mainstream computer languages. Referring to the top 10 popular languages in this list, I launched the Seata multilingual voting in the two major DingTalk groups of Seata at the beginning of this month (July 2022) . The statistical results are as follows:
According to the voting results and the technical reserve of relevant talents in the community, the community decided to focus on the construction of Go, Python and JS. The current progress of work in several languages is as follows:
seata-go
Project address: https://github.com/seata/seata-go
Community pin group: 33069364
seata-python
Project address: https://github.com/opentrx/seata-python
Community Pin Group: 44788121
seata-js
Project address: https://github.com/seata/seata-js
Community pin group: 44788119
seata-rust
Project address: https://github.com/seata/seata-rust
Community Pin Group: 44791799
seata-php
Project address: https://github.com/seata/seata-php
Community pin group: 44788115
Historically, Seata used to have two Go versions, seata-go contributed by Zhang Xu in 2019 and seata-golang contributed by Liu Xiaomin in 2020. In order to unify the construction, these two projects have been merged at present. The completion degree of seata-go is the highest among the multi-language versions, and the TCC and AT modes are realized. The implementation of the XA and Saga modes is currently in progress, and the first version is expected to be released in the fall. Secondly, the high degree of completion is seata-python, which provides AT transaction mode.
Considering that most of Seata's developers and users are currently in China, the Seata community has built a number of language community groups to promote the development of language versions.
The construction of Seata's multi-language system is currently in full swing. Industry colleagues are welcome to join the group and join us to promote the implementation of various language versions of Seata, improve the transaction technology level of the microservice framework in each language, and create a new situation in the construction of distributed technology. !
【Reference document】
1. Analysis of Seata application side startup process - registration center and configuration center modules:
https://seata.io/en-us/blog/seata-client-start-analysis-02.html
2. The realization principle of Seata registration center:
https://seata.io/en-us/blog/seata-config-center.html
3. The realization principle of Seata configuration center:
https://seata.io/en-us/docs/user/registry/index.html
4. Seata Enterprise Edition:
https://developer.aliyun.com/article/928860
Recommended reading of the week
Full analysis of Go native plugin usage problems
SOFARegistry source code|Data synchronization module analysis
Community Articles|MOSN Building Subset Optimization Ideas Sharing
Nydus - Exploration and Practice of Next-Generation Container Images
Welcome to scan the code to follow:
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。