Author: vivo Internet Middleware Team - Liu Runyun
A large number of businesses use message middleware for decoupling, asynchrony, and peak-shaving and valley-filling design and implementation between systems. In the early stage of the company, a high-availability message middleware platform was implemented based on RabbitMQ. With the continuous growth of business, the volume of messages increases, which puts forward higher requirements for the message middleware platform. In addition, in the process of operation and maintenance, it also encounters many problems such as difficulty in guaranteeing high availability and insufficient functional characteristics. Based on these problems encountered, it was decided to introduce RocketMQ for replacement. This article will introduce the construction of a message middleware platform based on RocketMQ and the smooth migration of online business without perception.
1. Background Description
The vivo Internet middleware team began to provide high-availability messaging middleware platform services to businesses based on open source RabbitMQ in 2016.
In order to solve the problem of the rapid growth of business traffic, we have better delivered the platform capability requirements of the business for the message middleware platform through reasonable business cluster splitting and dynamic adjustment.
However, with the rapid development of the long-term business cycle, the volume of messages is also getting larger and larger. There are certain limitations in the system architecture design of RabbitMQ in high concurrency and large traffic scenarios. The main problems are as follows:
1.1 Insufficient high availability
There is a risk of split-brain in the architecture design, and automatic recovery cannot be performed after the default split-brain, and there is a risk of data loss in manual recovery.
In order to solve the split-brain problem, you can choose to adjust the processing of network exceptions to pause_minority mode, but it also brings about the problem that a small network jitter may cause the cluster failure to be unrecoverable.
1.2. Insufficient performance
After the business message is sent, it is routed to the corresponding queue through the exchange. Each queue is actually carried by a node in the cluster. A node in the cluster may become a bottleneck under high traffic.
A queue cannot be quickly migrated after a node carries traffic. Forcibly migrated to other low-load nodes may cause the queue to be unavailable. This also leads to the fact that adding nodes to the cluster cannot quickly improve the traffic carrying capacity of the cluster.
The cluster performance is low. After testing, three machines are used to form a cluster, which can carry about tens of thousands of tps. Since the queue is actually carried by a node in the cluster, the performance of a queue cannot be improved, so it cannot support High traffic business.
When messages accumulate to 10 million or more, the cluster performance will be degraded. Even after massive accumulation, if the consumption request tps is particularly high, the sending performance may be degraded due to the performance loss of the disk, and the recovery time will be long or even impossible to recover when the messages are accumulated too much. .
1.3 Insufficient features
By default, RabbitMQ will perform immediate redelivery for consumption exceptions. A small number of abnormal messages may also cause the business to fail to consume subsequent messages.
In terms of functional characteristics, transaction message and sequential message functions are not supported.
Although the message tracking logic can be implemented by itself, it will cause a very large performance loss to the cluster. In a formal environment, the message tracking function cannot be realized based on the native capabilities of RabbitMQ.
2. Project goals of the message middleware platform
Based on the above problems, the middleware team began to investigate the next-generation message middleware platform solution in Q4 2020. In order to ensure that the next-generation message middleware platform meets the new business needs, we first clarified the construction goals of the message middleware platform. It mainly includes two parts:
- Business needs
- Platform requirements
2.1 Analysis of business needs
High performance: It can support extremely high tps and horizontal expansion, which can quickly meet the traffic growth requirements of the business. The message middleware should not become a bottleneck for improving the performance of business request links.
High availability: extremely high platform availability (>99.99%), extremely high data reliability (>99.99999999%).
Rich features: support clustering, broadcast consumption; support transaction messages, sequential messages, delayed messages, dead letter messages; support message tracking.
2.2 Analysis of Platform Operation and Maintenance Requirements
- Operation and maintenance: business use permission verification; business production and consumption traffic restrictions; business traffic isolation and rapid migration capabilities.
- Observable: Rich performance indicators to observe the operation of the cluster.
- Mastery: It can quickly carry out secondary development based on open source components, enrich platform features and repair related problems.
- Cloud native: In the future, cloud native message middleware can be provided based on containerization to provide higher elasticity and scalability.
- Summary: It is necessary to build a high-performance, high-reliability next-generation message middleware, with extremely high data reliability, rich functional features, and it needs to be perfectly compatible with the current RabbitMQ platform to help businesses quickly migrate to the new message middleware platform. Business Migration Costs.
3. Research on open source component selection
Based on the problems of the current RabbitMQ platform and the project requirements for the next-generation messaging middleware platform, we have carried out research on the two most popular messaging middleware: RocketMQ and Pulsar.
In the research process, the comparison mainly focuses on the following two aspects:
3.1 High Availability Analysis and Comparison
3.1.1 Comparison of High Availability Architecture and Load Balancing Capability
Pulsar Deployment Architecture (Source: Pulsar Community )
RocketMQ deployment architecture (source: RocketMQ community )
- Pulsar:
- Adopting the design of computing and storage separation architecture, it can realize massive data storage, and support cold and hot data separation storage.
- Broker failover is controlled based on ZK and Manager nodes to achieve high availability.
- Zookeeper adopts a hierarchical sharded storage design, which naturally supports load balancing.
- RocketMQ:
- It adopts an integrated storage and computing architecture design, and is deployed in a master-slave mode. The abnormal master node does not affect the reading of messages. Topic adopts a sharding design.
- Secondary development is required to support master-slave switching to achieve high availability.
- Broker's automatic load balancing is not implemented. Top n traffic topics can be distributed to different Brokers to achieve simple load balancing.
3.1.2 Comparison between capacity expansion and failure recovery
- Pulsar
- Broker and BooKeeper scale independently and automatically load balance after scaling.
- The Broker node is stateless. After a fault, the hosting topic will be automatically transferred to other Broker nodes to complete the fault recovery in seconds.
- BooKeeper performs ledger data alignment by the automatic recovery service, and restores it to the set QW share.
- During the failure, acked messages will not be lost, and unacked messages need to be resent by the client.
- RocketMQ
- After the Broker expands and shrinks, it needs manual intervention to complete the topic traffic balancing. An automatic load balancing component can be developed combined with the topic's read and write permission control to automatically complete the load balancing after the expansion and shrinkage.
- High availability is achieved based on master-slave switching. Since the client regularly updates the route from NameSrv for 30 seconds, the fault recovery time is 30 to 60 seconds. Combined with the client downgrade strategy, the client can actively remove abnormal broker nodes to achieve faster fault recovery.
- Using synchronous replication and asynchronous disk flushing deployment architecture, in extreme cases, a small amount of messages will be lost. With synchronous replication and synchronous disk flushing, written messages will not be lost.
3.1.3 Performance comparison
- Pulsar
- It can support millions of topics, which is actually limited by ZK storage metadata.
- According to the internal pressure test, 1KB messages can support hundreds of thousands of TPS.
- RocketMQ
- Logically, it can support millions of topics. In fact, when it reaches tens of thousands, the heartbeat packet transmission between Broker and NameSrv may time out. It is recommended that a single cluster does not exceed 50,000.
- According to the pressure test, it can support 1KB message body TPS up to 100,000+.
3.2 Comparison of functional characteristics
3.3 Summary
From the analysis of the high-availability architecture, Pulsar realizes the separation of computing and storage of the architecture based on the Bookeeper component, which can achieve rapid recovery of failures; RocketMQ adopts a master-slave replication architecture, and failure recovery relies on master-slave switching.
From the analysis of functional characteristics, Pulsar supports rich expiration strategies, supports message deduplication, and supports the semantics that messages are only consumed once in real-time computing; RocketMQ has better features for online business in terms of transaction messages, message traces, and consumption patterns. support.
From these two aspects, we finally chose RocketMQ to build our next-generation messaging middleware platform .
4. Smooth Migration Construction
Through technical research, it is determined to build a next-generation messaging middleware platform based on RocketMQ.
In order to realize the smooth migration of business from RabbitMQ to RocketMQ, it is necessary to build a message gateway to convert messages from AMQP protocol to RocketMQ ; the metadata semantics and storage of RabbitMQ and RocketMQ are different, and it is necessary to realize the mapping of metadata semantics and the independent storage of metadata.
There are four main things that need to be done:
4.1 Differences between independent deployment and embedded deployment of message gateways
4.2 Metadata Definition Mapping and Maintenance
4.3 High-performance message push without interfering with each other
RabbitMQ uses the push mode for message consumption. Although RocketMQ also supports message push consumption, the number of messages cached by the client is limited by the prefetch parameter in the AMQP protocol to ensure that the client memory will not be abnormal due to caching too many messages. Therefore, in the message gateway The semantics of the AMQP protocol must also be met when implementing message push.
At the same time, each message gateway requires thousands or even tens of thousands of queues to push messages. There are differences in the consumption rate of each queue message, and each queue may have messages that need to be pushed to the client for consumption at any time. The pushes do not interfere with each other and are timely.
In order to achieve efficient and non-interfering message push, there are the following strategies:
- Each queue uses an independent thread to ensure mutual non-interference and timeliness. The disadvantage is that it cannot support mass queue message push.
- Based on semaphores, blocking queues, etc., messages can be pushed on demand when they perceive that there are pushable messages and consumable servers, so that a small number of threads can be used to complete efficient message push.
In the end, the second option was chosen, and the data flow diagram is shown in the following figure:
A message consumption process: After the client starts to connect to the message gateway, it builds a RocketMQ push consumer client instance in the message gateway, injects a custom ConsumeMessageService instance, and uses a semaphore to save the number of messages the client is allowed to push.
When a message is pushed from the cluster side to the message gateway, the message is encapsulated as a task according to the pushed batch and stored in the BlockingQueue of the ConsumeMessageService instance. At the same time, the push thread will poll all the ConsumeMessageService instances. If it finds a message to be consumed in the local cache and A business client that can consume messages submits tasks to the thread pool to complete message push.
In order to ensure that the timeliness of message push of other queues will not be reduced due to a small number of queues with a particularly high consumption rate, each ConsumeMessageService will be restricted to only allow a certain number of messages to be pushed, that is, to push messages of other queues, so as to ensure that all queues are pushed. The non-interference and timeliness of the message push.
After the client consumes ack/uack, the next push is notified again through the semaphore, which also ensures that the push requirement of massive messages can be completed with a small amount of thread resources.
4.4 Realization of Consumption Start-stop and Consumption Current Limiting Capability
Based on the message gateway, consumption start-stop and consumption current limiting logic can be added to the message push logic.
Consumption start and stop can help the business quickly realize the suspension of consumption or stop message consumption by some abnormal nodes.
The consumption current limit can help the business control the message consumption rate and avoid too much pressure on the underlying dependencies.
4.5 Platform Architecture
- Finally, the above platform architecture is formed. A new AMQP-proxy message gateway service is built to convert AMQP messages to RocketMQ to support business message production and consumption.
- The mq-meta service is built to maintain the metadata information of the cluster.
- The master-slave switchover of the cluster is controlled by mq-controller to realize the high availability of the cluster. At the same time, cluster monitoring is added, and the load balancing module ensures the high availability of the cluster.
V. Platform construction progress and migration benefits
5.1 Revenue from business use
5.1.1 Higher and more stable message sending performance
Native RabbitMQ cluster business pressure measurement performance
Service stress measurement performance after using message gateway
5.1.2 Richer features
- Unified message expiration time
- Consumption exception messages will be re-delivered according to the gradient delay
- Direct support for broadcast consumption mode
- The entire environment provides the message tracking function on demand
- Support consumption reset to a previous location
5.1.3 Changes in service usage characteristics
- Messages will no longer be retained indefinitely . The default retention time is 3 to 7 days (the actual retention time depends on the cluster configuration)
- Consumption exceptions will no longer be re-delivered immediately, but will be delayed and re-delivered according to a certain gradient . After multiple exceptions, it will become a dead letter message
- Directly supports broadcast consumption , pay attention to the broadcast consumption mode consumption without abnormal re-delivery, each message is only consumed once per node
- Business production and consumption performance can support horizontal expansion
- Does not support consumption priority function
- The default consumption timeout period is 15 minutes . After the consumption timeout period, the message is re-delivered. The consumption timeout period can be adjusted as needed.
- Support consumption start and stop (global or limit some node consumption)
- Support global consumption current limit
- Limit the size of the message body . The current limit is 256KB. If it exceeds the limit, it will directly return a failure. Traffic management will be carried out in the future to limit the business traffic of sending large message bodies.
5.2 Platform operation and maintenance benefits
After the business is migrated from RabbitMQ to RocketMQ, the business traffic can be increased from 10,000 TPS to 100,000 TPS, and the business capacity can be increased from hundreds of millions to tens of billions. The consumption of machine resources is reduced by more than 50%, the difficulty and cost of operation and maintenance are greatly reduced, and more functions and features can be realized based on the message gateway.
6. Future Outlook
In the future, the middleware team plans to iteratively evolve the message middleware in three aspects:
- Based on the capabilities of the message gateway, the existing platform features are enriched, and business message management is performed.
- In the past five years, the middleware team has carried out the high-availability construction of RabbitMQ based on the open source RabbitMQ, and found that directly allowing the business side to use the SDK access based on open source components will bring difficulties in upgrading the SDK, and the problem of binding with the back-end message middleware type. We plan to realize the service of message queue engine based on GPRC and message gateway, and the business does not need to care about the selection of open source message middleware used by the bottom layer.
- Investigate the RocketMQ5.0 computing and storage separation architecture, and re-upgrade the messaging middleware architecture.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。