头图

Author: Wang Yuning (Yan Ran)

RTMS is the abbreviation of Real Time Message Service. Based on the idea of frame synchronization, we have implemented a set of real-time network communication solutions. The server only forwards messages without any logic processing. It is specially suitable for high requirements for real-time and interactivity. business scenario.

In terms of implementation, RTMS is integrated and deployed with the business server by providing SDK on the server side, which reduces east-west traffic forwarding, does not rely on external storage services (such as distributed cache, database), etc., pure memory computing, provides the underlying business network capabilities. By simplifying the design, the real-time performance and stability are greatly improved.

business background

In order to improve user activity and stickiness, Alipay currently has many interactive products such as Ant Forest, Ant Manor, and Baba Farm. During the 618 and Double 11 promotions in previous years, there were interactive products for stacking cats, the New Year's five blessings that started the year before, and the game of fighting the beasts was added. However, the interaction of the interactive products in Alipay terminal is not strong enough at present, users cannot compete with each other at the same time, and the interaction and participation are low.

In the 7th year of Wufu, the big IP of New Year Wufu also urgently needs new gameplay innovations to further attract users. Through the survey, it was found that users' voices for the addition of "interaction, PK, and battle" to Wufu activities are also very high. The network technical capabilities of RTMS are very compatible with the needs of the online version of the Nian Beast gameplay designed by the Wufu product team.

Business pain points

1. Multi-person real-time interactive business scenarios, complex processing and poor real-time performance

The server is usually a distributed stateless micro-service architecture, and client requests will be load balanced and randomly landed on a business server for processing. In a multi-person real-time interactive scenario, after multiple user requests of the same interactive "room" are randomly routed to different servers, in order to maintain the state of obtaining the "room", the interactive service provider has to rely on distributed cache or centralized Databases store and exchange data.

At the same time, in order to send the processed result message to the client, it also needs to rely on the centralized message core service to send. Since the client's persistent connections are distributed on different servers, the message core needs to manage the connection information through a distributed cache, so that any message core server can find the server where a specific user's persistent connection is located when pushing. In addition, the original design of the message core is to ensure that even if the client network is offline, it can finally be pushed to the client, so the message core will persist each message to the database.

It can be seen that based on the traditional network communication solution, business processing is very complex, requiring multiple access to the distributed cache and database, and there are also more east-west traffic and requests inside the server, which increases the processing complexity and time-consuming. Although the reliability of message arrival is guaranteed, it cannot meet the high-frequency and real-time demands in interactive scenarios.

2. Large-scale random user message subscription scenarios are complex and time-consuming

In Alipay's current stock business scenario, there are many stock targets. During the opening of the market, the data of each stock target is constantly changing every minute and every second, and the amount of information is huge. When users view stock quotes, they will also constantly switch the stock target they are viewing. Every day, millions of users are viewing information on different stock targets at the same time. How to push frequently changing stock market data to all clients accurately and quickly is a difficult problem.

The server needs to perceive and maintain the one-to-many relationship between users and stock targets in real time, which is very complicated and troublesome. When a message is generated, it is necessary to traverse the user list through the service of the message core, and send the same message to all clients one by one. For popular targets, it is time-consuming to make a full call in batches. With the continuous increase in the number of users, such invocation methods are becoming more and more heavy, and gradually cannot support the development of the business.

Design

Thinking carefully about the demands of these new business forms, we have redesigned a new north-south real-time message push solution RTMS, which meets the needs of business for high The need for real-time, rapid and large-scale diffusion.

Decentralization & De-persistence

Different from the centralized message core, RTMS completely uses a decentralized solution. Connection management, upstream message routing and delivery, and downstream message diffusion and distribution are deployed on the same machine as the business service through the SDK, reducing the number of gaps between the business service and the message core. east-west traffic.

In addition, these scenarios have high requirements for real-time messages, which also means that the demand for messages generated when the client is offline is reduced. Therefore, we have removed the persistence of messages on the server in the design. High-frequency messages are no longer persistent, which not only improves the real-time performance of the entire link, but also avoids huge storage pressure.

two-way communication

The two-way communication capability of RTMS enables the client to send uplink requests and the server to push the processing results to the client directly on the same link. The link is more simplified and the programming model is more unified.

Upstream message

Uplink message means that the client sends a message to the server. Uplink messages are first delivered to the user's own uplink message queue, and then asynchronously delivered to the specified destination through the inspection task bound to the uplink message queue.

Downlink message

Downlink messages refer to messages sent from the server to the client. Meeting messages and Topic messages are first delivered to their own message queues, and then asynchronously delivered to the user's downlink message queues through inspection tasks. The user downlink message queue also has its own inspection task, and then asynchronously pushes the message to the client.

Directed routing

Through directional routing, we route the persistent connections of users who logically belong to a "room" to the RTMS SDK on the same server for processing.

The server generates a Token through a certain mechanism, which represents the basic information of a specific server. Different clients can specify this Token when establishing a persistent connection. The load balancer recognizes this Token and routes connections with the same Token to the same server. Therefore, the business logic of users belonging to a "room" can be processed on a single server, reducing the complexity of business processing, avoiding the use of centralized storage, and improving the real-time performance of messages.

SUBSCRIBE PUBLISH MODE

In addition to providing the basic mode of pushing messages by specified users, RTMS also provides two push modes, Meeting and Topic, to adapt to different message diffusion scenarios.

Meeting mode

Meeting is an abstraction of an instance of a "room" in the business. For example, a round of online version of Nian Beast can correspond to a Meeting, and all users in the round join together in a Meeting. A Meeting will only be on a single server, so the upstream and downstream messages of all users in the Meeting can be centrally processed on a single server.

Topic mode

Topic is an abstraction of a type of related messages. For example, a stock target can correspond to a topic, and a user can subscribe to the topic of a stock target to generate a subscription relationship.

Through the cluster spreading capability, the business server only needs to call the interface once, and RTMS spreads the message to all the machines on the server through the message queue middleware, and then each server spreads the message to all clients according to the subscription relationship of the topic.

Summary & Outlook

In the field of message push and end-to-end network communication, the programming model and design ideas provided by RTMS give businesses a new choice. In scenarios with high requirements for real-time and interactivity, RTMS can provide important value for services, reduce processing complexity, and make services more focused on the business itself.

In order to further improve the real-time performance and large-scale diffusion capability of messages, RTMS will continue to explore the direction of MESH, and further shorten the link and improve user experience through nearby access and edge deployment.

Pay attention to [Alibaba Mobile Technology], Ali's cutting-edge mobile dry goods & practice will give you thoughts!


阿里巴巴终端技术
336 声望1.3k 粉丝

阿里巴巴移动&终端技术官方账号。