Unlimited user distribution faces various challenges such as a surge in message distribution and diverse message states due to the large number of group members and large business demands.

In order to ensure the ultimate performance of super clusters in ultra-large-scale user distribution, Rongyun Super Cluster has comprehensively considered solutions to core problems such as service deployment models, message delivery methods, and resource isolation from the design stage.

This article mainly shares the architecture design and implementation plan of Rongyun super group unlimited user distribution.

Technical Challenges of Unlimited User Distribution

  1. Every message sent upstream by each user needs to be distributed to all users in real time. Even if the target user is not online, it needs to be converted to push to reach this user.

Infinite users may be too abstract. Let's take a group with 10 million users as an example. Each message sent by a user will become a downstream distribution of 10 million. In the face of sudden peaks, especially when there is explosive news in the group or large-scale members are brought into the rhythm, the pressure on data storage and network distribution will rise sharply.

  1. Members of a supergroup may be exposed to massive amounts of information. Whether it is the performance of the client or the mental effort of the user, there are bottlenecks.

A supergroup with a large number of members will have unique needs that are different from ordinary chat rooms: users want to not miss information when they need it, and not be disturbed when it is irrelevant.

Therefore, there needs to be a huge customizable space for which messages and scenarios need to be pushed, and the frequency and aggregation of conversations and messages to notify the client.

That is to say, as a communication platform, between massive information and real-time chat, it is necessary to abstract the capabilities and give the APP the ability to adjust flexibly.

  1. Due to the large amount of information in the super group, it is necessary to support dividing the group into different channels, similar to the traditional topic or channel. Even with the same group and group members, conversations, messages, and unreads can still be aggregated into categories through different channels. Users can pay more attention to the parts they are interested in and increase user stickiness.
  1. Scenarios that combine information and chat generally have multiple needs. Different platforms, such as Android, iOS, Web, etc., have different technical characteristics in the network request and storage of massive messages, and even the push channel characteristics of different manufacturers on the same platform are also different, all of which need to be considered one by one.

Of course, the unlimited user group also needs to provide global high-quality network access for each user to ensure that the messages between the client and the server are not duplicated, lost or disordered.

In this regard, the Rongyun platform carries 100 million users and 100 billion message distribution every day, which has provided a solid foundation and needs no special consideration.

Design Architecture and Implementation Scheme Service Distribution Hierarchical Architecture

Rongyun Super Cluster has comprehensively considered core issues such as service deployment model, message delivery method, and resource isolation from the design stage.

Finite Diffusion Model:

The master node is responsible for core verification, and the diffusion node is responsible for data reading and writing, ensuring high availability of the master node and high availability within the diffusion node group, ensuring strong data consistency

Excellent resource isolation:

Support public cloud, private cloud, hierarchical resource isolation, accurate flow control strategy

dynamic delivery model:

Select message delivery model according to group type, multi-level message cache structure, online status linkage, multiple message directional delivery strategies

store and distribute

For the underlying storage, there is a big difference between unlimited and capped group members, and we can design based on the upper limit.

For example, for common group messages, you can usually choose to write diffusion, which can achieve better speed and concurrency in real-time delivery. Combined with the mechanism of half-write diffusion (reference distribution), a certain balance can be made in time and space.

However, in the supergroup scenario, in order to reduce the read and write pressure, the read diffusion method is used for optimization by default. In principle, 1 write N read, through the separation of upstream and downstream nodes and the characteristics of consistent hashing, specific optimization can be carried out for reading and writing respectively. A memory-level message ring and a second-level LRU cache are introduced for hot data to ensure read and write performance.

Distribution mode

In the face of massive messages, users hope that they can not miss information when they need it, and not be disturbed when it is irrelevant.

The analysis and implementation of these business forms fall into the distribution model, which can be divided into two categories.

One is message-driven, such as Telegram. A user receives messages from all conversations in real time. Conversation status, unreads, and notification reminders are actually driven by messages.

The other type is session-driven, such as Discord. Users can selectively receive messages from certain sessions. For sessions with low attention, they only need to receive notifications such as session status, unreads, and @ information. Combined with the first category, you can also achieve subscription-style session-driven.

The distribution mechanism determines that the management node, session node, and message distribution node of the group must be separate high-availability logical units.

Message delivery method

When the user is not online, the super group still supports pushing to the user. However, considering the user experience, the APP can be set to aggregate by time, or only push messages with high relevance such as @, or allow users to choose by themselves and set global, group-level, and channel-level do-not-disturb to reduce user disturbance. disturb.

When the user is online, the IM persistent connection generally includes direct push, notification pull, and aggregated notifications. Supergroup messages and conversations will dynamically combine these methods. The protocol layer supports QoS and ensures that each message has a unique value, and the client can synchronize and compensate by means of incremental timestamps.

When the user goes offline and then goes online, the client will first incrementally synchronize the supergroup session information, and use the merge and message breaking mechanism of sessions and messages to ensure the rapid acquisition of messages and the completeness of information.

Partial operation internalized

In the common group scenario, most of the session information such as status, unread, and input is handed over to the client for processing by default to ensure flexibility.

However, in the supergroup scenario, due to the massive historical messages and multi-terminal characteristics, the storage and acquisition of these information needs to be internalized in the communication model of the supergroup.

For message changes, Rongyun Supergroup also provides a series of expansion and internalization capabilities, such as extended information when a message is sent and after it is sent, and supports operations such as message retraction, deletion, modification, and reference modification.

For the scenarios of notification or control signaling commonly used by APP, Rongyun also provides online messages and other methods to ensure the reach of online users and reduce the amount of distribution.

Flow Control and Resource Isolation

Because the super cluster model is very flexible and has a high peak value, as a communication platform, Rongyun will provide APP, group, and signaling level flow control on the upstream node of the super cluster to ensure the stability of the platform and support proprietary Individual adjustment of clouds.

Through the above methods, Rongyun Super Group can ensure the reliability of message transmission in unlimited user scenarios, and there will be no problems such as message loss, message delay, and message disorder; in the case of high message concurrency, users are offline or online. It can receive pushes or messages in an orderly manner, and there will be no problems such as lag and inability to pull. At the same time, the performance pressure of the client is relieved by internalizing some operations.


融云RongCloud
82 声望1.2k 粉丝

因为专注,所以专业