Rongyun IM Technology Sharing: Thinking and Practice of the Message Delivery Program of Ten Thousands of People Chatting

This article was originally shared by Rongyun's technical team. The original title is "Technical Practice 丨 Message Distribution Speed Control Scheme for Ten Thousands of People Chat". In order to make the article better understand, the content has been revised.

1 Introduction

IM group chat in the traditional sense is usually a group of 500 people like WeChat, or a group of 2000 people of QQ (QQ has 3000 people, but that is a separate charge, which means that it is not a standard without threshold and can be used Not many people).

Since a foreign country called "the safest IM in the world" started a chat among tens of thousands of people, tens of thousands of people have quickly been accepted by domestic users. With the development of the mobile Internet, instant messaging services are widely used in various industries (not limited to traditional IM social applications). With the rapid development of business, traditional group chats with a limit of one hundred and one thousand people can no longer satisfy many businesses. The demand of the scene, so a large group of 10,000 or even 100,000 people can be regarded as accompanying and following the trend.

▲ Ten thousand people of "paper airplanes" (the developers are trembling...)

IM group chat has always been one of the more difficult hotspot technologies in IM applications. The usual meaning of group chat is nothing more than 500 people, 1000 people, and 2000 people. The technical realization is much more complicated than single chat. However, for tens of thousands of people (or even 100,000 people), compared to hundreds of people or thousands of people, the technical realization is almost a matter of another technical dimension, and the difficulty is much higher.

Based on the practical experience of the Rongyun technical team, this article summarizes some of the thinking and practice of the message delivery program for tens of thousands of people, hoping to inspire you.

study Exchange:

5 groups for instant messaging/push technology development and communication: 215477170 [recommended]
Introduction to Mobile IM Development: "One entry is enough for novices: Develop mobile IM from scratch"
Open source IM framework source code: https://github.com/JackJiang2011/MobileIMSDK

(This article was published synchronously at: http://www.52im.net/thread-3687-1-1.html)

2. Related articles

You can also read the following technical articles related to the talk of thousands of people:

"Netease Yunxin Technology Sharing: Summary of the Practice of the Ten Thousands of People Chatting Technical Solutions in IM"
"The Secret of the IM Architecture Design of Enterprise WeChat: Message Model, Ten Thousands of People, Read Receipt, Message Withdrawal, etc."
"Ali DingTalk Technology Sharing: Enterprise-level IM King-DingTalk's outstanding features in the back-end architecture"

Other articles shared by Rongyun technical team:

"Rongyun Technology Sharing: Practice of Network Link Keep Alive Technology of Rongyun Android IM Products"
"Rongyun Technology Sharing: Fully Revealing the Reliable Delivery Mechanism of 100 Million-level IM Messages"
"Rongyun Technology Sharing: Real-time Audio and Video First Frame Display Time Optimization Practice Based on WebRTC"
"IM Message ID Technology Topic (3): Decrypting the Chat Message ID Generation Strategy of Rongyun IM Products"

3. The technical challenges faced by the super group

Compared with hundreds of people and thousands of people, a large group of 10,000 or even 100,000 people has greatly increased the number of people in the group. For many business scenarios, the benefits are self-evident.

However, the size of a single group member is so large that the traffic impact on the IM system is very huge, and the technical difficulty can be imagined. Let's first analyze the technical challenges of supergroups.

Take a model of ten thousand people as an example:

1) If someone in the group sends a message, then the message needs to be distributed and delivered in a ratio of 1:9999. If we follow the normal message processing procedure, the message processing service will be under great pressure;
2) In the case of a large amount of messages, the processing speed of the server directly pushing messages to the client will become a system bottleneck, and once the user's message delivery queue causes squeeze, it will affect the normal message distribution and cause The usage of service cache has increased sharply;
3) In the microservice architecture, QPS and network traffic between services and storage (DB, cache) will also increase sharply;
4) The message cache with the group as the unit has a large memory and storage overhead (the storage of the message body is magnified ten thousand times).

Based on these technical challenges, in order to truly achieve the technical goals of the super large group, specific technical optimizations are necessary to deal with them.

4. Message delivery model for general group chat

Let's take a look at the message delivery model of ordinary group chat.

Our general group chat message delivery model is shown in the figure below:

As shown in the figure above, when a user sends a message in a common group, the delivery path is:

1) Messages come first to the group service;
2) Then use the group relationship cached by the group service to lock the target user to whom the message will ultimately need to be distributed;
3) Distribute to the message service according to a certain strategy;
4) The message service then judges whether the message is a direct push, notification pull, or transfer to Push according to the user's online status and message status, and finally delivers it to the target user.

The message delivery of ordinary group chats is just as you expect, basically everyone's implementation methods are not bad. However, for 10,000 people, this is obviously not enough.

Let's take a look at our technical optimization methods for the delivery of chat messages for tens of thousands of people.

5. Ten thousand people chat message delivery optimization method 1: speed control

One of our main methods for delivering news to tens of thousands of people is speed control.

As shown in FIG.

First of all: We will establish multiple group message distribution queues according to the number of cores of the server. For these queues, we have set different sleep times and different consumption threads.

In layman's terms, the queue can be divided into fast, medium, and slow queues in this way.

Second: We map all groups to the corresponding queues according to the number of group members.

The rules are:

1) The small group is mapped to the fast queue;
2) The large group is mapped to the corresponding slow queue.

Then: the small group has a small number of people and has little impact on the service, so the service uses the fast queue to quickly distribute the group message, while the large group message uses the relatively high delay of the slow queue to control the speed.

6. Optimized method of message delivery for tens of thousands of people chatting 2: Merging

As mentioned in section 3 of this article, the main challenge is that after the message is spread, distributed, and delivered, the message is cloned into N pieces, and the message traffic is instantly amplified.

For example: when a group message is sent to the IM server, it needs to be delivered from the group service to the message service. If every group member delivers it once, and the content of the group message delivered is consistent, it will definitely create corresponding resources Waste and service pressure.

In response to this situation, our solution is to merge and deliver messages.

The principle is: we use consistent hashing in the calculation of the service location, and the group members are relatively fixed, so we can merge group members with the same location into one request for delivery, which greatly improves the delivery efficiency and reduces Service pressure.

The following figure shows the merged delivery logic of messages shared by the Yunxin team:

▲ The above picture is quoted from "Summary of the Practice of the Technical Solution of the Ten Thousand People Chatting in IM"

As shown in the figure above, the Yunxin team’s combined message delivery solution for millions of people is: route messages in groups of links, and all group members on the same link only need to route one message.

7. One hundred thousand, million-level super large group processing program

In the actual group chat business, another business scenario is an ultra-large-scale group, where the number of people in this group reaches hundreds of thousands or even millions.

If this kind of group follows the above-mentioned delivery plan, it will inevitably still cause great pressure on the message node.

For example, we have a group of 100,000 people, five message nodes, and the upper limit of message service processing is 4000 messages per second, then each message node will be allocated approximately 20,000 group messages, which has greatly exceeded the message node. Processing power.

Therefore, in order to avoid the above problems, we will identify groups with more than 3000 members online as 10,000 people or super groups. Groups of this level can be adjusted according to the number of servers and server configuration. For this kind of super groups, special queues will be used. Handle the delivery of group messages.

The number of messages delivered by this special queue to the back-end message service in 1 second is half of the upper limit of the message service processing (reserve the corresponding ability to process other messages). If the upper limit of QPS processed by a single message service is 4000, then the group service is one A maximum of 2000 messages can be delivered to a single message service in a second.

8. Write at the end
In the future, we will also perform reference delivery for group messages. For messages with relatively large message bodies sent in large groups, we will only distribute and cache the index of the message to group members, such as MessageID. When the group members actually pull the group message, the message will not be assembled and distributed to the client. Doing so will save distributed traffic and storage space.

With the development of the Internet, the model and pressure of group business are constantly expanding, and there may be more challenges in the future, and of course, it will continue to iterate to better deal with it.

Appendix: More technical articles on IM group chat

"Rapid Fission: Witness the evolution of WeChat's powerful back-end architecture from 0 to 1 (1)"
"How to ensure the "sequence" and "consistency" of IM real-time messages? 》
"Should I use "push" or "pull" for online status synchronization in IM single chat and group chat? 》
"IM group chat messages are so complicated, how to ensure that they are not lost or repetitive? 》
"WeChat background team: optimization and upgrade practice sharing of WeChat background asynchronous message queue"
"How to ensure the efficiency and real-time performance of large-scale group message push in mobile IM? 》
"Discussion on the Synchronization and Storage Scheme of Chat Messages in Modern IM System"
"Discussion on the disorder of IM IM group chat messages"
"How to realize the read receipt function of IM group chat messages? 》
"Is IM group chat messages stored in one copy (i.e. diffused reading) or multiple copies (i.e. diffused writing)? 》
"A set of high-availability, easy-scalable, and high-concurrency IM group chat and single chat architecture design practices"
"[Technical Brain Hole] Can it be achieved technically if 1.4 billion Chinese people are pulled into a WeChat group? 》
"IM group chat mechanism, is there any way besides sending messages in a loop?" How to optimize? 》
"Netease Yunxin Technology Sharing: Summary of the Practice of the Ten Thousands of People Chatting Technical Solutions in IM"
"Ali DingTalk Technology Sharing: Enterprise-level IM King-DingTalk's outstanding features in the back-end architecture"
"Discussion on the realization of the read and unread functions of IM group chat messages in terms of storage space"
"Live Broadcast System Chat Technology (1): The Road to Real-time Push Technology Practice of Million Online's Meipai Live Barrage System"
"Live broadcast system chat technology (2): Ali e-commerce IM messaging platform, technical practice in group chat and live broadcast scenarios"
"Live Broadcast System Chat Technology (3): The Evolution of 15 Million Online Message Architecture in a Single Room of WeChat Live Chat Room"
"Live broadcast system chat technology (4): Baidu live broadcast massive user real-time messaging system architecture evolution practice"
"The Secret of the IM Architecture Design of Enterprise WeChat: Message Model, Ten Thousands of People, Read Receipt, Message Withdrawal, etc."
"Rongyun IM Technology Sharing: Thinking and Practice of Message Delivery Scheme for Ten Thousands of People Chatting"

This article has been simultaneously published on the official account of "Instant Messaging Technology Circle".
The synchronous publishing link is: http://www.52im.net/thread-3687-1-1.html

Rongyun IM Technology Sharing: Thinking and Practice of the Message Delivery Program of Ten Thousands of People Chatting

1 Introduction

2. Related articles

3. The technical challenges faced by the super group

4. Message delivery model for general group chat

5. Ten thousand people chat message delivery optimization method 1: speed control

6. Optimized method of message delivery for tens of thousands of people chatting 2: Merging

7. One hundred thousand, million-level super large group processing program

Appendix: More technical articles on IM group chat

JackJiang

引用和评论

长连接网关技术专题(十二)：大模型时代多模型AI网关的架构设计与实现

极致出海友好，融云 IM 支持消息免打扰设置时区

支持百万人超大群聊的Web端IM架构设计与实践

全平台开源即时通讯IM框架MobileIMSDK：7端+TCP/UDP/WebSocket协议

鸿蒙NEXT如何保证应用安全：详解鸿蒙NEXT数字签名和证书机制

《北京日报》点赞！融云助力打造“数字丝路”新范式

拥抱国产化：转转APP的鸿蒙NEXT端开发尝鲜之旅