The design practice of the offline message push system architecture of Himalaya with hundreds of millions of users

This article was originally written by Li Qiankun, the Himalaya technical team, with the original title "Push System Practice", thanks to the author for his selfless sharing.

1 Introduction

1.1 What is offline message push

For IM developers, offline message push is a familiar requirement. For example, the following figure shows a typical IM offline message notification effect.

1.2 Andriod's offline push is really not easy

There are nothing more than two terminals involved in offline message push on the mobile terminal-iOS and Andriod. There is nothing to say about iOS, and APNs is the only option.

Andriod terminal is more exotic (mainly referring to domestic mobile phones). In order to realize offline push, various keep-alive black technologies emerge in an endless stream. With the continuous escalation of the keep-alive difficulty, there are fewer and fewer keep-alive methods that can be used. Read the following articles that I have compiled, and get a feel for them (the articles are in chronological order, and continue to advance as the difficulty of the Andriod system keeps alive).

"Ultimate summary of application keep-alive (1): Dual-process daemon keep-alive practice under Android 6.0"
"Ultimate summary of application keep-alive (2): Android 6.0 and above keep-alive practice (process anti-kill articles)"
"Ultimate Summary of App Keep Alive (3): Keep Alive Practices of Android 6.0 and above (Resurrection from Killed)"
"The official version of Android P is coming: the real nightmare of background application keep alive and message push"
"Comprehensive inventory of the real operating effects of the current Android background keep-alive solution (as of 2019)"
"In 2020, is there still a scene for Android background keep-alive? See how I can achieve it elegantly! 》
"The strongest Android keep-alive idea in history: an in-depth analysis of Tencent TIM's process immortality technology"
"The ultimate reveal of Android process immortality technology: the underlying principle of process killing, APP response to killing skills"
"Android keep-alive from entry to giving up: obediently guide users to add whitelists (with 7 examples of whitening models)"

The above articles are only part of the articles I have compiled in this area. Pay special attention to the last article "Android Keep-alive From Entry to Abandonment: Obediently guide users to whitelist (with 7 major models whitening examples)". Yes, the current Andriod system's tolerance for APP's own keep-alive is almost zero, so the previous keep-alive methods are almost all invalid in the new version of the system.

It's no longer a matter of keeping alive by yourself, you still have to do it to keep the offline message push. How to do? According to the current best practice, that is the ROOM-level push channel for inoculating mobile phone manufacturers. I won’t go into details here. If you’re interested, you can read "Android P Official Version Coming: The Real Nightmare of Background Application Keeping Alive and Message Pushing".

In the era of self-keep-alive and self-built push channels (here, of course, it refers to the Andriod terminal), the architecture design of the offline message push system is relatively simple. It is nothing more than calculating a deviceID for each terminal, and the server through the self-built channel That's it for the information to be transparently transmitted.

However, in the self-built channel, you can only rely on the manufacturer's push channel. Today, Xiaomi, Huawei, Meizu, OPPO, vivo (this is just a few mainstream ones), etc., there are too many mobile phone models, and the push APIs and designs of each company The specifications are different (don’t tell me anything about the unified push alliance, I have waited for him for 3 years for that stuff-see "The much anticipated "Unified Push Alliance" is on the stage"), which directly leads to the previous offline message push The system architecture design must be redesigned to adapt to the push technology requirements of the new era.

1.3 How to design it reasonably

So, for the ROOM-level push channels of different vendors, how should our background push architecture be designed reasonably?

The design of the offline message push system shared in this article is not specifically for IM products, but no matter how different the business layer is, the general technical ideas are the same. I hope that this sharing from Himalaya can provide offline messages for a large number of users. You bring some inspiration.

Recommended reading: Another article shared by the Himalaya technical team, "Special Topic on Long Connection Gateway Technology (5): Himalaya Self-developed 100 Million API Gateway Technology Practice", you can also read it together if you are interested.

Learning Exchange:

5 groups for instant messaging/push technology development and communication: 215477170 [recommended]
Mobile IM development introductory article: "One entry is enough for novices: Develop mobile IM from scratch"
Open source IM framework source code: https://github.com/JackJiang2011/MobileIMSDK

(This article was published simultaneously at: http://www.52im.net/thread-3621-1-1.html)

2. Technical background

First introduce the function of the push system in the Himalaya APP. The picture below is a push/notification of news business.

Offline push is mainly to have a means to reach users when the user does not open the APP, maintain the presence of the APP, and improve the daily activity of the APP.

We currently mainly use push services including:

1) Anchor start: The company has a live broadcast service, and the anchor will send a push start reminder to all fans of this anchor when the live broadcast is started.
2) Album update: There are a lot of albums on the platform. Below the album is a series of specific voices. For example, a novel is an album and a novel has many chapters. Then when the novel updates chapters, it will be given to all users who subscribed to this album. Send an updated reminder:
3) Personalization, news services, etc.
Since you want to send an offline push to a user, the system must have a communication channel with the user's device.

Everyone who has done this knows that self-built push channel requires App to be resident in the background (that is, the application "keep alive" mentioned in the introduction), and mobile phone manufacturers generally adopt "aggressive" background process management strategies due to power saving and other reasons. The self-built channel is of poor quality. The current channel is generally maintained by the "push service provider", which means that the company's push system does not directly send pushes to users (this is the situation mentioned in the article in the previous section: "The official version of Android P is coming: The real nightmare of background application keep-alive and message push").

The offline push flow process in this case is as follows:

Several major domestic manufacturers (Xiaomi, Huawei, Meizu, OPPO, vivo, etc.) all have their own official push channels, but each interface is different, so some manufacturers such as Xiaomi and One Push provide integrated interfaces. When sending, the push system is sent to the integrator, and then the integrator sends it to the specific manufacturer’s push channel according to the specific device, and finally to the user.

When sending a push to a device, you must make it clear what content you want to send: title, message/body, and specify which device to send the push to.

We use token to identify a device. The meaning of token is different in different scenarios. The company generally uses uid or deviceId to identify a device. For integrators and different manufacturers, they also have their own unique "number" for the device, so The internal push service of the company is responsible for the conversion of uid and deviceId to integrator token.

3. Overall architecture design

As shown in the figure above, the push system as a whole is a queue-based streaming system.

The right side of the above figure: is the main link. Each business party sends pushes to the push system through the push interface. The push interface sends data to a queue for consumption by the conversion and filtering service. Conversion is the conversion from uid/deviceId to token mentioned above. Filtering is specifically discussed below. After the conversion and filtering, it is sent to the sending module, and finally to the integrator interface.

When the app starts: a binding request will be sent to the server, and the binding relationship between uid/deviceId and token will be reported. When the token is invalidated due to uninstallation/reinstallation of the App, etc., the integrator informs the push system through an http callback. Each component will send the pipeline to the company's xstream real-time stream processing cluster through kafka, aggregate the data and place it in mysql, and finally grafana will provide various reports for display.

4. Design of business filtering mechanism

Each business party can send pushes to users brainlessly, but the push system must be restrained, so business messages must be selectively filtered.

filtering mechanism includes the following points (in the order of support):

1) User switch: App supports the configuration of user switch, if the user turns off push, it will not send push to the user device;
2) Duplicate copywriting: A user cannot receive duplicate copywriting, which is used to prevent the upstream business party from sending logic errors;
3) Frequency control: Each service corresponds to a msg_type, and a maximum of xx pushes can be sent within xx time;
4) Silent time: Every day from xx to xx, no push is sent to the user, so as not to disturb the user's rest.
5) Hierarchical management: hierarchical control is carried out from the two dimensions of user and message.

aimed at point 5, specifically:

1) Each msg/msg_type has a level, which gives more opportunities to send important/high-level services;
2) When a user receives xx pushes a day, the unimportant news will no longer be sent to these users.

5. Multi-dimensional query problem under sub-database and sub-table

In many cases, the design is based on theory and experience, but in practice, various specific problems will always be encountered.

Himalaya now has 600 million+ users, and the corresponding device table of the push system (recording the mapping of uid/deviceId to token) has a similar magnitude, so the device table is divided into database and table, with deviceId as the table column.

But in fact: there are often query requirements based on uid/token, so it is also necessary to establish a mapping relationship from uid/token to deviceId. Because the uid query scene is also very frequent, the uid secondary table also has the same fields as the primary table.

Because there will be one or two global pushes every day, and there are special pushes for silent users (that is, users who don’t often use APPs). There is actually no "hot spot" in storage. Although cache is used, the effect is very limited and it takes up The space is huge.

Multiple tables and caching result in three or four copies of data. Different logics use different copies. Inconsistencies often occur (the pursuit of consistency affects performance), and the query code is very complex and has low performance.

In the end, we chose to store the device data on tidb, which greatly simplifies the code on the premise that it can be used.

6. Timeliness of special business

6.1 Basic concepts
The push system is queue-based, "first come first push". Most services do not require high real-time performance, but live broadcast services require half an hour to be delivered, and news services are even more "desired". The sooner the better.

If the news is pushed: there is a huge amount of "album update" pushes in the queue waiting to be processed, the album update service will seriously interfere with the delivery of the news service.

6.2 Is this an isolation problem?
At the beginning, we thought this was an isolation problem: for example, there are 10 consumer nodes, 3 are dedicated to high-efficiency services, and 7 nodes are responsible for general services. At the time, rabbitmq was used for queues, and spring-rabbit was modified to support routing messages to specific nodes based on msytype.

This program has the following disadvantages:

1) When some machines are busy, other machines are "watching on the sidelines";
2) When adding new services, it is necessary to additionally configure the mapping relationship between msgType and consumer nodes, which results in higher maintenance costs;
3) Rabbitmq is implemented based on memory, which occupies a large amount of memory during the instant peak of push, which in turn causes rabbitmq to become unstable.

6.3 It's actually a priority issue
Later, we realized that this was a priority problem: high-priority services/messages can be cut in the queue, so Kafka support priority is encapsulated, which better solves the problems caused by the isolation scheme. The specific implementation is to establish multiple topics, one topic represents a priority, encapsulating kafka mainly encapsulates the logic of the consumer side (that is, constructing a PriorityConsumer).

Note: For the sake of simplicity, this article uses consumer.poll(num) to describe the use of consumer to pull num messages, which is inconsistent with the real kafka api, please be aware.

There are three schemes for PriorityConsumer implementation, which are explained separately below.

1) Reorder the poll to the memory: Java has a ready-made memory-based priority queue PriorityQueue or PriorityBlockingQueue. Kafka consumers consume normally, and re-push the polled data to the priority queue.

1.1) If a bounded queue is used, after the queue is full, no matter how high the priority of the subsequent messages is, they will not be put in, and the effect of "jumping the queue" will be lost;
1.2) If you use an unbounded queue, the messages that should have been piled on Kafka will be piled in memory, and OOM is very risky.
2) Pull high-priority topic data first: continue to consume as long as there is, until there is no data, then consume lower-level topics. In the process of consuming lower-level topics, if a higher-level topic message is found to arrive, it will switch to consuming high-priority messages.

The implementation of this solution is more complicated, and in the intensive time periods such as evening peaks, it may cause low-priority services to completely lose the push opportunity.

3) The priority is from high to low, and the data is cyclically pulled:

The logic of a loop is:

consumer-1.poll(topic1-num);
cosumer-i.poll(topic-i-num);
consumer-max.priority.poll(topic-max.priority-num)

If topic1-num=topic-i-num=topic-max.priority-num, then the program has no priority effect. topic1-num can be regarded as a weight. We agree: topic-high-num=2 * topic-low-num, all topics will be consumed at the same time, and the “queuing effect” can be realized in disguised form by the amount of consumption at a time. In the specific details, the "sliding window" strategy is also used to optimize the total consumption performance of a topic of a certain priority when there is no news for a long time.

From this, we can see that the timeliness issue is first understood as an isolation issue, then as a priority issue, and finally transformed into a weight issue.

7. Storage and performance issues of the filtering mechanism

In our architecture, the main factors that affect the speed of push delivery are tidb query and filtering logic. The filtering mechanism is divided into storage and performance issues.

Here we take the xx service frequency control limit "send one per hour at most" as an example for analysis.

When the first version is implemented: the redis kv structure is <deviceId_msgtype, the number of pushes sent>.

frequency control realization logic is:

1) When sending, incr key, the number of sending times plus 1;
2) If it exceeds the limit (return value of the incr command> the upper limit of the number of sending times), it will not push;
3) If the limit is not exceeded and the return value is 1, it means that the deviceId is sent to the deviceId for the first time in the msgtype frequency control period, and the expire key is required to set the expiration time (equal to the frequency control period).

above scheme has the following disadvantages:

1) The company currently has 60+ push services, 600 million + deviceId, a total of 600 million*60 keys, which takes up a huge space;
2) In many cases, two instructions are needed to process a deviceId: incr+expire.

Therefore, our solution is:

1) Use pika (disk-based redis) to replace redis, disk space can meet storage requirements;
2) The commissioned system architecture group has expanded the redis protocol to support the new structure ehash.

ehash is modified based on redis hash, which is a two-level map <key, field, value>. In addition to the key can set the validity period, the field can also support the validity period, and it supports conditional setting of the validity period.

The storage structure of the frequency control data is changed from <deviceId_msgtype,value> to <deviceId,msgtype,value>, so that for multiple msgtypes, the deviceId is only stored once, which saves space.

Incr and expire are combined into one instruction: incr(key,filed,expire), which reduces one network communication:

1) When the field has not set the validity period, set the validity period for it;
2) When the field has not expired, the validity period parameter is ignored.

Because the push system heavily uses the incr instruction, it can be regarded as a write instruction. In most scenarios, pipeline is also used to achieve the effect of batch writing. We entrusted the system architecture team to optimize the writing performance of pika and support the "write mode". (Optimize the relevant parameters in the writing scene), qps reaches more than 10w.

The ehash structure also plays an important role in the pipeline record, such as <deviceId,msgId,100001002>, where 100001002 is an example value of the data format we agreed on, the first, middle and last three parts (each part occupies 3 digits) respectively represent A message (msgId) is specific to the sending, receiving and clicking details of the deviceId. For example, the first 3 digits "100" indicate that the sending failed because it was in the silent period when sending.

Appendix: More news push technical articles

"Detailed Explanation of iOS Push Service APNs: Design Ideas, Technical Principles and Defects, etc."
"Carrier Pigeon Team Original: Let's Walk Through the Pit of APNS on iOS10"
"Android-side message push summary: implementation principle, heartbeat keep-alive, problems encountered, etc."
"Literacy Sticker: Understanding the MQTT Communication Protocol"
"A complete Android push Demo based on MQTT communication protocol"
"Interview with IBM Technical Manager: The development process and development status of the MQTT protocol, etc."
"Seeking advice on android message push: the pros and cons of GCM, XMPP, and MQTT"
"Analysis of Real-time Message Push Technology on Mobile"
"Literacy Post: A Brief Talk on the Principles and Differences of Real-time Message Push in the Background of iOS and Android"
"Absolute Dry Goods: Technical Essentials of Push Service for Massive Access Based on Netty"
"Mobile IM Practice: Google Message Push Service (GCM) Research (from WeChat)"
"Why IM tools like WeChat and QQ don't use GCM to push messages? 》
"Technical Practice Sharing of Large-scale and High-concurrency Architecture of Aurora Push System"
"From HTTP to MQTT: An Overview of Application Data Communication Practice Based on Location Services"
"Meizu 25 million long-connected real-time message push architecture technical practice sharing"
"Special Meizu Architect: Experiences and Experiences of Real-time Message Push System with Massive Long Connections"
"In-depth talk about the small matter of Android message push"
"Practice of Message Push for Hybrid Mobile Applications Based on WebSocket (including code examples)"
"An Implementation Idea of a Secure and Scalable Subscription/Push Service Based on Long Connections"
"Practice Sharing: How to build a highly available mobile messaging system? 》
"The Practice of Go Language to Build a Ten Million-Level Online Highly Concurrent Message Push System (From 360 Company)"
"Tencent pigeon technology sharing: practical experience of tens of billions of real-time message push"
"Millions Online's Real-time Push Technology Practice Road of Meipai Live Barrage System"
"The Evolution of the Message Push Architecture of the Jingtokyo Mai Merchant Open Platform"
"Understanding iOS Push is enough: The most comprehensive iOS Push technology in history"
"Achieving iOS High-Performance Message Push Based on the Latest HTTP/2 Interface of APNs (Server Side)"
"Decrypting "Dada-JD Daojia" Orders Instant Dispatch Technical Principles and Practices"
"Technical dry goods: from scratch, teach you to design a million-level message push system"
"Special Topic on Long Connection Gateway Technology (4): Practice of iQiyi WebSocket Real-time Push Gateway Technology"
"Himalaya's offline message push system architecture design practice with hundreds of millions of users"
More similar articles...
This article has been simultaneously published on the official account of "Instant Messaging Technology Circle".

▲ The link of this article on the official account is: click here to enter. Synchronous publishing link is: http://www.52im.net/thread-3621-1-1.html

The design practice of the offline message push system architecture of Himalaya with hundreds of millions of users

1 Introduction

1.1 What is offline message push

1.2 Andriod's offline push is really not easy

1.3 How to design it reasonably

2. Technical background

3. Overall architecture design

4. Design of business filtering mechanism

5. Multi-dimensional query problem under sub-database and sub-table

6. Timeliness of special business

7. Storage and performance issues of the filtering mechanism

Appendix: More news push technical articles

JackJiang

引用和评论

小红书APP的全新鸿蒙NEXT端性能优化技术实践

即时通讯安全篇（一）：正确地理解和使用Android端加密算法

全民AI时代，大模型客户端和服务端的实时通信到底用什么协议？

融云数据监控平台「北极星」教程，聊天室洪峰、连接异常、消息未达正确解法

极致出海友好，融云 IM 支持消息免打扰设置时区

视频直播技术干货(十三)：B站实时视频直播技术实践和音视频知识入门

支持百万人超大群聊的Web端IM架构设计与实践