头图

1. Features of the push platform

The vivo push platform is a message push service provided by vivo company to developers. By establishing a stable and reliable long-term connection between the cloud and the client, it provides developers with a real-time push message service to client applications, supporting tens of billions of Push notifications/messages to reach mobile users in seconds.

The push platform is characterized by high concurrency, large message volume, and high delivery timeliness. At present, the highest push speed is 140w/s, the maximum message volume in a single day is 15 billion, and the end-to-end second-level online delivery rate is 99.9%.

2. Introduction to the use of push platform Redis

Based on the characteristics of the vivo push platform, the requirements for concurrency and timeliness are high, and the number of messages is large and the validity period of messages is short. Therefore, the push platform chooses to use Redis middleware as message storage and transfer, as well as token information storage. Previously, two Redis clusters were mainly used, and the Redis Cluster cluster mode was adopted. The two clusters are as follows:

The operation of Redis mainly includes the following aspects:

1) In the push link, the message body is stored at the access layer to the msg Redis cluster, and the message expiration time is the expiration time of the message stored in msg Redis.

2) After a series of logics, the push service layer finds the message body from the msg Redis cluster, queries the client information of the client Redis cluster, and pushes it directly if the client is online. If the client is not online, write the message id to the waiting queue.

3) If connected, push the service layer, read the waiting queue message, and push it.

4) The storage management service will regularly scan the cii index. According to the last update time of the cii storage, if it has not been updated for 14 days, it means that it is an inactive user, and the token information will be cleared, and the waiting queue messages corresponding to the token will be cleared.

The Redis flow chart of the push link operation is as follows:

3. Online issues on the push platform

As described above, the push platform uses the main msg cluster and client cluster of Redis. With the development of the business, the performance requirements of the system are getting higher and higher, and there are some bottleneck problems in Redis. Before the optimization of the msg Redis cluster, the scale has reached 220 masters. , 4400G capacity. As the scale of the cluster becomes larger, the maintenance difficulty increases and the accident rate becomes higher. Especially in April, when a certain celebrity divorced, the real-time concurrent message volume was 520 million, and the msg Redis cluster experienced a single-node connection and memory surge. One of the node's connections reached 24,674, and the memory reached 23.46G, lasting about 30 minutes. During this period, the msg Redis cluster read and write response was slow, with an average response time of about 500ms, which affected the stability and availability of the overall system, and the availability dropped to 85%.

Fourth, push platform Redis optimization

Redis is generally optimized from the following aspects:

1) Capacity : Redis belongs to in-memory storage. Compared with disk storage databases, the storage cost is more expensive. It is precisely because of the characteristics of in-memory storage that it has high read and write performance, but the storage space is limited. Therefore, when the business is using, it should be noted that the storage content is as hot as possible, and the capacity can be pre-assessed, and it is best to set the expiration time. In the storage design, use the corresponding data structure reasonably. For some relatively large values, they can be compressed and stored.

2) Hot key tilt : Redis-Cluster maps all physical nodes to [0-16383] slots, and each node is responsible for a part of the slot. When there is a request call, according to the value of CRC16(key) mod 16384, decide which slot to request the key to. Due to the feature of Redis-cluster, each node is only responsible for a part of the slot. Therefore, the randomness of the key should be guaranteed when designing the key, especially when using some hash algorithms to map the key, the random distribution of the hash value should be guaranteed. In addition, to control the concurrency problem of hotspot keys, you can use current-limiting degradation or local caching to prevent Redis hotspots from tilting due to excessively high concurrent requests for hotspot keys.

3) The cluster is too large : Redis-Cluster adopts a non-central structure, each node saves data and the state of the entire cluster, and each node is connected to all other nodes. Each node saves the mapping relationship between all nodes and slots. When there are more nodes, the mapping relationship saved by each node will also increase. The more data is carried in the message body of the heartbeat packet between each node. When scaling up or down, it takes a relatively long time for the cluster to perform clusterSlots again. The cluster will be at risk of blocking and the stability will be affected. Therefore, when using a cluster, you should try to avoid too many cluster nodes, and finally split the cluster according to the business.

Here's a question: Why does Redis-Cluster use 16384 slots, not more, and how many nodes can there be at most?

The official author gave explanation , and explained in the explanation that Redis-Cluster does not recommend more than 1000 master nodes.

图片

Based on the above optimization directions and its own business characteristics, the push platform starts the road of Redis optimization from the following aspects.

  • msg Redis cluster capacity optimization;
  • The large msg Redis cluster is split according to business attributes;
  • Redis hotspot key investigation;
  • Client Redis cluster concurrent call optimization.

4.1 msg Redis cluster capacity optimization

As mentioned above, the size of the msg Redis cluster has reached 220 masters and a capacity of 4400G. During the peak period, the used capacity has reached 3650G, which is about 83%. If the subsequent push increases, the capacity needs to be expanded, and the cost is too high. Therefore, to analyze the storage content of the msg Redis cluster, the analysis tool used is the snowball open source RDB analysis tool RDR. github URL : Not much introduction here, you can go to the github URL to download the corresponding tools to use. This tool can analyze the Redis snapshot situation, including: Redis different structure types capacity, number of keys, top 100 largest keys, number and capacity of prefix keys.

The conclusion after analysis: In the msg Redis cluster, the structure at the beginning of mi: accounts for about 80%, and the single push message accounts for 80%. instruction:

  • single push : 1 message push 1 user
  • group push : One message can be pushed repeatedly to multiple users, and messages can be reused.

The feature of single push is one-to-one push, and the message body is no longer used after push or push failure (controlled, invalid user, etc.).

optimization scheme :

  • Clean up the single push message in time. If the user has received the single push message and received the puback receipt, delete the Redis message directly. If the single push message is restricted from being sent due to reasons such as control, delete the single push message body directly.
  • For messages with the same content, aggregated storage is performed, one message with the same content is stored, and the message id is used for multiple times when pushing.

After this optimization, the shrinking effect is more obvious. After the full volume was launched, the capacity was reduced by 2090G, the original maximum capacity was 3650G, and the capacity of reduced by 58% .

图片

4.2 The large msg Redis cluster is split according to business attributes

Although the cluster capacity has been optimized, the msg Redis pressure is still very high during peak hours.

Main reasons for :

1) There are many nodes connected to msg Redis, resulting in a high number of connections during peak periods.

2) The message body and the waiting queue are stored in a cluster, and operations are required when pushing, resulting in a large amount of Redis concurrency and a high CPU load during peak periods, reaching more than 90%.

3) The Redis version of the old cluster is 3.x. After the split, the new cluster uses version 4.x. Compared with the 3.x version, it has the following advantages:

  • PSYNC2.0: Optimized the problem of full replication caused by master-slave node switching in previous versions.
  • Provides a new cache culling algorithm: LFU (Least Frequently Used), and optimizes the existing algorithm.
  • Provides non-blocking del and flush/flushdb functions, effectively solving Redis blocking that may be caused by deleting bigkey.
  • Provides the memory command to achieve more comprehensive monitoring and statistics of memory.
  • More memory saving, storing the same amount of data, requires less memory space.
  • You can do memory defragmentation to gradually reclaim memory. When using the Jemalloc memory allocation scheme, Redis can use online memory sorting.

The splitting scheme splits the msg Redis storage information according to the business attributes, splits the message body and the waiting queue, and puts them into two independent clusters. So there are two split options.

scheme one : the waiting queue from the old cluster

Just push the node to modify, but the sending waiting queue is continuous, stateful, related to the online status of clientId, the corresponding value will be updated in real time, and the switching will cause data loss.

scheme two : split the message body from the old cluster

All nodes connected to msg Redis replace the new address and restart, push the node to perform double-reading, and directly switch to read the new cluster when the hit rate of the old cluster is 0. Because the characteristics of the message body are that there are only two operations of writing and reading, and there is no update, the switching does not need to consider the state problem, as long as it is guaranteed that it can be written and read. In addition, the capacity of the message body has an incremental attribute, which needs to be able to be easily and quickly expanded. The new cluster adopts version 4.0, which is convenient for dynamic expansion and contraction.

Considering the impact on the business and service availability, and ensuring that messages are not lost, we finally choose option two. Adopt double read and single write scheme design:

Since the message body is switched to the new cluster, during the switching period (up to 30 days), the new message body is written to the new cluster, and the old cluster stores the content of the old message body. During this period, the push node needs to double-read to ensure that the data is not lost. In order to ensure the efficiency of double reading, it is necessary to support dynamic rule adjustment measures that do not modify the code and restart the service.

There are four general rules: read only the old, read only the new, read the old first and then read the new, read the new first and then read the old.

Design idea: The server supports 4 strategies, and decides which rule to follow through the configuration of the configuration center.

Judgment basis of the rule: It is determined according to the number of hits and the hit rate of the old cluster. In the initial stage of the launch, the rules are configured to "read the old first and then read the new"; when the hit rate of the old cluster is lower than 50%, switch to "read the new first and then read the old"; when the hit number of the old cluster is 0, switch to "read only new" .

The hit rate and hit count of the old cluster are increased by general monitoring.

The flow chart of Scheme 2 is as follows:

After splitting effect:

  • Before the split, the old msg Redis cluster had a peak load of more than 95% during the same period.
  • After the split, the peak load during the same period was reduced to 70%, a drop of 15%.

图片

Before the split, the average response time of the msg Redis cluster during the peak period during the same period was 1.2ms, and the response time of calling Redis during the peak period was slow. After splitting, the average response time is reduced to 0.5ms, and there is no slow response problem during peak periods.

4.3 Redis hotspot key investigation

As mentioned earlier, a certain star hotspot event in April, the number of msg Redis single-node connections and memory soared, the number of single-node node connections reached 24674, and the memory reached 23.46G.

Due to the virtual machine used by the Redis cluster, it was initially suspected that the host where the virtual machine is located has a pressure problem, because according to the investigation, it is found that there are many Redis master nodes mounted on the host where the node with the problem is located, about 10, while other hosts are mounted. There are about 2-4 master nodes, so a round of balanced optimization is carried out on the master, so that the master nodes allocated by each host are more balanced. After equalization, the overall situation has improved to some extent. However, during the push peak period, especially when the push is full at full speed, the number of single node connections and memory spikes will occasionally occur. Observing the incoming and outgoing traffic of the host's network card, there is no bottleneck problem, and the influence of other business nodes on the host is also excluded. Therefore, it is suspected that there is a hot spot tilt problem in the business use of Redis.

By monitoring the call chain during peak period, we can see from the figure below that calling the hexists command of msg Redis during the period from 11:49 to 12:59 is very time-consuming. This command is mainly to query whether the message is in the mii index. Most of the time-consuming keys for road analysis are mii:0. At the same time, the problem node Redis memory snapshot was analyzed, and it was found that the mii:0 capacity accounted for a high proportion, and there was a hot problem of reading mii:0.

After analysis and investigation, it is found that the messageId generated by the snowflake algorithm that generates the message id has a skew problem. Since the sequence value of the same millisecond starts from 0, and the sequence length is 12 bits, it is not very concurrent for the management background and api nodes. , the generated messageId is basically the last 12 bits are 0. Since the mii index key is mi:${messageId%1024}, the last 12 digits of messageId are 0, and messageId%1024 is 0, so the key mii:0 in msg Redis is very large, and the hit rate is high during query, so it leads to The hot key problem of Redis is solved.

optimization measures :

1) The snowflake algorithm is modified. The initial value of the sequence used when generating the message id is no longer 0, but a random number from 0 to 1023 to prevent the problem of hotspot tilt.

2) Replace the hexists command by the message type in the msg message body and whether the message body exists.

Final effect : After optimization, the mii index has been evenly distributed, the number of Redis connections is stable, and the memory growth is also relatively stable, and the problem of Redis single-node memory and the number of connections soaring no longer occurs.

4.4 Client Redis Cluster Concurrent Call Optimization

The upstream node calls the push node through a consistent hash call through the clientId. The push node caches the clientInfo information locally for 7 days. When pushing, the local cache is first queried to determine whether the client is valid. For important and frequently changed information, directly query client Redis to obtain it, which leads to high pressure on the client Redis cluster, high concurrency, and high CPU load during the push peak period.

Push node operation cache and client Redis flow chart before optimization:

图片

optimization scheme : Split the original clientInfo cache into three caches, and adopt a hierarchical scheme.

  • The cache still saves some information of the original clientInfo, which is not frequently changed, and the cache time is still 7 days.
  • cache1 caches frequently changed information of clientInfo, such as online status, cn address, etc.
  • cache2 caches some parameters of ci encryption. This part of the cache is only used when encryption is required, and the frequency of change is not so high. It will only be changed when connected.

Due to the new cache, the cache consistency problem needs to be considered, so the following measures are added:

1) Push the cache verification, call the broker node, update and clear the local cache information according to the return information of the broker. The broker is not online, and the aes does not match the error code. The next time you push or retry, it will be reloaded from Redis to get the latest client information.

2) According to the uplink events on the mobile phone, when connecting and disconnecting, update and clear the local cache information, and when the next push or retry, it will be reloaded from Redis to obtain the latest client information.

Overall process: When the message is pushed, the local cache is first queried, and the cache does not exist or has expired before it is loaded from the client Redis. When pushing to the broker, update or invalidate the cache according to the information returned by the broker. Uplink, receive disconnect and connect events, update or invalidate the cache in time, and reload from client Redis when pushing again.

After optimization, push node operation cache and client Redis flow chart:

图片

optimized effect :

1) The new cache 1 cache hit rate is 52%, and the cache 2 cache hit rate is 30%.

2) The concurrent calls of client Redis are reduced by nearly 20%.

3) Redis load is reduced by about 15% during peak periods.

V. Summary

Due to its high concurrent performance and support for rich data structures, Redis is a good choice as a cache middleware in high concurrency systems. Of course, whether Redis can achieve high performance also depends on whether the business really understands and uses Redis correctly. There are a few things to note:

1) Due to the Redis cluster mode, each master node is only responsible for a part of the slot. When designing the Redis key, the business should fully consider the randomness of the key and distribute it evenly on each Redis node. At the same time, large keys should be avoided. In addition, in business, the problem of Redis request hotspots should be avoided, and requests hit a small number of nodes at the same time.

2) The actual throughput of Redis is also related to the size of the packet data requested by Redis and the network card. The official document related instructions. When the size of a single packet exceeds 1000 bytes, the performance will drop sharply. Therefore, large keys should be avoided as much as possible when using Redis. In addition, it is best to perform performance stress testing according to the actual business scenario, actual network environment, bandwidth and network card conditions, and to get a thorough understanding of the actual throughput of the cluster.

Take our client Redis cluster as an example: (for reference only)

图片

Redis has less support for real-time analysis. Except for basic indicator monitoring, real-time in-memory data analysis is not currently supported. In actual business scenarios, if a Redis bottleneck occurs, monitoring data is often missing, and it is difficult to locate the problem. Data analysis of Redis can only rely on analysis tools to analyze Redis snapshot files. Therefore, the use of Redis depends on the business's full understanding of Redis, and it is fully considered when designing the scheme. At the same time, perform performance stress tests on Redis according to business scenarios, understand where the bottlenecks are, and prepare for monitoring and capacity expansion.

Author: vivo internet server team - Yu Quan

vivo互联网技术
3.3k 声望10.2k 粉丝