Open the chat software, say good morning to important people, and glance at the group message that turned off the reminder. I believe this is a "fixed action" for many people to start the day.

With the rapid development of mobile Internet and communication technology, online communication has become the most important way for people to work and live. Going out without a mobile phone is like losing the world. When the battery is below 50%, we start to be anxious, because our friends list saves our connection to the world. And opening a conversation through the conversation list is one of our most frequent actions every day.

The accuracy and real-time nature of the conversations in the IM conversation list will directly affect the user experience and feelings. This article shares Rongyun IM instant messaging conversation service data read and write design ideas. (Follow Global Internet Communication Cloud to learn more about IM & RTC technology and scene topics)

Technical Challenges of Massive Message Sessions

Taking a single chat as an example, when user A sends a message to user B, two session records will be generated: one for the sender and one for the receiver. The server will save these two session records in the database for subsequent service restarts, updates, and queries.

In order to reduce the interaction between the server and the client, under normal circumstances, we will save the last message in the conversation record. After that, every time A sends a message to B or B sends a message to A, the message records of the two conversations and the last time of the conversation will be updated accordingly.

If 100,000 single chat users send messages at a certain time, 200,000 conversations will be added (first chat) or updated (follow-up chat) operations.

In high-frequency read and write scenarios, it is necessary to provide accurate and fast queries and reliable storage to test the processing capabilities of the service in high-concurrency scenarios.

High concurrency session query operation

In order to provide fast query capabilities, we generally store frequently accessed hot data in the cache, such as Redis, to facilitate the system to respond quickly, instead of causing database pressure every time the database is queried.

In order to reduce network interaction, reduce server pressure and improve user experience, we can store hotspot data in the service memory. When the user queries next time, first go to the memory to find out whether it exists. If it exists, it will be directly returned to the user; if it does not exist, it will be queried in the database and then cached in the memory, so that the next query will be directly retrieved from the memory and returned to the user.

(Figure 1 Session query process)

In a distributed system architecture, in order to improve the hit rate of cached data in memory, we generally adopt a consistent Hash method to place all the operations of a user's sessions on the same service instance. Doing so will not only improve performance, but also help reduce the difficulty of dealing with data consistency problems caused by distributed read and write.

(Figure 2 Consistent Hash algorithm calculation service location)

Insert update operation under high concurrency

After solving the query operation, let's take a look at how to optimize the write operation, including insert, update, and delete.

When there is a large amount of data to be added to the database, it is bound to increase the latency of the database query. How to deal with it?

First of all, the memory size is limited, and it is unrealistic to put all the sessions in the memory. We will put the newly added session in the LRU cache for query use, and then write the added session to the queue and add it to the database asynchronously. At this point, you may think: This does not reduce the number of inserts into the database, but it is processed asynchronously later. Yes, this step does not achieve the effect of reducing the number of operations on the database, let's look down.

(Figure 3 Asynchronous data landing processing)

If you want to add session A, it will first update to the LRU, and then add the session to the queue and wait for it to be added to the database. At this time, if there is no backlog in the queue, it will be directly updated to the database; if there is a backlog, when updating to this session, first compare the sessions in the LRU, update the latest session to the database, and record the latest session time.

(Figure 4 Session data access strategy)

As shown in Figure 4, the conversation 1 is taken out of the queue for storage. At this moment, the message time of conversation 1 in the queue is 1, and the message time of conversation 1 in the LRU is 4. Compared with the latest session in the LRU, the session in the LRU is stored in the database and the update time of this session is recorded. When the queue is updated to the old session 1 with time 3, since the time of this session is less than the recorded time, it will be discarded and not stored. In this way, it can effectively reduce the situation of high concurrency backlog, frequent updates of the same session lead to frequent warehousing, so as to achieve the purpose of reducing database pressure.

All in all, the design ideas of Rongyun's session service data read and write are as follows:

Make full use of memory to cache hotspot data, reducing the pressure on back-end data storage services to read;
Increase the cache hit rate through service placement calculation;
By merging business data, minimize invalid business operations and reduce write operations to storage services;
Through asynchronous decoupling of business and data writing process.

As communication capabilities become the basic requirements for more and more scenarios, and there are more and more high-concurrency scenarios, we will continue to iterate on the service architecture, improve concurrency support capabilities, and provide developers with stable, reliable, and low-latency Communication capability support.


融云RongCloud
82 声望1.2k 粉丝

因为专注,所以专业