The author of this article is "Shang Wenmo", with revisions and changes.
1. Write in front
Among the large number of IM technical articles compiled by Instant Messaging (see the "References" section at the end of this article), articles on message reliability and consistency issues account for a large proportion. The reason is that systems such as IM disregard all kinds of dazzling products. Functional and technical characteristics, ensuring the reliability and consistency of messages are almost necessary qualities for IM products.
Imagine if an IM doesn’t even know whether the message sent by the other party can receive it, and whether the chat content sent by the other party sees “nonsense” (serious out-of-order problem), such APP users will definitely not let him. Overnight on the phone (definitely uninstalled immediately), because the most basic chat logic can not be realized, it has lost the meaning of the IM software itself.
However, on the other hand, the IM system is not standard (although the XMPP protocol has tried to solve this problem, it turns out that it is not realistic at all). Almost all companies have their own private protocols and different implementation logics. This also determines that even with the same technical problem, it is difficult for IM to have a fixed implementation routine and standard solution.
Therefore, for this article, although the author in the article provides a solution to the problem of "reliability" and "consistency" of IM messages, whether the solution is reasonable or not suitable for you is a matter of whether the benevolent sees benevolence and the wise sees wisdom. Up. To put it in human terms: the content of this article is for reference only. For specific solutions, please read more articles on this technical topic on the instant messaging network, and find the best technical solutions and solutions that suit you. Thinking is the most sensible.
Learning exchange:
- 5 groups for instant messaging/push technology development and communication: 215477170 [recommended]
- Introduction to Mobile IM Development: "One entry is enough for novices: Develop mobile IM from scratch"
- Open source IM framework source code: https://github.com/JackJiang2011/MobileIMSDK
(This article was published simultaneously at: http://www.52im.net/thread-3574-1-1.html)
2. Introduction
Cong Suozhi, the instant messaging chat (IM) system must solve the problem of message reliability and message consistency (PS: If you have not figured out what the specific IM system is, read this article "Introduction to Zero-based IM Development" One): What is an IM system? ").
these two issues, it is popular:
- 1) Message reliability: In simple terms, no message is lost. One party in the conversation sends a message, and the message successfully reaches the other party and is displayed correctly;
- 2) Message consistency: including the same message on the sender side and the same message on both sides of the conversation, requiring that the messages are not repeated and not out of order.
This article will start with the typical IM message sending logic, and explain the principles of message reliability and consistency problems and the technical solutions that can be referred to in a simple and easy-to-understand manner. Maybe the technical solution is not perfect, but I hope it can solve your IM technical problems. Bring inspiration.
3. Typical IM message sending process
general realization process of 160be18677808c IM message sending can be divided into two stages:
- 1) The sender sends a message, the server receives it, and returns a message ACK to the sender;
- 2) The server pushes the message to the receiver.
Judging whether the message is successfully sent is mainly based on the first stage-that is, whether the server receives the message.
For the message sender, the message status can be divided into three categories:
- 1) Sending;
- 2) Successfully sent;
- 3) Sending failed.
Specifically, the specific meanings of these three types of states are:
- 1) Sending: The sender triggers the start of the sending event until the server returns the message corresponding to the ACK;
- 2) Successfully sent: the sender receives the message and responds with ACK;
- 3) Transmission failure: more than a certain number of retransmissions, the ACK reply corresponding to the message is not received.
is shown in the following figure:
4. IM message reliability
Due to space limitations, for the basic concepts and detailed principles of IM message reliability, it is recommended to read "Introduction to zero-based IM development (3): What is the reliability of IM systems?" ", this article focuses on the solution ideas.
4.1 Retransmission mechanism
The method to ensure that the message is sent successfully in the first stage of message sending (see "3. Typical IM Message Sending Process" section of this article) is to set up a retransmission mechanism:
- 1) Determine whether the message should be retransmitted according to whether the corresponding ACK is received within a certain period of time;
- 2) If it exceeds the preset duration, send it again;
- 3) When the number of retransmissions exceeds the preset number, no retransmissions will be made, the message transmission is determined to be failed, and the message transmission status is modified.
PS: specific complete program-level code implementation, you can refer to the code implementation of the QoS mechanism in MobileIMSDK.
4.2 Session record check
The second stage of message sending (see "3. Typical IM message sending process" section of this article) the server pushes the message to the receiver. If the connection is disconnected, the message will be lost.
Therefore, to ensure that the message is complete, it is necessary to obtain the session record according to the timestamp of the last message (ACK) after the connection is established, and return all messages within a period of time at one time (PS: In medium and large applications, the pull of messages is not a problem. Simple things, for details, you can read "IM Development and Dry Goods Sharing: How to Elegantly Achieve Reliable Delivery of Large Numbers of Offline Messages").
Another guarantee method is to add regular polling to check the integrity of the message. The specific idea is shown in the figure below.
connection flow chart:
4.3 Two issues to consider
message retransmission and session record check need to consider two issues:
- 1) Whether the message will be sent repeatedly;
- 2) Whether the order of messages will be disrupted.
Give two examples.
Regarding the message retransmission problem:
- 1) If the point where the message is lost is before the message reaches the server, the server does not receive the message, and the sender resends the lost message, and the server receives the message successfully, and two identical messages will not be generated;
- 2) If the server receives a message and returns a lost ACK, and then sends the same message again, it may cause the message to be repeated.
About the order of messages:
- 1) If the sender sends three messages in a row, the first and third are successfully received by the server, and the second is lost, will the third message be recorded?
- 2) If the second message reaches the server at this time, is the order before or after the third time (the server generally timestamps the record)?
5. IM message consistency
As in the previous section, for the basic concepts and detailed principles of IM message consistency, it is recommended to read "Introduction to Zero-based IM Development (4): What is the message timing consistency of the IM system? ".
5.1 De-duplication using uuid messages
For the problem of message retransmission, the attribute uuid can be added to each message as the unique identifier of the message, the uuid of the retransmitted message remains unchanged, and the front end deduplicates according to the uuid. The general idea is this.
PS: For IM, message ID is also a big technical topic. If you are interested, you can read the following series:
- "IM Message ID Technology Topic (1): Practice of Generating Massive IM Chat Message Sequence Numbers on WeChat (Principles of Algorithms)"
- "IM Message ID Technology Topic (2): Practice of Generating Massive IM Chat Message Serial Numbers in WeChat (Disaster Recovery Plan)"
- "IM Message ID Technology Topic (3): Decrypting the Chat Message ID Generation Strategy of Rongyun IM Products"
- "IM Message ID Technology Topic (4): Deep Decryption of Meituan's Distributed ID Generation Algorithm"
- "IM Message ID Technology Topic (5): Technical Implementation of Open Source Distributed ID Generator UidGenerator"
- "IM Message ID Technology Topic (6): Deep Decryption Didi's High-Performance ID Generator (Tinyid)"
5.2 Use vector clock for message sorting
for message ordering issues: Because in chat, the order of messages has an important influence on the sender’s presentation, incomplete or reversed messages may cause semantic incoherence and even misinterpretation. Therefore, the order of messages sent by the sender needs to be guaranteed, and the order of messages between the two parties in the conversation needs to consider the actual situation.
in general perception: status is the message being sent, which should not have been seen by the other party. Only the successfully sent message will be seen by the other party. However, in the implementation, the success of the message transmission is based on the success of the server receiving the message and returning an ACK, rather than being received by the other party.
Then there will be such a question: If a message status is sending and a message is received at this time, then is the received message before or after the message being sent?
This is a context. The key question is: sender is based on which message it sees to send the message.
provides an idea here: draws on the vector clock algorithm in distributed systems (see "Vector clock algorithm in distributed systems").
First briefly describe the vector clock algorithm:
The vector clock algorithm is used to generate partial order relations of events in distributed systems and correct the causal relations. A system contains N nodes, and the message body generated by each node contains the logic clock of the node. The vector clock of the overall system is composed of N-dimensional logic clocks and is transmitted in the message body generated by each node.
Simply put, the realization principle of the vector clock algorithm is as follows:
- 1) The initial state, the vector value is 0;
- 2) Each time the node finishes processing the node event, the node clock is +1;
- 3) Every time a node sends a message, it sends the system vector clock including its own clock together;
- 4) Each time a node receives a message, the vector clock is updated, the node clock is +1, and other nodes compare the vector clock value held locally by each node with the vector clock value in the message body, and take the maximum value;
- 5) The node receives multiple messages at the same time, and judges whether there is a partial order relationship between the vector clocks of the received messages.
addresses the above point 5):
- 1) If there is a partial order relationship, the vector clocks are merged and the vector clock with a larger partial order is selected;
- 2) If there is no partial order relationship, it cannot be merged.
partial order relationship: If each dimension in the A vector is greater than or equal to the B vector, then there is a partial order relationship between A and B, otherwise there is no partial order relationship.
For IM to sort chat messages, it is actually processing the context of chat messages and determining the causal relationship between messages.
reference vector clock algorithm: assumes that there are N message conversation parties. The vector clock of the system is composed of N-dimensional clocks. The vector clocks are transmitted in the message body sent by all parties and are sorted according to the vector clocks.
specific implementation ideas for
- 1) The system vector clock is set to (0, 0, …, N);
- 2) The node sends a message to update the system vector clock, the node clock is incremented by one, and the other nodes remain unchanged;
- 3) The node receives the message and updates the system vector clock. The node clock is incremented by one; the other nodes compare the value of the vector clock held locally by each node with the value of the vector clock in the message, and take the maximum value;
- 4) The order of messages is determined according to the partial order relationship of the system vector clock in the message body.
addresses point 4) above:
- 1) If the partial order relationship can be determined, display it from small to large according to the partial order relationship;
- 2) If multiple messages cannot determine the partial order relationship, they will be displayed in the natural order (the order in which they are received).
The vector clock can solve most of the message consistency problems in theory, but the actual use experience needs to be considered in the implementation.
One of the most important questions is: should be forced to sort, or if the partial order relationship between the actual display order and the vector clock is inconsistent, whether to move the order between the messages.
an example: In a conversation with multiple people, if one party has a particularly slow internet speed, it cannot receive messages or send messages. After the last message he saw, other people had started a new topic. At this time, his message on the previous topic was finally sent successfully and received by others.
has such a problem at this time: this news about the previous topic is displayed at the end, or moved to an earlier time?
- 1) If it is displayed at the end, but the content of the message is not related to the current topic, other people may feel inexplicable;
- 2) If you move the message to an earlier time, then this message may not be seen by other people, or you may feel abrupt when you see an additional message in front of it.
There are many scenarios of IM, and they are also very complicated. More often, it is necessary to consider issues from a product perspective.
Regarding the question of whether messages need to be sorted, here is only a more general solution: suggests that the sorting is not mandatory in the conversation, and the conversation history records are sorted according to the partial order relationship of the vector clock.
6. Summary of this article
For IM system message reliability and consistency issues, the message retransmission mechanism is used to ensure that the message is successfully received by the server, and the session record check is used to ensure that the received message is complete, thereby ensuring the reliability of the entire message sending process. Use uuid message deduplication, refer to vector clock algorithm for message sorting, and provide a solution to ensure message consistency.
In short, systems such as IM seem simple, but the water is as deep as the sea. If you are a newbie in IM development, you can learn from the "Novice Getting Started: Develop Mobile IM from Zero". If you think you are already an IM veteran, the articles on large-scale architecture design in IM compiled here may be useful for reference.
7. Reference materials
[1] Introduction to zero-based IM development (3): What is the reliability of the IM system?
[2] Introduction to zero-based IM development (4): What is the message timing consistency of the IM system?
[3] Implementation of IM message delivery guarantee mechanism (1): Guarantee reliable delivery of online real-time messages
[4] Implementation of IM message delivery guarantee mechanism (2): Guarantee reliable delivery of offline messages
[5] How to ensure the "timing" and "consistency" of IM real-time messages?
[6] Discussion on a low-cost method to ensure the timing of IM messages
[7] The IM group chat message is so complicated, how to ensure that it is not lost or repeated?
[8] How to design a "failure retry" mechanism for a completely self-developed IM?
[9] IM development and dry goods sharing: how to elegantly realize the reliable delivery of a large number of offline messages
[10] Talk about the message reliability and delivery mechanism of the mobile terminal IM from the perspective of the client
[11] A set of IM architecture technical dry goods for hundreds of millions of users (Part 2): reliability, orderliness, weak network optimization, etc.
[12] From novice to expert: How to design a distributed IM system with billions of messages
This article has been simultaneously published on the official account of "Instant Messaging Technology Circle".
▲ The link of this article on the official account is: click here to enter. The synchronous publishing link is: http://www.52im.net/thread-3574-1-1.html
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。