Instant Messaging Security (11): End-to-end encryption technology for transmission content of IM chat system security means

This article is shared by the Rongyun technical team. The original title is "End-to-end encryption technology for Internet communication security", and the content has been revised and changed a lot.

1 Introduction

In the previous article "Communication Connection Layer Encryption Technology of IM Chat System Security Means", the related technologies and practices of communication connection layer encryption were shared, including enabling TLS link encryption when transmitting instant messaging messages (to ensure that messages are encrypted before reaching the server) Cannot be eavesdropped and tampered with), use CA authentication mechanism (to prevent man-in-the-middle attacks), etc. This article will focus on the security issues of IM transmission content, based on practice, to share with you the "end-to-end" encryption technology in instant messaging applications.

study Exchange:

Introductory article on mobile IM development: "One entry is enough for beginners: developing mobile IM from scratch"
Open source IM framework source code: https://github.com/JackJiang2011/MobileIMSDK (click here for alternate address)

(This article has been published simultaneously at: http://www.52im.net/thread-4026-1-1.html )

2. Series of articles

This article is the 11th in a series of articles on IM communication security knowledge. The general catalogue of this series is as follows:
"Instant Messaging Security (1): Correctly Understand and Use Android-side Encryption Algorithms"
"Instant Messaging Security (2): Discussing the Application of Combined Encryption Algorithms in IM"
"Instant Messaging Security (3): Explanation of Common Encryption and Decryption Algorithms and Communication Security"
"Instant Messaging Security (4): Case Analysis of the Risk of Hard-coding Keys in Android"
"Instant Messaging Security (V): Application Practice of Symmetric Encryption Technology on Android Platform"
"Instant Messaging Security (6): Principles and Application Practices of Asymmetric Encryption Technology"
"Instant Messaging Security (7): If you understand the principle of HTTPS in this way, one article is enough"
"Instant Messaging Security (8): Do you know whether HTTPS uses symmetric encryption or asymmetric encryption? 》
"Instant Messaging Security (9): Why Use HTTPS? Explain in simple language, explore the security of short connections"
"Instant Messaging Security (10): Communication Connection Layer Encryption Technology of IM Chat System Security Means"
"Instant Messaging Security (11): End-to-End Encryption Technology for Transmission Content of IM Chat System Security Means" (* this article)

3. Why do we need end-to-end encryption?

The connection layer encryption technology mentioned in the previous article is a means of improving the security of data transmission between the IM client and the server, but it cannot solve the communication privacy and security risks between users. Because after the data is transmitted to the server, everyone who has access to the server, including employees, suppliers and other relevant personnel (even hackers), may read the user's data. In view of this, end-to-end encryption technology is widely used in the field of instant messaging IM, including WhatsApp, Signal, Telegram and other foreign instant messaging software.

PS: The basic knowledge about end-to-end encryption can be obtained from these two articles. It is recommended to read:
"The Weapon of Mobile Secure Communication - Detailed Explanation of End-to-End Encryption (E2EE) Technology"
"A brief description of the working principle of end-to-end encryption (E2EE) in real-time audio and video chat"

4. Technical design ideas of end-to-end encryption

4.1 Simplified version of the idea When it comes to end-to-end encryption, the first solution we think of is: encrypt the entire message before the sender sends the message, and decrypt the message after the receiver receives it. As above: the message relay server cannot obtain the content of our message. In fact: this is indeed a simplified version of the solution for sending and receiving messages in end-to-end encryption, but we are more complex and more secure in practical applications.

4.2 How to Safely Pass the Key for Message Encryption and Decryption For end-to-end encryption, the pre-security problem that we need to solve first is how to securely pass the key for message encryption and decryption.

The answer is: use asymmetric encryption to transmit keys (similar to how keys are exchanged securely in SSL/TLS). The algorithm of asymmetric encryption and transmission of symmetric encryption keys generally boils down to two ways: 1) One is based on RSA, ECC, etc. (public key encryption and private key decryption, which is essentially an encryption and decryption algorithm); 2) The other One is to generate a shared key based on DH and ECDH (essentially, a common key is negotiated through calculation instead of an encryption and decryption algorithm).

In fact: most of the end-to-end encryption in instant messaging software uses the method of generating a shared key to transmit the session key. Why is this? This involves the DH algorithm (ie the Diffie-Hellman key exchange algorithm). For information on the DH algorithm, you can read "Diffie-Hellman Key Agreement Algorithm" if you are interested. Due to space limitations, it will not be discussed here.

The security of the Diffie-Hellman key exchange algorithm relies on the fact that while computing the exponent modulo a prime number is relatively easy, computing the discrete logarithm is difficult. For large prime numbers, it is nearly impossible to compute the discrete logarithm. Here is a brief description of the DH shared key process as follows:

(where "key S" is the final shared key)

4.3 Reasons for using a shared key There are several reasons for using a shared key to transmit session keys in end-to-end encryption:
1) If the key is transmitted by public key encryption and private key decryption such as RSA and ECC, a temporary key needs to be generated when the session is created, and encrypted with the other party's public key and transmitted to the receiver. This requires full assurance of the reliability of the message, and if the message is lost or corrupted in any of the links, subsequent communication cannot proceed. Alternatively, a more reliable transmission scheme needs to be adopted. The usual practice is that the receiving end needs to be online, and various confirmations are used to ensure this reliability. With the shared key method, you only need to know the public key of the other party to complete the generation of the shared key, and the other party does not necessarily need to be online.

2) If the ephemeral symmetric key that has been generated is lost, the key needs to be renegotiated. In the shared key method, only the public key of the other party needs to be known to complete the generation of the shared key without renegotiation.

3) The method of using public key encryption and private key decryption will at least have one more communication process of symmetric key exchange than the method of generating a shared key.

4) The key negotiation method can not only complete the key negotiation between two points, but also can be extended to the common negotiation of the same key among multiple people, which can meet the needs of multi-person group communication.

5. Preliminary practice plan of end-to-end encryption

Combined with the knowledge of the DH algorithm (ie Diffie-Hellman key exchange algorithm), which is a shared key method (that is, the public key can be freely disclosed), we first design a simple end-to-end message encryption process.

The logical flow of this process is as follows:
1) When the client APP is installed for the first time, it generates its own DH public key and private key based on the two global parameters disclosed by the server;
2) Upload your own public key to the certificate server, and the certificate server saves the relationship between the user ID and its public key. The private key is stored on the client;
3) When sending a message to the other party for the first time or receiving the other party's message for the first time, go to the certificate server to query the other party's public key;
4) Calculate the shared key according to the other party's public key and its own private key;
5) All subsequent messages with the other party are encrypted and decrypted based on this key and the same symmetric encryption and decryption algorithm.

The process of end-to-end message encryption is as follows:

So far: We have completed a simple end-to-end message encryption scheme. In this scheme, we have introduced a third-party role for storing the user's public key. The existence of this role allows either party to not care about the online status of the other party. , to send encrypted messages to the other party at any time, but the message forwarding server cannot decrypt the messages. Next, we will gradually analyze and optimize the various security risks in this simple solution.

6. Further optimization and evolution of the end-to-end encryption practice solution

6.1 Using HMAC as the message integrity authentication algorithm In the process of message transmission, both parties need to confirm the integrity of each other's messages. The simple method is to hash the message, and the obtained hash value is appended to the message and sent together with the message; the opposite end receives it. After that, Hash is also performed to verify whether the message has been tampered with. The key point is that the Hash value obtained by different data must be different, and the Hash value with the key is the MAC algorithm. In addition, in order to avoid using the same Hash function to operate on the same data and always obtain the same value, an additional key is added, so that different MACs can be obtained by using different keys. Of course, this key is known to both peers. In this way, we get the algorithm of message integrity authentication based on encrypted Hash - Hash-based MAC (HMAC for short).

Basic Knowledge 1: What is the MAC Algorithm?
The full name is Message Authentication Code , which is the message authentication code (Hash function with a key). In cryptography, MAC is an authentication mechanism used by both communicating entities and a tool to ensure the integrity of message data. The security of the MAC algorithm depends on the Hash function, so it is also called the Hash function with a key. The message authentication code is a value obtained based on the key and the message digest "hash", which can be used for data origin authentication and integrity verification.

The specific process for verifying message integrity using MAC is:
1) Assuming that both parties A and B share the key K, A uses the message authentication code algorithm to calculate the message authentication code Mac from K and message M, and then sends Mac and M to B together;
2) After receiving Mac and M, B uses M and K to calculate a new verification code Mac . If Mac and Mac are equal, the verification is successful, which proves that the message has not been tampered with. Since the attacker does not have the key K, the attacker cannot calculate the corresponding message verification code after modifying the content of the message, so B can find that the integrity of the message is damaged.

In a nutshell:
1) The sender calculates the MAC value of the message through the MAC algorithm, and sends it to the recipient together with the message;
2) The recipient uses the same MAC algorithm to calculate the MAC value of the received message and compares the two.

The following figure shows the principle:

Basics 2: What is the HMAC algorithm?
HMAC is one of the MAC algorithms, which is implemented based on the encrypted HASH algorithm. Any encrypted HASH, such as MD5, SHA256, etc., can be used to implement the HMAC algorithm, and the corresponding algorithms are called HMAC-MD5, HMAC-SHA256, etc.

6.2 Replacing DH Algorithm with ECDH Algorithm

The DH algorithm is based on the mathematical problem of discrete logarithms. With the gradual enhancement of computer computing power, we have to keep using larger numbers to increase the difficulty of cracking. At present, it is generally believed in the industry that at least 2048-bit DH algorithm is required. better security. Here we introduce the ECDH algorithm to replace the DH algorithm. The ECDH key agreement algorithm is a combination of the ECC algorithm and the DH key exchange principle. ECC is a cryptosystem based on the discrete logarithm problem based on elliptic curves. Under the same cracking difficulty, ECC has the advantages of smaller key length and faster forward calculation speed. ECDH on our system can directly use the currently public sepc256kl and Curve25519 curves without the need for the service to provide public large number parameters.

6.3 Improve forward security <br>In the process of message transmission, if the negotiated key is leaked, it means that all information will be exposed to risks. In order to prevent this from happening, we need to use a different key for each encryption from the previous one, and the previous key cannot be deduced backwards. A Hash algorithm is introduced here: this Hash algorithm can derive another key with greater discreteness by inputting a key. Every time a message is sent, the Hash operation is performed with the last message key to obtain this key. , because the Hash algorithm is one-way and irreversible, it is impossible to derive the previous key from this key. Visually, it's like a ratchet, which is a special kind of gear that can only turn in one direction, but not back.

Let's first have a perceptual understanding of the ratchet:

Technically, it is the key to achieve forward safety that "you can only turn in one direction, but not turn back". This ensures that if the key of a certain round is cracked, the previous key cannot be calculated, that is, the previous message cannot be decrypted.

6.4 Ensure both forward security and backward security <br>For the ultimate security requirements, we will consider both forward security and backward security. How to ensure that in a certain communication, the decrypted key cannot decrypt the previous message, and within a certain period, the decrypted key will no longer work. Between this we introduce another ratchet to ensure its backward security. This is the double ratchet algorithm in the famous Signal protocol. Signal protocol is a real end-to-end communication encryption protocol. It is known as the most secure communication protocol in the world. No third party including the server can view the communication content. The double ratchet algorithm consists of a KDF ratchet and a DH ratchet.

KDF (Key derivation function) The key derivation function is used to derive one or more keys from an original key. Essentially a Hash function, usually used to turn short passwords into long ones. In addition, KDF needs to add a "salt" to prevent rainbow tables. Due to the characteristics of Hash, the length of this "salt" must be at least greater than the length of the Hash result.

KDF (original key, salt) = derived key KDF ratchet is to use the KDF algorithm to design a key changing effect, the process is as follows:

First: use the KDF algorithm to derive a new key from the initial key, the new key is cut into two parts, the first half is used as the input for the next KDF calculation, and the second half is used as the message key. Every iteration (or ratchet step, so to speak), a new message key is generated. Due to the one-way nature of the KDF algorithm, the key of the previous message cannot be deduced from the key of this message, which ensures the forward security of the key. But if the salt in the KDF is mastered, then it can calculate all future message keys according to this algorithm.

In order to ensure backward security, a method should be designed so that the salt introduced in each iteration is random, so as to ensure that each message key cannot be backward calculated. From the DH algorithm introduced above, we know that two pairs of key pairs can generate a secure negotiated key through the DH protocol. If one of the key pairs is replaced, the new negotiated key will also change.

According to this method: We can devise a way to safely update the salt. We add a temporary public key certificate to the certificate server. This temporary certificate is a temporary public key pair constructed according to the identifiers of the receiving parties, that is, each individual session of each person has a temporary public key. Every time a message round is performed, the temporary public key of one party is updated, and the negotiation is carried out according to the temporary public key of the other party and the private key of one party, and the negotiated key is used as the salt, so that the message key generated by the KDF ratchet algorithm is used. Has backward security.

At the beginning, we cannot predict all the new two-person sessions of each person: then we can stipulate that when creating a new two-person session, the initiator first generates a new temporary DH public-private key pair, and uploads its own temporary DH to the server Second, the sender uses the long-term public key announced by the receiver and its own temporary private key to negotiate a key as the key for message encryption to encrypt the message; finally, the receiver uses its own long-term public key after receiving the message for the first time. Calculate the message key with the sender's temporary private key, and generate a temporary public and private key when replying to the message for the first time, and upload the temporary public key at the same time. The problem is: if the receiver is not online, and the sender updates its own temporary public key certificate for each message, it will cause these messages sent out to be unable to be decrypted normally after the receiver is online and received.

In order to solve this problem, we need to stipulate that the temporary certificate will only be updated after sending a message and getting a reply from the other party. If the other party does not reply to the message, the temporary certificate will not be updated. If the receiving end can reply to the message, it means that it has been online and has received the message, so as to ensure that the offline message or the out-of-order message can also be parsed by the other party normally.

This method is another DH ratchet in the double ratchet algorithm.

6.5 A more secure key exchange protocol - X3DH
Compared with the original scheme, in order to satisfy the forward security and backward security of the message, we added a double ratchet algorithm, and added a set of session-level temporary DH keys for each person on the original basic scheme, and each person has a long-term key. and a set of ephemeral keys. However, since the long-term key cannot be replaced, the scheme still has security risks. Therefore: Signal protocol designs a more complex and secure DH key exchange process, which is called X3DH (that is, the 3 times extended version of the DH protocol).

In the X3DH protocol, everyone has to create 3 key pairs, which are as follows:
1) Identity Key Pair: a long-term key pair that conforms to the DH protocol, created during user registration and bound to the user's identity;
2) Signed pre-shared key (Signed Pre Key): a mid-term key pair that conforms to the DH protocol, created when the user registers, signed by the identity key, and periodically rotated, this key may be used to protect the identity The key is not leaked;
3) One-Time Pre-Shared Keys (One-Time Pre Keys): A queue of Curve25519 key pairs for one-time use, generated during installation, and supplemented when insufficient. Everyone has to upload the public keys of these 3 key pairs to the server for others to use when initiating a session.

If Alice wants to send a message to Bob, she must first determine the message key with Bob. The process is roughly as follows:
1) Alice wants to create a temporary key pair (ephemeral key), we set it to EPK-A, this key pair is prepared for the later ratchet algorithm, and it has little effect here;
2) Alice obtains the public keys of Bob's three key pairs from the server: the identity key pair IPK-B, the signed pre-shared key SPK-B, and the one-time pre-shared key OPK-B;
3) Alice starts to use the DH protocol to calculate the negotiated key. The parameters to be introduced include: the private keys of the two key pairs created by herself, and the three public keys of Bob. Then, in a similar way of permutation and combination, bring your own private key and the other party's public key into the DH algorithm for calculation.

DH1 = DH(IPK-A, SPK-B)
DH2 = DH(EPK-A, IPK-B)
DH3 = DH(EPK-A, SPK-B)
DH4 = DH(IPK-A, OPK-B)

as the picture shows:

Then connect the four calculated values before and after to get the initial key, as follows:

DH = DH1 || DH2 || DH3 || DH4

Note: "||" represents a connector, such as 456 || 123 = 456123

However, the DH key is too long to be used as a message key, so a KDF calculation is performed on this initial key to derive a fixed-length message key S: S = KDF (DH1 || DH2 || DH3 || DH4) In this step, Alice finally calculates the message key S.

then:
1) Alice uses the message key S to encrypt the message, and sends it to Bob together with her own identity public key IPK-A and temporary public key EPK-A;
2) After Bob receives Alice's information, he takes out Alice's 2 public keys, together with his own key, and uses the same algorithm as Alice to calculate the message key S;
3) Bob and Alice use the message key for encrypted communication.

It can be seen from the above: X3DH is actually a complex version of the DH protocol.

So far: We briefly introduced the most core X3DH protocol and double ratchet algorithm in Signal Protocol, which can basically satisfy forward security and backward security. Of course, the real process would be more complex and secure.

7. End-to-end encryption scheme for IM group chat

In the instant messaging scene, in addition to the chat between two people, another important scene is the group chat. How to do end-to-end encryption for the multi-person messages in the group chat?

Let's go back to the derivation process of the DH key agreement algorithm again: Obviously, the DH key agreement algorithm can still be used in the case of multiple parties, which is the basis of end-to-end encryption in group chats.

The design of Signal Protocol in group chat is different from that of two-person chat. Due to the relatively low confidentiality requirements of group chat, only KDF chain ratchet + public key signature is used for encrypted communication to ensure the forward direction of encryption. Safety.

The encryption and decryption communication process of group chat is as follows:
1) Each group member must first generate a random 32-byte KDF Chain Key to generate a message key to ensure the forward security of the message key, and also generate a random Curve25519 Signing key pair, used for message signing;

2) Each group member encrypts and sends the Chain Key and the signature public key to other members individually. At this point, each member has the chain key and signature public key of all members in the group;

3) When a member sends a message, it first encrypts the message with the message key generated by the KDF chain ratchet algorithm, then signs it with the private key, and then sends the message to the server, and the server sends it to other members;

4) After receiving the encrypted message, other members first use the sender's signature public key to verify, and after the verification is successful, use the corresponding chain key to generate the message key, and decrypt it with the message key;

5) When a group member leaves, all group members clear their own chain key and signature public key, regenerate it, and send it to each member individually.

By doing so, the members who leave will not be able to view the messages in the group.

It can be seen from the above: a person in different groups will generate different chain keys and signature key pairs to ensure the isolation between groups.

In each group, each member also stores the KDF chain and signature public key of other members. If there are too many group members, the amount of encryption and decryption operations is very large, which will affect the speed of sending and receiving. At the same time, the key management database is also will be very large, and the read efficiency will be reduced. So: the group chat uses the Signal Protocol protocol, and the number of groups should not be too large.

8. Supplementary explanation of end-to-end encryption scheme

Above we have introduced the entire process of end-to-end encryption for two-person chat and group chat in instant messaging. However, under normal circumstances, end-to-end message encryption only encrypts the actual payload part of the message (that is, only encrypts the "body" part of the message), and the control layer of the message will not be encrypted, because the message forwarding server needs to forward or route the message according to the control information. (Otherwise, it will definitely greatly affect the routing and communication efficiency of the bottom layer of IM, because repeated encryption and decryption are required).

In order to prevent messages from being targeted for analysis (analyzing when a user sent a message to whom, or who received a message), we still need to encrypt and protect the long connection link of the overall instant communication (this is what I said in the previous article). Communication connection layer encryption technology) to prevent information from being intercepted and analyzed by intermediate network devices. In addition, in order to prevent the key server from being attacked by man-in-the-middle, it is also necessary to enable link encryption protection.

9. References

[1] The weapon of mobile terminal security communication - detailed explanation of end-to-end encryption (E2EE) technology
[2] Briefly describe the working principle of end-to-end encryption (E2EE) in real-time audio and video chat
[3] HASH, MAC, HMAC learning
[4] An article to understand encryption and decryption, hash function, MAC, digital signature, certificate, CA, etc.
[5] Double ratchet algorithm: end-to-end encryption security protocol, detailed explanation of principle and process
[6] Signal protocol open source protocol understanding
[7] X25519 (Curve25519) elliptic curve reference material (this article has been published simultaneously at: http://www.52im.net/thread-4026-1-1.html )

Instant Messaging Security (11): End-to-end encryption technology for transmission content of IM chat system security means

1 Introduction

2. Series of articles

3. Why do we need end-to-end encryption?

4. Technical design ideas of end-to-end encryption

5. Preliminary practice plan of end-to-end encryption

6. Further optimization and evolution of the end-to-end encryption practice solution

7. End-to-end encryption scheme for IM group chat

8. Supplementary explanation of end-to-end encryption scheme

9. References

JackJiang

引用和评论

长连接网关技术专题(十二)：大模型时代多模型AI网关的架构设计与实现

极致出海友好，融云 IM 支持消息免打扰设置时区

HTTPS中的“S”究竟意味着什么？

支持百万人超大群聊的Web端IM架构设计与实践

全平台开源即时通讯IM框架MobileIMSDK：7端+TCP/UDP/WebSocket协议

RSA非对称加密算法深度解析

全球HTTPS强制化：政策推动下的网络安全升级