By DTLS consultation after the completion of both the RTC communication MasterKey and MasterSalt negotiations. Next, we continue to analyze how to use the exchanged key to encrypt RTP and RTCP in WebRTC to achieve secure data transmission. At the same time, this article will answer the questions encountered in the use of libsrtp, for example, what is ROC and why is ROC 32-bits? Why does it return error_code=9, error_code=10? Does the exchanged key have a life cycle, and if so, how long is it? Before reading this article, it is recommended to read DTLS negotiation article , the combination of the two, the effect is better!

Author|Jinxue

Review|Taiyi

Problem to be solved

RTP/RTCP protocol does not provide any protection to its load data. Therefore, if an attacker uses a packet capture tool, such as Wireshark, to capture audio and video data, the tool can directly play the audio and video stream, which is a very scary thing.

In WebRTC, in order to prevent such things from happening, the RTP/RTCP protocol is not used directly, but the SRTP/SRTCP protocol, which is the secure RTP/RTCP protocol. WebRTC uses the very famous libsrtp library to convert the original RTP/RTCP protocol data into SRTP/SRTCP protocol data.

SRTP problems to be solved:

  • RTP/RTCP the payload of 060d2aac261b6f to ensure data security;
  • Ensure RTP/RTCP package, while preventing replay attacks.

SRTP/SRTCP structure

SRTP structure

As can be seen from the SRTP structure diagram:

  1. The encryption part Encrypted Portion composed of payload , RTP padding and RTP pad count . In other words, what we usually call only encrypts RTP payload data.
  1. The verification part Authenticated Portion , which is composed of RTP Header , RTP Header extension and Encrypted Portion .

Under normal circumstances, only the RTP payload data needs to be encrypted. If the RTP header extension needs to be encrypted, RFC6904 gives a detailed solution, which is also implemented in libsrtp.

SRTCP structure

It can be seen from the SRTCP

  1. The encrypted part Encrypted Portion is the part after Compound RTCP and the same RTCP Header
  2. E-flag explicitly indicates whether the RTCP packet is encrypted. (PS: How to judge that an RTP packet is encrypted?)
  3. SRTCP index shows the sequence number of the RTCP packet to prevent replay attacks. (PS: Can the 16-bit serial number of an RTP packet prevent replay attacks?)
  4. The part to be verified Authenticated Portion composed of RTCP Header and Encrypted Portion .

In an initial understanding SRTP and SRTCP the structure, followed by introduction Encrypted Portion and Authenticated Portion how to get up.

Key management

In the SRTP/SRTCP protocol, the two-tuple <SRTP destination IP address, SRTP/SRTCP destination port> is used to identify a SRTP/SRTCP session of a communication participant, which is called SRTP/SRTCP Session .

In the SRTP protocol, the triple <SSRC, RTP/RTCP destination address, RTP/RTCP destination port> is used to identify a stream, and a SRTP/SRTCP Session composed of multiple streams. The description of the encryption and decryption related parameters of each stream is called Cryptographic Context .

Cryptographic Context of each stream contains the following parameters:

  • SSRC: SSRC used by Stream.
  • Cipher Parameter: key, salt, algorithm description (type, parameter, etc.) used for encryption and decryption.
  • Authentication Parameter: Key, salt, algorithm description (type, parameter, etc.) used for integrity.
  • Anti-Replay Data: Prevent replay attacks on cached data information, such as ROC, maximum serial number, etc.

In SRTP/SRTCP Session , each Stream will use its own encryption and decryption Key and Authentication Key. These keys are all used in the same Session and are called Session Key . These Session Key are derived by using KDF (Key Derivation Function) Master Key

KDF is used to derive the Session Key function, and KDF uses the encryption and decryption function by default. For example, after DTLS is completed, the profile of the negotiated SRTP encryption algorithm is:

SRTP_AES128_CM_HMAC_SHA1_80
         cipher: AES_128_CM
         cipher_key_length: 128
         cipher_salt_length: 112
         maximum_lifetime: 2^31
         auth_function: HMAC-SHA1
         auth_key_length: 160
         auth_tag_length: 80

The corresponding KDF is AES128_CM . Session Key is shown in the following figure:

Session Key depends on the following parameters:

  • key_label : Depending on the type of the exported Key, the key_label as follows:
  • master_key: The key negotiated after DTLS is completed.
  • master_salt: The negotiated Salt after DTLS is completed.
  • packet_index: RTP/RTCP packet sequence number. SRTP uses a 48-bits implicit packet requirement, and SRTCP uses a 31-bits packet sequence number. Reference serial number management.
  • key_derivation_rate: key export rate, denoted as kdr. The default value is 0, and the key export is performed once. The value range is {{1,2,4,...,2^24} . In key_derivation_rate>0 , before encryption, perform a key export, and subsequently perform key export packet_index/key_derivation_rate > 0
r = packet_index / kdr
key_id = label || r
x = key_id XOR master_salt
key = KDF(master_key, x)
'/': Means divisible, when B=0, C = A/B=0.
||: Indicates the meaning of connection. A, B, C are expressed in network byte order, C = A||B, then the high byte of C is A, and the low byte is B.
XOR: It is an exclusive OR operation, which is aligned according to the low byte during calculation.

The following uses AES128_CM illustrate Session Key , assuming that DTLS is negotiated:

master_key:  E1F97A0D3E018BE0D64FA32C06DE4139   // 128-bits
master_salt: 0EC675AD498AFEEBB6960B3AABE6           // 112-bits

Export encryption key (cipher key):

packet_index/kdr:              000000000000
label:                       00
master_salt:   0EC675AD498AFEEBB6960B3AABE6
-----------------------------------------------
xor:           0EC675AD498AFEEBB6960B3AABE6     (x, KDF input)
x*2^16:        0EC675AD498AFEEBB6960B3AABE60000 (AES-CM input)
cipher key:    C61E7A93744F39EE10734AFE3FF7A087 (AES-CM output)

Export SALT Key (cipher salt):

packet_index/kdr:              000000000000
label:                       02
master_salt:   0EC675AD498AFEEBB6960B3AABE6
----------------------------------------------
xor:           0EC675AD498AFEE9B6960B3AABE6     (x, KDF input)
x*2^16:        0EC675AD498AFEE9B6960B3AABE60000 (AES-CM input)
               30CBBC08863D8C85D49DB34A9AE17AC6 (AES-CM ouptut)
cipher salt:   30CBBC08863D8C85D49DB34A9AE1

To export the verification key (auth key), the auth key is 94 bytes:

packet_index/kdr:                000000000000
label:                         01
master salt:     0EC675AD498AFEEBB6960B3AABE6
-----------------------------------------------
xor:             0EC675AD498AFEEAB6960B3AABE6     (x, KDF input)
x*2^16:          0EC675AD498AFEEAB6960B3AABE60000 (AES-CM input)

auth key                           AES input blocks
CEBE321F6FF7716B6FD4AB49AF256A15   0EC675AD498AFEEAB6960B3AABE60000
6D38BAA48F0A0ACF3C34E2359E6CDBCE   0EC675AD498AFEEAB6960B3AABE60001
E049646C43D9327AD175578EF7227098   0EC675AD498AFEEAB6960B3AABE60002
6371C10C9A369AC2F94A8C5FBCDDDC25   0EC675AD498AFEEAB6960B3AABE60003
6D6E919A48B610EF17C2041E47403576   0EC675AD498AFEEAB6960B3AABE60004
6B68642C59BBFC2F34DB60DBDFB2       0EC675AD498AFEEAB6960B3AABE60005
For the introduction of AES-CM, refer to AES-CM .

At this point, we got SRTP/SRTCP encryption and authentication required Session Key : the cipher keyword Key, Key auth, Salt Key.

Serial number management

SRTP serial number management

In the RTP packet structure definition, 16-bit is used to describe the serial number. Taking into account the need to prevent replay attacks, message integrity verification, encrypted data, and export SessionKey , in the SRTP protocol, the sequence number of the SRTP packet is implicitly recorded with the packet sequence number packet_index , and i is used to identify packet_index .

For the sender, i is calculated as follows:

i = 2^16 * ROC + SEQ

Among them, SEQ is the 16-bit packet sequence number described in the RTP packet. ROC (rollover couter) is the RTP packet sequence number (SEQ) rollover count, that is, whenever SEQ/2^16=0 , the ROC count increases by 1. The initial value of ROC is 0.

For the receiving end, considering the impact of packet loss and disorder factors, in addition to maintaining ROC , it also needs to maintain a currently received maximum packet sequence number s_l. When a new packet arrives, the receiving end needs to estimate the current packet location. The sequence number of the corresponding actual SRTP packet. The initial value of ROC is 0, and the initial value of s_l is the SEQ of the first SRTP packet received. The following formula is used to estimate the received SRTP sequence number i:

i = 2^16 * v + SEQ

Wherein, v possible values { ROC-1, ROC, ROC+1 } , receiving end the ROC locally maintained ROC, SEQ is the sequence number received SRTP. v Take ROC-1, ROC, and ROC+1 to calculate i and 2^16*ROC + s_l . If the one is closer, v will take the corresponding value. After SRTP decryption and integrity check are completed, ROC and s_l are updated, which can be divided into the following three situations:

  1. v = ROC-1, ROC and s_l are not updated.
  2. v = ROC, if SEQ> s_1, then update s_l = SEQ.
  3. v = ROC + 1, ROC = v = ROC + 1,s_l = SEQ。

More intuitive code description:

if (s_l < 32768)
    if (SEQ - s_l > 32768)
        set v to (ROC-1) mod 2^32
    else
        set v to ROC
    endif
else
    if (s_l - 32768 > SEQ)
        set v to (ROC+1) mod 2^32
    else
        set v to ROC
    endif
endif
return SEQ + v*65536

SRTCP serial number management

RTCP is no field SRTCP 31-bits is displayed in the SRTCP packet. Use 060d2aac2627ac to display the description. See the SRTCP format for details, that is, the maximum sequence number in SRTCP is 2^31.

Serial number and communication duration

It can be seen that the maximum serial number of SRTP is 2^48, and the maximum serial number of SRTCP is 2^16. In most applications (assuming that there is at least one RTCP packet for every 128000 RTP packets), the SRTCP sequence number will reach the upper limit first. At a speed of 200 SRTCP packets/second, the 2^31 sequence number space of SRTCP is enough to ensure communication for approximately 4 months.

Anti-replay attack

The attacker saves the intercepted SRTP/SRTCP packet, and then resends it to the network, realizing the replay of the packet. SRTP receivers prevent this attack by maintaining a replay list (ReplayList). Theoretically, the Replay List should save the serial number index of all received and verified packets. In reality, ReplayList uses a sliding window to implement anti-replay attacks. Use SRTP-WINDOW-SIZE to describe the size of the sliding window.

SRTP anti-replay attack

In the sequence number management part, we detailed the receiver, according to the SEQ, ROC, s_l of the received SRTP packet to estimate the packet_index method of the SRTP packet. At the same time, record the maximum sequence number of the SRTP packet that the receiver has received as local_packet_index . Calculate the difference delta:

delta =  packet_index - local_packet_index

It is divided into the following 3 situations:

  1. delta> 0: indicates that a new packet has been received.
  2. delta <-(SRTP-WINDOW-SIZE-1) <0: indicates the sequence number of the received packet, which is less than the minimum sequence number required by the replay window. When libSRTP receives such a packet, it will return srtp_err_status_replay_old=10 , indicating that it has received the old replay packet.
  3. delta <0, delta>= -(SRTP-WINDOW-SIZE - 1) : Indicates that the packet within the replay window has been received. If the corresponding package is found in ReplayList, it is a replay package with duplicate index. When libSRTP receives such a packet, it will return srtp_err_status_replay_fail=9 . Otherwise, it means that an out-of-sequence packet is received.

The following figure more intuitively illustrates the three areas of anti-replay attack:

The minimum value of SRTP-WINDOW-SIZE is 64. The application can be set to a larger value as needed, and libsrtp will be rounded up to an integer multiple of 32. For example, SRTP-WINDOW-SIZE = 1024 in WebRTC. Users can make adjustments according to their needs, but to achieve the purpose of preventing replay attacks.

SRTCP anti-replay attack

In SRTCP, the packet index is explicitly given. In libsrtp, the window size of SRTCP's anti-replay attack is 128. Use window_start record the starting sequence number of the anti-replay attack. The check steps for SRTCP anti-replay attack are as follows:

  1. index> window_start + 128: Receive a new SRTCP package.
  2. index <window_start: The sequence number of the received packet is on the left side of the replay window, and it can be considered that we have received an older packet. After libsrtp receives such a packet, it will return to srtp_err_status_replay_old=10 .
  3. replay_list_index = index-windwo_start: The flag corresponding to replay_list_index in ReplayList is 1, indicating that the packet has been received, and libsrtp returns srtp_err_status_replay_fail=9 . The corresponding identification bit is 0, indicating that an out-of-sequence packet is received.

Encryption and verification algorithm

In SRTP, the AES encryption algorithm in CTR (Counter mode) mode is used. The CTR mode generates a continuous key stream by incrementing an encryption counter. The counter can be any key that guarantees no repeated output for a long time. According to the different counting methods, it is divided into the following two types:

  • AES-ICM : ICM mode (Integer Counter Mode, integer counting mode), using integer counting operations.
  • AES-GCM : GCM mode (Galois Counter Mode, based on the Galois field counting mode), the counting operation is defined in the Galois field.

In SRTP, AES-ICM used to complete the encryption algorithm, and HMAC-SHA1 is used to complete the MAC calculation, and the integrity of the data is checked. The encryption and the MAC calculation need to be completed in two steps. AES-GCM based on the idea of AEAD (Authenticated-Encryption with Associated-Data). It calculates the MAC while encrypting the data. It implements one step to complete the calculation of encryption and verification information. The usage of AES-ICM and AES_GSM are introduced below respectively.

AEC—ICM

The figure describes AES-ICM encryption and decryption process, FIG. K is derived by KDF SessionKey . Both encryption and encryption are performed by encrypting the Counter, and the encrypted data C is obtained by the exclusive OR operation with the plaintext P, and vice versa, the plaintext data P is obtained by the exclusive OR operation with the ciphertext C. Considering security, Counter generation relies on Session Salt , the packet index and the SSRC of the packet. Counter is a 128-bits count, and the generation method is defined as follows:

one byte
<-->
0  1  2  3  4  5  6  7  8  9  10 11 12 13 14 15
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|00|00|00|00|   SSRC    |   packet index  | b_c |---+
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+   |
                                                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+   v
|                  salt (k_s)             |00|00|->(+)
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+   |
                                                    |
                                                    v
                                            +-------------+
                    encryption key (k_e) -> | AES encrypt |
                                            +-------------+
                                                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+   |
|                keystream block                |<--+
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Among them, b_c is the count of Counter. The initial b_c is 0, which corresponds to Counter 0. For every encrypted 128-bits data, b_c increases by 1 as the next Counter. According to the index of an RTP packet, the Counter calculated by SSRC forms a keystream, and each Counter is a keystream block.
By using the AES-ICM algorithm, the RTP/RTCP payload is encrypted to obtain the Encrypted Portion Portion part.

HMAC—SHA1

Hash-based message authentication code (HMAC) is a message authentication code (MAC) generated after a special calculation method. It uses a cryptographic hash function and combines an encryption key. It can It is used to ensure the integrity of data and can be used to authenticate a certain message at the same time. HMAC uses a standard algorithm to mix the key into the calculation process in the process of calculating the hash. The encryption of HMAC is implemented as follows:

HMAC(K,M) = H ( (K XOR opad ) + H( (K XOR ipad ) + M ) )
  • H: hash algorithm, for example, MD5, SHA-1, SHA-256.
  • B: The length of the block byte, the block is the basic unit of hash operation. Here B=64.
  • L: The byte length calculated by the hash algorithm. (L=16 for MD5, L=20 for SHA-1).
  • K: Shared key, the length of K can be arbitrary, but for security considerations, it is recommended that the length of K>B. When the length of K is greater than B , the hash algorithm will be executed on K first, and the result of the length of L will be used as the new shared key. If the length of K is B , 0x00 will be filled after K until it is equal to length B.
  • M: The content to be authenticated.
  • opad: External filling constant, 0x5C repeated B times.
  • ipad: The internal filling constant is 0x36 repeated B times.
  • XOR: Exclusive OR operation.
  • +: Represents "connection" operation.

The calculation steps are as follows:

  1. Fill 0x00 to the back of K until its length is equal to B.
  2. XOR the result of step 1 with ipad.
  3. Attach the information to be encrypted to the result of step 2.
  4. Call the H method.
  5. XOR the result of step 1 with opad.
  6. Attach the result of step 4 to the result of step 5.
  7. Call the H method.

SRTP and SRTCP calculate Authentication tag , the K RTP auth key and RTCP auth key described in the Key Management section, the Hash algorithm used is SHA-1 , and Authentication tag is 80-bits.

In calculating SRTP , the content M to be authenticated is:

M = Authenticated Portion + ROC

Among them, + represents the "connection" operation, and Authenticated Portion is given in the structure diagram of SRTP

When calculating SRTCP , the content M to be authenticated is:

M=Authenticated Portion

Among them, Authenticated Portion is given in the structure drawing of SRTCP

By using Authenticated Portion algorithm, calculated SRTP/SRTCP the Encrypted Portion Portion portion.

AES—GCM

AES-GCM uses counter mode to encrypt data. This operation can be effectively pipelined. The operation used in GCM authentication is particularly suitable for effective implementation in hardware. The GCM-SPEC details the theoretical knowledge of GCM, and Section4.2 Hardware details the hardware implementation.

The application of AES-GCM in SRTP encryption is described in detail RFC7714 Key management and serial number management are the same as those described in this article. Note that:

  1. AES-GCM a kind of AEAD(Authenticated Encryption with Associated Data ) encryption algorithm. What are the input and output? Correspond to the package structure of SRTP/SRTCP
  2. Counter is different from the calculation method described in AES-ICM, which requires special attention.

libsrtp has realized AES-GCM . Students who are interested can study and study in combination with the code.

Use of libsrtp

libsrtp is a widely used SRTP/SRTCP encrypted open source project. The frequently used APIs are as follows:

  1. srtp_init , initialize the srtp library, initialize the internal encryption algorithm, before using srtp, it must be called.
  2. srtp_create , creating srtp_session, can be understood together with the concepts of session and session key introduced in this article.
  3. srtp_unprotect/srtp_protect , RTP packet encryption and decryption interface.
  4. srtp_protect_rtcp/srtp_unprotect_rtcp , the encryption and decryption interface for RTCP packets.
    5. srtp_set_stream_roc/srtp_get_stream_roc , set and get stream ROC, these two interfaces were added in the latest version 2.3.

The important structure srtp_policy_t used to initialize encryption and decryption parameters. This structure is used srtp_create The following parameters need attention:

  1. MasterKey and MasterSalt obtained after DTLS negotiation are passed to libsrtp through this structure for session key generation.
  2. window_size corresponds to the window size of the srtp anti-replay attack we described earlier.
  3. allow_repeat_tx , whether to allow retransmission of packets with the same sequence number.

SRS is a new-generation real-time communication server. Students who are interested in libsrtp can quickly set up a debugging environment on this machine, perform related tests, and have a deeper understanding of related algorithms.

to sum up

Based on the SRTP/SRTCP in-depth detailed interpretation of the relevant principles, the problems encountered in the use of libsrtp answer, hoping to give students real-time audio and video communication related fields to help.

references

  1. RFC3711: SRTP
  2. RFC6904: Encrypted SRTP Header Extensions
  3. Integer Counter Mode
  4. RFC-6188: The Use of AES-192 and AES-256 in Secure RTP
  5. RFC7714: AES-GCM for SRTP
  6. RFC2104: HMAC
  7. RFC2202: Test Cases for HMAC-MD5 and HMAC-SHA-1
  8. GCM-SPEC: GCM
"Video Cloud Technology" Your most noteworthy audio and video technology public account, pushes practical technical articles from the front line of Alibaba Cloud every week, and exchanges and exchanges with first-class engineers in the audio and video field. The official account backstage reply [Technology] You can join the Alibaba Cloud Video Cloud Technology Exchange Group, discuss audio and video technologies with the author, and get more industry latest information.

CloudImagine
222 声望1.5k 粉丝