By DTLS consultation after the completion of both the RTC communicationMasterKey
andMasterSalt
negotiations. Next, we continue to analyze how to use the exchanged key to encrypt RTP and RTCP in WebRTC to achieve secure data transmission. At the same time, this article will answer the questions encountered in the use of libsrtp, for example, what is ROC and why is ROC 32-bits? Why does it return error_code=9, error_code=10? Does the exchanged key have a life cycle, and if so, how long is it? Before reading this article, it is recommended to read DTLS negotiation article , the combination of the two, the effect is better!
Author|Jinxue
Review|Taiyi
Problem to be solved
RTP/RTCP
protocol does not provide any protection to its load data. Therefore, if an attacker uses a packet capture tool, such as Wireshark, to capture audio and video data, the tool can directly play the audio and video stream, which is a very scary thing.
In WebRTC, in order to prevent such things from happening, the RTP/RTCP
protocol is not used directly, but the SRTP/SRTCP
protocol, which is the secure RTP/RTCP
protocol. WebRTC uses the very famous libsrtp library to convert the original RTP/RTCP
protocol data into SRTP/SRTCP
protocol data.
SRTP problems to be solved:
RTP/RTCP
the payload of 060d2aac261b6f to ensure data security;- Ensure
RTP/RTCP
package, while preventing replay attacks.
SRTP/SRTCP structure
SRTP structure
As can be seen from the SRTP structure diagram:
- The encryption part
Encrypted Portion
composed ofpayload
,RTP padding
andRTP pad count
. In other words, what we usually call only encrypts RTP payload data.
- The verification part
Authenticated Portion
, which is composed ofRTP Header
,RTP Header extension
andEncrypted Portion
.
Under normal circumstances, only the RTP payload data needs to be encrypted. If the RTP header extension needs to be encrypted, RFC6904 gives a detailed solution, which is also implemented in libsrtp.
SRTCP structure
It can be seen from the SRTCP
- The encrypted part
Encrypted Portion
is the part afterCompound RTCP
and the sameRTCP Header
- E-flag explicitly indicates whether the RTCP packet is encrypted. (PS: How to judge that an RTP packet is encrypted?)
SRTCP index
shows the sequence number of the RTCP packet to prevent replay attacks. (PS: Can the 16-bit serial number of an RTP packet prevent replay attacks?)- The part to be verified
Authenticated Portion
composed ofRTCP Header
andEncrypted Portion
.
In an initial understanding SRTP
and SRTCP
the structure, followed by introduction Encrypted Portion
and Authenticated Portion
how to get up.
Key management
In the SRTP/SRTCP
protocol, the two-tuple <SRTP destination IP address, SRTP/SRTCP destination port> is used to identify a
SRTP/SRTCP
session of a communication participant, which is called SRTP/SRTCP Session
.
In the SRTP protocol, the triple <SSRC, RTP/RTCP destination address, RTP/RTCP destination port> is used to identify a stream, and a
SRTP/SRTCP Session
composed of multiple streams. The description of the encryption and decryption related parameters of each stream is called Cryptographic Context
.
Cryptographic Context
of each stream contains the following parameters:
- SSRC: SSRC used by Stream.
- Cipher Parameter: key, salt, algorithm description (type, parameter, etc.) used for encryption and decryption.
- Authentication Parameter: Key, salt, algorithm description (type, parameter, etc.) used for integrity.
- Anti-Replay Data: Prevent replay attacks on cached data information, such as ROC, maximum serial number, etc.
In SRTP/SRTCP Session
, each Stream will use its own encryption and decryption Key and Authentication Key. These keys are all used in the same Session and are called Session Key
. These Session Key
are derived by using KDF (Key Derivation Function) Master Key
KDF
is used to derive the Session Key
function, and KDF uses the encryption and decryption function by default. For example, after DTLS is completed, the profile of the negotiated SRTP encryption algorithm is:
SRTP_AES128_CM_HMAC_SHA1_80
cipher: AES_128_CM
cipher_key_length: 128
cipher_salt_length: 112
maximum_lifetime: 2^31
auth_function: HMAC-SHA1
auth_key_length: 160
auth_tag_length: 80
The corresponding KDF
is AES128_CM
. Session Key
is shown in the following figure:
Session Key
depends on the following parameters:
key_label
: Depending on the type of the exported Key, thekey_label
as follows:- master_key: The key negotiated after DTLS is completed.
- master_salt: The negotiated Salt after DTLS is completed.
- packet_index: RTP/RTCP packet sequence number. SRTP uses a 48-bits implicit packet requirement, and SRTCP uses a 31-bits packet sequence number. Reference serial number management.
- key_derivation_rate: key export rate, denoted as kdr. The default value is 0, and the key export is performed once. The value range is
{{1,2,4,...,2^24}
. Inkey_derivation_rate>0
, before encryption, perform a key export, and subsequently perform key exportpacket_index/key_derivation_rate > 0
r = packet_index / kdr
key_id = label || r
x = key_id XOR master_salt
key = KDF(master_key, x)
'/': Means divisible, when B=0, C = A/B=0.
||: Indicates the meaning of connection. A, B, C are expressed in network byte order, C = A||B, then the high byte of C is A, and the low byte is B.
XOR: It is an exclusive OR operation, which is aligned according to the low byte during calculation.
The following uses AES128_CM
illustrate Session Key
, assuming that DTLS is negotiated:
master_key: E1F97A0D3E018BE0D64FA32C06DE4139 // 128-bits
master_salt: 0EC675AD498AFEEBB6960B3AABE6 // 112-bits
Export encryption key (cipher key):
packet_index/kdr: 000000000000
label: 00
master_salt: 0EC675AD498AFEEBB6960B3AABE6
-----------------------------------------------
xor: 0EC675AD498AFEEBB6960B3AABE6 (x, KDF input)
x*2^16: 0EC675AD498AFEEBB6960B3AABE60000 (AES-CM input)
cipher key: C61E7A93744F39EE10734AFE3FF7A087 (AES-CM output)
Export SALT Key (cipher salt):
packet_index/kdr: 000000000000
label: 02
master_salt: 0EC675AD498AFEEBB6960B3AABE6
----------------------------------------------
xor: 0EC675AD498AFEE9B6960B3AABE6 (x, KDF input)
x*2^16: 0EC675AD498AFEE9B6960B3AABE60000 (AES-CM input)
30CBBC08863D8C85D49DB34A9AE17AC6 (AES-CM ouptut)
cipher salt: 30CBBC08863D8C85D49DB34A9AE1
To export the verification key (auth key), the auth key
is 94 bytes:
packet_index/kdr: 000000000000
label: 01
master salt: 0EC675AD498AFEEBB6960B3AABE6
-----------------------------------------------
xor: 0EC675AD498AFEEAB6960B3AABE6 (x, KDF input)
x*2^16: 0EC675AD498AFEEAB6960B3AABE60000 (AES-CM input)
auth key AES input blocks
CEBE321F6FF7716B6FD4AB49AF256A15 0EC675AD498AFEEAB6960B3AABE60000
6D38BAA48F0A0ACF3C34E2359E6CDBCE 0EC675AD498AFEEAB6960B3AABE60001
E049646C43D9327AD175578EF7227098 0EC675AD498AFEEAB6960B3AABE60002
6371C10C9A369AC2F94A8C5FBCDDDC25 0EC675AD498AFEEAB6960B3AABE60003
6D6E919A48B610EF17C2041E47403576 0EC675AD498AFEEAB6960B3AABE60004
6B68642C59BBFC2F34DB60DBDFB2 0EC675AD498AFEEAB6960B3AABE60005
For the introduction of AES-CM, refer to AES-CM .
At this point, we got SRTP/SRTCP
encryption and authentication required Session Key
: the cipher keyword Key, Key auth, Salt Key.
Serial number management
SRTP serial number management
In the RTP
packet structure definition, 16-bit is used to describe the serial number. Taking into account the need to prevent replay attacks, message integrity verification, encrypted data, and export SessionKey
, in the SRTP
protocol, the sequence number of the SRTP packet is implicitly recorded with the packet sequence number packet_index
, and i is used to identify packet_index
.
For the sender, i is calculated as follows:
i = 2^16 * ROC + SEQ
Among them, SEQ is the 16-bit packet sequence number described in the RTP packet. ROC (rollover couter) is the RTP packet sequence number (SEQ) rollover count, that is, whenever SEQ/2^16=0
, the ROC count increases by 1. The initial value of ROC is 0.
For the receiving end, considering the impact of packet loss and disorder factors, in addition to maintaining ROC
, it also needs to maintain a currently received maximum packet sequence number s_l. When a new packet arrives, the receiving end needs to estimate the current packet location. The sequence number of the corresponding actual SRTP packet. The initial value of ROC is 0, and the initial value of s_l is the SEQ of the first SRTP packet received. The following formula is used to estimate the received SRTP sequence number i:
i = 2^16 * v + SEQ
Wherein, v
possible values { ROC-1, ROC, ROC+1 }
, receiving end the ROC locally maintained ROC, SEQ is the sequence number received SRTP. v Take ROC-1, ROC, and ROC+1 to calculate i and 2^16*ROC + s_l
. If the one is closer, v will take the corresponding value. After SRTP decryption and integrity check are completed, ROC and s_l are updated, which can be divided into the following three situations:
- v = ROC-1, ROC and s_l are not updated.
- v = ROC, if SEQ> s_1, then update s_l = SEQ.
- v = ROC + 1, ROC = v = ROC + 1,s_l = SEQ。
More intuitive code description:
if (s_l < 32768)
if (SEQ - s_l > 32768)
set v to (ROC-1) mod 2^32
else
set v to ROC
endif
else
if (s_l - 32768 > SEQ)
set v to (ROC+1) mod 2^32
else
set v to ROC
endif
endif
return SEQ + v*65536
SRTCP serial number management
RTCP
is no field SRTCP
31-bits
is displayed in the SRTCP packet. Use 060d2aac2627ac to display the description. See the SRTCP format for details, that is, the maximum sequence number in SRTCP is 2^31.
Serial number and communication duration
It can be seen that the maximum serial number of SRTP is 2^48, and the maximum serial number of SRTCP is 2^16. In most applications (assuming that there is at least one RTCP packet for every 128000 RTP packets), the SRTCP sequence number will reach the upper limit first. At a speed of 200 SRTCP packets/second, the 2^31 sequence number space of SRTCP is enough to ensure communication for approximately 4 months.
Anti-replay attack
The attacker saves the intercepted SRTP/SRTCP packet, and then resends it to the network, realizing the replay of the packet. SRTP receivers prevent this attack by maintaining a replay list (ReplayList). Theoretically, the Replay List should save the serial number index of all received and verified packets. In reality, ReplayList uses a sliding window to implement anti-replay attacks. Use SRTP-WINDOW-SIZE
to describe the size of the sliding window.
SRTP anti-replay attack
In the sequence number management part, we detailed the receiver, according to the SEQ, ROC, s_l of the received SRTP packet to estimate the packet_index
method of the SRTP packet. At the same time, record the maximum sequence number of the SRTP packet that the receiver has received as local_packet_index
. Calculate the difference delta:
delta = packet_index - local_packet_index
It is divided into the following 3 situations:
- delta> 0: indicates that a new packet has been received.
- delta <-(SRTP-WINDOW-SIZE-1) <0: indicates the sequence number of the received packet, which is less than the minimum sequence number required by the replay window. When libSRTP receives such a packet, it will return
srtp_err_status_replay_old=10
, indicating that it has received the old replay packet. delta <0, delta>= -(SRTP-WINDOW-SIZE - 1)
: Indicates that the packet within the replay window has been received. If the corresponding package is found in ReplayList, it is a replay package with duplicate index. When libSRTP receives such a packet, it will returnsrtp_err_status_replay_fail=9
. Otherwise, it means that an out-of-sequence packet is received.
The following figure more intuitively illustrates the three areas of anti-replay attack:
The minimum value of SRTP-WINDOW-SIZE is 64. The application can be set to a larger value as needed, and libsrtp will be rounded up to an integer multiple of 32. For example, SRTP-WINDOW-SIZE = 1024
in WebRTC. Users can make adjustments according to their needs, but to achieve the purpose of preventing replay attacks.
SRTCP anti-replay attack
In SRTCP, the packet index is explicitly given. In libsrtp, the window size of SRTCP's anti-replay attack is 128. Use window_start
record the starting sequence number of the anti-replay attack. The check steps for SRTCP anti-replay attack are as follows:
- index> window_start + 128: Receive a new
SRTCP
package. - index <window_start: The sequence number of the received packet is on the left side of the replay window, and it can be considered that we have received an older packet. After libsrtp receives such a packet, it will return to
srtp_err_status_replay_old=10
. - replay_list_index = index-windwo_start: The flag corresponding to replay_list_index in ReplayList is 1, indicating that the packet has been received, and libsrtp returns
srtp_err_status_replay_fail=9
. The corresponding identification bit is 0, indicating that an out-of-sequence packet is received.
Encryption and verification algorithm
In SRTP, the AES encryption algorithm in CTR (Counter mode) mode is used. The CTR mode generates a continuous key stream by incrementing an encryption counter. The counter can be any key that guarantees no repeated output for a long time. According to the different counting methods, it is divided into the following two types:
AES-ICM
: ICM mode (Integer Counter Mode, integer counting mode), using integer counting operations.AES-GCM
: GCM mode (Galois Counter Mode, based on the Galois field counting mode), the counting operation is defined in the Galois field.
In SRTP, AES-ICM
used to complete the encryption algorithm, and HMAC-SHA1
is used to complete the MAC
calculation, and the integrity of the data is checked. The encryption and the MAC
calculation need to be completed in two steps. AES-GCM
based on the idea of AEAD (Authenticated-Encryption with Associated-Data). It calculates the MAC
while encrypting the data. It implements one step to complete the calculation of encryption and verification information. The usage of AES-ICM
and AES_GSM
are introduced below respectively.
AEC—ICM
The figure describes AES-ICM
encryption and decryption process, FIG. K
is derived by KDF SessionKey
. Both encryption and encryption are performed by encrypting the Counter, and the encrypted data C is obtained by the exclusive OR operation with the plaintext P, and vice versa, the plaintext data P is obtained by the exclusive OR operation with the ciphertext C. Considering security, Counter generation relies on Session Salt
, the packet index and the SSRC of the packet. Counter is a 128-bits count, and the generation method is defined as follows:
one byte
<-->
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|00|00|00|00| SSRC | packet index | b_c |---+
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |
|
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ v
| salt (k_s) |00|00|->(+)
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |
|
v
+-------------+
encryption key (k_e) -> | AES encrypt |
+-------------+
|
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |
| keystream block |<--+
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
Among them, b_c
is the count of Counter. The initial b_c
is 0, which corresponds to Counter 0. For every encrypted 128-bits data, b_c
increases by 1 as the next Counter. According to the index of an RTP packet, the Counter calculated by SSRC forms a keystream, and each Counter is a keystream block.
By using the AES-ICM
algorithm, the RTP/RTCP
payload is encrypted to obtain the Encrypted Portion Portion
part.
HMAC—SHA1
Hash-based message authentication code (HMAC) is a message authentication code (MAC) generated after a special calculation method. It uses a cryptographic hash function and combines an encryption key. It can It is used to ensure the integrity of data and can be used to authenticate a certain message at the same time. HMAC uses a standard algorithm to mix the key into the calculation process in the process of calculating the hash. The encryption of HMAC is implemented as follows:
HMAC(K,M) = H ( (K XOR opad ) + H( (K XOR ipad ) + M ) )
- H: hash algorithm, for example, MD5, SHA-1, SHA-256.
- B: The length of the block byte, the block is the basic unit of hash operation. Here B=64.
- L: The byte length calculated by the hash algorithm. (L=16 for MD5, L=20 for SHA-1).
- K: Shared key, the length of K can be arbitrary, but for security considerations, it is recommended that the length of K>B. When the length of K is greater than
B
, the hash algorithm will be executed on K first, and the result of the length of L will be used as the new shared key. If the length of K isB
, 0x00 will be filled after K until it is equal to length B. - M: The content to be authenticated.
- opad: External filling constant, 0x5C repeated B times.
- ipad: The internal filling constant is 0x36 repeated B times.
- XOR: Exclusive OR operation.
- +: Represents "connection" operation.
The calculation steps are as follows:
- Fill 0x00 to the back of K until its length is equal to B.
- XOR the result of step 1 with ipad.
- Attach the information to be encrypted to the result of step 2.
- Call the H method.
- XOR the result of step 1 with opad.
- Attach the result of step 4 to the result of step 5.
- Call the H method.
SRTP
and SRTCP
calculate Authentication tag
, the K
RTP auth key
and RTCP auth key
described in the Key Management section, the Hash algorithm used is SHA-1
, and Authentication tag
is 80-bits.
In calculating SRTP
, the content M to be authenticated is:
M = Authenticated Portion + ROC
Among them, +
represents the "connection" operation, and Authenticated Portion
is given in the structure diagram of SRTP
When calculating SRTCP
, the content M to be authenticated is:
M=Authenticated Portion
Among them, Authenticated Portion
is given in the structure drawing of SRTCP
By using Authenticated Portion
algorithm, calculated SRTP/SRTCP
the Encrypted Portion Portion
portion.
AES—GCM
AES-GCM
uses counter mode to encrypt data. This operation can be effectively pipelined. The operation used in GCM authentication is particularly suitable for effective implementation in hardware. The GCM-SPEC details the theoretical knowledge of GCM, and Section4.2 Hardware details the hardware implementation.
The application of AES-GCM
in SRTP
encryption is described in detail RFC7714 Key management and serial number management are the same as those described in this article. Note that:
AES-GCM
a kind ofAEAD(Authenticated Encryption with Associated Data
) encryption algorithm. What are the input and output? Correspond to the package structure ofSRTP/SRTCP
Counter
is different from the calculation method described in AES-ICM, which requires special attention.
libsrtp has realized AES-GCM
. Students who are interested can study and study in combination with the code.
Use of libsrtp
libsrtp is a widely used SRTP/SRTCP
encrypted open source project. The frequently used APIs are as follows:
srtp_init
, initialize the srtp library, initialize the internal encryption algorithm, before using srtp, it must be called.srtp_create
, creating srtp_session, can be understood together with the concepts of session and session key introduced in this article.srtp_unprotect/srtp_protect
, RTP packet encryption and decryption interface.srtp_protect_rtcp/srtp_unprotect_rtcp
, the encryption and decryption interface for RTCP packets.
5.srtp_set_stream_roc/srtp_get_stream_roc
, set and get stream ROC, these two interfaces were added in the latest version 2.3.
The important structure srtp_policy_t
used to initialize encryption and decryption parameters. This structure is used srtp_create
The following parameters need attention:
MasterKey
andMasterSalt
obtained after DTLS negotiation are passed to libsrtp through this structure for session key generation.window_size
corresponds to the window size of the srtp anti-replay attack we described earlier.allow_repeat_tx
, whether to allow retransmission of packets with the same sequence number.
SRS is a new-generation real-time communication server. Students who are interested in libsrtp can quickly set up a debugging environment on this machine, perform related tests, and have a deeper understanding of related algorithms.
to sum up
Based on the SRTP/SRTCP
in-depth detailed interpretation of the relevant principles, the problems encountered in the use of libsrtp answer, hoping to give students real-time audio and video communication related fields to help.
references
- RFC3711: SRTP
- RFC6904: Encrypted SRTP Header Extensions
- Integer Counter Mode
- RFC-6188: The Use of AES-192 and AES-256 in Secure RTP
- RFC7714: AES-GCM for SRTP
- RFC2104: HMAC
- RFC2202: Test Cases for HMAC-MD5 and HMAC-SHA-1
- GCM-SPEC: GCM
"Video Cloud Technology" Your most noteworthy audio and video technology public account, pushes practical technical articles from the front line of Alibaba Cloud every week, and exchanges and exchanges with first-class engineers in the audio and video field. The official account backstage reply [Technology] You can join the Alibaba Cloud Video Cloud Technology Exchange Group, discuss audio and video technologies with the author, and get more industry latest information.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。