Getting started with WebRTC DTLS encountered many bugs? Talking about DTLS Fragment

The previous article "Explaining the WebRTC Transmission Security Mechanism: An Article to Understand the DTLS Protocol" elaborated on DTLS. This article will combine the problems encountered in the development of DTLS, explain in detail some basic concepts of DTLS and the mechanism of Fragment, and further study the DTLS protocol.

Author｜Taiyi

Proofreading｜Go to school, Mo Zhan

Preface

Recently, I am doing the DTLS-SRTP handshake encryption work for the two RTC systems of J and G, and require the use of certificates issued by CA institutions. During the debugging process of this machine, it was found that the G system uses the CA certificate, and the DTLS handshake is successful, while the J system fails the handshake.

After several debugging and analysis, the reason is located: J system has one more TURN forwarding module compared with G system. The upper limit of the receiving buffer set by this module is 1600 bytes, while the size of CA certificate is nearly 3000. Bytes, so the certificate forwarded by the TURN module to the client is incomplete, causing the DTLS handshake to fail.

As everyone knows, WebRTC's DTLS uses a self-signed certificate, which is generally not too big, as shown in the figure below, only 286 bytes.

However, if you want to use a certificate issued by a CA, the certificate may be very large. As shown in the figure below, it has reached 2772 bytes, which obviously exceeds the size of the receiving buffer of the TURN module.

In the above picture, you may have noticed that the CA certificate is divided into two fragments. This is actually done by the DTLS protocol layer. However, it is worth thinking about that the size of each piece of the CA certificate does not exceed the limit of 1600 bytes in the receiving buffer of the TURN module, but why does the TURN forwarding module of the J system still fail to receive it?

This is because although the certificate is fragmented, it is not sent independently according to the fragments when sent to the TURN module. It is still all packaged into the same UDP datagram for transmission, so the reception will definitely fail.

Below, we will understand the mechanism of DTLS Fragment together. First, we must clarify a few concepts.

Message、Record、Flight

DTLS protocol is divided into two layers: the bottom Record Protocol and an upper Handshake Protocol , Change cipher spec Protocol , Alert Protocol and file application Data Protocol .
DTLS

Remark: The handshake protocol, the password specification change protocol, the warning protocol, and the application data protocol are all in the upper layer of the DTLS record protocol. These four protocols are collectively referred to as the DTLS handshake protocol.
Note: Regarding the respective functions of the record and handshake protocols, I will not repeat them here. You can refer to DTLS application in WebRTC .

DTLS Message is a complete DTLS message. For example, handshake messages: Client Hello, Certificate, Client Key Exchange, etc.; for example, password specification change messages: Change Cipher Spec.

DTLS Record is the concept of Record Layer. It can be considered as a shell with DTLS Message loaded inside, as shown below:

Message and Record are a one-to-one or one-to-many relationship . In other words, a Record does not necessarily contain a complete Message. Because it is possible that multiple Records form a complete Message.

If the Message is very small and does not exceed the MTU limit, then one Record is enough to hold a message; if the Message is too large and exceeds the MTU limit, then multiple Records are needed to hold the message. That is, this DTLS Message will be divided into multiple Fragments, and then multiple Records will be loaded separately.

Remark: Maximum transmission unit (MTU) is the concept of the data link layer. MTU limits the payload size of the data link layer, that is, the size of the upper layer protocol, such as IP and ICMP. In Ethernet, the MTU of the link layer is 1500 bytes.

For example, in the handshake message of Certificate, the size of the certificate easily exceeds the limit of MTU, then the message will be divided into multiple Fragments and stored in multiple DTLS Records. The size of each Fragment must not exceed the limit of MTU. (PS: The second picture in the guide is a practical example).

Flight interpreted as "flight" or "voyage" in Chinese. It is one or a group of packaged Messages. This group of messages belongs to the same "voyage" and is regarded as a whole, and is sent through a single UDP datagram.

As shown in the figure above, there are a total of 4 flights in this DTLS handshake. Flight2 is a combination of the three Messages of Server Hello, Certificate, and Server Hello Done. The Message of Certificate is divided into two Fragments and loaded into two Records. Flight2 is sent out via a UDP datagram with a size of 2969 bytes.

Remark: The 2969-byte UDP packet of Flight2 was obtained by debugging and capturing packets in the local environment. It does not mean that the MTU is so large. In the actual network, such data packets that far exceed the MTU limit will not appear.

At this point, the concepts of Message, Record, and Flight are finished. The relationship between the three is as follows:

Fragment

Let's talk about why DTLS fragments DTLS Message.

We know that due to the influence of Ethernet MTU, the maximum size of UDP datagram is 1500 bytes. If this limit is exceeded, it will be fragmented by the IP layer (PS: Ethernet MTU is set to 1500 bytes to maximize channel transmission utilization).

But what if the IP layer fragmentation mechanism is prohibited? This will cause UDP datagrams larger than 1500 bytes to be discarded at the IP layer. Therefore, DTLS must fragment the message to meet the IP layer's requirement for message size. DTLS1.2: Message Size explains this reason.

By contrast, UDP datagrams are often limited to < 1500 bytes if IP fragmentation is not desired. In order to compensate for this limitation, each DTLS handshake message may be fragmented over several DTLS records, each of which is intended to fit in a single IP datagram.

Therefore, the fragmentation mechanism of DTLS is very simple: divide the DTLS Message into multiple consecutive DTLS Records when sending, and cache the fragments when receiving until you have a complete DTLS Message.

We can use these two APIs of OpenSSL to set the MTU size:

SSL_set_options(dtls, SSL_OP_NO_QUERY_MTU);
SSL_set_mtu(dtls, 1500);

The above code sets the MTU to 1500, then when the DTLS Message size exceeds 1500 bytes, the DTLS fragmentation mechanism will be triggered. Similarly, if the MTU is set to 300, then when the DTLS Message size exceeds 300 bytes, it will be divided. sheet. If you don't set it, the MTU will go to the default value. As shown in the figure below, the certificate message is divided into several fixed Fragments with a size of 288 bytes.

Remark: The bottom layer of TLS is the TCP protocol, which is byte streaming, so TLS does not have a message fragmentation mechanism.

We can also use the following API to set the upper limit of the Fragment size:

SSL_set_max_send_fragment(dtls, 1500);

Finally, let’s go back to the problem described by 160a214828e9cb: The certificate message is actually divided into two pieces and stored in two Records respectively, but because it is still packed into a UDP datagram when sent, the UDP is too large. The datagram caused the TURN module to not receive the complete data.

The more detailed reason is: we are using memory-type BIO, and calling BIO_get_mem_data at the application layer is a contiguous memory about DTLS Message (although the certificate message in this memory has been cut into two contiguous Fragments by DTLS And there are two Records), and the application layer directly sendto function to the opposite end after obtaining this memory. Therefore, this UDP packet is of course still very large, causing the reception to fail.

Looking back at the of the certificate message fragment in guide message sequence field of the two Records is the same, indicating that these are two Fragments of the same DTLS Message. And each Record has fragment offset and fragment length which are used to identify the boundary of the fragment. Therefore, we can parse out each independent Fragment based on these two fields.

Of course, it is Length field of the Record header, which will make parsing at the application layer more convenient. Therefore, to solve this problem, what the application layer has to do is to parse the message memory obtained from BIO to obtain the boundary of each Record, and then send each Record as an independent UDP message. The specific analysis code is not posted here, it is very simple.

Finally, in practice, it is found that DTLS Record cannot send across UDP datagrams, DTLS 1.2: Transport Layer Mapping This section also explains this. In other words, the application layer must parse out each Record strictly according to the boundary of the Record, and send it through independent UDP datagrams, instead of dividing it into several UDP datagrams to send at will. Because this may cause a DTLS Record to be split into multiple UDP datagrams to be sent, so that the receiving DTLS cannot reassemble the received DTLS Records into a complete DTLS Message.

The following figure shows the effect after DTLS fragments are sent independently:

Interested readers can refer to the DTLS demo I wrote, which implements a simple DTLS handshake and independent fragment sending. You can also refer to open source video server SRS , which is more concise and detailed.

to sum up

For DTLS Message exceeding the MTU limit, DTLS will divide it into multiple Fragments and store them in each DTLS Record. Therefore, a Fragment must be a DTLS Record. For DTLS Messages that do not exceed the MTU limit, they will not be fragmented and are also stored in the DTLS Record. Therefore, a DTLS Record is not necessarily a Fragment, but may also be a complete DTLS Message. In addition, the size of MTU and the maximum value of Fragment can be set using OpenSSL API.

Since we obtained the continuous memory storing each DTLS Message through the memory-type BIO, we packaged it directly into Flight and sent it through a separate UDP data message, so the UDP packet is still that big, which exceeds the TURN module. The upper limit of the receive buffer and the limit of MTU. Therefore, in order to achieve true independent transmission of fragments, the application layer needs to parse the Fragment itself (in fact, it is to parse the boundaries of the Record), and send them through independent UDP packets.

we solve a problem, we have to ask ourselves whether we have introduced a new problem.

Sending each DTLS Record independently, although it solves the problem of DTLS Message exceeding the MTU limit, it also increases the number of UDP packets, so the probability of packet loss will increase accordingly, the number of DTLS retransmissions increases, and the success rate of the handshake reduce. One way to solve this problem is: it is not necessary to send each DTLS Record separately by UDP, but multiple DTLS Records can be sent, as long as they can ensure that their combined size does not exceed the limit of MTU.

At the same time, we have to ask ourselves if there is a better way .

For example, the current solution is that the application layer implements Record parsing and sends independently, then does OpenSSL have related APIs to implement similar functions? For example, does BIO have related APIs that can tell us the number of Records in the read memory data and What is the boundary of each Record? This question will be investigated when you have time in the future.

Thanks for reading.

reference

DTLS 1.2
TLS 1.2

"Video Cloud Technology" Your most noteworthy audio and video technology public account, pushes practical technical articles from the front line of Alibaba Cloud every week, and exchanges and exchanges with first-class engineers in the audio and video field. The official account backstage reply [Technology] You can join the Alibaba Cloud Video Cloud Technology Exchange Group, discuss audio and video technologies with the author, and get more industry latest information.

Getting started with WebRTC DTLS encountered many bugs? Talking about DTLS Fragment

Preface

Message、Record、Flight

Fragment

to sum up

reference

CloudImagine

引用和评论

阿里云 ESA 游戏行业解决方案｜安全防护、加速、低延时的技术融合

三分钟掌握音视频处理 | 在 Rust 中优雅地集成 FFmpeg

三分钟掌握视频分辨率修改 | 在 Rust 中优雅地使用 FFmpeg

CVPR 2025 | 火山引擎获得NTIRE 视频质量评价挑战赛全球第一

三分钟掌握音视频信息查询 | 在 Rust 中优雅地集成 FFmpeg

【harmonyOS NEXT 下的前端开发者】WAV音频编码实现

什么是抖动以及如何使用抖动缓冲区来减少抖动？