Click on "Cloud Recommended Big Coffee" with one click, to get the official recommended boutique content, and learn technology without getting lost!

image.png

5G remote control scenarios have very high requirements for real-time audio and video transmission delay, stutter rate, and weak network resistance. This article will introduce how to combine the characteristics of 5G networks to perform in real-time audio and video communication links. Joint optimization to meet the remote control needs of industry scenarios and reduce picture delay.

image.png

In the previous article, we introduced the technical points of remote control. Starting from this chapter, the author will sequentially introduce the application and optimization of the three major remote control technologies. This article will start with real-time audio and video communication technology, which is mainly used to solve the real-time transmission of the image and sound of the controlled device or the surrounding environment of the vehicle to the remote control terminal in remote control, so that the remote driver or operator can clearly understand The surrounding conditions of the controlled device can be controlled accordingly. For example, the images of the front and the rear of the vehicle during the forward movement, and the images of the grab arm during the operation of the excavator need to be remotely transmitted through real-time audio and video technology.

In order to ensure the real-time and fluency of the control, compared to the transmission of sound, remote control mainly has very high requirements for the transmission of the picture, especially for the core indicators such as picture delay, freeze rate and weak network resistance. Taking the low-speed remote driving scenario as an example, the delay needs to be less than 200ms, as close to 100ms as possible, and the stall rate is preferably less than 2 per thousand. It can resist network fluctuations and 20%-30 equivalent to the average RTT delay in extreme cases. The packet loss rate is around %. The requirements of these indicators are often significantly higher than the previous application scenarios such as remote conferences, live broadcasts, and monitoring. For real-time audio and video technology, reducing latency is often contradictory to reducing the stall rate and improving the ability to resist weak networks. So this is a very big challenge.

Comparison of indicators between remote control and other application scenarios

企业微信截图_16391329328340.png

Follow the picture and explore the key points of optimization

The following figure is a schematic diagram of a typical video transmission link, which is mainly composed of main modules such as acquisition, encoding, sending, transmission, receiving, decoding, and rendering.
企业微信截图_16391330229705.png

Schematic diagram of a typical video transmission link

Acquisition: collect the original image frame data from the camera

Encoding: Encode the captured original image frame

Send: package and send the encoded video frame

Transmission: Transmit the packaged data from the network

Receiving: receiving the packaged data and recovering the video frame

Decoding: decode the video frame to recover the original image frame data

Rendering: Render and output the original image frame data to the screen

In real-time audio and video communications, the jitterbuffer in the receiving module is mainly responsible for resisting network fluctuations and reducing the stall rate, and it is also one of the main contributors to time delay. The implementation of jitterbuffer in different projects is slightly different, but basically there are functions such as out-of-order arrangement, frame detection, and frame buffering. The jitterbuffer is mainly responsible for receiving the video frame correctly and buffering it appropriately. After confirming that the decodable condition is reached, it is smoothed according to the estimated inter-frame delay (two frame receiving time difference-two frame sending time difference), and then sent to the subsequent decoding and rendering module. In this way, even if the network fluctuates to a certain extent, due to the smoothness of the jitterbuffer, adjacent video frames can still be rendered at close to the expected time interval, so as to play smoothly. Usually in order to deal with packet loss, disorder and delay jitter, the larger the network RTT and delay jitter, the larger the jitterbuffer required. At this time, due to the increase of the buffer, the video delay will increase accordingly. This is the root cause of the contradiction between the three major indicators.

In addition to the receiving module, let's look at the other modules. With the increasing computing power of chips, the delay of encoding, decoding, rendering and other modules has been very small, basically within 10ms, and even about 5ms. There is not much room for optimization, and the impact on the three core indicators is small. The delay of acquisition and transmission modules is mainly affected by external objective conditions. The former depends on the camera, and the latter depends on the network. The sending module will affect the packet loss, delay and jitter of data transmission, and affect the receiving effect. Therefore, in order to achieve the three core indicators, it is mainly the sending and receiving modules that need to be optimized. Through the optimization of the sending module, on the basis of ensuring the jam rate and the ability to resist weak networks, try to reduce the size of the receiving end jitterbuffer, thereby reducing the delay.

Targeted, design optimization plan

For the joint optimization of the sending and receiving modules, the realization of different projects is not consistent, and the complexity and effect are also quite different. The following is a schematic diagram of the implementation of a more complex sending and receiving module in a real-time audio and video communication architecture. Real-time audio and video communications in Tencent's remote control products also use this structure.

企业微信截图_16391331489997.png

Send and receive module schematic

The sending module is mainly composed of packet protocol, congestion control, sending window, error coding, etc. Among them, in order to improve transmission efficiency and anti-weak network ability, the packet protocol is usually based on the standard RTP protocol, and the bottom layer adopts the UDP protocol. Congestion control is mainly to estimate the state of the network, and make recommendations for sending pacing windows and bit rates. Error coding is mainly to resist RTP packet loss and improve the forward error correction capability, so that part of the lost packets can be recovered through error decoding without relying on retransmission.

In addition to the out-of-order buffer, frame detection buffer, and frame buffer involved in the jitterbuffer in the receiving module, there are modules such as unpacking, error decoding, and link state estimation feedback; among them, the link state estimation feedback is mainly used to estimate the loss of the link. Packets, delays, and delay jitter are used to guide the design of the jitterbuffer size and provide a reference for congestion control at the sender.

As mentioned above, the purpose of optimization is to reduce the size of the jitterbuffer, and the inter-frame delay fluctuation is the core factor that affects the size of the jitterbuffer. In addition to network fluctuations, packet loss retransmission is the main contributor to the peak delay fluctuations. Therefore, the first consideration for joint optimization of sending and receiving is to reduce packet loss and retransmission. Tencent has made better optimizations for 5G remote control scenarios mainly in terms of congestion control and error coding, reducing the probability of packet loss and retransmission.

Congestion Control current common congestion control methods for real-time audio and video, BBR, GCC, etc. are better.

BBR is mainly based on the delay bandwidth product of the network. It detects the maximum bandwidth and minimum delay of the network separately, and considers that the product of the two is the maximum data capacity that can be carried on the network. Its advantage is that it can resist random network delay and packet loss. The disadvantage of fluctuating noise is that the throughput will be reduced when the minimum delay is measured. For sudden network degradation, it will take longer to reduce to the actual bandwidth. Moreover, BBR was not originally designed for video transmission, and its application experience in real-time audio and video is relatively limited.

GCC is based on both delay congestion control and packet loss congestion control, and takes the minimum of the two. In delay congestion control, in order to smooth the influence of network fluctuation noise on delay gradient estimation, a Kalman filter is used in GCC for processing. The advantage of GCC is that it can take into account delay and packet loss at the same time, and has good practical application experience.

error code : In network transmission, the packet loss model can be understood as a deletion channel, and data packets will be randomly deleted during transmission. Therefore, forward error correction coding (FEC) applicable to delete channels can be used to restore packet loss by increasing the amount of redundancy during packet transmission. Considering error correction performance and computational complexity, linear block codes are mainly used for audio and video transmission, and XOR codes and RS codes are commonly used. Since FEC is designed mainly for random errors, this method can resist a certain degree of random packet loss under a shorter code length (number of coded packets). But for sudden packet loss caused by congestion or network quality degradation, a shorter code length is still unable to resist. At this time, the traditional method will increase the time interval between packets and increase the code length to resist sudden packet loss.

Optimization and enhancement based on 5G air interface network

In the 5G remote control scenario, the delay and fluctuation of the 5G air interface account for a relatively large network delay, and the network model of the 5G air interface is somewhat different from that of the traditional router. Traditional routing mainly focuses on packet loss due to congestion and does not carry retransmission; 5G air interface has both error packet loss and congestion loss, and has its own retransmission; traditional routing delay increase is mainly caused by congestion, 5G air interface due to resource scheduling cycle, There will also be a certain degree of delay fluctuation, especially for uplink data transmission. The bandwidth of the 5G air interface is related to the signal-to-noise ratio and the air interface load, and will change over time; the bandwidth of the traditional routing is relatively fixed, which is mainly affected by the network load.

Comparison of the characteristics of routers and 5G air interface networks
企业微信截图_16391332188378.png

Congestion Control Optimization : It can be seen that 5G air interface networks are quite different from traditional routing. In the face of delay jitter caused by resource scheduling cycles and bandwidth fluctuations caused by signal quality, BBR congestion control has limited applicability. Considering that the signal quality in the 5G air interface will cause a large change in the network bandwidth, it is possible to add congestion control based on the air interface signal-to-interference and noise ratio and network load estimation on the basis of the GCC delay and packet loss congestion control. This is for the 5G air interface network Changes have a faster reaction speed. At the same time, the Kalman filter algorithm used in the delay gradient estimation in GCC can be modified to better smooth the delay gradient jitter caused by the resource scheduling period.

error coding optimization : Based on the characteristics of the 5G air interface network, it can be seen that the 5G air interface has a low probability of packet loss due to its own retransmission, and a shorter code length can be used to resist random packet loss. The sudden packet loss in the 5G air interface is often caused by a sudden drop in the signal quality of the 5G air interface. The period of this deep fading is usually related to the mobility. The faster the movement speed, the shorter the period, and it is about 10ms when moving at low speed. . The traditional method of simply introducing a longer packet interval and increasing the code length cannot effectively deal with it, and it will increase the amount of data sent, leading to deterioration of packet loss. Cooperating with congestion control estimation based on the air interface signal-to-interference and noise ratio, this sudden packet loss can be predicted in real time. By reducing the bit rate, the transmission time can be prolonged without increasing the encoding length and the probability of burst packet loss can be reduced. At the same time, a packet interleaving method can be introduced to interleaving the codes to a certain extent to resist sudden packet loss.

In general, 5G remote control scenarios have very high requirements for audio and video delays. Although combined with 5G network characteristics, some joint optimizations have been made in sending and receiving to meet the remote control needs of some medium and low-speed industry scenarios. The industry's ideal 100ms indicator will still have some challenges, especially in cross-regional remote control scenarios. In the future, it is necessary to introduce more methods of joint optimization combined with the network. In addition, we can also consider more mining in camera acquisition and coding to maximize the end-to-end effect.

8毛峻岭.jpg

wonderful articles: 161b9fbf66c4e7 liberating the distance between people and equipment, how to complete remote control in the 5G era

"Yunjian Big Coffee" is a special column of Tencent Cloud Plus community. Cloud recommendation officials specially invite industry leaders to focus on the implementation of cutting-edge technologies and theoretical practice, and continue to interpret hot technologies in the cloud era and explore new opportunities for industry development. Click one-click to subscribe to , and we will regularly push premium content for you.


腾讯云开发者
21.9k 声望17.3k 粉丝