The Practice of Weak Network Confrontation after the Real-time Conference of Shengwang 2020

声网2020实时大会后的弱网对抗实践

voip

IP-based audio and video transmission is a real-time video call technology, through the Internet protocol to achieve audio and video calls, as well as multimedia conferences. VoIP can be used for many Internet access devices, including VoIP phones, smart phones, and personal computers. It can use cellular networks, Wi-Fi, coaxial cables, fiber optics and other equipment for signaling transmission, audio and video calls, text messages, and some Control the transmission of information.

Background introduction

Once a mobile phone or monitoring device is connected to the network, due to the heterogeneity of the Internet and the declining transmission efficiency of various media, the loss of audio and video data packets in network transmission will inevitably occur, which directly affects the user's senses and subjective experience. In TCP, there is ack feedback to verify the integrity of the packet, and NACK is added to UDP to confirm and judge packet loss, and RR and SR related reports are used to count RTT related data. The emergence of WebRTC and the implementation of its own JitterBuffer and NetEQ have ensured enough audio and video UDP transmission.

In the 2020 RTE conference of Soundnet, I was fortunate to participate in online sharing and learned a lot of content. Among them, the optimization items mentioned in the real-time transmission of audio and video and the results of soundnet optimization left a deep impression on me. The following introduces the PPT that I watched at the time, and the targeted optimization of related content after learning real-time audio and video after watching it.

Data driven

Teacher Zhang Xinggong from Wang Xuan Computer Institute of Peking University introduced the chapter of "Data-driven Real-time Video Transmission Technology". Now that the Internet is booming, real-time video is everywhere, including but not limited to video conferencing, live video, VR/AR, 360° panoramic video, and audio and video monitoring, audio and video calls, etc.

However, real-time video transmission faces many challenges, including: network restrictions, transmission delays between different networks, relatively large jitter, network switching or 4G network packet loss is more serious, video transmission quality is low, easy to freeze, mosaic, black screen, green screen Such phenomena directly affect user experience. Although TCP can solve some of the problems, its sensitivity to the network needs to be strengthened. At the same time, there will be a certain delay, which is not conducive to real-time transmission. The rise of WebRTC can solve most of the problems, including controllers based on packet loss and delay, which can be greatly alleviated. At the same time, the introduction of reinforcement learning further improves the problem-solving ability. After that, the BBR model provides a good solution for transmission, including low latency and high bandwidth. But it is still based on the RTT model, and there is no fairness reference, and the adaptability is not so strong. At the same time, BBR is based on detection, which is lagging for network detection.

Reference model

The CC, which combines mathematical models and statistical models, provided by Mr. Zhang’s team provides a good idea. Including the combination of mathematical model with fairness as the objective function + statistical model of model-free network state, as shown in the following figure:

The main goal of the model is to solve two unknowable and one lagging problems: including user agnostic, network status agnostic, and network status feedback lag.

Optimization and promotion

After learning and drawing on relevant experience, we optimized our products, mainly including the following parts:

The first step is to improve the test environment. Because most of our products are wired connections, and some are fiber-optic, the network environment is relatively stable. Therefore, support for Traffic Control commands is added to Android and linux products to simulate the network at the data sending end. TC can support multiple methods such as packet loss, network jitter, delay, bandwidth limitation, etc., which can maximize the near real-time network environment and further improve the accuracy of laboratory simulation and test methods. To provide a more comprehensive and convenient test method for weak network optimization in the future, and combine the TC command to complete the development of the test APP, which can be set at will in combination with the command, so that the test lady who has no development experience can also test and verify at will.

The second step is to improve weak network countermeasures. Our products continue to use the relatively early WebRTC version, which is no longer comparable to the latest version, but for stability, only some functions can be gradually optimized, and then launched after pressure testing, which increases the difficulty of maintenance for software engineers. We study the latest BBR model and the data-driven network model proposed by Teacher Zhang to optimize the accuracy of network detection. At the same time, we enable the mechanism of FEC and NACK to work at the same time. We optimize and change the judgment conditions for some processes in JTB to improve processing efficiency. . In the original company's VGA mode, the TC is set to 20% packet loss rate and there is no picture, and it is upgraded to 720P. TC is set to 30% to improve the experience of smooth playback. The relevant algorithm has been launched after more than a month of pressure testing by the Quality Department, and the effect has been improved significantly, which has been appreciated by users. In the positive PK with competitors, we won the cooperation opportunities of customers due to the better quality of the weak network video of our products, and signed a long-term memorandum.

The third step is to adjust the H264 encoding and decoding parameters. Due to the difference of H264 encoding parameters, it has a great impact on the code rate after encoding. Therefore, combined with the support of hardware manufacturers and the optimization of soft coding and software solutions, we have adjusted and optimized some coding and decoding parameters, including CABAC and CAVLC. Selection (there is in the interface provided by the manufacturer before, but the big guys who originally designed and developed it did not use this parameter), including the investigation and modification of the rate control parameters, including the introduction and optimization of IDR and Intra-Refresh parameters, including the interface Next, the LRT and SRT (adaptation of long and short reference frames) to be connected with the manufacturer. Appropriately fine-tune the codec parameters, without affecting the video quality and user subjective experience, the coded rate can be controlled at the optimal state, and the pressure on the network environment with poor conditions will be reduced a lot, so as to maximize the source Save the code rate, improve the coding quality, and make every effort to ensure the user experience.

Problems and goals

The above three parts are optimizations and improvements that we are doing recently, but the relevant content is okay for packet loss and delay, but when the jitter is very serious, there is nothing we can do. Due to the limited capabilities of our product's WIFI module (cost considerations), when wifi transmits data, the jitter is very severe and there is a certain packet loss rate. This hardware performance directly leads to the poor effect of the weak network countermeasure system we studied. In addition to replacing a more stable and reliable wifi module, high jitter is also the next challenge our weak network confrontation team will face.

In the next R&D cycle, we will continue to learn, delve into, and carefully review the information to try to understand the data-driven related models proposed by the team of Teacher Zhang; try to combine our own equipment environment and usage scenarios to form a self-developed data model to detect network conditions and congestion The control and network sensitive feedback system further improves the more reliable and high-quality video transmission under WiFi and 4G link modes, and provides a strong guarantee for the promotion of the company's products.

Concluding remarks

The road is long and long, and I will search up and down. Real-time audio and video transmission’s weak network confrontation is a long-term process. We will have the courage to try and learn from it. We can’t guarantee industry leadership. However, providing perfect real-time video call quality is our team’s goal: Today's will not be enough.

The above is my own sharing, please feel free to communicate and discuss with you at any time. If you are interested, you can click the triple.

The Practice of Weak Network Confrontation after the Real-time Conference of Shengwang 2020

voip

Background introduction

Data driven

Reference model

Optimization and promotion

Problems and goals

Concluding remarks

RTE开发者社区

引用和评论

高能预警！Community Day 20+议题大公开

三分钟掌握视频剪辑 | 在 Rust 中优雅地集成 FFmpeg

百万架构师第二十五课：分布式架构的基础：分布式系统的基石TCP-IP通讯协议｜JavaGuide

2025版 RTC、直播、点播技术对比｜腾讯云/即构/声网如何选型

GMAC网络延时性能优化

网速只拼Mbps？解码网速真相的五大关键因素

三分钟掌握音视频处理 | 在 Rust 中优雅地集成 FFmpeg