头图

图片

introduction

Since the birth of the Internet, man-made or unexpected conditions such as network throughput limitation, data packet loss, data transmission delay and delay jitter have closely accompanied the development of the Internet. Today's Internet is a massive, crowded, busy and complex system where different network services and applications share the same network infrastructure to send and receive traffic in a competitive manner. For those traditional application scenarios, such as web browsing, file transfer, or audio and video on demand services, because the user's tolerance for delay in these scenarios is in seconds or even greater, and the network jitter is even unaware, the traditional cubic-based solution is used. Or the TCP protocol of reno can complete the task better. These congestion control algorithms tend to fill the network until packet loss occurs and then retreat very quickly, which causes the degree of network congestion to change drastically.

The real-time network scenarios represented by real-time audio and video transmission requirements are more concerned with lower latency under acceptable audio and video quality, which requires congestion control algorithms to not always fill the network but to adjust the encoding strategy, send Real-time audio and video data at the bit rate required for better subjective experience can be obtained at lower latency. For this, traditional TCP cannot meet the demand. Although there is a BBR-based TCP congestion control algorithm, if you need to customize the scene and communicate with the media When the layers are linked, they face the problem of modifying the protocol stack code, which is not friendly to the release of new functions and the rapid iteration of versions. Therefore, all manufacturers who are deeply involved in the real-time network industry basically use the UDP protocol + self-developed weak network. Combination of adversarial algorithms. Real-time network products rely on accurate bandwidth estimation, efficient packet loss countermeasures, and reasonable forward error correction to improve end-to-end communication quality and reduce end-to-end communication delay. Having reliable and efficient evaluation products is crucial to improving the weak network confrontation ability.

The complexity of the actual network

Real-world networks are far more complex than the ones listed below:

  • Multiple data traffic bandwidth competition
  • Different transceiver types (performance), different NIC hardware and drivers
  • different routing paths
  • Different paths consist of different node sequences
  • Differences in hardware performance and software algorithms of different nodes
  • Physical distancing for geographic isolation
  • If the data traverses a wireless or mobile network, the effects of RF signals start to be noticeable
  • Signal strength, signal-to-noise ratio, physical occlusion
  • Radio protocol type, FEC or retransmission mechanism of the underlying protocol
  • Multi-device competition, co-channel interference and hidden nodes

Capabilities and Limitations of Network Impairment Tools

The actual possible network links and data flows have experienced various network nodes and different types of devices and protocols, and the forwarding capabilities and forwarding delays of different devices are different:

图片

In order to construct and model the real network, we abstract a set of parameters from the real network to describe the damage degree of the real network. A network link abstracted as a black box:

图片

Provide a stable and controllable weak network environment with the network damage meter:

图片

's expectations for network loss tools

From the detection and abstraction of the actual network, we expect the network loss tool to have the following functions:

  • Bandwidth Limiting Capability
  • queue depth
  • burst traffic
  • Packet loss capability
  • Rich packet loss model
  • Fixed Delay Capability
  • Stacking Burst Capability
  • Control of stack rate and burst rate
  • High-frequency and high-precision simulation parameter modification responsiveness

From the convenience of use, it is expected that the network loss tool has the following characteristics:

  • good scalability
  • good programmability
  • good interactive experience

Problems solved by the network loss tool

For setting the specified network damage parameters, the network loss meter can achieve accurate simulation of different network parameters, and can add timed and quantitative reliable network damage to the network link, which can improve the controllability and reproducibility of the network state.

of network loss tools

As for what kind of network damage to set and what parameters to use, the network loss meter itself will not give an answer. Due to the complexity of the network itself, the real network only provides limited visibility to the observer, and the current detection methods and results of network damage in the entire industry are not complete. The damage models are not the same, nor are they accurate.

Several used network damage tools

Hardware Products

  • S**

图片

This is a commercial network damage instrument that supports modular construction of network topology, and can set damage parameters of each module separately. The software has built-in common communication equipment simulation components.

The equipment is characterized by high precision of simulation parameters and flexible topology construction. At the same time, users are also required to have a clear understanding of the network topology to be simulated and the network nodes in it, which is equivalent to simulating a network link close to a white box.

The automation aspect supports Restful API interface calls.

  • H**

图片

This is also a commercial network damage meter. It abstracts the network into links in two directions and adds specified damages. The specified traffic can be introduced into the unreachable link through the filter for damage. In contrast, this implementation method is closer. Internet way of thinking.

When trying out this device, the product is still under development and high-speed iteration, adding a lot of features that are more friendly to the Internet industry, such as: support for more packet loss models, support for more types of jitter distribution, Python Call the development of the interface, etc.

After using this product, there is still a lot of room for improvement in UI interaction and system stability.

software product

图片 图片

This is a network damage tool developed by Tencent WeTest and currently supports iOS and Android platforms. This tool uses the VPN method to proxy the local traffic to realize the injection of network damage, which is more suitable for the weak network test requirements of Web or common type APP development. The system performance problem reduces the simulation accuracy.

The feature of this product is that it has built-in models of many weak network scenarios, such as elevators, high-speed rail, subway station scenarios, etc., and will also update the real-time delay detected by Tencent server and Jitter as the source of simulation data.

In terms of automation, the Andoid platform supports interface calls in adb mode, but the iOS platform has not used it yet.

图片

This is a small team of open source software, currently only supports the Windows platform.

The function of this software is relatively simple, and I haven't even found a way to set up Jitter. There is also no corresponding interface support for automated calls.

But it also fills a gap for low-cost Windows stand-alone weak network debugging.

图片 图片

The software exists as an Apple developer tool, supporting MacOS and iOS devices respectively.

The function is also relatively simple. Except for bandwidth limitation, packet loss and delay, I have not seen any Jitter-related settings. However, the tool supports the presets of some simple network types, but the implementation is relatively simple.

This function module is well maintained in different versions of Apple systems, the usability and stability are guaranteed, and the UI is relatively simple and clear.

Automation can be controlled by applescript script, but the control precision is low, not suitable for high-frequency calls.

The famous TC module exists as a part of the Linux kernel, and there is no official UI interface adaptation. But there are all kinds of Wrappers and third-party UIs circulating in the rivers and lakes.

Because of the open source nature of the Linux kernel, TC has endless possibilities, but the disadvantage is that the learning curve is relatively steep, and the function implementation in different kernel versions is quite different. For example: pfifo Although the documentation states that it supports, most The version does not support it; even some versions of the jitter distribution are not as expected.

When TC is combined with iptables, it can realize the static damage function possessed by most network loss meters; however, the dynamic damage is limited by the fact that the accuracy of Linux command execution is not high. If you want to improve the execution accuracy, you need to directly mobilize the C interface or directly program it.

issues that need resolving

The requirements of the network damage environment in practical work mainly come from two aspects:

  1. Detect and confirm the product's weak net confrontation capability boundary
  2. Improve the weak network confrontation ability of the product in the actual network

For the first requirement, the general practice in the industry is to fill up a certain network indicator or a combination of several network indicators until the product is stuck or cannot be used normally. Examples of possible test methods include: bandwidth limit gradually reduced from 1000Kbps to 100Kbps or 50Kbps; packet loss increased from 5% to 70% or 80%; delay (or delay jitter) increased from 10ms to 2000ms. This test method is indeed meaningful, but from the experience of using a variety of network impairment devices, the default configuration of the bandwidth limit function of different devices The queue depth and the amount of Burst data allowed are different, and the delay jitter is different. The default distribution and the specific implementation of whether to allow out-of-order are also different, so even with the same bandwidth limit or the same delay jitter, different network loss tools are likely to test different results; while for packet loss rate and fixed delay The default implementation of each tool is basically the same, corresponding to random packet loss and fixed delay in milliseconds.

For the second requirement, the difficulty of the problem is how to describe and model the real network. It can be seen that some recently emerging network damage software such as QNET have noticed this function and began to integrate some real network detection data and models. , but there are still many obstacles in large-scale deployment and use due to the shape and platform limitations of the product itself.

In response to the two problems mentioned above, SoundNet has carried out the research of network damage characteristics and the continuous advancement of actual network detection and modeling . From the current network detection data, the actual network damage has the following characteristics It is beyond the coverage of traditional weak network test cases:

  1. The time-varying frequency of network bandwidth is much higher than intuitive, and needs to be supported by queue depth and burst traffic.
  2. Sustained high random packet loss for long periods of time is uncommon, and most packet losses are strongly correlated with network congestion and are not random.
  3. There is no persistent lower bandwidth limit, unless artificially set by the operator or shared network.
  4. There is basically no scene of out-of-order or repeated packets, except when switching between cells or APs.

After comparing the advantages and disadvantages of various network damage tools, Shengwang finally chose the self-developed solution of Linux Traffic Control. At present, the openness and possibility of this solution are the highest, TC command + iptables + self-developed The combination can fully cover various application scenarios from weak network confrontation boundary testing to actual network simulation, providing a strong guarantee for the sound network products to verify the weak network confrontation algorithm in a more real network.

Epilogue

For Internet companies, especially those that have the need to simulate Lastmile networks, the current hardware network loss meters on the market are not too friendly. Traditional hardware network loss meters have developed from adapting to the needs of operators and communication equipment manufacturers. Up to now, it is more focused on ensuring the throughput of the test and the accuracy of the simulation, and lack of exploration for the diversity of network types and network scenarios. There are also some software network loss solutions on the market, and some simulation software will be used in the development process. The network detection results have built-in templates of some network scenarios and network types, which are more suitable for the usage scenarios of Internet companies. For products with low requirements for weak network countermeasures, choosing these software to simulate weak networks is an efficient and low-cost solution, while real-time network products with higher requirements for weak network countermeasures are capable of network detection and network customization. Under the premise of damage scenarios, using the open source Linux Traffic Control is a more autonomous and more possible choice.

Schedule: Comparison of Network Damage Tool Functions

图片
<Click the picture to view the larger picture>

Dev for Dev column introduction

Dev for Dev (Developer for Developer) is a developer interactive innovation practice activity jointly initiated by Agora and the RTC developer community. Through various forms of technology sharing, communication and collision, and project co-construction from the perspective of engineers, the power of developers is gathered, the most valuable technical content and projects are mined and delivered, and the creativity of technology is fully released.


RTE开发者社区
658 声望971 粉丝

RTE 开发者社区是聚焦实时互动领域的中立开发者社区。不止于纯粹的技术交流,我们相信开发者具备更加丰盈的个体价值。行业发展变革、开发者职涯发展、技术创业创新资源,我们将陪跑开发者,共享、共建、共成长。