1

Introduction

Simple is beautiful. In the world of network protocols, TCP and UDP are two very common protocols based on the IP protocol. The HTTP protocol we often use now is based on the TCP protocol. It is equivalent to the stability of TCP. Because of the unreliability of data transmission, UDP is used in some specific occasions, such as live broadcast, broadcast message, video and audio stream processing, and other occasions that do not need to verify data integrity.

Compared with the TCP protocol, UDP is characterized by its simplicity. It deletes various restrictive features in the TCP protocol to ensure the accuracy of the message. The advantage of simplicity is fast! Today, I will explain to you, UDT, a high-speed data transmission protocol based on UDP.

UDT protocol

Because of its simple characteristics, UDP can do a lot of things that TCP can't, such as fast transmission of large amounts of data. This is not to distinguish between TCP and UDP. After all, the adaptation scenarios of each protocol are different. The reason why they are popular is because they can play an important role in specific scenarios. To paraphrase a Chinese proverb: No matter whether the cat is white or black, the cat that can catch mice is a good cat.

If we make good use of the UDP protocol, we can quickly transfer a large amount of data. This protocol is the UDT protocol.

In other words, basic protocols like these were invented by foreigners, and China's Internet giants are rushing to do platform and traffic business. There is really nothing to say...

The UDT project started in 2001 and was developed by Yunhong Gu during his Ph.D. study at the National Data Mining Center (NCDM) at the University of Illinois in Chicago, and continued to maintain, upgrade and improve after graduation.

The emergence of UDP is because at that time, optical fiber networks with faster and cheaper transmission appeared, replacing the previous copper cables and twisted pairs, which greatly improved the efficiency of information transmission. At this time, everyone found that the use of the TCP protocol for the transmission of big data would have a big problem. Thus the UDT protocol based on UDP appeared.

The first version of UDT, also known as SABUL (Simple Available Bandwidth Utility Library), UDT supports batch data transmission to facilitate data transmission on a private network.

It should be noted that SABUL, the first version of UDT, used the UDP protocol to transmit data, while using a separate TCP protocol connection to transmit control messages.

The initial version of UDT was developed and tested on ultra-high-speed networks (1 Gbit/s, 10 Gbit/s, etc.). In October 2003, NCDM achieved an average of 6.8 Gbits per second from Chicago, the United States to Amsterdam, the Netherlands. transmission. In the 30-minute test, they transmitted approximately 1.4TB of data.

Since version 2.0 released in 2004, SABUL has been renamed UDT. The full name of UDT is UDP-based Data Transfer Protocol, which is a UDP-based data transfer protocol.

Why change to UDT? Because in UDT2.0, the TCP control connection in SABUL is deleted, and UDP is used to process data and control information. In addition, UDT2 also introduces a new congestion control algorithm that allows the protocol to dynamically adjust UDT and TCP streams to achieve concurrent operation of UDT and TCP streams.

In 2006, the UDT protocol was upgraded to version 3. The protocol not only runs on private networks, but also extends to the commercial Internet. At the same time, the congestion control in UDT3 can be adjusted and optimized, can run in a low-bandwidth environment, and allows users to easily define and install their own congestion control algorithms. In addition, UDT3 also significantly reduces the use of system resources (CPU and memory).

In 2007, UDT4 version optimized and improved performance in terms of high concurrency and firewall penetration. UDT4 allows multiple UDT connections to be bound to the same UDP port, and it also supports collective connection settings for UDP hole punching.

What is UDP hole punching?

UDP hole punching is usually used in network address translation (NAT). It is used to maintain the user UDP data packet flow traversing NAT. It is a method of establishing a two-way UDP connection between Internet hosts in a private network using a network address converter.

What is NAT?

Everyone knows that IPV4 addresses are limited, and IPV4 addresses will soon be used up, so how to solve this problem?

Of course, a permanent solution is IPV6, but IPV6 has been launched for so many years, and it seems that it has not really become popular.

Is there any solution if I don't use IPV6?

This method is NAT (Network Address Translators).

The principle of NAT is to map the IP and port of the LAN to the IP and port of the NAT device.

A translation table is maintained inside the NAT. In this way, many LAN servers can be connected through a NAT IP address and different ports.

So what's wrong with NAT?

The problem with NAT is that internal clients do not know their own external IP address, only the internal IP address.

If it is in the UDP protocol, because UDP is stateless, NAT is needed to rewrite the source port and address in each UDP packet, and the source IP address in the IP packet.

If the client tells the server its own IP address within the application and wants to establish a connection with the server, then it must not be established. Because the public IP of the client cannot be found.

Even if the public IP is found, any packet that reaches the external IP of the NAT device must have a destination port, and there must be an entry in the NAT translation table to convert it to the IP address and port number of the internal host. Otherwise, the connection failure as shown in the figure below may occur.

How to solve it?

The first way is to use STUN server.

The STUN server is a server with a known IP address. Before the client wants to communicate, first go to the STUN server to query its own external network IP and port, and then use this external network IP and port to communicate.

But sometimes UDP packets are blocked by firewalls or other applications. Traversal Using Relays around NAT (TURN) can be used at this time.

Both parties send data to the repeater server, and the repeater server is responsible for forwarding the data. Note that this is no longer P2P.

Finally, we have a master agreement called ICE (Interactive Connectivity Establishment):

It is actually a direct connection. It is a combination of STUN and TURN. When it can be directly connected, it is directly connected. If it is not directly connected, it is used by STUN.

In the process of using STUN and ICE, we will have a network host to establish port mapping and maintain other UDP port status, but the UDP status usually expires after a short time of tens of seconds to several minutes, in order to ensure the UDP in NAT The status and life cycle of the UDP hole punching technology. By regularly transmitting keep-alive data packets, the UDP status in the NAT is updated.

Disadvantages of UDT

Because UDT is based on the UDP protocol, but the UDP protocol does not have the characteristics of security because of its concise characteristics. Therefore, the UDT protocol based on it lacks security features, so its application in a commercial environment will be subject to certain restrictions.

However, a new version of UDT is already under development, so you can look forward to it.

Summarize

UDT is widely used in high-performance computing, such as high-speed data transmission on optical fiber networks. We will tell you how to use the UDT protocol in netty later.

This article has been included in http://www.flydean.com/11-udt/

The most popular interpretation, the most profound dry goods, the most concise tutorial, and many tips you don't know are waiting for you to discover!

Welcome to pay attention to my official account: "programs, those things", know the technology, know you better!


flydean
890 声望432 粉丝

欢迎访问我的个人网站:www.flydean.com