Accumulate a Thousand Miles: A Summary of the Landing of the QUIC Agreement in the Ant Group

Author: Kong Lingtao

Since 2015, the QUIC protocol has been standardized in the IETF and has been implemented by major manufacturers at home and abroad. Given that QUIC has many advantages such as "0RTT connection" and "support connection migration", and will become the next-generation Internet protocol: the underlying transmission protocol of HTTP3.0, the Ant Group Alipay client team and the access gateway team will be in the second half of 2018. Started to implement QUIC in scenarios such as mobile payment and overseas acceleration.

This article is a review article to introduce the overall landing situation of QUIC in ants. The reason for this review is that the QUIC protocol is too complicated. If you benchmark the existing protocols, QUIC is approximately equal to HTTP + TLS + TCP, and cannot be used to complete the work in detail. Therefore, we will present the key points of the implementation to readers through a review. Mainly introduce the following parts:

QUIC background: briefly and comprehensively introduce the background knowledge related to QUIC;
Scheme selection design: Introduce in detail how the ant's landing scheme finds another way and gracefully supports many features of QUIC, including connection migration, etc.;
Landing scene: Introduce two landing scenes of QUIC in Ant, including: Alipay client link and overseas acceleration link;
Several key technologies: introduce the core problems that need to be solved in the process of landing QUIC, and the solutions we use, including: "support connection migration", "improve 0RTT ratio", "support UDP lossless upgrade" and "client intelligent routing" Wait;
Several key technology patents.

QUIC background introduction

In view of the different backgrounds of readers, before starting this article, we will briefly introduce the background knowledge of QUIC. If you are interested in more design details of this protocol, you can refer to the related Draft: https://datatracker.ietf. org/wg/quic/documents/

1. What is QUIC?

Simply put, QUIC (Quick UDP Internet Connections) is a secure and reliable transmission protocol based on UDP encapsulation. Its goal is to replace TCP and self-contain TLS as a standard secure transmission protocol. The following figure shows the position of QUIC in the protocol stack. The HTTP protocol carried by QUIC is further standardized as HTTP3.0.

2. Why is QUIC?

Before the emergence of QUIC, TCP carried more than 90% of Internet traffic, and it seemed that there was no problem. Then why did QUIC, a revolutionary, appear? This is mainly because TCP, which has been developed for decades, faces the "protocol rigidity problem", which is manifested in several aspects:

The rigidity of network devices supporting TCP is manifested in the following: For some firewalls or NAT devices, if TCP introduces new features, such as the addition of certain TCP OPTIONs, it may be considered as an attack and packet loss, resulting in new features. It does not work on old network equipment.
TCP rigidity caused by the difficulty of upgrading the network operating system, some TCP features cannot be quickly evolved
In addition, when the application layer protocol is optimized to TLS1.3 and HTTP2.0, the optimization of the transport layer is also on the agenda. QUIC is based on TCP, taking the essence and removing the dross, and has the following hard core advantages:

3. A brief history of the development of the QUIC ecosystem

The following figure shows some of the more important time nodes from the creation of QUIC to the present. In 2021, QUIC V1 will become an RFC, ending the trend of a hundred flowers blooming.

After introducing the related background of QUIC, we will introduce the whole landing content of ant. Here, for the convenience of explanation, we will use the one, two, three, and four of QUIC to summarize and summarize, namely "a set of landing framework", "two Landing scene", "Three innovation patent protection", "Four key technologies".

A set of floor-to-ceiling frames

Ant's access gateway is developed based on the multi-process NGINX (internally called Spanner, the wrench for protocol unloading), and UDP has many challenges in the multi-process programming model, such as lossless upgrades. In order to design a complete framework, we fully considered the convenience, scalability, and performance issues of server deployment on the cloud before landing, and designed the following landing framework to support different landing scenarios:

In this framework, the following two components are included:

QUIC LB component: Developed based on NGINX 4 layer UDP Stream module, used to route based on the server information carried in QUIC DCID to support connection migration;
NGINX QUIC server: NGINX_QUIC_MODULE has been developed, and each worker listens to two types of ports:
BASE PORT, the same port number used by each Worker, listens in the form of Reuseport, and is exposed to QUIC LB to receive the data packet in the first RTT from the client. The characteristic of this type of packet is that the DCID is controlled by the client. End generation, no routing information;
Working PORT, a different port number used by each Worker, is the real working port to receive the QUIC packet after the first RTT. The specific of this type of packet is that the DCID is generated by the process of the server and carries the information of the server.

The capabilities supported by the current framework include the following:

Without modifying the kernel, QUIC connection migration is fully supported in user mode, and CID update during connection migration;
Without modifying the kernel, support QUIC's lossless upgrade and other operation and maintenance issues completely in user mode;
It supports true 0RTT and can increase the ratio of 0RTT.

Why can support the above capabilities, we will expand the description later.

Two landing scenes

Our two landing scenes from near to far are as follows:

Scenario 1: Alipay mobile terminal landed

The following is a schematic diagram of our landing architecture. The Alipay mobile client carries HTTP requests through QUIC, and forwards the requests to Spanner (the 7-layer gateway developed by Ant based on NGINX) through QUIC LB and other four-layer gateways. On Spanner, we will QUIC request Proxy Make a TCP request and send it to the service gateway (RS).

The specific scheme selection is as follows:

The supported QUIC version is gQUIC Q46;
NGINX QUIC MODULE supports the access of QUIC and the ability of PROXY to become TCP;
Support all RPC requests including mobile payment, funds, and Ant Forest;
There are currently two ways to select the QUIC link:
- Backup mode, that is, when the TCP link cannot be used, it is downgraded to the QUIC link;
- Smart mode, that is, TCP and QUIC race, in the case of TCP's expressive power is weaker than QUIC, the next request to actively use the QUIC link.

In this scenario, the dividends that can be obtained by using QUIC include:

When the client connection is migrated, the chain can continue to serve;
When the client initiates a connection for the first time, it can save the time of the TCP three-way handshake;
For weak network conditions, QUIC's transmission control can improve transmission performance.

Scenario 2: Accelerating overseas landing

Since 2018, Ant Group has self-developed overseas dynamic acceleration platform AGNA (Ant Global Network Accelerator) to replace the acceleration services of third-party vendors. AGNA deploys overseas access points: Local Proxy (LP) and domestic access points: Remote Proxy (RP) to return users' overseas requests to the source country through the acceleration link of LP and RP. As shown in the figure below, we deploy QUIC on the link between LP and RP.

On overseas access points (LP), each TCP connection is carried by the proxy as a Stream on QUIC. At the domestic access point (RP), each QUIC Stream is Proxy as a TCP connection. LP and QUIC long connection is used between RPs.

In this scenario, the dividends that can be obtained by using QUIC include:

Use the Stream on the QUIC long connection to carry TCP requests to avoid each cross-sea connection;
For cross-sea networks, QUIC's transmission control can improve transmission performance.

Three key patents

So far, we have protected some innovative technical points in the implementation process through patent application, and actively carried out standardization in the IETF to share our research results, including:

Patent One

In Scenario 2, we will use QUIC Stream to carry out four-layer agency method to carry out overseas return-to-source acceleration method for patent protection, and propose: "a link acceleration method based on QUIC protocol agency". This patent has now been obtained. US patent authorized, patent number: CN110213241A.

Patent two

Protect the QUIC LB component in our landing framework as a patent, and propose: "a stateless, consistent, and distributed QUIC load balancing device". This patent is still being accepted. Since QUIC LB can well support the connection migration problem of the QUIC protocol, there is currently a draft related to QUIC LB on the IETF QUIC WG. We have participated in the discussion and formulation of Draft, and the subsequent related plans will continue. Promote to products on the cloud.

Patent Three

We have patented the UDP lossless upgrade method we solved, and proposed "a QUIC server lossless upgrade plan". This patent is still being accepted. Since the problem of UDP lossless upgrade is a problem in the industry, some current methods need to jump in the user mode, and the performance loss is relatively large. Our solution can solve the current problem in our landing framework. We will focus on the details of this solution. Introduction in technology.

Four key technologies

During the entire landing, the solution we designed was developed around solving several core problems and formed four key technologies, which are as follows:

Technical point 1: Elegant ability to support connection migration

Let me talk about the problems faced by connection migration. As mentioned above, a more important function of QUIC is to support connection migration. The connection migration here refers to: if the client switches the network while the long connection is maintained, such as switching from 4G to Wifi, or the quintuple changes due to NAT Rebinding, QUIC can still continue on the new quintuple Connection Status. One reason why QUIC can support connection migration is that the bottom layer of QUIC is based on connectionless UDP. Another important reason is that QUIC uses a unique CID to identify a connection instead of a five-tuple.

As shown in the figure below, it is a schematic diagram of QUIC support connection. When the client exit address is switched from A to B, because the CID remains unchanged, the corresponding Session status can still be queried on the QUIC server.

However, the theory is very full, but the landing is very difficult. In the end-to-end landing process, the introduction of load balancing equipment will cause all mechanisms that rely on the five-tuple Hash for forwarding or associate Session to fail during connection migration. Take LVS as an example. After the connection is migrated, LVS's reliance on 5-tuple addressing will cause inconsistencies in the addressed servers. Even if the LVS addressing is correct, when the message arrives at the server, the kernel associates the process according to the 5-tuple, and the addressing error will still occur. At the same time, the IETF Draft requires that the CID needs to be updated when the connection is migrated, which makes it impossible to rely solely on the CID for forwarding plan.

Let’s talk our . In order to solve this problem, we designed the landing framework introduced at the beginning. Here we will simplify and abstract the plan. The overall idea is shown in the following figure:

1. On the four-layer load balancing, we designed the mechanism of QUIC LoadBalancer:

We have extended some fields (ServerInfo) in the CID of QUIC to associate the IP and Working Port information of the QUIC Server;
When connection migration occurs, QUIC LoadBalancer can rely on ServerInfo in CID for routing, avoiding problems caused by relying on 5-tuple associated Session;
When the CID needs to be updated, the ServerInfo in NewCID remains unchanged, so as to avoid addressing inconsistencies caused by only relying on the CID Hash to select the backend when the CID is updated;

2. In the multi-process working mode of the QUIC server, we have broken through the inherent shackles of NGINX's inherent multi-worker monitoring on the same port, and designed a multi-port monitoring mechanism. Each worker is isolated on the working port and the port information Carried in the CID of the return packet to the First Initial Packet, the advantages of such a proxy are:

Regardless of whether the connection is migrated or not, QUIC LB can forward the message to the correct process according to ServerInfo;
The common solution in the industry is to modify the kernel and change the Reuse port mechanism to the Reuse CID mechanism, that is, the kernel selects the process based on the CID. Even if the latter can be supported by means such as ebpf, we believe that this mechanism of modifying the kernel is too dependent on the bottom layer, which is not conducive to the large-scale deployment and operation and maintenance of the solution, especially on public clouds;
Using an independent port is also conducive to solving the problem of UDP lossless upgrade in multi-process mode. We will introduce this in technical point 3.

Technical point 2: Increase the ratio of 0RTT handshake

Here first introduce the principle of QUIC 0RTT. As we introduced in the previous article, QUIC supports both the transport layer handshake and the secure encryption layer handshake to be completed in one 0RTT. TLS1.3 itself supports the 0RTT of the encryption layer handshake, so it is not surprising. And how does QUIC implement the transport layer handshake to support 0RTT? Let's first look at the purpose of the transport layer handshake, that is, the server verifies that the client is the one that really wants to handshake, and the address is not spoofed, so as to avoid the forged source address attack. In TCP, the server relies on the last ACK of the three-way handshake to verify that the client is the real client, that is, only the real client will receive Sever's syn_ack and reply.

QUIC also needs to verify the source address of the handshake, otherwise there will be a DDOS problem with UDP itself, then how is QUIC implemented? Rely on STK (Source Address Token) mechanism. Here we first declare that, similar to TLS, the 0RTT handshake of QUIC is based on the connection established with the same server, so if it is a pure first connection, an RTT is still needed to obtain the STK. As shown in the figure below, we introduce this principle:

Similar to the principle of Session Ticket, Server will encrypt the client's address and current Timestamp to generate STK through its own KEY.
When the client handshake next time, the STK is brought over. Since the STK cannot be tampered with, the server decrypts it through its own KEY. If the solved address is the same as the address of the client's handshake, and the time is within the valid period, it means the client If it is trusted, the connection can be established.
Since the client did not have this STK during the first handshake, the service will reply REJ with the information of this handshake and carry the STK.

Theoretically, as long as the client caches this STK and brings it over during the next handshake, the server can directly verify it through, that is, realize the 0RTT of the transport layer. But the real scene but there the following two questions :

Because STK is encrypted on the server side, if the client is routed to another server next time, the server also needs to be able to identify it;
What is encoded in STK is the address of the last client. If the address carried by the client changes next time, the verification will also fail. This phenomenon is very likely to occur on the mobile terminal, especially in the IPV6 scenario, the exit address of the client will often change.

Let me introduce our solution. The first problem is easier to solve. We only need to ensure that the STK keys generated by the machines in the cluster are consistent. The second question, our idea of solving the problem is:

We have extended a Client ID in the STK. This Clinet ID is generated by the client through the wireless bodyguard black box and is globally unique. It is similar to the SIMID of a device. The client passes the encrypted Trasnport Parameter to the server. Include this ID in the STK;
If the STK verification fails due to a change in the Client IP, the Client ID will be verified. Because the ID is always the same for a Client, the verification can be successful. Of course, the premise is that the client is real. In order to prevent the leakage of Client ID, we will selectively implement current limiting protection for Client ID verification capabilities.

Technical point 3: Support QUIC lossless upgrade

We know that UDP lossless upgrade is a problem in the industry. Lossless upgrade means that when reloading or updating the binary, the old process can gracefully exit after processing the data on the existing connection. Taking NGINX as an example, here is how TCP handles lossless upgrades, mainly in the following two steps:

The old process closes the listening socket first, and then closes the connection socket after all pending connection requests are completed;
The new process inherits the listening socket from the old process and begins to accept the new request.

However, UDP cannot achieve lossless upgrade because UDP has only one listening socket without a connection socket similar to TCP. All the data packets sent and received are on this socket, which causes problems in the following hot upgrade steps:

During the hot upgrade, after the old process forks the new process, the new process will inherit the listening socket and start recv msg;
If the old process closes the listening socket at this time, the data packets in transit cannot be received, and the purpose of graceful exit will not be achieved;
If you continue to monitor, both the new and old processes will receive messages on the new connection at the same time, causing the old process to fail to exit.

Here are related solutions. In response to this problem, the industry has some methods, such as: carrying the process number in the data packet, and when the data packet is sent and received incorrectly, a forwarding is performed between the new and the old process. Considering the performance on the access layer and other reasons, we don't want the data to jump again. Combined with our landing architecture, we designed the following lossless upgrade solution based on multi-port rotation. In simple terms, we let the new and old processes monitor in different port groups and carry them in the CID, so that QUIC LB can be forwarded to according to the port New and old processes. In order to facilitate the operation and maintenance, we adopt the port rotation method, the new and old processes will restart the previously selected port after reload N times. As shown below:

During the lossless upgrade, the Baseport port of the old process is closed, so that the first intial packet will no longer be accepted, which is similar to closing the listening socket of tcp;
The working port of the old process continues to work to receive the residual traffic on the current process;
The Baseport of the new process starts to work to receive the first initial packet and open a new connection, which is similar to the listening socket with TCP enabled;
The working port of the new process = (I + 1) mod N, N refers to the number of times that the state of the new and old processes are supported at the same time. For example, N = 4, which means that it can be reloaded four times at the same time. There are four types: Old, New1, New2, and New3. The state coexists at the same time, I is the port number of the previous process, here +1 is because there is only one worker, if there are M workers, then add M;
The established connection is transferred by Load Balancer to the Working Port of the listening port of the new process.

Technical point 4: Client intelligent routing

Although the desire to land QUIC is good, the development of new things is not smooth sailing. Since QUIC is based on UDP, and compared to TCP, UDP is not friendly to operators in terms of support, which is reflected in:

When bandwidth is tight, UDP will often be restricted;
Some firewalls will directly drop UDP packets;
The NAT gateway's session survival time for UDP is also shorter.

At the same time, according to observations, different mobile phone manufacturers have different support capabilities for UDP, so in the process of landing, if all traffic is cut blindly to QUIC, it may lead to some unpredictable results. For this reason, on the client, we designed the mutual backup link between TCP and QUIC introduced at the beginning. As shown in the figure below, we detect the RTT, packet loss rate, request completion time, and error of the TCP link and QUIC link in real time. According to a certain quantitative method, the two links are scored. According to the score, it is determined which link to choose to go, so as to avoid the problems caused by addressing only one link.

Make a summary

This article mainly introduces QUIC's landing plan, scene and some key technologies in ant. key technology of , 1618a54bd72ade, mainly introduces how we can elegantly support the connection migration mechanism of QUIC, the lossless upgrade of QUIC server, etc. by creatively proposing the QUIC LB component and the multi-port monitoring mechanism. Relying on this solution, our access The gateway does not need to rely on changes to the underlying kernel like the industry does, which greatly facilitates the deployment of our solutions, especially in public cloud scenarios. In addition to connection migration, we also proposed a 0RTT connection improvement plan and a client intelligent routing plan to maximize the revenue of QUIC on the mobile terminal. Up to now, QUIC has been running smoothly on the two scenarios , the mobile terminal of Alipay and the global acceleration link , and has brought good business benefits.

future plan

In the past two years, we have mainly used the community’s gQuic as the basis to give full play to the protocol advantages of QUIC, and combined with Ant’s business characteristics to maximize the revenue of the mobile terminal. We have creatively proposed some solutions and actively promoted them to the community and IETF. . In the future, as ant develops and explores more businesses and HTTP3.0/QUIC is about to become a standard, we will continue to dig into the value of QUIC mainly in the following directions:

We will use the advantages of QUIC in the application layer to design a unified QUIC transmission control framework with adaptive service types and network types. For different types of services and network types, we will optimize the transmission to optimize the service. Network transmission experience;
Switch gQUIC to IETF QUIC, and promote the further implementation of standard HTTP3.0 in Ant;
Promote Ant's QUIC LB technical point to IETF QUIC LB, and eventually evolve into a standard QUIC LB;
Explore and implement MPQUIC (Multipath QUIC) technology to maximize revenue on the mobile terminal;
Continue the performance optimization work of QUIC, using UDP GSO, eBPF, io_uring and other kernel technologies;
Explore the opportunity of QUIC to carry east-west traffic on the intranet.

, 3 mobile technology practices & dry goods for you to think about every week!