0 Preface

If applications want to communicate with each other and work together to achieve business functions, transmission protocol support is also required.

The transmission protocol is the language of dialogue between applications. There are not too many specifications and requirements for designing the transmission protocol, as long as the applications on both sides of the communication can correctly handle the protocol && no ambiguity.

1 Segmentation

1.1 Separator

Transmission protocol is also a kind of language. When transmitting data, the first thing to solve is sentence segmentation.

For the transport layer, what is the received data?

It is a segment of bytes, but due to the uncertainty of the network, the segment you receive is not necessarily the segment sent by the producer.

Wouldn't it be enough to add "punctuation" in the agreement?

Moreover, there is no need for as many punctuation marks as in natural language, but only one separator is required.

This method is indeed feasible, and many transmission protocols use this method, such as the HTTP 1.0 protocol, where the separator is line feed (\r\n). But there is a problem: in natural language, punctuation marks are dedicated, they have no other meaning, and they are naturally distinguished from words.
But in the process of data transmission, no matter what character you define as a delimiter, it is theoretically possible that it will appear in the transmitted data.

So how to distinguish between "separator in data" and real delimiter?

Have the transmission data phase, plus the first delimiter, the delimiter in the data escape , then the received data escape back.
This is indeed a troublesome process and some performance is lost.

1.2 Preset length

more practical method.

Precede each sentence with a number indicating the length of the sentence. When you receive the data, read it according to the length.

Such as: 03 rainy days 03 guest days 02 days to stay 03 I will not stay
Here, a fixed length of 2 digits is used to store the length, and each sentence can support up to 99 words. The processing after receiving is simple. First read the 2-digit number 03, knowing that the next 3 words are the first sentence, then wait for these 3 words to be received, it can be used as the first sentence, and read it in the same way. The second sentence, the third sentence.

This is a good solution to the problem of sentence segmentation. The implementation is simpler than the separator method, and the performance is better. It is a commonly used method of separating data.

The redis aof file seems to be the pre-length, the classic solution is everywhere~

Is there a similar problem with the pre-length? 03 may also be the content in the normal text, should it be escaped?

You can think about it, it is best to implement the code that receives the data for parsing by yourself, and you will understand that the leading length does not need to be escaped. Because when parsing, you can clearly know whether the position currently read should be the length or the real data, it does not need to be determined based on the content in the data stream.

2 Duplex transceiver

2.1 Simplex communication

At any one time, data can only be transmitted in one direction. When one person speaks, the other can only listen.
The HTTP 1.0 protocol is like this. After the client establishes a connection with the server, the client sends a request until the server returns a response or the request times out. During this time, no other requests can be sent on this connection channel.

This kind of simplex communication is inefficient. To solve performance problems, many browsers and apps can only create multiple connections between the server and the client at the same time.

In simplex communication, there is a natural correspondence between the request and the response sent and received in sequence. Just like when being questioned by your girlfriend, you dare to answer one sentence when your girlfriend asks. What is the significance of this communication efficiency?

2.2 Duplex communication

The TCP connection is a full-duplex channel, which can send and receive data in both directions at the same time without affecting each other. To improve throughput, the application layer protocol must support duplex communication.

Duplex communication, after the client or server establishes a connection, both parties can send and receive messages based on the socket, instead of the server can only do some processing after accepting the message.

If you and your partner have the ability to listen and talk, after switching to a duplex agreement, you are basically reasoning with women, and you will be so confused that you can't distinguish between answering questions or expressing opinions.
Under concurrency, the order cannot be guaranteed. When actually designing a protocol, you generally don't care about the sequence, just make sure that the request and response can correspond correctly.

Solve the corresponding problem

When sending a request, add a serial number to each request: the serial number is guaranteed to be unique in this session, and then the serial number of the request is included in the response, which can correspond to the request and the response.

After adding the serial number, even if it is as confusing as an answer, it is clear what is being said.

You and your partner can number the request sent by you, and when replying to the other party's response, just bring the number of the other party's request. This solves the main problem of duplex communication.

During a conversation, is the unique serial number at the beginning? Then is the data length followed by the content?
How does the party receiving the message distinguish the length of the serial number, so as to distinguish between the serial number and the data length information before the content?

It must be the length of the data at the beginning, and the serial number is also part of the data! So it should be after the data length.

3 summary

When designing a transmission protocol, as long as the applications of both parties can recognize the transmission protocol and communicate with each other, there is no absolute specification.

The first thing to do is to solve the sentence segmentation. There are two segmentation schemes: "separator" and "pre-length".

The method of using ID to identify the corresponding relationship between request and response is a relatively common method for realizing duplex communication, which can effectively improve the throughput of data transmission.

The sentence segmentation is solved, the duplex communication is realized, and with the special serialization method, a high-performance network communication protocol can be realized, and a high-performance inter-process communication can be realized. Many MQ and RPC frameworks use this way to implement their own private application layer transmission protocols.

Simple high-performance communication program: three groups of conversations between you and your object. The server is your object and the client is yourself. Let the two meet one million times in the living room and record the total time.

https://github.com/WangYangA9/netty-FullDuplex-example
https://sourcegraph.com/github.com/swgithub1006/mqlearning/-/tree/src/main/java/org/coffee/mqlearning


JavaEdge
361 声望416 粉丝