Introduction to Network Programming for Lazy People (14): What is Socket? Understand it in one sentence!

This article was shared by cxuan, the original title "So this is the Socket", has been revised.

1 Introduction

The previous articles in this series mainly explain the theoretical basis of computer network, but for application layer developers of instant messaging IM, it is actually various API interfaces that deal with computer network.

In this article, let's talk about Socket, which is most familiar to network application programmers. Putting aside the jerky computer network theory, we will understand what Socket is from the perspective of the application layer.

For the understanding of Socket, this article will introduce the following aspects:

1) What is Socket;
2) How the Socket is created;
3) How the Socket is connected;
4) How does Socket send and receive data;
5) How the Socket is disconnected;
6) Socket socket deletion, etc.

Special Note: The "Socket", "Network Socket" and "Socket" mentioned in this article, unless otherwise specified, refer to the same thing.

study Exchange:

Introductory article on mobile IM development: "One entry is enough for beginners: developing mobile IM from scratch"
Open source IM framework source code: https://github.com/JackJiang2011/MobileIMSDK

(This article has been published simultaneously at: http://www.52im.net/thread-3821-1-1.html )

2. What is Socket

A data packet is generated by the application program, and enters the protocol stack to pack various headers. Then the operating system calls the network card driver to instruct the hardware to send the data to the opposite host.

The general diagram of the whole process is as follows:

As we all know, the protocol stack is actually a stack of some protocols located in the operating system, including TCP, UDP, ARP, ICMP, IP and so on.

Usually a protocol is designed to solve a specific problem, such as:

1) The design of TCP is responsible for the safe and reliable transmission of data;
2) UDP is designed to have small packets and high transmission efficiency;
3) ARP is designed to be able to query physical (Mac) addresses by IP addresses;
4) ICMP is designed to return error messages to the host;
5) The purpose of IP design is to realize the interconnection of large-scale hosts.

Data generated by applications such as browsers, e-mails, file transfer servers, etc., are transmitted through transport layer protocols. The application does not directly establish contact with the transport layer, but has a suite that can connect the application layer and the transport layer, and this suite is Socket.

In the above picture, the application program includes a socket and a resolver. The function of the resolver is to initiate a query to the DNS server to query the target IP address (for DNS, please refer to "Theory Linking Practice, All-round In-depth Understanding of DNS").

Below the application: the inside of the operating system, the operating system includes the protocol stack, and the protocol stack is a stack of a series of protocols.

Below the operating system: the network card driver, the network card driver is responsible for controlling the network card hardware, and the driver drives the network card hardware to complete the sending and receiving work.

Inside the operating system, there is a storage space for storing control information, and this storage space records the control information for controlling communication. In fact, the control information is the entity of the Socket, or the memory space where the control information is stored is the entity of the Socket.

Everyone here may not know why, so I used the netstat command to show everyone what a Socket is.

We enter in the Windows command prompt:

netstat-ano

netstat is used to display Socket content, -ano is an optional option
a Not only displays the Sockets that are communicating, but also displays all Sockets that have not yet started communication, etc.
n Display IP address and port number
o Display the program PID of the Socket

My computer gives the following results:

As shown in FIG:

1) Each line is equivalent to a Socket;
2) Each column is also called a tuple.

So, a Socket is a quintuple:

1) Agreement;
2) local address;
3) External address;
4) Status;
5）PID。

PS: Sometimes it is also called a quad, and a quad does not include a protocol.

Let's interpret the data in the above figure, such as the first row in the figure:

1) Its protocol is TCP, and both the local address and the remote address are 0.0.0.0 (this means that the communication has not yet started, and the IP address has not yet been determined).

2) The local port is known to be 135, but the remote port is not yet known. The state at this time is LISTENING (LISTENING means that the application has been opened and is waiting to establish a connection with the remote host. Regarding the transition between various states, you can read "Easy to Understand - Deep Understanding of the TCP Protocol (Part 1): Theoretical Basis").

3) The last tuple is PID, the process identifier, PID is like our ID number, which can precisely locate a unique process.

3. How is Socket created?

Through the explanation in the previous section, you may now have a basic understanding of Socket. Take a drink first, take a break, and let us continue to explore Socket.

Now I have a question, how is Socket created?

Sockets are created with the application.

There is a socket component in the application. When the application starts, it will call the socket to apply for creating a Socket. The protocol stack will create a Socket according to the application's application: first allocate the memory space required by a Socket. This step is equivalent to preparing for control information. A container, but only the container has no practical effect, so you also need to put control information into the container; if you do not apply for the memory space required to create a Socket, there is no place to store the control information you create, so allocate memory space, Putting control information is essential. At this point, the creation of the Socket has been completed.

After the Socket is created, it will return a Socket descriptor to the application. This descriptor is equivalent to the number plate that distinguishes different Sockets. According to this descriptor, the application needs to provide this descriptor when entrusting the protocol stack to send and receive data.

4. How the Socket is connected

After the Socket is created, it will eventually serve for data sending and receiving. However, before the data is sent and received, there is still a step of "connection" (the term is connect), and there is a whole set of procedures for establishing a connection.

This "connection" is not a real connection (a water hose between two computers? Not what you think...).

In fact this "connection" is the process by which an application program is transferred from one host to another over the network medium through the TCP/IP protocol standard.

After the Socket is just created, there is no data and no communication object.

In this state: even if you let the client application delegate the protocol stack to send data, it doesn't know where to send it. Therefore, the browser needs to query the IP address of the server according to the URL (the protocol that does this work is DNS), and after querying the target host, tell the protocol stack the IP of the target host. At this point, the client side is ready.

On the server: Socket needs to be created just like the client, but it also does not know who the communication object is, so we need to let the client inform the server the necessary information of the client: IP address and port number.

Now that the necessary information for the two communicating parties to establish a connection is available, the "connection" process can begin.

First: the client application needs to call the connect method in the Socket library and provide the socket descriptor and the server IP address and port number.

The following is the pseudocode call of connect:

connect(<descriptor>, <server IP address and port number>)

This information will be passed to the TCP module in the protocol stack, the TCP module will encapsulate the request message, and then pass it to the IP module for IP header encapsulation, and then to the physical layer for frame header encapsulation.

After that, it is transmitted to the server through the network medium, and the server will parse the frame header, IP module, and packet header of the TCP module to find the corresponding Socket.

After the Socket receives the request, it will write the corresponding information and change the status to connecting.

After the request process is completed: the TCP module of the server will return a response, and this process is the same as that of the client (if you are not sure about the encapsulation process of the message header, you can read "A Quick Understanding of the TCP Protocol is Enough").

In a complete request and response process, control information plays a very critical role:

1) SYN is the abbreviation of synchronization. The client will first send a SYN packet to request the server to establish a connection;
2) ACK means the corresponding meaning, it is the response to sending the SYN data packet;
3) FIN means termination, which means that the client/server wants to terminate the connection.

Due to the complex and changeable network environment, data packets are often lost, so the two parties need to confirm each other whether the data packets of the other party have arrived when communicating, and the criterion for judgment is the value of ACK.

The above text is not vivid enough, the animation can better illustrate the process:

▲ The above picture is quoted from "Following Animation to Learn TCP Three-way Handshake and Four Waves"

(PS: For the detailed theoretical knowledge of this "connection", you can read "Theoretical Classics: Detailed Explanation of the Three-Way Handshake and Four-Way Wave Process of the TCP Protocol" and "Learn TCP Three-Way Handshake and Four-Way Wave Process with Animation", which will not be repeated here. .)

When all the messages for establishing the connection can be sent and received normally, the socket has entered the state of being able to send and receive. At this time, it can be considered that the two sockets are connected by a management. Of course, the tube doesn't actually exist. After the connection is established, the connection operation of the protocol stack is over, that is to say, the connection has been executed, and the control flow is returned to the application.

In addition: If you are more familiar with Socket code, you can read this "Teach you to write TCP-based Socket long connection" first.

5. How does Socket send and receive data

When the connection process in the previous section of the control flow returns to the application, it will directly enter the data sending and receiving stage.

The data sending and receiving operation starts when the application calls write to send the data to be sent to the protocol stack, and the protocol stack executes the sending operation after receiving the data.

The protocol stack does not care what data is transmitted by the application, because these data will eventually be converted into binary sequences. After receiving the data, the protocol stack will not send the data immediately, but will put the data in the send buffer. , and then wait for the application to send the next data.

Why is the received packet not sent directly, but placed in the buffer?

Because as long as the data is sent as soon as it is received, it is possible to send a large number of small packets, resulting in a decrease in network efficiency (so the protocol stack needs to accumulate data to a certain amount before it can be sent).

Different versions and types of operating systems have different opinions on how much data the protocol stack will put into the buffer.

However, all operating systems follow these standards:

1) The first judgment element: it is the data length that each network packet can accommodate. The criterion for judgment is MTU, which represents the maximum length of a network packet. The maximum length includes the header, so if the data area is only discussed, the MTU - the header length will be used, and the resulting maximum data length is called MSS.

2) Another criterion: time. When the data generated by the application program is relatively small, and the efficiency of the protocol stack to place data in the buffer is not high, if it waits for the MSS to send it every time, it may be delayed due to the long waiting time. In this case, the data should be sent even if the data length does not reach the MSS.

But the protocol stack does not tell us how to balance these two factors. If the data length is the priority, the efficiency may be lower; if the time is the priority, it will reduce the efficiency of the network.

After a while. . . . . .

Suppose we are using the law of limited length: the buffer is full at this time, the protocol stack is about to send data, and the protocol stack is just about to send the data, but finds that it cannot transmit such a large amount of data at one time (relative) data, then How to do it?

In this case, the data in the send buffer will exceed the length of the MSS, the data in the send buffer will be split into a packet with the size of the MSS, and each piece of split data will be added with TCP, IP , the Ethernet header is then put into a separate network packet.

At this point, the network packet is ready to be sent to the server, but the data sending operation is not over, because the server has not yet confirmed whether the network packet has been received. Therefore, after the client sends the data packet, the server also needs to confirm.

When the TCP module splits the data, it will calculate the offset of the network packet. This offset is the number of bytes calculated from the beginning of the data, and the calculated number of bytes will be written in the TCP header. The TCP module It will also generate a serial number (SYN) of the network packet. This serial number is unique, and this serial number is used for the server to confirm.

The server will confirm the data packet sent by the client. After the confirmation is correct, the server will generate a sequence number and an acknowledgment number (ACK) and send it to the client together. After the client confirms, it will send the acknowledgment number to the server.

Let's take a look at the actual working process:

First: the client needs to calculate the initial value of the sequence number when connecting, and send this value to the server.

Next: The server calculates the confirmation number from this initial value and returns it to the client (the initial value may be discarded during the communication process, so when the server receives the initial value, it needs to return the confirmation number for confirmation).

At the same time: the server also needs to calculate the initial value of the serial number in the direction from the server to the client, and send this value to the client. Then, the client also needs to calculate the confirmation number according to the initial value sent by the server and send it to the server.

At this point: the connection is established, and then you can enter the data sending and receiving stage.

In the data sending and receiving stage, both parties can send requests and responses at the same time, and both parties can also confirm the requests at the same time.

The request-confirmation mechanism is very powerful: through this mechanism, we can confirm whether the receiver has received a certain packet, and if not, resend it, so that any errors in the network can be detected even if and remedy.

The above text is not vivid enough, the animation can better understand the request-confirmation mechanism:

▲ The above picture is quoted from "Following Animation to Learn TCP Three-way Handshake and Four Waves"

Network cards, hubs, and routers (see "Introduction to the Functional Principles of the Most Popular Hubs, Switches, and Routers in History") do not have an error remedy mechanism. Once an error is detected, the data packet will be discarded directly, and the application program does not have such a mechanism. It works Just the TCP/IP module.

Due to the complex and changeable network environment, data packets will be lost. Therefore, there are certain rules for sending sequence numbers and confirmation numbers. TCP will manage confirmation numbers through the window. We will not repeat them in this article. You can read "Easy to Understand- In-depth understanding of the TCP protocol (below): RTT, sliding window, congestion handling" to find the answer.

PS: Another article "When we read and write Socket, what are we reading and writing? "The animation explains this process in detail, you can read it if you are interested.

6. How Socket is disconnected

When the two communicating parties no longer need to send and receive data, they need to disconnect. Different applications disconnect at different times.

Take the Web as an example: the browser sends a request message to the Web server, and the Web server returns a response message. At this time, the sending and receiving of data is all over. The server may initiate a disconnection response first, and of course, the client may initiate it first (whoever Disconnection is a decision made by the application) and has nothing to do with the protocol stack.

No matter which party initiates the disconnection request, the close procedure of the Socket library will be called.

Let's take the server disconnection as an example: the server initiates a disconnection request, and the protocol stack will generate a disconnected TCP header, which is actually to set the FIN bit, and then entrust the IP module to send data to the client. At the same time, the server's Socket will record information about disconnection.

After receiving the FIN request from the server: the client protocol stack will mark the Socket as disconnected, and then the client will return a confirmation number to the server, which is the first step in disconnecting. After this step, The application also calls read to read the data. After the server data is sent, the protocol stack will notify the client application that the data has been received.

As long as all the data returned by the server is received, the client will call the close program to end the sending and receiving operation. At this time, the client will generate a FIN and send it to the server. After a period of time, the server will return the ACK number. At this point, the communication between the client and the server is over.

The above text is not vivid enough, the animation can better illustrate the process:

▲ The above picture is quoted from "Following Animation to Learn TCP Three-way Handshake and Four Waves"
PS: For detailed theoretical knowledge of disconnection, you can read "Theoretical Classics: Detailed Explanation of the 3-Way Handshake and 4-Way Wave Process of TCP Protocol" and "Learn TCP Three-Way Handshake and 4-Way Hand Wave with Animation", which will not be repeated here.

7. Socket deletion

After the above communication process is completed, the Socket used for communication will no longer be used, and we can delete the Socket at this time.

However, at this time, the Socket will not be deleted immediately, but will be deleted after a period of time.

Waiting for this period of time is to prevent misoperation. The most common misoperation is that the confirmation number returned by the client is lost. As for how long to wait, it is related to the method of data packet retransmission. Here we will discuss it in depth.

Regarding the whole process of Socket operation, from the perspective of the system, it may be more in-depth. It is recommended to read Zhang Yanfei's article "In-depth Operating System, Understanding the Receiving Process of Network Packets from the Kernel (Linux)".

8. Series of articles

This article is the 14th in a series of articles, the outline of which is as follows:

[1] Introduction to Lazy Network Programming (1): Quickly Understand Network Communication Protocols (Part 1)
[2] Introduction to Lazy Network Programming (2): Quickly Understand Network Communication Protocols (Part 2)
[3] Introduction to Lazy Network Programming (3): A quick understanding of the TCP protocol is enough
[4] Introduction to Lazy Network Programming (4): Quickly Understand the Difference Between TCP and UDP
[5] Introduction to Network Programming for Lazy People (5): Quickly understand why UDP is sometimes more advantageous than TCP
[6] Introduction to Lazy Network Programming (6): Introduction to the Functional Principles of the Most Popular Hubs, Switches, and Routers in History
[7] Introduction to Network Programming for Lazy People (7): Explain the profound things in simple language and fully understand the HTTP protocol
[8] Introduction to Network Programming for Lazy People (8): Teach you how to write TCP-based Socket long connections
[9] Introduction to Network Programming for Lazy People (9): A popular explanation, why use a MAC address when you have an IP address?
[10] Introduction to Lazy Network Programming (10): Quickly read the QUIC protocol in a pee
[11] Introduction to Network Programming for Lazy People (11): Understand what IPv6 is in one article
[12] Introduction to Network Programming for Lazy People (12): Quickly read and understand the Http/3 protocol, one article is enough!
[13] Introduction to Network Programming for Lazy People (13): Quickly understand the difference between TCP and UDP in a pee time
[14] Introduction to Lazy Network Programming (14): What is Socket? Understand it in one sentence! (* This article)

9. References

[1] Detailed TCP/IP - Chapter 17 TCP: Transmission Control Protocol
[2] Detailed TCP/IP - Chapter 18 Establishment and Termination of TCP Connections
[3] Detailed explanation of TCP/IP - Chapter 21. TCP timeout and retransmission
[4] Quickly understand network communication protocols (Part 1)
[5] Quickly understand network communication protocols (Part 2)
[6] Necessary for face-to-face, the most popular computer network layering in history
[7] If you were to design the network, what would you do?
[8] If you were to design the TCP protocol, what would you do?
[10] Analysis of the intractable diseases in the TCP protocol (Part 2)
[11] Why TIME_WAIT, CLOSE_WAIT when closing the TCP connection
[12] Starting from the bottom, in-depth analysis of the secret of TCP connection time-consuming

(This article has been published simultaneously at: http://www.52im.net/thread-3821-1-1.html )

Introduction to Network Programming for Lazy People (14): What is Socket? Understand it in one sentence!

1 Introduction

2. What is Socket

3. How is Socket created?

4. How the Socket is connected

5. How does Socket send and receive data

6. How Socket is disconnected

7. Socket deletion

8. Series of articles

9. References

JackJiang

引用和评论

小红书APP的全新鸿蒙NEXT端性能优化技术实践

即时通讯安全篇（一）：正确地理解和使用Android端加密算法

全民AI时代，大模型客户端和服务端的实时通信到底用什么协议？

Redis-单线程模型

融云数据监控平台「北极星」教程，聊天室洪峰、连接异常、消息未达正确解法

极致出海友好，融云 IM 支持消息免打扰设置时区

他可能疯了吧，要用 awk 语言写网络程序……