The unknown network programming (13): Go deep into the operating system and thoroughly understand the 127.0.0.1 native network communication

The author of this article, Zhang Yanfei, the original title "How much do you know about the native network communication process of 127.0.0.1", was first published in "Developing Internal Strength Training", please contact the author for reprinting. There are changes this time.

1 Introduction

Following the " do you really understand the difference between 127.0.0.1 and 0.0.0.0?" ", this is my second article on the basics of network programming on the local network.

This article was originally shared by the author Zhang Yanfei. The reason for writing this article is that the application of local network IO is very wide now. In php, Nginx and php-fpm generally communicate through 127.0.0.1; in microservices, due to the application of the side car mode, there are more and more local network requests. Therefore, if you can deeply understand this problem, it will be very meaningful in the technical practice of various network communication applications.

Today we will clarify the problems related to the 127.0.0.1 local network communication!

In order to facilitate the discussion, I split this question into 3 questions:

1) Does 127.0.0.1 local network IO need to go through a network card?
2) Compared with the external network communication, what is the difference in the core receiving and sending process?
3) Can 127.0.0.1 be faster than 192.168.x?

I believe that the above questions, including the old instant messaging veterans who often confuse the instant messaging network, are all seemingly familiar, but in fact they are still unable to thoroughly explain the topic clearly. This time, let's figure it out thoroughly!

(This article was published simultaneously at: http://www.52im.net/thread-3600-1-1.html)

2. Series of articles

This article is the 13th article in a series. The outline of this series is as follows:

"The Unknown Network Programming (1): Analysis of the Intractable Diseases in the TCP Protocol (Part 1)"
"The Unknown Network Programming (2): Analysis of the Intractable Diseases in the TCP Protocol (Part 2)"
"Unknown Network Programming (3): Why TIME_WAIT, CLOSE_WAIT When Close TCP Connection"
"The Unknown Network Programming (4): In-depth study and analysis of TCP's abnormal shutdown"
"Unknown Network Programming (5): UDP Connectivity and Load Balancing"
"The Unknown Network Programming (6): Deeply understand the UDP protocol and use it well"
"The Unknown Network Programming (7): How to make the unreliable UDP reliable? 》
"Unknown Network Programming (8): Decrypting HTTP from the Data Transmission Layer"
"The Unknown Network Programming (9): Combining Theory with Practice, and Understanding DNS in an All-round Way"
"The Unknown Network Programming (10): Go Deep into the Operating System and Understand the Process of Receiving Network Packets from the Kernel (Linux)"
"Unknown Network Programming (11): Starting from the bottom, in-depth analysis of the time-consuming secrets of TCP connections"
"Unknown Network Programming (12): Thoroughly understand the KeepAlive keepalive mechanism of the TCP protocol layer"
"The Unknown Network Programming (13): Go Deep into the Operating System and Thoroughly Understand the 127.0.0.1 Native Network Communication" (* This article)

3. For comparison, let’s take a look at cross-machine network communication

Before starting to talk about the native communication process, let's take a look at cross-machine network communication (take the implementation in the Linux system kernel as an example to explain).

3.1 Cross-machine data transmission
From the send system call until the network card sends the data, the overall process is as follows:

In the above picture, we see that the user data is copied to the kernel mode, and then entered into the RingBuffer after being processed by the protocol stack. Then the network card driver actually sends the data out. When the transmission is completed, the CPU is notified through a hard interrupt, and then the RingBuffer is cleaned up.

However, the above picture does not show the kernel components and source code well. Let's look at it again from the perspective of the code.

After the network has been sent. When the network card is finished sending, it will send a hard interrupt to the CPU to notify the CPU. After receiving this hard interrupt, the memory used in RingBuffer will be released.

3.2 Cross-computer data reception
When the data packet arrives on another machine, the receiving process of the Linux data packet begins (for a more detailed explanation, please refer to "In-depth operating system, understanding the network packet receiving process from the kernel (Linux)").

▲ The above picture is quoted from "Deep into the operating system, understand the process of receiving network packets from the kernel (Linux)"

When the network card receives the data, the CPU initiates an interrupt to notify the CPU that data has arrived. When the CPU receives the interrupt request, it will call the interrupt processing function registered by the network driver to trigger the soft interrupt. ksoftirqd detects the arrival of a soft interrupt request and starts polling to receive the packet. After receiving it, it is handed over to the protocol stacks at all levels for processing. After the protocol stack has processed and put the data in the receiving queue, wake up the user process (assuming it is in blocking mode).

Let's look at it again from the perspective of kernel components and source code.

3.3 Summary of cross-machine network communication
About the understanding of cross-machine network communication, you can use the following picture to summarize:

4. The sending process of local network data

In the previous section, we saw the entire network data sending process when cross-machine.

In the process of local network IO, there will be some differences in the process. In order to highlight the key points, this section will no longer introduce the overall process, but only introduce the differences from the cross-machine logic. There are two differences in total, namely routing and driver.

4.1 Network layer routing
When sending data into the protocol stack to the network layer, the network layer entry function is ip_queue_xmit. Route selection is performed in the network layer. After the route selection is completed, some IP headers are set, some netfilter filtering is performed, and the packet is delivered to the neighboring subsystem.

For the local network IO, the special feature is that routing items can be found in the local routing table, and the corresponding devices will use the loopback network card, which is our common lO.

Let's take a detailed look at the routing-related working process in the routing network layer. Look from the network layer entry function ip_queue_xmit.

//file: net/ipv4/ip_output.c

intip_queue_xmit(struct sk_buff skb, struct flowi fl)

{

//Check if there is a cached routing table in the socket

rt = (struct rtable *)__sk_dst_check(sk, 0);

if(rt == NULL) {

//Expand search if there is no cache

//Find routing items and cache them in the socket

rt = ip_route_output_ports(...);

sk_setup_caps(sk, &rt->dst);

}

The function to find routing items is ip_route_output_ports, which in turn calls ip_route_output_flow, __ip_route_output_key, fib_lookup. The calling process is omitted, and directly look at the key code of fib_lookup.

//file:include/net/ip_fib.h

static inline int fib_lookup(struct net net, const struct flowi4 flp, struct fib_result *res)

{

struct fib_table *table;

table = fib_get_table(net, RT_TABLE_LOCAL);

if(!fib_table_lookup(table, flp, res, FIB_LOOKUP_NOREF))

return 0;

table = fib_get_table(net, RT_TABLE_MAIN);

if(!fib_table_lookup(table, flp, res, FIB_LOOKUP_NOREF))

return 0;

return -ENETUNREACH;

}

In fib_lookup, the two routing tables local and main will be queried, and local will be queried first and then main will be queried. We can view these two routing tables by using the command name on Linux, here we only look at the local routing table (because the local network IO queries this table and it terminates).

ip route list table local

local10.143.x.y dev eth0 proto kernel scope host src 10.143.x.y

local127.0.0.1 dev lo proto kernel scope host src 127.0.0.1

It can be seen from the above results that the route whose destination is 127.0.0.1 can be found in the local routing table. The fib_lookup work is completed, return to __ip_route_output_key to continue.

//file: net/ipv4/route.c

struct rtable __ip_route_output_key(struct net net, struct flowi4 *fl4)

{

if(fib_lookup(net, fl4, &res)) {

}

if(res.type == RTN_LOCAL) {

dev_out = net->loopback_dev;

...

}

rth = __mkroute_output(&res, fl4, orig_oif, dev_out, flags);

return rth;

}

For local network requests, all devices will use net->loopback_dev, which is the lo virtual network card.

The next network layer is still the same as the cross-machine network IO, and will eventually pass through ip_finish_output, and finally enter the entry function dst_neigh_output of the neighbor subsystem.

Does the local network IO need to be IP fragmented? Because it will go through the ip_finish_output function like the normal network layer processing. In this function, if skb is greater than MTU, fragmentation will still be performed. It's just that the MTU of lo is much larger than that of Ethernet. It can be found through the ifconfig command, the normal network card is generally 1500, and the lO virtual interface can have 65535.

After processing in the neighbor subsystem function, it enters the network device subsystem (the entry function is dev_queue_xmit).

4.2 Network equipment subsystem
The entry function of the network device subsystem is dev_queue_xmit. Recall briefly when I talked about the cross-machine sending process before, for physical devices that really have queues, after a series of complicated queuing and other processing in this function, dev_hard_start_xmit is called, and then it enters the driver to send from this function.

In this process, it is even possible to trigger a soft interrupt to send, the process is shown in the figure:

But for the loopback device in the startup state (q->enqueue is judged to be false), it is much simpler: there is no queue problem, directly enter dev_hard_start_xmit. Then enter the send callback function loopback_xmit in the "driver" of the loopback device, and "send" the skb.

Let's look at the detailed process, starting from the dev_queue_xmit entry of the network device subsystem.

//file: net/core/dev.c

int dev_queue_xmit(struct sk_buff *skb)

{

q = rcu_dereference_bh(txq->qdisc);

if(q->enqueue) {//The loopback device is false here

rc = __dev_xmit_skb(skb, q, dev, txq);

goto out;

}

//Start loopback device processing

if(dev->flags & IFF_UP) {

dev_hard_start_xmit(skb, dev, txq, ...);

...

}

In dev_hard_start_xmit, the operation function of the device driver will still be called.

//file: net/core/dev.c

int dev_hard_start_xmit(struct sk_buff skb, struct net_device dev, struct netdev_queue *txq)

{

//Get the set of callback functions ops of the device driver

const struct net_device_ops *ops = dev->netdev_ops;

//Call the driver's ndo_start_xmit to send

rc = ops->ndo_start_xmit(skb, dev);

...

}

4.3 "Driver" program
For a real igb network card, its driver code is in the files drivers/net/ethernet/intel/igb/igb_main.c. Following this path, I found the "driver" code location of the loopback device: drivers/net/loopback.c.

In drivers/net/loopback.c:

//file:drivers/net/loopback.c

static const struct net_device_ops loopback_ops = {

.ndo_init = loopback_dev_init,

.ndo_start_xmit = loopback_xmit,

.ndo_get_stats64 = loopback_get_stats64,

};

So the call to dev_hard_start_xmit actually executes the loopback_xmit in the loopback "driver".

Why do I put the "driver" in quotation marks, because loopback is a pure software virtual interface, and there is no real driver. Its workflow is roughly as shown in the figure.

Let's look at the detailed code again.

//file:drivers/net/loopback.c

static netdev_tx_t loopback_xmit(struct sk_buff skb, struct net_device dev)

{

//Strip off the connection with the original socket

skb_orphan(skb);

//Call netif_rx

if(likely(netif_rx(skb) == NET_RX_SUCCESS)) {

}

In skb_orphan, the socket pointer on skb is first removed (stripped out).

Note: In the process of sending the local network IO, the skb below the transport layer does not need to be released, just pass it directly to the receiver. Finally, it saves a little bit of overhead. Unfortunately, the skb of the transport layer can't be saved either, so you still have to apply and release frequently.

Then call netif_rx, in this method will eventually be executed to enqueue_to_backlog (netif_rx -> netif_rx_internal -> enqueue_to_backlog).

//file: net/core/dev.c

static int enqueue_to_backlog(struct sk_buff skb, int cpu, unsigned int qtail)

{

sd = &per_cpu(softnet_data, cpu);

...

__skb_queue_tail(&sd->input_pkt_queue, skb);

...

____napi_schedule(sd, &sd->backlog);

In the enqueue_to_backlog, insert the skb to be sent into the softnet_data->input_pkt_queue queue and call ____napi_schedule to trigger the soft interrupt.

//file:net/core/dev.c

static inline void ____napi_schedule(struct softnet_data sd, struct napi_struct napi)

{

list_add_tail(&napi->poll_list, &sd->poll_list);

__raise_softirq_irqoff(NET_RX_SOFTIRQ);

}

Only when the soft interrupt is triggered, the sending process is complete.

5. The receiving process of local network data

5.1 The main process
In the process of receiving a cross-machine network packet, a hard interrupt is required before a soft interrupt can be triggered.

In the local network IO process, because the network card is not really used, the actual transmission of the network card, the hard interruption is omitted. Start directly from the soft interrupt and send it to the protocol stack after process_backlog. The general process is as shown in the figure below.

5.2 Detailed process
Next, we will look at the process in more detail.

After the soft interrupt is triggered, it will enter the processing method net_rx_action corresponding to NET_RX_SOFTIRQ (for details, see section 4.2 in the article "In-depth operating system, understanding the network packet receiving process from the kernel (Linux)").

//file: net/core/dev.c

static void net_rx_action(struct softirq_action *h){

while(!list_empty(&sd->poll_list)) {

work = n->poll(n, weight);

}

We still remember that for the igb network card, poll actually calls the igb_poll function.

So who is the poll function of the loopback network card? Since poll_list contains struct softnet_data objects, we found clues in net_dev_init.

//file:net/core/dev.c

static int __init net_dev_init(void)

{

for_each_possible_cpu(i) {

sd->backlog.poll = process_backlog;

}

It turns out that the default poll of struct softnet_data is set to the process_backlog function when it is initialized, let's see what it does.

static int process_backlog(struct napi_struct *napi, int quota)

{

while(){

while((skb = __skb_dequeue(&sd->process_queue))) {

__netif_receive_skb(skb);

}

//skb_queue_splice_tail_init() function is used to connect linked list a to linked list b,

//Form a new linked list b, and turn the head of the original a into an empty linked list.

qlen = skb_queue_len(&sd->input_pkt_queue);

if(qlen)

skb_queue_splice_tail_init(&sd->input_pkt_queue, &sd->process_queue);

}

This time, let's look at the call to skb_queue_splice_tail_init first. I didn't read the source code, and directly said that its function is to link the skb in sd->input_pkt_queue to the sd->process_queue linked list.

Then look at __skb_dequeue, __skb_dequeue is to take the package from sd->process_queue to process. This matches the end of the previous sending process. The sending process is to put the packet in the input_pkt_queue queue, and the receiving process is to take out the skb from this queue.

Finally, __netif_receive_skb is called to send skb (data) to the protocol stack. The calling process after this is consistent with the cross-machine network IO again.

The call chain sent to the protocol stack is __netif_receive_skb => __netif_receive_skb_core => deliver_skb and then the data packet is sent to ip_rcv (for details, see 4.3 in the article "In-depth operating system, understanding network packet receiving process from the kernel (Linux) Subsection).

The network is followed by the transport layer, and finally the user process is awakened, so I won’t do much here.

6. Summary of the local network communication process

Let's summarize the kernel execution process of local network communication:

Recall that the process of cross-machine network IO is:

Okay, back to the topic, we can finally answer the three opening questions in a separate chapter.

7. Answers to the opening three questions

1) Question 1: Does 127.0.0.1 local network IO need to go through a network card?

Through the narrative of this article, we definitely conclude that there is no need to go through the network card. Whether the local network can still be used normally even if the network card is unplugged.

2) Question 2: What is the direction of the data packet in the kernel, and what is the difference in the process compared to the external network transmission?

In general, comparing native network IO and cross-machine IO, it does save some overhead. Sending data does not need to enter the driver queue of RingBuffer, and directly pass the skb to the receiving protocol stack (via soft interrupt).

But there is nothing less in the other components of the kernel: system calls, protocol stacks (transport layer, network layer, etc.), network device subsystem, neighbor subsystem, etc., go through the whole process. Even the "driver" program is gone (although for the loopback device, it is just a pure software virtual stuff). So even if it's native network IO, don't mistakenly think that there is no overhead.

3) Question 3: Can 127.0.0.1 be faster than 192.168.x?

Let me start with the conclusion: I think there is no difference in performance between the two methods of use.

I think a considerable part of people will think that 127.0.0.1 is faster to access the local Server. The reason is that it is intuitively believed that access to the IP will pass through the network card.

In fact, the kernel knows all the IPs on the machine, as long as it finds that the destination address is the machine's IP, it can all go to loopback devices. The other IPs of this machine are the same as 127.0.0.1, and the physical network card is not used, so the performance cost of accessing them is basically the same!

Appendix: More Network Programming Articles

If you think this series of articles is too professional, you can read the series of articles "Introduction to Network Programming for Lazy People". The catalog of the series is as follows:

"Introduction to Network Programming for Lazy People (1): Quickly Understand Network Communication Protocol (Part 1)"

"Lazy Introduction to Network Programming (2): Quickly Understand Network Communication Protocol (Part 2)"

"Lazy Introduction to Network Programming (3): A quick understanding of TCP protocol is enough"

"Lazy Introduction to Network Programming (4): Quickly understand the difference between TCP and UDP"

"Lazy Introduction to Network Programming (5): Quickly understand why UDP is sometimes more advantageous than TCP"

"Introduction to Network Programming for Lazy People (6): Introduction to the Function Principles of the Most Popular Hubs, Switches, and Routers in History"

"Lazy Introduction to Network Programming (7): Explain the basics and understand the HTTP protocol comprehensively"

"Introduction to Network Programming for Lazy People (8): Teach you to write TCP-based Socket long connections"

"Introduction to Network Programming for Lazy People (9): A popular explanation, with an IP address, why use a MAC address? 》

"Introduction to Network Programming for Lazy People (10): A time to soak in the urine to quickly understand the QUIC protocol"

"Introduction to Network Programming for Lazy People (11): One Article to Understand What Is IPv6"

"Lazy Introduction to Network Programming (12): Quickly understand the Http/3 protocol, one is enough! 》

The "Introduction to Brain Disabled Network Programming" on this site is also suitable for introductory learning. The outline of this series is as follows:

"Introduction to Brain Disabled Network Programming (1): Follow the animation to learn TCP three-way handshake and four waved hands"

"Introduction to Brain Disabled Network Programming (2): When we read and write Socket, what exactly are we reading and writing? 》

"Introduction to Brain Disabled Network Programming (3): Some Knowledge That Must Be Known and Known about HTTP Protocol"

"Introduction to Brain Disabled Network Programming (4): Quickly Understand HTTP/2 Server Push"

"Introduction to Brain Disabled Network Programming (5): What is the Ping command that I use every day? 》

"Introduction to Brain Disabled Network Programming (6): What are public IP and intranet IP? What the hell is NAT? 》

"Introduction to Brain Disabled Network Programming (7): Necessary for face-to-face viewing, the most popular computer network hierarchical explanation in history"

"Introduction to Brain Disabled Network Programming (8): Do you really understand the difference between 127.0.0.1 and 0.0.0.0? 》

"Introduction to Brain Disabled Network Programming (9): Interview Compulsory Exam, the most popular and small endian byte order in history"

The following information comes from "TCP/IP Detailed Explanation", a must-read for beginners:

"TCP/IP Detailed Explanation-Chapter 11 UDP: User Datagram Protocol"

"TCP/IP Detailed Explanation-Chapter 17 TCP: Transmission Control Protocol"

"TCP/IP Detailed Explanation-Chapter 18 · TCP Connection Establishment and Termination"

"TCP/IP Detailed Explanation-Chapter 21 TCP Timeout and Retransmission"

The following series are suitable for server-side network programming developers to read:

"High-performance network programming (1): How many concurrent TCP connections can a single server have"

"High-performance network programming (2): The famous C10K concurrent connection problem in the last 10 years"

"High-Performance Network Programming (3): In the next 10 years, it's time to consider C10M concurrency"

"High-performance network programming (4): Theoretical exploration of high-performance network applications from C10K to C10M"

"High-Performance Network Programming (5): Reading the I/O Model in High-Performance Network Programming in One Article"

"High-Performance Network Programming (6): Understanding the Thread Model in High-Performance Network Programming in One Article"

"High-performance network programming (7): What is high concurrency? Understand in one article! 》

"Understanding high performance and high concurrency from the root (1): Going deep into the bottom of the computer, understanding threads and thread pools"

"Understanding high performance and high concurrency from the root (2): In-depth operating system, understanding I/O and zero copy technology"

"Understanding high performance and high concurrency from the root (3): In-depth operating system, thorough understanding of I/O multiplexing"

"Understanding high performance and high concurrency from the root (4): In-depth operating system, thorough understanding of synchronization and asynchrony"

"Understanding high performance and high concurrency from the root (5): Going deep into the operating system and understanding the coroutines in high concurrency"

"Understanding high performance and high concurrency from the roots (6): easy to understand, how high-performance servers are implemented in the end"

"Understanding high performance and high concurrency from the root (7): Going deep into the operating system, and understanding processes, threads, and coroutines in one article"

The following series are suitable for mobile senior network communication developers to read:

"Introduction to Zero-Basic Communication Technology for IM Developers (1): A Century of Development History of Communication Exchange Technology (Part 1)"

"Introduction to Zero-Basic Communication Technology for IM Developers (2): A Century of Development History of Communication Exchange Technology (Part 2)"

"Introduction to Zero-Basic Communication Technology for IM Developers (3): The Hundred Years of Changes in Chinese Communication Methods"

"Introduction to zero-based communication technology for IM developers (4): The evolution of mobile phones, the most complete mobile terminal development history in history"

"Introduction to zero-based communication technology for IM developers (5): 1G to 5G, 30-year history of mobile communication technology evolution"

"Introduction to zero-based communication technology for IM developers (6): Mobile terminal connector-"base station" technology"

"Introduction to Zero-Basic Communication Technology for IM Developers (7): The Maxima of Mobile Terminals-"Electromagnetic Waves""

"Introduction to zero-based communication technology for IM developers (8): Zero-based, the strongest "antenna" principle literacy in history"

"Introduction to Zero-Basic Communication Technology for IM Developers (9): The Hub of Wireless Communication Network-"Core Network""

"Introduction to zero-based communication technology for IM developers (10): Zero-based, the strongest 5G technology literacy in history"

"Introduction to Zero-Basic Communication Technology for IM Developers (11): Why is the WiFi signal poor? Understand in one article! 》

"Introduction to Zero-Basic Communication Technology for IM Developers (12): Is the Internet stuck? Internet disconected? Understand in one article! 》

"Introduction to Zero-Basic Communication Technology for IM Developers (13): Why is the mobile phone signal poor? Understand in one article! 》

"Introduction to Zero-Basic Communication Technology for IM Developers (14): How difficult is it to surf the Internet on high-speed rail? Understand in one article! 》

"Introduction to Zero-Basic Communication Technology for IM Developers (15): Understanding Positioning Technology, One Enough"

This article has been simultaneously published on the official account of "Instant Messaging Technology Circle".

▲ The link of this article on the official account is: click here to enter. The synchronous publishing link is: http://www.52im.net/thread-3600-1-1.html

The unknown network programming (13): Go deep into the operating system and thoroughly understand the 127.0.0.1 native network communication

1 Introduction

2. Series of articles

3. For comparison, let’s take a look at cross-machine network communication

4. The sending process of local network data

ip route list table local

5. The receiving process of local network data

6. Summary of the local network communication process

7. Answers to the opening three questions

Appendix: More Network Programming Articles

JackJiang

引用和评论

长连接网关技术专题(十二)：大模型时代多模型AI网关的架构设计与实现

极致出海友好，融云 IM 支持消息免打扰设置时区

视频直播技术干货(十三)：B站实时视频直播技术实践和音视频知识入门

百万架构师第二十五课：分布式架构的基础：分布式系统的基石TCP-IP通讯协议｜JavaGuide

支持百万人超大群聊的Web端IM架构设计与实践

全平台开源即时通讯IM框架MobileIMSDK：7端+TCP/UDP/WebSocket协议

爬取跨境电商AI选品分析