Reading Notes: "How the Internet is Connected"

Hello everyone, I'm Xiao Cai~

Handsome guys and beauties, I know that your time is precious, so Xiaocai will read a good book for you, read a good book, take the essence of it, and share it with you~!

This article mainly shares 《网络是怎样连接的》
If necessary, you can refer to
If it helps, don't forget to like it❥
The WeChat public account has been opened, Cai Nong said , students who didn't pay attention remember to pay attention!

What I bring today is the reading notes of "How the Internet is Connected"

(The example textures used in this article are all from the original book)

Before officially entering the sharing, we want to take a look at the directory structure of this tree

how the network is connected

This book has a total of 6 chapters, 156482 words, not much space, and interesting content. The complex network communication world can have a certain understanding

Book Route:

Chapter 1: Browser Generated Messages

1) Generate HTTP request

The www in https://www.baidu.com is a name on the web server, the World Wide Web is not the name of a protocol, but the name of the browser and HTML editor first developed by the proposer of the web
Network applications such as browsers do not actually have network control functions, but delegate the operating system to control the network

What is a URL? A website is called a URL to be precise, which is a string of characters starting with Http: in daily life.

Several URLs commonly used in our lives:

Although there are various ways of writing URLs, the content of the beginning part determines the way of writing the latter part, that is, the existence of a certain protocol is defined, so that it will not cause confusion in the process of use.

1. Parse the URL

When we enter a string of URLs, the browser needs to parse the URL, and then generate a request message sent to the Web server. Of course, this step is indifferent during our use, because the browser does this for us. Everything, we only care about the result of the response.

The parsing process includes the following steps:

We first identified HTTP , which means that we need to access the Web server, and then we can continue to split back, including 服务器名称、目录名和文件名 , at this point we know that the original user wants to access file1 in the dir1 directory. html file

2. Omit the filename

The above URL we want to visit http://www.lab.glasscom.com/dir1/file1.html clearly describes that we are accessing the file1.html file, and sometimes there may be similar special URLs

http://www.lab.glasscom.com/dir/ , the file name is omitted, but this does not mean that we cannot access the file. Generally speaking, in this case, the default name of the omitted file name will be set on the server in advance, such as index.html or default.html like
http://www.lab.glasscom.com/ , the URL contains / , which means that the access is the root directory. If we know the directory name, we can infer the file to be accessed according to the case 1. name
http://www.lab.glasscom.com , the URL even / are omitted, this case means accessing the default file set in advance in the root directory (the earliest time this file was called the home page )
http://www.lab.glasscom.com/dir , there is no / at the end of the URL, it means that if there is a file named dir on the web server, the dir will be processed as a file name, if there is a directory named dir , it is processed as a directory name

3. The basic idea of processing

First, the client will send a request message to the server (including 对什么 and 做什么 two parts)

For what : refers to the URL
what to do : refers to the method

The URL is not explained too much, it is the string of request addresses sent to the server. It is not uncommon when we adjust the interface. Usually, there are the following types:

GET : Usually used to get information.
POST : Usually used to add new data.
PUT : Usually used to update data.
DELETE : Usually used to delete data.
HEAD : Basically the same as GET. However, it only returns the HTTP header, not the content of the data.
TRACE : Return the request line and header received by the server directly to the client
OPTIONS : used to notify or query communication options

4. Response processing

When the requested message is sent, the web server returns a response message. In the response message, the content of the first line is Status Code and Response Phrase , which are used to indicate whether the execution result of the request is a success or an error. Status codes and response phrases represent the same content, but have different uses.

The status code is a number used to inform the program of the result of execution, and the response phrase is a piece of text that is also used to inform the result of execution

Status code summary :

1xx : Inform the processing progress and status of the request
2xx : The request was successful
3xx : Indicates that further operations are required
4xx : Client error
5xx : Server error

2) Query the DNS server for the IP address of the web server

Before the client sends a request to the server, there is still one more work to be done, which is to query the IP address corresponding to the server domain name in the URL.

Because the browser itself does not have the function of sending messages to the network , but needs to delegate the operation to the operating system.

But the condition for entrusting the operating system to complete the communication is to provide the IP address of the communication object, not the domain name.

1. What is an IP address

1. Introduction to TCP/IP

To understand what an IP address is, we need to understand TCP/IP.

TCP/IP 结构图

This is a diagram of the structure of TCP/IP. TCP/IP is a large network composed of some small subnets (several calculators connected by hubs) connected by routers .

All devices in the network will be assigned an address (equivalent to our real xx room xx), where the number is assigned to the entire subnet (network number), and the room is assigned to the computers in the subnet ( host number), this is the address in the network, and the whole is called the IP address .

Then the process of sending the message is:

The message sent by the sender first passes through the hub in the subnet, and then forwards it to the router closest to the sender. The router determines the location of the next router according to the destination of the message, and then sends the message to the next router for multiple forwarding. arrived at the final destination

2. IP address

An IP address is actually a string of 32-bit numbers, divided into 4 groups according to 8 bits (1 byte), which are represented by decimal and separated by dots .

An IP address is composed of a network number and a host number , but through this string of numbers, we can't know which are the network numbers and which are the host numbers, so we also need the help of the subnet mask .

Subnet mask : It is a string of 32-bit numbers with the same length as the IP address. The left half is 1 and the right half is 0. The part of the subnet mask of 1 represents the network number, and the part of 0 represents the host number . The subnet mask represents the boundary between the network number and the host number.

The bits in the host number part are all 0 or all 1 to indicate two special meanings:

All 0s: Indicates the entire subnet
All 1: Indicates that the packet is sent to all devices on the subnet, that is, broadcast

2. The emergence of domain names

To reach the final destination, we need to know the IP address of the destination. But an IP address is a string of numbers. There are so many websites that we deal with every day, so for the sake of simplicity, there is a domain name

In order to bridge the gap between the two, there needs to be a mechanism to query IP addresses by name, or to query names by IP address, and this mechanism is DNS

3. Socket library

For the browser to send a request, it needs to entrust the operating system, but to entrust the operating system to query, we need to tell the operating system the IP address of the destination. Therefore, the browser needs to query the IP address corresponding to the domain name and then inform the operating system, but the browser does not have the function of sending requests. Isn't this an infinite loop?

In fact, the operation responsible for DNS query IP address is called domain name resolution, so the operation responsible for resolution is called resolver .

The parser is actually a program, which is included in the Socket library of the operating system. It is a collection of general-purpose program components that allow other applications to call the network functions of the operating system. It is a component of this difficulty.

4. Parser

When the process flow goes to the resolver, the resolver will generate a query message to be sent to the DNS server. Similarly, the operation of sending a message is not performed by the resolver itself, but delegated to the protocol stack inside the operating system ( The parser itself does not have the ability to send and receive over the network)

When sending a message to the DNS server, we also need to know the IP address of the DNS server, but this IP address is already set, such as the network settings on Windows

3) Big relay of DNS servers all over the world

1. Domain name query

Usually the client wants to query the DNS usually contains the following 3 kinds of messages:

domain name
Class : Information used to identify the network, always representing the IN of the Internet
Record Type : Indicates which type of record the domain name corresponds to. For example, when the type is A, it means the IP address corresponding to the domain name, and when it is MX, it means the corresponding mail server. For different record types, the information returned by the server to the client will be different.

2. The hierarchy of domain names

There are countless servers on the Internet, and it is impossible to store all the information of these servers on one DNS server. Therefore, it is necessary to distribute and save the information in multiple DNS servers. out the information to be inquired.

For example www.life.cbuc.com , the position on the right means the higher the level, the general meaning of this domain name is www of the life group of the cbuc department of the com group . This hierarchical domain name information will be registered in the DNS server, and each domain is handled as a whole, that is, a domain cannot be split and stored in multiple DNS servers.

A domain is indivisible, but we can create sub-domains (subdomains) under the domain, and then assign them to each business group, such as life.cbuc.com can create two subdomains: a1.life.cbuc.com and a2.life.cbuc.com

The result of a DNS query is somewhat like a tree.

The IP addresses of the DNS servers responsible for managing lower-level domains are registered with their upper-level DNS servers, and then the IP addresses of the upper-level DNS servers are registered with the higher-level DNS servers, and so on. That is to say, the IP address of the DNS server responsible for managing life.cbuc.com needs to be registered with the DNS server of the --- 88dfa0b239f42044ab5047bfdebaadc1 cbuc.com domain, and the DNS server of the cbuc.com domain The IP address needs to be registered in the DNS server of the com domain, so that the IP address of the lower-level DNS server can be queried through the upper-level DNS server, and the query request can be sent to the lower-level server.

3. The existence of the root domain

After the above explanation, if you think com、cn this type of domain belongs to the top-level domain, it is wrong.

In fact, there is also the existence of the root domain , which is generally omitted when writing. If you want to clearly indicate the root domain, you need to add it at the end of the domain name . For example, www.baidu.com. , although it is not written when writing , but the root domain does exist. The DNS of the root domain holds the information of DNS servers such as com and cn , so when we resolve the domain name, we need to find the DNS server of any domain from the root domain all the way down.

4. Cache to speed up the response

Sometimes it is not necessary to start the search from the top-level root domain, because the DNS server has a cache function, which can remember the domain name queried before. If the domain name and related information to be queried are already in the cache, then you can directly return response.

Similarly, when it is queried that the domain name does not exist, the response result of "does not exist" will also be cached.

All caches have an expiration date. When the information in the cache exceeds the expiration date, the data will be deleted from the cache. Also, when responding to a query, the DNS server also informs the client whether the result of the response came from the cache or from the DNS server responsible for managing the domain name.

4) Delegate the protocol stack to send the message

After obtaining the IP address, the operating system can send and receive messages. The message is actually a kind of digital information . This operation is not limited to browsers, and is common to various applications using the network.

When delegating to the protocol stack inside the operating system, the components in the Socket library need to be called in the specified order.

Before sending and receiving data, both parties need to establish a pipeline. The key to establishing a pipeline lies in the entrance and exit of the data at both ends of the pipeline. These entrances and exits are called 套接字 , and then connect the sockets to form a pipeline. Data flows ( bi-directionally ) along this channel, eventually reaching its destination.

When all the data is sent, the connected pipeline will be disconnected. The pipeline is initiated by the client when it is connected, but can be initiated by any party when it is disconnected.

In summary, the general operation of sending and receiving data is as follows:

Create Socket Phase : Create Socket
connect phase : connect the pipe to the socket on the server side
Communication stage : sending and receiving data
Disconnect Phase : Disconnect the pipe and delete the socket

Chapter 2: Transmission of TCP/IP Data with Electrical Signals

1) Create a socket

1. The internal structure of the protocol stack

The structure of the graph is also a hierarchical relationship, and the upper layer will delegate work to the lower layer layer by layer.

The top part is the network application, that is, the browser, email client, web server, email server, etc., which delegate the work of sending and receiving data to the lower parts.
The lower layer of the application is the Socket library, which includes the resolver, which is used to make queries to the DNS server.
Next is the internals of the operating system, including the protocol stack. The upper part of the protocol stack has two blocks
- The part responsible for sending and receiving data using the TCP protocol
- The part responsible for sending and receiving data using the UDP protocol

The next half is the part that uses the IP protocol to control the transmission and reception of network packets. When transmitting data on the Internet, the data will be divided into network packets one by one, and the operation of sending the network packets to the communication object is handled by IP. In addition, IP also includes the ICMP protocol (used to inform network packet transmission errors and various control messages) and ARP protocol (used to query the corresponding Ethernet MAC address based on the IP address)

The network card driver under the IP is responsible for controlling the network card hardware
The bottom network card is responsible for completing the actual sending and receiving operations, that is, sending and receiving signals in the network cable.

2. The concept of sockets

A socket is a concept and has no actual entity, but it has, for example, the IP address of the communication object, the port number, and the status of the communication operation. The protocol stack queries these control information when performing operations.

Its function is to record various control information used to control communication operations, and the protocol stack needs to judge the next action based on this information

2) Connect to the server

After the socket is created, the application calls connect , and the protocol stack connects the local socket to the server's socket. The connection here refers to the operation process in which the two communicating parties exchange information.

1. Save the header of the control information

Control information can be divided into two categories

Control messages exchanged between clients and servers when they communicate with each other. This information is not only required when connecting, but also when sending and receiving data and when disconnecting.
Information stored in the socket to control the operation of the protocol stack. The information passed by the application and the information received from the communication object will be saved here, as well as the execution status of the sending and receiving data operations and other information will also be saved here.

2. The actual process of connection

The connection is started from the application calling the connect of the Socket library

 connect(<描述符>, <服务器IP地址和端口号>,...)

The above connection information will be passed to the TCP module, and then TCP will exchange control information with the communication object corresponding to the IP address. The general process is as follows:

Create a header representing connection control information at the TCP module
Find the socket to connect to via the TCP header
Pass the information to the IP module and delegate it to send

This period actually involves the process of TCP three-way handshake

3) Send and receive data

1. Pass the message to the protocol stack

When the control flow returns from the connect to the application, the next step is to enter the data sending and receiving stage.

The data sending and receiving operation starts when the application calls write to send the data to be sent to the protocol stack.

The protocol stack does not send the data as soon as it receives it, but stores the data in the internal send buffer and waits for the next piece of data from the application. The advantage of this is that the length of data sent by the protocol stack is determined by the application itself. When to send out or not depends on the following factors:

MTU : The maximum length of a network packet, usually 1500 bytes in Ethernet
MSS : After removing the header, the maximum length of TCP data that a network packet can hold
Time : When the frequency of sending data by the application is not high, if it waits until the length is close to the MSS every time before sending, it may cause a delay in sending due to the long waiting time. In this case, even if the length of the data in the buffer If the MSS is not reached, it should be sent out decisively. For this reason, there is a timer inside the protocol stack. After a certain period of time, the network packet will be sent out.

2. Data splitting

When the data in the sending buffer exceeds the length of the MSS, it needs to be split in units of the length of the MSS. Each piece of data split will be put into a separate network packet summary. According to the data splitting of the sending buffer, When it is determined that these data need to be sent, a TCP header is added in front of each piece of data, and the port numbers of the sender and receiver are marked according to the control information recorded in the socket, and then handed over to the IP module for transmission.

4) Disconnect from the server and delete the socket

After the communication with the server is over, the socket used for communication will no longer be used. At this time, we can delete the socket, but the socket will not be deleted immediately, but will wait for a while The reason for waiting for a period of time is to prevent misoperation.

The specific waiting time is related to the operation mode of packet retransmission. After the network packet is lost, it will be retransmitted. This operation usually lasts for a few minutes. If the retransmission is still invalid after a few minutes, the retransmission will be stopped.

1. Summary of sending and receiving operations

The first step in the data sending and receiving operation is to create a socket. Generally speaking, the application on the server side will create a socket and enter the state of waiting for a connection when it is started. The client generally creates a socket when the user triggers a specific action and needs to access the server.

After the socket is created, the client will initiate a connection operation to the server, which is the classic TCP three-way handshake operation

After the connection is established, it enters the data sending and receiving operation

5) IP and Ethernet packet sending and receiving operations

1. Basic knowledge of packages

The packet is composed of two parts: header and data

Header : Contains control information such as the destination address, which is equivalent to the face sheet of the express package
Data : The content sent to the other party, which is equivalent to the goods in the express package

There are two different forwarding devices in the network, routers and hubs

Router: Determine the location of the next router based on the destination address
Hub: transmits network packets to the next route in the subnet

In fact, a hub is a device that transmits packets according to Ethernet rules , and a router is a device that transmits packets according to IP rules , so a conclusion can be drawn:

IP Protocol : Determine the location of the next IP forwarding device based on the target address
Ethernet protocol : transmits packets to the next forwarding device in the subnet

Include two headers in TCP/IP

MAC header : for the Ethernet protocol
IP header : for the iP protocol

2. Packet sending and receiving operations

In fact, the work of transmitting packets from the sender to the receiver is done by network devices such as hubs and routers, so the IP module is only the entrance to the entire packet transmission process.

The IP module is responsible for adding two headers:

MAC Header : Header for Ethernet, containing the MAC address
IP header : The header for IP, including the IP address

Next, the packaged package will be handed over to the network hardware (network card), the network hardware will convert these digital information into electrical or optical signals, and send them out through network cables (optical fibers), and then these signals will reach hubs, routers, etc. The forwarding device, and then the forwarding device delivers it to the receiver step by step.

Regardless of whether the packet to be sent and received is a control packet or a data packet, IP's sending and receiving operations for various types of packets are the same.

3. Generate IP header

IP does not know the recipient's IP address, which is assigned by the application. In addition to the recipient's IP address, the IP header also needs to fill in the sending IP address. The IP address here does not refer to the IP address of the computer, but the IP address of the network card, because there may be multiple network cards in a computer.

4. Generate MAC header

The receiver's IP address in the IP header indicates the destination of the network packet. Through this address, we can determine where to send the packet, but in the Ethernet world, this idea of TCP/IP does not work. The way the network determines the destination of a network packet is different from that of TCP/IP, so a matching method must be used to send the packet to the destination in the Ethernet, and the MAC header is used for this purpose.

The MAC header starts with the receiver's and sender's MAC addresses, but the IP address is 32 bits and the MAC address is 48 bits.

The sender's MAC address is easy to know, but the receiver's MAC address is more troublesome. We need to query the corresponding MAC address according to the IP address.

5. ARP query MAC address

In Ethernet, there is a method called broadcast, which can send packets to all devices connected to the same Ethernet.

ARP is to use broadcast to ask all devices: "Whose IP address is this xx, please tell me your MAC address"

There is also a cache in ARP, and the ARP cache will be queried first when sending.

6. Basic knowledge of Ethernet

Ethernet is a communication technology designed for multiple computers to be able to communicate with each other freely and cheaply.

The essence of this kind of network is actually a network cable. With the help of a small device called a transceiver, its function is only to connect the signals between different network cables.

When a computer sends a signal, the signal will flow through the entire network through the network cable and eventually reach all devices. At this time, it is necessary to add the receiver's information at the beginning of the signal to determine who a signal is sent to.

With the subsequent development, the previous prototype was converted into a repeater hub, and the transceiver network cable was replaced with twisted pair. Although the structure of the network has changed, the basic nature of the signal being sent to all devices has not changed.

When the switching hub is used later, the signal will not be sent to all devices, but to the device with the specified MAC address.

In summary, there are three characteristics:

Send the packet to the receiver of the MAC header
Identify sender by sender MAC address
Identify the contents of the packet with the ether type

7. Send network packets to the hub

We can send the packet out through the network cable, and the operation of sending the signal is divided into two types

Using the Hub's Half-Duplex Mode

In order to avoid signal collision, it is first necessary to determine whether there is a signal sent by other devices in the network cable. If there is, it needs to block and wait.

Using the switch's full-duplex mode

Send and receive can be done simultaneously without collision

8. Accept return package

In half-duplex mode Ethernet using a hub , the signal sent by one device will reach all devices connected to the hub, and after all the signals are received, FCS and MAC checks will be performed. Packets are placed in the buffer, and the network card will notify the computer that a packet has been received.

The action to notify the computer uses a mechanism called an interrupt , which needs to interrupt the computer's ongoing task to let the computer notice what's happening on the network card

6) Send and receive operation of UDP protocol

1. Short data for control

The operation of exchanging control information such as DNS query can basically be solved within the size of a packet. In this scenario, UDP can be used instead of TCP.

2. Audio and video data

When sending audio and video data, it must be delivered within the specified time. Once it is delivered late, the playback timing will be missed, causing the sound and image to freeze. Therefore, UDP is usually used to achieve higher transmission effects, because even if some packets are missing, it will not cause serious problems, and knowledge will cause some distortion or freeze.

Chapter 3: From Network Cables to Network Devices

1) The signal is transmitted in the network cable and the hub

After the network packet is sent from the client computer, it must pass through the hub, switch and router and finally enter the Internet. In fact, our home routers have integrated the functions of hubs and switches

2) It is important to prevent signal attenuation in the network cable

When the signal arrives at the hub, it is not exactly the same as the one that just sent it. The signal received by the hub sometimes attenuates, and the energy of the signal will gradually lose during the network transmission process. The longer the network cable, the more serious the signal attenuation.

Even if the line conditions are good and there is no noise, the signal will still be distorted during transmission. If the influence of noise is added, the distortion will be even worse.

3) Twisted pair is to suppress noise

The twisted pair in the twisted pair means that two signal lines are twisted together in a group. This twist-like design is to suppress the influence of noise.

Reasons for noise :

There are electromagnetic waves around the network cable, and when the electromagnetic waves come into contact with conductors such as metals, a current is generated in them. Since the signal itself is also a current with voltage changes, its essence is the same as the current generated by the noise, so the current of the signal and the noise will be mixed together, which will cause the waveform of the signal to be distorted, which is the influence of the noise.

Types of electromagnetic waves :

Electromagnetic waves leaked from electric motors, fluorescent lamps, and CRT display light equipment, which come from other equipment other than network cables

To suppress this electromagnetic wave, first the signal wire is made of metal. When the electromagnetic wave touches the signal wire, a current will be generated along the right-handed direction of the propagation of the electromagnetic wave. This current will cause the waveform to be distorted. If the signal wire is wrapped around Together, the signal lines will become spiral, and the noise current defense lines generated in the two signal lines will be opposite, so that the noise currents cancel each other out, and the noise is suppressed

Leaks from adjacent signal lines in the network cable

The intensity of this noise is not large, but the distance is relatively close. The way to suppress this noise lies in the twisting of twisted pairs. In a network cable, the twisting pitch of each pair of signal lines has a certain difference, which makes the distance between positive signal lines in some places close, and others The place is that the negative signal line is close. Since the noise effects generated by the positive and negative signal lines are opposite, the two will cancel each other out.

4) The hub sends the signal to all lines

When the signal reaches the hub, it is broadcast to the entire network. The basic architecture of Ethernet [illustration] is to send packets to all devices, and then the device determines which packets should be received according to the receiver's MAC address, and the hub is a faithful embodiment of this architecture, which is responsible for following the basics of Ethernet. Architecture broadcasts the signal

2) The switch forwards according to the address table

Switches are designed to forward network packets as-is to their destination.

When the signal reaches the network cable interface and is received by the PHY (MAU) module, this part is the same as the hub, the PHY (MAU) module will convert the signal in the network cable to the general mode, and then pass it to the MAC module, the MAC module will The signal is converted into a digital signal, and then the error is checked by the FCS at the end of the packet, and if there is no problem, it is stored in the buffer.

The operation of this part is basically the same as that of the network card. It can be considered that each network cable interface of the switch is behind a network card. Unlike the network card, the port of the switch does not have a MAC address.

When the packet is stored in the buffer, it is necessary to check whether the MAC address of the receiver of the packet has been recorded in the MAC address table, and then send the packet to the responding port through the switching circuit

MAC 地址表

When the network packet reaches the sending port through the switching circuit, the MAC module and the PHY (MAU) module in the port will perform the sending operation and send the signal to the network cable. This part is the same as the process of sending the signal by the network card.

3) Maintenance of MAC address table

In the process of forwarding packets, the switch also needs to maintain the contents of the MAC address table. There are two types of maintenance:

When a packet is received, write the sender's MAC address and its input port number into the MAC address table
The action of deleting a record in the address table prevents problems when the device is moved (for example, when we move the computer from the desk to the conference room, the device moves and the port changes). In order to prevent problems caused by the movement of terminal equipment, it is necessary to delete outdated records that have not been used for a period of time from the address table.

4) Full duplex mode

The full-duplex mode is a unique working mode of the switch , it can transmit and receive at the same time, the hub does not have such a feature

5) Auto-negotiation: determine the optimal transmission rate

Auto-negotiation refers to detecting whether the other party supports full-duplex mode, and supports switching to the corresponding working mode, and in addition to automatically switching the working mode, it can also detect the transmission rate of the other party and perform automatic switching.

3) The packet forwarding operation of the router

1. Basic knowledge of routers

After the network cable package passes through the hub and switch, it will reach the router.

Routers are designed based on IP and switches are designed based on Ethernet. The router includes a forwarding module and a port module

Forwarding module : responsible for judging the forwarding destination of the packet (similar to the IP module)
Port module : responsible for sending and receiving packets (similar to a network card)

The basic principle of router :

When the router forwards packets, it will first receive the sent packets through the port. The working process of this step depends on the communication technology corresponding to the port. For the Ethernet port, it works according to the Ethernet specification, while the wireless LAN port works according to the wireless LAN specification. In short, the hardware of the port is entrusted to receive the packet. Next, the forwarding module will query in the routing table according to the IP address of the receiver recorded in the IP header of the received packet to determine the forwarding target, and then the forwarding module will transfer the packet to the port corresponding to the forwarding target, and then the port will be forwarded. The packet is sent out according to the rules of the hardware, that is, the forwarding module entrusts the port module to send the packet out.

Chapter 4: Entering the Internet through the Access Network

1) Structure and working mode of ADSL access network

1. The basic structure of the Internet is the same as that of home and corporate networks

The Internet also forwards packets through routers. We can understand the Internet as an enlarged version of the home and company network.

The two main differences between the Internet and home and corporate networks are: the difference in distance and the way in which routes are maintained.

2. Access network connecting users to the Internet

The so-called access network refers to the communication line connecting the Internet with the home and company networks. General household access networks include ADSL, FTTH, CATV, telephone lines, ISDM, etc., and companies may also use dedicated lines.

3. ADSL Modem splits packets into cells

In this figure network packets are transmitted from right to left. The network packets sent by the router at the client end reach the telephone office through ASDL Modem and telephone lines, and then reach the network operator (ISP) of ADSL.

The network packets generated by the client first pass through the hub and switch to the Internet access router, where the IP packets are extracted from the Ethernet and the forwarding destination is determined. Here, the network packet will add a total of three headers: MAC header, PPPoE header, and PPP header, and then converted into electrical signals according to Ethernet rules and sent to the ADSL Modem, and then the ADSL Modem will split the packet into many Each small cell is called a cell.

4. ADSL modulates cells into signals

Ethernet uses a square wave signal to represent 0 and 1. ADSL is more complicated. It uses a smooth waveform (sine wave) to synthesize the signal to represent 0 and 1. This technique is called modulation .

There are many modulation methods. The modulation method used by ADSL is the quadrature amplitude modulation (QAM) method combining amplitude modulation (ASK) and phase modulation (PSK) .

5. ADSL increases speed by using multiple waves

Signals do not have to be limited to a single rate. Waves of different frequencies can be synthesized, so multiple frequency synthesized waves can be used to transmit signals, so that the number of bits that can be represented can be doubled. ADSL takes advantage of this property. , increasing the rate by increasing the number of bits that can be represented by multiple waves.

6. The role of the separator

After the ADSL Modem converts the cells into electrical signals, the signal enters a device called a splitter, and the ADSL signal is mixed with the phone's voice signal and transmitted out the phone line.

The function of the splitter is actually in the opposite direction, that is, when the signal is transmitted from the telephone line, it is necessary to separate the signal of the telephone and the ADSL.

The function of the splitter is to filter out the signal above a certain frequency, that is, to filter out the high-frequency signal used by ADSL. In this way, only the telephone signal will be transmitted to the telephone, but for the ADSL Modem at the other end, it is Send the original mixed signal to it.

7. From the user to the telephone office

From the splitter is the interface for plugging in the telephone line. After the signal comes out from here, it will pass through the indoor telephone line and then reach the IDF and MDF of the building. Since there are many underground cables near the telephone office, a tunnel is formed where the cables are buried, and this part is called a cable tunnel. After entering the telephone office through the cable tunnel, the cables are connected to the MDF of the telephone office one by one.

2) Optical fiber access network

1. Basic knowledge of fiber optics

In addition to the ADSL mentioned above, the access network technology is also called FTTH , which is an optical fiber-based access network technology. The key to FTTH is the use of optical fibers.

Optical fibers are made of a two-layer structure of fibrous transparent materials (glass and plastic) that transmit digital information by conducting optical signals through the core inside. Light means 1, dark means 0

2. Single mode and multimode

The key technology of optical fiber communication is the optical fiber that can transmit optical signals. Fiber can be divided into several types, generally including thinner single-mode fiber (8~10 um) and thicker multi-mode fiber (50 um or 62.5 um)

Multimode fiber: It can transmit multiple light rays, which means that more light passes through, and the performance requirements for light sources and photosensitive elements are lower, which can reduce the performance requirements of light sources and photosensitive elements.

Single-mode fiber: only one light can be transmitted, less light can pass through, and higher performance requirements for light sources and sensitive components

Single-mode fiber has less distortion and can be longer than multi-mode fiber. Therefore, multi-mode fiber is mainly used for connections in a building, while single-mode fiber is used for connections between buildings with a long distance.

3) PPP and tunnels used in the access network

1. User authentication and configuration delivery

In both ADSL and FTTH access networks, you need to enter a user name and password before logging in before you can access the Internet, and BAS is the window for logging in. BAS uses PPPoE to achieve this function.

How PPP works:

First, the user makes a call to the operator's access point, and then enters the user name and password to log in after the call is connected. The username and password are sent from the RAS to the authentication server through the RADIUS protocol, and the authentication server verifies whether the information is correct. After confirmation, the authentication server will return configuration information such as the IP address, and deliver the information to the user.

2. Transmission of PPP messages over Ethernet

ADSL and FTTH access methods also need to assign a public address to the computer to access the Internet, which is the same as dial-up Internet access.

The PPP protocol does not define elements such as the header and FCS in the Ethernet, nor does it define the format of the signal, so it is impossible to directly convert the PPP message into a signal to send. To transmit a PPP message, there must be another "container" that contains elements such as header, FCS, signal format, etc., and then the PPP message is loaded in this container. Therefore, in dial-up access, PPP borrows the HDLC protocol as a container, and the HDLC protocol was originally designed to transmit network packets in a dedicated line. The dial-up access mode makes some amendments to this specification. For ADSL and FTTH, HDLC cannot be used, but Ethernet packets can be used instead of HDLC to reproduce the PPP protocol, and in order to make up for these problems, a new specification is designed, namely PPPoE

3. Send network packets to the operator through the tunnel

In addition to serving as a window for user authentication, BAS can also use tunneling to transmit network packets

The tunnel is similar to the TCP connection established between sockets. Throw the entire packet including the header from one end of the tunnel, and the packet will come out of the other end of the tunnel intact, as if in the network Dig a drop, and the network packets pass through this tunnel.

How the tunnel is implemented:

Similar to the TCP connection, a TCP connection needs to be established between two tunnel routers on the network, and then the sockets at both ends of the connection are regarded as the ports of the routers, and data is sent and received from this port.
Encapsulation-based tunnel implementation. Transfer the entire packet, including the header, into another packet to the other end of the tunnel. In this way, the packet itself can reach the outlet on the other end intact.

4) Inside the network operator

1. POPs and NOCs

The Internet is made up of multiple carrier networks interconnected. Access networks such as ADSL and FTTH are connected to the operator's equipment contracted by the user. These equipments are called POP, that is, the entrance of the Internet.

The structure of POP varies according to the type of access network and the service type of the operator.

5) Network packages across operators

1. Connections between operators

After the network packet arrives at the POP router, if the final destination Web server and the client are connected to the same operator, there should be a corresponding forwarding destination in the routing table of the POP router. The operator's router can exchange routing information with other routers, thereby automatically updating its own routing table and realizing automatic management.

If the operator of the server and the operator of the client are different, it needs to be sent to the operator of the server first. This information can also be found in the routing table, because the router of the operator is also exchanging information with the routers of other operators.

2. Routing information exchange between operators

As long as the connected routers can inform the routing information, as long as the routing information of the other party is obtained, you can know all the networks connected to the other router, write this information into your own routing table, and send packets to those networks. The mechanism used for this routing information exchange is called BGP .

This routing exchange can be divided into two categories:

Inform each other of all the routes in the Internet
The two operators only inform each other of the routing information related to their respective networks, so that the networks between the two parties can send and receive network packets to each other. This method is called non-transfer, also called peering.

Chapter 5: What are the mysteries in the server-side LAN

1) The deployment location of the web server

1. Deploy a web server in the company

Traditional deployment method: The server is deployed directly on the company network and can be accessed directly from the Internet. In this case, the network packet reaches the server after passing through the router in the nearest POP, the access network, and the server-side router.

There are disadvantages in this way:

Not enough IP addresses. This approach requires all devices on the corporate network, including server and client computers, to be assigned their own public addresses.
safe question. Network packets in the Internet will enter the server without moderation.

2. Deploy the web server in the data center

Servers can be placed in data centers managed by network operators, or directly leased servers provided by operators.

The data center is directly connected to the NOC, the core part of the operator, or the hub IX between operators.

connected. The core part of the Internet can be directly connected to the Internet through high-speed lines, so the server is deployed

High access speed can be obtained here,

2) The structure and principle of firewall

1. Mainstream packet filtering methods

Wherever the server is deployed, a firewall is deployed in front. If the packet cannot pass through the firewall, it cannot reach the server.

2. How to set the rules of packet filtering

The header of a network packet contains control information used to control the operation of the communication, and by examining this information, a lot of useful content can be obtained.

When this is a packet filtering rule, the first thing to observe is how packets flow. Through the receiver IP address and sender IP address, the start and end points of the packet can be determined, and the IP address can be set as the judgment condition.

3. Qualify applications by port numbers

When we want to limit an application, we can add the port number in the TCP header or the UDP header as the judgment condition.

4. Determine the connection direction through the control bits

Through the above two conditions, it can be limited to a specific application, but there is still no way to prevent the Web server from accessing the Internet. The TCP protocol used by the Web sends and receives network packets in both directions. package, the operation of accessing the Web server from the Internet will also be affected and cannot be performed. Therefore, it is not enough to judge the flow direction of the packet alone. It is also necessary to judge according to the direction of the access. Here, the control bit of the TCP header needs to be used.

3) Content distribution service

1. Leverage content distribution services to share the load

When the cache server is deployed on the server side, the traffic cannot be reduced, so if the cache server is deployed on the client side, it can be free from or less affected by some congestion points, making the network traffic more stable. But if it is deployed in the client's letter, the web server's nurse can't control it and can't expand or shrink the number. So it is possible to deploy cache servers at the edge of the Internet

Three deployment methods:

As a Web server operator, if you deploy the server yourself, it is still too much to bear to a certain extent, so there is a manufacturer that specializes in providing this service (content distribution service) called CDSP .

2. How to find the nearest cache server

Access is distributed using DNS servers that relay DNS queries to each other. However, DNS can only return IP addresses in order in a round-robin manner, regardless of the distance between the client and the cache server, so it may return the IP address of the cache server farther away from the client.

If you want to access the nearest cache server, you should not use polling, but should judge the distance between the client and the cache server, and return the IP address of the cache server closest to the client.

Don't talk empty-handed, don't be lazy, and be a programmer with 吹着牛X做架构 ~ Follow us to be a companion, so that Xiaocai is no longer alone. See you below!

If you work harder today, tomorrow you will be able to say one less thing to ask for help!
I'm Xiao Cai, a man who grows stronger with you. 💋
The WeChat public account has been opened, Cai Nong said , students who didn't pay attention remember to pay attention!