flow chart
This question is dazzling and seems to be no problem. It is nothing more than the rendering of HTTP requests to the browser, but there are many things that can be talked about. I guess the order of execution is, user input - start navigation - HTTP request - browser rendering. Among them, user input, starting navigation, and browser rendering are the knowledge points of the browser, and HTTP requests are the knowledge points of HTTP.
The following is the entire flow chart from entering the url to seeing the page
Original image address: https://s2.loli.net/2022/04/26/zv3DJdqoSm4bVsZ.png
foreword
Before understanding "start navigation", you need to know the browser architecture. In short, modern browsers consist of 1 main browser process, 1 GPU process, multiple rendering processes, multiple plug-in processes, network processes, audio processes, Stored Process Composition
The following picture is shown by Li Bing in "The Working Principle and Practice of Browser" , showing the architecture of the Chrome browser
And a schematic diagram of the future modern browser architecture:
There is a picture in the article Revealing the Inside of Modern Browsers , which is described like this
The figure shows that the main browser process includes UI threads, network threads, and storage threads, which is different from Li Bing's point of view. Who shall prevail? Based on the time, Li Bing's column was written in 2019, while "Decryption of Modern Browsers" was written in 2018. Standing in the background of 2022, modern browsers, UI, network, storage, etc. have all been upgraded to processes , not a thread in the browser's main process
user input
When the user enters a string in the address bar, the address bar will determine whether the entered keyword is the search content or the requested URL
If you are searching for content, the address bar will use the browser's default search engine to synthesize a new URL with search keywords
- For example, search for Nagasawa Masami in chrome
- If the input content conforms to the URL rules, for example, enter
azhubaby.com
, then the address bar will add this content to the URL after the protocol synthesis according to the rules, such ashttps://azhubaby.com
When the user enters a keyword and types a carriage return, it means that the current page will be replaced with a new page. At this time, there is an API in the browser - beforeunload , which allows the page to trigger a confirmation dialog before leaving. This API is used here to stop the browser from navigating
// 监听离开页面前的事件
window.addEventListener('beforeunload', (event) => {
event.preventDefault();
event.returnValue = '';
})
Check out the demo of beforeunload here
From the perspective of the browser architecture division of labor, when the user enters a string, the UI process (the older browser is the browser main process) is running
start navigating
When the Enter key is pressed, the UI process transfers command to the network process. Before the network process accepts the request command, it will first check whether there is a cache in the local cache. If the resource is cached, it will directly return the resource to the browser process; if the resource is not found in the cache, it will officially enter the HTTP request stage
For knowledge about HTTP caching, you can read this article - interview frequent visitor: HTTP caching
HTTP request
I wrote an article on the TCP/IP protocol and network layering model before, describing the TCP/IP network layering protocol. It is like building blocks. Each layer needs the support of the next layer. Our HTTP request is its HTTP protocol. application, you need to connect the transport layer (TCP) and the lower layer of the network interconnection layer (IP) first
And where does the IP come from, use DNS to map its domain name and IP
We can sort out the "route" using backwards:
HTTP request - HTTP protocol connection - TCP protocol connection - IP protocol connection - need to know IP - DNS for domain name/IP mapping
So the first step into an HTTP request is DNS resolution
DNS resolution
I will not give too much overview of DNS here. In short, its role is to replace IP addresses with domain names, which is in line with human memory. Enter du.azhubaby.com
, which means the IP address 47.102.152.19
, you can ping a domain name in the command line to verify the result
The first step before the HTTP request is to determine whether there is a cache in DNS, if so, return the IP address directly; if not, perform DNS resolution and cache the resulting IP in DNS
After having the IP address, the IP layer connection is successful, and the next step is the TCP transport layer
TCP connection
Here it depends on the version of the HTTP protocol. If it is HTTP/1.1, it is necessary to consider whether the TCP queue is full, because HTTP/1.1 allows a domain name to connect up to 6 TCPs, and if there are too many, it will be queued in the waiting TCP queue; if it is HTTP/2 is fine, it allows TCP concurrency
It is also necessary to consider that if the protocol is the HTTPS protocol, a TLS connection needs to be established
When I wait for the real TCP connection, I think of the Internet celebrity interview questions: three handshakes, four waves
Three handshakes, four waves
Why is it three-way handshake and four-way wave, because only in this way can both parties (client and server) know each other's receiving and sending capabilities.
The steps are:
- The client proposes to establish a connection and sends a client seq:
seq=client_isn
- After the server receives the message, it returns
ack=client_isn+1
and the server seq:seq=server_isn
- After the client receives it, it returns
ack=server_isn+1
indicating that it has received
It can be understood that both men and women confirm their relationship, and both men and women want to get married, what should I do? Seeing that parents are approved by their parents, I have heard this sentence before: marriage without the blessing of parents is unhappy (of course, there are also people who get married without parents, but it is not mainstream)
- The man proposes to go to the woman's house and bring the meeting ceremony seq:
seq=男方的诚意
- The woman's family will return (to the man) a red envelope after receiving the meeting gift
ack=我们认可你啦
and the woman will also bring the meeting gift when she goes to the man's house seq:seq=女方的诚意
- The man's family returns the red envelope (for the woman) after receiving the meeting gift
ack=server_isn+1
This is called a definite relationship. So it has to go back and forth three times, and both sides make sure to know the sincerity of the other side and the sincerity of their own.
So what is the fourth wave?
Four waves are required before disconnecting
Why do you have to wave four times?
Mainly to ensure that both parties know that the other is disconnected
The specific steps are:
- The first time the client sends a message to the server telling it to disconnect
- After the server receives the message, it returns a message to tell the client: Got it, in order to ensure that the server has received all previous HTTP requests, the server needs to wait a while before disconnecting
- The server confirms that all HTTP requests have been received, and actively sends a message to the client: All requests on my side have been processed, and I can also disconnect
- After the client receives this request, it returns a message to tell the server: I know, disconnect it
It is mainly to confirm whether the receiving and sending capabilities of both parties are normal, and to formulate their own initialization sequence numbers to prepare for the subsequent reliable transmission.
It can be understood that a man and a woman want to break up
- The woman proposed to break up, saying that you are not good to me, I want to break up
- The man feels that his needs are reasonable and agrees to break up, but before breaking up, he should figure out his contact information, group photos, and all kinds of messy things before breaking up.
- After the man figured it out, he took the initiative to send a message to the woman, saying that everything has been dealt with clearly here. From now on, you are you, I am me, and we can break up.
- After the woman received the news, she returned and told the man: I understand, let's break up
As a result, they were cut off, and the parting procedures were completed. For detailed information, please see the interviewer in the Valley of the Apes. Don't ask me to shake hands three times and wave my hands four times . One word: fine
send HTTP request
The TCP connection has been passed, and now the HTTP request is officially sent. Here are some things to talk about, such as HTTP message content, request header, response header, request method, status code and other knowledge points
First of all, the message structure of HTTP is composed of starting line + header + blank line + entity . In short, it is header+body. HTTP message can have no body (get method), but must have header
The request header consists of the request line + the header field , and the response header consists of the status line + the header field
The request line has three parts: the request method, the request target, and the version number
- e.g. GET /HTTP/1.1
The status line also has three parts: version number, status code, and reason string
- For example HTTP/1.1 200 OK
In the browser, open F12, in any request in NetWork, you will see this structure
Here we often encounter some derivative problems such as the difference between GET and POST request methods, HTTP status codes, etc.
The difference between GET and POST request methods
- From a caching perspective, GET will be cached, POST will not be cached
- From the perspective of parameters, GET passes parameters in the form of key=value after "?" in the URL, and the data is connected with "&"; POST encapsulates the data into the request body and sends it, this process is invisible
- From a security point of view, GET is not safe because the URL is visible; POST is more secure than GET
- From the perspective of encoding, GET only accepts ASCII characters, and Chinese characters may be garbled when sending Chinese characters to the server; POST supports standard character sets and can correctly transmit Chinese characters
- Judging from the limitation of data length, GET is generally limited by the length of the URL (the maximum length of the URL is 2048 characters), and POST is unlimited.
HTTP status codes
The RFC standard divides the status codes into five categories , and the first digit of the number is used to represent the classification, while 0~99 are not used, so the actual available range of the status codes is greatly reduced, from 000~999 to 100~599.
The specific meanings of these five categories are:
- 1××: prompt information, indicating that it is currently an intermediate state of protocol processing, and subsequent operations are required;
- 2xx: success, the message has been received and processed correctly;
- 3××: redirection, the resource location changes, and the client needs to resend the request;
- 4××: The client is wrong, the request message is wrong, and the server cannot process it;
- 5xx: Server error, the server has an internal error while processing the request.
There are a total of 41 status codes in the current RFC standard
101 - Switching Protocols, the client uses the Upgrade header field
200 - Request succeeded
204 - No Content, the server successfully processed the request but returned nothing.
206 - Generally used for resuming a breakpoint, or loading large files such as video files
301 - Permanent redirect
302 - Temporary redirect
304 - Negotiation cache not modified, return data in cache. It does not have the usual jump meaning, but can be understood as redirecting to a cached file (ie cache redirection)
400 - Syntax error in request
401 - Unauthorized
403 - The server received the request but refused to serve, i.e. the resource is not available
404 - The requested resource cannot be found
408 Request Timeout - Request timed out
414 - The request URI is too long (as shown in Figure 1 Sina often)
500 - Internal Server Error
501 - Not yet implemented: The server does not have the requested capability
502 - Bad Gateway
503 - The server is unavailable, actively respond to the request with 503 or Nginx sets the speed limit, if the speed limit is exceeded, 503 will be returned
504 - Gateway Timeout
Here is a description of 304. When the request header If-Modified-Since
or If-None-Match
judge whether the modification time is consistent (or whether the unique identifier is consistent), if yes, return 304 and use browser memory The ETag
Last-Modified
in
request method
HTTP/1.1 specifies eight methods, all of which must be in uppercase
- GET: Get resources, which can be understood as reading or downloading data. Only GET requests can be cached
- HEAD: Get the meta information of the resource
- POST: submit data like a resource, which is equivalent to writing or uploading data
- PUT: similar to POST
- DELETE: delete a resource
- CONNECT: Establish a special connection tunnel
- OPTIONS: List the methods that can be performed on the resource
- TRACE: trace request-response transmission path
browser rendering
When the HTTP request is completed, the TCP connection is disconnected and the resource is returned to the client (browser). At this time, the browser needs to determine whether it is the same site as the opened website. Because if it is the same site, you can use the rendering process of the same site to render the page, if not, the browser will open a new rendering process to parse the resources
The general process of browser rendering is shown in the following figure:
We can divide page rendering into three steps:
Parse
- HTML is parsed into a DOM tree, CSS is parsed into a CSS rule tree, and JavaScript operates DOM Tree and CSS Rule Tree through DOM API and CSSOM API
render
- The browser engine builds the Rendering Tree through the DOM Tree and CSS Rule Tree, in which a lot of reflow (Reflow) and repainting (Repaint) are performed
Reflow and Repaint
- Reflow: It means that the geometric size of the component has changed, and the Render Tree needs to be re-validated and calculated
- Redraw: A part of the screen needs to be redrawn, for example, the background color of a CSS has changed, but the geometric size of the component has not changed
- The cost of reflow is greater than that of repainting
draw
- Finally draw through the API of the native GUI of the operating system (browser)
Among them, the problems of redraw and reflow are derived. One of the ways to improve performance is to reduce the rendering time of the browser. One of the optimization points is to reduce redraw and reflow.
Ways to reduce reflows and repaints
- Don't modify the DOM style one by one. Instead, it is better to predefine the CSS class and then modify the DOM style.
Modify the DOM after taking it "offline"
- Use the documentFragment object to manipulate the DOM in memory
- First give the DOM to
display:none
(there is a Reflow), then you can change it as you want, and then display it - Clone a DOM node into memory, and then change it as you want, and exchange it with the online one after changing it
- Don't put the property value of a DOM node in a loop as a loop variable, otherwise this will lead to a lot of reading and writing the properties of this node
- Modify the lower-level DOM as much as possible
- Don't use table layout
Properties that cause reflow:
width, height, padding, margin, border, position, top, left, bottom, right, float, clear, text-align, vertical-align, line-height, font-weight, font-size, font-family, overflow, white-space
Properties that cause repaints:
color, border-style, border-radius, text-decoration, box-shadow, outline, background
Remember that reflow is geometry size dependent, redraw is size independent
In this way, the whole process from entering the url to seeing the page is over.
Summarize
This question can generate many questions. From one question, you can test the interviewee's knowledge of HTTP and browsers. As the saying goes, "Peng flies in anger, its wings are like clouds hanging from the sky; the water strikes three thousand li, and the sky is nine thousand zhang; the good wind, with its strength, sends me to the blue sky. ". The reason why this question can become a classic question is not without it.
The author makes a summary here, and lists the knowledge points that can be derived from this question one by one, waiting for you to think about it
Browser aspect
browser architecture
- what is it composed of? Browser main process, GPU process, multiple rendering processes, multiple plug-in processes, network processes, audio processes, storage processes, etc.
- Which processes are in the renderer process? GUI rendering thread, JS engine thread, event trigger thread, network asynchronous thread, timer thread
- Difference between process and thread? A process is an instance created by an application, and a thread relies on a process, which is the smallest running unit of a computer
browser rendering
- rendering process? Parse, render, draw
Redraw and Reflow
- The difference between the two
- Properties for Redraw and Reflow
- How to reduce redraws and reflows and improve rendering performance
HTTP side
HTTP caching
Strong cache
- HTTP/1.1 Cache-Control
- HTTP/1.0 Expires
- Cache-Control > Expires
Negotiate cache
- HTTP/1.1 ETag/If-None-Match
- HTTP/1.0 Last-Modified/If-Modified-Since
- Accuracy: ETag > Last-Modified
- Performance: Last-Modified > ETag
TCP/IP connection
- Three handshakes, four waves
Performance optimization at the network level
- HTTP/1.1 approach
- The HTTP/2 approach
- The HTTP/3 approach
- The performance optimization used at each stage is different
References
- Introduction to Browser Rendering Principles
- Deep understanding of modern browsers
- 10,000-word detailed text: In-depth understanding of browser principles
- The inner workings of modern browsers (with detailed flowchart)
- The working principle of the browser that everyone should know about the front end
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。