foreword
This article is a translation of Onion 's " inside look at modern web browser " series by Mario Kosaka . The translation here does not refer to literal translation, but to express what the author wants to express based on personal understanding, and will try to add some relevant content to help everyone understand better.
what happens when navigating
This is part 2 of a 4-part blog series that explores the inner workings of Chrome
. In the last article , we explored the high-level architectural design of browsers and the benefits of a multi-process architecture. At the same time, we also discussed technologies that are closely related to browser multi-process architecture, such as serviceization and website isolation. Next we're going to start digging into how these processes and threads render our website pages.
Let's look at a simple web browsing example: you type in the address bar of your browser URL
and press the Enter key, the browser will then fetch the relevant data from the Internet and display the web page. In this article, we'll focus on this simple scenario of requesting data from a website and the preparation the browser does before rendering the page - that is, the process of navigating ( navigation
).
Everything starts with the browser process
We mentioned in Part 1: CPU, GPU, Memory and Multi-Process Architecture that everything that happens outside the browser tab
is controlled by the browser process ( browser process
). The browser process has many threads ( worker thread
) that are responsible for different tasks, including the UI
thread ( UI thread
d3633274401eec355b2b9581f7778--- that draws components such as the browser's top buttons and navigation bar input boxes) UI thread
), network threads that manage network requests ( network thread
), and storage threads that control file reading and writing ( storage thread
), etc. When you enter a URL
in the navigation bar, it is actually UI
the thread is processing your input.
Figure 1: The user interface of the browser is at the top, and the schematic diagram of the browser process is at the bottom, which contains UI
, network and storage threads
One simple navigation
Step 1: Process the input
When the user starts typing in the address bar, the first thing the UI
thread asks is "is the string you entered a keyword for the search ( search query
) or a URL
address?". Because for Chrome
, the input in the address bar may be either a direct request URL
, or it may be that the user wants to search engine (for example Google
) The keyword information for the search inside, so UI
the thread needs to parse and decide whether to send the user input to the search engine or directly request the site resources you entered.
Figure 1: UI
The thread is asking whether the input string is a search keyword or a URL
Step 2: Start Navigation
When the user presses the enter key, the UI
thread will call the network thread ( network thread
) to initiate a network request to fetch the site's content. At this time tab
will display a spinning circle indicating that the resource is loading, and the network thread will perform a series of such as DNS
addressing and building for the request TLS
The operation of the connection.
Figure 2: UI
thread tells the network thread to jump to mysite.com
At this time, if the network thread receives the HTTP 301
redirection response from the server, it will tell the UI
thread to redirect and then it will initiate a new network request again.
Step 3: Read the response
When the network thread receives the body ( payload
) stream ( stream
) of the HTTP
response body, it first checks if necessary bytes to determine the specific media type of the response body ( MIME Type
). The media type of the response body can generally be determined by the HTTP
in the Content-Type
ce76d41a22b0a25416a68f6d54378377--- header, but Content-Type
is sometimes missing or wrong. The server will then perform MIME type sniffing to determine the response type. MIME
Type sniffing is not an easy task, you can learn from the comments in Chrome's source code how different browsers are based on different Content-Type
to determine the response body is which media type it belongs to.
When theMIME
type is missing or the client thinks the file has the wrongMIME
type, the browser may doMIME
sniffing by looking at the resource. Each browser performs different actions in different situations. Because this operation has some security issues, someMIME
types represent executable content and some are non-executable content. Browsers can setX-Content-Type-Options
to preventMIME
from sniffing by requesting headerContent-Type
.
Figure 3: The header of the response has Content-Type
information, while the body of the response has real data
If the body of the response is a HTML
file, the browser will hand over the obtained response data to the rendering process ( renderer process
) for further work. If the received response data is a compressed file ( zip file
) or other types of files, the response data will be handed over to the download manager ( download manager
) for processing.
Figure 4: The network thread is asking if the response data is from a secure source HTML
file
The network thread also performs SafeBrowsing checks on the content before handing it over to the rendering process. If the requested domain name or the content of the response matches a known virus site, the web thread will display a warning page to the user. In addition, the network thread will also do CORB ( C ross Origin Read B locking) checks to determine which sensitive cross-site data will not be sent to the rendering process .
Cross-Origin Read Blocking
(hereinafter referred to asCORB
) is not aHTTP
header, but a part of the site isolation mechanism [6] . As mentioned above, site isolation allows different sites to run in different processes, but this is not enough because malicious sites can still legitimately request cross-origin resources. For example, a malicious website could use aimg
element to request aJSON
file containing sensitive information such as bank balances:<img src="https://your-bank.example/balance.json">
This
JSON
file will appear in the memory of the malicious site's renderer process. The renderer sees that this is not a valid image format and does not render the image. With the help of vulnerabilities likeSpectre
, attackers can manage to access this part of memory to obtain sensitive information.
CORB
is used to prevent such access. If a response is blocked byCORB
, the response will not even reach the process where the malicious site is located, which is better than the opaque response described in the previous [7] (the script cannot access, but can appear in the renderer process ) is more stringent [8] .
CORB
will not view the following two types of requests:
- Navigation requests or various embedded requests, such as cross-origin
、
, ``, etc. These embedded elements themselves have an independent security context. With the help of site isolation, their data and malicious document data are stored in different processes, which is safe enough- Download requests, the response data of such requests are stored directly to the hard disk, without going through the context of cross-origin documents, and do not require CORB protection
CORB
will review the rest of the request, including:
XHR
andfetch()
ping
,navigator.sendBeacon()
<link rel="prefetch" ...>
- Requests for the following resources:
- 图像请求,如
<img>
元素,网站/favicon.ico
,SVG
中的<image>
,CSS
中的background-image
etc- Script requests, such as
<script>
,importScripts()
,navigator.serviceWorker.register()
,audioWorklet.addModule()
, etc.- Audio, video and subtitle requests
- font request
- style request
- Report requests, such as
CSP
report,NEL
report, etc.The core idea of
CORB
is to consider whether a resource is inapplicable in all of the above scenarios, if the resource in the above scenarios either produces aCORS
error, or a syntax or decoding error, Or generate an opaque response, thenCORB
should prevent the resource from loading. That is,CORB
further blocks resources that would otherwise be unavailable, and resources that would otherwise be available can be used as usual (including cross-origin resources that correctly implementCORS
), soCORB
has almost no effect on compatibility.Currently,
CORB
will protect 3 types of content:JSON、HTML、XML
(the protection mentioned here is to prevent the response from reaching the process of the malicious site).
Fetch
The specification stipulates that when the requested cross-origin mode isno-cors
[9] :
- undeclared
Content-Type
header response not protected byCORB
206
,Content-Type
首部MIME type
HTML、JSON
XML
(image/svg+xml
except), the response is protected byCORB
- If the response declares the
X-Content-Type-Options: nosniff
header, theContent-Type
header is determined by theMIME type
text/plain
, the response is the above three, or the response isCORB
568d0e5bdf200cc6CORB
protect
Chromium
also added a sniffing mechanism to further determineContent-Type
whether the type of the header declaration is correct (not declaredContent-Type
the response of the header is still not affectedCORB
protected), which is more detailed than theFetch
specification [10] . Since the sniffing mechanism is not perfect, Google recommends developers to use the correctContent-Type
header and declare theX-Content-Type-Options: nosniff
header to avoid sniffing [11] .In addition,
WHATWG
members are discussing adding more content types toCORB
, such aspdf、csv
etc.
Step 4: Find a renderer process
After the network thread has done all the checks and is able to determine that the browser should navigate to the requested site, it will tell UI
that the thread has all its data ready. UI
The thread will find a rendering process ( renderer process
) for this website to render the interface after receiving the confirmation from the network thread.
Figure 5: The network thread tells UI
thread to find a rendering process to render the interface
Since network requests may take up to several hundred milliseconds to complete, in order to shorten the time required for navigation, the browser will make some optimizations in some of the previous steps. For example, in the second step when the UI
thread sends URL
link to the network thread, it actually already knows which site they are going to navigate to, so when the network thread is working , UI
thread will actively start a rendering thread for this network request. If all goes well (no redirects or anything like that), the rendering process of the page is ready when the network thread prepares the data, which saves the time of creating a new rendering process. But if something happens like the website is redirected to a different site, the renderer process just now cannot be used, it will be discarded, and a new renderer process will be started.
Step 5: Commit Navigation
At this point, the data and rendering process are ready, the browser process ( browser process
) will tell the rendering process to submit this navigation ( commit navigation
) through IPC
commit navigation
). In addition, the browser process will also pass the response data stream just received to the corresponding rendering process so that it continues to receive incoming HTML data. Once the browser process receives a reply from the rendering thread that the navigation has been committed ( commit
), the navigation process ends and the document loading phase ( document loading phase
) will officially begin.
At this point, the navigation bar will be updated and the security indicator ( security indicator
) and site settings UI
( site settings UI
) will display site information related to the new page. The session history of the current tab
( session history
) is also updated so that when you click the browser's forward and back buttons, you can also navigate to the page you just navigated to. In order to facilitate you to restore the current tab
and session ( session
) content when closing the tab or window ( window
), the current session history will be deleted. saved on disk.
Figure 6: The browser process initiates a rendering page request to the rendering process through IPC
Extra step: Initial load complete
When the navigation submission is complete, the rendering process begins to load resources and render the page. I'll cover the specifics of the rendering process rendering pages in a later article. Once the rendering process "finishes" ( finished
) rendering, it informs the browser process by IPC
(note this happens for all frames on the page onload
when the events have been triggered and the corresponding handlers have been executed), then UI
the thread will stop the spinning circle on the navigation bar.
I use the word "completed" here, because the client-side JavaScript
can still continue to load resources and change the content of the view.
Figure 7: The rendering process tells the browser process that the page has finished loading via IPC
Navigate to different sites
One of the simplest navigation scenarios has been described! But what happens if the user enters a different URL
on the navigation bar at this time? If so, the browser process will repeat the previous steps to complete the navigation of the new site. However, before the browser process does these things, it needs to make the current rendering page do some finishing work, specifically asking whether the current rendering process needs to handle the beforeunload event.
beforeunload
You can show a second confirmation pop-up box "Are you sure you want to leave the current page?" when the user re-navigates or closes the current tab
. The reason why the browser process needs to confirm with the current rendering process when re-navigating is that everything that happens on the current page (including the JavaScrip
t execution of the page) is not controlled by it but by the rendering process. , so it doesn't know what's inside.
Note: Do not addbeforeunload
event listener to the page casually, the listener function you define will be executed when the page is re-navigated, so this will increase the delay of re-navigation.beforeunload
Event listener functions can only be added when absolutely necessary, such as when the user enters data on the page, and the data disappears with the page.
Figure 8: The browser process tells the rendering process that it is going to leave the current page and navigate to a new page via IPC
What if the renavigation is initiated within the page? For example the user clicks a link on the page or the client-side JavaScript code executes code such as window.location = " https://newsite.com " . In this case, the rendering process will first check whether it has a registered beforeunload
event listener function, and execute it if there is one. What happens after execution is no different from the previous situation. The only difference is that this time the navigation request is initiated by the renderer process to the browser process.
If you re-navigate to a different site ( different site
), another rendering process will be started to complete the re-navigation, and the current rendering process will continue to handle some finishing work on the current page, such unload
Event listener function execution. Overview of page lifecycle states This article will introduce all the lifecycle states of the page, and the Page Lifecycle API will teach you how to monitor page state changes in the page.
Figure 9: The browser process tells the new renderer to render the new page and tells the current renderer to do the finishing touches
Service Worker Scenario
A recent change to this navigation process is the introduction of the concept of service workers . Because Service worker
can be used to write a web proxy for websites ( network proxy
), developers can have more control over network requests, such as deciding which data is cached locally and which data It needs to be retrieved from the network and so on. If the developer sets the current page content to be obtained from the cache in service worker
, the rendering of the current page does not need to resend the network request, which greatly speeds up the entire navigation process.
The key thing to note here is that service worker
is actually just some JavaScript
code running in the rendering process. So the question is, when the navigation starts, how does the browser process determine whether there is a corresponding site to navigate service worker
and start a rendering process to execute it?
Actually service worker
During registration, its scope ( scope
) will be recorded (you can learn more about service worker
through the article The Service Worker Lifecycle scope information). When the navigation starts, the network thread will search for the corresponding service worker
dfec6975ef8e079bfc72ea2e81575b41--- in the scope of the registered service worker
according to the requested domain name. URL
的service worker
, UI
线程就会为这个---9d889184a7e3f9821dd8de7773c428d6 service worker
进程( renderer process
) to execute its code. Service worker
is possible to use previously cached data or to initiate a new network request.
Figure 10: The network thread will look for a corresponding navigation task after receiving it service worker
Figure 11: UI
thread will start a rendering process to run the found service worker
code, which is executed by the worker thread in the rendering process ( worker thread
)
Navigation Preload - Navigation Preload
In the above example, you should feel that if the startup service worker
finally decides to send a network request, the back-and-forth communication between the browser process and the rendering process includes service worker
The startup time actually increases the delay of page navigation. Navigation preloading is a technology that speeds up the efficiency of the entire navigation process by loading the corresponding resources in parallel when service worker
starts. Request headers for preloaded resources will have some special flags that let the server decide whether to send completely new content to the client or just send the updated data to the client.
Figure 12: UI
thread starts a rendering process to run service worker
The code will also send network requests in parallel
Summarize
In this article, we discussed what happened to navigation and some technical solutions that browsers take to optimize navigation efficiency. In the next article , we will take a deeper look at how browsers parse us HTML/CSS/JavaScript
to present the content of the web page.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。