This article is a translation of Onion 's " inside look at modern web browser " series by Mario Kosaka . The translation here does not refer to literal translation, but to express what the author wants to express based on personal understanding, and will try to add some relevant content to help everyone understand better.
what happens when navigating
This is part 2 of a 4-part blog series that explores the inner workings of
Chrome . In the last article , we explored the high-level architectural design of browsers and the benefits of a multi-process architecture. At the same time, we also discussed technologies that are closely related to browser multi-process architecture, such as serviceization and website isolation. Next we're going to start digging into how these processes and threads render our website pages.
Let's look at a simple web browsing example: you type in the address bar of your browser
URL and press the Enter key, the browser will then fetch the relevant data from the Internet and display the web page. In this article, we'll focus on this simple scenario of requesting data from a website and the preparation the browser does before rendering the page - that is, the process of navigating (
Everything starts with the browser process
We mentioned in Part 1: CPU, GPU, Memory and Multi-Process Architecture that everything that happens outside the browser
tab is controlled by the browser process (
browser process ). The browser process has many threads (
worker thread ) that are responsible for different tasks, including the
UI thread (
UI thread d3633274401eec355b2b9581f7778--- that draws components such as the browser's top buttons and navigation bar input boxes)
UI thread ), network threads that manage network requests (
network thread ), and storage threads that control file reading and writing (
storage thread ), etc. When you enter a
URL in the navigation bar, it is actually
UI the thread is processing your input.
Figure 1: The user interface of the browser is at the top, and the schematic diagram of the browser process is at the bottom, which contains
UI , network and storage threads
One simple navigation
Step 1: Process the input
When the user starts typing in the address bar, the first thing the
UI thread asks is "is the string you entered a keyword for the search (
search query ) or a
URL address?". Because for
Chrome , the input in the address bar may be either a direct request
URL , or it may be that the user wants to search engine (for example
UI the thread needs to parse and decide whether to send the user input to the search engine or directly request the site resources you entered.
UI The thread is asking whether the input string is a search keyword or a
Step 2: Start Navigation
When the user presses the enter key, the
UI thread will call the network thread (
network thread ) to initiate a network request to fetch the site's content. At this time
tab will display a spinning circle indicating that the resource is loading, and the network thread will perform a series of such as
DNS addressing and building for the request
TLS The operation of the connection.
UI thread tells the network thread to jump to
At this time, if the network thread receives the
HTTP 301 redirection response from the server, it will tell the
UI thread to redirect and then it will initiate a new network request again.
Step 3: Read the response
When the network thread receives the body (
payload ) stream (
stream ) of the
HTTP response body, it first checks if necessary bytes to determine the specific media type of the response body (
MIME Type ). The media type of the response body can generally be determined by the
HTTP in the
Content-Type ce76d41a22b0a25416a68f6d54378377--- header, but
Content-Type is sometimes missing or wrong. The server will then perform MIME type sniffing to determine the response type.
MIME Type sniffing is not an easy task, you can learn from the comments in Chrome's source code how different browsers are based on different
Content-Type to determine the response body is which media type it belongs to.
MIMEtype is missing or the client thinks the file has the wrong
MIMEtype, the browser may do
MIMEsniffing by looking at the resource. Each browser performs different actions in different situations. Because this operation has some security issues, some
MIMEtypes represent executable content and some are non-executable content. Browsers can set
MIMEfrom sniffing by requesting header
Figure 3: The header of the response has
Content-Type information, while the body of the response has real data
If the body of the response is a
HTML file, the browser will hand over the obtained response data to the rendering process (
renderer process ) for further work. If the received response data is a compressed file (
zip file ) or other types of files, the response data will be handed over to the download manager (
download manager ) for processing.
Figure 4: The network thread is asking if the response data is from a secure source
The network thread also performs SafeBrowsing checks on the content before handing it over to the rendering process. If the requested domain name or the content of the response matches a known virus site, the web thread will display a warning page to the user. In addition, the network thread will also do CORB ( C ross Origin Read B locking) checks to determine which sensitive cross-site data will not be sent to the rendering process .
Cross-Origin Read Blocking(hereinafter referred to as
CORB) is not a
HTTPheader, but a part of the site isolation mechanism  . As mentioned above, site isolation allows different sites to run in different processes, but this is not enough because malicious sites can still legitimately request cross-origin resources. For example, a malicious website could use a
imgelement to request a
JSONfile containing sensitive information such as bank balances:
JSONfile will appear in the memory of the malicious site's renderer process. The renderer sees that this is not a valid image format and does not render the image. With the help of vulnerabilities like
Spectre, attackers can manage to access this part of memory to obtain sensitive information.
CORBis used to prevent such access. If a response is blocked by
CORB, the response will not even reach the process where the malicious site is located, which is better than the opaque response described in the previous  (the script cannot access, but can appear in the renderer process ) is more stringent  .
CORBwill not view the following two types of requests:
- Navigation requests or various embedded requests, such as cross-origin
、, ``, etc. These embedded elements themselves have an independent security context. With the help of site isolation, their data and malicious document data are stored in different processes, which is safe enough
- Download requests, the response data of such requests are stored directly to the hard disk, without going through the context of cross-origin documents, and do not require CORB protection
CORBwill review the rest of the request, including:
<link rel="prefetch" ...>
- Requests for the following resources:
- Script requests, such as
- Audio, video and subtitle requests
- font request
- style request
- Report requests, such as
The core idea of
CORBis to consider whether a resource is inapplicable in all of the above scenarios, if the resource in the above scenarios either produces a
CORSerror, or a syntax or decoding error, Or generate an opaque response, then
CORBshould prevent the resource from loading. That is,
CORBfurther blocks resources that would otherwise be unavailable, and resources that would otherwise be available can be used as usual (including cross-origin resources that correctly implement
CORBhas almost no effect on compatibility.
CORBwill protect 3 types of content:
JSON、HTML、XML(the protection mentioned here is to prevent the response from reaching the process of the malicious site).
FetchThe specification stipulates that when the requested cross-origin mode is
Content-Typeheader response not protected by
image/svg+xmlexcept), the response is protected by
- If the response declares the
X-Content-Type-Options: nosniffheader, the
Content-Typeheader is determined by the
text/plain, the response is the above three, or the response is
Chromiumalso added a sniffing mechanism to further determine
Content-Typewhether the type of the header declaration is correct (not declared
Content-Typethe response of the header is still not affected
CORBprotected), which is more detailed than the
Fetchspecification  . Since the sniffing mechanism is not perfect, Google recommends developers to use the correct
Content-Typeheader and declare the
X-Content-Type-Options: nosniffheader to avoid sniffing  .
WHATWGmembers are discussing adding more content types to
CORB, such as
Step 4: Find a renderer process
After the network thread has done all the checks and is able to determine that the browser should navigate to the requested site, it will tell
UI that the thread has all its data ready.
UI The thread will find a rendering process (
renderer process ) for this website to render the interface after receiving the confirmation from the network thread.
Figure 5: The network thread tells
UI thread to find a rendering process to render the interface
Since network requests may take up to several hundred milliseconds to complete, in order to shorten the time required for navigation, the browser will make some optimizations in some of the previous steps. For example, in the second step when the
UI thread sends
URL link to the network thread, it actually already knows which site they are going to navigate to, so when the network thread is working ,
UI thread will actively start a rendering thread for this network request. If all goes well (no redirects or anything like that), the rendering process of the page is ready when the network thread prepares the data, which saves the time of creating a new rendering process. But if something happens like the website is redirected to a different site, the renderer process just now cannot be used, it will be discarded, and a new renderer process will be started.
Step 5: Commit Navigation
At this point, the data and rendering process are ready, the browser process (
browser process ) will tell the rendering process to submit this navigation (
commit navigation ) through
commit navigation ). In addition, the browser process will also pass the response data stream just received to the corresponding rendering process so that it continues to receive incoming HTML data. Once the browser process receives a reply from the rendering thread that the navigation has been committed (
commit ), the navigation process ends and the document loading phase (
document loading phase ) will officially begin.
At this point, the navigation bar will be updated and the security indicator (
security indicator ) and site settings
site settings UI ) will display site information related to the new page. The session history of the current
session history ) is also updated so that when you click the browser's forward and back buttons, you can also navigate to the page you just navigated to. In order to facilitate you to restore the current
tab and session (
session ) content when closing the tab or window (
window ), the current session history will be deleted. saved on disk.
Figure 6: The browser process initiates a rendering page request to the rendering process through
Extra step: Initial load complete
When the navigation submission is complete, the rendering process begins to load resources and render the page. I'll cover the specifics of the rendering process rendering pages in a later article. Once the rendering process "finishes" (
finished ) rendering, it informs the browser process by
IPC (note this happens for all frames on the page
onload when the events have been triggered and the corresponding handlers have been executed), then
UI the thread will stop the spinning circle on the navigation bar.
I use the word "completed" here, because the client-side
Figure 7: The rendering process tells the browser process that the page has finished loading via
Navigate to different sites
One of the simplest navigation scenarios has been described! But what happens if the user enters a different
URL on the navigation bar at this time? If so, the browser process will repeat the previous steps to complete the navigation of the new site. However, before the browser process does these things, it needs to make the current rendering page do some finishing work, specifically asking whether the current rendering process needs to handle the beforeunload event.
beforeunload You can show a second confirmation pop-up box "Are you sure you want to leave the current page?" when the user re-navigates or closes the current
tab . The reason why the browser process needs to confirm with the current rendering process when re-navigating is that everything that happens on the current page (including the
JavaScrip t execution of the page) is not controlled by it but by the rendering process. , so it doesn't know what's inside.
Note: Do not add
beforeunloadevent listener to the page casually, the listener function you define will be executed when the page is re-navigated, so this will increase the delay of re-navigation.
beforeunloadEvent listener functions can only be added when absolutely necessary, such as when the user enters data on the page, and the data disappears with the page.
Figure 8: The browser process tells the rendering process that it is going to leave the current page and navigate to a new page via
beforeunload event listener function, and execute it if there is one. What happens after execution is no different from the previous situation. The only difference is that this time the navigation request is initiated by the renderer process to the browser process.
If you re-navigate to a different site (
different site ), another rendering process will be started to complete the re-navigation, and the current rendering process will continue to handle some finishing work on the current page, such
unload Event listener function execution. Overview of page lifecycle states This article will introduce all the lifecycle states of the page, and the Page Lifecycle API will teach you how to monitor page state changes in the page.
Figure 9: The browser process tells the new renderer to render the new page and tells the current renderer to do the finishing touches
Service Worker Scenario
A recent change to this navigation process is the introduction of the concept of service workers . Because
Service worker can be used to write a web proxy for websites (
network proxy ), developers can have more control over network requests, such as deciding which data is cached locally and which data It needs to be retrieved from the network and so on. If the developer sets the current page content to be obtained from the cache in
service worker , the rendering of the current page does not need to resend the network request, which greatly speeds up the entire navigation process.
The key thing to note here is that
service worker is actually just some
service worker and start a rendering process to execute it?
service worker During registration, its scope (
scope ) will be recorded (you can learn more about
service worker through the article The Service Worker Lifecycle scope information). When the navigation starts, the network thread will search for the corresponding
service worker dfec6975ef8e079bfc72ea2e81575b41--- in the scope of the registered
service worker according to the requested domain name.
service worker ，
renderer process ) to execute its code.
Service worker is possible to use previously cached data or to initiate a new network request.
Figure 10: The network thread will look for a corresponding navigation task after receiving it
UI thread will start a rendering process to run the found
service worker code, which is executed by the worker thread in the rendering process (
worker thread )
Navigation Preload - Navigation Preload
In the above example, you should feel that if the startup
service worker finally decides to send a network request, the back-and-forth communication between the browser process and the rendering process includes
service worker The startup time actually increases the delay of page navigation. Navigation preloading is a technology that speeds up the efficiency of the entire navigation process by loading the corresponding resources in parallel when
service worker starts. Request headers for preloaded resources will have some special flags that let the server decide whether to send completely new content to the client or just send the updated data to the client.
UI thread starts a rendering process to run
service worker The code will also send network requests in parallel
In this article, we discussed what happened to navigation and some technical solutions that browsers take to optimize navigation efficiency. In the next article , we will take a deeper look at how browsers parse us