foreword

This article is a translation of Onion 's " inside look at modern web browser " series by Mario Kosaka . The translation here does not refer to literal translation, but to express what the author wants to express based on personal understanding, and will try to add some relevant content to help everyone understand better.

what happens when navigating

This is part 2 of a 4-part blog series that explores the inner workings of Chrome . In the last article , we explored the high-level architectural design of browsers and the benefits of a multi-process architecture. At the same time, we also discussed technologies that are closely related to browser multi-process architecture, such as serviceization and website isolation. Next we're going to start digging into how these processes and threads render our website pages.

Let's look at a simple web browsing example: you type in the address bar of your browser URL and press the Enter key, the browser will then fetch the relevant data from the Internet and display the web page. In this article, we'll focus on this simple scenario of requesting data from a website and the preparation the browser does before rendering the page - that is, the process of navigating ( navigation ).

Everything starts with the browser process

We mentioned in Part 1: CPU, GPU, Memory and Multi-Process Architecture that everything that happens outside the browser tab is controlled by the browser process ( browser process ). The browser process has many threads ( worker thread ) that are responsible for different tasks, including the UI thread ( UI thread d3633274401eec355b2b9581f7778--- that draws components such as the browser's top buttons and navigation bar input boxes) UI thread ), network threads that manage network requests ( network thread ), and storage threads that control file reading and writing ( storage thread ), etc. When you enter a URL in the navigation bar, it is actually UI the thread is processing your input.

Browser processes

Figure 1: The user interface of the browser is at the top, and the schematic diagram of the browser process is at the bottom, which contains UI , network and storage threads

One simple navigation

Step 1: Process the input

When the user starts typing in the address bar, the first thing the UI thread asks is "is the string you entered a keyword for the search ( search query ) or a URL address?". Because for Chrome , the input in the address bar may be either a direct request URL , or it may be that the user wants to search engine (for example Google ) The keyword information for the search inside, so UI the thread needs to parse and decide whether to send the user input to the search engine or directly request the site resources you entered.

Handling user input
Figure 1: UI The thread is asking whether the input string is a search keyword or a URL

Step 2: Start Navigation

When the user presses the enter key, the UI thread will call the network thread ( network thread ) to initiate a network request to fetch the site's content. At this time tab will display a spinning circle indicating that the resource is loading, and the network thread will perform a series of such as DNS addressing and building for the request TLS The operation of the connection.

导航开始
Figure 2: UI thread tells the network thread to jump to mysite.com

At this time, if the network thread receives the HTTP 301 redirection response from the server, it will tell the UI thread to redirect and then it will initiate a new network request again.

Step 3: Read the response

When the network thread receives the body ( payload ) stream ( stream ) of the HTTP response body, it first checks if necessary bytes to determine the specific media type of the response body ( MIME Type ). The media type of the response body can generally be determined by the HTTP in the Content-Type ce76d41a22b0a25416a68f6d54378377--- header, but Content-Type is sometimes missing or wrong. The server will then perform MIME type sniffing to determine the response type. MIME Type sniffing is not an easy task, you can learn from the comments in Chrome's source code how different browsers are based on different Content-Type to determine the response body is which media type it belongs to.

When the MIME type is missing or the client thinks the file has the wrong MIME type, the browser may do MIME sniffing by looking at the resource. Each browser performs different actions in different situations. Because this operation has some security issues, some MIME types represent executable content and some are non-executable content. Browsers can set X-Content-Type-Options to prevent MIME from sniffing by requesting header Content-Type .

HTTP 响应
Figure 3: The header of the response has Content-Type information, while the body of the response has real data

If the body of the response is a HTML file, the browser will hand over the obtained response data to the rendering process ( renderer process ) for further work. If the received response data is a compressed file ( zip file ) or other types of files, the response data will be handed over to the download manager ( download manager ) for processing.

MIME 类型嗅探
Figure 4: The network thread is asking if the response data is from a secure source HTML file

The network thread also performs SafeBrowsing checks on the content before handing it over to the rendering process. If the requested domain name or the content of the response matches a known virus site, the web thread will display a warning page to the user. In addition, the network thread will also do CORB ( C ross Origin Read B locking) checks to determine which sensitive cross-site data will not be sent to the rendering process .

Cross-Origin Read Blocking (hereinafter referred to as CORB ) is not a HTTP header, but a part of the site isolation mechanism [6] . As mentioned above, site isolation allows different sites to run in different processes, but this is not enough because malicious sites can still legitimately request cross-origin resources. For example, a malicious website could use a img element to request a JSON file containing sensitive information such as bank balances:

 <img src="https://your-bank.example/balance.json">

This JSON file will appear in the memory of the malicious site's renderer process. The renderer sees that this is not a valid image format and does not render the image. With the help of vulnerabilities like Spectre , attackers can manage to access this part of memory to obtain sensitive information.

CORB is used to prevent such access. If a response is blocked by CORB , the response will not even reach the process where the malicious site is located, which is better than the opaque response described in the previous [7] (the script cannot access, but can appear in the renderer process ) is more stringent [8] .

CORB will not view the following two types of requests:

  • Navigation requests or various embedded requests, such as cross-origin , ``, etc. These embedded elements themselves have an independent security context. With the help of site isolation, their data and malicious document data are stored in different processes, which is safe enough
  • Download requests, the response data of such requests are stored directly to the hard disk, without going through the context of cross-origin documents, and do not require CORB protection

CORB will review the rest of the request, including:

  • XHR and fetch()
  • ping , navigator.sendBeacon()
  • <link rel="prefetch" ...>
  • Requests for the following resources:
    • 图像请求,如<img>元素,网站/favicon.icoSVG中的<image>CSS中的background-image etc
    • Script requests, such as <script> , importScripts() , navigator.serviceWorker.register() , audioWorklet.addModule() , etc.
    • Audio, video and subtitle requests
    • font request
    • style request
    • Report requests, such as CSP report, NEL report, etc.

The core idea of CORB is to consider whether a resource is inapplicable in all of the above scenarios, if the resource in the above scenarios either produces a CORS error, or a syntax or decoding error, Or generate an opaque response, then CORB should prevent the resource from loading. That is, CORB further blocks resources that would otherwise be unavailable, and resources that would otherwise be available can be used as usual (including cross-origin resources that correctly implement CORS ), so CORB has almost no effect on compatibility.

Currently, CORB will protect 3 types of content: JSON、HTML、XML (the protection mentioned here is to prevent the response from reaching the process of the malicious site).

Fetch The specification stipulates that when the requested cross-origin mode is no-cors [9] :

  • undeclared Content-Type header response not protected by CORB
  • 206Content-Type首部MIME type HTML、JSON XMLimage/svg+xml except), the response is protected by CORB
  • If the response declares the X-Content-Type-Options: nosniff header, the Content-Type header is determined by the MIME type text/plain , the response is the above three, or the response is CORB 568d0e5bdf200cc6 CORB protect

Chromium also added a sniffing mechanism to further determine Content-Type whether the type of the header declaration is correct (not declared Content-Type the response of the header is still not affected CORB protected), which is more detailed than the Fetch specification [10] . Since the sniffing mechanism is not perfect, Google recommends developers to use the correct Content-Type header and declare the X-Content-Type-Options: nosniff header to avoid sniffing [11] .

In addition, WHATWG members are discussing adding more content types to CORB , such as pdf、csv etc.

Overview of Cross-Origin Related Mechanisms (4): Spectre Attack and Improvement of Cross-Origin Mechanisms

Step 4: Find a renderer process

After the network thread has done all the checks and is able to determine that the browser should navigate to the requested site, it will tell UI that the thread has all its data ready. UI The thread will find a rendering process ( renderer process ) for this website to render the interface after receiving the confirmation from the network thread.

查找渲染器进程
Figure 5: The network thread tells UI thread to find a rendering process to render the interface

Since network requests may take up to several hundred milliseconds to complete, in order to shorten the time required for navigation, the browser will make some optimizations in some of the previous steps. For example, in the second step when the UI thread sends URL link to the network thread, it actually already knows which site they are going to navigate to, so when the network thread is working , UI thread will actively start a rendering thread for this network request. If all goes well (no redirects or anything like that), the rendering process of the page is ready when the network thread prepares the data, which saves the time of creating a new rendering process. But if something happens like the website is redirected to a different site, the renderer process just now cannot be used, it will be discarded, and a new renderer process will be started.

Step 5: Commit Navigation

At this point, the data and rendering process are ready, the browser process ( browser process ) will tell the rendering process to submit this navigation ( commit navigation ) through IPC commit navigation ). In addition, the browser process will also pass the response data stream just received to the corresponding rendering process so that it continues to receive incoming HTML data. Once the browser process receives a reply from the rendering thread that the navigation has been committed ( commit ), the navigation process ends and the document loading phase ( document loading phase ) will officially begin.

At this point, the navigation bar will be updated and the security indicator ( security indicator ) and site settings UI ( site settings UI ) will display site information related to the new page. The session history of the current tab ( session history ) is also updated so that when you click the browser's forward and back buttons, you can also navigate to the page you just navigated to. In order to facilitate you to restore the current tab and session ( session ) content when closing the tab or window ( window ), the current session history will be deleted. saved on disk.

Commit the navigation
Figure 6: The browser process initiates a rendering page request to the rendering process through IPC

Extra step: Initial load complete

When the navigation submission is complete, the rendering process begins to load resources and render the page. I'll cover the specifics of the rendering process rendering pages in a later article. Once the rendering process "finishes" ( finished ) rendering, it informs the browser process by IPC (note this happens for all frames on the page onload when the events have been triggered and the corresponding handlers have been executed), then UI the thread will stop the spinning circle on the navigation bar.

I use the word "completed" here, because the client-side JavaScript can still continue to load resources and change the content of the view.

页面完成加载
Figure 7: The rendering process tells the browser process that the page has finished loading via IPC

Navigate to different sites

One of the simplest navigation scenarios has been described! But what happens if the user enters a different URL on the navigation bar at this time? If so, the browser process will repeat the previous steps to complete the navigation of the new site. However, before the browser process does these things, it needs to make the current rendering page do some finishing work, specifically asking whether the current rendering process needs to handle the beforeunload event.

beforeunload You can show a second confirmation pop-up box "Are you sure you want to leave the current page?" when the user re-navigates or closes the current tab . The reason why the browser process needs to confirm with the current rendering process when re-navigating is that everything that happens on the current page (including the JavaScrip t execution of the page) is not controlled by it but by the rendering process. , so it doesn't know what's inside.

Note: Do not add beforeunload event listener to the page casually, the listener function you define will be executed when the page is re-navigated, so this will increase the delay of re-navigation. beforeunload Event listener functions can only be added when absolutely necessary, such as when the user enters data on the page, and the data disappears with the page.

beforeunload 事件处理程序
Figure 8: The browser process tells the rendering process that it is going to leave the current page and navigate to a new page via IPC

What if the renavigation is initiated within the page? For example the user clicks a link on the page or the client-side JavaScript code executes code such as window.location = " https://newsite.com " . In this case, the rendering process will first check whether it has a registered beforeunload event listener function, and execute it if there is one. What happens after execution is no different from the previous situation. The only difference is that this time the navigation request is initiated by the renderer process to the browser process.

If you re-navigate to a different site ( different site ), another rendering process will be started to complete the re-navigation, and the current rendering process will continue to handle some finishing work on the current page, such unload Event listener function execution. Overview of page lifecycle states This article will introduce all the lifecycle states of the page, and the Page Lifecycle API will teach you how to monitor page state changes in the page.

新的导航和卸载
Figure 9: The browser process tells the new renderer to render the new page and tells the current renderer to do the finishing touches

Service Worker Scenario

A recent change to this navigation process is the introduction of the concept of service workers . Because Service worker can be used to write a web proxy for websites ( network proxy ), developers can have more control over network requests, such as deciding which data is cached locally and which data It needs to be retrieved from the network and so on. If the developer sets the current page content to be obtained from the cache in service worker , the rendering of the current page does not need to resend the network request, which greatly speeds up the entire navigation process.

The key thing to note here is that service worker is actually just some JavaScript code running in the rendering process. So the question is, when the navigation starts, how does the browser process determine whether there is a corresponding site to navigate service worker and start a rendering process to execute it?

Actually service worker During registration, its scope ( scope ) will be recorded (you can learn more about service worker through the article The Service Worker Lifecycle scope information). When the navigation starts, the network thread will search for the corresponding service worker dfec6975ef8e079bfc72ea2e81575b41--- in the scope of the registered service worker according to the requested domain name. URLservice workerUI线程就会为这个---9d889184a7e3f9821dd8de7773c428d6 service worker进程( renderer process ) to execute its code. Service worker is possible to use previously cached data or to initiate a new network request.

服务工作者范围查找
Figure 10: The network thread will look for a corresponding navigation task after receiving it service worker

服务工作者导航
Figure 11: UI thread will start a rendering process to run the found service worker code, which is executed by the worker thread in the rendering process ( worker thread )

Navigation Preload - Navigation Preload

In the above example, you should feel that if the startup service worker finally decides to send a network request, the back-and-forth communication between the browser process and the rendering process includes service worker The startup time actually increases the delay of page navigation. Navigation preloading is a technology that speeds up the efficiency of the entire navigation process by loading the corresponding resources in parallel when service worker starts. Request headers for preloaded resources will have some special flags that let the server decide whether to send completely new content to the client or just send the updated data to the client.

导航预载
Figure 12: UI thread starts a rendering process to run service worker The code will also send network requests in parallel

Summarize

In this article, we discussed what happened to navigation and some technical solutions that browsers take to optimize navigation efficiency. In the next article , we will take a deeper look at how browsers parse us HTML/CSS/JavaScript to present the content of the web page.


记得要微笑
1.9k 声望4.5k 粉丝

知不足而奋进,望远山而前行,卯足劲,不减热爱。