From the front-end perspective, see how you are tracked in browser incognito mode

This article introduces the browser incognito mode from the perspective of popular science and technology. The full text is as follows. Readers can choose the corresponding chapter to read according to their interests.

IncognitoMode

Understanding the privacy model

What the privacy mode hides

Most modern web browsers have added private browsing mode to browse the web, aiming to protect user privacy. Chrome calls it incognito mode; Opera, Safari, and Firefox are generally referred to as private browsing. The mode dark themes and masked figures icon characterized, can give the user a anonymous browsing impression . Researchers from the University of Chicago and Leibniz University of Hannover found that people have many misunderstandings about private browsing or incognito mode . Many users believe that private browsing can protect them from malware, advertisements, tracking scripts, and Internet service providers ( ISP) monitoring.

In fact, private browsing aims to avoid leaving traces of browsing sessions on the computer . Therefore, when you open the privacy window, the cookies and browsing history in the main browsing window will not be preserved. When you close the private browsing window, your browsing history, saved passwords, and what you type in the text fields of the window (username, phone number, etc.) will be erased. This means using your computer and start the next person browsers will not be able to find out which sites you visit during a private browsing session , even if it is your own, use these sites when the next login to your account and you will Appear as a new user.

Private browsing is a very useful and convenient tool that can be used for quick browsing sessions without leaving a trace on your computer. will protect your privacy from others who use your computer and reduce some of the information about yourself that you reveal when you visit the website . But private browsing will not make you anonymous, nor will it protect you from surveillance and large-scale technical snooping .

Speaking of privacy leaks, many people think that as long as they do not log in, do not use cookies, and use the incognito mode of the browser, their data information is safe, and only you know what you browse. Then, we might as well take a look. The browser's official own definition.

Incognito mode is a setting of the Chrome browser. In the Chrome browser, it is described as follows:

incognito_mode

Simply put, Chrome incognito mode just helps you delete your local search and browsing history, but it looks "incognito". To open the page in incognito mode in Chrome, click the three-dot icon in the upper right corner and choose to open a new incognito window from the drop-down menu that appears, or press Ctrl+Shift+N. A new window with a dark theme pops up and displays a notification: "You have entered incognito mode." The rules explain the advantages and disadvantages of incognito mode. By default, third-party cookies (used to track you across different sites) are disabled.

As for how to turn on the incognito mode, I believe most people know it. If you don’t know, or don’t know how to turn on the incognito mode in a particular browser, you can refer to the link below.

How do I set my browser to Incognito or private mode?

In practice, even in privacy mode, the website can still with other information , such as your IP address, device type, and browsing habits (time of day, pages visited, etc.). Privacy Browsing will not hide any data . Large technology companies such as Facebook and Google have a lot of information about users, and by connecting these dots, they can identify you even if you are not logged in to your account.

Can browser incognito mode really be invisible?

Through the previous introduction, you must have a certain answer: no.

We can prove this with a sample website that shows private browsing tracking- Nothing Private The test method of this website is to let you submit your identification information first, and then let you use the incognito mode of the browser to visit the website, and guess whether the website will recognize you.

Here, I first filled in "Lone Fishing in Hanjiang Snow". Obviously, when I submitted the information, in addition to the " finger Fishing in Hanjiang Snow" I filled in, the browser also sent a field 061107e214da8d.

nothingprivate1

When I use incognito mode to open this site again, the browser has to carry the same finger fields relevant information to the server, so, I identified out .

nothingprivate2

In summary: browser incognito mode does not protect your data information from being obtained by the website's server. To be precise, the incognito mode is just . If you are interested, you might as well go to Nothing Private experience it yourself. The principle of it will be introduced in subsequent chapters.

What can't privacy mode do

It will not protect you from viruses or malware;
It will not prevent your Internet Service Provider (ISP) from seeing where you are online (in fact, no matter what you do, your ISP can access almost all your browsing activities);
It will not prevent websites from viewing your actual location;
When you close the webpage, any bookmarks you saved in private browsing or incognito mode will not disappear, it will be added to your normal webpage bookmarks;
Files you downloaded to your computer while browsing privately will not be deleted when you close the window.

Status of the use of privacy mode

In 2017, DuckDuckGo conducted a browser privacy model survey on 5,710 Americans to understand people’s awareness of privacy models and how they use this common privacy feature. For the full report, please refer to: A Study on Private Browsing: Consumer Usage, Knowledge, and Thoughts .

A brief summary is as follows:

46% of Americans have used private browsing;
The first reason people use private browsing is "embarrassing search";
76% of Americans who use private browsing cannot accurately identify the privacy advantages they provide;
65% of respondents said that after understanding the limitations of private browsing (private browsing mode only prevents your browser history from being recorded on your computer, and does not provide any additional protection), they feel " Surprised, "misleading", "confused" or "injured";
84% of Americans would consider trying another major web browser if it can provide more features to help protect their privacy.

Reference: Is Private Browsing Really Private?

Seeing Stealth Mode from a Technical Perspective

The user's process of visiting the website

Under normal circumstances, the process of a user visiting a website is shown in the following figure:

browsing_process1

When a user browses a webpage, they generally do the following:

Open the browser and enter the URL. At this time, the browser will silently keep this process in the history;
The connection request passes through the network cable in the user's home and progresses layer by layer, reaching the backbone network of the Internet provider, and then connecting to the website address requested by the user. At this time, the website can get the user's IP address;
The website returns data to the user, and most of the content of the web page is temporarily stored on the user's computer as a temporary file;
If the user registers/login, the user information will be saved/updated on the server side. Keep cookies locally as a way to authenticate users to avoid repeated logins. Of course, the mobile phone number, email address, and home address filled in during registration will also be recorded by the browser, which is convenient for the user to call next time.

It can be seen that in the whole process, generally speaking, there are 3 kinds of data stored on the user's computer, that is, browsing records, temporary files and cookies, and the contents of the form. One or two types of data are kept on the website, that is, the IP address and the registration information filled in by the user.

Nowadays, many companies and schools have established a proprietary network environment, which only displays one IP externally, and when the data is returned, it is sent to the corresponding internal network IP. Employers and schools can still know what someone on the intranet is viewing if they want to see it. For HTTP website links, employers and schools can fully understand which websites the user has browsed, what content the user has read, how long they stayed, and which websites have been clicked to redirect; For HTTPS website links, due to the existence of the certificate and The corresponding authentication mechanism is generally difficult to launch HTTPS decryption (man-in-the-middle attack), so it is only possible to know which websites the user has browsed, only . At the same time, an inappropriate network environment will expose your browsing history to others, such as free public Wi-Fi.

browsing_process2

The Game of Browser Stealth Mode Detection

Before Chrome 76, there was a vulnerability that many websites used to detect whether users were visiting the website in Chrome's incognito mode. These sites only need to try to use FileSystem API for storing temporary or permanent files. This API is disabled in incognito mode, but exists in non-incognito mode, so there is a difference, which is used to detect whether users are browsing the website in incognito mode and prevent these users from viewing the content of the website.

const fs = window.RequestFileSystem || window.webkitRequestFileSystem;
if (!fs) {
  console.log('check failed?');
} else {
  fs(
    window.TEMPORARY,
    100,
    console.log.bind(console, 'not in incognito mode'),
    console.log.bind(console, 'incognito mode')
  );
}

Google later fixed a vulnerability. Unfortunately, their fix led to two other methods that can still be used to detect when visitors are browsing privately.

Incognito mode detection based on file system size: This method is based on the amount of storage reserved for the internal file system used by the browser. Security researcher Vikas Mishra found that , Chrome’s storage quota is different between incognito mode and non-incognito mode. If the temporary storage quota is <= 120MB, then it can be said with certainty that it is an incognito window. This method is mainly obtained and judged navigator.storage.estimate

if ('storage' in navigator && 'estimate' in navigator.storage) {
  const { usage, quota } = await navigator.storage.estimate();
  console.log(`Using ${usage} out of ${quota} bytes.`);

  if (quota < 120000000) {
    console.log('Incognito');
  } else {
    console.log('Not Incognito');
  }
} else {
  console.log('Can not detect');
}

Incognito mode is detected by access time: When reading and writing data, the memory file system is always faster than the disk file system. In incognito mode, Chrome will store data written to the API in memory instead of persisting the data to disk as in normal mode. This new detection method was discovered by researcher Jesse Li , and it measures a series of writes to the browser's file system. Based on these writing speeds, the website can theoretically determine whether the browser uses incognito mode. The only way to prevent this detection method is to use the same storage medium for incognito mode and normal mode so that the API runs at the same speed anyway.

timings

Chrome developers saw these two points: In the March 2018 design document, they determined the risk of detecting privacy mode based on time and file system size, and implemented an alternative implementation : Only metadata is stored in memory , And encrypt the files on the disk. This will solve the risk of distinguishing memory and disk storage by website usage time, and eliminate differences based on file system size and file system type (temporary and persistent).

However, such a solution has its own trade-offs. Although it can resist privacy mode detection, it leaves metadata: even if the data itself cannot be decrypted, its existence provides evidence of stealth use.

If we consider the incognito threat model, is to protect the privacy of other users of the same device, not the privacy of the website you visit , this trade-off may not be worthwhile.

How to identify users in privacy mode

Unique device identification and browser fingerprint

We all know that the browser incognito mode can prevent others from knowing what websites you have visited and what operations you have done. In incognito mode, the opened web pages and loaded files will not be recorded in your browsing history and loading history. middle. After you close all open incognito windows, the system will delete all new cookies. However, as programmers, if there are scenarios similar to the following:

When product and data analysts need more accurate data;
When the page that does not need to log in (such as community articles) needs to eliminate the amount of UV visits in the incognito mode;
When voting sites that do not need to log in need to eliminate the incognito mode and repeatedly vote likes;
When there is no need to log in to a questionnaire website, it is necessary to restrict the user to only submit the questionnaire once or open it for the second time, and it is necessary to display the results of the previous submission;
...

This is undoubtedly a huge trouble for us. We may all know device unique identification , but on the browser side, in incognito mode, how can we get the device unique without additional authorization from the user? What about the logo?

In the development scenario, uniquely identifying a device is a basic function that can have many application scenarios, such as software authorization (how to ensure that your software can be used on a specific machine after authorization), software license, device identification, and device identity Identify etc.

If you want to obtain the unique identification of the device, you may think of ideas like IMEI, Android ID, MAC address, etc., but the official document Android 10 has the following two expressions:

A computer may have multiple network cards and multiple MAC addresses. Another more fatal weakness of the MAC address is that the MAC address is easy to manually change.

As for the Android ID, it does not have true uniqueness. ROOT, flashing, restoring factory settings, applications with different signatures, etc. will cause the acquired Android ID to change, and the bugs of the systems customized by different manufacturers will cause different devices to produce The same Android ID.

Some other methods to obtain the unique identification of the device, this article has a more comprehensive discussion:

obtain the unique identifier of the device (Unique Identifier): Windows system

advertisers track us? How to protect privacy in daily use of mobile phones This article has a picture that is a better summary:

identifying

And if we go back to our front-end scenario, the above methods have many limitations. For example, some require privileges, and some need to rely on the cooperation of native development. Then, is there a only requires the front-end to participate in What about a unique identification scheme that can get a good accuracy rate? —— It's time for the browser fingerprint .

FingerPrint is what we often call fingerprint recognition, which uses the texture pressed down on the front of the finger and thumb to authenticate identity. Fingerprints are a reliable method of identifying identity and are unique, because the texture arrangement on each finger of each person is different and does not change due to development or age. The browser fingerprint refers to a string composed of various information of the browser, such as the number of CPU cores, graphics card information, system fonts, screen resolution, browser plug-ins, etc., and can almost absolutely locate a user. Even if you use the privacy window mode of the browser, cannot be avoided.

This is a passive way of identification. That is to say, in theory, if you visit a certain website, then this website can identify you. Although you don’t know who you are, you have a unique fingerprint. In the future, whether it is advertising, precise push, and security protection , Or other things about privacy, are very convenient.

Technical Points and Classification of Browser Fingerprint

Basic fingerprint: The basic fingerprint of the browser is a characteristic identifier that any browser has, such as UserAgent, screen resolution, CPU core number, memory size, browser plug-ins and extensions, browser settings, language, hardware type, operating system, time zone , Geographic location, DNS, SSL certificate and many other information, these fingerprint information "similar" to human height, age, etc., have a high probability of conflict, can only be used as an auxiliary identification. Can this URL view of the basic characteristics of the local browser.
Advanced fingerprint: The difference between browser advanced fingerprint and basic fingerprint is: basic fingerprint is like a person's appearance feature, appearance can be distinguished by male and female, height, weight, but these features cannot uniquely identify a person, only basic fingerprints are used It is also impossible to determine the uniqueness of the client. Advanced fingerprints can be generated based on the many advanced functions of HTML5. Advanced fingerprints include Canvas fingerprints, Webgl fingerprints, AudioContext fingerprints, WebRTC fingerprints, font fingerprints, etc.;
Integrated fingerprint: Scattered fingerprint information cannot truly locate a unique user, and cannot be used to represent a user's unique identity (user fingerprint). integrated fingerprint refers to the combination of all user browser information , which can locate and identify users with an accuracy of nearly 99%. Combining the basic fingerprint and the advanced fingerprint can generate a comprehensive fingerprint (user fingerprint), so that it can reach close to 99% or more to locate a unique user.

For more details and principles of advanced fingerprints, you can learn about the article explore browser fingerprints

BrowserLeaks

For a long time, people have believed that IP addresses and cookies are the only reliable digital fingerprints used to track people online. But after a while, when modern network technology allows interested organizations to use new methods to identify and track users without their knowledge and unavoidable circumstances, things get out of control.

BrowserLeaks is about browsing privacy and web browser fingerprinting. Here, you will find a library of web technology security testing tools that will show you what types of personally identifiable data may be leaked and how to protect yourself from such leaks. This website provides an overview of various types of fingerprints including IP address, geographic location, Canvas, WebGL, WebRTC, fonts, etc. and its basic principles.

fingerPrint

If you are very interested in the technical principles, you can enter BrowserLeaks , click on the corresponding card title to view and understand, such as HTML5 Canvas Fingerprinting page, it will give your Canvas fingerprint and its uniqueness rate and other information.

fingerPrintCanvas

Nothing Private's recognition principle

The previous chapter "Can browser incognito mode really be incognito?" introduced the Nothing Private . We also saw that when submitting information and verifying, the request will carry a field of finger The field can then be considered as a "browser fingerprint".

Looking at the source code Nothing Private on GitHub, you can find that Nothing Private implements the core logic of "browser fingerprinting" as follows:

Obviously, Nothing Private uses ClientJS (device information and digital fingerprint written in pure JavaScript) to obtain the fingerprint of your web browser. The core method is getFingerprint . When you submit the form, this fingerprint is saved together with the ID you filled in in a MySQL database using PHP as the backend. Next time you visit the website, your browser fingerprint will match the column in the database and return the ID you filled in.

The data points currently used by ClientJS to generate fingerprints include:

user agent, screen print, color depth, current resolution, available resolution, device XDPI, device YDPI, plugin list,
font list, local storage, session storage, timezone, language, system language, cookies, canvas print

We still look ClientJS of getFingerprint basic logic :

fingerprint2

We found that getFingerprint will obtain UA, cookie, local storage, canvas fingerprint and other information, and then Murmur Hash algorithm, and finally return a "browser fingerprint" that can uniquely identify the browser device.

MurmurHash is a non-encrypted hash function, suitable for general hash retrieval operations. It was invented by Austin Appleby in 2008, and there have been many variants, all of which have been published in the public domain. Compared with other popular hash functions, MurmurHash's random distribution characteristics perform better for keys with strong regularity.
ClientJS official website address

FingerprintJS

FingerprintJS is a fast browser fingerprint library, purely JavaScript , without any dependencies. By default, the Murmur Hash algorithm is used to return a 32-bit integer, and the Hash function can be easily replaced. At the same time, he is also very lightweight: open gzipped only after 843 bytes , anonymous web browser recognition accuracy rate of up to 94% .

The use of FingerprintJS is also relatively simple:

import FingerprintJS from '@fingerprintjs/fingerprintjs';

// 应用启动时初始化：Initialize an agent at application startup.
const fpPromise = FingerprintJS.load();

(async () => {
  // Get the visitor identifier when you need it.
  const fp = await fpPromise;
  const result = await fp.get();

  // This is the visitor identifier:
  const visitorId = result.visitorId;
  console.log(visitorId);
})();

For more information about fingerprintJS, please refer to:

The above methods can obtain more than 90% of unique browser fingerprints, which may not be completely unique. For example, rewriting related canvas methods and using owl browser will still invalidate related methods. However, technical means are more often just a solution in a general sense, increasing the barriers and costs of cracking. I think it is sufficient to support development in common scenarios.

With a unique browser fingerprint, we can bring relevant fingerprints when similar to UV statistics, likes, and votes. Naturally, it is possible to determine to a great extent whether users have swiped tickets or swiped traffic, but , browser fingerprint technology is after all a double-edged sword , while solving the above problems, it will inevitably bring more information leakage troubles to users.

Implement Canvas Fingerprinting

Canvas Fingerprinting (Canvas fingerprint) draws a picture of a specific content based on Canvas, and uses the canvas.toDataURL() method to return the base64 encoded string of the picture content. For the PNG file format, it is divided into chunks, and the last piece is a 32-bit CRC check. Extracting this CRC check code can be used for the unique identification of the user. Canvas uses HTML5 canvas API and JavaScript to dynamically generate the images you want. Like other tracking technologies, this method has been adopted by thousands of websites, including the well-known area of advertising.

The following is a simple implementation of Canvas fingerprint, the principle is actually relatively simple, you can refer to the notes if you don't understand:

// PHP 中，bin2hex() 函数把 ASCII 字符的字符串转换为十六进制值。字符串可通过使用 pack() 函数再转换回去
// 下面是PHP 的 bin2hex 的 JavaScript 实现
function bin2hex(s) {
  let n,
    o = '';
  s += '';
  for (let i = 0, l = s.length; i < l; i++) {
    n = s.charCodeAt(i).toString(16);
    o += n.length < 2 ? '0' + n : n;
  }

  return o;
}

// 获取指纹UUID
function getUUID(domain) {
  // 创建 <canvas> 元素
  let canvas = document.createElement('canvas');
  // getContext() 方法可返回一个对象，该对象提供了用于在画布上绘图的方法和属性
  let ctx = canvas.getContext('2d');
  // 设置在绘制文本时使用的当前文本基线
  ctx.textBaseline = 'top';
  // 设置文本内容的当前字体属性
  ctx.font = "14px 'Arial'";
  // 设置用于填充绘画的颜色、渐变或模式
  ctx.fillStyle = '#f60';
  // 绘制"被填充"的矩形
  ctx.fillRect(125, 1, 62, 20);
  ctx.fillStyle = '#069';
  // 在画布上绘制"被填充的"文本
  ctx.fillText(domain, 2, 15);
  ctx.fillStyle = 'rgba(102, 204, 0, 0.7)';
  ctx.fillText(domain, 4, 17);

  // toDataURL返回一个包含图片展示的 data URI
  let b64 = canvas.toDataURL().replace('data:image/png;base64,', '');
  // atob() 方法用于解码使用 base-64 编码的字符串；base-64 编码使用方法是 btoa()，这俩都是window全局方法
  let crc = bin2hex(atob(b64).slice(-16, -12));
  return crc;
}

// 调用时，你可以传入任何你想传的字符串，并不局限于传递domain，这里只是为了便于区分站点
console.log(getUUID('https://www.baidu.com/'));

PHP bin2hex() function

The test results show that the CRC check code generated when the same browser accesses the domain is always unchanged. It can be simply understood that the , on different operating systems and different browsers, the generated image content is actually not exactly the same . There may be several reasons for this situation:

In terms of image format, different web browsers use different graphics processing engines, different image export options, and different default compression levels.
At the pixel level, the operating systems each use different settings and algorithms for anti-aliasing and sub-pixel rendering operations.
Even with the same drawing operation, the final image data is still different at the hash level.

How to better protect personal privacy

When should we use private/incognito mode?

stealth mode is to protect your browsing history from being seen by other public people and protect your account from malicious login when multiple people share a computer. In addition, the privacy mode can protect us from being bothered by malicious advertisements .

Even if you are using private browsing mode, it does not mean that you can do some evil things ;
Maybe you want to separate your work and personal life;
You may share a computer or device, and you do not want your family, friends, colleagues to snoop;
You may be buying gifts, but you don’t want anything to spoil the possible surprises;
Or maybe you just want to limit the amount of data that the company collects about you, and you value privacy;
When using computer equipment in public places.

How to prevent "browser fingerprints" from being generated?

In the previous section, we discussed how websites use various technologies to generate "browser fingerprints" to identify unique users, so let's talk about how to avoid being "generated" by websites to unique user fingerprints.

The commonly used method is to prevent the website from obtaining various information or return false data through the extension of the browser. This method is to execute a piece of JS code before the web page is loaded, change, rewrite, and HOOK each of the js Function to achieve, because the flexibility of JS makes this way possible. But this method is always superficial. Using JS modification can prevent most websites from generating unique fingerprints, but there are ways to detect whether they are "cheating".

A better method is to do the processing from the bottom of the browser, and modify the API from the bottom of the browser so that the information obtained at the js layer is not unique, and no matter how the combination is used, a unique fingerprint representing the user cannot be generated. For example: Owl Browser .

Owl browser is a browser based on chromium code modification and compilation. Various APIs have been modified from the bottom, which can be handed to users to customize and return various data, such as Canvas, Webgl, AudioContext, WebRTC, fonts, UserAgent, and screen resolution. , CPU core number, memory size, plug-in information, language and other information, so that you can completely avoid being "generated" unique user fingerprints. Because online companies, advertisers, and developers like to track your online activities and operations in order to provide you with targeted advertising, usually, people think that this is an infringement of user privacy.

How to avoid surveillance and tracking by ad trackers

Disable third-party cookies. Chrome 2020 introduced a thing called SameSite Cookie to reduce the sending of third-party cookies, but the website owner still has the ability to turn it off ( SameSite=None ), please refer to the figure below. The ultimate goal of Chrome is to completely eliminate third-party cookies by 2022. Like Safari and Brave have already done, SameSite Cookie is the first step.

Disable JavaScript scripts, anthracene, this is now, it is estimated to be forgotten. In the current development model where the front and back ends are separated, most websites will have nothing after JavaScript is disabled, so, there are no worries, and the website content is gone.

To hide your Internet traffic from monitoring and tracking, you can use a virtual private network (VPN). Your ISP will know that you are using VPN , but it cannot determine which websites you are visiting. The VPN service routes traffic through a remote server, so it looks like you are browsing from another location or multiple locations. However, the VPN provider can track where you are online , so it is best to find a company you can trust to delete or lock your browsing activity. VPN will not block third-party cookies from advertisers, but these cookies will not be able to accurately identify your location, making it difficult or impossible for ad trackers to be effective.

Friendly reminder: VPN is a neutral technology. built and registered by relevant units, but it is illegal to build privately (that is, using illegal VPN is illegal ); if you just use it It is not illegal for VPN to connect to the international network to perform necessary work and access necessary information; if you use VPN to make, copy, access, and disseminate , you need to be held accountable in accordance with the law.

Tor Browser can really mask your online activities. TorBrowser is a software for anonymously accessing the Internet. Users can communicate anonymously on the Internet through Tor. In order to achieve anonymity, Tor connects computers scattered around the world to form an encrypted loop. When you access the Internet through the Tor network, your network data will be sent in a roundabout way through multiple computers, like an onion wraps its core to cover up your network activities, making it difficult to track traffic, and the website you visit really doesn’t know you. Where, you only know the approximate location of the last server through which your request was routed; the information transmission is encrypted at every step, and there is no way to know where you are and the destination of the information transmission. Therefore Tor Browser is also known as the Onion Browser. Tor Browser will delete all cookies when it is closed, but even Tor Proxy will not prevent third-party advertisers from injecting cookies into your browser.

Reference

This article was first published on personal blog , welcome to and star .