8

On October 18th, W3C's Web Platform Incubator Community Group announced the draft specification of HTML Sanitizer API. This draft is used to solve the problem of how the browser solves the XSS attack.

One type of network security that gives developers a headache is XSS cross-site scripting attacks. This type of attack usually refers to the use of loopholes left during web page development, that is, to inject malicious instruction code into the web page so that users can load and execute the web page program maliciously created by the attacker.

These malicious codes are not filtered and are mixed with the normal code of the website. The browser cannot tell which content is credible, and the malicious scripts will be executed. The core of an XSS attack has two steps: 1. Dealing with the malicious code submitted by the attacker; 2. The browser executes the malicious code.

In order to solve this problem in the two-step malicious attack, there are usually the following means:

  1. Add filter conditions
  2. Only perform pure front-end rendering, separate data and code content
  3. Fully escape HTML

The steps above are cumbersome, and there are many things to pay attention to. In order to allow developers to more conveniently solve the problem of XSS attacks, the browser now provides a native XSS attack disinfection capability.

HTML Sanitizer API-This API jointly initiated by Google, Mozilla and Cure53 is about to be finalized. Through this browser native API, we can more easily protect Web applications from XSS attacks.

Next, let's take a look at this security API.

Introduction to Sanitizer API

The Sanitizer API allows the browser to remove malicious code directly from the dynamically updated markup of the website. When malicious HTML strings, documents or document fragment objects want to be inserted into the existing DOM, we can use the HTML Sanitizer API to directly clean up these contents. It's a bit like a computer security guard application, which can remove risky content.

There are three advantages to using Sanitizer API:

  • Reduce the number of cross-site scripting attacks in web applications
  • Ensure the safe use of HTML output content in the current user agent
  • Sanitizer API is very usable

Features of Sanitizer API

The Sanitizer API opens the door to a new world of HTML string security, broadly categorizing all functions, which can be divided into the following three main features:

1. Antivirus for user input

The main function of the Sanitizer API is to accept strings and convert them into more secure strings. These converted strings will not execute additional JavaScript and ensure that the application is protected from XSS attacks.

2. Built-in browser

The library is pre-installed when the browser is installed, and is updated when a bug is discovered or a new attack occurs. It is equivalent to that our browser has built-in anti-virus measures without importing any external libraries.

3. Simple and safe to use

After using the Sanitizer API, the browser now has a powerful and safe parser. As a mature browser, it knows how to deal with the activities of each element in the DOM. In contrast, external parsers developed with JavaScript are not only costly, but also easily fail to keep up with the update speed of the front-end environment.

Having said the highlights of these usage features, let's take a look at the specific usage of this API.

Use of Sanitizer API

The Sanitizer API uses the Sanitizer() method constructor and the Sanitizer class for configuration.

The official provides three basic cleaning methods:

1. Clean up the string that hides the context

Element.setHTML() is used to parse and clean up the string and insert it into the DOM immediately. This method is suitable for the case where the target DOM element is known and the HTML content is a string.

const $div = document.querySelector('div')
const user_input = `<em>Hello There</em><img src="" onerror=alert(0)>` // The user string.
const sanitizer = new Sanitizer() // Our Sanitizer
// We want to insert the HTML in user_string into a target element with id
// target. That is, we want the equivalent of target.innerHTML = value, except
// without the XSS risks.
$div.setHTML(user_input, sanitizer) // <div><em>Hello There</em><img src=""></div>


2. Clean up the text strings in a given context

Sanitizer.sanitizeFor() is used to parse, clean up and prepare strings to be added to the DOM later.

It is suitable for the case where the HTML content is a string and the target DOM element type is known (for example, div, span).

const user_input = `<em>Hello There</em><img src="" onerror=alert(0)>`
const sanitizer = new Sanitizer()
// Later:
// The first parameter describes the node type this result is intended for.
sanitizer.sanitizeFor("div", user_input) // HTMLDivElement <div>

It should be noted that the cleanup output result of .innerHTML in HTMLElement is in string format.

sanitizer.sanitizeFor("div", user_input).innerHTML // <em>Hello There</em><img src="">

3. Clean up the nodes

For DocumentFragment that has user control, Sanitizer.sanitize() can directly clean up DOM tree nodes.

// Case: The input data is available as a tree of DOM nodes.
const sanitizer = new Sanitizer()
const $userDiv = ...;
$div.replaceChildren(s.sanitize($userDiv));

In addition to the three methods mentioned above, SanitizerAPI modifies HTML strings by deleting sums, filtering attributes, and tags.

Give a "chestnut".

  • Delete some tags (_script, marquee, head, frame, menu, object, etc._) and keep the content tag.
  • Most of the attributes are removed, and only the HREF on the <a> tag and colspanson<td>,<th>
  • Filter out the content that may lead to the execution of risky scripts.

In the default setting, this security API is only used to prevent the appearance of XSS. But in some cases, we also need to customize custom settings. Here are some commonly used configurations.

Custom disinfection

Create a configuration object and pass it to the constructor when initializing the Sanitizer API.

const config = {
  allowElements: [],
  blockElements: [],
  dropElements: [],
  allowAttributes: {},
  dropAttributes: {},
  allowCustomElements: true,
  allowComments: true
};
// sanitized result is customized by configuration
new Sanitizer(config)

Here are some common methods:

  • allowElements reserves the specified input
  • blockElements blockElements delete the part of the content that needs to be retained
  • dropElements dropElements delete the specified content, including the input content
const str = `hello <b><i>there</i></b>`

new Sanitizer().sanitizeFor("div", str)
// <div>hello <b><i>there</i></b></div>

new Sanitizer({allowElements: [ "b" ]}).sanitizeFor("div", str)
// <div>hello <b>there</b></div>

new Sanitizer({blockElements: [ "b" ]}).sanitizeFor("div", str)
// <div>hello <i>there</i></div>

new Sanitizer({allowElements: []}).sanitizeFor("div", str)
// <div>hello there</div>
  • The two parameters allowAttributes and dropAttributes can customize the parts that need to be retained or deleted.
const str = `<span id=foo class=bar style="color: red">hello there</span>`

new Sanitizer().sanitizeFor("div", str)
// <div><span id="foo" class="bar" style="color: red">hello there</span></div>

new Sanitizer({allowAttributes: {"style": ["span"]}}).sanitizeFor("div", str)
// <div><span style="color: red">hello there</span></div>

new Sanitizer({dropAttributes: {"id": ["span"]}}).sanitizeFor("div", str)
// <div><span class="bar" style="color: red">hello there</span></div>
  • AllowCustomElements is turned on whether to use custom elements
const str = `<elem>hello there</elem>`

new Sanitizer().sanitizeFor("div", str);
// <div></div>

new Sanitizer({ allowCustomElements: true,
                allowElements: ["div", "elem"]
              }).sanitizeFor("div", str);
// <div><elem>hello there</elem></div>

If there is no configuration, the default configuration content will be used directly.

This API seems to be able to solve a lot of problems for us, but browser support for it is still limited, and more functions are still being improved. We are also looking forward to seeing a more complete SanitizerAPI

Friends who are interested in it can be about://flags/#enable-experimental-web-platform-features in Chrome93+, and Firefox is currently in the experimental stage. You can enable it by about:config.

For more information, please check: https://developer.mozilla.org/en-US/docs/Web/API/HTML\_Sanitizer\_API

Concerns about data security

According to the Verizon 2020 Data Breach Investigation Report (Verizon Business, 2020), approximately 90% of data breaches are caused by cross-site scripting ((XSS)) and security vulnerabilities. For front-end developers, in the face of more frequent network attacks, in addition to relying on security mechanisms such as Sanitizer API, they can also consider using front-end table controls such as SpreadJS


葡萄城技术团队
2.7k 声望28.6k 粉丝

葡萄城创建于1980年,是专业的软件开发技术和低代码平台提供商。以“赋能开发者”为使命,葡萄城致力于通过各类软件开发工具和服务,创新开发模式,提升开发效率,推动软件产业发展,为“数字中国”建设提速。