On October 18th, W3C's Web Platform Incubator Community Group announced the draft specification of HTML Sanitizer API. This draft is used to solve the problem of how the browser solves the XSS attack.
One type of network security that gives developers a headache is XSS cross-site scripting attacks. This type of attack usually refers to the use of loopholes left during web page development, that is, to inject malicious instruction code into the web page so that users can load and execute the web page program maliciously created by the attacker.
These malicious codes are not filtered and are mixed with the normal code of the website. The browser cannot tell which content is credible, and the malicious scripts will be executed. The core of an XSS attack has two steps: 1. Dealing with the malicious code submitted by the attacker; 2. The browser executes the malicious code.
In order to solve this problem in the two-step malicious attack, there are usually the following means:
- Add filter conditions
- Only perform pure front-end rendering, separate data and code content
- Fully escape HTML
The steps above are cumbersome, and there are many things to pay attention to. In order to allow developers to more conveniently solve the problem of XSS attacks, the browser now provides a native XSS attack disinfection capability.
HTML Sanitizer API-This API jointly initiated by Google, Mozilla and Cure53 is about to be finalized. Through this browser native API, we can more easily protect Web applications from XSS attacks.
Next, let's take a look at this security API.
Introduction to Sanitizer API
The Sanitizer API allows the browser to remove malicious code directly from the dynamically updated markup of the website. When malicious HTML strings, documents or document fragment objects want to be inserted into the existing DOM, we can use the HTML Sanitizer API to directly clean up these contents. It's a bit like a computer security guard application, which can remove risky content.
There are three advantages to using Sanitizer API:
- Reduce the number of cross-site scripting attacks in web applications
- Ensure the safe use of HTML output content in the current user agent
- Sanitizer API is very usable
Features of Sanitizer API
The Sanitizer API opens the door to a new world of HTML string security, broadly categorizing all functions, which can be divided into the following three main features:
1. Antivirus for user input
The main function of the Sanitizer API is to accept strings and convert them into more secure strings. These converted strings will not execute additional JavaScript and ensure that the application is protected from XSS attacks.
2. Built-in browser
The library is pre-installed when the browser is installed, and is updated when a bug is discovered or a new attack occurs. It is equivalent to that our browser has built-in anti-virus measures without importing any external libraries.
3. Simple and safe to use
After using the Sanitizer API, the browser now has a powerful and safe parser. As a mature browser, it knows how to deal with the activities of each element in the DOM. In contrast, external parsers developed with JavaScript are not only costly, but also easily fail to keep up with the update speed of the front-end environment.
Having said the highlights of these usage features, let's take a look at the specific usage of this API.
Use of Sanitizer API
The Sanitizer API uses the Sanitizer() method constructor and the Sanitizer class for configuration.
The official provides three basic cleaning methods:
1. Clean up the string that hides the context
Element.setHTML() is used to parse and clean up the string and insert it into the DOM immediately. This method is suitable for the case where the target DOM element is known and the HTML content is a string.
const $div = document.querySelector('div')
const user_input = `<em>Hello There</em><img src="" onerror=alert(0)>` // The user string.
const sanitizer = new Sanitizer() // Our Sanitizer
// We want to insert the HTML in user_string into a target element with id
// target. That is, we want the equivalent of target.innerHTML = value, except
// without the XSS risks.
$div.setHTML(user_input, sanitizer) // <div><em>Hello There</em><img src=""></div>
2. Clean up the text strings in a given context
Sanitizer.sanitizeFor() is used to parse, clean up and prepare strings to be added to the DOM later.
It is suitable for the case where the HTML content is a string and the target DOM element type is known (for example, div, span).
const user_input = `<em>Hello There</em><img src="" onerror=alert(0)>`
const sanitizer = new Sanitizer()
// Later:
// The first parameter describes the node type this result is intended for.
sanitizer.sanitizeFor("div", user_input) // HTMLDivElement <div>
It should be noted that the cleanup output result of .innerHTML in HTMLElement is in string format.
sanitizer.sanitizeFor("div", user_input).innerHTML // <em>Hello There</em><img src="">
3. Clean up the nodes
For DocumentFragment that has user control, Sanitizer.sanitize() can directly clean up DOM tree nodes.
// Case: The input data is available as a tree of DOM nodes.
const sanitizer = new Sanitizer()
const $userDiv = ...;
$div.replaceChildren(s.sanitize($userDiv));
In addition to the three methods mentioned above, SanitizerAPI modifies HTML strings by deleting sums, filtering attributes, and tags.
Give a "chestnut".
- Delete some tags (_script, marquee, head, frame, menu, object, etc._) and keep the content tag.
- Most of the attributes are removed, and only the HREF on the
<a>
tag andcolspanson<td>,<th>
- Filter out the content that may lead to the execution of risky scripts.
In the default setting, this security API is only used to prevent the appearance of XSS. But in some cases, we also need to customize custom settings. Here are some commonly used configurations.
Custom disinfection
Create a configuration object and pass it to the constructor when initializing the Sanitizer API.
const config = {
allowElements: [],
blockElements: [],
dropElements: [],
allowAttributes: {},
dropAttributes: {},
allowCustomElements: true,
allowComments: true
};
// sanitized result is customized by configuration
new Sanitizer(config)
Here are some common methods:
- allowElements reserves the specified input
- blockElements blockElements delete the part of the content that needs to be retained
- dropElements dropElements delete the specified content, including the input content
const str = `hello <b><i>there</i></b>`
new Sanitizer().sanitizeFor("div", str)
// <div>hello <b><i>there</i></b></div>
new Sanitizer({allowElements: [ "b" ]}).sanitizeFor("div", str)
// <div>hello <b>there</b></div>
new Sanitizer({blockElements: [ "b" ]}).sanitizeFor("div", str)
// <div>hello <i>there</i></div>
new Sanitizer({allowElements: []}).sanitizeFor("div", str)
// <div>hello there</div>
- The two parameters allowAttributes and dropAttributes can customize the parts that need to be retained or deleted.
const str = `<span id=foo class=bar style="color: red">hello there</span>`
new Sanitizer().sanitizeFor("div", str)
// <div><span id="foo" class="bar" style="color: red">hello there</span></div>
new Sanitizer({allowAttributes: {"style": ["span"]}}).sanitizeFor("div", str)
// <div><span style="color: red">hello there</span></div>
new Sanitizer({dropAttributes: {"id": ["span"]}}).sanitizeFor("div", str)
// <div><span class="bar" style="color: red">hello there</span></div>
- AllowCustomElements is turned on whether to use custom elements
const str = `<elem>hello there</elem>`
new Sanitizer().sanitizeFor("div", str);
// <div></div>
new Sanitizer({ allowCustomElements: true,
allowElements: ["div", "elem"]
}).sanitizeFor("div", str);
// <div><elem>hello there</elem></div>
If there is no configuration, the default configuration content will be used directly.
This API seems to be able to solve a lot of problems for us, but browser support for it is still limited, and more functions are still being improved. We are also looking forward to seeing a more complete SanitizerAPI
Friends who are interested in it can be about://flags/#enable-experimental-web-platform-features
in Chrome93+, and Firefox is currently in the experimental stage. You can enable it by about:config.
For more information, please check: https://developer.mozilla.org/en-US/docs/Web/API/HTML\_Sanitizer\_API
Concerns about data security
According to the Verizon 2020 Data Breach Investigation Report (Verizon Business, 2020), approximately 90% of data breaches are caused by cross-site scripting ((XSS)) and security vulnerabilities. For front-end developers, in the face of more frequent network attacks, in addition to relying on security mechanisms such as Sanitizer API, they can also consider using front-end table controls such as SpreadJS
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。