1
头图

Hi everyone, I'm Kasong.

risky DOM needs to be processed, such as:

  • Text paste function of various tools
  • Need to render the scene where the HTML

In order to prevent a potential XSS attack, there are two options:

  • escape (escaped)
  • sanitize (disinfection)

This article will introduce the difference between the two as well as DOM disinfection API - Sanitizer .

The content of this article comes from Safe DOM manipulation with the Sanitizer API

Escaping and sanitizing

Suppose that we want this to HTML string into DOM :

const str = "<img src='' onerror='alert(0)'>";

If you direct it as an element of innerHTML , img of onerror callback JS capability of the code will bring XSS risk.

A common solution is to escape the string.

What is escape

The browser will parse some reserved characters into HTML code, such as:

  • < is parsed as the beginning of the tag
  • > is parsed as the end of the tag
  • '' is parsed as the beginning and end of the attribute value

In order to display these reserved characters as text (not parsed as HTML code), you can replace them with the corresponding entity ( HTML entity):

  • < entity is &lt;
  • > entity is &gt;
  • '' entity is &quot;
This way of HTML characters entity is called escape (escaped)

What is sanitize

For the HTML string above:

const str = "<img src='' onerror='alert(0)'>";

In addition to escaping '' to avoid the XSS , there is a more intuitive idea: directly filter out the onerror attribute.

This method of directly removing harmful codes in the HTML <script> ) is called sanitize (disinfection)

Need to use a API - Sanitizer .

First, we construct an instance Sanitizer

const sanitizer = new Sanitizer();

Call the sanitizeFor method of the instance, passing in the container element type and the HTML string to be sanitized:

sanitizer.sanitizeFor("div", str);

Will get a HTMLDivElement (which we passed in the container element type), its interior contains a no onerror property img :

By default, Sanitizer will remove all JS execute.

Rich configuration

Sanitizer not only works out of the box, but also provides a wealth of whitelist and blacklist configurations:

const config = {
  allowElements: [],
  blockElements: [],
  dropElements: [],
  allowAttributes: {},
  dropAttributes: {},
  allowCustomElements: true,
  allowComments: true
};

new Sanitizer(config)

For example, allowElements defines a whitelist of elements, and only the elements in the list will be reserved. The corresponding blockElements is an element blacklist:

const str = `hello <b><i>world</i></b>`

new Sanitizer().sanitizeFor("div", str)
// <div>hello <b><i>world</i></b></div>

new Sanitizer({allowElements: [ "b" ]}).sanitizeFor("div", str)
// <div>hello <b>world</b></div>

new Sanitizer({blockElements: [ "b" ]}).sanitizeFor("div", str)
// <div>hello <i>world</i></div>

new Sanitizer({allowElements: []}).sanitizeFor("div", str)
// <div>hello world</div>

allowAttributes is the attribute whitelist, and the corresponding dropAttributes is the attribute blacklist. For the following configuration:

{
  allowAttributes: {"style": ["span"]},
  dropAttributes: {"id": ["*"]}}
}

Represents the HTML after disinfection:

  • span element is allowed to have the attribute style
  • id attribute of all elements ( * wildcard represents all elements)

compatibility

How about the compatibility of such fragrant API

Currently only after Chrome 93 , the test logo can be used:

about://flags/#enable-experimental-web-platform-features

Although the original Sanitizer far from stable, you can use the DOMPurify library to achieve similar functions.

postscript

Do you prefer escape or sanitize ?

Welcome to join the human high-quality front-end framework research group , lead the flight


卡颂
3.1k 声望16.7k 粉丝