7
头图
Author: imyzf

Overview

Content as structured data.

——Inscription on unified official website

Unified is a set of ecosystems related to text processing. Combined with its ecologically related plug-ins, it can process Markdown, HTML, natural language, etc. The unified library itself acts as a unified execution interface, acting as an executor, calling its ecologically related plug-ins to complete processing tasks.

As can be seen from the unified official website , unified is currently widely used, including Prettier, Node.js official website, and Gatsby all use the unified ability to complete some functions.

Figure: Example of use of unified official website

Common usage scenarios include:

  • Generate HTML pages and sites based on Markdown
  • Markdown/HTML content processing
  • Markdown syntax checking, formatting
  • As a low-level library, tools that encapsulate specific scenarios

In view of the fact that there are very few articles about the unified system in China, this article will introduce the related plug-in ecology and working principle of unified, and analyze some usage examples to help readers understand the capabilities, principles and uses of the unified system.

Plug-in ecology

Figure: unified ecology related plug-ins

remark

remark is a collection of Markdown-related plug-ins, providing the ability to parse, modify, and convert Markdown into HTML.

Some commonly used plugins currently provided:

For a complete list of plugins, you can refer to here , there are more than 150 plugins to choose from.

We can use this convenient way to call remark in our project:

remark() // 一键初始化 Markdown 解析器
  .processSync('# Hello, world!') // 同步处理文本

Equivalent to the following way:

unified() // 使用 unified 统一的接口
  .use(remarkParse)  // 使用 Markdown 解析器插件
  .use(remarkStringify) // 使用 Markdown 文本序列化插件
  .processSync('# Hello, world!')

Figure: remark usage and conversion example

It should also be noted that there is currently a project with the same name on GitHub, gnab/remark , and its official website is remarkjs.com . Although it is also a tool related to Markdown, it has nothing to do with remark under the unified ecosystem. The official website of remark is remark.js.org , and you need to avoid confusion when searching for relevant information through search engines.

rehype

Similar to remark, rehype is a collection of HTML-related plug-ins that provide HTML formatting, compression, and document generation capabilities.

In contrast, rehype has relatively few plugins, only more than 40. For a detailed plugin list, please refer to plugin list document

At the same time, we can also use rehype-remark and remark-rehype to realize the mutual conversion between the plug-in systems of the two languages. For example, the following example can convert the HTML content input from stdin to Markdown:

import { unified } from 'unified'
import { stream } from 'unified-stream'
import rehypeParse from 'rehype-parse'
import rehypeRemark from 'rehype-remark'
import remarkStringify from 'remark-stringify'

const processor = unified()
  .use(rehypeParse)     // 解析 HTML
  .use(rehypeRemark)    // 转换到 remark 体系
  .use(remarkStringify) // 将语法树转换为 Markdown 字符串

process.stdin.pipe(stream(processor)).pipe(process.stdout)

other

retext and redot are two relatively small systems with less usage and less active development than the aforementioned two systems. Their uses are as follows:

  • retext: Provides natural language processing capabilities, including spell checking, error correction, readability checking, etc.
  • redot: Provides the parsing capabilities of graphviz

Also in Markdown art, there are two named non re system beginning, MDX and MICROMARK , respectively Markdown specific usage scenarios:

  • mdx: Provides the ability to write JSX in Markdown documents, introduces various components into documents, and writes interactive documents
  • micromark: A minimalist Markdown conversion library that supports a small number of extension plug-ins, suitable for simple Markdown to HTML scenarios, and remark also reuses the parsing capabilities of micromark

Specific information can be found in the project documentation, and will not be repeated here.

working principle

The core mechanism of unified is based on AST (abstract syntax trees, abstract syntax trees). When the plugin is executed, the AST will be passed to the plugin, which can be processed in various ways. At the same time, it is also possible to convert various languages based on AST, for example, after parsing the Markdown document, convert it into HTML for processing, and then convert it back to Markdown.

Figure: unified workflow

For example, we can traverse the AST in the plugin and print out heading

module.exports = () => tree => {
  visit(tree, 'heading', node => {
    console.log(node)
  })
}

visit method in the above example comes from the unist-util-visit tool, which provides the function of traversing nodes. unified uses an AST standard called unist or UST to enable the same tools to be used on different languages. For example, AST for Markdown and HTML languages, since they are based on the same standard, we can use the same visit API to achieve the same function:

visit(markdownAST, 'images', transformImages)
visit(htmlAST, 'img', transformImgs)

Scenario example

Next, we will list some usage scenarios based on the unified ecology to help you further understand its use.

Node.js official website

Node.js official website mainly uses unified in two aspects: grammar check and document construction:

  • Use remark-cli to check the Markdown document, refer to the script configuration in package.json
  • Use unified for document construction, refer to the code in generate.mjs

dumi

dumi is a documentation tool customized for component development scenarios. Its core function is to convert Markdown documents into HTML pages. Looking at its source code, we will find that unified is used as a converter, remark/index.ts , and a series of custom or community-provided plugins are called for processing.

Due to the use of a lot of custom plugins, the dumi source code can be used as an excellent reference example for unified plugin development. For example, refer to link.ts to learn how to modify the AST of external links in Markdown and add a small link icon to the generated page to remind users that this is a link to an external site.

Document source code:

[云音乐官网](https://music.163.com/)

translates to:

<a target="_blank" rel="noopener noreferrer" href="https://music.163.com/">
  云音乐官网
  <svg class="__dumi-default-external-link-icon">……</svg>
</a>

react-markdown

react-markdown as part of the remark system, is an upper-level package based on the unified ecosystem, providing a React component that can render Markdown. In the React framework, compared to directly using remark to convert Markdown to HTML and then using dangerouslySetInnerHTML render, using react-markdown is more secure and reliable, and the usage is simpler and more convenient.

Figure: How react-markdown works

The above figure shows how react-markdown works. The process is as follows:

  1. Convert Markdown to corresponding AST by remark - mdast
  2. Use the remark plugin to process mdast
  3. Convert mdast to HTML AST via remark-rehype - hast
  4. Hast processing using the rehype plugin
  5. Render hast to React element using React component

The whole process above is actually a general processing process for rendering Markdown into HTML. We can also use it as a reference when implementing similar libraries.

About the author

At present, there are a total of 333 open source projects in the unified ecosystem (as of 2022.01.05), and its core developer is Titus Wormer. It can be learned from Wormer's personal website that he is from the Netherlands, graduated from the Amsterdam University of Applied Sciences, and once served as a lecturer at the university. As a full-time open source contributor, maintains over 535 projects in total, 50% of which is devoted to unified projects. It is admirable to be able to make such a great contribution to the open source community on your own. On how he manages the unified organization, you can refer to the unified collective document for further information.

This article is published from NetEase Cloud Music Front-end Team , any form of reprinting is prohibited without authorization. We recruit front-end, iOS, and Android all year round. If you are ready to change jobs and happen to like cloud music, then join us at grp.music-fe(at)corp.netease.com!

云音乐技术团队
3.6k 声望3.5k 粉丝

网易云音乐技术团队