14

The full name of pnpm is "Performant NPM", that is, high-performance npm. It combines soft and hard links with a new dependency organization method, which greatly improves the efficiency of package management, and also solves the problem of "phantom dependencies", making package management more standardized and reducing the possibility of potential risks.

Using pnpm is easy, you can use npm to install:

 npm i pnpm -g

After that, you can use pnpm instead of npm command, such as the most important installation package steps, you can use pnpm i instead of npm i pnpm is used.

Advantages of pnpm

Describe the advantage of using a word that is easier to remember pnpm that is "fast, accurate, ruthless":

  • Fast: Installation is fast.
  • Accurate: The installed dependencies will be accurately reused and cached, and even the changes brought by the package version upgrade are only diffs, which will never waste a little space, and the logic is also closely matched.
  • Ruthless: Directly abolishes the phantom dependency, and mercilessly chooses logical rationality in terms of logical rationality and vague convenience.

The ideas that bring these advantages are all on this picture on the official website:

  • All npm packages are installed in the global directory ~/.pnpm-store/v3/files , the same version of the package only stores one content, and even different versions of the package only store the diff content.
  • Each project's node_modules has a .pnpm directory to manage the source content of each version package in a flat structure, and point to the file address in pnpm-store by hard link.
  • The package structure installed under each project node_modules is a tree, which conforms to the node nearest search rule, and points the content to the package in node_modules/.pnpm by soft link.

Therefore, the search for each package must go through a three-layer structure: node_modules/package-a > soft link node_modules/.pnpm/package-a@1.0.0/node_modules/package-a > hard link ~/.pnpm-store/v3/files/00/xxxxxx .

What are the benefits of going through these three layers of addressing? Why three layers instead of two or four?

The purpose of three-layer addressing of dependent files

level one

Thinking about the above example, the first layer of looking for dependencies is nodejs or webpack and other operating environments/packaging tools are carried out, and their dependencies are found in the node_modules folder , and follow the principle of proximity, so the first layer of dependency files must be written under node_modules/package-a . On the one hand, the dependency search path is followed, and on the other hand, the dependencies are not carried to the superior directory, and the dependencies are not leveled. The purpose is to restore the most semantic package.json definition: that is, what package is defined can depend on which package, and vice versa. At the same time, the sub-dependencies of each package are also found in the package, which solves the multi-version management , but also makes node_modules have a stable structure, that is, the directory organization algorithm is only related to the definition of package.json , and has nothing to do with the package installation order.

If it stops here, this is the package management scheme of npm@2.x , but because the npm@2.x package management scheme is the least ambiguous, the first layer follows the design of this scheme.

Second floor

Starting from the second layer, it is necessary to solve the problems caused by npm@2.x design, mainly the problem of packet reuse. So the second layer node_modules/package-a > soft link node_modules/.pnpm/package-a@1.0.0/node_modules/package-a addressing uses soft link to solve the problem of repeated code references. Compared with npm@3 the design of flattening the package, the soft link can maintain the stability of the package structure, and at the same time use the file pointer to solve the problem of repeatedly occupying the hard disk space.

If you stop here, the package management problem in a project has been solved, but there is more than one project, and multiple projects are still too wasteful for multiple copies of the same package, so the third step of mapping is required.

the third floor

The third layer mapping node_modules/.pnpm/package-a@1.0.0/node_modules/package-a >hard link ~/.pnpm-store/v3/files/00/xxxxxx has left the current project path and points to a global unified management path, which is the inevitable choice for cross-project reuse, however pnpm further, instead of directly storing the source code of the package in pnpm-store, it is split into file blocks, which will be explained in detail later.

Phantom dependency

Phantom dependency means that a package referenced by the project code is not directly defined in package.json , but is installed by a package as a sub-dependency. The biggest hidden danger of relying on phantom dependencies in the code is that the semantic control of the package cannot penetrate to its sub-package, that is, the change of the package a@patch may mean that its sub-dependency package b@major Level of Break Change.

Because of the design of these three layers of addressing, the first layer can only contain packages defined by package.json , making it impossible for node_modules to address the packages not defined in package.json , naturally It solves the problem of phantom dependencies.

But there is also a more difficult phantom dependency problem, that is, the user installs a package in the root directory of the Monorepo project, and this package may be addressed by the code in a sub-Package. To completely solve this problem, it is necessary to use Rush together. In engineering, it is completely solved by relying on problem detection.

peer-dependences installation rules

pnpm Yes peer-dependences is a set of strict installation rules. For a package that defines peer-dependences , which means peer-dependences the content is sensitive, the subtext is that for different peer-dependences , this package may have different , so pnpm for different peer-dependences environments, may create multiple copies of the same package.

比如包bar peer-dependences baz^1.0.0foo^1.0.0 ,那我们在Monorepo 环境两个Packages 下分别安装不同版本的包会How?

 - foo-parent-1
  - bar@1.0.0
  - baz@1.0.0
  - foo@1.0.0
- foo-parent-2
  - bar@1.0.0
  - baz@1.1.0
  - foo@1.0.0

The result is this (citing the official website document example):

 node_modules
└── .pnpm
    ├── foo@1.0.0_bar@1.0.0+baz@1.0.0
    │   └── node_modules
    │       ├── foo
    │       ├── bar   -> ../../bar@1.0.0/node_modules/bar
    │       ├── baz   -> ../../baz@1.0.0/node_modules/baz
    │       ├── qux   -> ../../qux@1.0.0/node_modules/qux
    │       └── plugh -> ../../plugh@1.0.0/node_modules/plugh
    ├── foo@1.0.0_bar@1.0.0+baz@1.1.0
    │   └── node_modules
    │       ├── foo
    │       ├── bar   -> ../../bar@1.0.0/node_modules/bar
    │       ├── baz   -> ../../baz@1.1.0/node_modules/baz
    │       ├── qux   -> ../../qux@1.0.0/node_modules/qux
    │       └── plugh -> ../../plugh@1.0.0/node_modules/plugh
    ├── bar@1.0.0
    ├── baz@1.0.0
    ├── baz@1.1.0
    ├── qux@1.0.0
    ├── plugh@1.0.0

It can be seen that two identical versions of foo are installed, although the content is exactly the same, but they have different names: foo@1.0.0_bar@1.0.0+baz@1.0.0 , foo@1.0.0_bar@1.0.0+baz@1.1.0 . This is also a manifestation of the strict rules of pnpm . Any package should not have global side effects, or consider a singleton implementation, otherwise it may be installed multiple times by pnpm .

The principle of hard link and soft link

To understand the design of pnpm soft and hard links, we must first review the implementation of the software and hard links by the operating system file subsystem.

硬链接ln originFilePath newFilePath ,如ln ./my.txt ./hard.txt ,这样创建出来的hard.txt文件与my.txt都指向同一个文件存储地址,因此No matter which file is modified, the content of the two files changes at the same time because the content of the original address is directly modified. Further, N files created by hard links are equivalent. When viewing the file properties through ls -li ./ , you can see that the two files created by hard links have the same inode index:

 ls -li ./
84976912 -rw-r--r-- 2 author staff 489 Jun 9 15:41 my.txt
84976912 -rw-r--r-- 2 author staff 489 Jun 9 15:41 hard.txt

The third parameter 2 indicates that the storage address pointed to by the file has two hard link references. If a hard link points to a directory, it is more troublesome. The first problem is that the parent directory of the file will be ambiguous. At the same time, all child files must be created with hard links, which is more complicated to implement. Therefore, Linux does not provide this. an ability.

The soft link is created by ln -s originFilePath newFilePath , which can be considered as a pointer to the file address pointer, that is, it has a new inode index, but the file content only contains the pointed file path, such as:

 84976913 -rw-r--r-- 2 author staff 489 Jun 9 15:41 soft.txt -> my.txt

When the source file is deleted, the soft link will also be invalid, but the hard link will not, and the soft link can take effect on the folder. Therefore pnpm although a combination of software and hardware is used to achieve code reuse, the soft link itself hardly occupies much additional storage space, and the hard link mode occupies zero additional memory space, so for the same package, pnpm the additional storage space occupied can be approximately equal to zero.

How the global installation directory pnpm-store is organized

pnpm The hard link method is adopted in the third-layer addressing, but at the same time, there is still one problem left, that is, the hard link object file is not an ordinary NPM package source code, but a hash Files, this file organization is called content-addressable (content-based addressing).

To put it simply, the advantage of content-based addressing over filename-based addressing is that even if the package version is upgraded, it is only necessary to store the modified Diff instead of the complete file content of the new version, which further saves version management. storage.

The pnpm-store is organized like this:

 ~/.pnpm-store
- v3
  - files
    - 00
      - e4e13870602ad2922bfc7..
      - e99f6ffa679b846dfcbb1..
      ..
    - 01
      ..
    - ..
      ..
    - ff
      ..

That is, a storage method that uses file content addressing rather than file location addressing. The reason why this storage method can be used is that the content of the NPM package will not change once it is published, so it is suitable for content addressing, which is a fixed content scenario. At the same time, content addressing also ignores the structural relationship of the package. When a new package After downloading and decompressing, it can be discarded when the same file Hash value is encountered, and only the file whose Hash value does not exist is stored, which naturally realizes what was said at the beginning, pnpm For different versions of the same package, only The ability to store its incremental changes.

Summarize

pnpm Three layers of addressing, both fit node_modules default addressing method also solves the problem of duplicate file installation, and by the way solves the problem of phantom dependencies, which can be said to be package management The current best innovation, bar none.

However, its harsh package management logic makes us use it alone pnpm to manage a large Monorepo, and it is easy to encounter some logical but awkward places, such as if each Package generates a reference version for the same package Differentiation may cause Peer Deps to generate multiple instances of these packages, and the differentiation of these package versions may be caused accidentally. We may need to use Rush and other Monorepo management tools to ensure version consistency.

The discussion address is: Intensive Reading "pnpm" Issue #435 dt-fe/weekly

If you'd like to join the discussion, click here , there are new topics every week, with a weekend or Monday release. Front-end intensive reading - help you filter reliable content.

Follow Front-end Intensive Reading WeChat Official Account

<img width=200 src="https://img.alicdn.com/tfs/TB165W0MCzqK1RjSZFLXXcn2XXa-258-258.jpg">

Copyright notice: Free to reprint - non-commercial - non-derivative - keep attribution ( Creative Commons 3.0 license )

黄子毅
7k 声望9.6k 粉丝