10

Preface

The author currently resides team is using Monorepo way to manage all business projects, and with the increase in projects, stability and development experience being challenged, many problems began to be exposed, you can clearly feel the existing infrastructure has been insufficient to support Monorepo Increasingly large business projects.

The existing Monorepo is implemented based on the yarn workspace, through each package in the link warehouse, to achieve the purpose of cross-project reuse. The package manager also chose yarn for granted. Although it relies on Lerna, it is rarely used because the package delivery scenarios are relatively rare.

It can be summarized as the following three points:

  • Through the package in the yarn workspace link repository
  • Use yarn as a package manager to manage dependencies in the project
  • Use lerna to build the dependent packages according to the dependencies before the application app is built

The problem

Commands are not uniform

There are three kinds of commands

  1. yarn
  2. yarn workspace
  3. lerna

Newcomers can easily cause misunderstandings when they get started, and some commands overlap in functions.

Slow release

monorepo1

If we need to publish app1, it will

  1. Full installation dependencies, app1, app2, app3, and package1 to package6 dependencies will be installed;
  2. All packages are built instead of only package1 and package2 that app1 depends on.

Phantom dependencies

A library that uses a package that is not part of its dependencies is called Phantom dependencies (phantom dependencies, ghost dependencies, implicit dependencies). In the existing Monorepo architecture, this problem is magnified (dependency improvement).

monorepo-2

Since the correctness of the version that Phantom depends on cannot be guaranteed, it brings uncontrollable risks to the operation of the program. App depends on lib-a, and lib-a depends on lib-x. Due to the dependency improvement, we can directly reference lib-x in the app. This is not reliable. Can we reference lib-x and what version to reference The lib-x depends entirely on the developers of lib-a.

NPM doppelgnger

Multiple packages of the same version may be installed and multiple packages may be packaged.

Suppose the following dependencies exist

monorepo-3

There may be two consequences for the final dependency installation:

  1. lib-x@^1 * 1 copy, lib-x@^2 * 2 copies
  2. lib-x@^2 * 1 copy, lib-x@^1 * 2 copies

In the end, 3 copies of lib-x will be installed locally, and there will also be three instances when packaging. If lib-x requires a singleton, it may cause problems.

Yarn duplicate

Yarn duplicate and solution

Suppose the following dependencies exist

monorepo-4

When (p)npm is installed to the same module, it is judged whether the installed module version meets the version range of the new module. If it does, skip it. If it does not, install the module under node_modules of the current module. That is, lib-a will reuse lib-b@1.1.0 that app depends on.

However, using Yarn v1 as the package manager, lib-a will install a separate copy of lib-b@1.2.0.

peerDependencies risk

Yarn dependency is improved, which may cause BUG in the peerDependencies scenario.

  1. app1 depends on A@1.0.0
  2. app2 depends on B@2.0.0
  3. B@2.0.0 regards A@2.0.0 as peerDependency, so app2 should also install A@2.0.0

If app2 forgets to install A@2.0.0, then the structure is as follows

--apps
    --app1
    --app2
--node_modules
    --A@1.0.0
    --B@2.0.0

At this time, B@2.0.0 will incorrectly quote A@1.0.0.

Package reference specification is missing

There are currently three reference methods in the project:

  1. Source code citation: Use package name citation. The build script of the host project needs to be configured and the Package is included in the build process. Similar to directly publishing a TypeScript source code package, projects that reference this package need to be adapted.
  2. Source code reference: Use file path reference. It can be understood that "the source file that is hosted outside of its own src" is a part of the source code of the host project, not a package. The host needs to provide all the dependencies, and cross-project reuse is achieved under the premise of Yarn dependency improvement, but there is a greater risk.
  3. Product citations. After the package is completed, the product is directly quoted by the package name.

Package reference version uncertainty

Assuming that package1 in Monorepo is released to the npm repository, how should app1 in Monorepo write a version number that refers to package1 in package.json?

package1/packag.json

{
  "name": "package1",
  "version": "1.0.0"
}

app1/package.json

{
  "name": "app1",
  "dependencies": {
    "package-1": "?" // 这里的版本号应该怎么写?`*` or `1.0.0`
  }
}

When dealing with mutual references between items in Monorepo, Yarn will make the following judgments:

  1. Determine whether there is package1 that matches the required version of app1 in the current Monorepo;
  2. If it exists, execute the link operation, and app1 directly uses local package1;
  3. If it does not exist, pull package1 that matches the version from the remote npm repository for app1 to use.
Special attention is needed: * cannot match version 👉 Workspace package with prerelease version and wildcard dep version #6719 .

Suppose the following scenarios exist:

  1. 1.0.0 has previously released the 060e2bc0faedff version, at this time the remote warehouse is consistent with the code in the local Monorepo;
  2. Product classmates mentioned a requirement that only serves Monorepo's internal applications;
  3. 1.0.0 is iterated under the 060e2bc0faee4b version, and there is no need to change the version number to release;
  4. Yarn judges that the version of package1 in Monorepo meets the required version of app1 ( * or 1.0.0 );
  5. app1 smoothly uses the latest features of package1.

Until one day, this demand feature needs to be provided to external business parties for use.

  1. pacakge1 changed the version to 1.0.0-beta.0 and released it;
  2. Yarn judges that the version of package1 in Monorepo does not meet the required version of app1;
  3. package1@1.0.0 from the remote for use by app1;
  4. The remote package@1.0.0 has fallen behind the local package@1.0.0 previously used by app1 too much;
  5. Prepare incident notification and review.

This kind of uncertainty leads to frequent murmurs when referencing such packages: Do I refer to the local version or the remote version? Why is it sometimes the local version and sometimes the remote version? I want to use the latest content of package1 and need to keep the same version number as package1 at all times, so why should I use Monorepo?

yarn.lock conflict

(p) npm supports automatic resolution of lockfile conflicts. Yarn needs to be handled manually. In a large Monorepo scenario, almost every branch merge will encounter yarn.lock conflicts.

  • If the conflict is not resolved, no brain yarn , yarn.lock will be directly invalidated, all versions are updated to the package.json , which is too risky and loses the meaning of lockfile
  • Artificial resolve conflicts often arise Git conflict with binary files , only use master submitted re yarn , complicated process.

Automatically resolve conflicts in lockfile · Issue #2036 · pnpm/pnpm

It can be found that the existing Monorepo management method has too many defects. With the continuous increase of its projects, the construction speed will become slower and slower, and the robustness of the program cannot be guaranteed. We need a solution based on the developer's perception that it is unreliable.

Recommended reading: node_modules dilemma

solution

pnpm

Fast, disk space efficient package packageManager

Before npm@3, the structure of node_modules was clean and predictable, because each dependency in node_modules had its own node_modules folder, and all its dependencies were specified in package.json.

node_modules
└─ foo
   ├─ index.js
   ├─ package.json
   └─ node_modules
      └─ bar
         ├─ index.js
         └─ package.json

But this brings two very serious problems:

  1. If the dependency level is too deep, there will be problems under Windows;
  2. When the same Package is used as a dependency of multiple other packages, it will be copied many times.

In order to solve these two problems, npm@3 rethinked the structure of node_modules and introduced a tiling scheme. So the following structure we are familiar with appears.

node_modules
├─ foo
|  ├─ index.js
|  └─ package.json
└─ bar
   ├─ index.js
   └─ package.json

Unlike npm@3, pnpm uses another way to solve the problems encountered by npm@2 instead of tiling node_modules.

In the node_modules folder created by pnpm, all Packages are grouped with their own dependencies (isolated), but the dependency hierarchy is not too deep (soft links to the real address outside).

-> - a symlink (or junction on Windows)

node_modules
├─ foo -> .registry.npmjs.org/foo/1.0.0/node_modules/foo
└─ .registry.npmjs.org
   ├─ foo/1.0.0/node_modules
   |  ├─ bar -> ../../bar/2.0.0/node_modules/bar
   |  └─ foo
   |     ├─ index.js
   |     └─ package.json
   └─ bar/2.0.0/node_modules
      └─ bar
         ├─ index.js
         └─ package.json
  1. Solve Phantom dependencies based on the non-flat node_modules directory structure. Package can only reach its own dependencies.
  2. Reuse the same version of the Package through the soft chain, avoid repeated packaging (same version), and solve the NPM doppelgnger (solve disk occupation by the way).

It can be found that many problems related to the package manager can be easily solved.

Rush

a scalable monorepo manager for the web
  1. Command uniformity.

rush(x) xxx a shuttle to reduce the cost of getting started for newcomers. At the same time, Rush rush add and rushx xxx need to be run under the specified project. Other commands are global commands and can be executed in any directory within the project, avoiding the problem of frequently switching project paths in the terminal.

monorepo-5

  1. Strong dependency analysis capabilities.

Many commands in Rush support the analysis of dependencies, such as the -t (to) parameter:

$ rush install -t @monorepo/app1

This command will only install the dependencies of app1 and the dependencies of the packages that app1 depends on, that is, install dependencies on demand.

$ rush build -t @monorepo/app1

This command will execute the build scripts of app1 and the packages that app1 depends on.

Similarly, there is also the -f (from) parameter, which can make the command only work on the current package and packages that depend on this package.

  1. Ensure dependency version consistency

Projects in Monorepo should try to ensure the consistency of dependent versions, otherwise duplicate packaging and other problems are likely to occur.

Rush provides many capabilities to ensure this, such as rush check , rush add -p package-name -m and ensureConsistentVersions .

Interested students can read Rush's official documents on their own, which are very detailed and explain some common problems.

Package reference specification

monorepo-12

Product citation

In the traditional way of reference, after the build is completed, the app directly references the build product of the package. In the development phase, real-time construction can be ensured through the capabilities provided by the build tool (such as tsc --watch)

  • Advantages: standard, friendly to the app.
  • Disadvantages: As the number of modules increases, the package hot update speed may become unbearable.

Source code reference

main field in package.json is configured as the entry file of the source file, and the app that references the package needs to include the package in the compilation process.

  • Advantages: With the help of the hot update capability of the app, there is no process of generating build products by itself, and the hot update speed is fast
  • Disadvantages: App adaptation is required, alias adaptation is cumbersome;

Reference norm

  1. The packages used inside the project are called features and should not be released to the outside world. Directly set the main field as the source file entry and configure the webpack of the app project, and then compile the form.
  2. For packages that need to be released externally, features should not be and are not allowed to be referenced, and a build process must be required. If you need to use source code development to increase the hot update speed, you can add a custom entry field, which is first identified in the app’s webpack configuration Field.
Supplement: The rush build command supports the build product cache. If the app splitting granularity is small enough, there are enough reusable packages, and the packaged image supports the set and get of the build product cache, the app can be incrementally built.

Workspace protocol (workspace:)

Rush was born before PNPM and Yarn supported Workspace capabilities. Rush's method is to centrally install all packages in the common/temp folder, and then Rush creates a symbolic link from each project to common/temp. It is essentially equivalent to PNPM Workspace.

Enable the PNPM workspace ability to use workspace: The protocol guarantees the certainty of the reference version, and the package referenced by the protocol will only use the content in Monorepo.

{
  "dependencies": {
    "foo": "workspace:*",
    "bar": "workspace:~",
    "qar": "workspace:^",
    "zoo": "workspace:^1.5.0"
  }
}

It is recommended to use this protocol uniformly when quoting packages in Monorepo, quoting the latest local version content, to ensure that the changes can be spread and synchronized to other projects in a timely manner, which is also the advantage of Monorepo.

If you must use the remote version, you need to configure the specific project rush.json cyclicDependencyProjects configuration), see rush_json .

Fortunately PNPM workspace in workspace:* can match the prerelease version 👉 Local prerelease Version of Packages Standard Package Should BE linked only IF the Range at The IS *

Problem record

Monorepo Project Dependencies Duplicate

This problem is similar to the Yarn duplicate mentioned earlier, but it is not unique to Yarn.

Suppose the following dependencies exist (transform the Yarn duplicate example and place it in the Monorepo scenario)

app1 and package1 belong to Monorepo's internal project.

monorepo-8

In the Rush(pnpm)/Yarn project, the installation will strictly follow the version declared in the package.json of the project in Monorepo, that is, app1 installs lib-a@1.1.0, package1 installs lib-a@1.2.0.

At this time, if app1 is packaged, both lib-a@1.1.0 and lib-a@1.2.0 will be packaged.

You may have some surprises about this result, but when you think about it, it is natural.

To understand it in another way, the entire Monorepo is a large virtual project, and all our projects exist as direct dependencies of this virtual project.

{
  "name": "fake-project",
  "version": "1.0.0",
  "dependencies": {
    "@fake-project/app1": "1.0.0",
    "@fake-project/package1": "1.0.0"
  }
}

When installing dependencies, (p) npm first download direct dependency , and then download indirect dependency , and when installing the same module, determine whether the installed module version (direct dependency) conforms to the new module (indirect) Dependency) version range, skip if it meets, and install the module under node_modules of the current module if it does not.

The direct dependency of app1 and package1, lib-a, is an indirect dependency of the fake-project, which cannot meet the above judgment conditions, so it is installed according to the version described in the corresponding package.json.

Solution: Rush: Preferred versions

Rush can avoid the duplication of two compatible versions by manually specifying preferredVersions Here, 060e2bc0fafeba of lib-a in preferredVersions specified as 1.2.0, which is equivalent to directly installing the specified version of the module under the virtual project as a direct dependency.

{
  "name": "fake-project",
  "version": "1.0.0",
  "dependencies": {
    "@fake-project/app1": "1.0.0",
    "@fake-project/package1": "1.0.0",
    "lib-a": "1.1.0"
  }
}

For Yarn, due to the existence of Yarn duplicate, it is invalid to install a certain version of lib-a in the root directory.
But there are still two options for processing:

  1. Through yarn-deduplicate targeted modification yarn.lock ;
  2. Use the resolutions field. Too rough, unlike preferredVersions which allows incompatible versions to exist, it is not recommended.

Need to remember: To eliminate duplicate dependencies under Yarn, you should also deal with it one by one Package, and be careful to make the Wannian Ship.

  1. For public libraries with side effects, it is best to keep the same version;
  2. For other public libraries that are small in size (or support on-demand loading) and have no side effects, repeated packaging is acceptable to a certain extent.

prettier

Since node_modules no longer exists in the root directory, each project needs to install a prettier as devDependency and write a .prettierrc.js file.

In line with the principle of being lazy, .prettierrc.js newly created in the root directory (does not rely on any third-party packages), and prettier installed globally to solve this problem.

eslint

eslint-config-react-app look at a scenario first. If 060e2bc0fb0674 is used in the project, in addition to installing eslint-config-react-app , a series of peerDependencies plug-ins also need to be installed.

monorepo-10

monorepo-11

Why does eslint-config-react-app not include this series of plug-ins as dependencies, but as peerDependencies? The user does not need to care about which plugins are referenced in the default configuration.

For specific discussions, you can check the issue, which has related discussions: Support having plugins as dependencies in shareable config #3458 .

All in all: It is related to the specific search method of eslint plug-in. If the required plug-in is installed in the non-root directory node_modules due to dependency upgrade failure (multi-version conflict), problems may occur, and the user can ensure that it will not be installed by installing peerDependencies. The problem occurred.

Of course, we also found that some open source eslint preset configurations do not require peerDependencies to be installed. These presets use the flat node_modules structure of yarn and npm, that is, dependency upgrade. The installed packages are upgraded to the root directory node_modules, so they can work normally. . Even so, in Yarn-based Monorepo, once the dependencies become complicated, there may be cases where the plug-in cannot be found, and the normal operation is like an interesting coincidence.

In Rush, there is no dependency improvement (the improvement is not necessarily reliable), and it is too cumbersome to install a series of plug-ins. You can bypass patch

git hooks

husky is used in the project to register the pre-commit and commit-msg hooks to verify the code style and commit information.

Obviously, under the structure of the Rush project, the root directory does not have node_modules, so 060e2bc0fb082d cannot be used husky .

We can use rush init-autoinstaller to achieve the same effect. This section mainly refers to the official document Installing Git hooks and Enabling Prettier .

# 初始化一个名为 rush-lint 的 autoinstaller

$ rush init-autoinstaller --name rush-lint

$ cd common/autoinstallers/rush-lint

# 安装 lint 所需依赖

$ pnpm i @commitlint/cli @commitlint/config-conventional @microsoft/rush-lib eslint execa prettier lint-staged

# 更新 rush-lint 的 pnpm-lock.yaml

$ rush update-autoinstaller --name rush-lint

Add commit-lint.js and commitlint.config.js rush-lint directory, the contents are as follows

commit-lint.js

const path = require('path');
const fs = require('fs');
const execa = require('execa');

const gitPath = path.resolve(__dirname, '../../../.git');
const configPath = path.resolve(__dirname, './commitlint.config.js');
const commitlintBinPath = path.resolve(__dirname, './node_modules/.bin/commitlint');

if (!fs.existsSync(gitPath)) {
    console.error('no valid .git path');
    process.exit(1);
}

main();

async function main() {
    try {
        await execa('bash', [commitlintBinPath, '--config', configPath, '--cwd', path.dirname(gitPath), '--edit'], {
            stdio: 'inherit',
        });
    } catch (\_e) {
        process.exit(1);
    }
}

commitlint.config.js

const rushLib = require("@microsoft/rush-lib");

const rushConfiguration = rushLib.RushConfiguration.loadFromDefaultLocation();

const packageNames = [];
const packageDirNames = [];

rushConfiguration.projects.forEach((project) => {
  packageNames.push(project.packageName);
  const temp = project.projectFolder.split("/");
  const dirName = temp[temp.length - 1];
  packageDirNames.push(dirName);
});
// 保证 scope 只能为 all/package name/package dir name
const allScope = ["all", ...packageDirNames, ...packageNames];

module.exports = {
  extends: ["@commitlint/config-conventional"],
  rules: {
    "scope-enum": [2, "always", allScope],
  },
};

Note: There is no need to add prettierrc.js (the root directory already exists) and eslintrc.js (each item already exists).

.lintstagedrc file to the root directory

.lintstagedrc

{
  "{apps,packages,features}/**/*.{js,jsx,ts,tsx}": [
    "eslint --fix --color",
    "prettier --write"
  ],
  "{apps,packages,features}/**/*.{css,less,md}": ["prettier --write"]
}

After completing the installation of the related dependencies and the writing of the configuration, we will then register the execution of the related commands in rush .

Modify common/config/rush/command-line.json file commands field.

{
  "commands": [
    {
      "name": "commitlint",
      "commandKind": "global",
      "summary": "Used by the commit-msg Git hook. This command invokes commitlint to lint commit message.",
      "autoinstallerName": "rush-lint",
      "shellCommand": "node common/autoinstallers/rush-lint/commit-lint.js"
    },
    {
      "name": "lint",
      "commandKind": "global",
      "summary": "Used by the pre-commit Git hook. This command invokes eslint to lint staged changes.",
      "autoinstallerName": "rush-lint",
      "shellCommand": "lint-staged"
    }
  ]
}

Finally, rush commitlint and rush lint two commands respectively commit-msg and pre-commit bind hook.
commit-msg and pre-commit scripts under the common/git-hooks

commit-msg

#!/bin/sh

node common/scripts/install-run-rush.js commitlint || exit $? #++

pre-commit

#!/bin/sh

node common/scripts/install-run-rush.js lint || exit $? #++

In this way, the demand is fulfilled.

Avoid installing eslint and prettier globally

After the processing in the previous section, after installing eslint and prettier rush-lint directory, we don't need to install it globally, just configure VSCode.

{
  // ...
  "npm.packageManager": "pnpm",
  "eslint.packageManager": "pnpm",
  "eslint.nodePath": "common/autoinstallers/rush-lint/node_modules/eslint",
  "prettier.prettierPath": "common/autoinstallers/rush-lint/node_modules/prettier"
  // ...
}

appendix

Common commands

yarnrush(x)detail
yarn installrush installInstallation dependencies
yarn upgraderush updaterush update installation dependency, based on the lock file
rush update --full fully updated to the latest version that conforms to package.json
yarn add package-namerush add -p package-nameyarn add the version number is a default installation ^ beginning pharmaceutically small version update
Rush default installation version number is the Add ~ beginning to accept only patch update
Rush can be achieved consistent with the Add effect by increasing the yarn add --caret parameter
Rush not the Add Install multiple packages at once
yarn add package-name --devrush add -p package-name --dev-
yarn remove package-name-rush does not provide the remove command
-rush build
rush build -t @monorepo/app1 means only build @monorepo/app1 and its dependent package
rush build -T @monorepo/app1 means only build @monorepo/app1 The dependent package does not contain itself
-rush rebuildThe build scripts of all projects are executed by default
yarn xxx (custom script)rushx xxx (custom script)yarn xxx executes the xxx scripts (npm scripts)
rushx xxx in package.json in the current directory. The same is true. You can directly execute rushx to view the script commands supported by the current project.

Workflow

# 从 git 拉取最新变更
$ git pull

# 更新 NPM 依赖
$ rush update

# 重新打包 @monorepo/app1 依赖的项目(不含包其本身)
$ rush rebuild -T @monorepo/app1

# 进入指定项目目录
$ cd ./apps/app1

# 启动项目 ​
$ rushx start # or rushx dev

Reference article


海秋
311 声望19 粉丝

前端新手