highlight: a11y-dark
theme: smartblue
NodeJS currently has two systems: one is CommonJS (CJS for short), and the other is ECMAScript modules (ESM for short); this article mainly contains three topics:
- The Internals of CommonJS
- ESM module system for NodeJS platform
- The difference between CommonJS and ESM; how to convert between the two systems
First, let’s talk about why there is a module system
Why have a modular system
A good language must have a module system, because it can solve the basic needs encountered in engineering for us
- Splitting functions into modules can make the code more organized and easier to understand, allowing us to independently develop and test the functions of each submodule
- The function can be encapsulated, and then other modules can be directly introduced and used to improve reusability
- Implement encapsulation: only need to provide simple input and output documents to the outside world, and the internal implementation can be shielded from the outside, reducing the cost of understanding
- Manage dependencies: A good module system allows developers to easily build other modules based on existing third-party modules. In addition, the module system allows users to easily import the modules they want, and import modules on the dependency chain
At the beginning, JavaScript did not have a good module system, and pages mainly introduced different resources through multiple script tags. However, with the gradual complexity of the system, the traditional script tag mode cannot meet the business needs, so I began to plan to define a set of module systems, such as AMD, UMD, etc.
NodeJS is a server-side language that runs in the background. Compared with the browser's html, it lacks script tags to import files, and completely relies on the js files of the local file system. So NodeJS implements a module system according to the CommonJS specification
The ES2015 specification was released in 2015. At this time, JS has a formal standard for the module system. The module system built according to this standard is called the ESM system, which makes the browser and the server more consistent in the management of modules.
CommonJS modules
There are two basic ideas in CommonJS planning:
- Users can import a module in the local file system through the requeire function
Through the two special variables of exports and module.exports, the ability to publish externally
module loader
Here's a simple implementation of a simple module loader
The first is the function that loads the content of the module. We put this function in a private scope to avoid polluting the global environment, and then eval runs the functionfunction loadModule(filname, module, require) { const wrappedSrc = ` (function (module, exports, require) { ${fs.readFileSync(filename, 'utf-8')} })(module, module.exports, require) ` eval(wrappedSrc) }
In the code we read the module content
readFileSync
Generally speaking, when calling the file system API, the synchronous version should not be used, but this method is indeed used here. Commonjs uses synchronous operation to ensure that multiple modules can be installed and the normal dependency order is introduced.
Now implementing therequire
functionfunction require(moduleName) { const id = require.resolve(moduleName); if (require.cache[id]) { return require.cache[id].exports } // 模块的元数据 const module = { exports: {}, id, } require.cache[id] = module; loadModule(id, module, require); // 返回导出的变量 return module.exports } require.cache = {}; require.resolve = (moduleName) => { // 根据ModuleName解析完整的模块ID }
The above implements a simple
require
function. There are several parts of this self-made module system that need to be explained.- After entering the ModuleName of the module, first parse out the full path of the module (how to parse it will be discussed later), and then save the result in the id variable
- If the module has already been loaded, the result in the cache will be returned immediately
- If the template has not been loaded, then configure an environment. Specifically, first create a
module
variable and let it contain an exports attribute. The content of this object will be populated by the code used by the module when exporting the API - Cache the module object
- Execute the
loadModule
function, pass in the newly created module object, and mount the content of another module through the function Returns the exported content of another module
Module Resolution Algorithm
The full path of the parsing module is mentioned earlier. By passing in the module name, the module parsing function can return the corresponding full path of the module, and then load the code of the corresponding module through the path, and use this path to identify the identity of the module.
resolve
function mainly deals with the following three cases- want to load a file module? If the moduleName starts with /, it will be regarded as an absolute path. When loading, you only need to install the path and return it as it is. If the moduleName starts with
./
, then it is regarded as a relative path, so the relative path is calculated from the directory where the module is requested to be loaded - is the core module to be loaded If
moduleName
does not start with/
or./
, then the algorithm will first try to find the core module ofNodeJS
to be loaded is not a package module If no
moduleName
matching core modules, start from the issue of the request load module, called up layer by layer searchnode_modules
stranger, we have not been able to see inside there withmoduleName
module matches , and load the module if it exists. If not, continue along the line andnode_modules
directory, all the way to the root of the filesystem
In this way, two modules can depend on different versions of the package, but they can still be loaded normally
For example the following directory structure:myApp - index.js - node_modules - depA - index.js - depB - index.js - node_modules - depA - depC - index.js - node_modules - depA
In the above example, although
myApp
,depB
, anddepC
all depend ondepA
loaded modules are indeed different. for example:- In
/myApp/index.js
, the source loaded is/myApp/node_modules/depA
- In
/myApp/node_modules/depB/index.js
, the load is/myApp/node_modules/depB/node_modules/depA
At
/myApp/node_modules/depC/index.js
, the load is/myApp/node_modules/depC/node_modules/depA
The reason why NodeJs can manage dependencies well is because it has a core part of the module resolution algorithm behind it, which can manage thousands of packages without conflict or version incompatibility.circular dependency
Many people think that circular dependencies are a theoretical design problem, but this kind of problem is likely to appear in real projects, so you should know how CommonJS handles this situation. It is possible to realize the risks by looking at the require function implemented before. The following is an example to explain
There is a module of mian.js, which needs to depend on two modules, a.js and b.js. At the same time, a.js needs to depend on b.js, but b.js in turn depends on a.js, which causes a cycle Dependency, here is the source code:// a.js exports.loaded = false; const b = require('./b'); module.exports = { b, loaded: true } // b.js exports.loaded = false; const a = require('./a') module.exports = { a, loaded: false } // main.js const a = require('./a'); const b = require('./b'); console.log('A ->', JSON.stringify(a)) console.log('B ->', JSON.stringify(b))
Running
main.js
gives the following result
As can be seen from the results, CommonJS is at the risk of circular dependencies. When the b module imports the a module, the content is not complete. Specifically, it only reflects the state of the a.js
module when it requests the 061e42bf9c789a module, but cannot reflect the state of the a.js
module that is finally loaded.
The following is an example diagram to illustrate this process
The following is the specific process explanation
- The whole process starts from main.js, which starts by importing the a.js module
- The first thing a.js does is export a value called loaded and set it to false
- a.js module requires import of b.js module
- Similar to a.js, b.js first exports the variable loaded as false
- b.js continues to execute and needs to import a.js
- Since the system has already started processing the a.js module, b.js will immediately copy the content exported by a.js to this module
- b.js will change the loaded value it exports to false
- Since b has been executed, control will return to a.js, and he will copy the state of the b.js module
- a.js continues to execute, modify the export value loaded to true
- Finally execute main.js
As can be seen above, due to synchronous execution, the a.js module imported by b.js is not complete and cannot reflect the final state of b.js.
As you can see in the above example, the result of circular dependencies, which is more serious for large projects.
The method of use is relatively simple, and the limited space will not explain it in this article.
ESM
ESM is part of the ECMAScript 2015 specification, which establishes a unified module system for Javascript to adapt to various execution environments. An important difference between ESM and CommonJS is that the ES module is static, that is to say, the statement importing the module must be written at the top level. In addition, referenced modules can only use constant strings and cannot rely on expressions that need to be dynamically evaluated at runtime.
For example, we cannot introduce ES modules in the following ways
if (condition) {
import module1 from 'module1'
} else {
import module2 from 'module2'
}
And CommonJS can import different modules based on conditions
let module = null
if (condition) {
module = require("module1")
} else {
module = require("module2")
}
It seems to be stricter than CommonJS, but it is precisely because of this static introduction mechanism that we can statically analyze dependencies and remove logic that will not be executed. This is called tree-shaking
module loading process
To understand how the ESM system works and how it handles circular dependencies, we need to understand how the system parses and executes Javascript code
Stages of loading modules
The goal of the interpreter is to construct a graph to describe the dependencies between the modules to be loaded. This graph is also called a dependency graph.
It is through this dependency graph that the interpreter judges the dependencies of modules and decides in which order it should execute the code. For example, if we need to execute a js file, the interpreter will start from the entry and look for all import statements. If an import statement is encountered during the search process, it will recurse in a depth-first manner until all the codes are parsed. complete.
This process can be subdivided into three processes:
- Profiling: finds all import statements and recursively loads the contents of each module from related files
- Instantiation: For an exported entity, keep a named import in memory, but do not assign a value to it for the time being. At this time, dependencies should be established according to the import and export keywords, and the js code will not be executed at this time.
Execution: At this stage, NodeJS starts to execute the code, which enables the actual exported entity to obtain the actual value
In CommonJS, the file is executed while parsing dependencies. So when you see require, it means that the previous code has been executed. Because the require operation does not have to be at the beginning of the file, but can appear in the task place
However, the ESM system is different. These three stages are separated. It must first construct the complete dependency graph before starting to execute the code.circular dependency
In the example of CommonJS circular dependency mentioned earlier, use ESM to transform
// a.js import * as bModule from './b.js'; export let loaded = false; export const b = bModule; loaded = true; // b.js import * as aModule from './b.js'; export let loaded = false; export const a = aModule; loaded = true; // main.js import * as a from './a.js'; import * as b from './b.js'; console.log("A =>", a) console.log("B =>", b)
It should be noted that the
JSON.strinfy
method cannot be used here, because circular dependencies are used here
In the above execution results, you can see that both a.js and b.js can observe each other completely. Unlike CommonJS, the state obtained by a module is an incomplete state.
dissect
Let's analyze the process below:
Take the picture above as an example:
- Starting from the main.js analysis, first found an import statement, and then entered a.js
- Start execution from a.js, find another import statement, execute b.js
- When b.js starts to execute, an import statement is found and a.js is introduced. Because a.js has been depended on before, we will not execute this path again.
b.js continues to execute and finds that there are no other import statements. After returning to a.js, I also found that there are no other import statements, and then directly returned to the main.js entry file. Continue to execute and find that b.js is required to be introduced, but this module has been accessed before, so this path will not be executed
After a depth-first approach, the module dependency graph has formed a tree diagram, and then the interpreter executes code through this dependency graph
At this stage, the interpreter starts from the entry point and starts to analyze the dependencies between the modules. At this stage, the interpreter only cares about the import statements of the system, and loads the modules that these statements want to import, and explores the dependency graph in a depth-first manner. Traverse dependencies in this way to get a tree-like structureinstantiate
At this stage, the interpreter starts at the bottom of the tree and works its way to the top. Before reaching a module, it will look for all the attributes to be exported by the module, and build an implicit table in memory to store the name of the attribute to be exported by this module and the value that the attribute will have.
As shown below:
As you can see from the above figure, in what order the modules are instantiated
- The interpreter starts with the b.js module, which it finds to export loaded and a
- Then the interpreter analyzes the a.js module, and he finds that this module needs to export loaded and b
- Finally, analyzing the main.js module, he found that this module does not export any functions
- The set of exports map constructed in the instantiation phase only records the relationship between the exported name and the value that the name will have. As for the value itself, it is not initialized in this phase.
After the above process, the parser needs to be executed again. This time, it will associate the names exported by each module with those modules that imported them, as shown in the following figure:
The steps this time are:
- The module b.js should be connected with the content exported by the module b.js, this link is called aModule
- The module a.js should be connected with the content exported by the module a.js, this link is called bModule
- Finally, the module main.js should be connected with the content exported by the module b.js
At this stage, all the values are not initialized, we just establish the corresponding links, so that these links can point to the corresponding values, as for the value itself, we need to wait until the next stage to determine
implement
At this stage, the system finally has to execute the code in each file. He accesses the original dependency graph from bottom to top according to the depth-first order of the post-order, and executes the accessed files one by one. In this example, main.js will be executed last. This execution result ensures that when the program runs the main logic, the values exported by each module are all initialized
The specific steps in the above figure are:
- Execute from b.js. The first line of code to be executed will initialize the loaded exported by the module to false
- Next, it will be executed, and aModule will be copied to a. At this time, a gets a reference value, which is the a.js module
- Then set loaded to true. At this time, all the values of the b module are all determined.
- Now execute a.js. First initialize the export value loaded to false
- Next, the value of the b attribute exported by the module gets the initial value, which is the reference of bModule
Finally, change the value of loaded to true. At this point, we finally determined the values corresponding to these attributes exported by the a.js module system.
After completing these steps, the system can officially execute the main.js file. At this time, all the attributes exported by each module have been evaluated. Since the system imports modules by reference rather than copying, even if there is a cycle between modules Dependencies, each module can still fully see the final state of the otherThe difference and interactive use of CommonJS and ESM
Here are some important differences between CommonJS and ESM, and how to use both modules together when necessary
ESM does not support some references provided by CommonJS
CommonJS provides some key references that are not supported by ESM, these include
require
,exports
,module.exports
,__filename
,__diranme
. If these are used in ES modules, there will be a reference error in the program.
In the ESM system, we can obtain a reference through the special object import.meta, which refers to the URL of the current file. Specifically, the file path of the current module is obtained by writing import.meta.url, which is similar tofile: ///path/to/current_module.js
. Based on this path, we can construct the two absolute paths represented by__filename
and__dirname
import { fileURLToPath } from 'url'; import { dirname } from 'path'; const __dirname = fileURLToPath(import.meta.url); const __dirname = dirname(__filename);
The require function of CommonJS can also be implemented in the ESM module by using the following method:
import { createRequire } from 'module'; const require = createRequire(import.meta.url)
Now, you can use this
require()
function to load theCommonjs
module in the ES module system environmentUse another module in one of the module systems
module.createRequire
function is used in the ESM module to load thecommonJS
module. In addition to this method, you can actually import CommonJS modules through the import language. However, this method will only export the default export content;import pkg from 'commonJS-module' import { method1 } from 'commonJS-module' // 会报错
But there is no way to do it in
commonJS
, we can't introduce theESM
module into 061e42bf9c8390
In addition, ESM does not support importing json files as modules, which can be easily implemented in commonjs
The following import statement will report an errorimport json from 'data.json'
If you need to import a json file, you also need to use the
createRequire
function:import { createRequire } from 'module'; const require = createRequire(import.meta.url); const data = require("./data.json"); console.log(data)
Summarize
This article mainly explains how the two module systems in NodeJS work. Understanding these reasons can help us write bugs that avoid some difficult problems.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。