source
tree-shaking
First proposed by Rich Harris in rollup
.
Born to reduce final build size.
Here is the description from MDN :
tree-shaking is a term commonly used to describe the behavior of removing dead-code in the context of JavaScript.
It relies on the import and export statements in ES2015 to detect whether code modules are exported, imported, and used by JavaScript files.
In modern JavaScript applications, we use module packaging (like webpack or Rollup) to automatically remove unreferenced code when packaging multiple JavaScript files into a single file. This is important for preparing the code for release, so that the final file has a clean structure and minimized size.
tree-shaking VS dead code elimination
Speaking tree-shaking
have to talk about dead code elimination
, referred to as DCE
.
Many people tend to regard tree-shaking
as a technology to achieve DCE
. If it's all the same thing, the end goal is the same (less code). Why should it be renamed as tree-shaking
?
tree-shaking
Rich Harris , the inventor of the term, tells us the answer in his article "tree-shaking versus dead code elimination" .
Rich Harris cites an example of making a cake. The original text is as follows:
Bad analogy time: imagine that you made cakes by throwing whole eggs into the mixing bowl and smashing them up, instead of cracking them open and pouring the contents out. Once the cake comes out of the oven, you remove the fragments of eggshell, except that's quite tricky so most of the eggshell gets left in there.
You'd probably eat less cake, for one thing.
That's what dead code elimination consists of — taking the finished product, and imperfectly removing bits you don't want. tree-shaking, on the other hand, asks the opposite question: given that I want to make a cake, which bits of what ingredients do I need to include in the mixing bowl?
Rather than excluding dead code, we're including live code. Ideally the end result would be the same, but because of the limitations of static analysis in JavaScript that's not the case. Live code inclusion gets better results, and is prima facie a more logical approach to the problem of preventing our users from downloading unused code.
To put it simply: DCE
For example, when making a cake, put the whole egg directly, and then remove the egg shell from the cake when finished. And tree-shaking
is to take out the egg shell first, and then make the cake. Both have the same result, but the process is completely different.
dead code
dead code
generally has the following characteristics:
- code will not be executed, not reachable
- The result of code execution will not be used
- Code only affects dead variables (write only, not read)
Use webpack
to package the following code in mode: development
mode:
function app() {
var test = '我是app';
function set() {
return 1;
}
return test;
test = '无法执行';
return test;
}
export default app;
Final packing result:
eval(
"function app() {\n var test = '我是app';\n function set() {\n return 1;\n }\n return test;\n test = '无法执行';\n return test;\n}\n\napp();\n\n\n//# sourceURL=webpack://webpack/./src/main.js?"
);
You can see that there are still code blocks that cannot be executed in the packaged result.
webpack
doesn't support dead code elimination
? Yes, webpack
is not supported.
It turns out that the webpack
dead code elimination
is not webpack
itself, but the famous uglify .
By reading the source code, it is found that in the mode: development
mode, the terser-webpack-plugin
plugin will not be loaded.
// lib/config/defaults.js
D(optimization, 'minimize', production);
A(optimization, 'minimizer', () => [
{
apply: (compiler) => {
// Lazy load the Terser plugin
const TerserPlugin = require('terser-webpack-plugin');
new TerserPlugin({
terserOptions: {
compress: {
passes: 2
}
}
}).apply(compiler);
}
}
]);
// lib/WebpackOptionsApply.js
if (options.optimization.minimize) {
for (const minimizer of options.optimization.minimizer) {
if (typeof minimizer === 'function') {
minimizer.call(compiler, compiler);
} else if (minimizer !== '...') {
minimizer.apply(compiler);
}
}
}
And terser-webpack-plugin
the plugin uses uglify
to achieve it.
We pack in mode: production
mode.
// 格式化后结果
(() => {
var r = {
225: (r) => {
r.exports = '我是app';
}
},
// ...
})();
You can see the final result, the non-executable part of the code has been removed. In addition, it also helped us to compress the code, delete comments and other functions.
tree shaking doesn't work
tree shaking
Essentially by analyzing static ES modules, to eliminate unused code.
_ESModule_
FeaturesIt can only appear as a statement at the top level of a module, not inside a function or inside an if. (ECMA-262 15.2)
Imported module names can only be string constants. (ECMA-262 15.2.2)
Regardless of where the import statement appears, all imports must already be done when the module is initialized. (ECMA-262 15.2.1.16.4 - 8.a)
import binding is immutable, similar to const. For example you can't import { a } from './a' and assign a to something else. (ECMA-262 15.2.1.16.4 - 12.c.3)
————Quoted from You Yuxi
Let's take a look at the effect of tree shaking
.
we have a module
// ./src/app.js
export const firstName = 'firstName'
export function getName ( x ) {
return x.a
}
getName({ a: 123 })
export function app ( x ) {
return x * x * x;
}
export default app;
Below are 7 instances.
// 1*********************************************
// import App from './app'
// export function main() {
// var test = '我是index';
// return test;
// }
// console.log(main)
// 2*********************************************
// import App from './app'
// export function main() {
// var test = '我是index';
// console.log(App(1))
// return test;
// }
// console.log(main)
// 3*********************************************
// import App from './app'
// export function main() {
// var test = '我是index';
// App.square(1)
// return test;
// }
// console.log(main)
// 4*********************************************
// import App from './app'
// export function main() {
// var test = '我是index';
// let methodName = 'square'
// App[methodName](1)
// return test;
// }
// console.log(main)
// 6*********************************************
// import * as App from './app'
// export function main() {
// var test = '我是index';
// App.square(1)
// return test;
// }
// console.log(main)
// 7*********************************************
// import * as App from './app'
// export function main() {
// var test = '我是index';
// let methodName = 'square'
// App[methodName](1)
// return test;
// }
// console.log(main)
Use the simplest webpack
configuration for packaging
// webpack.config.js
module.exports = {
entry: './src/index.js',
output: {
filename: 'dist.js'
},
mode: 'production'
};
It can be seen from the results that all the packaging results in the first 6 have eliminated dead codes, and only the seventh type has failed to be eliminated.
/* ... */
const r = 'firstName';
function o(e) {
return e.a;
}
function n(e) {
return e * e * e;
}
o({ a: 123 });
const a = n;
console.log(function () {
return t.square(1), '我是index';
});
I haven't learned about it in detail, so I can only guess. Due to the characteristics of JavaScript
dynamic language makes static analysis more difficult. The current parser is statically parsed, and it is still unable to analyze the full import and dynamic use of the grammar. .
For more tree shaking
execution related can refer to the following link:
Of course, savvy programmers won't be stumped by this, since static analysis doesn't work, it's up to the developer to manually mark the file as side-effect-free.
tree shaking and sideEffects
sideEffects
support two writing methods, one is false
, the other is array
- If all code contained no side effects, we could simply mark the property as
false
- If your code does have some side effects, you can provide an array instead
It can be set in package.js
.
// boolean
{
"sideEffects": false
}
// array
{
"sideEffects": ["./src/app.js", "*.css"]
}
It can also be set in module.rules
.
module.exports = {
module: {
rules: [
{
test: /\.jsx?$/,
exclude: /(node_modules)/,
use: {
loader: 'babel-loader',
},
sideEffects: false || []
}
]
},
}
sideEffects: false
is set, and then repackaged
var e = {
225: (e, r, t) => {
(e = t.hmd(e)).exports = '我是main';
}
},
Only the code of the main.js
module is left, and the code of app.js
has been eliminated.
usedExports
webpack
in addition to sideEffects
also provides another way of marking elimination. That is through the configuration item usedExports
.
The information collected by optimization.usedExports will be used by other optimization means or code generation, such as unused exports will not be generated, when all uses are suitable, the export name will be treated as a single token character. Useless code cleanup in compression tools will benefit from this option and will be able to remove unused exports.
mode: productions
is enabled by default.
module.exports = {
//...
optimization: {
usedExports: true,
},
};
usedExports
会terser
有没有sideEffect
,如果没有用到,又没有sideEffect
的话,就会在打包时替It's marked unused harmony.
Finally, Terser
, UglifyJS
and so on DCE
tool "shake" this invalid code.
The principle of tree shaking
tree shaking
itself also adopts the method of static analysis.
Static code analysis refers to scanning the program code through lexical analysis, syntax analysis, control flow analysis, data flow analysis and other technologies without running the code to verify whether the code meets the specifications, security, A code analysis technique for reliability, maintainability and other indicators
tree shaking
ES6Module
语法, tree Shaking
ES6 的语法: import
export
.
Next, let's take a look at how the ancient version rollup
is implemented tree shaking
.
- Initialize
Module
according to the content of the entry module, and useacorn
to convertast
- Analysis
ast
. Look forimport
andexport
keywords to establish dependencies - Analysis
ast
, collect the functions, variables and other information existing in the current module - Analyze ast again, collect the usage of each function variable, because we collect the code according to the dependency relationship, if the function variable is not used,
- According to the collected information such as function variable identifiers, make a judgment. If it is
import
, then createModule
and take a few steps again. Otherwise, store the corresponding code information in a unifiedresult
. - Generate
bundle
according to the final result.
Source version: v0.3.1
Create bundle
through the entry
entry file, and execute the build
method to start packaging.
export function rollup ( entry, options = {} ) {
const bundle = new Bundle({
entry,
resolvePath: options.resolvePath
});
return bundle.build().then( () => {
return {
generate: options => bundle.generate( options ),
write: ( dest, options = {} ) => {
let { code, map } = bundle.generate({
dest,
format: options.format,
globalName: options.globalName
});
code += `\n//# ${SOURCEMAPPING_URL}=${basename( dest )}.map`;
return Promise.all([
writeFile( dest, code ),
writeFile( dest + '.map', map.toString() )
]);
}
};
});
}
build
Internally execute fetchModule
method, according to the file name, readFile
read the file content, create Module
.
build () {
return this.fetchModule( this.entryPath, null )
.then( entryModule => {
this.entryModule = entryModule;
if ( entryModule.exports.default ) {
let defaultExportName = makeLegalIdentifier( basename( this.entryPath ).slice( 0, -extname( this.entryPath ).length ) );
while ( entryModule.ast._scope.contains( defaultExportName ) ) {
defaultExportName = `_${defaultExportName}`;
}
entryModule.suggestName( 'default', defaultExportName );
}
return entryModule.expandAllStatements( true );
})
.then( statements => {
this.statements = statements;
this.deconflict();
});
}
fetchModule ( importee, importer ) {
return Promise.resolve( importer === null ? importee : this.resolvePath( importee, importer ) )
.then( path => {
/*
缓存处理
*/
this.modulePromises[ path ] = readFile( path, { encoding: 'utf-8' })
.then( code => {
const module = new Module({
path,
code,
bundle: this
});
return module;
});
return this.modulePromises[ path ];
});
}
According to the content of the read file, use the acorn
compiler to perform the conversion of ast
.
//
export default class Module {
constructor ({ path, code, bundle }) {
/*
初始化
*/
this.ast = parse(code, {
ecmaVersion: 6,
sourceType: 'module',
onComment: (block, text, start, end) =>
this.comments.push({ block, text, start, end })
});
this.analyse();
}
Traverse node information. Look for the keywords import
and export
. This step is what we often say to analyze the static structure of esm
.
把import
的信息,收集到this.imports
中,把exports
的信息,收集到this.exports
中.
this.ast.body.forEach( node => {
let source;
if ( node.type === 'ImportDeclaration' ) {
source = node.source.value;
node.specifiers.forEach( specifier => {
const isDefault = specifier.type === 'ImportDefaultSpecifier';
const isNamespace = specifier.type === 'ImportNamespaceSpecifier';
const localName = specifier.local.name;
const name = isDefault ? 'default' : isNamespace ? '*' : specifier.imported.name;
if ( has( this.imports, localName ) ) {
const err = new Error( `Duplicated import '${localName}'` );
err.file = this.path;
err.loc = getLocation( this.code.original, specifier.start );
throw err;
}
this.imports[ localName ] = {
source, // 模块id
name,
localName
};
});
}
else if ( /^Export/.test( node.type ) ) {
if ( node.type === 'ExportDefaultDeclaration' ) {
const isDeclaration = /Declaration$/.test( node.declaration.type );
this.exports.default = {
node,
name: 'default',
localName: isDeclaration ? node.declaration.id.name : 'default',
isDeclaration
};
}
else if ( node.type === 'ExportNamedDeclaration' ) {
// export { foo } from './foo';
source = node.source && node.source.value;
if ( node.specifiers.length ) {
node.specifiers.forEach( specifier => {
const localName = specifier.local.name;
const exportedName = specifier.exported.name;
this.exports[ exportedName ] = {
localName,
exportedName
};
if ( source ) {
this.imports[ localName ] = {
source,
localName,
name: exportedName
};
}
});
}
else {
let declaration = node.declaration;
let name;
if ( declaration.type === 'VariableDeclaration' ) {
name = declaration.declarations[0].id.name;
} else {
name = declaration.id.name;
}
this.exports[ name ] = {
node,
localName: name,
expression: declaration
};
}
}
}
}
analyse () {
// imports and exports, indexed by ID
this.imports = {};
this.exports = {};
// 遍历 ast 查找对应的 import、export 关联
this.ast.body.forEach( node => {
let source;
// import foo from './foo';
// import { bar } from './bar';
if ( node.type === 'ImportDeclaration' ) {
source = node.source.value;
node.specifiers.forEach( specifier => {
const isDefault = specifier.type === 'ImportDefaultSpecifier';
const isNamespace = specifier.type === 'ImportNamespaceSpecifier';
const localName = specifier.local.name;
const name = isDefault ? 'default' : isNamespace ? '*' : specifier.imported.name;
if ( has( this.imports, localName ) ) {
const err = new Error( `Duplicated import '${localName}'` );
err.file = this.path;
err.loc = getLocation( this.code.original, specifier.start );
throw err;
}
this.imports[ localName ] = {
source, // 模块id
name,
localName
};
});
}
else if ( /^Export/.test( node.type ) ) {
// export default function foo () {}
// export default foo;
// export default 42;
if ( node.type === 'ExportDefaultDeclaration' ) {
const isDeclaration = /Declaration$/.test( node.declaration.type );
this.exports.default = {
node,
name: 'default',
localName: isDeclaration ? node.declaration.id.name : 'default',
isDeclaration
};
}
// export { foo, bar, baz }
// export var foo = 42;
// export function foo () {}
else if ( node.type === 'ExportNamedDeclaration' ) {
// export { foo } from './foo';
source = node.source && node.source.value;
if ( node.specifiers.length ) {
// export { foo, bar, baz }
node.specifiers.forEach( specifier => {
const localName = specifier.local.name;
const exportedName = specifier.exported.name;
this.exports[ exportedName ] = {
localName,
exportedName
};
if ( source ) {
this.imports[ localName ] = {
source,
localName,
name: exportedName
};
}
});
}
else {
let declaration = node.declaration;
let name;
if ( declaration.type === 'VariableDeclaration' ) {
name = declaration.declarations[0].id.name;
} else {
name = declaration.id.name;
}
this.exports[ name ] = {
node,
localName: name,
expression: declaration
};
}
}
}
}
// 查找函数,变量,类,块级作用与等,并根据引用关系进行关联
analyse( this.ast, this.code, this );
}
Next, find functions, variables, classes, block-level functions, etc., and associate them according to the reference relationship.
Use magicString
for each statement
node to add content modification function.
Traverse the entire ast
tree, first initialize a Scope
as the namespace of the current module. If it is a function or block-level scope, etc., create a new one Scope
. Each Scope
is associated through parent
to establish a relationship tree according to the namespace.
If it is a variable and a function, associate it with the current Scope
, and add the corresponding identifier name to Scope
. At this point, the functions and variables that appear on each node have been collected.
Next, traverse ast
again. Finds the variable function, if it was just read, or if it was just modified.
According to the Identifier
type to find the identifier, if the current identifier can be found in Scope
, it means that it has been read. Stored in the _dependsOn
collection.
Next, according to the AssignmentExpression
, UpdateExpression
and CallExpression
type nodes, collect our identifiers, whether they have been modified or passed by the current parameters. And store the result in _modifies
.
function analyse(ast, magicString, module) {
var scope = new Scope();
var currentTopLevelStatement = undefined;
function addToScope(declarator) {
var name = declarator.id.name;
scope.add(name, false);
if (!scope.parent) {
currentTopLevelStatement._defines[name] = true;
}
}
function addToBlockScope(declarator) {
var name = declarator.id.name;
scope.add(name, true);
if (!scope.parent) {
currentTopLevelStatement._defines[name] = true;
}
}
// first we need to generate comprehensive scope info
var previousStatement = null;
var commentIndex = 0;
ast.body.forEach(function (statement) {
currentTopLevelStatement = statement; // so we can attach scoping info
Object.defineProperties(statement, {
_defines: { value: {} },
_modifies: { value: {} },
_dependsOn: { value: {} },
_included: { value: false, writable: true },
_module: { value: module },
_source: { value: magicString.snip(statement.start, statement.end) }, // TODO don't use snip, it's a waste of memory
_margin: { value: [0, 0] },
_leadingComments: { value: [] },
_trailingComment: { value: null, writable: true } });
var trailing = !!previousStatement;
// attach leading comment
do {
var comment = module.comments[commentIndex];
if (!comment || comment.end > statement.start) break;
// attach any trailing comment to the previous statement
if (trailing && !/\n/.test(magicString.slice(previousStatement.end, comment.start))) {
previousStatement._trailingComment = comment;
}
// then attach leading comments to this statement
else {
statement._leadingComments.push(comment);
}
commentIndex += 1;
trailing = false;
} while (module.comments[commentIndex]);
// determine margin
var previousEnd = previousStatement ? (previousStatement._trailingComment || previousStatement).end : 0;
var start = (statement._leadingComments[0] || statement).start;
var gap = magicString.original.slice(previousEnd, start);
var margin = gap.split('\n').length;
if (previousStatement) previousStatement._margin[1] = margin;
statement._margin[0] = margin;
walk(statement, {
enter: function (node) {
var newScope = undefined;
magicString.addSourcemapLocation(node.start);
switch (node.type) {
case 'FunctionExpression':
case 'FunctionDeclaration':
case 'ArrowFunctionExpression':
var names = node.params.map(getName);
if (node.type === 'FunctionDeclaration') {
addToScope(node);
} else if (node.type === 'FunctionExpression' && node.id) {
names.push(node.id.name);
}
newScope = new Scope({
parent: scope,
params: names, // TODO rest params?
block: false
});
break;
case 'BlockStatement':
newScope = new Scope({
parent: scope,
block: true
});
break;
case 'CatchClause':
newScope = new Scope({
parent: scope,
params: [node.param.name],
block: true
});
break;
case 'VariableDeclaration':
node.declarations.forEach(node.kind === 'let' ? addToBlockScope : addToScope); // TODO const?
break;
case 'ClassDeclaration':
addToScope(node);
break;
}
if (newScope) {
Object.defineProperty(node, '_scope', { value: newScope });
scope = newScope;
}
},
leave: function (node) {
if (node === currentTopLevelStatement) {
currentTopLevelStatement = null;
}
if (node._scope) {
scope = scope.parent;
}
}
});
previousStatement = statement;
});
// then, we need to find which top-level dependencies this statement has,
// and which it potentially modifies
ast.body.forEach(function (statement) {
function checkForReads(node, parent) {
if (node.type === 'Identifier') {
// disregard the `bar` in `foo.bar` - these appear as Identifier nodes
if (parent.type === 'MemberExpression' && node !== parent.object) {
return;
}
// disregard the `bar` in { bar: foo }
if (parent.type === 'Property' && node !== parent.value) {
return;
}
var definingScope = scope.findDefiningScope(node.name);
if ((!definingScope || definingScope.depth === 0) && !statement._defines[node.name]) {
statement._dependsOn[node.name] = true;
}
}
}
function checkForWrites(node) {
function addNode(node, disallowImportReassignments) {
while (node.type === 'MemberExpression') {
node = node.object;
}
// disallow assignments/updates to imported bindings and namespaces
if (disallowImportReassignments && has(module.imports, node.name) && !scope.contains(node.name)) {
var err = new Error('Illegal reassignment to import \'' + node.name + '\'');
err.file = module.path;
err.loc = getLocation(module.code.toString(), node.start);
throw err;
}
if (node.type !== 'Identifier') {
return;
}
statement._modifies[node.name] = true;
}
if (node.type === 'AssignmentExpression') {
addNode(node.left, true);
} else if (node.type === 'UpdateExpression') {
addNode(node.argument, true);
} else if (node.type === 'CallExpression') {
node.arguments.forEach(function (arg) {
return addNode(arg, false);
});
}
// TODO UpdateExpressions, method calls?
}
walk(statement, {
enter: function (node, parent) {
// skip imports
if (/^Import/.test(node.type)) return this.skip();
if (node._scope) scope = node._scope;
checkForReads(node, parent);
checkForWrites(node, parent);
//if ( node.type === 'ReturnStatement')
},
leave: function (node) {
if (node._scope) scope = scope.parent;
}
});
});
ast._scope = scope;
}
The result of execution is as follows:
In the previous step, we associated the declarations of functions, variables, classes, block-level functions, etc. with our current node. Now we need to collect this information on the node and put it in Module
//
this.ast.body.forEach( statement => {
Object.keys( statement._defines ).forEach( name => {
this.definitions[ name ] = statement;
});
Object.keys( statement._modifies ).forEach( name => {
if ( !has( this.modifications, name ) ) {
this.modifications[ name ] = [];
}
this.modifications[ name ].push( statement );
});
});
From this, we can see which ones are depended on and which ones have been modified in each statement
.
When our operation in the entry module is completed, we traverse the statement
node, and execute ---5088a03c947f4c41f7fdaab975b64f15 _dependsOn
according to the information in define
.
If the data of _dependsOn
this.imports
, it means that the identifier is an import module, call the fetchModule
method, and repeat the above logic.
If it is a normal function variable or the like, collect the corresponding statement
. At the end of the execution, we can collect all the related statement
, but it has not been collected, which means that it is useless code and has been filtered.
Finally, it is reassembled into bundle
and sent to our file through fs
.
stay last
There are many more points worth exploring in tree shaking, such as:
- css tree shaking
- Webpack's tree shaking implementation
- How to avoid tree shaking ineffective
- ...
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。