Debugging Memory Leaks in Node.js Applications
Node.js is a platform built on Chrome's V8 JavaScript engine for easily building fast and scalable web applications.
Google's V8-the JavaScript engine behind Node.js, its performance is incredible, and there are many reasons why Node.js works well in many use cases, but you are always limited by the heap size. When you need to handle more requests in a Node.js application, you have two choices: scale vertically or scale horizontally. Horizontal scaling means you must run more concurrent application instances. If you do it well, you will eventually be able to fulfill more requests. Vertical scaling means that you must increase the memory usage and performance of the application or increase the resources available to the application instance.
Node.js Memory Leak Debugging Arsenal
MEMWATCH
If you search for "how to find leaks in node.js", the first tool you might find is memwatch. The original package has long been abandoned and is no longer maintained. However, you can easily find an updated version of it in the GitHub repository fork list. This module is useful because it can issue a leak event when it sees that the heap grows more than 5 consecutive garbage collections.
HEAPDUMP
Great tool, it allows Node.js developers to take heap snapshots and check them later using Chrome developer tools.
NODE-INSPECTOR
It is even a more useful alternative to heapdump, because it allows you to connect to a running application, do a heap dump, and even debug and recompile it on the fly.
Taking “node-inspector” for a Spin
Unfortunately, you will not be able to connect to production applications running on Heroku because it does not allow sending signals to running processes. However, Heroku is not the only hosting platform.
In order to experience the actual operation of node-inspector, we will use restify to write a simple Node.js application and place some memory leak sources in it. All experiments here are performed with Node.js v0.12.7, which is compiled against V8 v3.28.71.19.
var restify = require('restify');
var server = restify.createServer();
var tasks = [];
server.pre(function(req, res, next) {
tasks.push(function() {
return req.headers;
});
// Synchronously get user from session, maybe jwt token
req.user = {
id: 1,
username: 'Leaky Master',
};
return next();
});
server.get('/', function(req, res, next) {
res.send('Hi ' + req.user.username);
return next();
});
server.listen(3000, function() {
console.log('%s listening at %s', server.name, server.url);
});
The application here is very simple, with obvious leaks. The array task will grow with the growth of the application life cycle, causing it to slow down and eventually crash. The problem is that we not only leaked the closure, but also the entire request object.
GC in V8 uses a stop-the-world strategy, so this means that the more objects in memory, the longer it takes to collect garbage. In the log below, you can clearly see that it takes an average of 20 milliseconds to collect garbage at the beginning of the application life cycle, but it takes about 230 milliseconds after hundreds of thousands of requests. Due to GC, people trying to access our application must now wait 230 milliseconds. You can also see that GC is called every few seconds, which means that every few seconds users will encounter problems when accessing our application. The delay will get bigger and bigger until the application crashes.
When starting a Node.js application with the –trace_gc flag, these log lines are printed:
node --trace_gc app.js
Let us assume that we have started our Node.js application with this flag. Before connecting the application with the node inspector, we need to send the SIGUSR1 signal to the running process. If you are running Node.js in a cluster, make sure you are connected to one of the slave processes.
kill -SIGUSR1 $pid # Replace $pid with the actual process ID
By doing this, we put the Node.js application (V8 to be precise) into debug mode. In this mode, the application will automatically open port 5858 using the V8 debugging protocol.
Our next step is to run node-inspector, which will connect to the debugging interface of the running application and open another web interface on port 8080.
$ node-inspector
Node Inspector v0.12.2
Visit http://127.0.0.1:8080/?ws=127.0.0.1:8080&port=5858 to start debugging.
If the application is running in a production environment and you have a firewall, we can connect the remote port 8080 to the local host through a tunnel:
ssh -L 8080:localhost:8080 admin@example.com
Now you can open the Chrome web browser and have full access to the Chrome development tools attached to the remote production application.
Let’s Find a Leak!
The memory leak in V8 is not the real memory leak we know from C/C++ applications. In JavaScript, variables do not become void, they are only "forgotten". Our goal is to find these variables that are forgotten by developers.
In Chrome Developer Tools, we can access multiple analyzers. We are particularly interested in recording heap allocation, which runs and takes multiple heap snapshots over time. This allows us to clearly see which objects are leaking.
To start recording heap allocation, let's use Apache Benchmark to simulate 50 concurrent users on our homepage.
ab -c 50 -n 1000000 -k http://example.com/
Before taking a new snapshot, V8 will perform mark-sweep garbage collection, so we must know that there is no old garbage in the snapshot.
Fixing the Leak on the Fly
After collecting heap allocation snapshots within 3 minutes, we finally get the following results:
We can clearly see that there are some huge arrays in the heap, as well as many IncomingMessage, ReadableState, ServerResponse and Domain objects. Let us try to analyze the source of the leak.
After selecting the heap difference from 20 seconds to 40 seconds on the chart, we will only see objects added 20 seconds after you started the analyzer. This way you can exclude all normal data.
Note how many objects of each type are in the system, and we extend the filter from 20 seconds to 1 minute. We can see that the already huge array is still growing. Under "(array)" we can see that there are many equidistant objects "(object properties)". These objects are the source of our memory leaks.
We can also see that "(closure)" objects are also growing rapidly.
It may also be convenient to view the string. There are many "Hi Leaky Master" phrases under the string list. These may also give us some clues.
In our example, we know that the string "Hi Leaky Master" can only be assembled under the "GET /" route.
If you open the retainer path, you will see that this string is somehow referenced via req, then the context is created and all of this is added to some huge array of closures.
So at this point we know that we have some kind of huge array of closures. Let's name all closures in real time under the "Source" tab.
After finishing the code editing, we can press CTRL+S to save and recompile the code!
Now let's record another heap allocation snapshot to see which closures are occupying memory.
Obviously SomeKindOfClojure() is our target. Now we can see that the SomeKindOfClojure() closure is added to some arrays called tasks in the global space.
It is easy to see that this array is useless. We can comment it out. But how do we release the occupied memory? Very simple, we only need to allocate an empty array for the task, it will be overwritten on the next request and the memory will be released after the next GC event.
The V8 heap is divided into several different spaces:
- new space: This space is relatively small, ranging in size from 1MB to 8MB. Most objects are allocated here.
- old pointer space: An object that may have pointers to other objects. If the object survives in the new space long enough, it will be promoted to the old pointer space.
- old data space: Contains only original data, such as strings, boxed numbers, and unboxed double-precision arrays. Objects that have survived in the GC long enough in the new space are also moved here.
- Large object space: Create objects in this space that are too large to fit in other spaces. Each object has its own mmap area in memory
- code space: Contains the assembly code generated by the JIT compiler.
- Cell space, property cell space, map space: This space contains cells, property cells and maps. This is used to simplify garbage collection.
Each space is composed of pages. A page is an area of memory allocated from the operating system using mmap. Except for the pages in the large object space, the size of each page is always 1MB.
V8 has two built-in garbage collection mechanisms: Scavenge, Mark-Sweep and Mark-Compact.
Scavenge is a very fast garbage collection technique that can process objects in New Space. Scavenge is an implementation of Cheney's algorithm. The idea is simple, New Space is divided into two equal half spaces: To-Space and From-Space. When the To-Space is full, Scavenge GC will occur. It just swaps the To and From spaces and copies all active objects to the To-Space or promotes them to one of the old spaces, if they survive the two clearings, then they are completely deleted from the space. Cleanup is very fast, but they have the overhead of maintaining a double-sized heap and constantly copying objects in memory. The reason for using purge is because most subjects are very young.
Mark-Sweep and Mark-Compact are another type of garbage collector used in V8. Another name is full garbage collector. It marks all active nodes, then removes all dead nodes and defragments memory.
GC Performance and Debugging Tips
Although high performance may not be a big issue for web applications, you still want to avoid leaks at all costs. During the marking phase of full GC, the application actually pauses until the garbage collection is complete. This means that the more objects in the heap, the longer it takes to perform the GC, and the longer the user waits.
ALWAYS GIVE NAMES TO CLOSURES AND FUNCTIONS
When all closures and functions have names, it is much easier to check the stack trace and heap.
db.query('GIVE THEM ALL', function GiveThemAllAName(error, data) {
...
})
AVOID LARGE OBJECTS IN HOT FUNCTIONS
Ideally, you want to avoid using large objects inside the hot function so that all data fits in the new space. All CPU and memory binding operations should be performed in the background. It is also necessary to avoid hot functions to optimize triggers. Optimized hot functions use less memory than unoptimized hot functions.
AVOID POLYMORPHISM FOR IC’S IN HOT FUNCTIONS
Inline Caches are used to speed up the execution of certain code blocks by accessing obj.key or some simple functions through cached object properties.
function x(a, b) {
return a + b;
}
x(1, 2); // monomorphic
x(1, “string”); // polymorphic, level 2
x(3.14, 1); // polymorphic, level 3
When x(a,b) is run for the first time, V8 creates a single-state IC. When you call x for the second time, V8 will erase the old IC and create a new polymorphic IC, which supports both integer and string operands. When you call the IC for the third time, V8 repeats the same process and creates another level 3 polymorphic IC.
However, there is a limitation. After the IC level reaches 5 (can be changed with the –max_inlining_levels flag), the function becomes superstate and is no longer considered optimizable.
Intuitively, it can be understood that the monomorphic function runs the fastest and has a smaller memory footprint.
DON’T ADD LARGE FILES TO MEMORY
This is obvious and well known. If you have a large file to process, such as a large CSV file, please read line by line and process it in small chunks instead of loading the entire file into memory. In rare cases, a single-line csv will be larger than 1mb, so you can put it in a new space.
DO NOT BLOCK MAIN SERVER THREAD
If you have some popular APIs that take some time to process, such as an image resizing API, move it to a separate thread or convert it to a background job. CPU-intensive operations block the main thread, forcing all other clients to wait and continue sending requests. Unprocessed request data will accumulate in memory, forcing full GC to take longer to complete.
DO NOT CREATE UNNECESSARY DATA
I have had a strange experience with restify. If you send hundreds of thousands of requests to invalid URLs, the application memory will quickly grow to hundreds of megabytes, until a complete GC starts in a few seconds, at which point everything will return to normal. It turns out that for each invalid URL, restify generates a new error object with a long stack trace. This forces newly created objects to be allocated in the large object space instead of the new space.
Accessing this data during development can be very helpful, but it is clearly not needed in production. So the rule is simple-don't generate data unless you really need it.
Summarize
Understanding how V8's garbage collection and code optimizer works is the key to improving application performance. V8 compiles JavaScript into a native assembly. In some cases, well-written code can achieve comparable performance to GCC-compiled applications.
More original articles by Jerry, all in: "Wang Zixi":
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。