WeChat public account: [Front end cook in a pot]
A little technique, a little thinking.
For questions or suggestions, please leave a message on the official account.

The Node.js main thread is single-threaded. If we use the node app.js method to run, we start a process, which can only perform operations on one CPU, and cannot use the server's multi-core CPU. To solve this problem, we can use a multi-process distribution strategy, that is, the main process receives all requests, and then distributes them to different Node.js child processes through a certain load balancing strategy.

There are 2 different schemes for this realization:

  • The main process listens to a port, the child process does not listen to the port, the main process distributes requests to the child process through load balancing technology;
  • The main process and the child process monitor different ports respectively, and distribute the request to the child process through the main process.

The first solution is that multiple Node processes monitor the same port. The advantage is that the communication between processes is relatively simple and the resource waste of the port is reduced. However, at this time, the stability of the service process must be ensured, especially for the stability of the Master process. Higher, the coding will be complicated.

The problem with the second solution is that it occupies multiple ports and causes waste of resources. Since multiple instances run independently, inter-process communication is not easy to do. The advantage is that it has high stability and no impact between instances.

The Cluster module that comes with Node.js uses the first solution.

cluster mode

The cluster mode is actually a main process and multiple child processes, thus forming the concept of a cluster. Let's take a look at an application example of the cluster mode first.

First implement a simple app.js, the code is as follows:

const http = require('http');
const cluster = require('cluster');
const instances = 2; // 启动进程数量

if (cluster.isMaster) {
  for (let i = 0; i < instances; i++) { // 使用 cluster.fork 创建子进程
    cluster.fork();
  }
} else {
  // 创建 http 服务,简单返回
  const server = http.createServer((req, res) => {
    res.write(`hello world, start with cluster ${process.pid}`);
    res.end();
  });

  // 启动服务
  server.listen(8000, () => {
    console.log('server start http://127.0.0.1:8000');
  });
  console.log(`Worker ${process.pid} started`);
}

First determine whether it is the main process:

  • If so, use cluster.fork to create a child process;
  • If not, the specific app.js is required for the child process.

Then run the following command to start the service.

node cluster.js

After the startup is successful, open another command line window and run the following commands several times:

curl "http://127.0.0.1:8000/"

You can see the following output:

hello world, start with cluster 8553
hello world, start with cluster 8552
hello world, start with cluster 8553
hello world, start with cluster 8552

The latter process ID is a relatively regular random number, sometimes output 8553, sometimes output 8552, 8553 and 8552 are the two child processes of the fork above. Let's see why this is the case.

principle

First of all, we need to clarify two issues:

  • How does Node.js cluster monitor one port by multiple processes;
  • How does Node.js perform load balancing request distribution.

Main process judgment

In the cluster mode, there is the concept of master and worker. The master is the main process and the worker is the child process.

The logic to determine whether the main process or the child process is in the source code is as follows:

'use strict';
const childOrPrimary = 'NODE_UNIQUE_ID' in process.env ? 'child' : 'primary';
module.exports = require(`internal/cluster/${childOrPrimary}`);

Judge by the process environment variable setting:

  • If it is not set, it is the master process;
  • If there is a setting, it is a child process.

Therefore, the first call to the cluster module is the master process, and all subsequent processes are child processes.

Multi-process port problem

Run the above app.js, and successfully started 1 Master process and 2 Worker processes.

Because there is only one port 8000, we need to see which processes are listening on it.

lsof -i:8000
node 8551 qianduanyiguozhu 23u IPv6 0xb5e3cbb6deb4d65d 0t0 TCP *:irdmi (LISTEN)

Now we know that port 8000 is not monitored by all processes, only by the Master process. Let us look at another message below.

ps -ef | grep 8551
501  8552  8551   0  5:53下午 ??  0:00.10
501  8553  8551   0  5:53下午 ??  0:00.10

This clearly shows the relationship between Worker and Master. Master is created by the cluster.fork() method. How to realize port sharing between processes?

In the previous example, the servers created in multiple workers listened on the same port 8000. Generally speaking, if multiple processes listen on the same port, the system will report an error. Why is our example okay?

The secret is that the server.listen() method is treated specially in the net module. According to whether the current process is a master process or a worker process:

  • Master process: Normally listen for requests on this port. (No special treatment)
  • Worker process: Create server instance. Then through the send method, send a message to the master process, let the master process also create a server instance, and listen for requests on this port. When a request comes in, the master process forwards the request to the server instance of the worker process.

The summary is: the port will only be bound and monitored once by the main process. The master process monitors a specific port and forwards client requests to each worker process through load balancing technology.

Principles of Load Balancing

The Node.js cluster module uses the main child process method, so how does it perform load balancing processing? There are mainly two modules involved here.

  • round_robin_handle.js (non-Windows platform application mode), this is a round-robin processing mode, that is, round-robin scheduling is distributed to idle child processes, and after processing is completed, it returns to the worker’s free pool. What should be noted here is if it has been bound The child process will be reused. If not, it will be re-judged. You can test it with the above app.js code. Use a browser to access it. You will find that the child process ID you call will not change every time you call it.
  • shared_handle.js (Windows platform application mode), by passing the file descriptor, port and other information to the child process, the child process creates the corresponding SocketHandle / ServerHandle through the information, and then performs the corresponding port binding and monitoring, and processing the request.

前端一锅煮
852 声望31 粉丝

积极阳光前端一枚,爱学习,爱分享,全栈进行中~


« 上一篇
Koa 洋葱模型