2

I. Overview

As we all know, Redis is a high-performance data storage framework. In the design of high-concurrency systems, Redis is also a key component and a great tool for us to improve system performance. It is becoming more and more important to understand the principles of Redis high performance in depth. Of course, the high performance design of Redis is a systematic project involving a lot of content. This article focuses on the Redis IO model and the threading model based on the IO model.

We started from the origin of IO and talked about blocking IO, non-blocking IO, and multiplexed IO. Based on multiplexed IO, we also sorted out several different Reactor models, and analyzed the advantages and disadvantages of several Reactor models. Based on the Reactor model, we started the analysis of Redis's IO model and threading model, and summarized the advantages and disadvantages of the Redis threading model, as well as the subsequent Redis multi-threading model solution. The focus of this article is to sort out the design ideas of Redis threading model. If the design ideas are smoothed, it is a matter of understanding.

Note: The code in this article is pseudo-code, mainly for illustration, and cannot be used in a production environment.

2. History of network IO model development

The network IO model we often talk about mainly includes blocking IO, non-blocking IO, multiplexed IO, signal-driven IO, and asynchronous IO. This article focuses on Redis-related content, so we focus on analyzing blocking IO, non-blocking IO, Multiplexing IO will help you understand the Redis network model better in the future.

Let's look at the picture below first;

2.1 Blocking IO

The blocking IO we often talk about is actually divided into two types, one is single-threaded blocking, and the other is multi-threaded blocking. There are actually two concepts here, blocking and threading.

blocking : Refers to the current thread will be suspended before the call result returns, and the calling thread will return only after the result is obtained;

thread : the number of threads for system calls.

System calls are involved in connection establishment, reading, and writing, which is a blocking operation in itself.

2.1.1 Single thread blocking

The server uses a single thread to process. When a client request comes, the server uses the main thread to handle operations such as connection, reading, and writing.

The following code simulates the single-threaded blocking mode;

import java.net.Socket;
 
public class BioTest {
 
    public static void main(String[] args) throws IOException {
        ServerSocket server=new ServerSocket(8081);
        while(true) {
            Socket socket=server.accept();
            System.out.println("accept port:"+socket.getPort());
            BufferedReader  in=new BufferedReader(new InputStreamReader(socket.getInputStream()));
            String inData=null;
            try {
                while ((inData = in.readLine()) != null) {
                    System.out.println("client port:"+socket.getPort());
                    System.out.println("input data:"+inData);
                    if("close".equals(inData)) {
                        socket.close();
                    }
                }
            } catch (IOException e) {
                e.printStackTrace();
            } finally {
                try {
                    socket.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }      
        }
    }
}

We are going to use two clients to initiate connection requests at the same time to simulate the phenomenon of single-threaded blocking mode. At the same time, a connection was initiated. Through the server log, we found that the server only accepted one of the connections at this time, and the main thread was blocked on the read method of the previous connection.

We try to close the first connection and look at the second connection. What we want to see is that the main thread returns and the new client connection is accepted.

It is found from the log that after the first connection is closed, the request for the second connection is processed, that is to say, the second connection request is queued until the main thread is awakened to receive the next request, which is in line with our expected.

not only has to ask at this time, why?

The main reason is that the three functions of accept, read, and write are all blocked. When the main thread is called by the system, the thread is blocked, and other client connections cannot be responded to.

Through the above process, we can easily find the defects of this process. The server can only process one connection request at a time, the CPU is not fully utilized, and the performance is relatively low. How to make full use of the multi-core feature of the CPU? I naturally thought of it- multi-threaded logic .

2.1.2 Multi-thread blocking

For engineers, the code explains everything and goes directly to the code.

BIO multi-threaded

package net.io.bio;
 
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.ServerSocket;
import java.net.Socket;
 
public class BioTest {
 
    public static void main(String[] args) throws IOException {
        final ServerSocket server=new ServerSocket(8081);
        while(true) {
            new Thread(new Runnable() {
                public void run() {
                    Socket socket=null;
                    try {
                        socket = server.accept();
                        System.out.println("accept port:"+socket.getPort());
                        BufferedReader  in=new BufferedReader(new InputStreamReader(socket.getInputStream()));
                        String inData=null;
                        while ((inData = in.readLine()) != null) {
                            System.out.println("client port:"+socket.getPort());
                            System.out.println("input data:"+inData);
                            if("close".equals(inData)) {
                                socket.close();
                            }
                        }
                    } catch (IOException e) {
                        e.printStackTrace();
                    } finally {
                         
                    }
                }
            }).start();
        }
    }
 
}

Similarly, we initiate two requests in parallel;

Both requests are accepted, and the server adds two threads to handle the client's connection and subsequent requests.

We use multi-threading to solve the problem, the server can only handle one request at the same time, but at the same time it brings a problem, if the client connection is more, the server will create a large number of threads to process the request, but the thread itself is more expensive Resource creation and context switching consume more resources. How to solve it?

2.2 Non-blocking

If we put all Sockets (file handles, the concept of using Socket to replace fd in the future, try to reduce the concept, and reduce the burden of reading) are put in the queue, and only one thread is used to train all the socket states. If ready, Take it out, does it reduce the number of threads on the server?

Let's take a look at the code together. The pure non-blocking mode is basically unnecessary. In order to demonstrate the logic, we simulated the relevant code as follows;

package net.io.bio;
 
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.ServerSocket;
import java.net.Socket;
import java.net.SocketTimeoutException;
import java.util.ArrayList;
import java.util.List;
 
import org.apache.commons.collections4.CollectionUtils;
 
 
public class NioTest {
 
    public static void main(String[] args) throws IOException {
        final ServerSocket server=new ServerSocket(8082);
        server.setSoTimeout(1000);
        List<Socket> sockets=new ArrayList<Socket>();
        while (true) {
            Socket socket = null;
            try {
                socket = server.accept();
                socket.setSoTimeout(500);
                sockets.add(socket);
                System.out.println("accept client port:"+socket.getPort());
            } catch (SocketTimeoutException e) {
                System.out.println("accept timeout");
            }
            //模拟非阻塞:轮询已连接的socket,每个socket等待10MS,有数据就处理,无数据就返回,继续轮询
            if(CollectionUtils.isNotEmpty(sockets)) {
                for(Socket socketTemp:sockets ) {
                    try {
                        BufferedReader  in=new BufferedReader(new InputStreamReader(socketTemp.getInputStream()));
                        String inData=null;
                        while ((inData = in.readLine()) != null) {
                            System.out.println("input data client port:"+socketTemp.getPort());
                            System.out.println("input data client port:"+socketTemp.getPort() +"data:"+inData);
                            if("close".equals(inData)) {
                                socketTemp.close();
                            }
                        }
                    } catch (SocketTimeoutException e) {
                        System.out.println("input client loop"+socketTemp.getPort());
                    }
                }
            }
        }
 
    }
}

System initialization, waiting for connection;

Two client connections are initiated, and the thread starts to poll whether there is data in the two connections.

After the two connections input data separately, the polling thread finds that there is data ready, and starts the related logic processing (single-threaded or multi-threaded).

Let's use a flowchart to help explain (the system actually uses file handles, and Socket is used instead to facilitate everyone's understanding).

The server has a special thread responsible for polling all Sockets to confirm whether the operating system has completed the relevant events. If there is, then return to the processing. If there is no further polling, let's think about it together? What's the problem at this time?

CPU idling and system calls (every polling involves a system call, the kernel command is used to confirm whether the data is ready), which causes a waste of resources. Is there a mechanism to solve this problem?

2.3 IO multiplexing

Is there a dedicated thread on the server side to do polling operations (the application side is not the kernel), but it is triggered by events. When there are related read, write, and connection events, the server thread is actively invoked to perform related logic processing. The relevant code is simulated as follows;

IO multiplexing

import java.net.InetSocketAddress;
import java.nio.ByteBuffer;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
import java.nio.charset.Charset;
import java.util.Iterator;
import java.util.Set;
 
public class NioServer {
 
    private static  Charset charset = Charset.forName("UTF-8");
    public static void main(String[] args) {
        try {
            Selector selector = Selector.open();
            ServerSocketChannel chanel = ServerSocketChannel.open();
            chanel.bind(new InetSocketAddress(8083));
            chanel.configureBlocking(false);
            chanel.register(selector, SelectionKey.OP_ACCEPT);
 
            while (true){
                int select = selector.select();
                if(select == 0){
                    System.out.println("select loop");
                    continue;
                }
                System.out.println("os data ok");
                 
                Set<SelectionKey> selectionKeys = selector.selectedKeys();
                Iterator<SelectionKey> iterator = selectionKeys.iterator();
                while (iterator.hasNext()){
                    SelectionKey selectionKey = iterator.next();
                     
                    if(selectionKey.isAcceptable()){
                        ServerSocketChannel server = (ServerSocketChannel)selectionKey.channel();
                        SocketChannel client = server.accept();
                        client.configureBlocking(false);
                        client.register(selector, SelectionKey.OP_READ);
                        //继续可以接收连接事件
                        selectionKey.interestOps(SelectionKey.OP_ACCEPT);
                    }else if(selectionKey.isReadable()){
                        //得到SocketChannel
                        SocketChannel client = (SocketChannel)selectionKey.channel();
                        //定义缓冲区
                        ByteBuffer buffer = ByteBuffer.allocate(1024);
                        StringBuilder content = new StringBuilder();
                        while (client.read(buffer) > 0){
                            buffer.flip();
                            content.append(charset.decode(buffer));
                        }
                        System.out.println("client port:"+client.getRemoteAddress().toString()+",input data: "+content.toString());
                        //清空缓冲区
                        buffer.clear();
                    }
                    iterator.remove();
                }
            }
 
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Create two connections at the same time;

Two connections are created without blocking;

Non-blocking receiving and reading;

Let's use a flowchart to help explain (the system actually uses file handles, and Socket is used instead to facilitate everyone's understanding).

Of course, there are several ways to implement multiplexing in the operating system. We often use select(). The epoll mode will not be explained too much here. If you are interested, you can check the relevant documents. There are asynchronous and event behind the development of IO. Waiting for the model, we will not repeat them here, we are more to explain the development of the Redis threading model.

Three, NIO thread model explanation

Let's talk about blocking, non-blocking, and IO multiplexing modes together. Which one does Redis use?

Redis uses the IO multiplexing mode, so we focus on understanding the multiplexing mode and how to better implement it in our system. It is inevitable that we have to talk about the Reactor mode.

First, let's explain related terms;

Reactor : similar to the Selector in NIO programming, responsible for the dispatch of I/O events;

Acceptor : After receiving the event in NIO, the branch logic that processes the connection;

Handler : Operation classes such as message reading and writing processing.

3.1 Single Reactor single thread model

process flow

  • Reactor monitors connection events and Socket events. When a connection event comes over, it is handed over to the Acceptor for processing, and when a Socket event comes over, it is handed over to a corresponding Handler for processing.

Advantages

  • The model is relatively simple, all processing is in one connection;
  • It is easier to implement, and module functions are relatively decoupled. Reactor is responsible for multiplexing and event distribution processing, Acceptor is responsible for connection event processing, and Handler is responsible for Scoket read and write event processing.

Disadvantages

  • There is only one thread, and the connection processing and business processing share the same thread, which cannot make full use of the advantages of CPU multi-core.
  • The system can perform well when the traffic is not particularly large and the business processing is relatively fast. When the traffic is relatively large and the read and write events are time-consuming, it is easy to cause performance bottlenecks in the system.

How to solve the above problems? Since the business processing logic may affect the system bottleneck, can we single out the business processing logic and hand it over to the thread pool for processing, on the one hand, reducing the impact on the main thread, on the other hand, taking advantage of the multi-core CPU? I hope everyone will understand this thoroughly, so that we can understand the design ideas of Redis from a single-threaded model to a multi-threaded model.

3.2 Single Reactor multi-threaded model

This model is relatively single-Reactor single-threaded model, but the processing logic of the business logic is handed over to a thread pool for processing.

processing flow

  • Reactor monitors connection events and Socket events. When a connection event comes over, it is handed over to the Acceptor for processing, and when a Socket event comes over, it is handed over to a corresponding Handler for processing.
  • After the Handler completes the reading event, it is packaged into a task object and handed over to the thread pool for processing, and the business processing logic is handed over to other threads for processing.

advantage

  • Let the main thread focus on the processing of common events (connection, reading, writing), and further decoupling from the design;
  • Take advantage of the multi-core CPU.

Disadvantages

  • It seems that this model is perfect. Let's think about it again. If there are many clients and the traffic is particularly large, the processing of general events (reading, writing) may also become the bottleneck of the main thread, because every read and write operation Both involve system calls.

Is there any good way to solve the above problems? Through the above analysis, have you discovered a phenomenon. When a certain point becomes a system bottleneck, find a way to take it out and hand it over to another thread for processing. Does this scenario apply?

3.3 Multi-Reactor multi-threaded model

This model is relative to the single-Reactor multi-threaded model, but only takes the read and write processing of Scoket from the mainReactor and hands it to the subReactor thread for processing.

processing flow

  • The mainReactor main thread is responsible for the monitoring and processing of connection events. When the Acceptor processes the connection process, the main thread assigns the connection to the subReactor;
  • The subReactor is responsible for the monitoring and processing of the Socket allocated by the mainReactor, and when there is a Socket event, it will be handed over to a corresponding Handler for processing;

After the Handler completes the reading event, it is packaged into a task object and handed over to the thread pool for processing, and the business processing logic is handed over to other threads for processing.

advantage

  • Let the main thread focus on the processing of connection events, and the sub-threads focus on reading and writing events, further decoupling from the design;
  • Take advantage of the multi-core CPU.

Disadvantages

  • The implementation will be more complicated, and it can be considered for use in scenarios where stand-alone performance is extremely pursued.

Four, Redis's threading model

4.1 Overview

Above we talked about the development history of the IO network model, and also talked about the reactor mode of IO multiplexing. Which reactor mode does Redis use? Before answering this question, let's sort out a few conceptual questions.

There are two types of events in the Redis server, file events and time events.

file event : Here you can understand the file as Socket-related events, such as connection, reading, writing, etc.;

Time Time : It can be understood as a timed task event, such as some regular RDB persistence operations.

This article focuses on Socket-related events.

4.2 Model diagram

First, let's look at the thread model diagram of the Redis service;

IO multiplexing is responsible for the monitoring of each event (connection, reading, writing, etc.). When an event occurs, the corresponding event is placed in the queue, and the event distributor will distribute it according to the event type;

If it is a connection event, it is distributed to the connection response processor; redis commands such as GET and SET are distributed to the command request processor.

After the command is processed, a command reply event is generated, and the event queue goes to the event dispatcher to the command reply processor to reply to the client's response.

4.3 A process of interaction between client and server

4.3.1 Connection process

connection process

  • The main thread of the Redis server monitors the fixed port and binds the connection event to the connection response processor.
  • After the client initiates the connection, the connection event is triggered, and the IO multiplexing program packs the connection event and shames the event queue, and then the event distribution processor distributes it to the connection response processor.
  • The connection response processor creates a client object and a Socket object. Here, we focus on the Socket object and generate ae_readable events, which are associated with the command processor to identify that the Socket is interested in readable events in the future, that is, to start receiving command operations from the client.
  • The current process is handled by a main thread.

4.3.2 Command execution process

SET command execution process

  • The client initiates a SET command. After the IO multiplexing program listens to the event (read event), it packs the data into an event and throws it into the event queue (the event is bound to the command request processor in the previous process);
  • The event distribution processor distributes the event to the corresponding command request processor according to the event type;
  • The command request processor reads the data in the Socket, executes the command, then generates the ae_writable event, and binds the command reply processor;
  • After the IO multiplexing program listens to the write event, it packs the data into an event and throws it into the event queue, and the event distribution processor distributes it to the command reply processor according to the event type;
  • Command reply processor, write data into Socket and return it to client.

4.4 Advantages and disadvantages of the model

From the above process analysis, we can see that Redis adopts the single-threaded Reactor model. We also analyzed the advantages and disadvantages of this model. Why does Redis adopt this model?

Redis itself

Command execution is based on memory operations, and business processing logic is relatively fast, so a single thread of command processing can maintain a high performance.

advantage

  • For the advantages of the Reactor single-threaded model, refer to the above.

Disadvantages

  • The shortcomings of Reactor's single-threaded model are also reflected in Redis. The only difference is that business logic processing (command execution) is not a system bottleneck.
  • With the increase in traffic, the time-consuming of IO operations will become more and more obvious (read operation, read data from the kernel to the application. Write operation, data from the application to the kernel). When a certain threshold is reached, the system becomes a bottleneck. It is reflected.

Redis solve it?

Haha~ Pick up the time-consuming points from the main thread? Does the new version of Redis do this? Let's take a look together.

4.5 Redis multi-threaded mode

Redis's multi-threaded model is a bit different from the "multi-reactor multi-threaded model" and "single-reactor multi-threaded model", but at the same time uses the ideas of two Reactor models, as follows;

Redis's multithreading model is to multithread IO operations, and its logic processing process (command execution process) is still single threaded. With the help of the single Reactor idea, the realization is different.

Multithreading the IO operation, and the same idea as the single Reactor deriving multiple Reactor, is to extract the IO operation from the main thread.

command execution general flow

  • The client sends a request command to trigger a read-ready event. The main thread of the server puts the Socket (in order to simplify the cost of understanding, uniformly uses Socket to represent the connection) into a queue, and the main thread is not responsible for reading;
  • The IO thread reads the client's request command through the Socket, the main thread is busy polling, waiting for all I/O threads to complete the reading task, the IO thread is only responsible for reading but not for executing commands;
  • The main thread executes all commands at one time, the execution process is the same as that of a single thread, and then the connection that needs to be returned is put into another queue, and an IO thread is responsible for writing out (the main thread will also write);
  • The main thread is busy polling, waiting for all I/O threads to finish writing out tasks.

Five, summary

To understand a component, more is to understand its design ideas, to think about why it is designed this way, what is the background of this technology selection, and what reference significance does it have for subsequent system architecture design, etc. One pass Belden, I hope it will be useful for everyone.

Author: vivo internet server team-Wang Shaodong

vivo互联网技术
3.3k 声望10.2k 粉丝