Detailed explanation of Reactor thread model of Netty source code analysis

previous article analyzed the initialization process of the Netty server startup. Today we will analyze the Reactor thread model in Netty

Before analyzing the source code, let's analyze, where are EventLoop used?

NioServerSocketChannel connection monitoring registration
IO event registration of NioSocketChannel

NioServerSocketChannel connection monitoring

In the initAndRegister() method of the AbstractBootstrap class, when the NioServerSocketChannel is initialized, it will call case to register the code.

final ChannelFuture initAndRegister() {
    Channel channel = null;
    try {
        channel = channelFactory.newChannel();
        init(channel);
    } catch (Throwable t) {
       
    }
   //注册到boss线程的selector上。
    ChannelFuture regFuture = config().group().register(channel);
    if (regFuture.cause() != null) {
        if (channel.isRegistered()) {
            channel.close();
        } else {
            channel.unsafe().closeForcibly();
        }
    }
    return regFuture;
}

AbstractNioChannel.doRegister

According to the execution logic of the code, it will eventually be executed in the doRegister() method of AbstractNioChannel.

@Override
protected void doRegister() throws Exception {
    boolean selected = false;
    for (;;) {
        try {
            //调用ServerSocketChannel的register方法，把当前服务端对象注册到boss线程的selector上
            selectionKey = javaChannel().register(eventLoop().unwrappedSelector(), 0, this);
            return;
        } catch (CancelledKeyException e) {
            if (!selected) {
                // Force the Selector to select now as the "canceled" SelectionKey may still be
                // cached and not removed because no Select.select(..) operation was called yet.
                eventLoop().selectNow();
                selected = true;
            } else {
                // We forced a select operation on the selector before but the SelectionKey is still cached
                // for whatever reason. JDK bug ?
                throw e;
            }
        }
    }
}

The startup process of NioEventLoop

NioEventLoop is a thread, and its startup process is as follows.

In the doBind0 method of AbstractBootstrap, the NioEventLoop in the NioServerSocketChannel is obtained, and then it is used to perform the task of binding the port.

private static void doBind0(
    final ChannelFuture regFuture, final Channel channel,
    final SocketAddress localAddress, final ChannelPromise promise) {

    //启动
    channel.eventLoop().execute(new Runnable() {
        @Override
        public void run() {
            if (regFuture.isSuccess()) {
                channel.bind(localAddress, promise).addListener(ChannelFutureListener.CLOSE_ON_FAILURE);
            } else {
                promise.setFailure(regFuture.cause());
            }
        }
    });
}

SingleThreadEventExecutor.execute

Then all the way to the SingleThreadEventExecutor.execute method, call the startThread() method to start the thread.

private void execute(Runnable task, boolean immediate) {
    boolean inEventLoop = inEventLoop();
    addTask(task);
    if (!inEventLoop) {
        startThread(); //启动线程
        if (isShutdown()) {
            boolean reject = false;
            try {
                if (removeTask(task)) {
                    reject = true;
                }
            } catch (UnsupportedOperationException e) {
                // The task queue does not support removal so the best thing we can do is to just move on and
                // hope we will be able to pick-up the task before its completely terminated.
                // In worst case we will log on termination.
            }
            if (reject) {
                reject();
            }
        }
    }

    if (!addTaskWakesUp && immediate) {
        wakeup(inEventLoop);
    }
}

startThread

private void startThread() {
    if (state == ST_NOT_STARTED) {
        if (STATE_UPDATER.compareAndSet(this, ST_NOT_STARTED, ST_STARTED)) {
            boolean success = false;
            try {
                doStartThread(); //执行启动过程
                success = true;
            } finally {
                if (!success) {
                    STATE_UPDATER.compareAndSet(this, ST_STARTED, ST_NOT_STARTED);
                }
            }
        }
    }
}

Then call the doStartThread() method to executor.execute , in which the NioEventLoop thread is started

private void doStartThread() {
    assert thread == null;
    executor.execute(new Runnable() { //通过线程池执行一个任务
        @Override
        public void run() {
            thread = Thread.currentThread();
            if (interrupted) {
                thread.interrupt();
            }

            boolean success = false;
            updateLastExecutionTime();
            try {
                SingleThreadEventExecutor.this.run(); //调用boss的NioEventLoop的run方法，开启轮询
            }
            //省略....
        }
    });
}

NioEventLoop's polling process

When the NioEventLoop thread is started, it directly enters the run method of NioEventLoop.

protected void run() {
    int selectCnt = 0;
    for (;;) {
        try {
            int strategy;
            try {
                strategy = selectStrategy.calculateStrategy(selectNowSupplier, hasTasks());
                switch (strategy) {
                    case SelectStrategy.CONTINUE:
                        continue;

                    case SelectStrategy.BUSY_WAIT:

                    case SelectStrategy.SELECT:
                        long curDeadlineNanos = nextScheduledTaskDeadlineNanos();
                        if (curDeadlineNanos == -1L) {
                            curDeadlineNanos = NONE; // nothing on the calendar
                        }
                        nextWakeupNanos.set(curDeadlineNanos);
                        try {
                            if (!hasTasks()) {
                                strategy = select(curDeadlineNanos);
                            }
                        } finally {
                            // This update is just to help block unnecessary selector wakeups
                            // so use of lazySet is ok (no race condition)
                            nextWakeupNanos.lazySet(AWAKE);
                        }
                        // fall through
                    default:
                }
            } catch (IOException e) {
                // If we receive an IOException here its because the Selector is messed up. Let's rebuild
                // the selector and retry. https://github.com/netty/netty/issues/8566
                rebuildSelector0();
                selectCnt = 0;
                handleLoopException(e);
                continue;
            }

            selectCnt++;
            cancelledKeys = 0;
            needsToSelectAgain = false;
            final int ioRatio = this.ioRatio;
            boolean ranTasks;
            if (ioRatio == 100) {
                try {
                    if (strategy > 0) {
                        processSelectedKeys();
                    }
                } finally {
                    // Ensure we always run tasks.
                    ranTasks = runAllTasks();
                }
            } else if (strategy > 0) {
                final long ioStartTime = System.nanoTime();
                try {
                    processSelectedKeys();
                } finally {
                    // Ensure we always run tasks.
                    final long ioTime = System.nanoTime() - ioStartTime;
                    ranTasks = runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
                }
            } else {
                ranTasks = runAllTasks(0); // This will run the minimum number of tasks
            }

            if (ranTasks || strategy > 0) {
                if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS && logger.isDebugEnabled()) {
                    logger.debug("Selector.select() returned prematurely {} times in a row for Selector {}.",
                                 selectCnt - 1, selector);
                }
                selectCnt = 0;
            } else if (unexpectedSelectorWakeup(selectCnt)) { // Unexpected wakeup (unusual case)
                selectCnt = 0;
            }
        } catch (CancelledKeyException e) {
            // Harmless exception - log anyway
            if (logger.isDebugEnabled()) {
                logger.debug(CancelledKeyException.class.getSimpleName() + " raised by a Selector {} - JDK bug?",
                             selector, e);
            }
        } catch (Error e) {
            throw (Error) e;
        } catch (Throwable t) {
            handleLoopException(t);
        } finally {
            // Always handle shutdown even if the loop processing threw an exception.
            try {
                if (isShuttingDown()) {
                    closeAll();
                    if (confirmShutdown()) {
                        return;
                    }
                }
            } catch (Error e) {
                throw (Error) e;
            } catch (Throwable t) {
                handleLoopException(t);
            }
        }
    }
}

The execution process of NioEventLoop

The run method in NioEventLoop is an endless loop thread, in which three things are mainly done, as shown in Figure 9-1.

<center>Figure 9-1</center>

Polling for processing I/O events (select), polling for I/O ready events of all Channels registered in the Selector
Process I/O events, if there is an I/O event of a ready Channel, call processSelectedKeys for processing
To process asynchronous tasks (runAllTasks), the Reactor thread has a very important responsibility, which is to process non-I/O tasks in the task queue. Netty provides the ioRadio parameter to adjust the ratio of I/O time and task processing time.

Polling for I/O ready events

Let's first look at the code snippets related to I/O time:

Get the current execution strategy through selectStrategy.calculateStrategy(selectNowSupplier, hasTasks())
According to different strategies, it is used to control the execution strategy of each polling.

protected void run() {
        int selectCnt = 0;
        for (;;) {
            try {
                int strategy;
                try {
                    strategy = selectStrategy.calculateStrategy(selectNowSupplier, hasTasks());
                    switch (strategy) {
                    case SelectStrategy.CONTINUE:
                        continue;

                    case SelectStrategy.BUSY_WAIT:
                        // fall-through to SELECT since the busy-wait is not supported with NIO

                    case SelectStrategy.SELECT:
                        long curDeadlineNanos = nextScheduledTaskDeadlineNanos();
                        if (curDeadlineNanos == -1L) {
                            curDeadlineNanos = NONE; // nothing on the calendar
                        }
                        nextWakeupNanos.set(curDeadlineNanos);
                        try {
                            if (!hasTasks()) {
                                strategy = select(curDeadlineNanos);
                            }
                        } finally {
                            // This update is just to help block unnecessary selector wakeups
                            // so use of lazySet is ok (no race condition)
                            nextWakeupNanos.lazySet(AWAKE);
                        }
                        // fall through
                    default:
                    }
                }
                //省略....
            }
        }
}

selectStrategy processing logic

@Override
public int calculateStrategy(IntSupplier selectSupplier, boolean hasTasks) throws Exception {
    return hasTasks ? selectSupplier.get() : SelectStrategy.SELECT;
}

If hasTasks is true, it means that if there are asynchronous tasks in the current NioEventLoop thread, call selectSupplier.get() , otherwise directly return SELECT .

selectSupplier.get() is as follows:

private final IntSupplier selectNowSupplier = new IntSupplier() {
    @Override
    public int get() throws Exception {
        return selectNow();
    }
};

This method calls the selectNow() method, which is a non-blocking method provided in the Selector, and will return immediately after execution.

If there are currently ready channels, the number of corresponding ready channels will be returned
Otherwise, it returns 0.

Branch processing

After obtaining the strategy in the previous step, branch processing will be performed according to different results.

CONTINUE, which means you need to try again.
BUSY_WAIT, because BUSY_WAIT is not supported in NIO, the execution logic of BUSY_WAIT and SELECT is the same
SELECT means that the list of ready channels needs to be obtained through the select method. When there is no asynchronous task in the NioEventLoop, that is, the task queue is empty, the strategy is returned.

switch (strategy) {
    case SelectStrategy.CONTINUE:
        continue;

    case SelectStrategy.BUSY_WAIT:
        // fall-through to SELECT since the busy-wait is not supported with NIO

    case SelectStrategy.SELECT:
        long curDeadlineNanos = nextScheduledTaskDeadlineNanos();
        if (curDeadlineNanos == -1L) {
            curDeadlineNanos = NONE; // nothing on the calendar
        }
        nextWakeupNanos.set(curDeadlineNanos);
        try {
            if (!hasTasks()) {
                strategy = select(curDeadlineNanos);
            }
        } finally {
            // This update is just to help block unnecessary selector wakeups
            // so use of lazySet is ok (no race condition)
            nextWakeupNanos.lazySet(AWAKE);
        }
        // fall through
    default:
}

SelectStrategy.SELECT

When there is no asynchronous task in the NioEventLoop thread, the SELECT strategy is executed

//下一次定时任务触发截至时间，默认不是定时任务，返回 -1L
long curDeadlineNanos = nextScheduledTaskDeadlineNanos();
if (curDeadlineNanos == -1L) {
    curDeadlineNanos = NONE; // nothing on the calendar
}
nextWakeupNanos.set(curDeadlineNanos);
try {
    if (!hasTasks()) {
        //2. taskQueue中任务执行完，开始执行select进行阻塞
        strategy = select(curDeadlineNanos);
    }
} finally {
    // This update is just to help block unnecessary selector wakeups
    // so use of lazySet is ok (no race condition)
    nextWakeupNanos.lazySet(AWAKE);
}

The select method is defined as follows, by default deadlineNanos=NONE select() method will be called to block.

private int select(long deadlineNanos) throws IOException {
    if (deadlineNanos == NONE) {
        return selector.select();
    }
    //计算select()方法的阻塞超时时间
    long timeoutMillis = deadlineToDelayNanos(deadlineNanos + 995000L) / 1000000L;
    return timeoutMillis <= 0 ? selector.selectNow() : selector.select(timeoutMillis);
}

Finally, the number of ready channels is returned, and the subsequent logic will determine the execution logic according to the number of ready channels returned.

Business processing in NioEventLoop.run

The logic of business processing is relatively easy to understand

If there is a ready channel, handle the IO events of the ready channel
After the processing is completed, the tasks in the asynchronous queue are executed synchronously.
In addition, in order to solve the idling problem in Java NIO, the number of idling times is recorded through selectCnt. If an idling occurs in a cycle (no IO needs to be processed or performed), then record it (selectCnt);, if idling occurs continuously (SelectCnt reaches a certain value), netty thinks that the BUG of NIO has been triggered (unexpectedSelectorWakeup processing);

There is a bug in Java Nio, the epoll empty polling problem of Java nio under Linux system. That is, in the select() method, the channel that is ready in time is 0, and it will be awakened from the operation that should have been blocked, resulting in the CPU usage rate reaching 100%.

@Override
protected void run() {
    int selectCnt = 0;
    for (;;) {
        //省略....
        selectCnt++;//selectCnt记录的是无功而返的select次数，即eventLoop空转的次数，为解决NIO BUG
        cancelledKeys = 0;
        needsToSelectAgain = false;
        final int ioRatio = this.ioRatio;
        boolean ranTasks;
        if (ioRatio == 100) { //ioRadio执行时间占比是100%，默认是50%
            try {
                if (strategy > 0) { //strategy>0表示存在就绪的SocketChannel
                    processSelectedKeys(); //执行就绪SocketChannel的任务
                }
            } finally {
             //注意，将ioRatio设置为100，并不代表任务不执行，反而是每次将任务队列执行完
                ranTasks = runAllTasks(); //确保总是执行队列中的任务
            }
        } else if (strategy > 0) { //strategy>0表示存在就绪的SocketChannel
            final long ioStartTime = System.nanoTime(); //io时间处理开始时间
            try {
                processSelectedKeys(); //开始处理IO就绪事件
            } finally {
                // io事件执行结束时间
                final long ioTime = System.nanoTime() - ioStartTime;
                //基于本次循环处理IO的时间，ioRatio，计算出执行任务耗时的上限，也就是只允许处理多长时间异步任务
                ranTasks = runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
            }
        } else {
            //这个分支代表：strategy=0，ioRatio<100，此时任务限时=0，意为：尽量少地执行异步任务
            //这个分支和strategy>0实际是一码事，代码简化了一下而已
            ranTasks = runAllTasks(0); // This will run the minimum number of tasks
        }

        if (ranTasks || strategy > 0) { //ranTasks=true，或strategy>0，说明eventLoop干活了，没有空转，清空selectCnt
            if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS && logger.isDebugEnabled()) {
                logger.debug("Selector.select() returned prematurely {} times in a row for Selector {}.",
                             selectCnt - 1, selector);
            }
            selectCnt = 0;
        } 
         //unexpectedSelectorWakeup处理NIO BUG
        else if (unexpectedSelectorWakeup(selectCnt)) { // Unexpected wakeup (unusual case)
            selectCnt = 0;
        }
    }
}

processSelectedKeys

Through the select method, we can get the number of ready I/O events to trigger the execution of the processSelectedKeys method.

private void processSelectedKeys() {
    if (selectedKeys != null) {
        processSelectedKeysOptimized();
    } else {
        processSelectedKeysPlain(selector.selectedKeys());
    }
}

When dealing with I/O events, there are two logical branches:

One is to process selectedKeys optimized by Netty,
The other is normal processing logic

The processSelectedKeys method selectedKeys is set. The default is the selectedKeys optimized by Netty, and the object it returns is SelectedSelectionKeySet .

processSelectedKeysOptimized

private void processSelectedKeysOptimized() {
    for (int i = 0; i < selectedKeys.size; ++i) {
        //1. 取出IO事件以及对应的channel
        final SelectionKey k = selectedKeys.keys[i];
        selectedKeys.keys[i] = null;//k的引用置null，便于gc回收，也表示该channel的事件处理完成避免重复处理

        final Object a = k.attachment(); //获取保存在当前channel中的attachment，此时应该是NioServerSocketChannel
        //处理当前的channel
        if (a instanceof AbstractNioChannel) {
             //对于boss NioEventLoop，轮询到的基本是连接事件，后续的事情就是通过他的pipeline将连接扔给一个worker NioEventLoop处理
            //对于worker NioEventLoop来说，轮循道的基本商是IO读写事件，后续的事情就是通过他的pipeline将读取到的字节流传递给每个channelHandler来处理
            processSelectedKey(k, (AbstractNioChannel) a);
        } else {
            @SuppressWarnings("unchecked")
            NioTask<SelectableChannel> task = (NioTask<SelectableChannel>) a;
            processSelectedKey(k, task);
        }
        
        if (needsToSelectAgain) {
            // null out entries in the array to allow to have it GC'ed once the Channel close
            // See https://github.com/netty/netty/issues/2363
            selectedKeys.reset(i + 1);

            selectAgain();
            i = -1;
        }
    }
}

processSelectedKey

private void processSelectedKey(SelectionKey k, AbstractNioChannel ch) {
    final AbstractNioChannel.NioUnsafe unsafe = ch.unsafe();
    if (!k.isValid()) {
        final EventLoop eventLoop;
        try {
            eventLoop = ch.eventLoop();
        } catch (Throwable ignored) {
           
        }
        if (eventLoop == this) {
            // close the channel if the key is not valid anymore
            unsafe.close(unsafe.voidPromise());
        }
        return;
    }

    try {
        int readyOps = k.readyOps(); //获取当前key所属的操作类型
      
        if ((readyOps & SelectionKey.OP_CONNECT) != 0) {//如果是连接类型
            int ops = k.interestOps();
            ops &= ~SelectionKey.OP_CONNECT;
            k.interestOps(ops);

            unsafe.finishConnect();
        }
        if ((readyOps & SelectionKey.OP_WRITE) != 0) { //如果是写类型
            ch.unsafe().forceFlush();
        }
        //如果是读类型或者ACCEPT类型。则执行unsafe.read()方法，unsafe的实例对象为 NioMessageUnsafe
        if ((readyOps & (SelectionKey.OP_READ | SelectionKey.OP_ACCEPT)) != 0 || readyOps == 0) {
            unsafe.read();
        }
    } catch (CancelledKeyException ignored) {
        unsafe.close(unsafe.voidPromise());
    }
}

NioMessageUnsafe.read()

Assuming that it is a read operation, or the client establishes a connection, the code execution logic is as follows,

@Override
public void read() {
    assert eventLoop().inEventLoop();
    final ChannelConfig config = config();
    final ChannelPipeline pipeline = pipeline(); //如果是第一次建立连接，此时的pipeline是ServerBootstrapAcceptor
    final RecvByteBufAllocator.Handle allocHandle = unsafe().recvBufAllocHandle();
    allocHandle.reset(config);

    boolean closed = false;
    Throwable exception = null;
    try {
        try {
            do {
                int localRead = doReadMessages(readBuf);
                if (localRead == 0) {
                    break;
                }
                if (localRead < 0) {
                    closed = true;
                    break;
                }

                allocHandle.incMessagesRead(localRead);
            } while (continueReading(allocHandle));
        } catch (Throwable t) {
            exception = t;
        }

        int size = readBuf.size();
        for (int i = 0; i < size; i ++) {
            readPending = false;
            pipeline.fireChannelRead(readBuf.get(i));  //调用pipeline中的channelRead方法
        }
        readBuf.clear();
        allocHandle.readComplete();
        pipeline.fireChannelReadComplete();

        if (exception != null) {
            closed = closeOnReadError(exception);

            pipeline.fireExceptionCaught(exception); //调用pipeline中的ExceptionCaught方法
        }

        if (closed) {
            inputShutdown = true;
            if (isOpen()) {
                close(voidPromise());
            }
        }
    } finally {
        if (!readPending && !config.isAutoRead()) {
            removeReadOp();
        }
    }
}

Optimization of SelectedSelectionKeySet

Netty encapsulates and implements a SelectedSelectionKeySet to optimize the structure of the original SelectorKeys. How is it optimized? First look at its code definition

final class SelectedSelectionKeySet extends AbstractSet<SelectionKey> {

    SelectionKey[] keys;
    int size;

    SelectedSelectionKeySet() {
        keys = new SelectionKey[1024];
    }

    @Override
    public boolean add(SelectionKey o) {
        if (o == null) {
            return false;
        }

        keys[size++] = o;
        if (size == keys.length) {
            increaseCapacity();
        }

        return true;
    }
}

The SelectedSelectionKeySet uses the SelectionKey array internally, and all ready I/O events can be retrieved directly by traversing the array in the processSelectedKeysOptimized method.

The original Set<SelectionKey> returns the HashSet type. Compared with the two, SelectionKey[] does not need to consider the problem of hash conflicts, so it can implement the add operation with O(1) time complexity.

Initialization of SelectedSelectionKeySet

Through reflection, netty replaces the selectedKeys and publicSelectedKeys inside the Selector object with SelectedSelectionKeySet.

The original selectedKeys and publicSelectedKeys fields are both of the HashSet type, and they become SelectedSelectionKeySet after replacement. When there is a ready key, it will be directly filled into the SelectedSelectionKeySet array. Only need to traverse in the follow-up.

private SelectorTuple openSelector() {
    final Class<?> selectorImplClass = (Class<?>) maybeSelectorImplClass;
    final SelectedSelectionKeySet selectedKeySet = new SelectedSelectionKeySet();
    //使用反射
    Object maybeException = AccessController.doPrivileged(new PrivilegedAction<Object>() {
        @Override
        public Object run() {
            try {
                //Selector内部的selectedKeys字段
                Field selectedKeysField = selectorImplClass.getDeclaredField("selectedKeys");
                //Selector内部的publicSelectedKeys字段
                Field publicSelectedKeysField = selectorImplClass.getDeclaredField("publicSelectedKeys");

                if (PlatformDependent.javaVersion() >= 9 && PlatformDependent.hasUnsafe()) {
                    //获取selectedKeysField字段偏移量
                    long selectedKeysFieldOffset = PlatformDependent.objectFieldOffset(selectedKeysField);
                    //获取publicSelectedKeysField字段偏移量
                    long publicSelectedKeysFieldOffset =
                        PlatformDependent.objectFieldOffset(publicSelectedKeysField);

                    if (selectedKeysFieldOffset != -1 && publicSelectedKeysFieldOffset != -1) {
                        //替换为selectedKeySet
                        PlatformDependent.putObject(
                            unwrappedSelector, selectedKeysFieldOffset, selectedKeySet);
                        PlatformDependent.putObject(
                            unwrappedSelector, publicSelectedKeysFieldOffset, selectedKeySet);
                        return null;
                    }
                    // We could not retrieve the offset, lets try reflection as last-resort.
                }
                Throwable cause = ReflectionUtil.trySetAccessible(selectedKeysField, true);
                if (cause != null) {
                    return cause;
                }
                cause = ReflectionUtil.trySetAccessible(publicSelectedKeysField, true);
                if (cause != null) {
                    return cause;
                }
                selectedKeysField.set(unwrappedSelector, selectedKeySet);
                publicSelectedKeysField.set(unwrappedSelector, selectedKeySet);
                return null;
            } catch (NoSuchFieldException e) {
                return e;
            } catch (IllegalAccessException e) {
                return e;
            }
        }
    });
    if (maybeException instanceof Exception) {
        selectedKeys = null;
        Exception e = (Exception) maybeException;
        logger.trace("failed to instrument a special java.util.Set into: {}", unwrappedSelector, e);
        return new SelectorTuple(unwrappedSelector);
    }
    selectedKeys = selectedKeySet;
}

The execution flow of asynchronous tasks

After analyzing the above process, we continue to look at the process of asynchronous tasks in the run method in NioEventLoop

@Override
protected void run() {
    int selectCnt = 0;
    for (;;) {
        ranTasks = runAllTasks();
    }
}

runAllTask

It should be noted that NioEventLoop can support the execution of timed tasks, which is completed nioEventLoop.schedule()

protected boolean runAllTasks() {
    assert inEventLoop();
    boolean fetchedAll;
    boolean ranAtLeastOne = false;

    do {
        fetchedAll = fetchFromScheduledTaskQueue(); //合并定时任务到普通任务队列
        if (runAllTasksFrom(taskQueue)) { //循环执行taskQueue中的任务
            ranAtLeastOne = true;
        }
    } while (!fetchedAll);  

    if (ranAtLeastOne) { //如果任务全部执行完成，记录执行完完成时间
        lastExecutionTime = ScheduledFutureTask.nanoTime();
    }
    afterRunningAllTasks();//执行收尾任务
    return ranAtLeastOne;
}

fetchFromScheduledTaskQueue

Traverse the tasks in scheduledTaskQueue and add them to taskQueue.

private boolean fetchFromScheduledTaskQueue() {
    if (scheduledTaskQueue == null || scheduledTaskQueue.isEmpty()) {
        return true;
    }
    long nanoTime = AbstractScheduledEventExecutor.nanoTime();
    for (;;) {
        Runnable scheduledTask = pollScheduledTask(nanoTime);
        if (scheduledTask == null) {
            return true;
        }
        if (!taskQueue.offer(scheduledTask)) {
            // No space left in the task queue add it back to the scheduledTaskQueue so we pick it up again.
            scheduledTaskQueue.add((ScheduledFutureTask<?>) scheduledTask);
            return false;
        }
    }
}

Task add method execute

There are two very important asynchronous task queues inside NioEventLoop, which are ordinary task and timed task queue. For these two queues, two methods are provided to add tasks to the two queues.

execute()
schedule()

Among them, the execute method is defined as follows.

private void execute(Runnable task, boolean immediate) {
    boolean inEventLoop = inEventLoop();
    addTask(task); //把当前任务添加到阻塞队列中
    if (!inEventLoop) { //如果是非NioEventLoop
        startThread(); //启动线程
        if (isShutdown()) { //如果当前NioEventLoop已经是停止状态
            boolean reject = false;
            try {
                if (removeTask(task)) { 
                    reject = true;
                }
            } catch (UnsupportedOperationException e) {
                // The task queue does not support removal so the best thing we can do is to just move on and
                // hope we will be able to pick-up the task before its completely terminated.
                // In worst case we will log on termination.
            }
            if (reject) {
                reject();
            }
        }
    }

    if (!addTaskWakesUp && immediate) {
        wakeup(inEventLoop);
    }
}

Nio's idle rotation problem

The so-called empty rotation means that when we execute the selector.select() method, if there is no ready SocketChannel, the current thread will be blocked. The empty polling means that when the SocketChannel is not ready, it will be triggered to wake up.

And this wake-up does not have any read and write requests, which causes the thread to do invalid polling, which makes the CPU occupancy higher.

The root cause of this problem is:

In some Linux kernels of 2.6, poll and epoll will set the returned eventSet event collection to POLLHUP for the sudden interruption of the connection socket, or it may be POLLER, the eventSet event collection has changed, which may cause the Selector to be awakened. This is related to the mechanism of the operating system. Although the JDK is only a software compatible with various operating system platforms, unfortunately in the initial version of JDK5 and JDK6 (in the strict sense, it will be part of the JDK), this problem does not It was not resolved, and the hat was thrown to the operating system. This is the reason why this bug was not finally fixed until 2013, and the final influence was too wide.

How does Netty solve this problem? Let’s go back to the run method of NioEventLoop

@Override
protected void run() {
    int selectCnt = 0;
    for (;;) {
        //selectCnt记录的是无功而返的select次数，即eventLoop空转的次数，为解决NIO BUG
        selectCnt++; 
        //ranTasks=true，或strategy>0，说明eventLoop干活了，没有空转，清空selectCnt
        if (ranTasks || strategy > 0) {
            //如果选择操作计数器的值，大于最小选择器重构阈值，则输出log
            if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS && logger.isDebugEnabled()) {
                logger.debug("Selector.select() returned prematurely {} times in a row for Selector {}.",
                             selectCnt - 1, selector);
            }
            selectCnt = 0;
        } 
        //unexpectedSelectorWakeup处理NIO BUG
        else if (unexpectedSelectorWakeup(selectCnt)) { // Unexpected wakeup (unusual case)
            selectCnt = 0;
        }
    }
}

unexpectedSelectorWakeup

private boolean unexpectedSelectorWakeup(int selectCnt) {
    if (Thread.interrupted()) {
        if (logger.isDebugEnabled()) {
            logger.debug("Selector.select() returned prematurely because " +
                         "Thread.currentThread().interrupt() was called. Use " +
                         "NioEventLoop.shutdownGracefully() to shutdown the NioEventLoop.");
        }
        return true;
    }
    //如果选择重构的阈值大于0， 默认值是512次、 并且当前触发的空轮询次数大于 512次。，则触发重构
    if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
        selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
        // The selector returned prematurely many times in a row.
        // Rebuild the selector to work around the problem.
        logger.warn("Selector.select() returned prematurely {} times in a row; rebuilding Selector {}.",
                    selectCnt, selector);
        rebuildSelector();
        return true;
    }
    return false;
}

rebuildSelector()

public void rebuildSelector() {
    if (!inEventLoop()) { //如果不是在eventLoop中执行，则使用异步线程执行
        execute(new Runnable() {
            @Override
            public void run() {
                rebuildSelector0();
            }
        });
        return;
    }
    rebuildSelector0();
}

rebuildSelector0

The main function of this method: Re-create a selector to replace the selector in the current event loop

private void rebuildSelector0() {
    final Selector oldSelector = selector; //获取老的selector选择器
    final SelectorTuple newSelectorTuple; //定义新的选择器

    if (oldSelector == null) { //如果老的选择器为空，直接返回
        return;
    }

    try {
        newSelectorTuple = openSelector(); //创建一个新的选择器
    } catch (Exception e) {
        logger.warn("Failed to create a new Selector.", e);
        return;
    }

    // Register all channels to the new Selector.
    int nChannels = 0;
    for (SelectionKey key: oldSelector.keys()) {//遍历注册到选择器的选择key集合
        Object a = key.attachment();
        try {
             //如果选择key无效或选择关联的通道已经注册到新的选择器，则跳出当前循环
            if (!key.isValid() || key.channel().keyFor(newSelectorTuple.unwrappedSelector) != null) {
                continue;
            }
             //获取key的选择关注事件集
            int interestOps = key.interestOps();
            key.cancel();//取消选择key
          //注册选择key到新的选择器
            SelectionKey newKey = key.channel().register(newSelectorTuple.unwrappedSelector, interestOps, a);
            if (a instanceof AbstractNioChannel) {//如果是nio通道，则更新通道的选择key
                // Update SelectionKey
                ((AbstractNioChannel) a).selectionKey = newKey;
            }
            nChannels ++;
        } catch (Exception e) {
            logger.warn("Failed to re-register a Channel to the new Selector.", e);
            if (a instanceof AbstractNioChannel) {
                AbstractNioChannel ch = (AbstractNioChannel) a;
                ch.unsafe().close(ch.unsafe().voidPromise());
            } else {
                @SuppressWarnings("unchecked")
                NioTask<SelectableChannel> task = (NioTask<SelectableChannel>) a;
                invokeChannelUnregistered(task, key, e);
            }
        }
    }
    //更新当前事件循环选择器
    selector = newSelectorTuple.selector;
    unwrappedSelector = newSelectorTuple.unwrappedSelector;

    try {
        // time to close the old selector as everything else is registered to the new one
        oldSelector.close(); //关闭原始选择器
    } catch (Throwable t) {
        if (logger.isWarnEnabled()) {
            logger.warn("Failed to close the old Selector.", t);
        }
    }

    if (logger.isInfoEnabled()) {
        logger.info("Migrated " + nChannels + " channel(s) to the new Selector.");
    }
}

From the above process, we found that Netty solves the NIO idle rotation problem by rebuilding the Selector object. In this reconstruction process, the core is to re-register all the SelectionKeys in the Selector to the new Selector, thus cleverly Avoid the JDK epoll empty rotation problem.

Connection establishment and processing process

In Section 9.2.4.3, it is mentioned that when the client has a connection or a read event is sent to the server, the read() method of the NioMessageUnsafe class will be called.

public void read() {
    assert eventLoop().inEventLoop();
    final ChannelConfig config = config();
    final ChannelPipeline pipeline = pipeline();
    final RecvByteBufAllocator.Handle allocHandle = unsafe().recvBufAllocHandle();
    allocHandle.reset(config);

    boolean closed = false;
    Throwable exception = null;
    try {
        try {
            do {
                //如果有客户端连接进来，则localRead为1，否则返回0
                int localRead = doReadMessages(readBuf);
                if (localRead == 0) {
                    break;
                }
                if (localRead < 0) {
                    closed = true;
                    break;
                }
                
                allocHandle.incMessagesRead(localRead); //累计增加read消息数量
            } while (continueReading(allocHandle));
        } catch (Throwable t) {
            exception = t;
        }

        int size = readBuf.size(); //遍历客户端连接列表
        for (int i = 0; i < size; i ++) {
            readPending = false;
            pipeline.fireChannelRead(readBuf.get(i)); //调用pipeline中handler的channelRead方法。
        }
        readBuf.clear(); //清空集合
        allocHandle.readComplete();
        pipeline.fireChannelReadComplete(); //触发pipeline中handler的readComplete方法

        if (exception != null) {
            closed = closeOnReadError(exception);

            pipeline.fireExceptionCaught(exception);
        }

        if (closed) {
            inputShutdown = true;
            if (isOpen()) {
                close(voidPromise());
            }
        }
    } finally {
        if (!readPending && !config.isAutoRead()) {
            removeReadOp();
        }
    }
}

pipeline.fireChannelRead(readBuf.get(i))

Continue to look at the trigger method of the pipeline, the pipeline composition at this time, if the current connection event, then pipeline = ServerBootstrap$ServerBootstrapAcceptor.

static void invokeChannelRead(final AbstractChannelHandlerContext next, Object msg) {
    final Object m = next.pipeline.touch(ObjectUtil.checkNotNull(msg, "msg"), next);
    EventExecutor executor = next.executor();
    if (executor.inEventLoop()) {
        next.invokeChannelRead(m); //获取pipeline中的下一个节点，调用该handler的channelRead方法
    } else {
        executor.execute(new Runnable() {
            @Override
            public void run() {
                next.invokeChannelRead(m);
            }
        });
    }
}

ServerBootstrapAcceptor

ServerBootstrapAcceptor is a special Handler in NioServerSocketChannel, specifically used to handle client connection events. The core purpose of this method is to add the handler linked list for SocketChannel to the pipeline in the current NioSocketChannel.

public void channelRead(ChannelHandlerContext ctx, Object msg) {
    final Channel child = (Channel) msg;

    child.pipeline().addLast(childHandler);  //把服务端配置的childHandler，添加到当前NioSocketChannel中的pipeline中

    setChannelOptions(child, childOptions, logger); //设置NioSocketChannel的属性
    setAttributes(child, childAttrs); 

    try {
        //把当前的NioSocketChannel注册到Selector上，并且监听一个异步事件。
        childGroup.register(child).addListener(new ChannelFutureListener() {
            @Override
            public void operationComplete(ChannelFuture future) throws Exception {
                if (!future.isSuccess()) {
                    forceClose(child, future.cause());
                }
            }
        });
    } catch (Throwable t) {
        forceClose(child, t);
    }
}

The pipeline construction process

In Section 9.6.2, child is actually a NioSocketChannel. It is in NioServerSocketChannel. When a new link is received, an object is created.

@Override
protected int doReadMessages(List<Object> buf) throws Exception {
    SocketChannel ch = SocketUtils.accept(javaChannel());

    try {
        if (ch != null) {
            buf.add(new NioSocketChannel(this, ch)); //这里
            return 1;
        }
    } catch (Throwable t) {
        logger.warn("Failed to create a new channel from an accepted socket.", t);

        try {
            ch.close();
        } catch (Throwable t2) {
            logger.warn("Failed to close a socket.", t2);
        }
    }

    return 0;
}

When NioSocketChannel is constructed, it calls the construction method in the parent class AbstractChannel to initialize a pipeline.

protected AbstractChannel(Channel parent) {
    this.parent = parent;
    id = newId();
    unsafe = newUnsafe();
    pipeline = newChannelPipeline();
}

DefaultChannelPipeline

The default instance of pipeline is DefaultChannelPipeline, which is constructed as follows.

protected DefaultChannelPipeline(Channel channel) {
    this.channel = ObjectUtil.checkNotNull(channel, "channel");
    succeededFuture = new SucceededChannelFuture(channel, null);
    voidPromise =  new VoidChannelPromise(channel, true);

    tail = new TailContext(this);
    head = new HeadContext(this);

    head.next = tail;
    tail.prev = head;
}

A head node and a tail node are initialized to form a doubly linked list, as shown in Figure 9-2

<center>Figure 9-2</center>

The composition of the handler chain in NioSocketChannel

Going back to the channelRead method of ServerBootstrapAccepter, when the client connection is received, the addition of the pipeline in the NioSocketChannel is triggered

The following code is the addLast method of DefaultChannelPipeline.

@Override
public final ChannelPipeline addLast(EventExecutorGroup executor, ChannelHandler... handlers) {
   ObjectUtil.checkNotNull(handlers, "handlers");

   for (ChannelHandler h: handlers) { //遍历handlers列表，此时这里的handler是ChannelInitializer回调方法
       if (h == null) {
           break;
       }
       addLast(executor, null, h);
   }

   return this;
}

addLast

Add the ChannelHandler configured on the server to the pipeline. Note that the ChannelInitializer callback method is stored in the pipeline at this time.

@Override
public final ChannelPipeline addLast(EventExecutorGroup group, String name, ChannelHandler handler) {
    final AbstractChannelHandlerContext newCtx;
    synchronized (this) {
        checkMultiplicity(handler); //检查是否有重复的handler
        //创建新的DefaultChannelHandlerContext节点
        newCtx = newContext(group, filterName(name, handler), handler);

        addLast0(newCtx);  //添加新的DefaultChannelHandlerContext到ChannelPipeline

      
        if (!registered) { 
            newCtx.setAddPending();
            callHandlerCallbackLater(newCtx, true);
            return this;
        }

        EventExecutor executor = newCtx.executor();
        if (!executor.inEventLoop()) {
            callHandlerAddedInEventLoop(newCtx, executor);
            return this;
        }
    }
    callHandlerAdded0(newCtx);
    return this;
}

When does this callback method trigger the call? In fact, ServerBootstrapAcceptor when channelRead method of this class, the current registration NioSocketChannel

childGroup.register(child).addListener(new ChannelFutureListener() {}

Finally, according to the idea of the source code analysis in our previous lesson, we locate the register0 method in AbstractChannel.

private void register0(ChannelPromise promise) {
            try {
                // check if the channel is still open as it could be closed in the mean time when the register
                // call was outside of the eventLoop
                if (!promise.setUncancellable() || !ensureOpen(promise)) {
                    return;
                }
                boolean firstRegistration = neverRegistered;
                doRegister();
                neverRegistered = false;
                registered = true;
                //
                pipeline.invokeHandlerAddedIfNeeded();

            }
}

callHandlerAddedForAllHandlers

The pipeline.invokeHandlerAddedIfNeeded() method, executed downward, will enter the callHandlerAddedForAllHandlers method in the DefaultChannelPipeline class

private void callHandlerAddedForAllHandlers() {
    final PendingHandlerCallback pendingHandlerCallbackHead;
    synchronized (this) {
        assert !registered;

        // This Channel itself was registered.
        registered = true;

        pendingHandlerCallbackHead = this.pendingHandlerCallbackHead;
        // Null out so it can be GC'ed.
        this.pendingHandlerCallbackHead = null;
    }
    //从等待被调用的handler 回调列表中，取出任务来执行。
    PendingHandlerCallback task = pendingHandlerCallbackHead;
    while (task != null) {
        task.execute();
        task = task.next;
    }
}

We found that the singly linked list of pendingHandlerCallbackHead was added in the callHandlerCallbackLater method.

And callHandlerCallbackLater is added in the addLast method, so it forms an asynchronous complete closed loop.

ChannelInitializer.handlerAdded

The task.execute() method execution path is

callHandlerAdded0 -> ctx.callHandlerAdded ->

-------> AbstractChannelHandlerContext.callHandlerAddded()

---------------> ChannelInitializer.handlerAdded

Call the initChannel method to initialize the Channel in the NioSocketChannel.

@Override
public void handlerAdded(ChannelHandlerContext ctx) throws Exception {
    if (ctx.channel().isRegistered()) {
        // This should always be true with our current DefaultChannelPipeline implementation.
        // The good thing about calling initChannel(...) in handlerAdded(...) is that there will be no ordering
        // surprises if a ChannelInitializer will add another ChannelInitializer. This is as all handlers
        // will be added in the expected order.
        if (initChannel(ctx)) {

            // We are done with init the Channel, removing the initializer now.
            removeState(ctx);
        }
    }
}

Then, call the initChannel abstract method, which is completed by the concrete implementation class.

private boolean initChannel(ChannelHandlerContext ctx) throws Exception {
    if (initMap.add(ctx)) { // Guard against re-entrance.
        try {
            initChannel((C) ctx.channel());
        } catch (Throwable cause) {
            // Explicitly call exceptionCaught(...) as we removed the handler before calling initChannel(...).
            // We do so to prevent multiple calls to initChannel(...).
            exceptionCaught(ctx, cause);
        } finally {
            ChannelPipeline pipeline = ctx.pipeline();
            if (pipeline.context(this) != null) {
                pipeline.remove(this);
            }
        }
        return true;
    }
    return false;
}

The implementation of ChannelInitializer is an anonymous inner class in our custom Server, ChannelInitializer. Therefore, this callback is used to complete the pipeline construction process of the current NioSocketChannel.

public static void main(String[] args){
    EventLoopGroup boss = new NioEventLoopGroup();
    //2 用于对接受客户端连接读写操作的线程工作组
    EventLoopGroup work = new NioEventLoopGroup();
    ServerBootstrap b = new ServerBootstrap();
    b.group(boss, work)    //绑定两个工作线程组
        .channel(NioServerSocketChannel.class)    //设置NIO的模式
        // 初始化绑定服务通道
        .childHandler(new ChannelInitializer<SocketChannel>() {
            @Override
            protected void initChannel(SocketChannel sc) throws Exception {
                sc.pipeline()
                    .addLast(
                    new LengthFieldBasedFrameDecoder(1024,
                                                     9,4,0,0))
                    .addLast(new MessageRecordEncoder())
                    .addLast(new MessageRecordDecode())
                    .addLast(new ServerHandler());
            }
        });
}

Copyright statement: All articles in this blog, except for special statements, adopt the CC BY-NC-SA 4.0 license agreement. Please indicate the reprint from Mic takes you to learn architecture! If this article is helpful to you, please help me to follow and like. Your persistence is the motivation for my continuous creation. Welcome to follow the WeChat public account of the same name for more technical dry goods!