The core principle and practice of JDK ThreadPoolExecutor

1. Content summary

The content of this article mainly revolves around the ThreadPoolExecutor in the JDK. First, it describes the construction process of ThreadPoolExecutor and the mechanism of internal state management, and then uses a lot of space to explore the process of ThreadPoolExecutor thread allocation, task processing, rejection strategy, start and stop, etc. Key analysis of the built-in class not only includes its working principle, but also analyzes its design ideas. The content of the article includes not only source process analysis, but also design ideas discussion and secondary development practice.

Two, construct ThreadPoolExecutor

2.1 Thread pool parameter list

You can create a thread pool through the following construction methods (in fact, there are other constructors, you can go deep into the source code to view, but in the end they are all calling the following constructor to create a thread pool);

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          ThreadFactory threadFactory,
                          RejectedExecutionHandler handler) {
    ...
}

The functions of the construction parameters are as follows:

corePoolSize : the number of core threads. When submitting a task, when the number of threads in the thread pool is less than corePoolSize, a new core thread will be created to perform the task. When the number of threads is equal to corePoolSize, the task will be added to the task queue.
maximumPoolSize : the maximum number of threads. When submitting a task, when the task queue is full and the total number of threads in the thread pool is not greater than the maximumPoolSize, the thread pool will make the non-core threads execute the submitted tasks. When it is greater than the maximumPoolSize, the rejection policy will be implemented.
keepAliveTime : the survival time of non-core threads when they are idle.
unit : The unit of keepAliveTime.
workQueue : task queue (blocking queue).
threadFactory : thread factory. The thread pool is used to create a new thread factory class.
handler : Rejection strategy. When the thread pool encounters a situation that cannot be processed, it will execute the rejection strategy and choose to discard or ignore tasks.

2.2 Overview of the execution process

From the effect of the construction parameters, we know that there are several important components in the core thread pool , idle (non-core) thread pool and blocking queue . Here is the core execution flow chart of the thread pool. First, everyone has an impression of it. After that, it will be easier to analyze the source code.

The following explains some notes in the flowchart: cap represents the capacity of the pool, and size represents the number of threads running in the pool. For blocking queues, cap represents the capacity of the queue, and size represents the number of tasks that have been enqueued. cpS<cpc means that the number of running core threads is less than the number of core threads set in the thread pool.

1) When the core thread pool is not "full", a new core thread will be created to execute the submitted task. The "full" here means that the number (size) in the core thread pool is less than the capacity (cap), and at this time, a newly created thread will execute the submitted task through the thread factory.

2) When the core thread pool is "full", the submitted task will be pushed into the task queue and wait for the release of the core thread. Once the core thread is released, the pull task will continue to execute from the task queue. Because the blocking queue is used, core threads that have been released will also be blocked in the process of obtaining tasks.

3) When the task queue is also full (full here means that it is really full, of course, the unbounded queue situation is not considered for the time being), threads will continue to be created from the idle thread pool to execute the submitted tasks. But the thread in the free thread pool has survival time (keepAliveTime) . When the thread finishes the task, it can only survive the keepAliveTime duration. Once the time has passed, the thread has to be destroyed.

4) When the number of threads in the idle thread pool continues to increase, until the total number of threads in ThreadPoolExecutor is greater than the maximumPoolSize, the task will be rejected and the submitted task will be handed over to the RejectedExecutionHandler for subsequent processing.

The core thread pool and idle thread pool mentioned above are just abstract concepts, and we will analyze their specific content later.

2.3 Common thread pool

Before entering the source code analysis of ThreadPoolExecutor, let's first introduce the commonly used thread pool (in fact, it is not commonly used, but the JDK comes with it). These thread pools can be created by the tool class Executors (or thread pool factory).

2.3.1 FixedThreadPool

The method of creating a thread pool with a fixed number of threads is as follows: the number of core threads and the maximum number of threads are fixed and equal, and an unbounded blocking queue with a linked list as the underlying structure is adopted.

public static ExecutorService newFixedThreadPool(int nThreads, ThreadFactory threadFactory) {
    return new ThreadPoolExecutor(nThreads, nThreads,
                                  0L, TimeUnit.MILLISECONDS,
                                  new LinkedBlockingQueue<Runnable>(),
                                  threadFactory);
}

Features :

The number of core threads is equal to the maximum number of threads, so no idle threads are created. It does not matter whether keepAliveTime is set or not.
With unbounded queues, tasks will be added indefinitely until out of memory (OOM).
Since the unbounded queue cannot be filled, the task cannot be rejected before execution (provided that the thread pool is always running).

application scenario :

Suitable for scenarios with a fixed number of threads
Applicable to servers with heavy loads

2.3.2 SingleThreadExecutor

The single-threaded thread pool is created in the following way: the number of core threads and the maximum number of threads are both 1, and an unbounded blocking queue with a linked list as the underlying structure is adopted.

public static ExecutorService newSingleThreadExecutor(ThreadFactory threadFactory) {
    return new FinalizableDelegatedExecutorService
        (new ThreadPoolExecutor(1, 1,
                                0L, TimeUnit.MILLISECONDS,
                                new LinkedBlockingQueue<Runnable>(),
                                threadFactory));
}

Features

Similar to FixedThreadPool, except that the number of threads is 1.

application scenario

Suitable for single-threaded scenarios.
It is suitable for scenarios where there is a sequential requirement for the processing of submitted tasks.

2.3.3 CachedThreadPool

The buffer thread pool is created in the following way: the number of core threads is 0, and the maximum number of threads is Integer.MAX_VALUE (which can be understood as infinity). Use synchronous blocking queues.

public static ExecutorService newCachedThreadPool(ThreadFactory threadFactory) {
    return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
                                  60L, TimeUnit.SECONDS,
                                  new SynchronousQueue<Runnable>(),
                                  threadFactory);
}

Features :

If the number of core threads is 0, an idle thread is created initially, and the idle thread can only wait for a task for 60s. If no task is submitted within 60s, the idle thread will be destroyed.
The maximum number of threads is infinite, which will cause a huge number of threads to run at the same time, the CPU load is too high, and the application crashes.
Using synchronous blocking queue, that is, the queue does not store tasks. Submit one to consume one. Since the maximum number of threads is infinite, as long as the task is submitted, it will definitely be consumed (before the application crashes).

application scenario :

It is suitable for time-consuming and asynchronous small programs.
Suitable for servers with lighter loads.

Three, thread pool status and the number of active threads

There are two very important parameters in ThreadPoolExecutor: thread pool status (rs) and active threads (wc). The former is used to identify the state of the current thread pool and control what the thread pool should do according to the amount of state; the latter is used to identify the number of active threads, and control whether threads should be created in the core thread pool or the idle thread pool according to the number.

ThreadPoolExecutor uses an Integer variable (ctl) to set these two parameters . We know that under different operating systems, Integer variables in Java are all 32 bits. ThreadPoolExecutor uses the first 3 bits (31~29) to represent the thread pool status, and the last 29 bits (28~0) represent the number of active threads.

is the purpose of setting 161c91e2bc46d8 like this?

We know that the cost of maintaining two variables at the same time in a concurrent scenario is very large, and it is often necessary to lock to ensure that the changes of the two variables are atomic. By maintaining two parameters with one variable, only one statement can guarantee the atomicity of the two variables. This way greatly reduces the concurrency problem during use.

With the above concepts, let's take a look at the several states of ThreadPoolExecutor from the source code level, and how ThreadPoolExecutor operates the two parameters of state and number of active threads at the same time.

The source code of ThreadPoolExecutor about state initialization is as follows:

private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
private static final int COUNT_BITS = Integer.SIZE - 3;
private static final int CAPACITY   = (1 << COUNT_BITS) - 1;
 
private static final int RUNNING    = -1 << COUNT_BITS;
private static final int SHUTDOWN   =  0 << COUNT_BITS;
private static final int STOP       =  1 << COUNT_BITS;
private static final int TIDYING    =  2 << COUNT_BITS;
private static final int TERMINATED =  3 << COUNT_BITS;

ThreadPoolExecutor uses atomic Integer to define ctl variables. ctl wraps two variables, the number of active threads and the runtime state of the thread pool, in an int. In order to achieve this goal, the number of threads of ThreadPoolExecutor is limited to 2^29-1 (about 500 million) instead of 2^31-1 (2 billion), because the first 3 bits are used to identify the status of ThreadPoolExecutor. If the number of threads in ThreadPoolExecutor is not enough in the future, you can set ctl to the atomic long type, and then adjust the corresponding mask.

COUNT_BITS is conceptually used to represent the boundary value between the status bit and the number of threads, and is actually used for shift operations such as state variables. Here it is Integer.sixze-3=32-3=29.

CAPACITY represents the maximum capacity of ThreadPoolExecutor. As can be seen from the figure below, after the shift operation, the last 29 bits of an int value reach the maximum value: all 1s. These 29 bits represent the number of active threads, and when all are 1, it means that the maximum number of threads that ThreadPoolExecutor can accommodate has been reached. The first 3 bits are 0, which means that the variable is only related to the number of active threads and has nothing to do with the state. This is also to facilitate subsequent bit operations.

RUNNING, SHUTDOWN, STOP, TIDYING, TERMINATED represent the 5 states of ThreadPoolExecutor. The executable operations corresponding to these 5 states are as follows:

RUNNING : New tasks can be received, and tasks in the blocking queue can be continuously processed.
SHUTDOWN : Can not receive new tasks, can continue to process tasks in the blocking queue.
STOP : Can not receive new tasks, interrupt all tasks in the blocking queue.
TIDYING : All tasks are terminated directly, and all threads are cleared.
TERMINATED : The thread pool is closed.

The calculation process of these 5 states is shown in the figure below. After the shift calculation, the last 29 digits of the value are all 0, and the first 3 digits represent different states.

After the above variable definitions, ThreadPoolExecutor separates the state from the number of threads, and sets a different continuous bit of an int value, which also brings great convenience to the following operations.

Next, let's take a look at how ThreadPoolExecutor obtains the status and the number of threads.

3.1 runStateOf(c) method

private static int runStateOf(int c) {
    return c & ~CAPACITY;
}

The runStateOf() method is a method used to obtain the state of the thread pool. The formal parameter c is generally a ctl variable, which contains the state and the number of threads. The process of runStateOf() shift calculation is shown in the figure below.

After CAPACITY is inverted, the high third position is 1, and the low 29 position is 0. The inverted value is ANDed with ctl. Since any value'and' 1 is equal to the original value,'and' 0 is equal to 0. Therefore, after the AND operation, the upper 3 bits of ctl retain the original value, and the lower 29 bits are 0. This separates the status value from ctl.

3.2 workerCountOf(c) method

private static int workerCountOf(int c) {
    return c & CAPACITY;
}

The analysis idea of the workerCountOf(c) method is similar to the above, which is to separate the last 29 bits from ctl to obtain the number of active threads. As shown in the figure below, I won't repeat it here.

3.3 ctlOf(rs, wc) method

private static int ctlOf(int rs, int wc) {
    return rs | wc;
}

ctlOf(rs, wc) calculates the ctl value through the state value and the thread value. rs is the abbreviation of runState, and wc is the abbreviation of workerCount. The last 29 digits of rs are 0, and the first three digits of wc are 0. The final value calculated by the'or' operation of both retains the first 3 digits of rs and the last 29 digits of wc, which is the ctl value.

There are some other methods of operating ctl in ThreadPoolExecutor. The analysis ideas are similar to the above. If you are interested, you can see for yourself.

At the end of this summary, let's take a look at the way of ThreadPoolExecutor state transition, which can also be understood as the life cycle.

Four, execute() execution process

4.1 execute method

The source code of execute() is as follows:

public void execute(Runnable command) {
  // 如果待执行的任务为null，直接返回空指针异常。如果任务都没有，下面的步骤都没有执行的必要啦。
  if (command == null) throw new NullPointerException();
  // 获取 ctl 的值，ctl = (runState + workerCount)
  int c = ctl.get();
  // 如果 workerCount(工作线程数) < 核心线程数
  if (workerCountOf(c) < corePoolSize) {
    // 执行 addWorker 方法。addWorker()方法会在下面进行详细分析，这里可以简单理解为添加工作线程处理任务。这里的true表示：在小于核心线程数时添加worker线程，即添加核心线程。
    if (addWorker(command, true))
      // 添加成功则直接返回
      return;
    // 添加失败，重新获取 ctl 的值，防止在添加worker时状态改变
    c = ctl.get();
  }
  // 运行到这里表示核心线程数已满，因此下面addWorker中第二个参数为false。判断线程池是否是运行状态，如果是则尝试将任务添加至 任务队列 中
  if (isRunning(c) && workQueue.offer(command)) {
    // 再次获取 ctl 的值，进行 double-check
    int recheck = ctl.get();
    // 如果线程池为非运行状态，则尝试从任务队列中移除任务
    if (! isRunning(recheck) && remove(command))
      // 移除成功后执行拒绝策略
      reject(command);
    // 如果线程池为运行状态、或移除任务失败
    else if (workerCountOf(recheck) == 0)
      // 执行 addWorker 方法，此时添加的是非核心线程（空闲线程，有存活时间）
      addWorker(null, false);
  }
  // 如果线程池是非运行状态，或者 任务队列 添加任务失败，再次尝试 addWorker() 方法
  else if (!addWorker(command, false))
    // addWorker() 失败，执行拒绝策略
    reject(command);
}

Source code analysis can be done directly by looking at the comments, which are included in each line, which is very ugly and detailed.

It can be seen from the source code that the execute() method mainly encapsulates the judgment logic of ThreadPoolExecutor to create threads, the creation timing of core threads and idle threads, and the execution timing of the rejection strategy are all judged in this method. Here is a summary of the above source code through the following flowchart.

The logic of executing the submitted task by creating a thread is encapsulated in the addWorker() method. In the next section, we will analyze the specific logic of executing the submitted task in the future. There are several other methods in the execute() method, which are explained here.

3.1.1 workerCountOf()

Obtaining the number of active threads from ctl has already been introduced in the second section.

3.1.2 isRunning()

private static boolean isRunning(int c) {
    return c < SHUTDOWN;
}

Determine whether ThreadPoolExecutor is running or not based on the value of ctl. In the source code, it is directly judged whether ctl <SHUTDOWN is established. This is because the highest bit of ctl in the running state is 1, which must be a negative number; while the highest bit of other states is 0, which must be a positive number. Therefore, judging the size of ctl can judge whether it is running.

3.1.3 reject()

final void reject(Runnable command) {
    handler.rejectedExecution(command, this);
}

Directly call the rejectedExecution() method of the RejectedExecutionHandler interface during initialization. This is also the typical use of the strategy pattern, the real rejection operation is encapsulated in the implementation class that implements the RejectedExecutionHandler interface. It will not be expanded here.

4.2 addWorker method

The source code analysis of addWorker() is as follows:

private boolean addWorker(Runnable firstTask, boolean core) {
  retry:
  // 死循环执行逻辑。确保多线程环境下在预期条件下退出循环。
  for (;;) {
    // 获取 ctl 值并从中提取线程池 运行状态
    int c = ctl.get();
    int rs = runStateOf(c);
    // 如果 rs > SHUTDOWN，此时不允许接收新任务，也不允许执行工作队列中的任务，直接返回fasle。
    // 如果 rs == SHUTDOWN，任务为null，并且工作队列不为空，此时走下面的 '执行工作队列中任务' 的逻辑。
    // 这里设置 firstTask == null 是因为：线程池在SHUTDOWN状态下，不允许添加新任务，只允许执行工作队列中剩余的任务。
    if (rs >= SHUTDOWN &&
        ! (rs == SHUTDOWN &&
           firstTask == null &&
           ! workQueue.isEmpty()))
      return false;
    for (;;) {
      // 获取活跃线程数
      int wc = workerCountOf(c);
      // 如果活跃线程数 >= 容量，不允许添加新任务
      // 如果 core 为 true，表示创建核心线程，如果 活跃线程数 > 核心线程数，则不允许创建线程
      // 如果 core 为 false，表示创建空闲线程，如果 活跃线程数 > 最大线程数，则不允许创建线程
      if (wc >= CAPACITY ||
          wc >= (core ? corePoolSize : maximumPoolSize))
        return false;
      // 尝试增加核心线程数，增加成功直接中断最外层死循环，开始创建worker线程
      // 增加失败则持续执行循环内逻辑
      if (compareAndIncrementWorkerCount(c))
        break retry;
      // 获取 ctl 值，判断运行状态是否改变
      c = ctl.get();
      // 如果运行状态已经改变，则从重新执行外层死循环
      // 如果运行状态未改变，继续执行内层死循环
      if (runStateOf(c) != rs)
        continue retry;
    }
  }
  // 用于记录worker线程的状态
  boolean workerStarted = false;
  boolean workerAdded = false;
  Worker w = null;
  try {
    // new 一个新的worker线程，每一个Worker内持有真正执行任务的线程。
    w = new Worker(firstTask);
    final Thread t = w.thread;
    if (t != null) {
      // 加锁，保证workerAdded状态更改的原子性
      final ReentrantLock mainLock = this.mainLock;
      mainLock.lock();
      try {
        // 获取线程池状态
        int rs = runStateOf(ctl.get());
        // 如果为运行状态，则创建worker线程
        // 如果为 SHUTDOWN 状态，并且 firstTask == null，此时将创建线程执行 任务队列 中的任务。
        if (rs < SHUTDOWN ||
            (rs == SHUTDOWN && firstTask == null)) {
          // 如果线程在未启动前就已经运行，抛出异常
          if (t.isAlive())
            throw new IllegalThreadStateException();
          // 本地缓存worker线程
          workers.add(w);
          int s = workers.size();
          if (s > largestPoolSize)
            largestPoolSize = s;
          // worker线程添加成功，更改为 true 状态
          workerAdded = true;
        }
      } finally {
        mainLock.unlock();
      }
      // 更改状态成功后启动worker线程
      if (workerAdded) {
        // 启动worker线程
        t.start();
        // 更改启动状态
        workerStarted = true;
      }
    }
  } finally {
    // 如果工作线程状态未改变，则处理失败逻辑
    if (! workerStarted)
      addWorkerFailed(w);
  }
  return workerStarted;
}

addWorker() judges the running status of ThreadPoolExecutor through the inner and outer two-layer infinite loop and successfully updates the number of active threads through CAS. This is to ensure that multiple threads in the thread pool can exit the loop according to the expected conditions in a concurrent environment.

Then the method will new a Worker and start the built-in worker thread of the Worker. Here, it is judged whether the worker is successfully cached and started by the two states of workerAdded and workerStarted.

Modifying the workerAdded process will use the mainlock of ThreadPoolExecutor to ensure atomicity, preventing unexpected situations in the processes of adding data to workers and obtaining the number of workers in a multi-threaded concurrent environment.

The steps of addWorker() to start the worker thread are to first new a Worker object, then obtain the worker thread from it, and then start, so the real thread start process is still in the Worker object.

Here is a summary of addWorker through a flowchart:

Several methods of addWorker are also analyzed here:

4.2.1 runStateOf()

Obtain the ThreadPoolExecutor status from ctl, and see Chapter 2 for a detailed analysis.

4.2.2 workerCountOf()

Get the number of active threads of ThreadPoolExecutor from ctl. For detailed analysis, see Chapter 2.

4.2.3 compareAndIncrementWorkerCount()

int c = ctl.get();
if (compareAndIncrementWorkerCount(c)) {...}
private boolean compareAndIncrementWorkerCount(int expect) {
    return ctl.compareAndSet(expect, expect + 1);
}

uses CAS to make the number of active threads in ctl+1. Why can the number of threads be changed as long as the value of ctl is +1? Because the value of the ctl thread number is stored in the last 29 digits, +1 will only affect the value of the last 29 digits without overflow, and will only make the number of threads +1. Without affecting the thread pool state.

4.2.4 addWorkerFailed()

private void addWorkerFailed(Worker w) {
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        if (w != null)
            // 移除worker
            workers.remove(w);
        // 活跃线程数-1
        decrementWorkerCount();
        // 尝试停止线程池
        tryTerminate();
    } finally {
        mainLock.unlock();
    }
}
 
private void decrementWorkerCount() {
    do {} while (! compareAndDecrementWorkerCount(ctl.get()));
}

This method is executed after the worker thread fails to start. Under what circumstances will this problem occur? After successfully increasing the number of active threads and successfully new Worker, the thread pool state changes to> SHUTDOWN, which can neither accept new tasks nor execute the remaining tasks in the task queue. At this time, the thread pool should stop directly.

The method is in this case:

Remove the newly created Worker from the workers buffer pool;
Ensure that the number of active threads is reduced by 1 through infinite loop + CAS;
Execute the tryTerminate() method to try to stop the thread pool.

After executing the tryTerminate() method, the thread pool will enter the TERMINATED state.

4.2.5 tryTerminate()

final void tryTerminate() {
    for (;;) {
        int c = ctl.get();
        // 如果当前线程池状态为以下之一，无法直接进入 TERMINATED 状态，直接返回false，表示尝试失败
        if (isRunning(c) || runStateAtLeast(c, TIDYING) ||
            (runStateOf(c) == SHUTDOWN && ! workQueue.isEmpty()))
            return;
        // 如果活跃线程数不为0，中断所有的worker线程，这个会在下面详细讲解，这里会关系到 Worker 虽然继承了AQS，但是并未使用里面的CLH的原因。
        if (workerCountOf(c) != 0) {
            interruptIdleWorkers(ONLY_ONE);
            return;
        }
        // 加上全局锁
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            // 首先通过 CAS 将 ctl 改变成 (rs=TIDYING, wc=0)，因为经过上面的判断保证了当先线程池能够达到这个状态。
            if (ctl.compareAndSet(c, ctlOf(TIDYING, 0))) {
                try {
                    // 钩子函数，用户可以通过继承 ThreadPoolExecutor 实现自定义的方法。
                    terminated();
                } finally {
                    // 将 ctl 改变成 (rs=TERMINATED, wc=0)，此时线程池将关闭。
                    ctl.set(ctlOf(TERMINATED, 0));
                    // 唤醒其它线程，唤醒其实也没用了，其它线程唤醒后经过判断得知线程池 TERMINATED 后也会退出。
                    termination.signalAll();
                }
                return;
            }
        } finally {
            // 释放全局锁
            mainLock.unlock();
        }
    }
}

Five, Worker built-in class analysis

5.1 Worker Object Analysis

Source code analysis of Worker object:

private final class Worker extends AbstractQueuedSynchronizer implements Runnable {
  // 工作线程
  final Thread thread;
  // 提交的待执行任务
  Runnable firstTask;
  // 已经完成的任务量
  volatile long completedTasks;
  Worker(Runnable firstTask) {
    // 初始化状态
    setState(-1);
    this.firstTask = firstTask;
    // 通过线程工厂创建线程
    this.thread = getThreadFactory().newThread(this);
  }
  // 执行提交任务的方法，具体执行逻辑封装在 runWorker() 中，当addWorker() 中t.start()后，将执行该方法
  public void run() {
    runWorker(this);
  }
  // 实现AQS中的一些方法
  protected boolean isHeldExclusively() { ... }
  protected boolean tryAcquire(int unused) { ... }
  protected boolean tryRelease(int unused) { ... }
  public void lock()        { ... }
  public boolean tryLock()  { ... }
  public void unlock()      { ... }
  public boolean isLocked() { ... }
  // 中断持有的线程
  void interruptIfStarted() {
    Thread t;
    if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) {
      try { t.interrupt(); }
      catch (SecurityException ignore) {}
    }
  }
}

It can be seen from the above source code: Worker implements the Runnable interface, indicating that Worker is a task; Worker inherits AQS, indicating that Worker also has the nature of locking, but Worker does not use CLH functions like ReentrantLock and other lock tools, because threads There is no scenario where multiple threads access the same Worker in the pool. Here, only the state maintenance function in AQS is used. This will be explained in detail below.

Each Worker object holds a worker thread thread. When the Worker is initialized, the worker thread is created through the thread factory and passed into the worker thread as a task. Therefore, thread pool is not actually the run() method of directly executing the submitted task, but the run() method in the Worker, and then the run() method of the submitted task is executed in this method.

The run() method in Worker is delegated to runWorker() in ThreadPoolExecutor to execute specific logic.

Here is a summary:

Worker itself is a task and holds tasks and worker threads submitted by users.
The task held by the worker thread is this itself, so calling the start() method of the worker thread actually executes the run() method of this itself.
The run() of this itself delegates to the global runWorker() method to execute specific logic.
In the runWorker() method, the run() method, which executes the task submitted by the user, executes the user-specific logic.

5.2 runWorker method

The source code of runWorker() is as follows:

final void runWorker(Worker w) {
  Thread wt = Thread.currentThread();
  // 拷贝提交的任务，并将 Worker 中的 firstTask 置为 null，便于下一次重新赋值。
  Runnable task = w.firstTask;
  w.firstTask = null;
  w.unlock();
  boolean completedAbruptly = true;
  try {
    // 执行完持有任务后，通过 getTask() 不断从任务队列中获取任务
    while (task != null || (task = getTask()) != null) {
      w.lock();
      try {
        // ThreadPoolExecutor 的钩子函数，用户可以实现 ThreadPoolExecutor，并重写 beforeExecute() 方法，从而在任务执行前 完成用户定制的操作逻辑。
        beforeExecute(wt, task);
        Throwable thrown = null;
        try {
          // 执行提交任务的 run() 方法
          task.run();
        } catch (RuntimeException x) {
          ...
        } finally {
          // ThreadPoolExecutor 的钩子函数，同 beforeExecute，只不过在任务执行完后执行。
          afterExecute(task, thrown);
        }
      } finally {
        // 便于任务回收
        task = null;
        w.completedTasks++;
        w.unlock();
      }
    }
    completedAbruptly = false;
  } finally {
    // 执行到这里表示任务队列中没了任务，或者线程池关闭了，此时需要将worker从缓存冲清除
    processWorkerExit(w, completedAbruptly);
  }
}

runWorker() is the method to actually execute the submitted task, but it does not execute the task through the Thread.start() method, but directly executes the task's run() method.

runWorker() will continuously get tasks from the task queue and execute them.

runWorker() provides two hook functions. If the ThreadPoolExecutor of jdk cannot meet the needs of the developer, the developer can inherit the ThreadPoolExecutor and override the beforeExecute() and afterExecute() methods to customize the logic that needs to be executed before the task is executed. For example, set some monitoring indicators or print logs.

5.2.1 getTask()

private Runnable getTask() {
    boolean timedOut = false;
    // 死循环保证一定获取到任务
    for (;;) {
        ...
        try {
            // 从任务队列中获取任务
            Runnable r = timed ?
                workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
            workQueue.take();
            if (r != null)
                return r;
            timedOut = true;
        } catch (InterruptedException retry) {
            timedOut = false;
        }
    }
}

5.2.2 processWorkerExit()

private void processWorkerExit(Worker w, boolean completedAbruptly) {
    ...
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        completedTaskCount += w.completedTasks;
        // 从缓存中移除worker
        workers.remove(w);
    } finally {
        mainLock.unlock();
    }
    // 尝试停止线程池
    tryTerminate();
    ...
}

Six, shutdown () execution process

The thread pool has two methods to actively shut down;

shutdown() : Shut down all idle Worker threads in the thread pool and change the thread pool state to SHUTDOWN;
shutdownNow() : Shut down all Worker threads in the thread pool, change the thread pool state to STOP, and return a list of all tasks waiting to be processed.

Why are Worker threads divided into idle and non-idle?

From the runWorker() method above, we know that the Worker thread will ideally continuously get tasks from the task queue and execute them in the while loop. At this time, the Worker thread is non-idle; the worker thread that is not performing tasks is Free. Because the SHUTDOWN state of the thread pool does not allow to receive new tasks, only the remaining tasks in the task queue are allowed to be executed, so all idle worker threads need to be interrupted, and non-idle threads continue to execute tasks in the task queue until the queue is empty. The STOP state of the thread pool neither allows new tasks to be accepted, nor does it allow the remaining tasks to be executed, so all Worker threads, including those that are running, need to be shut down.

6.1 shutdown()

The source code of shutdown() is as follows:

public void shutdown() {
  // 上全局锁
  final ReentrantLock mainLock = this.mainLock;
  mainLock.lock();
  try {
    // 校验是否有关闭线程池的权限，这里主要通过 SecurityManager 校验当前线程与每个 Worker 线程的 “modifyThread” 权限
    checkShutdownAccess();
    // 修改线程池状态
    advanceRunState(SHUTDOWN);
    // 关闭所有空闲线程
    interruptIdleWorkers();
    // 钩子函数，用户可以继承 ThreadPoolExecutor 并实现自定义钩子，ScheduledThreadPoolExecutor便实现了自己的钩子函数
    onShutdown();
  } finally {
    mainLock.unlock();
  }
  // 尝试关闭线程池
  tryTerminate();
}

shutdown() encapsulates the shutdown steps of ThreadPoolExecutor in several methods, and ensures that only one thread can actively shut down ThreadPoolExecutor through a global lock. ThreadPoolExecutor also provides a hook function onShutdown() to allow developers to customize the shutdown process. For example, ScheduledThreadPoolExecutor will clean up the task queue when it is closed.

The method is analyzed below.

checkShutdownAccess()

private static final RuntimePermission shutdownPerm = new RuntimePermission("modifyThread");
 
private void checkShutdownAccess() {
  SecurityManager security = System.getSecurityManager();
  if (security != null) {
    // 校验当前线程的权限，其中 shutdownPerm 就是一个具有 modifyThread 参数的 RuntimePermission 对象。
    security.checkPermission(shutdownPerm);
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
      for (Worker w : workers)
        // 校验所有worker线程是否具有 modifyThread 权限
        security.checkAccess(w.thread);
    } finally {
      mainLock.unlock();
    }
  }
}

advanceRunState()

// targetState = SHUTDOWN
private void advanceRunState(int targetState) {
  for (;;) {
    int c = ctl.get();
    // 判断当前线程池状态 >= SHUTDOWN是否成立，如果不成立的话，通过CAS进行修改
    if (runStateAtLeast(c, targetState) ||
        ctl.compareAndSet(c, ctlOf(targetState, workerCountOf(c))))
      break;
  }
}
private static boolean runStateAtLeast(int c, int s) {
  return c >= s;
}

In this method, judging whether the current process pool status of the thread >= SHUTDOWN is established is actually the technique of the previous thread pool status definition. For other states that are not running, they are all positive numbers, and the upper three digits are different, TERMINATED (011)> TIDYING (010)> STOP (001)> SHUTDOWN (000) and the size of the upper three digits depends on the size of the entire number. Therefore, for different states, regardless of the number of active threads, the state of the thread pool always determines the size of the ctl value. That is, the ctl value in the TERMINATED state> the ctl value in the TIDYING state is always established.

interruptIdleWorkers()

private void interruptIdleWorkers() {
  interruptIdleWorkers(false);
}
private void interruptIdleWorkers(boolean onlyOne) {
  final ReentrantLock mainLock = this.mainLock;
  mainLock.lock();
  try {
    for (Worker w : workers) {
      Thread t = w.thread;
      // 判断worker线程是否已经被标记中断了，如果没有，则尝试获取worker线程的锁
      if (!t.isInterrupted() && w.tryLock()) {
        try {
          // 中断线程
          t.interrupt();
        } catch (SecurityException ignore) {
        } finally {
          w.unlock();
        }
      }
      // 如果 onlyOne 为true的话最多中断一个线程
      if (onlyOne)
        break;
    }
  } finally {
    mainLock.unlock();
  }
}

The method just tries to acquire the worker's lock, and only interrupts the thread if the acquisition is successful. This is also related to the aforementioned Worker, although it inherits AQS but does not use CLH, which will be analyzed later.

The tryTerminate() method has been analyzed before, so I won’t go into more detail here.

6.2 shutdownNow()

public List<Runnable> shutdownNow() {
  List<Runnable> tasks;
  final ReentrantLock mainLock = this.mainLock;
  mainLock.lock();
  try {
    // 校验关闭线程池权限
    checkShutdownAccess();
    // 修改线程池状态为STOP
    advanceRunState(STOP);
    // 中断所有线程
    interruptWorkers();
    // 获取队列中所有正在等待处理的任务列表
    tasks = drainQueue();
  } finally {
    mainLock.unlock();
  }
  // 尝试关闭线程池
  tryTerminate();
  // 返回任务列表
  return tasks;
}

This method is similar to shutdown(), encapsulating the core steps in several methods, among which checkShutdownAccess() and advanceRunState() are the same. The different methods are explained below

interruptWorkers()

private void interruptWorkers() {
  final ReentrantLock mainLock = this.mainLock;
  mainLock.lock();
  try {
    // 遍历所有的Worker，只要Worker启动了就将其中断
    for (Worker w : workers)
      w.interruptIfStarted();
  } finally {
    mainLock.unlock();
  }
}
void interruptIfStarted() {
  Thread t;
  // state >= 0表示worker已经启动，Worker启动并且持有线程不为null并且持有线程未被标记中断，则中断该线程
  if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) {
    try {
      t.interrupt();
    } catch (SecurityException ignore) {
    }
  }
}

This method does not try to acquire the worker's lock, but directly interrupts the thread. Because the thread pool in the STOP state does not allow processing tasks that are waiting in the task queue.

drainQueue()

// 将任务队列中的任务添加进列表中返回，通常情况下使用 drainTo() 就行了，但如果队列是延迟队列或是其他无法通过drainTo()方法转移任务时，再通过循环遍历进行转移
private List<Runnable> drainQueue() {
  ...
}

Seven, the reason why Worker inherited AQS

Let me talk about the conclusion first-Worker inherits AQS to use its state management function, and does not use the CLH nature of AQS like ReentrantLock.

Let's first take a look at the methods related to AQS in Worker:

// 参数为unused，从命名也可以知道该参数未被使用
protected boolean tryAcquire(int unused) {
  // 通过CAS改变将状态由0改变为1
  if (compareAndSetState(0, 1)) {
    // 设置当前线程独占
    setExclusiveOwnerThread(Thread.currentThread());
    return true;
  }
  return false;
}
// 该方法只在 runWorker() 中被使用
public void lock()        { acquire(1); }
public boolean tryLock()  { return tryAcquire(1); }

The tryAcquire in the Worker just changes the state to 1, and the parameter is not used, so we can conclude that the state in the Worker may take the value (0, 1). The initialization state-1 is not considered here to avoid confusion.

Looking at the lock() method again, the only place where the lock() method is called is before the worker thread is started in runWorker(). And runWorker() is called by run() in Worker. Worker as a task is only passed to the worker thread it owns, so the run() method in Worker can only be called by the worker thread it owns through start(), so runWorker() will only be held by the worker itself Called by a worker thread, the lock() method can only be called by a single thread. There is no situation where multiple threads compete for the same lock, and there is no multithreaded environment where only one thread can acquire the lock and cause other waiting threads to be added. In the case of queue 161c91e2bc5982. So Worker does not use CLH functions.

This also shows that the tryAcquire() method does not use the passed parameters, because Worker only has two states, either locked (not idle, state=1) or not locked (idle, state=0 ). No need to pass parameters to set other states.

final void runWorker(Worker w) {
  ...
  try {
    while (task != null || (task = getTask()) != null) {
      // 唯一被调用的地方
      w.lock();
      ...
    }
  }
}

The above analysis shows that Worker does not use the CLH function of AQS. So how does Worker use the state management function?

In the shutdown() method of closing the thread pool, one step is to interrupt all idle worker threads. Before interrupting all worker threads, it will be judged whether the worker thread can be acquired by the lock, through tryLock () -> tryAcquire () to determine whether the worker status is 0, only workers that can acquire the lock will be interrupted, and can be The worker that has acquired the lock is the idle worker (state=0). The name of the Worker table that cannot be acquired has already executed the lock() method. At this time, the Worker continuously acquires the task execution of the blocking queue in the While loop, and cannot be interrupted in the shutdown() method.

private void interruptIdleWorkers(boolean onlyOne) {
    ...
  try {
    for (Worker w : workers) {
      Thread t = w.thread;
      if (!t.isInterrupted() && w.tryLock()) { ... }
    }
  }
}

Therefore, the state management of the Worker actually judges whether the Worker is idle by the value of state (0 or 1). If it is idle, it can be interrupted when the thread pool is closed, otherwise it has to be kept in the while loop to get in the blocking queue. The task will be executed and will not be released until the task in the queue is empty. As shown below:

8. Rejection strategy

This chapter only discusses the four built-in rejection policy handlers of ThreadPoolExecutor.

8.1 CallerRunsPolicy

public static class CallerRunsPolicy implements RejectedExecutionHandler {
  public CallerRunsPolicy() { }
  public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
    // 如果线程池未被关闭，直接在当前线程中执行任务
    if (!e.isShutdown()) {
      r.run();
    }
  }
}

Execute the rejected task directly in the calling thread. As long as the thread pool is in the RUNNING state, the task is still executed. If it is in a non-RUNNING state, the task will be ignored directly, which also conforms to the behavior of the thread pool state.

8.2 AbortPolicy

public static class AbortPolicy implements RejectedExecutionHandler {
  public AbortPolicy() { }
  public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
    // 抛出拒绝异常
    throw new RejectedExecutionException("Task " + r.toString() +
                                         " rejected from " +
                                         e.toString());
  }
}

Throw a rejection exception directly after the task is rejected.

8.3 DiscardPolicy

public static class DiscardPolicy implements RejectedExecutionHandler {
  public DiscardPolicy() { }
    // 空方法，什么都不执行
  public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
  }
}

Abandon the task. If the rejection method is empty, it means nothing will be executed, which is equivalent to abandoning the task.

8.4 DiscardOldestPolicy

public static class DiscardOldestPolicy implements RejectedExecutionHandler {
  public DiscardOldestPolicy() { }
  public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
    if (!e.isShutdown()) {
      // 从阻塞队列中获取（移除）队头的任务，
      e.getQueue().poll();
      // 再次尝试execute当前任务
      e.execute(r);
    }
  }
}

Remove the task that first entered the queue (head of the queue) from the blocking queue, and then try to execute the execute() method again to enqueue the current task. This is a typical strategy of liking the new and disgusting the old.

Nine, ThreadPoolExecutor secondary development practice

After introducing the core principles of ThreadPoolExecutor, let's take a look at how the NexTask concurrency framework developed by vivo plays with the thread pool and improves the development speed and code execution speed of business personnel.

NexTask abstracts common business models, algorithms, and scenarios, and implements them in the form of components. It provides a fast, lightweight, simple and easy-to-use method that shields the underlying technical details, allowing developers to quickly write concurrent programs and empowering development to a greater extent.

First, we give the NexTask architecture diagram, and then we conduct a detailed analysis of where ThreadPoolExecutor is used in the architecture diagram.

// Executor部分代码：
public class Executor {
  ...
    private static DefaultTaskProcessFactory taskProcessFactory =
    new DefaultTaskProcessFactory();
  // 对外提供的API，用户快速创建任务处理器
  public static TaskProcess getCommonTaskProcess(String name) {
        return TaskProcessManager.getTaskProcess(name, taskProcessFactory);
    }
  public static TaskProcess getTransactionalTaskProcess(String name) {
        return TaskProcessManager.getTaskProcessTransactional(name, taskProcessFactory);
    }
  ...
}

Executor is an interface provided to the outside world. Developers can use its simple and easy-to-use API to quickly create a task processor TaskProcess through the task manager TaskProcessManager.

// TaskProcessManager 部分代码：
public class TaskProcessManager {
  // 缓存map，<业务名称, 针对该业务的任务处理器>
  private static Map<String, TaskProcess> taskProcessContainer =
            new ConcurrentHashMap<String, TaskProcess>();
  ...
}

TaskProcessManager holds a ConcurrentHashMap local cache of task processors, each task processor is mapped to a specific business name one by one. When the task processor is obtained, the specific service name is obtained from the cache, which not only ensures that the task processing between each service is isolated from each other, but also prevents resource loss caused by multiple creation and destruction of thread pools.

// TaskProcess 部分代码：
public class TaskProcess {
  // 线程池
  private ExecutorService executor;
  // 线程池初始化
  private void createThreadPool() {
        executor = new ThreadPoolExecutor(coreSize, poolSize, 60, TimeUnit.SECONDS,
                new LinkedBlockingQueue<Runnable>(2048), new DefaultThreadFactory(domain),
                new ThreadPoolExecutor.AbortPolicy());
    }
  // 多线程提交任务进行处理
  public <T> List<T> executeTask(List<TaskAction<T>> tasks) {
    int size = tasks.size();
    // 创建一个与任务数相同的 CountDownLatch，保证所有任务全部处理完后一起返回结果
    final CountDownLatch latch = new CountDownLatch(size);
    // 返回结果初始化
    List<Future<T>> futures = new ArrayList<Future<T>>(size);
    List<T> resultList = new ArrayList<T>(size);
    //  遍历所有任务，提交到线程池
    for (final TaskAction<T> runnable : tasks) {
        Future<T> future = executor.submit(new Callable<T>() {
            @Override
            public T call() throws Exception {
          // 处理具体的任务逻辑
                try { return runnable.doInAction(); }
          // 处理完成后，CountDownLatch - 1
          finally { latch.countDown(); }
                }
            });
            futures.add(future);
        }
        try {
      // 等待所有任务处理完成
            latch.await(50, TimeUnit.SECONDS);
        } catch (Exception e) {
            log.info("Executing Task is interrupt.");
        }
    // 封装结果并返回
        for (Future<T> future : futures) {
            try {
                T result = future.get();// wait
                if (result != null) {
                    resultList.add(result);
                }
            } catch (Exception e) {
                throw new RuntimeException(e);
            }
        }
        return resultList;
    }
  ...
}

Each TaskProcess holds a thread pool. From the initialization process of the thread pool, it can be seen that TaskProcess uses a bounded blocking queue with a maximum of 2048 tasks stored in the queue. Once this number is exceeded, it will directly refuse to receive tasks and Throw a rejection exception.

TaskProcess will traverse the list of tasks submitted by the user and submit them to the thread pool for processing through the submit() method. The bottom layer of submit() is actually the ThreadPoolExecutor#execute() method called, but the task will be encapsulated into RunnableFuture before the call. This is the content of the FutureTask framework, so I won't expand it.

TaskProcess will create a CountDownLatch each time a task is processed, and execute CountDownLatch.countDown() after the task ends, so as to ensure that all tasks block the current thread after the completion of the execution, until all tasks are processed and the results are uniformly obtained and returned.

10. Summary

Although the JDK provides developers with Executors tool classes and a variety of built-in thread pools, the use of those thread pools is very limited and cannot meet the increasingly complex business scenarios. Ali's official programming protocol also recommends that developers not directly use the thread pool that comes with the JDK, but create thread pools through ThreadPoolExecutor according to their own business scenarios. Therefore, understanding the internal principles of ThreadPoolExecutor is also crucial for proficient use of thread pools in daily development.

This article mainly explores the core principles of ThreadPoolExecutor, introduces its construction method and the detailed meaning of its various construction parameters, as well as the conversion method of thread pool core ctl parameters. Subsequently, I spent a lot of space in-depth ThreadPoolExecutor source code to introduce the startup and shutdown process of the thread pool, the core built-in Worker, etc. There are other methods of ThreadPoolExecutor that have not been introduced in this article. Readers can read other source code after reading this article. I believe it will be helpful.

Author: vivo internet server team-Xu Weiteng