Threads and thread pools: thread pool articles (10,000-character long article)

Keywords of this article:

thread, thread pool, single-threaded, multi-threaded, benefits thread pool, thread recovery, way to create, core parameters, underlying mechanism, deny policy, parameter settings, dynamic monitoring, Thread isolation

Knowledge about threads and thread pools is a knowledge point that you will encounter in Java learning or interviews. In this article, we will explain from threads and processes, parallel and concurrency, single thread and multithread, etc. to thread pool and thread pool. The benefits, creation methods, important core parameters, several important methods, underlying implementation, rejection strategies, parameter settings, dynamic adjustments, thread isolation, etc. The main outline is as follows:

`Benefits of thread pool`

Thread pool uses the idea of pooling to manage threads. Pooling technology is to maximize benefits, minimize user risks, and put resources together for management. This kind of thinking is used in many places, not just computers, such as finance, business management, equipment management, etc.

Why thread pool? If in a concurrent scenario, coders create thread pools according to their needs, there may be the following problems:

It is difficult for us to determine how many threads are running in the system. If it is used, it is created, and if it is not used, it is destroyed. Then the cost of creating and destroying threads is relatively large.
Assuming a lot of requests come, it may be crawlers, frantically creating threads, which may exhaust system resources.

What are the benefits of implementing a thread pool?

Reduce resource consumption: Pooling technology can reuse threads that have been created, reducing the loss of thread creation and destruction.
Improve response speed: use existing threads for processing, reducing the time to create threads
Controllable management threads: threads are scarce resources and cannot be created infinitely. The thread pool can be allocated and monitored uniformly
Expand other functions: such as a timing thread pool, which can execute tasks on a regular basis

In fact, pooling technology is used in many places, such as:

Database connection pool: Database connection is a scarce resource. Create it first to improve response speed and reuse existing connections
Instance pool: first create an object and put it in the pool for recycling, reducing the consumption of creating and destroying back and forth

`Thread pool related classes`

The following is the inheritance relationship of the classes related to the thread pool:

`Executor`

Executor is the top-level interface. There is only one method execute(Runnable command) , which defines the scheduling thread pool to perform tasks. It defines the basic specifications of the thread pool. It is its bounden duty to perform tasks.

`ExecutorService`

ExecutorService inherits Executor , but it is still an interface, it has some more methods:

void shutdown() : Close the thread pool and wait for the task to complete.
List<Runnable> shutdownNow() : Immediately close the thread pool, try to stop all active tasks, stop the processing of waiting tasks, and returns a list of tasks waiting to be executed (not yet executed) .
boolean isShutdown() : Determine whether the thread pool has been closed, but the thread may still be executing.
boolean isTerminated() : After executing shutdown/shutdownNow, all tasks have been completed, and this status is true.
boolean awaitTermination(long timeout, TimeUnit unit) : After the shutdown is executed, the block waits until the terminated state, unless it times out or is interrupted.
<T> Future<T> submit(Callable<T> task) : Submit a task with a return value, and return a Future that has no result for the task, call the future.get() method to return the result when the task is completed.
<T> Future<T> submit(Runnable task, T result) : Submit a task, pass in the return result, this result has no effect, just specify the type and a returned result.
Future<?> submit(Runnable task) : Submit the task and return to Future
<T> List<Future<T>> invokeAll(Collection<? extends Callable<T>> tasks) : Execute tasks in batches, get the list of Futures, and submit tasks in batches.
<T> List<Future<T>> invokeAll(Collection<? extends Callable<T>> tasks,long timeout, TimeUnit unit) : Submit tasks in batches and specify the timeout period
<T> T invokeAny(Collection<? extends Callable<T>> tasks) : Block, get the result value of the first completed task,
<T> T invokeAny(Collection<? extends Callable<T>> tasks,long timeout, TimeUnit unit) : Block, get the value of the first completed result, and specify the timeout period

There may be students on the front <T> Future<T> submit(Runnable task, T result) doubt, this reuslt what effect?

In fact, it has no effect. It just holds it. After the task is completed, it still calls future.get（） to return this result. Use result new a ftask , which actually uses the Runnable wrapper class RunnableAdapter , without special processing for the result, call call() method, this result is returned directly. (The specific implementation in Executors)

    public <T> Future<T> submit(Runnable task, T result) {
        if (task == null) throw new NullPointerException();
        RunnableFuture<T> ftask = newTaskFor(task, result);
        execute(ftask);
        return ftask;
    }

    static final class RunnableAdapter<T> implements Callable<T> {
        final Runnable task;
        final T result;
        RunnableAdapter(Runnable task, T result) {
            this.task = task;
            this.result = result;
        }
        public T call() {
            task.run();
            // 返回传入的结果
            return result;
        }
    }

There is also a method worth mentioning: invokeAny() : In ThreadPoolExecutor use ExecutorService methods invokeAny() the outcome of the first tasks completed, when the first task execution is completed, it will call interrupt() method will interrupt other tasks.

Note that ExecutorService is an interface, which contains all definitions and does not involve implementation. The previous explanations are based on its name (prescribed specification) and its general implementation.

It can be seen that ExecutorService defines some operations of the thread pool, including shutting down, judging whether to shut down, stopping or not, submitting tasks, submitting tasks in batches, and so on.

`AbstractExecutorService`

AbstractExecutorService is an abstract class that implements the ExecutorService interface. This is the basic implementation of most thread pools. The timing thread pool will not be paid attention to. The main methods are as follows:

Not only implements submit , invokeAll , invokeAny and other methods, but also provides a newTaskFor method for constructing RunnableFuture objects. Those objects that can get the results of the task are obtained through newTaskFor . Without expanding the introduction of all the source code inside, just take the submit() method as an example:

    public Future<?> submit(Runnable task) {
        if (task == null) throw new NullPointerException();
        // 封装任务
        RunnableFuture<Void> ftask = newTaskFor(task, null);
        // 执行任务
        execute(ftask);
        // 返回 RunnableFuture 对象
        return ftask;
    }

But in AbstractExecutorService , the most important method is not implemented, that is, the execute() method. How the thread pool is implemented specifically, this different thread pool can have different implementations, generally inherit AbstractExecutorService (timed tasks have other interfaces), our most commonly used is ThreadPoolExecutor .

`ThreadPoolExecutor`

is here!!! ThreadPoolExecutor generally the thread pool class that we usually use. The so-called creation of thread pool, if it is not a timed thread pool, it is used.

ThreadPoolExecutor look at the internal structure (attributes) of 060d063380aae9:

public class ThreadPoolExecutor extends AbstractExecutorService {
    // 状态控制，主要用来控制线程池的状态，是核心的遍历，使用的是原子类
    private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
      // 用来表示线程数量的位数（使用的是位运算，一部分表示线程的数量，一部分表示线程池的状态）
    // SIZE = 32 表示32位，那么COUNT_BITS就是29位
    private static final int COUNT_BITS = Integer.SIZE - 3;
      // 线程池的容量，也就是27位表示的最大值
    private static final int CAPACITY   = (1 << COUNT_BITS) - 1;

    // 状态量，存储在高位，32位中的前3位
      // 111（第一位是符号位，1表示负数），线程池运行中
    private static final int RUNNING    = -1 << COUNT_BITS; 
      // 000
    private static final int SHUTDOWN   =  0 << COUNT_BITS;
      // 001
    private static final int STOP       =  1 << COUNT_BITS;
      // 010
    private static final int TIDYING    =  2 << COUNT_BITS;
      // 011
    private static final int TERMINATED =  3 << COUNT_BITS;

    // 取出运行状态
    private static int runStateOf(int c)     { return c & ~CAPACITY; }
      // 取出线程数量
    private static int workerCountOf(int c)  { return c & CAPACITY; }
      // 用运行状态和线程数获取ctl
    private static int ctlOf(int rs, int wc) { return rs | wc; }
      
      // 任务等待队列
    private final BlockingQueue<Runnable> workQueue;
      // 可重入主锁（保证一些操作的线程安全）
    private final ReentrantLock mainLock = new ReentrantLock();
      // 线程的集合
    private final HashSet<Worker> workers = new HashSet<Worker>();
  
      // 在Condition中，用await()替换wait()，用signal()替换notify()，用signalAll()替换notifyAll()，
    // 传统线程的通信方式，Condition都可以实现，Condition和传统的线程通信没什么区别，Condition的强大之处在于它可以为多个线程间建立不同的Condition
    private final Condition termination = mainLock.newCondition();
  
      // 最大线程池大小
    private int largestPoolSize;
      // 完成的任务数量
    private long completedTaskCount;
      // 线程工厂
    private volatile ThreadFactory threadFactory;
      // 任务拒绝处理器
    private volatile RejectedExecutionHandler handler;
         // 非核心线程的存活时间
    private volatile long keepAliveTime;
      // 允许核心线程的超时时间
    private volatile boolean allowCoreThreadTimeOut;
         // 核心线程数
    private volatile int corePoolSize;
        // 工作线程最大容量
    private volatile int maximumPoolSize;
         // 默认的拒绝处理器（丢弃任务）
      private static final RejectedExecutionHandler defaultHandler =
        new AbortPolicy();
      // 运行时关闭许可
    private static final RuntimePermission shutdownPerm =
        new RuntimePermission("modifyThread");
      // 上下文
    private final AccessControlContext acc;
      // 只有一个线程
    private static final boolean ONLY_ONE = true;
}

`Thread pool status`

As can be seen from the above code, a 32-bit object is used to save the state of the thread pool and the capacity of the thread pool. The upper 3 bits are the state of the thread pool, and the remaining 29 bits are the number of threads:

    // 状态量，存储在高位，32位中的前3位
      // 111（第一位是符号位，1表示负数），线程池运行中
    private static final int RUNNING    = -1 << COUNT_BITS; 
      // 000
    private static final int SHUTDOWN   =  0 << COUNT_BITS;
      // 001
    private static final int STOP       =  1 << COUNT_BITS;
      // 010
    private static final int TIDYING    =  2 << COUNT_BITS;
      // 011
    private static final int TERMINATED =  3 << COUNT_BITS;

The various states are different, and their state changes are as follows:

RUNNING: running status, you can accept or process tasks
SHUTDOWN: Can not accept the task, but can handle the task
STOP: Can not accept the task, can not process the task, interrupt the current task
TIDYING: All threads stop
TERMINATED: the last state of the thread pool

`Worker implementation`

Thread pool, definitely have to have the pool, and is the place to put the thread in ThreadPoolExecutor performance in Worker , this is an internal class:

The thread pool is actually a Worker (beating workers, constantly receiving tasks, completing tasks), here is HashSet :

private final HashSet<Worker> workers = new HashSet<Worker>();

How is Worker

Worker addition to inheriting the AbstractQueuedSynchronizer , is AQS , AQS is essentially a queue lock, a simple mutex, usually interrupt or modify worker when the state use.

AQS is introduced internally for thread safety. When a thread executes a task, it calls runWorker(Worker w) . This method is not a worker method, but a ThreadPoolExecutor method. As can be seen from the code below, every time Worke r is modified, it is thread-safe. Worker holds a thread Thread , which can be understood as an encapsulation of threads.

As for runWorker(Worker w) works? Keep this question first, and explain in detail later.

    // 实现 Runnable，封装了线程
    private final class Worker
        extends AbstractQueuedSynchronizer
        implements Runnable
    {
        // 序列化id
        private static final long serialVersionUID = 6138294804551838833L;

        // worker运行的线程
        final Thread thread;
        
        // 初始化任务，有可能是空的，如果任务不为空的时候，其他进来的任务，可以直接运行，不在添加到任务队列
        Runnable firstTask;
        // 线程任务计数器
        volatile long completedTasks;

        // 指定一个任务让工人忙碌起来，这个任务可能是空的
        Worker(Runnable firstTask) {
              // 初始化AQS队列锁的状态
            setState(-1); // 禁止中断直到 runWorker
            this.firstTask = firstTask;
            // 从线程工厂，取出一个线程初始化
            this.thread = getThreadFactory().newThread(this);
        }

        // 实际上运行调用的是runWorker
        public void run() {
              // 不断循环获取任务进行执行
            runWorker(this);
        }

        // 0表示没有被锁
        // 1表示被锁的状态
        protected boolean isHeldExclusively() {
            return getState() != 0;
        }
        // 独占，尝试获取锁，如果成功返回true，失败返回false
        protected boolean tryAcquire(int unused) {
            // CAS 乐观锁
            if (compareAndSetState(0, 1)) {
                // 成功，当前线程独占锁
                setExclusiveOwnerThread(Thread.currentThread());
                return true;
            }
            return false;
        }
        // 独占方式，尝试释放锁
        protected boolean tryRelease(int unused) {
            setExclusiveOwnerThread(null);
            setState(0);
            return true;
        }
        // 上锁，调用的是AQS的方法
        public void lock()        { acquire(1); }
        // 尝试上锁
        public boolean tryLock()  { return tryAcquire(1); }
        // 解锁
        public void unlock()      { release(1); }
        // 是否锁住
        public boolean isLocked() { return isHeldExclusively(); }

        // 如果开始可就中断
        void interruptIfStarted() {
            Thread t;
            if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) {
                try {
                    t.interrupt();
                } catch (SecurityException ignore) {
                }
            }
        }
    }

`Task queue`

In addition to the place where the thread pool is placed, if there are many tasks and there are not so many threads, there must be a place to place tasks to act as a buffer, that is, the task queue, which is expressed in the code as:

private final BlockingQueue<Runnable> workQueue;

`Rejection policy and handler`

The computer's memory is always limited, and it is impossible for us to add content to the queue all the time, so the thread pool provides us with a choice to choose from a variety of queues. At the same time, when there are too many tasks, full of threads, and the task queue is also full, we need to make a certain response, that is, reject or throw an error, and discard the task? Which tasks are lost, these are the content that may need to be customized.

`How to create a thread pool`

Regarding how to create a thread pool, ThreadPoolExecutor actually provides a construction method, the main parameters are as follows, if not passed, the default will be used:

Number of core threads: The number of core threads, generally refers to the resident threads, usually will not be destroyed when there is no task
Maximum number of threads: the maximum number of threads that the thread pool allows to create
Non-core thread survival time: refers to how long a non-core thread can survive when there is no task
Unit of time: unit of survival time
Queue for storing tasks: used to store tasks
Thread factory
Reject the processor: if adding a task fails, it will be processed by the processor

    // 指定核心线程数，最大线程数，非核心线程没有任务的存活时间，时间单位，任务队列    
    public ThreadPoolExecutor(int corePoolSize,
                              int maximumPoolSize,
                              long keepAliveTime,
                              TimeUnit unit,
                              BlockingQueue<Runnable> workQueue) {
        this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
             Executors.defaultThreadFactory(), defaultHandler);
    }
      // 指定核心线程数，最大线程数，非核心线程没有任务的存活时间，时间单位，任务队列，线程池工厂    
    public ThreadPoolExecutor(int corePoolSize,
                              int maximumPoolSize,
                              long keepAliveTime,
                              TimeUnit unit,
                              BlockingQueue<Runnable> workQueue,
                              ThreadFactory threadFactory) {
        this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
             threadFactory, defaultHandler);
    }
      // 指定核心线程数，最大线程数，非核心线程没有任务的存活时间，时间单位，任务队列，拒绝任务处理器
    public ThreadPoolExecutor(int corePoolSize,
                              int maximumPoolSize,
                              long keepAliveTime,
                              TimeUnit unit,
                              BlockingQueue<Runnable> workQueue,
                              RejectedExecutionHandler handler) {
        this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
             Executors.defaultThreadFactory(), handler);
    }
        // 最后其实都是调用了这个方法
    public ThreadPoolExecutor(int corePoolSize,
                              int maximumPoolSize,
                              long keepAliveTime,
                              TimeUnit unit,
                              BlockingQueue<Runnable> workQueue,
                              ThreadFactory threadFactory,
                              RejectedExecutionHandler handler) {
      ...
    }

In fact, in addition to specifying the above parameters, the JDK also encapsulates some methods to directly create a thread pool for us, that is Executors :

        // 固定线程数量的线程池，无界的队列
        public static ExecutorService newFixedThreadPool(int nThreads) {
        return new ThreadPoolExecutor(nThreads, nThreads,
                                      0L, TimeUnit.MILLISECONDS,
                                      new LinkedBlockingQueue<Runnable>());
    }
        // 单个线程的线程池，无界的队列，按照任务提交的顺序，串行执行    
        public static ExecutorService newSingleThreadExecutor(ThreadFactory threadFactory) {
        return new FinalizableDelegatedExecutorService
            (new ThreadPoolExecutor(1, 1,
                                    0L, TimeUnit.MILLISECONDS,
                                    new LinkedBlockingQueue<Runnable>(),
                                    threadFactory));
    }
        // 动态调节，没有核心线程，全部都是普通线程，每个线程存活60s，使用容量为1的阻塞队列
    public static ExecutorService newCachedThreadPool() {
        return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
                                      60L, TimeUnit.SECONDS,
                                      new SynchronousQueue<Runnable>());
    }
      // 定时任务线程池
    public static ScheduledExecutorService newSingleThreadScheduledExecutor() {
        return new DelegatedScheduledExecutorService
            (new ScheduledThreadPoolExecutor(1));
    }

However, it is generally not recommended to use the thread pool encapsulated by others above! ! !

`The underlying parameters and core methods of the thread pool`

After reading the above creation parameters, you may be a little confused, but it doesn't matter, I will tell you one by one:

It can be seen that when a task comes in, first judge whether the core thread pool is full, and if not, it will continue to create threads. Note that if a task comes in, a thread is created for execution, the execution is completed, and the thread becomes idle. At this time, if another task comes in, will it continue to use the previous thread or create a new thread for execution?

The answer is to re-create threads, so that the thread pool can quickly reach the size of the number of core threads in order to quickly respond to subsequent tasks.

If the number of threads has reached the number of core threads, a task comes, and the threads in the thread pool are not idle, then it will be judged whether the queue is full. If there is room in the queue, then the task will be put into the queue and wait The thread receives execution.

If the task queue is full and no tasks can be placed, it will determine whether the number of threads has reached the maximum number of threads. If it has not yet arrived, it will continue to create threads and execute tasks. At this time, non-core threads are created.

If the maximum number of threads has been reached, then you cannot continue to create threads and can only execute the rejection policy. The default rejection policy is to discard tasks, and we can customize the rejection policy.

It is worth noting that if there are more tasks before and some non-core threads are created, then after there are fewer tasks, no tasks can be received. After a certain period of time, the non-core threads will be destroyed, leaving only the number of core thread pools. the rout. This time is the aforementioned keepAliveTime .

`Submit task`

To submit the task, we look at execute() , we will first get the status and number of the thread pool. If the number of threads has not reached the number of core threads, threads will be added directly, otherwise they will be placed in the task queue. If the task queue can’t fit, it will continue to increase. Threads, but not to increase core threads.

    public void execute(Runnable command) {
        if (command == null)
            throw new NullPointerException();
        // 获取状态和个数
        int c = ctl.get();
          // 如果个数小于核心线程数
        if (workerCountOf(c) < corePoolSize) {
              // 直接添加
            if (addWorker(command, true))
                return;
              // 添加失败则继续获取
            c = ctl.get();
        }
          // 判断线程池状态是不是运行中，任务放到队列中
        if (isRunning(c) && workQueue.offer(command)) {
              // 再次检查
            int recheck = ctl.get();
              // 判断线程池是不是还在运行
            if (! isRunning(recheck) && remove(command))
                  // 如果不是，那么就拒绝并移除任务
                reject(command);
            else if (workerCountOf(recheck) == 0)
                  // 如果线程数为0，并且还在运行，那么就直接添加
                addWorker(null, false);
        }else if (!addWorker(command, false))
              // 添加任务队列失败，拒绝
            reject(command);
    }

In the above source code, an important method is called: addWorker(Runnable firstTask, boolean core) , this method is mainly to increase the thread of work, let's take a look at how it is executed:

    private boolean addWorker(Runnable firstTask, boolean core) {
          // 回到当前位置重试
        retry:
        for (;;) {
              // 获取状态
            int c = ctl.get();
            int rs = runStateOf(c);

            // 大于SHUTDOWN说明线程池已经停止
              // ! (rs == SHUTDOWN && firstTask == null && ! workQueue.isEmpty()) 表示三个条件至少有一个不满足
              // 不等于SHUTDOWN说明是大于shutdown
              // firstTask ！= null 任务不是空的
              // workQueue.isEmpty() 队列是空的
            if (rs >= SHUTDOWN &&
                ! (rs == SHUTDOWN &&
                   firstTask == null &&
                   ! workQueue.isEmpty()))
                return false;

            for (;;) {
                // 工作线程数
                int wc = workerCountOf(c);
                  // 是否符合容量
                if (wc >= CAPACITY ||
                    wc >= (core ? corePoolSize : maximumPoolSize))
                    return false;
                  // 添加成功，跳出循环
                if (compareAndIncrementWorkerCount(c))
                    break retry;
                c = ctl.get();  // Re-read ctl
                  // cas失败，重新尝试
                if (runStateOf(c) != rs)
                    continue retry;
                // else CAS failed due to workerCount change; retry inner loop
            }
        }

          // 前面线程计数增加成功
        boolean workerStarted = false;
        boolean workerAdded = false;
        Worker w = null;
        try {
              // 创建了一个worker，包装了任务
            w = new Worker(firstTask);
            final Thread t = w.thread;
              // 线程创建成功
            if (t != null) {
                  // 获取锁
                final ReentrantLock mainLock = this.mainLock;
                mainLock.lock();
                try {
                    // 再次确认状态
                    int rs = runStateOf(ctl.get());
                    if (rs < SHUTDOWN ||
                        (rs == SHUTDOWN && firstTask == null)) {
                          // 如果线程已经启动，失败
                        if (t.isAlive()) // precheck that t is startable
                            throw new IllegalThreadStateException();
                          // 新增线程到集合
                        workers.add(w);
                          // 获取大小
                        int s = workers.size();
                          // 判断最大线程池数量
                        if (s > largestPoolSize)
                            largestPoolSize = s;
                          // 已经添加工作线程
                        workerAdded = true;
                    }
                } finally {
                      // 解锁
                    mainLock.unlock();
                }
                  // 如果已经添加
                if (workerAdded) {
                      // 启动线程
                    t.start();
                    workerStarted = true;
                }
            }
        } finally {
              // 如果没有启动
            if (! workerStarted)
                  // 失败处理
                addWorkerFailed(w);
        }
        return workerStarted;
    }

`Processing task`

When introducing the Worker class earlier, we explained that its run() method actually calls the external runWorker() method, then let’s take a look at the runWorkder() method:

First of all, it will directly process its own firstTask, which is not in the task queue, but it holds:

final void runWorker(Worker w) {
              // 当前线程
        Thread wt = Thread.currentThread();
              // 第一个任务
        Runnable task = w.firstTask;
              // 重置为null
        w.firstTask = null;
              // 允许打断
        w.unlock();
        boolean completedAbruptly = true;
        try {
           // 任务不为空，或者获取的任务不为空
            while (task != null || (task = getTask()) != null) {
                  // 加锁
                w.lock();
                                //如果线程池停止，确保线程被中断;
                                //如果不是，确保线程没有被中断。这
                                //在第二种情况下需要复查处理
                                // shutdown - now竞赛同时清除中断
                if ((runStateAtLeast(ctl.get(), STOP) ||
                     (Thread.interrupted() &&
                      runStateAtLeast(ctl.get(), STOP))) &&
                    !wt.isInterrupted())
                    wt.interrupt();
                try {
                      // 执行之前回调方法（可以由我们自己实现）
                    beforeExecute(wt, task);
                    Throwable thrown = null;
                    try {
                          // 执行任务
                        task.run();
                    } catch (RuntimeException x) {
                        thrown = x; throw x;
                    } catch (Error x) {
                        thrown = x; throw x;
                    } catch (Throwable x) {
                        thrown = x; throw new Error(x);
                    } finally {
                          // 执行之后回调方法
                        afterExecute(task, thrown);
                    }
                } finally {
                      // 置为null
                    task = null;
                      // 更新完成任务
                    w.completedTasks++;
                    w.unlock();
                }
            }
              // 完成
            completedAbruptly = false;
        } finally {
              // 处理线程退出相关工作
            processWorkerExit(w, completedAbruptly);
        }
    }

As you can see above, if the current task is null, it will get a task. Let’s take a look at getTask() . There are two parameters involved. One is whether the core thread is allowed to be destroyed, and the other is whether the number of threads is greater than the number of core threads. If the conditions are met, the task is taken out of the queue. If the task is not taken over time, it will return to empty, which means that the task is not taken. If the task is not taken, the previous loop will not be executed and the thread will be triggered to destroy processWorkerExit() and other tasks.

private Runnable getTask() {
      // 是否超时
    boolean timedOut = false; // Did the last poll() time out?

    for (;;) {
        int c = ctl.get();
        int rs = runStateOf(c);

        // SHUTDOWN状态继续处理队列中的任务，但是不接收新的任务
        if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
            decrementWorkerCount();
            return null;
        }
          // 线程数
        int wc = workerCountOf(c);

        // 是否允许核心线程超时或者线程数大于核心线程数
        boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;

        if ((wc > maximumPoolSize || (timed && timedOut))
            && (wc > 1 || workQueue.isEmpty())) {
              // 减少线程成功，就返回null，后面由processWorkerExit()处理
            if (compareAndDecrementWorkerCount(c))
                return null;
            continue;
        }

        try {
              // 如果允许核心线程关闭，或者超过了核心线程，就可以在超时的时间内获取任务，或者直接取出任务
            Runnable r = timed ?
                workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                workQueue.take();
              // 如果能取到任务，那就肯定可以执行
            if (r != null)
                return r;
              // 否则就获取不到任务，超时了
            timedOut = true;
        } catch (InterruptedException retry) {
            timedOut = false;
        }
    }
}

`Destroy thread`

As mentioned earlier, if the current task of the thread is empty and the core thread is allowed to be destroyed, or the thread exceeds the number of core threads and waits for a certain period of time, if the timeout expires but the task is not obtained from the task queue, it will jump out of the loop and execute to the following The thread destroys (ends) the program. What to do when the thread is destroyed?

    private void processWorkerExit(Worker w, boolean completedAbruptly) {
          // 如果是突然结束的线程，那么之前的线程数是没有调整的，这里需要调整
        if (completedAbruptly)
            decrementWorkerCount();
          // 获取锁
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
      
        try {
              // 完成的任务数
            completedTaskCount += w.completedTasks;
            // 移除线程
              workers.remove(w);
        } finally {
              // 解锁
            mainLock.unlock();
        }
          // 试图停止
        tryTerminate();
          // 获取状态
        int c = ctl.get();
          // 比stop小，至少是shutdown
        if (runStateLessThan(c, STOP)) {
              // 如果不是突然完成
            if (!completedAbruptly) {
                  // 最小值要么是0，要么是核心线程数，要是允许核心线程超时销毁，那么就是0
                int min = allowCoreThreadTimeOut ? 0 : corePoolSize;
                  // 如果最小的是0或者队列不是空的，那么保留一个线程
                if (min == 0 && ! workQueue.isEmpty())
                    min = 1;
                  // 只要大于等于最小的线程数，就结束当前线程
                if (workerCountOf(c) >= min)
                    return; // replacement not needed
            }
              // 否则的话，可能还需要新增工作线程
            addWorker(null, false);
        }
    }

`How to stop the thread pool`

To stop the thread pool, you can use shutdown() or shutdownNow() , shutdown() can continue to process tasks in the queue, and shutdownNow() will immediately clean up tasks and return to unexecuted tasks.

    public void shutdown() {
        // 获取锁
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
              // 检查停止权限
            checkShutdownAccess();
              // 更新状态
            advanceRunState(SHUTDOWN);
              // 中断所有线程
            interruptIdleWorkers();
              // 回调钩子
            onShutdown(); // hook for ScheduledThreadPoolExecutor
        } finally {
            mainLock.unlock();
        }
        tryTerminate();
    }
        // 立刻停止
   public List<Runnable> shutdownNow() {
        List<Runnable> tasks;
             // 获取锁
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
              // 检查停止权限
            checkShutdownAccess();
              // 更新状态到stop
            advanceRunState(STOP);
              // 中断所有线程
            interruptWorkers();
            // 清理队列
            tasks = drainQueue();
        } finally {
            mainLock.unlock();
        }
        tryTerminate();
             // 返回任务列表（未完成）
        return tasks;
    }

`execute() and submit() methods`

execute() method can submit tasks that do not require a return value, and cannot determine whether the task is successfully executed by the thread pool
submit() method is used to submit tasks that require a return value. Thread pool will return an object of future type, through this object, we call get() method can blocking , until obtaining the results thread execution is complete, and we can use the overtime period of waiting method get（long timeout，TimeUnit unit） , so no matter thread has If the execution is not completed, if the time is up, it will not be blocked, and null will be returned directly. What is returned is the RunnableFuture object, which inherits the two interfaces Runnable, Future<V>

public interface RunnableFuture<V> extends Runnable, Future<V> {
    /**
     * Sets this Future to the result of its computation
     * unless it has been cancelled.
     */
    void run();
}

`Why do thread pools use blocking queues?`

The blocking queue, first of all, is a queue, and it must have a first-in, first-out attribute.

Blocking is the evolution of this model. General queues can be used in the production-consumer model, that is, data sharing. Some people put tasks in, and some people keep taking tasks out of it. This is an ideal state.

But if it is not ideal, the speed of generating tasks and consuming tasks is different. If there are more tasks in the queue and consumption is slower, you can consume slowly, or the producer has to pause the production task (blocking the producer thread). You can use offer(E o, long timeout, TimeUnit unit) set the waiting time. If you cannot add BlockingQueue to the queue within the specified time, it will return failure. You can also use put(Object) to put the object in the blocking queue. If there is no space, then this method will block Will be put in until there is space.

If the consumption speed is fast and the producer is too late to produce, you can use poll(time) when acquiring the task, and if there is data, you can directly fetch it, and if there is no data, you can wait for time then return to null . You can also use take() take out the first task. If there is no task, it will be blocked until there are tasks in the queue.

The above mentioned the properties of the blocking queue, so why use it?

If a task is generated, it will be put in the queue when it comes, and the resources are easily exhausted.
To create a thread, you need to acquire a lock. This is a global lock of a thread pool. If each thread continuously acquires the lock, unlocks, and thread context switching, the overhead is relatively large. It is better to block and wait when the queue is empty.

`Common blocking queue`

ArrayBlockingQueue : Based on an array implementation, there is a fixed-length array inside, and the positions of the head and tail of the queue are stored at the same time.
LinkedBlockingQueue linked list, the producer and the consumer use independent locks, and the parallel ability is strong. If the capacity is not specified, the default capacity is invalid, which is easy to run out of system memory.
DelayQueue : Delay queue, no size limit, production data will not be blocked, consumption data will be, only the specified delay time is up, the element can be obtained from the queue.
PriorityBlockingQueue : priority-based blocking queue, consumption according to priority, internal control synchronization is fair lock.
SynchronousQueue : Without buffering, the producer directly hands over the task to the consumer, eliminating the intermediate buffer area.

`How does the thread pool reuse threads? How to deal with the finished thread`

The previous source code analysis has actually explained this problem. The run() method called by the thread of the thread pool is actually called runWorker() , which is an infinite loop. Unless the task cannot be obtained, if there is no task firstTask and get it from the task queue When there is no task and timeout, it will judge whether the core thread can be destroyed or the number of core threads is exceeded. When the conditions are met, the current thread will be terminated.

Otherwise, it will always be in a loop and will not end.

We know that the start() method can only be called once, so run() method is called, the outside runWorker() , so that when it is in runWorker() , it will continue to loop to obtain tasks. Get the task, call the run() method of the task.

The thread that has completed execution will call processWorkerExit（） . As we analyzed earlier, the lock will be acquired, the number of threads will be reduced, and the worker threads will be removed from the collection. After the thread is removed, it will be judged whether there are too few threads, and if so, it will be repeated. Add it back, personally think it is a remedy.

`How to configure thread pool parameters?`

Generally speaking, there is a formula, if it is a computational (CPU) intensive task, then the number of core threads is set to processor core number-1, if it is io-intensive (many network requests), then it can be set to 2*Number of processor cores. But this is not a silver bullet. Everything must be based on reality. It is best to perform stress testing in a test environment to get real knowledge through practice. In many cases, a machine has more than one thread pool or other threads, so the parameters cannot be set. It's too full.

Generally, for an 8-core machine, setting 10-12 core threads is almost the same, all of which must be calculated according to the specific business values. Setting too many threads, context switching, fierce competition, too few settings, there is no way to make full use of computer resources.

Computing (CPU) intensive consumption is mainly CPU resources. The number of threads can be set to N (the number of CPU cores) + 1. The number of threads more than the number of CPU cores is to prevent occasional page fault interrupts by threads, or other reasons The impact caused by the suspension of the task. Once the task is suspended, the CPU will be in an idle state, and in this case an extra thread can make full use of the idle time of the CPU.
The io-intensive system will spend most of the time processing I/O interaction, and the thread will not occupy the CPU for processing during the time period of processing I/O, then the CPU can be handed over to other threads for use. Therefore, in the application of I/O intensive tasks, we can configure more threads, the specific calculation method is 2N.

`Why not recommend the default thread pool creation method?`

In Ali’s programming specifications, it is not recommended to use the default method to create threads, because the threads created in this way are often default parameters, and the creator may not know well, and it is easy to new ThreadPoolExecutor() problems. It is best to create it through 060d063380bca9. Convenient to control parameters. The problems created in the default way are as follows:

Executors.newFixedThreadPool(): Unbounded queue, memory may be burst
Executors.newSingleThreadExecutor(): Single thread, low efficiency, serial.
Executors.newCachedThreadPool(): Without core threads, the maximum number of threads may be infinite, and the memory may burst.

Using specific parameters to create a thread pool, developers must understand the role of each parameter, and will not set parameters indiscriminately, reducing memory overflow and other problems.

Generally reflected in several questions:

How to set up the task queue?
How many core threads?
What is the maximum number of threads?
How to reject the task?
There is no name when the thread is created, so it is hard to find the traceability problem.

`Thread pool rejection strategy`

The thread pool generally has the following four rejection strategies, in fact, we can see from its internal classes:

AbortPolicy: Do not perform new tasks, throw an exception directly, prompting that the thread pool is full
DisCardPolicy: do not perform new tasks, but will not throw exceptions, silently
DisCardOldSetPolicy: Discard the oldest task in the message queue and become a new incoming task
CallerRunsPolicy: directly call the current execute to perform the task

Generally speaking, the above rejection strategies are not particularly ideal. Generally, if the task is full, the first thing you need to do is to see if the task is necessary. If it is not necessary and not the core, you can consider rejecting it and give an error reminder. It must be saved, no matter if it is using mq messages or other means, the task cannot be lost. In these processes, logs are very necessary. It is necessary to protect the thread pool, but also to be responsible for the business.

`Thread pool monitoring and dynamic adjustment`

The thread pool provides some APIs that can dynamically obtain the status of the thread pool, and can also set the parameters and status of the thread pool:

View the status of the thread pool:

Modify the state of the thread pool:

Regarding this point, the thread pool article of Meituan made it very clear, and even made a real-time adjustment of thread pool parameters, which can track and monitor thread pool activity, task execution Transaction (frequency, time consumption), and reject exceptions. , Thread pool internal statistics, etc. I will not expand here, the original text: https://tech.meituan.com/2020/04/02/java-pooling-pratice-in-meituan.html , this is the idea we can refer to.

`Thread pool isolation`

Thread isolation, many students may know that different tasks are run in different threads, and thread pool isolation is generally based on business types. For example, order processing threads are placed in a thread pool, and member-related processing is placed A thread pool.

It can also be separated by core and non-core. The core processing flow is put together, and the non-core is put together. The two use different parameters and different rejection strategies. Try to ensure that multiple thread pools are not affected, and the greatest possible Keep the core thread running, and the non-core thread can tolerate failure.

Hystrix inside to use this technology, Hystrix thread isolation techniques to prevent avalanches between different network request, even though dependent on a service thread pool is full, it will not affect other parts of the application.

`About the author`

Qin Huai, [160d063380c0aa Qinhuai Grocery Store ], the road to technology is not at a time, the mountains are high and the rivers are long, even if it is slow, it will never stop. Personal writing direction: Java source code analysis, JDBC, Mybatis, Spring, redis, distributed, offer, LeetCode, etc., write each article carefully, don't like title party, don't like bells and whistles, mostly write series of articles, no guarantee What I wrote is completely correct, but I guarantee that what I wrote has been practiced or searched for information. I hope to correct any omissions or errors.

What did I write in 2020?

open source programming notes