A glimpse of java thread pool source code

What is thread pool:

Thread pool is a kind of thread usage mode. Threads are created in advance and stored in a pooled way. You can use them directly when you use them to avoid the performance overhead caused by frequent thread creation and destruction. The number is limited and managed to avoid creating too many threads and causing oom exceptions.

How to use multithreading in java

1. Inherit the Thread class, implement the Runable/Callable interface, and call the start/call method to start a thread.

2, is the way we want to talk about the thread pool.

1. Needless to say, the frequent creation and destruction of threads brings performance overhead, and the number of threads is not limited. In actual projects, multi-threading cannot be used in this way.

Thread pool Executor framework in java

It is java's implementation of the thread pool, which implements the mechanism of separating the execution unit and the separation unit of the thread. The framework includes three parts

1. The task unit is the task class that implements the Runable/Callable interface (what you really want to do is written in the run/call method).

2, Execution unit, the thread cached in the thread pool to execute your task.

3. The result of asynchronous calculation, when the task is to implement the callable interface, the execution method will encapsulate the return result into a Future object and return it

Pre-thinking:

There is a very classic interview question asking: Can a thread call the start method multiple times? The answer is no, why can we open the source code

public synchronized void start() {
        /**
         * This method is not invoked for the main method thread or "system"
         * group threads created/set up by the VM. Any new functionality added
         * to this method in the future may have to also be added to the VM.
         *
         * A zero status value corresponds to state "NEW".
         */
        /**
          他判断了线程的状态状态是0也就是NEW的状态才能执行该方法,
          很显然线程调用过一次start方法后状态肯定不会为NEW而是TERMINATED,
          所以多次调用会抛出异常。
          至于线程的那几个状态在内部枚举类State里面有列举这里不再赘述
        */
        if (threadStatus != 0)
            throw new IllegalThreadStateException();

        /* Notify the group that this thread is about to be started
         * so that it can be added to the group's list of threads
         * and the group's unstarted count can be decremented. */
        group.add(this);

        boolean started = false;
        try {
            start0();
            started = true;
        } finally {
            try {
                if (!started) {
                    group.threadStartFailed(this);
                }
            } catch (Throwable ignore) {
                /* do nothing. If start0 threw a Throwable then
                  it will be passed up the call stack */
            }
        }
    }

Well, we can't help but think that the thread pool can complete the multiplexing of threads, so how does it make the thread not end there and wait for the next task to continue after executing the task? If we want to do it, what should we do? We can boldly imagine whether it is a model like the producer-consumer. .

Use thread pool:

The use of the thread pool is not complicated, there are two ways to use it:

1. The Executors tool class (thread pool factory class?) creates some specific thread pools. This is to help you shield the settings of some parameters. Using it directly to obtain the thread pool object will have the following problems: The default parameters he sets are right for you. It may not be reasonable in terms of the business needs of the thread pool. If you can't fix it, OOM is not a good thing for you to understand the principle of the thread pool. Therefore, we use the next method in the project. There are several thread pools. Readers who are interested can go to it by themselves. Research, it is best not to use this method to create a thread pool in project development.

2, ThreadPoolExecutor creates a thread pool.

/**corePoolSize:核心线程数量
   maximumPoolSize:最大线程数量
   keepAliveTime:线程多长时间没活干之后销毁(默认只针对非核心线程,但是可以通过allowCoreThreadTimeOut设置核心线程超时销毁)
   unit:时间单位
   workQueue:缓存任务的阻塞队列(当任务过来时候发现核心线程都在忙着就会先缓存进去该队列)
   threadFactory:线程工厂
   handler:拒绝策略(当任务无法被执行且不能被缓存时候执行)
*/
public ThreadPoolExecutor(int corePoolSize,
                              int maximumPoolSize,
                              long keepAliveTime,
                              TimeUnit unit,
                              BlockingQueue<Runnable> workQueue,
                              ThreadFactory threadFactory,
                              RejectedExecutionHandler handler) {
        if (corePoolSize < 0 ||
            maximumPoolSize <= 0 ||
            maximumPoolSize < corePoolSize ||
            keepAliveTime < 0)
            throw new IllegalArgumentException();
        if (workQueue == null || threadFactory == null || handler == null)
            throw new NullPointerException();
        this.acc = System.getSecurityManager() == null ?
                null :
                AccessController.getContext();
        this.corePoolSize = corePoolSize;
        this.maximumPoolSize = maximumPoolSize;
        this.workQueue = workQueue;
        this.keepAliveTime = unit.toNanos(keepAliveTime);
        this.threadFactory = threadFactory;
        this.handler = handler;
    }

The general logic of the thread pool (from Baidu pictures)

线程池执行逻辑(摘自百度图片)

When are threads in the thread pool created:

We use the thread pool to directly call the execute method to perform tasks.

 public void execute(Runnable command) {
        if (command == null)
            throw new NullPointerException();
        
        int c = ctl.get();
             //这里线程池的设计者巧妙的用1个int类型的变量表示了线程池的状态和当前线程的数量
            //二进制的前3位表示线程池状态后29位表示线程的数量
        if (workerCountOf(c) < corePoolSize) {
            //如果当前的线程数量小于核心线程数量,尝试添加一个核心线程去执行当前任           
            if (addWorker(command, true))
                return;
            c = ctl.get();
        }
        if (isRunning(c) && workQueue.offer(command)) {
            //如果线程池现在是runing的状态,且入队成功
            int recheck = ctl.get();
            if (! isRunning(recheck) && remove(command))
                //double check线程池状态,如果此时线程池状态不是running移除添加的任务并执行拒绝策略
                reject(command);
            else if (workerCountOf(recheck) == 0)
                //如果此时工作中的线程数量为0添加一个非核心线程
                addWorker(null, false);
        }
       //入队也没吃成功添加非核心线程
        else if (!addWorker(command, false))
            reject(command);
    }

addWorker method:

Worker is the inner class of the thread pool, which encapsulates a thread object, which itself implements the runable interface, and the encapsulated thread is the execution unit of the task. When instantiating this thread, the current Worker object is passed. Around multiple products.

 private boolean addWorker(Runnable firstTask, boolean core) {
        retry:
        for (;;) {
            int c = ctl.get();
            int rs = runStateOf(c);

            // Check if queue empty only if necessary.
            if (rs >= SHUTDOWN &&
                ! (rs == SHUTDOWN &&
                   firstTask == null &&
                   ! workQueue.isEmpty()))
                return false;

            for (;;) {
                int wc = workerCountOf(c);
                if (wc >= CAPACITY ||
                    wc >= (core ? corePoolSize : maximumPoolSize))
                    return false;
                if (compareAndIncrementWorkerCount(c))
                    break retry;
                c = ctl.get();  // Re-read ctl
                if (runStateOf(c) != rs)
                    continue retry;
                // else CAS failed due to workerCount change; retry inner loop
            }
        }

        boolean workerStarted = false;
        boolean workerAdded = false;
        Worker w = null;
        try {
            w = new Worker(firstTask);
            //真正执行任务的线程
            final Thread t = w.thread;
            if (t != null) {
                final ReentrantLock mainLock = this.mainLock;
                //因为线程池采用的是hashset为保证线程安全,加锁
                mainLock.lock();
                try {
                    // Recheck while holding lock.
                    // Back out on ThreadFactory failure or if
                    // shut down before lock acquired.
                    int rs = runStateOf(ctl.get());

                    if (rs < SHUTDOWN ||
                        (rs == SHUTDOWN && firstTask == null)) {
                        if (t.isAlive()) // precheck that t is startable
                            throw new IllegalThreadStateException();
                        //将worker添加到线程池中
                        workers.add(w);
                        int s = workers.size();
                        if (s > largestPoolSize)
                            //滚动更新池中的最大线程数
                            largestPoolSize = s;
                        workerAdded = true;
                    }
                } finally {
                    mainLock.unlock();
                }
                if (workerAdded) {
                    //这里就是执行任务的逻辑了,上面提到这个t是worker里面的一个成员,他的实例化传了worker对象,所以实际上这里执行的逻辑应该是worker实现runable接口后复写的run方法而worker类的run方法又是调用的线程池中的RunWorker方法
                    t.start();
                    workerStarted = true;
                }
            }
        } finally {
            if (! workerStarted)
                addWorkerFailed(w);
        }
        return workerStarted;
    }

final void runWorker(Worker w) {
        Thread wt = Thread.currentThread();
        //拿到worker里面封装的task
        Runnable task = w.firstTask;
        //将worker里面的task清空,这里变量名也取的很好,第一个任务,意思是worker第一个执行的任务肯定是当初实例化他传进去的runable类型的参数,当然这个任务也可能为空,比如线程池创建之后执行预先创建线程的方法prestartAllCoreThreads
        w.firstTask = null;
        w.unlock(); // allow interrupts
        boolean completedAbruptly = true;
        try {
            while (task != null || (task = getTask()) != null) {
                //如果task不为空或者getTask不为空就去执行task
                //我想大家已经猜到了,这个getTask多半就是从阻塞队列中获取任务了
                //阻塞队列有什么特点？没有任务他就会阻塞在这里,所以这便可以回答前文的问题,
                //他是靠阻塞队列的take方法的阻塞特性让线程挂起从而完成线程的复用的
                w.lock();
                if ((runStateAtLeast(ctl.get(), STOP) ||
                     (Thread.interrupted() &&
                      runStateAtLeast(ctl.get(), STOP))) &&
                    !wt.isInterrupted())
                    wt.interrupt();
                try {
                    beforeExecute(wt, task);
                    Throwable thrown = null;
                    try {
                        //真正执行业务方法的地方
                        task.run();
                    } catch (RuntimeException x) {
                        thrown = x; throw x;
                    } catch (Error x) {
                        thrown = x; throw x;
                    } catch (Throwable x) {
                        thrown = x; throw new Error(x);
                    } finally {
                        afterExecute(task, thrown);
                    }
                } finally {
                    task = null;
                    w.completedTasks++;
                    w.unlock();
                }
            }
            completedAbruptly = false;
        } finally {
            //这个方法是销毁worker的方法,里面有将worker从池hashset中移除的逻辑,那么他什么时候会走到呢?
            //1,上述逻辑发生了异常即completedAbruptly=true.
            //2,上述逻辑正常退出,咦,刚不是说上述循环条件会卡在take方法阻塞住么,怎么会正常退出呢？我们看看getTask方法
            processWorkerExit(w, completedAbruptly);
        }
    }
private Runnable getTask() {
       //记得初始化线程池有一个参数叫线程多长时间没活干就销毁他么
        boolean timedOut = false; // Did the last poll() time out?

        for (;;) {
            //死循环
            int c = ctl.get();
            int rs = runStateOf(c);

            // Check if queue empty only if necessary.
            if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
                decrementWorkerCount();
                return null;
            }

            int wc = workerCountOf(c);

            //这里意思是说检查超时的必要条件要么是核心线程也允许超时要么是当前有非核心线程在运行着
            boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
            //忽略第一个极端条件超过最大线程数量不看,第一次进来是肯定不会进去这个分支的因为timeout为false.
            if ((wc > maximumPoolSize || (timed && timedOut))
                && (wc > 1 || workQueue.isEmpty())) {
                //cas减少一个执行单元的数量,并没有销毁线程池中的线程对象,销毁动作在processWorkerExit方法中即将线程(worker)从池即hashset中移除
                if (compareAndDecrementWorkerCount(c))
                    return null;
                continue;
            }

            try {
                //这里就是阻塞获取队列里面的任务
                Runnable r = timed ?
                    workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                    workQueue.take();
                if (r != null)
                    //获取到了任务
                    return r;
                //超时没有获取到timeout就是true了,再次循环就会到上面减少运行线程数量的分支去了
                timedOut = true;
            } catch (InterruptedException retry) {
                timedOut = false;
            }
        }
    }

Final question:

Now we roughly know the principle of thread pool, but there is still a very difficult question, that is, how should those parameters be set reasonably? Before answering this question, we might as well consider why we use multi-threading. The answer is believed that everyone should Knowing that in order to improve the utilization of CPU, we also know that a single-core processor can only process one task at a time. The reason why we can listen to songs while watching Zhihu is due to the time slice mechanism of CPU in each (process) The constant switching back and forth between threads gives us the illusion of running concurrently. The overhead of context switching of thread cutting is not small, because the system has switching from user mode to kernel mode.

We mainly look at the parameter settings of the number of core threads

So since context switching has overhead, more threads are not better. When choosing multi-threading, you need to analyze specific tasks specifically:

CPU-intensive tasks: that is, the task is originally a computational task that requires a lot of CPU participation, and the utilization of the CPU is already sufficient. At this time, if you add more threads to your thread, it will only increase the extra overhead caused by the thread context. That's it. So at this time, the number of threads should theoretically be set to be the same as the number of system CPUs, but in order to prevent some accidents, it is generally set to the number of CPUs + 1 thread.

IO-intensive tasks: The time-consuming task is mainly reflected in the thread waiting for the return of the io operation, such as network calls, file reading and writing, etc. At this time, the utilization of the CPU is not fully utilized, so theoretically, to some extent, the thread The number should be as many as possible, such as twice the number of CPUs.

Regarding the setting of parameters, the experience is strong, because it is impossible to run your application on your server. I think you only need to understand what kind of task type and how to set it, and you can be flexible under this big idea.

Summarize:

The idea of the thread pool can be explained in vernacular. The company's project team has 5 programmers (the number of core threads). The usual job is to solve the bugs submitted by the test, and the test will submit a bug report on a bug control platform (blocking queue). , There are not many programmers doing bugs when they are free. They are waiting for the test to submit new bugs (blocking the acquisition of new tasks) in front of the computer. When they are busy, 996 of the five programmers are too busy. It’s almost because there are too many tests submitted for bugs, and I can’t log in. At this time, the boss said so, I went to recruit a few outsourcers, and the outsourcers also helped the project team to solve the bugs, but there are still many tests at this time. The bug has to be submitted and can’t be submitted, the project team boss is angry, and said whether you will use the function written by Laozi, what kind of bug you mentioned (ie8 can’t be displayed normally), get out (rejection strategy) The project started to be idle, bug There is not so much outsourcing, so I will pack up and leave (threads are destroyed by timeout).

A glimpse of java thread pool source code

What is thread pool:

How to use multithreading in java

Thread pool Executor framework in java

Pre-thinking:

Use thread pool:

The general logic of the thread pool (from Baidu pictures)

When are threads in the thread pool created:

Final question:

Summarize:

Edson

引用和评论

Java12的新特性

Java8的新特性

Java11的新特性

Java5的新特性

Java9的新特性

Java13的新特性

Java7的新特性