Distributed | Dynamically adjust the number of thread pools in DBLE - MySQL分布式中间件DBLE

Author: Guo Aomen
Axon DBLE R&D member, responsible for the development of new functions of distributed database middleware, and answering general questions from the community/customers/internal.
Source of this article: original contribution
*The original content is produced by the open source community of Aikesheng, and the original content shall not be used without authorization. For reprinting, please contact the editor and indicate the source.

background

In the actual production environment, the initial flow of the project is relatively small. When the flow of the project rises later, the original thread configuration in dble may not be able to support the upstream pressure. At this time, a series of performance problems may be encountered. At this time, it is necessary to increase the processors. , backendProcessors and other thread pool parameters, and adjust them to the optimal multiple times according to expected indicators and actual thread usage.

Previously, after modifying the number of threads in the configuration, a restart was required for the configuration to take effect, but this method is not very flexible and may even affect upstream usage. dble also takes this into account, and in version 3.21.06.* provides a way to adjust the number of thread pools without restarting

Order

update dble_information.dble_thread_pool set core_pool_size = 2 where name = 'BusinessExecutor';

Notice:

The thread pools that support dynamic adjustment are: businessExecutor, writeToBackendExecutor, processors (in the nio scenario: the usingAIO value in bootstrap.cnf is 0), backendProcessors (in the nio scenario), backendBusinessExecutor, complexQueryExecutor
Dynamic adjustment of the thread pools corresponding to processors and backendProcessors in AIO scenarios is not supported
Due to the problem of the JDK native thread pool (ThreadPoolExecutor) expansion and shrinkage mechanism, newly created threads and threads that are about to be recycled need a certain time to be processed, so setting core_pool_size will not immediately be found through the dble_thread_pool table, but at this time use the The thread pool is not affected
Although the number of thread pools can be adjusted without downtime, in order to prevent unknown problems, it is recommended not to adjust when the traffic is heavy

Principle Interpretation

How to use threads in dble

Currently, there are two main ways to use thread pools in dble: one is the JDK built-in thread pool, and the other is the external queue + JDK built-in thread pool. Let's briefly talk about the principles of the two methods.

External queue + thread pool

DBLE adopts the classic master-slave Reactor multi-threading model at the network IO level for the characteristics of high concurrency and fast response. Here we only describe how the Reactor model is implemented in DBLE. Students who do not know the principle of the model can refer to: Thoroughly understand the Reactor model and the Proactor model (link at the end of the article)

The current network model of DBLE is shown in the figure above:

1. Reactor main thread - NIOAcceptor reads and preliminarily processes client connections through select and accept events, and passes them to Reactor sub-threads through the frontRegisterQueue queue

2. Reactor child thread - RW takes the connection from the external queue (non-thread pool internal queue, in order to distinguish the queue as the external queue at this time) and registers it in the current child thread, and then reads the data packet through the read method. And pass it to the worker thread via frontHandlerQueue queue or local queue

3. The sub-thread in the worker thread pool receives the task from the external queue, and after a series of subsequent analysis and processing, the result is passed to the writeToBackendExecutor thread through the writeQueue queue, and then sent to the front-end client, or passed to the backendBuinessExecutor/ through the local queue. The complexQueryExecutor thread returns the result directly to the front end

From the DBLE network model, it can be seen that it uses a queue + thread pool to distribute processing tasks. In addition, it also uses a large number of thread pools to process some time-consuming tasks in the process of business processing.

JDK built-in thread pool

The thread pool in dble is based on the ThreadPoolExecutor class of java. Its running process is as follows:

The thread pool actually builds a producer-consumer model internally, decoupling threads and tasks, and not directly related, so as to buffer tasks well and reuse threads. The operation of the thread pool is mainly divided into two parts: task management and thread management. The task management part acts as a producer. When a task is submitted, the thread pool will judge the subsequent flow of the task:

(1) Directly apply for the thread to perform the task;

(2) Buffer into the queue and wait for the thread to execute;

(3) Reject the task. The thread management part is the consumer, which is uniformly maintained in the thread pool, and the thread is allocated according to the task request. When the thread finishes executing the task, it will continue to obtain new tasks to execute. will be recycled.

Combined with the current structure in dble, the internal main thread uses:

The related threads of businessExecutor and writeToBackendExecutor are scheduled through external queue + thread pool. The thread pool is initialized when dble starts, and the thread obtains tasks through polling (the thread created at startup is called a resident thread)
The related threads of processors and backendProcessors are scheduled through an external queue + thread pool in the case of nio; in the case of aio, thread management is realized through the AsynchronousChannelGroup mechanism (built-in thread pool)
BackendBusinessExecutor related threads are scheduled in performance mode (usePerformanceMode=1) through external queue + thread pool; on the contrary, when subsequent tasks arrive, they are directly scheduled by the thread pool
Threads related to complexQueryExecutor are directly scheduled by the thread pool (threads not created at startup are called non-resident threads, and they live and die with tasks)

Implementation

In order to dynamically adjust the number of thread pools and ensure that tasks can be processed normally after expansion and contraction, the above two methods need to be processed separately. The specific implementation methods are as follows:

Thread Pool

The JDK native thread pool ThreadPoolExecutor provides the following public setter methods, as shown in the following figure:

JDK allows the thread pool user to dynamically set the core policy of the thread pool through an instance of ThreadPoolExecutor. Take setCorePoolSize as an example. After the thread pool user calls this method to set corePoolSize at runtime, the thread pool will directly overwrite the original corePoolSize value, and Different processing strategies are adopted based on the comparison result between the current value and the original value. For the case where the current value is less than the current number of worker threads, it means that there are redundant worker threads. At this time, an interrupt request will be issued to the worker thread of the current idle to achieve recycling, and the redundant idle workers will also be recycled after the task execution is completed (there is a delay). ); if the current value is greater than the original value and there are tasks to be executed in the current queue, the thread pool will create a new worker thread to execute the queue tasks. If there are no tasks to be executed in the internal queue of the current thread, a new work will be created when the next task needs to be executed. thread (with delay).

External queue + thread pool

The thread pool can use the set method provided by JDK to dynamically set the size of the pool. In the current scenario, when expanding the capacity, an external queue needs to be bound to the newly created thread to ensure that subsequent tasks can be received and processed by the newly created thread through the external queue. When creating a new thread in the code, you need to add a reference to the external queue

When shrinking, the ThreadPoolExecutor thread pool recycles idle threads through the above methods. For non-resident threads, you only need to set the size; for running threads, it will not take the initiative to recycle, and the bell must be tied to the ringer. We need to close it manually, first mark the thread as interrupted state through the interrupt method, and at the same time judge whether it is necessary to exit the polling according to the state value inside the thread, and then use the shrinkage strategy inside the thread pool to recycle the thread

The number of thread pools can be dynamically modified through the above two methods, but in order to allow the IO threads related to processors and backendProcessors to transition smoothly during expansion and contraction, additional necessary "aftermath" work is required.

"The Aftermath"

The threads corresponding to processors and backendProcessors are DBLE IO threads, which are responsible for receiving connection requests registered to the current thread and receiving back-end results. Therefore, when expanding the capacity, it is necessary to ensure that the newly created IO threads can process subsequent IO requests. The corresponding When shrinking, the connection bound to the recycled thread needs to be transferred to other IO threads to ensure that subsequent requests for the connection can be processed normally

In DBLE, the handling of the aftermath work is to first cancel the connection in the current thread, and then re-register these connections to the new IO thread. At this time, the selection strategy for deletion and re-registration is: when deleting, preferentially select the binding in the thread. If the number of connections is the smallest, the thread with the largest number of connections will be selected first when re-registering, and the selection will be made according to the deleted thread.

Summarize

dble in version 3.21.06.* and later provides commands that can modify thread parameters without restarting. Since dble does not simply use the built-in thread pool of JDK, the dynamic modification command is not just to use the built-in method of JDK. Implemented, but also did extra work to be compatible with external queues and IO connections.

Although we did not find any abnormality by dynamically adjusting the number of thread pools in the concurrency test, it is still recommended to make adjustments when the amount of concurrency is small, not only for smooth transition between threads, but also to reduce resource usage during thread adjustment. If you encounter some problems when using this command, you can feedback to actiontech/dble: A High Scalability Middle-ware for MySQL Sharding (github.com) to help us improve

refer to

Thoroughly understand Reactor model and Proactor model: https://cloud.tencent.com/developer/article/1488120
Java thread pool implementation principle and its practice in business: 161dfc6fa31a65 https://tech.meituan.com/2020/04/02/java-pooling-pratice-in-meituan.html

Distributed | Dynamically adjust the number of thread pools in DBLE

background

Order

Principle Interpretation

External queue + thread pool

JDK built-in thread pool

Implementation

Thread Pool

External queue + thread pool

"The Aftermath"

Summarize

refer to

爱可生开源社区

引用和评论

基础设施层变更后，业务真的还稳吗？来看问简智验平台的一次真实演示

Koupleless 助力「人力家」实现分布式研发集中式部署，又快又省！

百亿大表的实时分析：华安基金 HTAP 数据库的选型历程与 TiDB 使用体验

PHP-Casbin 在分布式服务中利用 Watcher 做策略同步

阿里云 EMR Serverless Spark 在微财机器学习场景下的应用

一行代码不用写，用 Autoflow + Gitee AI 搭建本地知识库问答机器人

百万架构师第二十五课：分布式架构的基础：分布式系统的基石TCP-IP通讯协议｜JavaGuide