Author: Huang Xiaomeng (Xueren)
Scheduled tasks are a common requirement of every business, such as scanning overtime paid orders every minute, cleaning historical database data every hour, collecting data from the previous day and generating reports every day, etc.
Built-in solution in Java
Use Timer
Create a java.util.TimerTask task and implement business logic in the run method. Scheduled through java.util.Timer, which supports execution at a fixed frequency. All TimerTasks are executed serially in the same thread and affect each other. That is to say, for multiple TimerTask tasks in the same Timer, if one TimerTask task is executing, other TimerTask tasks can only wait in line even if the execution time is reached. If an exception occurs, the thread will exit and the entire scheduled task will fail.
import java.util.Timer;
import java.util.TimerTask;
public class TestTimerTask {
public static void main(String[] args) {
TimerTask timerTask = new TimerTask() {
@Override
public void run() {
System.out.println("hell world");
}
};
Timer timer = new Timer();
timer.schedule(timerTask, 10, 3000);
}
}
Using ScheduledExecutorService
The timing task solution based on the thread pool design, each scheduling task will be assigned to a thread in the thread pool for execution, which solves the problem that the Timer timer cannot be executed concurrently, and supports fixedRate and fixedDelay.
import java.util.Timer;
import java.util.TimerTask;
public class TestTimerTask {
public static void main(String[] args) {
TimerTask timerTask = new TimerTask() {
@Override
public void run() {
System.out.println("hell world");
}
};
Timer timer = new Timer();
timer.schedule(timerTask, 10, 3000);
}
}
Solutions that come with Spring
Springboot provides a set of lightweight timing task tools Spring Task, which can be easily configured through annotations, and supports cron expressions, fixedRate, and fixedDelay.
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
public class TestTimerTask {
public static void main(String[] args) {
ScheduledExecutorService ses = Executors.newScheduledThreadPool(5);
//按照固定频率执行,每隔5秒跑一次
ses.scheduleAtFixedRate(new Runnable() {
@Override
public void run() {
System.out.println("hello fixedRate");
}
}, 0, 5, TimeUnit.SECONDS);
//按照固定延时执行,上次执行完后隔3秒再跑
ses.scheduleWithFixedDelay(new Runnable() {
@Override
public void run() {
System.out.println("hello fixedDelay");
}
}, 0, 3, TimeUnit.SECONDS);
}
}
Compared with the two solutions mentioned above, the biggest advantage of Spring Task is that it supports cron expressions, which can process services that are executed according to a fixed period of standard time, such as what time and what time every day.
Business idempotent solutions
The current applications are basically distributed deployment, and the code of all machines is the same. The solutions that come with Java and Spring introduced earlier are all process-level, and each machine will perform scheduled tasks at the same time. This will lead to problems with timed task services that require business idempotency, such as pushing messages to users regularly every month, which will be pushed multiple times.
Therefore, many applications naturally think of solutions using distributed locks. That is, before each timed task is executed, first go to grab the lock, grab the execution task of the lock, and not execute the task that cannot grab the lock. There are various ways to grab the lock, such as using DB, zookeeper, and redis.
## Use DB or Zookeeper to grab locks
The architecture of using DB or Zookeeper to grab locks is similar. The principles are as follows:
- When the time is up, in the callback method, grab the lock first.
- If you grab the lock, continue to execute the method, and return directly without grabbing the lock.
- After executing the method, release the lock.
The sample code is as follows:
import org.springframework.scheduling.annotation.EnableScheduling;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Component;
@Component
@EnableScheduling
public class MyTask {
/**
* 每分钟的第30秒跑一次
*/
@Scheduled(cron = "30 * * * * ?")
public void task1() throws InterruptedException {
System.out.println("hello cron");
}
/**
* 每隔5秒跑一次
*/
@Scheduled(fixedRate = 5000)
public void task2() throws InterruptedException {
System.out.println("hello fixedRate");
}
/**
* 上次跑完隔3秒再跑
*/
@Scheduled(fixedDelay = 3000)
public void task3() throws InterruptedException {
System.out.println("hello fixedDelay");
}
}
The current design, a careful student can find, in fact, it may still lead to repeated execution of tasks. For example, the task is executed very quickly. Machine A grabs the lock and releases the lock soon after the task is executed. B After this machine grabs the lock, it will still grab the lock and perform the task again.
Use redis to grab locks
Using redis to grab locks is actually similar in structure to DB/zookeeper, but redis lock grabs support expiration time, you don't need to actively release locks, and you can make full use of this expiration time to solve the problem of task execution being repeated too quickly by releasing locks. The architecture is as follows:
The sample code is as follows:
@Component
@EnableScheduling
public class MyTask {
/**
* 每分钟的第30秒跑一次
*/
@Scheduled(cron = "30 * * * * ?")
public void task1() throws InterruptedException {
String lockName = "task1";
if (tryLock(lockName, 30)) {
System.out.println("hello cron");
releaseLock(lockName);
} else {
return;
}
}
private boolean tryLock(String lockName, long expiredTime) {
//TODO
return true;
}
private void releaseLock(String lockName) {
//TODO
}
}
Seeing this, there may be students who have problems again. Is adding an expiration time still not rigorous enough, or is it possible that the task will be repeated?
——Indeed, if there is a machine with fullgc for a long time suddenly, or the previous task has not been processed (Spring Task and ScheduledExecutorService essentially still process tasks through the thread pool), it is still possible to schedule tasks after 30 seconds. .
Using Quartz
Quartz[1] is a lightweight task scheduling framework. It only needs to define Job (task), Trigger (trigger) and Scheduler (scheduler) to achieve a timing scheduling capability. It supports database-based cluster mode, which can achieve idempotent execution of tasks.
Quartz supports idempotent execution of tasks. In fact, it is still theoretically grabbing DB locks. Let's take a look at the table structure of quartz:
Among them, QRTZ_LOCKS is the row lock table for the Quartz cluster to implement the synchronization mechanism. The table structure is as follows:
--QRTZ_LOCKS表结构
CREATE TABLE `QRTZ_LOCKS` (
`LOCK_NAME` varchar(40) NOT NULL,
PRIMARY KEY (`LOCK_NAME`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
--QRTZ_LOCKS记录
+-----------------+
| LOCK_NAME |
+-----------------+
| CALENDAR_ACCESS |
| JOB_ACCESS |
| MISFIRE_ACCESS |
| STATE_ACCESS |
| TRIGGER_ACCESS |
+-----------------+
It can be seen that there are 5 records in QRTZ_LOCKS, representing 5 locks, which are used to realize the synchronization control of Job, Trigger and Calendar access by multiple Quartz Nodes.
Open source task scheduling middleware
The solutions mentioned above have a problem in the architecture, that is, each scheduling needs to grab locks, especially using DB and Zookeeper to grab locks, the performance will be relatively poor, once the task volume increases to a certain amount, it will There is a relatively obvious scheduling delay. Another pain point is that if the business wants to modify the scheduling configuration or add a task, it has to modify the code and republish the application.
As a result, a bunch of task scheduling middleware emerged in the open source community to create, modify and schedule tasks through the task scheduling system. The most popular ones in China are XXL-JOB and ElasticJob.
### ElasticJob
ElasticJob[2] is a lightweight and decentralized distributed task scheduling framework developed based on Quartz and relying on Zookeeper as a registry. It has been open sourced through Apache.
Compared with Quartz, ElasticJob's biggest difference in function is that it supports sharding, which can distribute a task sharding parameter to different machines for execution. The biggest difference in the architecture is that Zookeeper is used as the registration center. Different tasks are assigned to different nodes for scheduling, and there is no need to grab locks to trigger. The performance is much stronger than that of Quartz. The architecture diagram is as follows:
The development is also relatively simple, and it is better to combine with springboot. The tasks can be defined in the configuration file as follows:
elasticjob:
regCenter:
serverLists: localhost:2181
namespace: elasticjob-lite-springboot
jobs:
simpleJob:
elasticJobClass: org.apache.shardingsphere.elasticjob.lite.example.job.SpringBootSimpleJob
cron: 0/5 * * * * ?
timeZone: GMT+08:00
shardingTotalCount: 3
shardingItemParameters: 0=Beijing,1=Shanghai,2=Guangzhou
scriptJob:
elasticJobType: SCRIPT
cron: 0/10 * * * * ?
shardingTotalCount: 3
props:
script.command.line: "echo SCRIPT Job: "
manualScriptJob:
elasticJobType: SCRIPT
jobBootstrapBeanName: manualScriptJobBean
shardingTotalCount: 9
props:
script.command.line: "echo Manual SCRIPT Job: "
Implement the task interface as follows:
@Component
public class SpringBootShardingJob implements SimpleJob {
@Override
public void execute(ShardingContext context) {
System.out.println("分片总数="+context.getShardingTotalCount() + ", 分片号="+context.getShardingItem()
+ ", 分片参数="+context.getShardingParameter());
}
The results are as follows:
分片总数=3, 分片号=0, 分片参数=Beijing
分片总数=3, 分片号=1, 分片参数=Shanghai
分片总数=3, 分片号=2, 分片参数=Guangzhou
At the same time, ElasticJob also provides a simple UI to view the list of tasks, and supports modification, triggering, stopping, validating, and invalidating operations.
Unfortunately, ElasticJob does not currently support dynamic creation of tasks.
XXL-JOB
XXL-JOB[3] is an out-of-the-box lightweight distributed task scheduling system. Its core design goals are rapid development, simple learning, lightweight, and easy expansion. It is widely popular in the open source community.
XXL-JOB is a Master-Slave architecture. The Master is responsible for task scheduling, and the Slave is responsible for task execution. The architecture diagram is as follows:
XXL-JOB access is also very convenient. Unlike ElasticJob, which defines task implementation classes, JobHandler is defined through the @XxlJob annotation.
@Component
public class SampleXxlJob {
private static Logger logger = LoggerFactory.getLogger(SampleXxlJob.class);
/**
* 1、简单任务示例(Bean模式)
*/
@XxlJob("demoJobHandler")
public ReturnT<String> demoJobHandler(String param) throws Exception {
XxlJobLogger.log("XXL-JOB, Hello World.");
for (int i = 0; i < 5; i++) {
XxlJobLogger.log("beat at:" + i);
TimeUnit.SECONDS.sleep(2);
}
return ReturnT.SUCCESS;
}
/**
* 2、分片广播任务
*/
@XxlJob("shardingJobHandler")
public ReturnT<String> shardingJobHandler(String param) throws Exception {
// 分片参数
ShardingUtil.ShardingVO shardingVO = ShardingUtil.getShardingVo();
XxlJobLogger.log("分片参数:当前分片序号 = {}, 总分片数 = {}", shardingVO.getIndex(), shardingVO.getTotal());
// 业务逻辑
for (int i = 0; i < shardingVO.getTotal(); i++) {
if (i == shardingVO.getIndex()) {
XxlJobLogger.log("第 {} 片, 命中分片开始处理", i);
} else {
XxlJobLogger.log("第 {} 片, 忽略", i);
}
}
return ReturnT.SUCCESS;
}
}
Compared with ElasticJob, the biggest feature of XXL-JOB is that it has richer functions and stronger operation and maintenance capabilities. It not only supports the dynamic creation of tasks in the console, but also schedules logs, runs reports and other functions.
The history records, running reports and scheduling logs of XXL-JOB are all implemented based on the database:
It can be seen from this that all functions of XXL-JOB depend on the database, and the scheduling center does not support distributed architecture. When the task volume and scheduling volume are relatively large, there will be performance bottlenecks. However, if there are no high requirements for task magnitude, high availability, monitoring and alarming, visualization, etc., XXL-JOB can basically meet the needs of scheduled tasks.
Enterprise Solutions
Open source software can only provide basic scheduling capabilities, and its capabilities in supervision and control are generally relatively weak. For example, log service, the industry often uses the ELK solution; SMS alarm, requires a SMS platform; monitoring the market, the current mainstream solution is Prometheus; and so on. If an enterprise wants to have these capabilities, it not only requires additional development costs, but also requires expensive resource costs.
In addition, the use of open source software is also accompanied by the risk of stability, that is, no one can handle the problem, and if you want to feedback to the community and other communities for processing, this link is too long, and a fault has already occurred.
Alibaba Cloud Task SchedulerX[4] is a one-stop task scheduling platform based on Akka architecture developed by Alibaba. It is compatible with open source XXL-JOB, ElasticJob, Quartz (planning), and supports Cron timing, one-time tasks, task scheduling, Distributed batch running, with high availability, visualization, operation and maintenance, low latency and other capabilities, comes with enterprise-level monitoring dashboard, log service, SMS alarm and other services.
Advantage
Security
• Multi-layered security protection: supports HTTPS and VPC access, as well as Alibaba Cloud's multi-layered security protection to prevent malicious attacks.
• Multi-tenant isolation mechanism: supports multi-region, namespace and application-level isolation.
• Authority control: Supports authority management of console read and write, and authentication of client access.
Enterprise-grade high availability
SchedulerX2.0 adopts a high-availability architecture and a multi-task backup mechanism. After years of Double Eleven and disaster recovery drills by Alibaba Group, it can be ensured that any computer room is down, and task scheduling will not be affected.
Commercial-level alarm operation and maintenance • Alarm: Support email, DingTalk, SMS, and phone calls (other alarm methods are planned). Support task failure, timeout, no machine available alarm. The content of the alarm can directly see the reason for the failure of the task, taking DingTalk robot as an example.
• Operation and maintenance operations: rerun in place, refresh data, mark success, view stack, stop tasks, specify machines, etc.
rich visualization
schedulerx has rich visualization capabilities, such as:
• User market
• View task execution history
• View task run logs
• View the task run stack
• View task operation records
Compatible with open source
Schedulerx is compatible with open source XXL-JOB, ElasticJob, and Quartz (under planning). The business can host tasks on the SchedulerX scheduling platform without changing a line of code, and enjoy enterprise-level visualization and alarming capabilities.
Spring native
SchedulerX supports dynamic creation of tasks through the console and API, as well as Spring declarative task definitions. A task configuration can be started in any environment with one click. The configuration is as follows:
spring:
schedulerx2:
endpoint: acm.aliyun.com #请填写不同regin的endpoint
namespace: 433d8b23-06e9-xxxx-xxxx-90d4d1b9a4af #region内全局唯一,建议使用UUID生成
namespaceName: 学仁测试
appName: myTest
groupId: myTest.group #同一个命名空间下需要唯一
appKey: myTest123@alibaba #应用的key,不要太简单,注意保管好
regionId: public #填写对应的regionId
aliyunAccessKey: xxxxxxx #阿里云账号的ak
aliyunSecretKey: xxxxxxx #阿里云账号的sk
alarmChannel: sms,ding #报警通道:短信和钉钉
jobs:
simpleJob:
jobModel: standalone
className: com.aliyun.schedulerx.example.processor.SimpleJob
cron: 0/30 * * * * ? # cron表达式
jobParameter: hello
overwrite: true
shardingJob:
jobModel: sharding
className: ccom.aliyun.schedulerx.example.processor.ShardingJob
oneTime: 2022-06-02 12:00:00 # 一次性任务表达式
jobParameter: 0=Beijing,1=Shanghai,2=Guangzhou
overwrite: true
broadcastJob: # 不填写cron和oneTime,表示api任务
jobModel: broadcast
className: com.aliyun.schedulerx.example.processor.BroadcastJob
jobParameter: hello
overwrite: true
mapReduceJob:
jobModel: mapreduce
className: com.aliyun.schedulerx.example.processor.MapReduceJob
cron: 0 * * * * ?
jobParameter: 100
overwrite: true
alarmUsers: #报警联系人
user1:
userName: 张三
userPhone: 12345678900
user2:
userName: 李四
ding: https://oapi.dingtalk.com/robot/send?access_token=xxxxx
Distributed batch running
SchedulerX provides a rich distributed model that can handle a variety of distributed business scenarios. Including stand-alone, broadcast, sharding, MapReduce[5], etc. The architecture is as follows:
The MapReduce model of SchedulerX can distribute massive tasks to multiple machines to run batches with just a few lines of code. Compared with big data batches, it has the characteristics of fast speed, data security, low cost, and easy learning.
task scheduling
SchedulerX arranges tasks through workflow, and provides a visual interface, which is easy to operate. You can configure a workflow by dragging and dropping. The detailed task state diagram can clearly see why downstream tasks are not running, which is convenient for locating problems.
Preemptible task priority queue
A common scenario is offline reporting business at night. For example, many reporting tasks start running at 1 or 2 o'clock in the evening. The maximum number of concurrent tasks of the application must be controlled (otherwise, the business will not be able to handle it), and tasks that reach the concurrent upper limit will wait in the queue. At the same time, it is required that the KPI report must be run out before 9:00 in the morning. KPI tasks can be set to a high priority, which will preempt the priority scheduling of low-priority tasks.
SchedulerX supports preemptible task priority queues, which can be dynamically configured in the console:
Q&A
- Can Kubernetes applications access SchedulerX?
——Yes, whether it is a physical machine, a container, or a Kubernetes pod, you can access SchedulerX. - My application is not on Alibaba Cloud, can I use SchedulerX?
——Yes, any cloud platform or local machine, as long as it can access the public network, can access SchedulerX.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。