1、现象
WI-1581][TI-824] - [ERROR] 2024-06-07 15:26:25.763 +0800 o.a.d.s.m.r.WorkflowExecuteRunnable:[423] - Task finish failed, get a exception, will remove this taskInstance from completeTaskSet
java.lang.NullPointerException: null
at org.apache.dolphinscheduler.server.master.utils.WorkflowInstanceUtils.logTaskInstanceInDetail(WorkflowInstanceUtils.java:74)
at org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable.taskFinished(WorkflowExecuteRunnable.java:420)
at org.apache.dolphinscheduler.server.master.event.TaskStateEventHandler.handleStateEvent(TaskStateEventHandler.java:78)
at org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable.handleEvents(WorkflowExecuteRunnable.java:288)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[WI-1581][TI-824] - [ERROR] 2024-06-07 15:26:25.763 +0800 o.a.d.s.m.r.WorkflowExecuteRunnable:[312] - State event handle error, get a unknown exception, will retry this event: TaskStateEvent(processInstanceId=1581, taskInstanceId=824, taskCode=0, status=TaskExecutionStatus{code=7, desc='success'}, type=TASK_STATE_CHANGE, key=null, channel=null, context=null)
java.lang.NullPointerException: null
at org.apache.dolphinscheduler.server.master.utils.WorkflowInstanceUtils.logTaskInstanceInDetail(WorkflowInstanceUtils.java:74)
at org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable.taskFinished(WorkflowExecuteRunnable.java:420)
at org.apache.dolphinscheduler.server.master.event.TaskStateEventHandler.handleStateEvent(TaskStateEventHandler.java:78)
at org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable.handleEvents(WorkflowExecuteRunnable.java:288)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
看代码 org.apache.dolphinscheduler.server.master.utils.WorkflowInstanceUtils#logTaskInstanceInDetail
public String logTaskInstanceInDetail(TaskInstance taskInstance) {
StringBuilder logBuilder = new StringBuilder();
// set the length for '*'
int horizontalLineLength = 80;
// Append the title and the centered "Task Instance Detail"
int titleLength = 40;
int leftSpaces = (horizontalLineLength - titleLength) / 2;
String centeredTitle = String.format("%" + leftSpaces + "s%s", "", "Task Instance Detail");
logBuilder.append("\n").append(Strings.repeat("*", horizontalLineLength)).append("\n")
.append(centeredTitle).append("\n")
.append(Strings.repeat("*", horizontalLineLength)).append("\n")
.append("Task Name: ").append(taskInstance.getName()).append("\n")
.append("Workflow Instance Name: ").append(taskInstance.getProcessInstance().getName()).append("\n")
// 就是这里报错了,后来发现原来创建流程定义的时候忘记指定TaskExecuteType了,而API也没有校验
.append("Task Execute Type: ").append(taskInstance.getTaskExecuteType().getDesc()).append("\n")
.append("Execute State: ").append(taskInstance.getState().getDesc()).append("\n")
.append("Host: ").append(taskInstance.getHost()).append("\n")
.append("Task Type: ").append(taskInstance.getTaskType()).append("\n")
.append("Priority: ").append(taskInstance.getTaskInstancePriority().getDescp())
.append("\n")
.append("Tenant: ").append(taskInstance.getProcessInstance().getTenantCode())
.append("\n")
.append("First Submit Time: ").append(taskInstance.getFirstSubmitTime()).append("\n")
.append("Submit Time: ").append(taskInstance.getSubmitTime()).append("\n")
.append("Start Time: ").append(taskInstance.getStartTime()).append("\n")
.append("End Time: ").append(taskInstance.getEndTime()).append("\n");
return logBuilder.toString();
}
总结 :
其实就是报了一个空指针异常,然后从队列completeTaskSet移除了TaskInstance,然后就又执行该任务,再失败,再删除,再调度。页面现象就是成功了,然后运行中,再成功,再运行。。。。
2、解决
2.1、编写一个空检查反序列化类
import com.fasterxml.jackson.core.JsonParseException;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.databind.DeserializationContext;
import com.fasterxml.jackson.databind.JsonDeserializer;
import java.io.IOException;
public class NullCheckingDeserializer extends JsonDeserializer<Object> {
private final JsonDeserializer<Object> defaultDeserializer;
public NullCheckingDeserializer(JsonDeserializer<Object> defaultDeserializer) {
this.defaultDeserializer = defaultDeserializer;
}
@Override
public Object deserialize(JsonParser p, DeserializationContext ctxt) throws IOException {
Object value = defaultDeserializer.deserialize(p, ctxt);
if (value == null) {
throw new JsonParseException(p, "Value cannot be null.");
}
return value;
}
}
2.2、然后在流程定义的字段加加上即可
org.apache.dolphinscheduler.dao.entity.TaskDefinition
@JsonDeserialize(using = JSONUtils.NullCheckingDeserializer.class)
private TaskExecuteType taskExecuteType;
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。