1、现象

WI-1581][TI-824] - [ERROR] 2024-06-07 15:26:25.763 +0800 o.a.d.s.m.r.WorkflowExecuteRunnable:[423] - Task finish failed, get a exception, will remove this taskInstance from completeTaskSet
java.lang.NullPointerException: null
        at org.apache.dolphinscheduler.server.master.utils.WorkflowInstanceUtils.logTaskInstanceInDetail(WorkflowInstanceUtils.java:74)
        at org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable.taskFinished(WorkflowExecuteRunnable.java:420)
        at org.apache.dolphinscheduler.server.master.event.TaskStateEventHandler.handleStateEvent(TaskStateEventHandler.java:78)
        at org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable.handleEvents(WorkflowExecuteRunnable.java:288)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
[WI-1581][TI-824] - [ERROR] 2024-06-07 15:26:25.763 +0800 o.a.d.s.m.r.WorkflowExecuteRunnable:[312] - State event handle error, get a unknown exception, will retry this event: TaskStateEvent(processInstanceId=1581, taskInstanceId=824, taskCode=0, status=TaskExecutionStatus{code=7, desc='success'}, type=TASK_STATE_CHANGE, key=null, channel=null, context=null)
java.lang.NullPointerException: null
        at org.apache.dolphinscheduler.server.master.utils.WorkflowInstanceUtils.logTaskInstanceInDetail(WorkflowInstanceUtils.java:74)
        at org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable.taskFinished(WorkflowExecuteRunnable.java:420)
        at org.apache.dolphinscheduler.server.master.event.TaskStateEventHandler.handleStateEvent(TaskStateEventHandler.java:78)
        at org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable.handleEvents(WorkflowExecuteRunnable.java:288)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

看代码 org.apache.dolphinscheduler.server.master.utils.WorkflowInstanceUtils#logTaskInstanceInDetail

public String logTaskInstanceInDetail(TaskInstance taskInstance) {
        StringBuilder logBuilder = new StringBuilder();
        // set the length for '*'
        int horizontalLineLength = 80;
        // Append the title and the centered "Task Instance Detail"
        int titleLength = 40;
        int leftSpaces = (horizontalLineLength - titleLength) / 2;
        String centeredTitle = String.format("%" + leftSpaces + "s%s", "", "Task Instance Detail");
        logBuilder.append("\n").append(Strings.repeat("*", horizontalLineLength)).append("\n")
                .append(centeredTitle).append("\n")
                .append(Strings.repeat("*", horizontalLineLength)).append("\n")
                .append("Task Name:              ").append(taskInstance.getName()).append("\n")
                .append("Workflow Instance Name: ").append(taskInstance.getProcessInstance().getName()).append("\n")
                // 就是这里报错了,后来发现原来创建流程定义的时候忘记指定TaskExecuteType了,而API也没有校验
                .append("Task Execute Type:      ").append(taskInstance.getTaskExecuteType().getDesc()).append("\n")
                .append("Execute State:          ").append(taskInstance.getState().getDesc()).append("\n")
                .append("Host:                   ").append(taskInstance.getHost()).append("\n")
                .append("Task Type:              ").append(taskInstance.getTaskType()).append("\n")
                .append("Priority:               ").append(taskInstance.getTaskInstancePriority().getDescp())
                .append("\n")
                .append("Tenant:                 ").append(taskInstance.getProcessInstance().getTenantCode())
                .append("\n")
                .append("First Submit Time:      ").append(taskInstance.getFirstSubmitTime()).append("\n")
                .append("Submit Time:            ").append(taskInstance.getSubmitTime()).append("\n")
                .append("Start Time:             ").append(taskInstance.getStartTime()).append("\n")
                .append("End Time:               ").append(taskInstance.getEndTime()).append("\n");
        return logBuilder.toString();
    }

总结 :
其实就是报了一个空指针异常,然后从队列completeTaskSet移除了TaskInstance,然后就又执行该任务,再失败,再删除,再调度。页面现象就是成功了,然后运行中,再成功,再运行。。。。

2、解决

2.1、编写一个空检查反序列化类

import com.fasterxml.jackson.core.JsonParseException;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.databind.DeserializationContext;
import com.fasterxml.jackson.databind.JsonDeserializer;

import java.io.IOException;

public class NullCheckingDeserializer extends JsonDeserializer<Object> {

    private final JsonDeserializer<Object> defaultDeserializer;

    public NullCheckingDeserializer(JsonDeserializer<Object> defaultDeserializer) {
        this.defaultDeserializer = defaultDeserializer;
    }

    @Override
    public Object deserialize(JsonParser p, DeserializationContext ctxt) throws IOException {
        Object value = defaultDeserializer.deserialize(p, ctxt);
        if (value == null) {
            throw new JsonParseException(p, "Value cannot be null.");
        }
        return value;
    }
}

2.2、然后在流程定义的字段加加上即可

org.apache.dolphinscheduler.dao.entity.TaskDefinition

@JsonDeserialize(using = JSONUtils.NullCheckingDeserializer.class)
private TaskExecuteType taskExecuteType;

journey
32 声望23 粉丝