Dolphinscheduler AOP 获取 yarn applicationId 优雅吗？

明确个人观点，NO，不优雅。本来好好的日志解析，弄个spring aop拦截器就NB了？而且还跑不通，Dolphinscheduler 3.2.1 版本测试了，不能使用。这是要干鸡毛啊？

1、原始来源

小编翻看了原始的提交人的github存放的原始代码 https://github.com/gabrywu/Aop2YarnClient/tree/master，按照测试没有跑通。可能是自己有点low。不过实话说，gabrywu github star还是挺多的，社区爱好者

2、Dolphinscheduler YarnClientAspect

按照官网说的是在 common.properties 进行配置

# way to collect applicationId: log(original regex match), aop
# 注意，默认是log，正则日志解析。感觉挺好的，配置了aop跑不起来
appId.collect=log

核心代码 :

@Aspect
public class YarnClientAspect {

    /**
     * The current application report when application submitted successfully
     */
    private ApplicationReport currentApplicationReport = null;

    private final String appInfoFilePath;

    protected final Logger logger = LoggerFactory.getLogger(getClass());

    public YarnClientAspect() {
        // 统一路径都是在/项目根目录/appInfo.log下？
        appInfoFilePath = String.format("%s/%s", System.getProperty("user.dir"), "appInfo.log");
    }

    /**
     * Trigger submitApplication when invoking YarnClientImpl.submitApplication
     *
     * @param appContext     application context when invoking YarnClientImpl.submitApplication
     * @param submittedAppId the submitted application id returned by YarnClientImpl.submitApplication
     * @throws Throwable exceptions
     */
    @AfterReturning(pointcut = "execution(ApplicationId org.apache.hadoop.yarn.client.api.impl.YarnClientImpl." +
            "submitApplication(ApplicationSubmissionContext)) && args(appContext)", returning = "submittedAppId", argNames = "appContext,submittedAppId")
    public void registerApplicationInfo(ApplicationSubmissionContext appContext, ApplicationId submittedAppId) {
        if (appInfoFilePath != null) {
            try {
                // TODO 使用aop来写applicationId
                Files.write(Paths.get(appInfoFilePath),
                        Collections.singletonList(submittedAppId.toString()),
                        StandardOpenOption.CREATE,
                        StandardOpenOption.WRITE,
                        StandardOpenOption.APPEND);
            } catch (IOException ioException) {
                logger.error(
                        "YarnClientAspect[registerAppInfo]: can't output current application information, because {}",
                        ioException.getMessage());
            }
        }
        logger.info("YarnClientAspect[submitApplication]: current application context {}", appContext);
        logger.info("YarnClientAspect[submitApplication]: submitted application id {}", submittedAppId);
        logger.info(
                "YarnClientAspect[submitApplication]: current application report {}", currentApplicationReport);
    }

    /**
     * Trigger getAppReport only when invoking getApplicationReport within submitApplication
     * This method will invoke many times, however, the last ApplicationReport instance assigned to currentApplicationReport
     *
     * @param appReport current application report when invoking getApplicationReport within submitApplication
     * @param appId     current application id, which is the parameter of getApplicationReport
     */
    @AfterReturning(pointcut = "cflow(execution(ApplicationId org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(ApplicationSubmissionContext))) "
            +
            "&& !within(YarnClientAspect) && execution(ApplicationReport org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(ApplicationId)) && args(appId)", returning = "appReport", argNames = "appReport,appId")
    public void registerApplicationReport(ApplicationReport appReport, ApplicationId appId) {
        currentApplicationReport = appReport;
    }
}

cflow这里的作用是 : submitApplication 方法中一个while true的调用getApplicationReport
说白的其实就是要拦截一个方法中的方法(这个方法是多次调用的)

org.apache.hadoop.yarn.client.api.impl.YarnClientImpl#submitApplication

  public ApplicationId
      submitApplication(ApplicationSubmissionContext appContext)
          throws YarnException, IOException {
    ApplicationId applicationId = appContext.getApplicationId();
    if (applicationId == null) {
      throw new ApplicationIdNotProvidedException(
          "ApplicationId is not provided in ApplicationSubmissionContext");
    }
    SubmitApplicationRequest request =
        Records.newRecord(SubmitApplicationRequest.class);
    request.setApplicationSubmissionContext(appContext);

    // Automatically add the timeline DT into the CLC
    // Only when the security and the timeline service are both enabled
    if (isSecurityEnabled() && timelineV1ServiceEnabled) {
      addTimelineDelegationToken(appContext.getAMContainerSpec());
    }

    //TODO: YARN-1763:Handle RM failovers during the submitApplication call.
    rmClient.submitApplication(request);

    int pollCount = 0;
    long startTime = System.currentTimeMillis();
    EnumSet<YarnApplicationState> waitingStates = 
                                 EnumSet.of(YarnApplicationState.NEW,
                                 YarnApplicationState.NEW_SAVING,
                                 YarnApplicationState.SUBMITTED);
    EnumSet<YarnApplicationState> failToSubmitStates = 
                                  EnumSet.of(YarnApplicationState.FAILED,
                                  YarnApplicationState.KILLED);        
    while (true) {
      try {
        ApplicationReport appReport = getApplicationReport(applicationId);
        YarnApplicationState state = appReport.getYarnApplicationState();
        if (!waitingStates.contains(state)) {
          if(failToSubmitStates.contains(state)) {
            throw new YarnException("Failed to submit " + applicationId + 
                " to YARN : " + appReport.getDiagnostics());
          }
          LOG.info("Submitted application " + applicationId);
          break;
        }

        long elapsedMillis = System.currentTimeMillis() - startTime;
        if (enforceAsyncAPITimeout() &&
            elapsedMillis >= asyncApiPollTimeoutMillis) {
          throw new YarnException("Timed out while waiting for application " +
              applicationId + " to be submitted successfully");
        }

        // Notify the client through the log every 10 poll, in case the client
        // is blocked here too long.
        if (++pollCount % 10 == 0) {
          LOG.info("Application submission is not finished, " +
              "submitted application " + applicationId +
              " is still in " + state);
        }
        try {
          Thread.sleep(submitPollIntervalMillis);
        } catch (InterruptedException ie) {
          String msg = "Interrupted while waiting for application "
              + applicationId + " to be successfully submitted.";
          LOG.error(msg);
          throw new YarnException(msg, ie);
        }
      } catch (ApplicationNotFoundException ex) {
        // FailOver or RM restart happens before RMStateStore saves
        // ApplicationState
        LOG.info("Re-submit application " + applicationId + "with the " +
            "same ApplicationSubmissionContext");
        rmClient.submitApplication(request);
      }
    }

    return applicationId;
  }

使用的地方 :
org.apache.dolphinscheduler.plugin.task.api.AbstractYarnTask#getApplicationIds

org.apache.dolphinscheduler.server.worker.utils.TaskExecutionContextUtils#createTaskInstanceWorkingDirectory

// TODO 设置appInfo地址   taskExecutionContext.setAppInfoPath(FileUtils.getAppInfoPath(taskInstanceWorkingDirectory));

taskInstanceWorkingDirectory是什么呢？

String taskInstanceWorkingDirectory = FileUtils.getTaskInstanceWorkingDirectory(
                taskExecutionContext.getTenantCode(),
                taskExecutionContext.getProjectCode(),
                taskExecutionContext.getProcessDefineCode(),
                taskExecutionContext.getProcessDefineVersion(),
                taskExecutionContext.getProcessInstanceId(),
                taskExecutionContext.getTaskInstanceId());


前缀是 common.properties data.basedir.path=/tmp/dolphinscheduler               

批 :
/tmp/dolphinscheduler/exec/process/{tenantCode}/{projectCode}/{processDefineCode}_{processDefineVersion}/{processInstanceId}/{taskInstanceId}

流 :
/tmp/dolphinscheduler/exec/process/{tenantCode}/{projectCode}/{processDefineCode}_{processDefineVersion}/0/{taskInstanceId}

所以会在比如说批任务的话，从/tmp/dolphinscheduler/exec/process/{tenantCode}/{projectCode}/{processDefineCode}_{processDefineVersion}/{processInstanceId}/{taskInstanceId}/appInfo.log进行applicationId进行读取

真正使用的地方 :
org.apache.dolphinscheduler.plugin.task.api.AbstractYarnTask#getApplicationIds

public List<String> getApplicationIds() throws TaskException {
        // TODO 这里看common.properties中是否配置 appId.collect了，如果配置了走aop，否则走log
        return LogUtils.getAppIds(taskRequest.getLogPath(), taskRequest.getAppInfoPath(),
                PropertyUtils.getString(APPID_COLLECT, DEFAULT_COLLECT_WAY));
    }

org.apache.dolphinscheduler.plugin.task.api.utils.LogUtils#getAppIds

public List<String> getAppIds(String logPath, String appInfoPath, String fetchWay) {
        if (!StringUtils.isEmpty(fetchWay) && fetchWay.equals("aop")) {
            log.info("Start finding appId in {}, fetch way: {} ", appInfoPath, fetchWay);
            // TODO 如果走aop拦截的写的日志文件中读取
            return getAppIdsFromAppInfoFile(appInfoPath);
        } else {
            log.info("Start finding appId in {}, fetch way: {} ", logPath, fetchWay);
            // TODO 从日志中进行正则匹配
            return getAppIdsFromLogFile(logPath);
        }
    }

如感兴趣，点赞加关注，谢谢!!!

Dolphinscheduler AOP 获取 yarn applicationId 优雅吗？

1、原始来源

2、Dolphinscheduler YarnClientAspect

journey

引用和评论

go kratos 入门

【Hadoop】HDFS架构解析

【Hadoop】HBase系统解析及适用场景

基于 pyflink 的算法工作流设计和改造

【深度解析】Spring/Boot 核心陷阱：事务、AOP 与 Bean 生命周期的常见问题与应对策略

MCP+Hologres+LLM 搭建数据分析 Agent

某全球领先网络解决方案提供商基于 Apache Doris 统一 Trino、Pinot、Iceberg、Kyuubi技术栈

Dolphinscheduler AOP 获取 yarn applicationId 优雅吗？

1、原始来源

2、Dolphinscheduler YarnClientAspect

journey

引用和评论

go kratos 入门

【Hadoop】HDFS架构解析

【Hadoop】HBase系统解析及适用场景

基于 pyflink 的算法工作流设计和改造

【深度解析】Spring/Boot 核心陷阱：事务、AOP 与 Bean 生命周期的常见问题与应对策略

MCP+Hologres+LLM 搭建数据分析 Agent

某全球领先网络解决方案提供商 基于 Apache Doris 统一 Trino、Pinot、Iceberg、Kyuubi技术栈

某全球领先网络解决方案提供商基于 Apache Doris 统一 Trino、Pinot、Iceberg、Kyuubi技术栈