3、azkaban-3.51.0 条件工作流flow和参数传递
  nNPyvzOmRTFq 2023年11月02日 68 0

本文主要介绍azkaban的条件工作流与参数传递。 本文分为三部分,即本文所有示例中均需要使用的标识文件、条件工作流介绍及示例、参数传递介绍及示例。 本文的前提是azkaban环境正常可使用。

一、全局标识

下述的例子中都需要有 flow20.project(文件名无所谓,内如如下)

azkaban-flow-version: 2.0

二、条件工作流flow

Conditional workflow feature allows users to specify whether to run certain jobs based on conditions. Users can run or disable certain jobs based on runtime parameters like the output from previous jobs. Azkaban provides users with some predefined macros to specify the condition based on previous jobs’ status. With those conditions, users can obtain more flexibility in deciding the job execution logic. For example, they can run the current job as long as one of the parent jobs has succeeded. They can achieve branching logic inside a workflow. Conditional workflow feature leverages Azkaban Flow 2.0 design ( see Creating Flows ). The condition is defined inside a flow YAML file. How to define a condition? A valid condition is a combination of multiple conditions on job runtime parameters and one condition on job status macro. Comparison and logical operators can be used to connect individual condition components. Supported operators are: ==, !=, >, >=, <, <=, &&, ||, ! Condition on job runtime parameter Variable substitution ${jobName:param} can be used to define the condition on the job runtime parameter. : is used to separate the jobName and the parameter. The runtime parameter can be compared with a string or a number in the condition. Users need to write the value of the parameter into the $JOB_OUTPUT_PROP_FILE ( This output file is available to most Azkaban jobs ). Condition on job status macro This condition will be evaluated on all the parent jobs, i.e. the dependsOn section in YAML file. Currently supported macros: all_success (default) all_done all_failed one_success (at least one parent job succeeded) one_failed (at least one parent job failed) Corresponding job status for each macro: all_done: FAILED, KILLED, SUCCEEDED, SKIPPED, FAILED_SUCCEEDED, CANCELLED all_success / one_success: SUCCEEDED, SKIPPED, FAILED_SUCCEEDED all_failed / one_failed: FAILED, KILLED, CANCELLED Users are not allowed to combine multiple conditions on job status macros in one single condition because they might have conflict with each other.

Azkaban 中预置了几个特殊的判断条件,称为预定义宏。 预定义宏会根据所有父 Job 的完成情况进行判断,再决定是否执行。可用的预定义宏如下:

  • all_success: 表示父 Job 全部成功才执行(默认)
  • all_done:表示父 Job 全部完成才执行
  • all_failed:表示父 Job 全部失败才执行
  • one_success:表示父 Job 至少一个成功才执行
  • one_failed:表示父 Job 至少一个失败才执行

1、示例

  • 复杂示例
${JobA:param1} == 1 && ${JobB:param2} > 5
one_success
all_done && ${JobC:param3} != "foo"
(!{JobD:param4} || !{JobE:parm5}) && all_success || ${JobF:parm6} == "bar"

以下是具体示例

  • sample.flow
nodes:
 - name: JobA
   type: command
   config:
     command: bash ./write_to_props.sh  #输出参数,即给param1赋值

 - name: JobB
   type: command
   dependsOn:
     - JobA
   config:
     command: echo “This is JobB.”
   condition: ${JobA:param1} == 1 #满足条件即可执行

 - name: JobC
   type: command
   dependsOn:
     - JobA
   config:
     command: echo “This is JobC.”
   condition: ${JobA:param1} == 2 #满足条件即可执行

 - name: JobD
   type: command
   dependsOn:
     - JobB
     - JobC
   config:
     command: pwd
   condition: one_success #满足条件即可执行
  • write_to_props.sh
echo '{"param1":"1"}' > $JOB_OUTPUT_PROP_FILE
  • 运行结果 在这里插入图片描述 在这里插入图片描述

  • 运行日志 JobA

18-08-2022 15:48:16 CST JobA INFO - Starting job JobA at 1660808896589
18-08-2022 15:48:16 CST JobA INFO - job JVM args: -Dazkaban.flowid=sample -Dazkaban.execid=46 -Dazkaban.jobid=JobA
18-08-2022 15:48:16 CST JobA INFO - user.to.proxy property was not set, defaulting to submit user azkaban
18-08-2022 15:48:16 CST JobA INFO - Building command job executor. 
18-08-2022 15:48:16 CST JobA INFO - Memory granted for job JobA
18-08-2022 15:48:16 CST JobA INFO - 1 commands to execute.
18-08-2022 15:48:16 CST JobA INFO - cwd=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46
18-08-2022 15:48:16 CST JobA INFO - effective user is: azkaban
18-08-2022 15:48:16 CST JobA INFO - Command: bash ./write_to_props.sh
18-08-2022 15:48:16 CST JobA INFO - Environment variables: {JOB_OUTPUT_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46/JobA_output_5311332079979767272_tmp, JOB_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46/JobA_props_1033652882785987662_tmp, KRB5CCNAME=/tmp/krb5cc__mutilexec__sample__JobA__46__azkaban, JOB_NAME=JobA}
18-08-2022 15:48:16 CST JobA INFO - Working directory: /usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46
18-08-2022 15:48:16 CST JobA INFO - Process completed successfully in 0 seconds.
18-08-2022 15:48:16 CST JobA INFO - output properties file=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46/JobA_output_5311332079979767272_tmp
18-08-2022 15:48:16 CST JobA INFO - Finishing job JobA at 1660808896602 with status SUCCEEDED

JobB

18-08-2022 15:48:16 CST JobB INFO - Starting job JobB at 1660808896739
18-08-2022 15:48:16 CST JobB INFO - job JVM args: -Dazkaban.flowid=sample -Dazkaban.execid=46 -Dazkaban.jobid=JobB
18-08-2022 15:48:16 CST JobB INFO - user.to.proxy property was not set, defaulting to submit user azkaban
18-08-2022 15:48:16 CST JobB INFO - Building command job executor. 
18-08-2022 15:48:16 CST JobB INFO - Memory granted for job JobB
18-08-2022 15:48:16 CST JobB INFO - 1 commands to execute.
18-08-2022 15:48:16 CST JobB INFO - cwd=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46
18-08-2022 15:48:16 CST JobB INFO - effective user is: azkaban
18-08-2022 15:48:16 CST JobB INFO - Command: echo “This is JobB.”
18-08-2022 15:48:16 CST JobB INFO - Environment variables: {JOB_OUTPUT_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46/JobB_output_4506541865054375691_tmp, JOB_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46/JobB_props_1246594669203301991_tmp, KRB5CCNAME=/tmp/krb5cc__mutilexec__sample__JobB__46__azkaban, JOB_NAME=JobB}
18-08-2022 15:48:16 CST JobB INFO - Working directory: /usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46
18-08-2022 15:48:16 CST JobB INFO - “This is JobB.”
18-08-2022 15:48:16 CST JobB INFO - Process completed successfully in 0 seconds.
18-08-2022 15:48:16 CST JobB INFO - output properties file=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46/JobB_output_4506541865054375691_tmp
18-08-2022 15:48:16 CST JobB INFO - Finishing job JobB at 1660808896751 with status SUCCEEDED

JobC 由于条件不满足,所以不会执行 JobD 条件:依赖BC,但只要有一个成功即可执行

18-08-2022 15:48:16 CST JobD INFO - Starting job JobD at 1660808896767
18-08-2022 15:48:16 CST JobD INFO - job JVM args: -Dazkaban.flowid=sample -Dazkaban.execid=46 -Dazkaban.jobid=JobD
18-08-2022 15:48:16 CST JobD INFO - user.to.proxy property was not set, defaulting to submit user azkaban
18-08-2022 15:48:16 CST JobD INFO - Building command job executor. 
18-08-2022 15:48:16 CST JobD INFO - Memory granted for job JobD
18-08-2022 15:48:16 CST JobD INFO - 1 commands to execute.
18-08-2022 15:48:16 CST JobD INFO - cwd=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46
18-08-2022 15:48:16 CST JobD INFO - effective user is: azkaban
18-08-2022 15:48:16 CST JobD INFO - Command: pwd
18-08-2022 15:48:16 CST JobD INFO - Environment variables: {JOB_OUTPUT_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46/JobD_output_4201588014871968596_tmp, JOB_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46/JobD_props_1359000998414188665_tmp, KRB5CCNAME=/tmp/krb5cc__mutilexec__sample__JobD__46__azkaban, JOB_NAME=JobD}
18-08-2022 15:48:16 CST JobD INFO - Working directory: /usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46
18-08-2022 15:48:16 CST JobD INFO - /usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46
18-08-2022 15:48:16 CST JobD INFO - Process completed successfully in 0 seconds.
18-08-2022 15:48:16 CST JobD INFO - output properties file=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/46/JobD_output_4201588014871968596_tmp
18-08-2022 15:48:16 CST JobD INFO - Finishing job JobD at 1660808896792 with status SUCCEEDED

三、参数传递

1、参数类型

azkaban的工作流中的参数如下几个类型:

  • Azkaban UI 页面输入参数
  • 环境变量参数
  • job作业文件中定义的参数
  • 工作流的用户定义的属性文件,上游作业传递给下游的参数
  • 工作流运行时产生的系统参数
  • job的common参数 参数类型与其对应的参数范围如下 在这里插入图片描述

2、job 参数简介commom参数

除了type,command,dependencies三个必选参数外,还有如下参数可以为每个job配置 在这里插入图片描述 一个flow的email属性,只会取最后一个job的配置,其他的job的email配置将会被忽略

3、job之间的参数传递

  • Parameter Passing There is often a desire to pass these parameters to the executing job code. The method of passing these parameters is dependent on the jobtype that is run, but usually Azkaban writes these parameters to a temporary file that is readable by the job. The path of the file is set in JOB_PROP_FILE environment variable. The format is the same key value pair property files. Certain built-in job types do this automatically for you. The java type, for instance, will invoke your Runnable and given a proper constructor, Azkaban can pass parameters to your code automatically.
  • Parameter Output Properties can be exported to be passed to its dependencies. A second environment variable JOB_OUTPUT_PROP_FILE is set by Azkaban. If a job writes a file to that path, Azkaban will read this file and then pass the output to the next jobs in the flow. The output file should be in json format. Certain built-in job types can handle this automatically, such as the java type.

JOB_OUTPUT_PROP_FILE和JOB_PROP_FILE都是一个环境变量,指向文件路径。

  • 参数传入 上游节点把需要输出的值以json的格式写入JOB_OUTPUT_PROP_FILE文件,azkaban以job执行过程中,上游job传递进来的临时参数,运行时参数,项目中配置文件的参数,job定义中参数等 都保存在${JOB_PROP_FILE}文件中,保存格式为key=value。执行job的中shell命令时,可以作为参数传递。
  • 参数传出 一个azkaban job执行结束,可以将一些参数写入到${JOB_OUTPUT_PROP_FILE}文件 中,azkaban会将这些参数传递到下游依赖的的job的参数文件${JOB_PROP_FILE}文件中,供下游job引用。写入到${JOB_OUTPUT_PROP_FILE}文件中参数需要是json格式的,否则会报json解析错。下游节点就可以在JOB_PROP_FILE中看到key-value形式的输出,用${key}的方式使用变量。

4、job之间的参数传递示例

  • baseflow.flow
# baseflow.flow
nodes:
  - name: jobB
    type: command 
    dependsOn:
       - jobA
    config:
       command: sh commandB.sh "${firstName}" 
       #command: sh commandB.sh "${lastName}"-"${firstName}" 结果是lastname展示出来了,原因不详
       #command: sh commandB.sh "${lastName}-${firstName}" 结果是lastname-firstName展示出来了
  -firstName
   type: command
    config:
       command: sh commandA.sh
  • commandA.sh
#!/bin/bash
echo '{ "firstName":"John" , "lastName":"Doe" }' >> ${JOB_OUTPUT_PROP_FILE}
  • commandB.sh
#!/bin/bash
cat ${JOB_PROP_FILE} >> /usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/logs/azkabantest.txt

jobB依赖JobA,jobA执行完成后,会一串json内容到${JOB_OUTPUT_PROP_FILE}指向的文件中,JobA执行完成后,jobB才可以执行,等job执行时,会将jobA输出的内容写入到/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/logs/azkabantest.txt,并追加参数中的firstName写入到文件中。

  • 运行结果
[root@localhost logs]# pwd
/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/logs
[root@localhost logs]# cat azkabantest.txt
Doe-John
  • 工作流运行日志
19-08-2022 10:03:12 CST baseflow INFO - Assigned executor : localhost:12321
19-08-2022 10:03:12 CST baseflow INFO - Running execid:55 flow:baseflow project:6 version:19
19-08-2022 10:03:12 CST baseflow INFO - Updating initial flow directory.
19-08-2022 10:03:12 CST baseflow INFO - Fetching job and shared properties.
19-08-2022 10:03:12 CST baseflow INFO - Starting flows
19-08-2022 10:03:12 CST baseflow INFO - Running flow 'baseflow'.
19-08-2022 10:03:12 CST baseflow INFO - Configuring Azkaban metrics tracking for jobrunner object
19-08-2022 10:03:12 CST baseflow INFO - Submitting job 'jobA' to run.
19-08-2022 10:03:12 CST baseflow INFO - Created file appender for job jobA
19-08-2022 10:03:12 CST baseflow INFO - Attached file appender for job jobA
19-08-2022 10:03:12 CST baseflow INFO - Job Started: jobA
19-08-2022 10:03:12 CST baseflow INFO - No attachment file for job jobA written.
19-08-2022 10:03:12 CST baseflow INFO - Job jobA finished with status SUCCEEDED in 0 seconds
19-08-2022 10:03:12 CST baseflow INFO - Configuring Azkaban metrics tracking for jobrunner object
19-08-2022 10:03:12 CST baseflow INFO - Submitting job 'jobB' to run.
19-08-2022 10:03:12 CST baseflow INFO - Created file appender for job jobB
19-08-2022 10:03:12 CST baseflow INFO - Attached file appender for job jobB
19-08-2022 10:03:12 CST baseflow INFO - Job Started: jobB
19-08-2022 10:03:12 CST baseflow INFO - No attachment file for job jobB written.
19-08-2022 10:03:12 CST baseflow INFO - Job jobB finished with status SUCCEEDED in 0 seconds
19-08-2022 10:03:12 CST baseflow INFO - Flow '' is set to SUCCEEDED in 0 seconds
19-08-2022 10:03:12 CST baseflow INFO - Finishing up flow. Awaiting Termination
19-08-2022 10:03:12 CST baseflow INFO - Finished Flow
19-08-2022 10:03:12 CST baseflow INFO - Setting end time for flow 55 to 1660874592807
  • JobB运行日志
19-08-2022 10:03:12 CST jobB INFO - Starting job jobB at 1660874592783
19-08-2022 10:03:12 CST jobB INFO - job JVM args: -Dazkaban.flowid=baseflow -Dazkaban.execid=55 -Dazkaban.jobid=jobB
19-08-2022 10:03:12 CST jobB INFO - user.to.proxy property was not set, defaulting to submit user azkaban
19-08-2022 10:03:12 CST jobB INFO - Building command job executor. 
19-08-2022 10:03:12 CST jobB INFO - Memory granted for job jobB
19-08-2022 10:03:12 CST jobB INFO - 1 commands to execute.
19-08-2022 10:03:12 CST jobB INFO - cwd=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/55
19-08-2022 10:03:12 CST jobB INFO - effective user is: azkaban
19-08-2022 10:03:12 CST jobB INFO - Command: sh commandB.sh "Doe-John"
19-08-2022 10:03:12 CST jobB INFO - Environment variables: {JOB_OUTPUT_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/55/jobB_output_2783919976351931342_tmp, JOB_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/55/jobB_props_8122716689014464679_tmp, KRB5CCNAME=/tmp/krb5cc__mutilexec__baseflow__jobB__55__azkaban, JOB_NAME=jobB}
19-08-2022 10:03:12 CST jobB INFO - Working directory: /usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/55
19-08-2022 10:03:12 CST jobB INFO - Process completed successfully in 0 seconds.
19-08-2022 10:03:12 CST jobB INFO - output properties file=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/55/jobB_output_2783919976351931342_tmp
19-08-2022 10:03:12 CST jobB INFO - Finishing job jobB at 1660874592796 with status SUCCEEDED

5、job参数之参数继承

后缀为.properties的文件将会作为参数文件加载,并且为flow中每个job所共享,属性文件通过目录分层结构继承。 比如,在zip包中有以下结构

system.properties
system.job
testflow/sit.properties
testflow/prd.properties
testflow/sit.job
testflow/prd.job

system.properties是全局的属性,将会被system.job和testflow目录下的sit.job和prd.job使用,但是system.job不会继承sit.properties和prd.properties的属性

6、job参数之参数继承示例

  • system.properties
druid.initialSize=10
druid.minIdle=10
druid.maxActive=50
druid.maxWait=60000
druid.timeBetweenEvictionRunsMillis=60000
druid.minEvictableIdleTimeMillis=300000
druid.validationQuery=SELECT 1 from dual
druid.testWhileIdle=true
druid.testOnBorrow=false
druid.testOnReturn=false
druid.poolPreparedStatements=true
druid.maxPoolPreparedStatementPerConnectionSize=20
druid.filters=wall,stat
  • system.job
type=command
command=echo ${druid.validationQuery}-${druid.filters}
  • testflow/sit.properties
connection.url=jdbc:mysql://192.168.10.44:3306/smp?autoReconnect=true&useUnicode=true&characterEncoding=utf-8&serverTimezone=GMT%2B8&useSSL=false
connection.username=sit_username
connection.password=sit_password
  • testflow/sit.job
type=command
dependencies=system
command=echo ${druid.validationQuery}-${druid.filters}-${connection.username}-${connection.password}
  • testflow/prd.properties
connection.url=jdbc:mysql://192.168.10.37:3306/smp?autoReconnect=true&useUnicode=true&characterEncoding=utf-8&serverTimezone=GMT%2B8&useSSL=false
connection.username=prd_username
connection.password=prd_password
  • testflow/prd.job
type=command
dependencies=system
command=echo ${druid.validationQuery}-${druid.filters}-${connection.username}-${connection.password}
  • 任务上传 在这里插入图片描述
  • prd任务运行情况 sit任务运行情况类似,不再赘述 在这里插入图片描述
19-08-2022 10:27:17 CST prd INFO - Assigned executor : localhost:12321
19-08-2022 10:27:17 CST prd INFO - Running execid:56 flow:prd project:6 version:20
19-08-2022 10:27:17 CST prd INFO - Updating initial flow directory.
19-08-2022 10:27:17 CST prd INFO - Fetching job and shared properties.
19-08-2022 10:27:17 CST prd INFO - Starting flows
19-08-2022 10:27:17 CST prd INFO - Running flow 'prd'.
19-08-2022 10:27:17 CST prd INFO - Configuring Azkaban metrics tracking for jobrunner object
19-08-2022 10:27:17 CST prd INFO - Submitting job 'system' to run.
19-08-2022 10:27:17 CST prd INFO - Created file appender for job system
19-08-2022 10:27:17 CST prd INFO - Attached file appender for job system
19-08-2022 10:27:17 CST prd INFO - Job Started: system
19-08-2022 10:27:17 CST prd INFO - No attachment file for job system written.
19-08-2022 10:27:17 CST prd INFO - Job system finished with status SUCCEEDED in 0 seconds
19-08-2022 10:27:17 CST prd INFO - Configuring Azkaban metrics tracking for jobrunner object
19-08-2022 10:27:17 CST prd INFO - Submitting job 'prd' to run.
19-08-2022 10:27:17 CST prd INFO - Created file appender for job prd
19-08-2022 10:27:17 CST prd INFO - Attached file appender for job prd
19-08-2022 10:27:17 CST prd INFO - Job Started: prd
19-08-2022 10:27:17 CST prd INFO - No attachment file for job prd written.
19-08-2022 10:27:17 CST prd INFO - Job prd finished with status SUCCEEDED in 0 seconds
19-08-2022 10:27:17 CST prd INFO - Flow '' is set to SUCCEEDED in 0 seconds
19-08-2022 10:27:17 CST prd INFO - Finishing up flow. Awaiting Termination
19-08-2022 10:27:17 CST prd INFO - Finished Flow
19-08-2022 10:27:17 CST prd INFO - Setting end time for flow 56 to 1660876037151

system.job运行结果

19-08-2022 10:27:17 CST system INFO - Starting job system at 1660876037104
19-08-2022 10:27:17 CST system INFO - job JVM args: -Dazkaban.flowid=prd -Dazkaban.execid=56 -Dazkaban.jobid=system
19-08-2022 10:27:17 CST system INFO - user.to.proxy property was not set, defaulting to submit user azkaban
19-08-2022 10:27:17 CST system INFO - Building command job executor. 
19-08-2022 10:27:17 CST system INFO - Memory granted for job system
19-08-2022 10:27:17 CST system INFO - 1 commands to execute.
19-08-2022 10:27:17 CST system INFO - cwd=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/56
19-08-2022 10:27:17 CST system INFO - effective user is: azkaban
19-08-2022 10:27:17 CST system INFO - Command: echo SELECT 1 from dual-wall,stat
19-08-2022 10:27:17 CST system INFO - Environment variables: {JOB_OUTPUT_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/56/system_output_549638214031155746_tmp, JOB_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/56/system_props_6040607731603943516_tmp, KRB5CCNAME=/tmp/krb5cc__mutilexec__prd__system__56__azkaban, JOB_NAME=system}
19-08-2022 10:27:17 CST system INFO - Working directory: /usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/56
19-08-2022 10:27:17 CST system INFO - SELECT 1 from dual-wall,stat
19-08-2022 10:27:17 CST system INFO - Process completed successfully in 0 seconds.
19-08-2022 10:27:17 CST system INFO - output properties file=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/56/system_output_549638214031155746_tmp
19-08-2022 10:27:17 CST system INFO - Finishing job system at 1660876037120 with status SUCCEEDED
  • prd.job运行结果
19-08-2022 10:27:17 CST prd INFO - Starting job prd at 1660876037128
19-08-2022 10:27:17 CST prd INFO - job JVM args: -Dazkaban.flowid=prd -Dazkaban.execid=56 -Dazkaban.jobid=prd
19-08-2022 10:27:17 CST prd INFO - user.to.proxy property was not set, defaulting to submit user azkaban
19-08-2022 10:27:17 CST prd INFO - Building command job executor. 
19-08-2022 10:27:17 CST prd INFO - Memory granted for job prd
19-08-2022 10:27:17 CST prd INFO - 1 commands to execute.
19-08-2022 10:27:17 CST prd INFO - cwd=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/56/testflow
19-08-2022 10:27:17 CST prd INFO - effective user is: azkaban
19-08-2022 10:27:17 CST prd INFO - Command: echo SELECT 1 from dual-wall,stat-sit_username-sit_password
19-08-2022 10:27:17 CST prd INFO - Environment variables: {JOB_OUTPUT_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/56/testflow/prd_output_3426270791861719483_tmp, JOB_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/56/testflow/prd_props_3327412481427109800_tmp, KRB5CCNAME=/tmp/krb5cc__mutilexec__prd__prd__56__azkaban, JOB_NAME=prd}
19-08-2022 10:27:17 CST prd INFO - Working directory: /usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/56/testflow
19-08-2022 10:27:17 CST prd INFO - SELECT 1 from dual-wall,stat-sit_username-sit_password
19-08-2022 10:27:17 CST prd INFO - Process completed successfully in 0 seconds.
19-08-2022 10:27:17 CST prd INFO - output properties file=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/56/testflow/prd_output_3426270791861719483_tmp
19-08-2022 10:27:17 CST prd INFO - Finishing job prd at 1660876037139 with status SUCCEEDED

7、job参数之参数替换

azkaban支持参数替换; 替换参数样式: azkaban会替换{}中的参数。 无论${parameterName} 在job file中或者在参数文件中或者运行时参数发现,都可以被替换为对应的值

# shared.properties 
replaceparameter=bar
# myjob.job 
param1=mytest 
foo=${replaceparameter} #${replaceparameter}会替换为bar 
param2=${param1} # ${param1} 会被替换成mytest。

在myjob 作业运行前,foo 会被赋值为bar , param2会被赋值为mytest

8、shell动态传参

  • 步骤一:在job文件test.job指定
type=command
command=echo ${test.ui.param}
  • 步骤二:UI页面输入参数定义 在这里插入图片描述

  • 步骤三:编辑test.job,添加参数test.ui.param=test_ui_param,然后执行

  • 运行结果

19-08-2022 10:45:37 CST test INFO - Starting job test at 1660877137088
19-08-2022 10:45:37 CST test INFO - job JVM args: -Dazkaban.flowid=test -Dazkaban.execid=59 -Dazkaban.jobid=test
19-08-2022 10:45:37 CST test INFO - user.to.proxy property was not set, defaulting to submit user azkaban
19-08-2022 10:45:37 CST test INFO - Building command job executor. 
19-08-2022 10:45:37 CST test INFO - Memory granted for job test
19-08-2022 10:45:37 CST test INFO - 1 commands to execute.
19-08-2022 10:45:37 CST test INFO - cwd=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/59
19-08-2022 10:45:37 CST test INFO - effective user is: azkaban
19-08-2022 10:45:37 CST test INFO - Command: echo test_ui_param
19-08-2022 10:45:37 CST test INFO - Environment variables: {JOB_OUTPUT_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/59/test_output_6446797515661898923_tmp, JOB_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/59/test_props_6734213231611895973_tmp, KRB5CCNAME=/tmp/krb5cc__mutilexec__test__test__59__azkaban, JOB_NAME=test}
19-08-2022 10:45:37 CST test INFO - Working directory: /usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/59
19-08-2022 10:45:37 CST test INFO - test_ui_param
19-08-2022 10:45:37 CST test INFO - Process completed successfully in 0 seconds.
19-08-2022 10:45:37 CST test INFO - output properties file=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/59/test_output_6446797515661898923_tmp
19-08-2022 10:45:37 CST test INFO - Finishing job test at 1660877137098 with status SUCCEEDED
  • Job Logs
19-08-2022 10:45:37 CST test INFO - Starting job test at 1660877137088
19-08-2022 10:45:37 CST test INFO - job JVM args: -Dazkaban.flowid=test -Dazkaban.execid=59 -Dazkaban.jobid=test
19-08-2022 10:45:37 CST test INFO - user.to.proxy property was not set, defaulting to submit user azkaban
19-08-2022 10:45:37 CST test INFO - Building command job executor. 
19-08-2022 10:45:37 CST test INFO - Memory granted for job test
19-08-2022 10:45:37 CST test INFO - 1 commands to execute.
19-08-2022 10:45:37 CST test INFO - cwd=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/59
19-08-2022 10:45:37 CST test INFO - effective user is: azkaban
19-08-2022 10:45:37 CST test INFO - Command: echo test_ui_param
19-08-2022 10:45:37 CST test INFO - Environment variables: {JOB_OUTPUT_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/59/test_output_6446797515661898923_tmp, JOB_PROP_FILE=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/59/test_props_6734213231611895973_tmp, KRB5CCNAME=/tmp/krb5cc__mutilexec__test__test__59__azkaban, JOB_NAME=test}
19-08-2022 10:45:37 CST test INFO - Working directory: /usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/59
19-08-2022 10:45:37 CST test INFO - test_ui_param
19-08-2022 10:45:37 CST test INFO - Process completed successfully in 0 seconds.
19-08-2022 10:45:37 CST test INFO - output properties file=/usr/local/bigdata/azkaban3.51.0/exec-server/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/59/test_output_6446797515661898923_tmp
19-08-2022 10:45:37 CST test INFO - Finishing job test at 1660877137098 with status SUCCEEDED
  • shell中使用参数的注意事项 在UI页面重新输入运行时参数时,可以覆盖系统默认生成的参数值。运行时参数,和UI输入的参数,都可以认为是全局参数,在整个工作流的作业配置中,都可以通过 ${参数名} 的方式引用使用

在shell 中直接引用 公共参数,运行时系统参数,UI输入参数,是无效的 在shell中只能直接使用环境变量 公共参数,运行时系统参数,UI输入参数能只通过shell的脚本参数的方式传递进来 job文件中定义的环境变量参数,可以在shell脚本中直接引用,但只对当前job有效 关于job间的参数问题 https://github.com/azkaban/azkaban/issues/1897

【版权声明】本文内容来自摩杜云社区用户原创、第三方投稿、转载,内容版权归原作者所有。本网站的目的在于传递更多信息,不拥有版权,亦不承担相应法律责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@moduyun.com

  1. 分享:
最后一次编辑于 2023年11月08日 0

暂无评论

推荐阅读
  KRe60ogUm4le   2024年05月31日   101   0   0 flink大数据
  KRe60ogUm4le   2024年05月31日   37   0   0 flink大数据
nNPyvzOmRTFq
最新推荐 更多

2024-05-31