关注 spark技术分享,
撸spark源码 玩spark最佳实践

AbstractCommandBuilder

AbstractCommandBuilder

AbstractCommandBuilder is the base command builder for SparkSubmitCommandBuilder and SparkClassCommandBuilder specialized command builders.

AbstractCommandBuilder expects that command builders define buildCommand.

Table 1. AbstractCommandBuilder Methods
Method Description

buildCommand

The only abstract method that subclasses have to define.

buildJavaCommand

getConfDir

loadPropertiesFile

Loads the configuration file for a Spark application, be it the user-specified properties file or spark-defaults.conf file under the Spark configuration directory.

buildJavaCommand Internal Method

buildJavaCommand builds the Java command for a Spark application (which is a collection of elements with the path to java executable, JVM options from java-opts file, and a class path).

If javaHome is set, buildJavaCommand adds [javaHome]/bin/java to the result Java command. Otherwise, it uses JAVA_HOME or, when no earlier checks succeeded, falls through to java.home Java’s system property.

Caution
FIXME Who sets javaHome internal property and when?

buildJavaCommand loads extra Java options from the java-opts file in configuration directory if the file exists and adds them to the result Java command.

Eventually, buildJavaCommand builds the class path (with the extra class path if non-empty) and adds it as -cp to the result Java command.

buildClassPath method

buildClassPath builds the classpath for a Spark application.

Note
Directories always end up with the OS-specific file separator at the end of their paths.

buildClassPath adds the following in that order:

  1. SPARK_CLASSPATH environment variable

  2. The input appClassPath

  3. The configuration directory

  4. (only with SPARK_PREPEND_CLASSES set or SPARK_TESTING being 1) Locally compiled Spark classes in classes, test-classes and Core’s jars.

    Caution
    FIXME Elaborate on “locally compiled Spark classes”.
  5. (only with SPARK_SQL_TESTING being 1) …​

    Caution
    FIXME Elaborate on the SQL testing case
  6. HADOOP_CONF_DIR environment variable

  7. YARN_CONF_DIR environment variable

  8. SPARK_DIST_CLASSPATH environment variable

Note
childEnv is queried first before System properties. It is always empty for AbstractCommandBuilder (and SparkSubmitCommandBuilder, too).

Loading Properties File — loadPropertiesFile Internal Method

loadPropertiesFile is part of AbstractCommandBuilder private API that loads Spark settings from a properties file (when specified on the command line) or spark-defaults.conf in the configuration directory.

It loads the settings from the following files starting from the first and checking every location until the first properties file is found:

  1. propertiesFile (if specified using --properties-file command-line option or set by AbstractCommandBuilder.setPropertiesFile).

  2. [SPARK_CONF_DIR]/spark-defaults.conf

  3. [SPARK_HOME]/conf/spark-defaults.conf

Note
loadPropertiesFile reads a properties file using UTF-8.

Spark’s Configuration Directory — getConfDir Internal Method

AbstractCommandBuilder uses getConfDir to compute the current configuration directory of a Spark application.

It uses SPARK_CONF_DIR (from childEnv which is always empty anyway or as a environment variable) and falls through to [SPARK_HOME]/conf (with SPARK_HOME from getSparkHome internal method).

Spark’s Home Directory — getSparkHome Internal Method

AbstractCommandBuilder uses getSparkHome to compute Spark’s home directory for a Spark application.

It uses SPARK_HOME (from childEnv which is always empty anyway or as a environment variable).

If SPARK_HOME is not set, Spark throws a IllegalStateException:

赞(0) 打赏
未经允许不得转载:spark技术分享 » AbstractCommandBuilder
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏