关注 spark技术分享,
撸spark源码 玩spark最佳实践

spark-class shell script

spark-class shell script

spark-class shell script is the Spark application command-line launcher that is responsible for setting up JVM environment and executing a Spark application.

Note
Ultimately, any shell script in Spark, e.g. spark-submit, calls spark-class script.

You can find spark-class script in bin directory of the Spark distribution.

When started, spark-class first loads $SPARK_HOME/bin/load-spark-env.sh, collects the Spark assembly jars, and executes org.apache.spark.launcher.Main.

Depending on the Spark distribution (or rather lack thereof), i.e. whether RELEASE file exists or not, it sets SPARK_JARS_DIR environment variable to [SPARK_HOME]/jars or [SPARK_HOME]/assembly/target/scala-[SPARK_SCALA_VERSION]/jars, respectively (with the latter being a local build).

If SPARK_JARS_DIR does not exist, spark-class prints the following error message and exits with the code 1.

spark-class sets LAUNCH_CLASSPATH environment variable to include all the jars under SPARK_JARS_DIR.

If SPARK_PREPEND_CLASSES is enabled, [SPARK_HOME]/launcher/target/scala-[SPARK_SCALA_VERSION]/classes directory is added to LAUNCH_CLASSPATH as the first entry.

Note
Use SPARK_PREPEND_CLASSES to have the Spark launcher classes (from [SPARK_HOME]/launcher/target/scala-[SPARK_SCALA_VERSION]/classes) to appear before the other Spark assembly jars. It is useful for development so your changes don’t require rebuilding Spark again.

SPARK_TESTING and SPARK_SQL_TESTING environment variables enable test special mode.

Caution
FIXME What’s so special about the env vars?

spark-class uses org.apache.spark.launcher.Main command-line application to compute the Spark command to launch. The Main class programmatically computes the command that spark-class executes afterwards.

Tip
Use JAVA_HOME to point at the JVM to use.

Launching org.apache.spark.launcher.Main Standalone Application

org.apache.spark.launcher.Main is a Scala standalone application used in spark-class to prepare the Spark command to execute.

Main expects that the first parameter is the class name that is the “operation mode”:

  1. org.apache.spark.deploy.SparkSubmit — Main uses SparkSubmitCommandBuilder to parse command-line arguments. This is the mode spark-submit uses.

  2. anything — Main uses SparkClassCommandBuilder to parse command-line arguments.

Main uses buildCommand method on the builder to build a Spark command.

If SPARK_PRINT_LAUNCH_COMMAND environment variable is enabled, Main prints the final Spark command to standard error.

If on Windows it calls prepareWindowsCommand while on non-Windows OSes prepareBashCommand with tokens separated by

赞(0) 打赏
未经允许不得转载:spark技术分享 » spark-class shell script
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏