关注 spark技术分享,
撸spark源码 玩spark最佳实践

SparkConf — Programmable Configuration for Spark Applications

SparkConf — Spark Application’s Configuration

Tip
Refer to Spark Configuration in the official documentation for an extensive coverage of how to configure Spark and user programs.
Caution

TODO

  • Describe SparkConf object for the application configuration.

  • the default configs

  • system properties

There are three ways to configure Spark and user programs:

  • Spark Properties – use Web UI to learn the current properties.

  • …​

setIfMissing Method

Caution
FIXME

isExecutorStartupConf Method

Caution
FIXME

set Method

Caution
FIXME

Mandatory Settings – spark.master and spark.app.name

There are two mandatory settings of any Spark application that have to be defined before this Spark application could be run — spark.master and spark.app.name.

Spark Properties

Every user program starts with creating an instance of SparkConf that holds the master URL to connect to (spark.master), the name for your Spark application (that is later displayed in web UI and becomes spark.app.name) and other Spark properties required for proper runs. The instance of SparkConf can be used to create SparkContext.

Tip

Start Spark shell with --conf spark.logConf=true to log the effective Spark configuration as INFO when SparkContext is started.

Use sc.getConf.toDebugString to have a richer output once SparkContext has finished initializing.

You can query for the values of Spark properties in Spark shell as follows:

Setting up Spark Properties

There are the following places where a Spark application looks for Spark properties (in the order of importance from the least important to the most important):

  • conf/spark-defaults.conf – the configuration file with the default Spark properties. Read spark-defaults.conf.

  • --conf or -c – the command-line option used by spark-submit (and other shell scripts that use spark-submit or spark-class under the covers, e.g. spark-shell)

  • SparkConf

Default Configuration

The default Spark configuration is created when you execute the following code:

It simply loads spark.* system properties.

You can use conf.toDebugString or conf.getAll to have the spark.* system properties loaded printed out.

Unique Identifier of Spark Application — getAppId Method

getAppId gives spark.app.id Spark property or reports NoSuchElementException if not set.

Note

getAppId is used when:

Settings

Table 1. Spark Properties
Spark Property Default Value Description

spark.master

Master URL

spark.app.id

TaskScheduler.applicationId()

Unique identifier of a Spark application that Spark uses to uniquely identify metric sources.

Set when SparkContext is created (right after TaskScheduler is started that actually gives the identifier).

spark.app.name

Application Name

赞(0) 打赏
未经允许不得转载:spark技术分享 » SparkConf — Programmable Configuration for Spark Applications
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏