SparkConf — Spark Application’s Configuration
Tip
|
Refer to Spark Configuration in the official documentation for an extensive coverage of how to configure Spark and user programs. |
Caution
|
TODO
|
There are three ways to configure Spark and user programs:
-
Spark Properties – use Web UI to learn the current properties.
-
…
setIfMissing
Method
Caution
|
FIXME |
isExecutorStartupConf
Method
Caution
|
FIXME |
set
Method
Caution
|
FIXME |
Mandatory Settings – spark.master and spark.app.name
There are two mandatory settings of any Spark application that have to be defined before this Spark application could be run — spark.master and spark.app.name.
Spark Properties
Every user program starts with creating an instance of SparkConf
that holds the master URL to connect to (spark.master
), the name for your Spark application (that is later displayed in web UI and becomes spark.app.name
) and other Spark properties required for proper runs. The instance of SparkConf
can be used to create SparkContext.
Tip
|
Start Spark shell with
Use |
You can query for the values of Spark properties in Spark shell as follows:
1 2 3 4 5 6 7 8 9 10 11 12 |
scala> sc.getConf.getOption("spark.local.dir") res0: Option[String] = None scala> sc.getConf.getOption("spark.app.name") res1: Option[String] = Some(Spark shell) scala> sc.getConf.get("spark.master") res2: String = local[*] |
Setting up Spark Properties
There are the following places where a Spark application looks for Spark properties (in the order of importance from the least important to the most important):
-
conf/spark-defaults.conf
– the configuration file with the default Spark properties. Read spark-defaults.conf. -
--conf
or-c
– the command-line option used by spark-submit (and other shell scripts that usespark-submit
orspark-class
under the covers, e.g.spark-shell
) -
SparkConf
Default Configuration
The default Spark configuration is created when you execute the following code:
1 2 3 4 5 6 |
import org.apache.spark.SparkConf val conf = new SparkConf |
It simply loads spark.*
system properties.
You can use conf.toDebugString
or conf.getAll
to have the spark.*
system properties loaded printed out.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
scala> conf.getAll res0: Array[(String, String)] = Array((spark.app.name,Spark shell), (spark.jars,""), (spark.master,local[*]), (spark.submit.deployMode,client)) scala> conf.toDebugString res1: String = spark.app.name=Spark shell spark.jars= spark.master=local[*] spark.submit.deployMode=client scala> println(conf.toDebugString) spark.app.name=Spark shell spark.jars= spark.master=local[*] spark.submit.deployMode=client |
Unique Identifier of Spark Application — getAppId
Method
1 2 3 4 5 |
getAppId: String |
getAppId
gives spark.app.id Spark property or reports NoSuchElementException
if not set.
Note
|
|
Settings
Spark Property | Default Value | Description |
---|---|---|
Master URL |
||
Unique identifier of a Spark application that Spark uses to uniquely identify metric sources. Set when |
||
Application Name |