SQLConf — Internal Configuration Store
SQLConf
is an internal key-value configuration store for parameters and hints used in Spark SQL.
Note
|
Spark SQL configuration is available through RuntimeConfig (the user-facing configuration management interface) that you can access using SparkSession.
|
You can access a SQLConf
using:
-
SQLConf.get (preferred) – the
SQLConf
of the current activeSparkSession
-
SessionState – direct access through SessionState of the
SparkSession
of your choice (that gives more flexibility on whatSparkSession
is used that can be different from the current activeSparkSession
)
1 2 3 4 5 6 7 8 9 10 11 12 |
import org.apache.spark.sql.internal.SQLConf // Use type-safe access to configuration properties // using SQLConf.get.getConf val parallelFileListingInStatsComputation = SQLConf.get.getConf(SQLConf.PARALLEL_FILE_LISTING_IN_STATS_COMPUTATION) // or even simpler SQLConf.get.parallelFileListingInStatsComputation |
SQLConf
offers methods to get, set, unset or clear values of configuration properties, but has also the accessor methods to read the current value of a configuration property or hint.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
scala> :type spark org.apache.spark.sql.SparkSession // Direct access to the session SQLConf val sqlConf = spark.sessionState.conf scala> :type sqlConf org.apache.spark.sql.internal.SQLConf scala> println(sqlConf.offHeapColumnVectorEnabled) false // Or simply import the conf value import spark.sessionState.conf // accessing properties through accessor methods scala> conf.numShufflePartitions res1: Int = 200 // Prefer SQLConf.get (over direct access) import org.apache.spark.sql.internal.SQLConf val cc = SQLConf.get scala> cc == conf res4: Boolean = true // setting properties using aliases import org.apache.spark.sql.internal.SQLConf.SHUFFLE_PARTITIONS conf.setConf(SHUFFLE_PARTITIONS, 2) scala> conf.numShufflePartitions res2: Int = 2 // unset aka reset properties to the default value conf.unsetConf(SHUFFLE_PARTITIONS) scala> conf.numShufflePartitions res3: Int = 200 |
Name | Parameter | Description | ||
---|---|---|---|---|
|
Used exclusively when |
|||
|
Used exclusively in JoinSelection execution planning strategy |
|||
|
|
|||
|
Used exclusively when |
|||
|
Used exclusively in BroadcastExchangeExec (for broadcasting a table to executors). |
|||
|
Used when |
|||
|
Used exclusively when |
|||
|
||||
|
|
|||
|
||||
|
Used exclusively in pivot operator. |
|||
|
Used exclusively in RelationalGroupedDataset when creating the result |
|||
|
|
|||
|
Used exclusively when |
|||
|
Used when ReuseSubquery and ReuseExchange physical optimizations are executed
|
|||
|
Used exclusively when |
|||
|
|
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
|
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
||||
|
||||
|
Used exclusively when |
|||
|
Used exclusively in CostBasedJoinReorder logical plan optimization |
|||
|
Used exclusively when a physical operator is requested the first n rows as an array. |
|||
|
|
|||
|
|
|||
|
|
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
spark.sql.statistics.parallelFileListingInStatsComputation.enabled |
Used exclusively when |
||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
|
|||
|
Used exclusively when InsertIntoHadoopFsRelationCommand logical command is executed |
|||
|
Used exclusively in JoinSelection execution planning strategy to prefer sort merge join over shuffle hash join. |
|||
|
|
|||
|
||||
|
Used exclusively in ReorderJoin logical plan optimization (and indirectly in |
|||
|
|
|||
|
Used exclusively when |
|||
|
|
|||
|
||||
|
|
|||
|
Used exclusively when |
|||
|
|
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
Used exclusively when |
|||
|
Used exclusively when |
Getting Parameters and Hints
You can get the current parameters and hints using the following family of get
methods.
1 2 3 4 5 6 7 8 9 10 11 |
getConf[T](entry: ConfigEntry[T], defaultValue: T): T getConf[T](entry: ConfigEntry[T]): T getConf[T](entry: OptionalConfigEntry[T]): Option[T] getConfString(key: String): String getConfString(key: String, defaultValue: String): String getAllConfs: immutable.Map[String, String] getAllDefinedConfs: Seq[(String, String, String)] |
Setting Parameters and Hints
You can set parameters and hints using the following family of set
methods.
1 2 3 4 5 6 7 |
setConf(props: Properties): Unit setConfString(key: String, value: String): Unit setConf[T](entry: ConfigEntry[T], value: T): Unit |
Unsetting Parameters and Hints
You can unset parameters and hints using the following family of unset
methods.
1 2 3 4 5 6 |
unsetConf(key: String): Unit unsetConf(entry: ConfigEntry[_]): Unit |
Clearing All Parameters and Hints
1 2 3 4 5 |
clear(): Unit |
You can use clear
to remove all the parameters and hints in SQLConf
.
Redacting Data Source Options with Sensitive Information — redactOptions
Method
1 2 3 4 5 |
redactOptions(options: Map[String, String]): Map[String, String] |
redactOptions
takes the values of the spark.sql.redaction.options.regex and spark.redaction.regex
configuration properties.
For every regular expression (in the order), redactOptions
redacts sensitive information, i.e. finds the first match of a regular expression pattern in every option key or value and if either matches replaces the value with ***(redacted)
.
Note
|
redactOptions is used exclusively when SaveIntoDataSourceCommand logical command is requested for the simple description.
|