SessionState — State Separation Layer Between SparkSessions
SessionState
is the state separation layer between Spark SQL sessions, including SQL configuration, tables, functions, UDFs, SQL parser, and everything else that depends on a SQLConf.
SessionState
is available as the sessionState property of a SparkSession
.
1 2 3 4 5 6 7 8 9 |
scala> :type spark org.apache.spark.sql.SparkSession scala> :type spark.sessionState org.apache.spark.sql.internal.SessionState |
SessionState
is created when SparkSession
is requested to instantiateSessionState (when requested for the SessionState per spark.sql.catalogImplementation configuration property).
Note
|
When requested for the SessionState, There are two
|
Name | Type | Description |
---|---|---|
|
Initialized lazily (i.e. only when requested the first time) using the analyzerBuilder factory function. Used when…FIXME |
|
|
Metastore of tables and databases Used when…FIXME |
|
|
Used when…FIXME |
|
|
Used when…FIXME |
|
|
Used when…FIXME |
|
|
|
Used when…FIXME |
|
Used when…FIXME |
|
|
Used exclusively when |
|
|
|
Used when…FIXME |
|
Used when…FIXME |
|
|
|
Used to manage streaming queries in Spark Structured Streaming |
|
Interface to register user-defined functions. Used when…FIXME |
Note
|
SessionState is a private[sql] class and, given the package org.apache.spark.sql.internal , SessionState should be considered internal.
|
Creating SessionState Instance
SessionState
takes the following when created:
-
catalogBuilder
function to create a SessionCatalog (i.e.() ⇒ SessionCatalog
) -
analyzerBuilder
function to create an Analyzer (i.e.() ⇒ Analyzer
) -
optimizerBuilder
function to create an Optimizer (i.e.() ⇒ Optimizer
) -
resourceLoaderBuilder
function to create aSessionResourceLoader
(i.e.() ⇒ SessionResourceLoader
) -
createQueryExecution
function to create a QueryExecution given a LogicalPlan (i.e.LogicalPlan ⇒ QueryExecution
) -
createClone
function to clone theSessionState
given a SparkSession (i.e.(SparkSession, SessionState) ⇒ SessionState
)
apply
Factory Methods
Caution
|
FIXME |
1 2 3 4 5 6 |
apply(sparkSession: SparkSession): SessionState (1) apply(sparkSession: SparkSession, sqlConf: SQLConf): SessionState |
-
Passes
sparkSession
to the otherapply
with a newSQLConf
Note
|
apply is used when SparkSession is requested for SessionState .
|
createAnalyzer
Internal Method
1 2 3 4 5 6 7 8 |
createAnalyzer( sparkSession: SparkSession, catalog: SessionCatalog, sqlConf: SQLConf): Analyzer |
createAnalyzer
creates a logical query plan Analyzer with rules specific to a non-Hive SessionState
.
Method | Rules | Description |
---|---|---|
extendedResolutionRules |
FindDataSourceTable |
Replaces InsertIntoTable (with |
ResolveSQLOnFile |
||
postHocResolutionRules |
||
PreprocessTableInsertion |
||
extendedCheckRules |
PreWriteCheck |
|
HiveOnlyCheck |
“Executing” Logical Plan (Creating QueryExecution For LogicalPlan) — executePlan
Method
1 2 3 4 5 |
executePlan(plan: LogicalPlan): QueryExecution |
executePlan
simply executes the createQueryExecution function on the input logical plan (that simply creates a QueryExecution with the current SparkSession and the input logical plan).
Creating New Hadoop Configuration — newHadoopConf
Method
1 2 3 4 5 6 |
newHadoopConf(): Configuration newHadoopConf(hadoopConf: Configuration, sqlConf: SQLConf): Configuration |
newHadoopConf
returns a Hadoop Configuration (with the SparkContext.hadoopConfiguration
and all the configuration properties of the SQLConf).
Note
|
newHadoopConf is used by ScriptTransformation , ParquetRelation , StateStoreRDD , and SessionState itself, and few other places.
|
Creating New Hadoop Configuration With Extra Options — newHadoopConfWithOptions
Method
1 2 3 4 5 |
newHadoopConfWithOptions(options: Map[String, String]): Configuration |
newHadoopConfWithOptions
creates a new Hadoop Configuration with the input options
set (except path
and paths
options that are skipped).
Note
|
|