Builder — Building SparkSession using Fluent API
Builder
is the fluent API to create a SparkSession.
Method | Description | ||
---|---|---|---|
|
|||
|
|||
Enables Hive support
|
|||
Gets the current SparkSession or creates a new one.
|
|||
|
|||
Access to the SparkSessionExtensions
|
Builder
is available using the builder object method of a SparkSession.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
import org.apache.spark.sql.SparkSession val spark = SparkSession.builder .appName("My Spark Application") // optional and will be autogenerated if not specified .master("local[*]") // only for demo and testing purposes, use spark-submit instead .enableHiveSupport() // self-explanatory, isn't it? .config("spark.sql.warehouse.dir", "target/spark-warehouse") .withExtensions { extensions => extensions.injectResolutionRule { session => ... } extensions.injectOptimizerRule { session => ... } } .getOrCreate |
Note
|
You can have multiple SparkSession s in a single Spark application for different data catalogs (through relational entities).
|
Name | Description |
---|---|
|
Used when…FIXME |
|
Used when…FIXME |
Getting Or Creating SparkSession Instance — getOrCreate
Method
1 2 3 4 5 |
getOrCreate(): SparkSession |
getOrCreate
…FIXME
Enabling Hive Support — enableHiveSupport
Method
1 2 3 4 5 |
enableHiveSupport(): Builder |
enableHiveSupport
enables Hive support, i.e. running structured queries on Hive tables (and a persistent Hive metastore, support for Hive serdes and Hive user-defined functions).
Note
|
You do not need any existing Hive installation to use Spark’s Hive support. Refer to SharedState. |
Internally, enableHiveSupport
makes sure that the Hive classes are on CLASSPATH, i.e. Spark SQL’s org.apache.hadoop.hive.conf.HiveConf
, and sets spark.sql.catalogImplementation internal configuration property to hive
.
withExtensions
Method
1 2 3 4 5 |
withExtensions(f: SparkSessionExtensions => Unit): Builder |
withExtensions
simply executes the input f
function with the SparkSessionExtensions.