StateStoreProvider Contract
StateStoreProvider
is the abstraction of state store providers that manage StateStores for a given StateStoreId.
Note
|
StateStoreProvider helper object uses spark.sql.streaming.stateStore.providerClass internal configuration property for the name of the class of the StateStoreProvider implementation.
|
Note
|
HDFSBackedStateStoreProvider is the only available implementation of the StateStoreProvider Contract in Spark Structured Streaming. |
Method | Description | ||
---|---|---|---|
|
Closes the state store provider Used exclusively when |
||
|
Optional maintenance task Used exclusively when |
||
|
Returns the StateStore for a given version Used exclusively when |
||
|
Initializes the state store provider Used exclusively when |
||
|
Returns the StateStoreId (that was used at initialization) Used when:
|
||
|
Used when:
|
Lifecycle of StateStoreProvider
The lifecycle of a StateStoreProvider
starts when StateStore
helper object (on a Spark executor) is requested to retrieve a StateStore by provider ID and version.
Note
|
HDFSBackedStateStoreProvider is the only available implementation of the StateStoreProvider Contract in Spark Structured Streaming. |
Note
|
Since StateStore and StateStoreProvider helper objects are Scala objects that gives that there can only be one instance of StateStore and StateStoreProvider on a JVM. That in turn means that there will be only one instance of each per JVM which is exactly the JVM of a Spark executor.
|
StateStore
helper object requests StateStoreProvider
helper object to createAndInit that creates the StateStoreProvider
implementation (given spark.sql.streaming.stateStore.providerClass internal configuration property) and requests it to initialize.
The initialized StateStoreProvider
is cached in loadedProviders internal lookup table (for a StateStoreId) for later lookups.
StateStoreProvider
helper object then requests the StateStoreProvider
to getStore for the version.
An instance of StateStoreProvider
is requested to do its own maintenance or close (when a corresponding StateStore is inactive) in MaintenanceTask daemon thread that runs periodically every spark.sql.streaming.stateStore.maintenanceInterval configuration property (default: 60s
).
Creating and Initializing StateStoreProvider — createAndInit
Factory Method
1 2 3 4 5 6 7 8 9 10 11 |
createAndInit( stateStoreId: StateStoreId, keySchema: StructType, valueSchema: StructType, indexOrdinal: Option[Int], storeConf: StateStoreConf, hadoopConf: Configuration): StateStoreProvider |
createAndInit
creates a new StateStoreProvider
(per spark.sql.streaming.stateStore.providerClass internal configuration property).
createAndInit
requests the StateStoreProvider
to initialize.
Note
|
createAndInit is used exclusively when StateStore helper object is requested to retrieve a StateStore by provider ID and version.
|