关注 spark技术分享,
撸spark源码 玩spark最佳实践

StreamingAggregationStateManager Contract — State Managers for Streaming Aggregation

StreamingAggregationStateManager Contract — State Managers for Streaming Aggregation

StreamingAggregationStateManager is the contract of state managers that are used in streaming aggregations (with StateStoreSaveExec and StateStoreRestoreExec physical operators).

Table 1. StreamingAggregationStateManager Contract
Method Description

commit

Used exclusively when StateStoreSaveExec physical operator is executed.

get

Gets the saved state for a non-null key from a state store

Used exclusively when StateStoreRestoreExec physical operator is executed.

getKey

Used when:

getStateValueSchema

The schema for the state values

Used when StateStoreRestoreExec and StateStoreSaveExec physical operators are executed.

iterator

Returns all the UnsafeRow key-value pairs in a state store

Used exclusively when StateStoreSaveExec physical operator is executed.

keys

Returns all keys in a state store (as an iterator)

Used exclusively when physical operators with WatermarkSupport are requested to removeKeysOlderThanWatermark (i.e. exclusively when StateStoreSaveExec physical operator is executed).

put

Stores a row in a state store

Used exclusively when StateStoreSaveExec physical operator is executed.

remove

Used exclusively when StateStoreSaveExec physical operator is executed (directly or indirectly as a WatermarkSupport)

values

Returns all the values in a state store

Used exclusively when StateStoreSaveExec physical operator is executed.

StreamingAggregationStateManager supports two versions of state managers for streaming aggregations:

  • 1 (legacy)

  • 2 (default)

Note
The version of a state manager is controlled using spark.sql.streaming.aggregation.stateFormatVersion internal configuration property.
Note
StreamingAggregationStateManagerBaseImpl is the one and only known base implementation of the StreamingAggregationStateManager Contract in Spark Structured Streaming.
Note
StreamingAggregationStateManager is a Scala sealed trait which means that all the implementations are in the same compilation unit (a single file).

Creating StreamingAggregationStateManager Instance — createStateManager Factory Method

createStateManager creates a new StreamingAggregationStateManager for a given stateFormatVersion:

createStateManager throws a IllegalArgumentException for any other stateFormatVersion:

Note
createStateManager is used when StateStoreRestoreExec and StateStoreSaveExec physical operators are created.
赞(0) 打赏
未经允许不得转载:spark技术分享 » StreamingAggregationStateManager Contract — State Managers for Streaming Aggregation
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏