关注 spark技术分享,
撸spark源码 玩spark最佳实践

StreamingAggregationStateManagerImplV2 — Default State Manager for Streaming Aggregation

StreamingAggregationStateManagerImplV2 — Default State Manager for Streaming Aggregation

StreamingAggregationStateManagerImplV2 is the default state manager for streaming aggregations.

Note
The version of a state manager is controlled using spark.sql.streaming.aggregation.stateFormatVersion internal configuration property.

StreamingAggregationStateManagerImplV2 is created exclusively when StreamingAggregationStateManager is requested for a new StreamingAggregationStateManager.

Table 1. StreamingAggregationStateManagerImplV2’s Internal Properties
Name Description

joiner

keyValueJoinedExpressions

needToProjectToRestoreValue

restoreValueProjector

valueExpressions

valueProjector

Storing Row in State Store — put Method

Note
put is part of the StreamingAggregationStateManager Contract to store a row in a state store.

put…​FIXME

Creating StreamingAggregationStateManagerImplV2 Instance

StreamingAggregationStateManagerImplV2 takes the following when created:

  • Attribute expressions for keys (Seq[Attribute])

  • Attribute expressions for input rows (Seq[Attribute])

StreamingAggregationStateManagerImplV2 initializes the internal registries and counters.

Getting Saved State for Non-Null Key from State Store — get Method

Note
get is part of the StreamingAggregationStateManager Contract to get the saved state for a given non-null key from a given state store.

get requests the given StateStore for the current state value for the given key.

get returns null if the key could not be found in the state store. Otherwise, get restoreOriginalRow (for the key and the saved state).

restoreOriginalRow Internal Method

restoreOriginalRow…​FIXME

Note
restoreOriginalRow is used when StreamingAggregationStateManagerImplV2 is requested to get the saved state for a given non-null key from a state store, iterator and values.

getStateValueSchema Method

Note
getStateValueSchema is part of the StreamingAggregationStateManager Contract to…​FIXME.

getStateValueSchema simply requests the valueExpressions for the schema.

iterator Method

Note
iterator is part of the StreamingAggregationStateManager Contract to…​FIXME.

iterator simply requests the input state store for the iterator that is mapped to an iterator of UnsafeRowPairs with the key (of the input UnsafeRowPair) and the value as a restored original row.

Note
scala.collection.Iterator is a data structure that allows to iterate over a sequence of elements that are usually fetched lazily (i.e. no elements are fetched from the underlying store until processed).

values Method

Note
values is part of the StreamingAggregationStateManager Contract to…​FIXME.

values…​FIXME

赞(0) 打赏
未经允许不得转载:spark技术分享 » StreamingAggregationStateManagerImplV2 — Default State Manager for Streaming Aggregation
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏