关注 spark技术分享,
撸spark源码 玩spark最佳实践

StateStoreOps — Implicits Methods for Creating StateStoreRDD

StateStoreOps — Extension Methods for Creating StateStoreRDD

StateStoreOps is a Scala implicit class to create StateStoreRDD when the following physical operators are executed:

Note
Implicit Classes are a language feature in Scala for implicit conversions with extension methods for existing types.

Creating StateStoreRDD (with storeUpdateFunction Aborting StateStore When Task Fails) — mapPartitionsWithStateStore Method

  1. Uses sqlContext.streams.stateStoreCoordinator to access StateStoreCoordinator

Internally, mapPartitionsWithStateStore requests SparkContext to clean storeUpdateFunction function.

Note
mapPartitionsWithStateStore uses the enclosing RDD to access the current SparkContext.
Note
Function Cleaning is to clean a closure from unreferenced variables before it is serialized and sent to tasks. SparkContext reports a SparkException when the closure is not serializable.

mapPartitionsWithStateStore then creates a (wrapper) function to abort the StateStore if state updates had not been committed before a task finished (which is to make sure that the StateStore has been committed or aborted in the end to follow the contract of StateStore).

Note
mapPartitionsWithStateStore uses TaskCompletionListener to be notified when a task has finished.

In the end, mapPartitionsWithStateStore creates a StateStoreRDD (with the wrapper function, SessionState and StateStoreCoordinatorRef).

Note

mapPartitionsWithStateStore is used when the following physical operators are executed:

赞(0) 打赏
未经允许不得转载:spark技术分享 » StateStoreOps — Implicits Methods for Creating StateStoreRDD
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏