关注 spark技术分享,
撸spark源码 玩spark最佳实践

StreamingSymmetricHashJoinExec Binary Physical Operator — Stream-Stream Joins

StreamingSymmetricHashJoinExec Binary Physical Operator — Stream-Stream Joins

StreamingSymmetricHashJoinExec is a binary physical operator that represents a Join logical operator of two streaming structured queries with equality predicates at execution time.

Note

Join logical operator represents Dataset.join operator.

Find out more in Join Logical Operator section in The Internals of Spark SQL book.

StreamingSymmetricHashJoinExec is created exclusively when StreamingJoinStrategy execution planning strategy is requested to plan a logical query plan with a Join logical operator of two streaming structured queries with equality predicates (i.e. EqualTo and EqualNullSafe)

StreamingSymmetricHashJoinExec is a stateful physical operator that writes to a state store.

StreamingSymmetricHashJoinExec uses the performance metrics of StateStoreWriter.

When executed, StreamingSymmetricHashJoinExec…​FIXME

The output schema of StreamingSymmetricHashJoinExec is…​FIXME

The output partitioning of StreamingSymmetricHashJoinExec is…​FIXME

Executing Physical Operator — doExecute Method

Note
doExecute is a part of SparkPlan contract to produce the result of a physical operator as an RDD of internal binary rows (i.e. InternalRow).

doExecute…​FIXME

Creating StreamingSymmetricHashJoinExec Instance

StreamingSymmetricHashJoinExec takes the following to be created:

  • Catalyst expressions of the keys on the left side

  • Catalyst expressions of the keys on the right side

  • JoinType

  • Join condition (JoinConditionSplitPredicates)

  • Optional StatefulOperatorStateInfo

  • Optional event-time watermark

  • State watermark (JoinStateWatermarkPredicates)

  • Physical operator on the left side (SparkPlan)

  • Physical operator on the right side SparkPlan

StreamingSymmetricHashJoinExec initializes the internal registries and counters.

processPartitions Internal Method

processPartitions…​FIXME

Note
processPartitions is used exclusively when StreamingSymmetricHashJoinExec physical operator is requested to doExecute.
赞(0) 打赏
未经允许不得转载:spark技术分享 » StreamingSymmetricHashJoinExec Binary Physical Operator — Stream-Stream Joins
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏