关注 spark技术分享,
撸spark源码 玩spark最佳实践

UnsupportedOperationChecker

UnsupportedOperationChecker

UnsupportedOperationChecker checks whether the logical plan of a streaming query uses supported operations only.

Note
UnsupportedOperationChecker is used exclusively when the internal spark.sql.streaming.unsupportedOperationCheck Spark property is enabled (which is by default).
Note

UnsupportedOperationChecker comes actually with two methods, i.e. checkForBatch and checkForStreaming, whose names reveal the different flavours of Spark SQL (as of 2.0), i.e. batch and streaming, respectively.

The Spark Structured Streaming gitbook is solely focused on checkForStreaming method.

checkForStreaming Method

checkForStreaming asserts that the following requirements hold:

checkForStreaming…​FIXME

checkForStreaming finds all streaming aggregates (i.e. Aggregate logical operators with streaming sources).

Note
Aggregate logical operator represents groupBy and groupByKey aggregations (and SQL’s GROUP BY clause).

checkForStreaming asserts that there is exactly one streaming aggregation in a streaming query.

Otherwise, checkForStreaming reports a AnalysisException:

checkForStreaming asserts that watermark was defined for a streaming aggregation with Append output mode (on at least one of the grouping expressions).

Otherwise, checkForStreaming reports a AnalysisException:

Caution
FIXME

checkForStreaming counts all FlatMapGroupsWithState logical operators (on streaming Datasets with isMapGroupsWithState flag disabled).

Note
FlatMapGroupsWithState logical operator represents mapGroupsWithState and flatMapGroupsWithState operators.
Note
FlatMapGroupsWithState.isMapGroupsWithState flag is disabled when…​FIXME

checkForStreaming asserts that multiple FlatMapGroupsWithState logical operators are only used when:

  • outputMode is Append output mode

  • outputMode of the FlatMapGroupsWithState logical operators is also Append output mode

Caution
FIXME Reference to an example in flatMapGroupsWithState

Otherwise, checkForStreaming reports a AnalysisException:

Caution
FIXME
Note
checkForStreaming is used exclusively when StreamingQueryManager is requested to create a StreamingQueryWrapper (for starting a streaming query), but only when the internal spark.sql.streaming.unsupportedOperationCheck Spark property is enabled (which is by default).
赞(0) 打赏
未经允许不得转载:spark技术分享 » UnsupportedOperationChecker
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏