UnsupportedOperationChecker
UnsupportedOperationChecker checks whether the logical plan of a streaming query uses supported operations only.
|
Note
|
UnsupportedOperationChecker is used exclusively when the internal spark.sql.streaming.unsupportedOperationCheck Spark property is enabled (which is by default).
|
|
Note
|
The Spark Structured Streaming gitbook is solely focused on checkForStreaming method. |
checkForStreaming Method
|
1 2 3 4 5 |
checkForStreaming(plan: LogicalPlan, outputMode: OutputMode): Unit |
checkForStreaming asserts that the following requirements hold:
checkForStreaming…FIXME
checkForStreaming finds all streaming aggregates (i.e. Aggregate logical operators with streaming sources).
|
Note
|
Aggregate logical operator represents groupBy and groupByKey aggregations (and SQL’s GROUP BY clause).
|
checkForStreaming asserts that there is exactly one streaming aggregation in a streaming query.
Otherwise, checkForStreaming reports a AnalysisException:
|
1 2 3 4 5 |
Multiple streaming aggregations are not supported with streaming DataFrames/Datasets |
checkForStreaming asserts that watermark was defined for a streaming aggregation with Append output mode (on at least one of the grouping expressions).
Otherwise, checkForStreaming reports a AnalysisException:
|
1 2 3 4 5 |
Append output mode not supported when there are streaming aggregations on streaming DataFrames/DataSets without watermark |
|
Caution
|
FIXME |
checkForStreaming counts all FlatMapGroupsWithState logical operators (on streaming Datasets with isMapGroupsWithState flag disabled).
|
Note
|
FlatMapGroupsWithState logical operator represents mapGroupsWithState and flatMapGroupsWithState operators. |
|
Note
|
FlatMapGroupsWithState.isMapGroupsWithState flag is disabled when…FIXME |
checkForStreaming asserts that multiple FlatMapGroupsWithState logical operators are only used when:
-
outputModeis Append output mode -
outputMode of the
FlatMapGroupsWithStatelogical operators is also Append output mode
|
Caution
|
FIXME Reference to an example in flatMapGroupsWithState
|
Otherwise, checkForStreaming reports a AnalysisException:
|
1 2 3 4 5 |
Multiple flatMapGroupsWithStates are not supported when they are not all in append mode or the output mode is not append on a streaming DataFrames/Datasets |
|
Caution
|
FIXME |
|
Note
|
checkForStreaming is used exclusively when StreamingQueryManager is requested to create a StreamingQueryWrapper (for starting a streaming query), but only when the internal spark.sql.streaming.unsupportedOperationCheck Spark property is enabled (which is by default).
|
spark技术分享