关注 spark技术分享,
撸spark源码 玩spark最佳实践

ExplainCommand

ExplainCommand Logical Command

ExplainCommand is a logical command with side effect that allows users to see how a structured query is structured and will eventually be executed, i.e. shows logical and physical plans with or without details about codegen and cost statistics.

When executed, ExplainCommand computes a QueryExecution that is then used to output a single-column DataFrame with the following:

  • codegen explain, i.e. WholeStageCodegen subtrees if codegen flag is enabled.

  • extended explain, i.e. the parsed, analyzed, optimized logical plans with the physical plan if extended flag is enabled.

  • cost explain, i.e. optimized logical plan with stats if cost flag is enabled.

  • simple explain, i.e. the physical plan only when no codegen and extended flags are enabled.

ExplainCommand is created by Dataset’s explain operator and EXPLAIN SQL statement (accepting EXTENDED and CODEGEN options).

The following EXPLAIN variants in SQL queries are not supported:

  • EXPLAIN FORMATTED

  • EXPLAIN LOGICAL

The output schema of a ExplainCommand is…​FIXME

Creating ExplainCommand Instance

ExplainCommand takes the following when created:

  • LogicalPlan

  • extended flag whether to include extended details in the output when ExplainCommand is executed (disabled by default)

  • codegen flag whether to include codegen details in the output when ExplainCommand is executed (disabled by default)

  • cost flag whether to include code in the output when ExplainCommand is executed (disabled by default)

ExplainCommand initializes output attribute.

Note
ExplainCommand is created when…​FIXME

Executing Logical Command (Computing Text Representation of QueryExecution) — run Method

Note
run is part of RunnableCommand Contract to execute (run) a logical command.

run computes QueryExecution and returns its text representation in a single Row.

Internally, run creates a IncrementalExecution for a streaming dataset directly or requests SessionState to execute the LogicalPlan.

Note
Streaming Dataset is part of Spark Structured Streaming.

run then requests QueryExecution to build the output text representation, i.e. codegened, extended (with logical and physical plans), with stats, or simple.

In the end, run creates a Row with the text representation.

赞(0) 打赏
未经允许不得转载:spark技术分享 » ExplainCommand
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏