ExplainCommand Logical Command
ExplainCommand
is a logical command with side effect that allows users to see how a structured query is structured and will eventually be executed, i.e. shows logical and physical plans with or without details about codegen and cost statistics.
When executed, ExplainCommand
computes a QueryExecution
that is then used to output a single-column DataFrame
with the following:
-
codegen explain, i.e. WholeStageCodegen subtrees if codegen flag is enabled.
-
extended explain, i.e. the parsed, analyzed, optimized logical plans with the physical plan if extended flag is enabled.
-
cost explain, i.e. optimized logical plan with stats if cost flag is enabled.
-
simple explain, i.e. the physical plan only when no
codegen
andextended
flags are enabled.
ExplainCommand
is created by Dataset’s explain operator and EXPLAIN SQL statement (accepting EXTENDED
and CODEGEN
options).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
// Explain in SQL scala> sql("EXPLAIN EXTENDED show tables").show(truncate = false) +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |plan | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |== Parsed Logical Plan == ShowTablesCommand == Analyzed Logical Plan == tableName: string, isTemporary: boolean ShowTablesCommand == Optimized Logical Plan == ShowTablesCommand == Physical Plan == ExecutedCommand +- ShowTablesCommand| +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
The following EXPLAIN variants in SQL queries are not supported:
-
EXPLAIN FORMATTED
-
EXPLAIN LOGICAL
1 2 3 4 5 6 7 8 9 10 11 12 |
scala> sql("EXPLAIN LOGICAL show tables") org.apache.spark.sql.catalyst.parser.ParseException: Operation not allowed: EXPLAIN LOGICAL(line 1, pos 0) == SQL == EXPLAIN LOGICAL show tables ^^^ ... |
The output schema of a ExplainCommand
is…FIXME
Creating ExplainCommand Instance
ExplainCommand
takes the following when created:
-
extended
flag whether to include extended details in the output whenExplainCommand
is executed (disabled by default) -
codegen
flag whether to include codegen details in the output whenExplainCommand
is executed (disabled by default) -
cost
flag whether to include code in the output whenExplainCommand
is executed (disabled by default)
ExplainCommand
initializes output attribute.
Note
|
ExplainCommand is created when…FIXME
|
Executing Logical Command (Computing Text Representation of QueryExecution) — run
Method
1 2 3 4 5 |
run(sparkSession: SparkSession): Seq[Row] |
Note
|
run is part of RunnableCommand Contract to execute (run) a logical command.
|
run
computes QueryExecution and returns its text representation in a single Row.
Internally, run
creates a IncrementalExecution
for a streaming dataset directly or requests SessionState
to execute the LogicalPlan
.
Note
|
Streaming Dataset is part of Spark Structured Streaming. |
run
then requests QueryExecution to build the output text representation, i.e. codegened, extended (with logical and physical plans), with stats, or simple.
In the end, run
creates a Row
with the text representation.