explain Operator — Explaining Query Plan
|
1 2 3 4 5 6 |
explain(): Unit (1) explain(extended: Boolean): Unit |
-
Calls
explainwithextendedflag disabled
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
val records = spark. readStream. format("rate"). load scala> records.explain == Physical Plan == StreamingRelation rate, [timestamp#0, value#1L] scala> records.explain(extended = true) == Parsed Logical Plan == StreamingRelation DataSource(org.apache.spark.sql.SparkSession@4071aa13,rate,List(),None,List(),None,Map(),None), rate, [timestamp#0, value#1L] == Analyzed Logical Plan == timestamp: timestamp, value: bigint StreamingRelation DataSource(org.apache.spark.sql.SparkSession@4071aa13,rate,List(),None,List(),None,Map(),None), rate, [timestamp#0, value#1L] == Optimized Logical Plan == StreamingRelation DataSource(org.apache.spark.sql.SparkSession@4071aa13,rate,List(),None,List(),None,Map(),None), rate, [timestamp#0, value#1L] == Physical Plan == StreamingRelation rate, [timestamp#0, value#1L] |
Internally, explain creates a ExplainCommand runnable command with the logical plan and extended flag.
explain then executes the plan with ExplainCommand runnable command and collects the results that are printed out to the standard output.
|
Note
|
|
For streaming Datasets, ExplainCommand command simply creates a IncrementalExecution for the SparkSession and the logical plan.
|
Note
|
For the purpose of explain, IncrementalExecution is created with the output mode Append, checkpoint location <unknown>, run id a random number, current batch id 0 and offset metadata empty. They do not really matter when explaining the load-part of a streaming query.
|
spark技术分享