explain Operator — Explaining Query Plan-spark技术分享

explain Operator — Explaining Query Plan



explain(): Unit (1)
explain(extended: Boolean): Unit

explain(): Unit (1)

explain(extended: Boolean): Unit

Calls explain with extended flag disabled

explain prints the logical and (with extended flag enabled) physical plans to the console.



val records = spark.
  readStream.
  format("rate").
  load
scala> records.explain
== Physical Plan ==
StreamingRelation rate, [timestamp#0, value#1L]

scala> records.explain(extended = true)
== Parsed Logical Plan ==
StreamingRelation DataSource(org.apache.spark.sql.SparkSession@4071aa13,rate,List(),None,List(),None,Map(),None), rate, [timestamp#0, value#1L]

== Analyzed Logical Plan ==
timestamp: timestamp, value: bigint
StreamingRelation DataSource(org.apache.spark.sql.SparkSession@4071aa13,rate,List(),None,List(),None,Map(),None), rate, [timestamp#0, value#1L]

== Optimized Logical Plan ==
StreamingRelation DataSource(org.apache.spark.sql.SparkSession@4071aa13,rate,List(),None,List(),None,Map(),None), rate, [timestamp#0, value#1L]

== Physical Plan ==
StreamingRelation rate, [timestamp#0, value#1L]

val records = spark.

readStream.

format("rate").

load

scala> records.explain

== Physical Plan ==

StreamingRelation rate, [timestamp#0, value#1L]

scala> records.explain(extended = true)

== Parsed Logical Plan ==

StreamingRelation DataSource(org.apache.spark.sql.SparkSession@4071aa13,rate,List(),None,List(),None,Map(),None), rate, [timestamp#0, value#1L]

== Analyzed Logical Plan ==

timestamp: timestamp, value: bigint

StreamingRelation DataSource(org.apache.spark.sql.SparkSession@4071aa13,rate,List(),None,List(),None,Map(),None), rate, [timestamp#0, value#1L]

== Optimized Logical Plan ==

StreamingRelation DataSource(org.apache.spark.sql.SparkSession@4071aa13,rate,List(),None,List(),None,Map(),None), rate, [timestamp#0, value#1L]

== Physical Plan ==

StreamingRelation rate, [timestamp#0, value#1L]

Internally, explain creates a ExplainCommand runnable command with the logical plan and extended flag.

explain then executes the plan with ExplainCommand runnable command and collects the results that are printed out to the standard output.

Note

explain uses SparkSession to access the current SessionState to execute the plan.



import org.apache.spark.sql.execution.command.ExplainCommand
val explain = ExplainCommand(...)
spark.sessionState.executePlan(explain)

import org.apache.spark.sql.execution.command.ExplainCommand

val explain = ExplainCommand(...)

spark.sessionState.executePlan(explain)

For streaming Datasets, ExplainCommand command simply creates a IncrementalExecution for the SparkSession and the logical plan.

Note	For the purpose of `explain`, `IncrementalExecution` is created with the output mode `Append`, checkpoint location `<unknown>`, run id a random number, current batch id `0` and offset metadata empty. They do not really matter when explaining the load-part of a streaming query.

explain Operator — Explaining Query Plan

explain Operator — Explaining Query Plan

相关推荐

欢迎关注：spark技术分享

热门标签

近期文章

分类目录

关注公众号：spark技术分享

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏

QQ咨询

回顶部