QueryPlan — Structured Query Plan
QueryPlan is part of Catalyst to build a tree of relational operators of a structured query.
Scala-specific, QueryPlan is an abstract class that is the base class of LogicalPlan and SparkPlan (for logical and physical plans, respectively).
A QueryPlan has an output attributes (that serves as the base for the schema), a collection of expressions and a schema.
QueryPlan has statePrefix that is used when displaying a plan with ! to indicate an invalid plan, and ' to indicate an unresolved plan.
A QueryPlan is invalid if there are missing input attributes and children subnodes are non-empty.
A QueryPlan is unresolved if the column names have not been verified and column types have not been looked up in the Catalog.
A QueryPlan has zero, one or more Catalyst expressions.
|
Note
|
QueryPlan is a tree of operators that have a tree of expressions.
|
QueryPlan has references property that is the attributes that appear in expressions from this operator.
QueryPlan Contract
|
1 2 3 4 5 6 7 8 9 |
abstract class QueryPlan[T] extends TreeNode[T] { def output: Seq[Attribute] def validConstraints: Set[Expression] // FIXME } |
| Method | Description |
|---|---|
|
Attribute expressions |
Transforming Expressions — transformExpressions Method
|
1 2 3 4 5 |
transformExpressions(rule: PartialFunction[Expression, Expression]): this.type |
transformExpressions simply executes transformExpressionsDown with the input rule.
|
Note
|
transformExpressions is used when…FIXME
|
Transforming Expressions — transformExpressionsDown Method
|
1 2 3 4 5 |
transformExpressionsDown(rule: PartialFunction[Expression, Expression]): this.type |
transformExpressionsDown applies the rule to each expression in the query operator.
|
Note
|
transformExpressionsDown is used when…FIXME
|
Applying Transformation Function to Each Expression in Query Operator — mapExpressions Method
|
1 2 3 4 5 |
mapExpressions(f: Expression => Expression): this.type |
mapExpressions…FIXME
|
Note
|
mapExpressions is used when…FIXME
|
Output Schema Attribute Set — outputSet Property
|
1 2 3 4 5 |
outputSet: AttributeSet |
outputSet simply returns an AttributeSet for the output schema attributes.
|
Note
|
outputSet is used when…FIXME
|
Missing Input Attributes — missingInput Property
|
1 2 3 4 5 |
def missingInput: AttributeSet |
missingInput are attributes that are referenced in expressions but not provided by this node’s children (as inputSet) and are not produced by this node (as producedAttributes).
Output Schema — schema Property
You can request the schema of a QueryPlan using schema that builds StructType from the output attributes.
|
1 2 3 4 5 6 7 8 9 |
// the query val dataset = spark.range(3) scala> dataset.queryExecution.analyzed.schema res6: org.apache.spark.sql.types.StructType = StructType(StructField(id,LongType,false)) |
Output Schema Attributes — output Property
|
1 2 3 4 5 |
output: Seq[Attribute] |
output is a collection of Catalyst attribute expressions that represent the result of a projection in a query that is later used to build the output schema.
|
Note
|
output property is also called output schema or result schema.
|
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
val q = spark.range(3) scala> q.queryExecution.analyzed.output res0: Seq[org.apache.spark.sql.catalyst.expressions.Attribute] = List(id#0L) scala> q.queryExecution.withCachedData.output res1: Seq[org.apache.spark.sql.catalyst.expressions.Attribute] = List(id#0L) scala> q.queryExecution.optimizedPlan.output res2: Seq[org.apache.spark.sql.catalyst.expressions.Attribute] = List(id#0L) scala> q.queryExecution.sparkPlan.output res3: Seq[org.apache.spark.sql.catalyst.expressions.Attribute] = List(id#0L) scala> q.queryExecution.executedPlan.output res4: Seq[org.apache.spark.sql.catalyst.expressions.Attribute] = List(id#0L) |
|
Tip
|
You can build a StructType from
|
Simple (Basic) Description with State Prefix — simpleString Method
|
1 2 3 4 5 |
simpleString: String |
|
Note
|
simpleString is part of TreeNode Contract for the simple text description of a tree node.
|
simpleString adds a state prefix to the node’s simple text description.
State Prefix — statePrefix Method
|
1 2 3 4 5 |
statePrefix: String |
Internally, statePrefix gives ! (exclamation mark) when the node is invalid, i.e. missingInput is not empty, and the node is a parent node. Otherwise, statePrefix gives an empty string.
|
Note
|
statePrefix is used exclusively when QueryPlan is requested for the simple text node description.
|
Transforming All Expressions — transformAllExpressions Method
|
1 2 3 4 5 |
transformAllExpressions(rule: PartialFunction[Expression, Expression]): this.type |
transformAllExpressions…FIXME
|
Note
|
transformAllExpressions is used when…FIXME
|
Simple (Basic) Description with State Prefix — verboseString Method
|
1 2 3 4 5 |
verboseString: String |
|
Note
|
verboseString is part of TreeNode Contract to…FIXME.
|
verboseString simply returns the simple (basic) description with state prefix.
innerChildren Method
|
1 2 3 4 5 |
innerChildren: Seq[QueryPlan[_]] |
|
Note
|
innerChildren is part of TreeNode Contract to…FIXME.
|
innerChildren simply returns the subqueries.
spark技术分享