TreeNode — Node in Catalyst Tree
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
package org.apache.spark.sql.catalyst.trees abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product { self: BaseType => // only required properties (vals and methods) that have no implementation // the others follow def children: Seq[BaseType] def verboseString: String } |
TreeNode
is a recursive data structure that can have one or many children that are again TreeNodes
.
Tip
|
Read up on <: type operator in Scala in Upper Type Bounds.
|
Scala-specific, TreeNode
is an abstract class that is the base class of Catalyst Expression and QueryPlan abstract classes.
TreeNode
therefore allows for building entire trees of TreeNodes
, e.g. generic query plans with concrete logical and physical operators that both use Catalyst expressions (which are TreeNodes
again).
Note
|
Spark SQL uses TreeNode for query plans and Catalyst expressions that can further be used together to build more advanced trees, e.g. Catalyst expressions can have query plans as subquery expressions.
|
TreeNode
can itself be a node in a tree or a collection of nodes, i.e. itself and the children nodes. Not only does TreeNode
come with the methods that you may have used in Scala Collection API (e.g. map, flatMap, collect, collectFirst, foreach), but also specialized ones for more advanced tree manipulation, e.g. mapChildren, transform, transformDown, transformUp, foreachUp, numberedTreeString, p, asCode, prettyJson.
Method | Description | ||
---|---|---|---|
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
Method | Description |
---|---|
|
|
|
Used when |
TreeNode | Description |
---|---|
Tip
|
|
withNewChildren
Method
1 2 3 4 5 |
withNewChildren(newChildren: Seq[BaseType]): BaseType |
withNewChildren
…FIXME
Note
|
withNewChildren is used when…FIXME
|
Simple Node Description — simpleString
Method
1 2 3 4 5 |
simpleString: String |
simpleString
gives a simple one-line description of a TreeNode
.
Note
|
simpleString is used when TreeNode is requested for argString (of child nodes) and tree text representation (with verbose flag off).
|
Numbered Text Representation — numberedTreeString
Method
1 2 3 4 5 |
numberedTreeString: String |
numberedTreeString
adds numbers to the text representation of all the nodes.
Getting n-th TreeNode in Tree (for Interactive Debugging) — apply
Method
1 2 3 4 5 |
apply(number: Int): TreeNode[_] |
apply
gives number
-th tree node in a tree.
Note
|
apply can be used for interactive debugging.
|
Internally, apply
gets the node at number
position or null
.
Getting n-th BaseType in Tree (for Interactive Debugging) — p
Method
1 2 3 4 5 |
p(number: Int): BaseType |
p
gives number
-th tree node in a tree as BaseType
for interactive debugging.
Note
|
p can be used for interactive debugging.
|
Note
|
|
Text Representation — toString
Method
1 2 3 4 5 |
toString: String |
Note
|
toString is part of Java’s Object Contract for the string representation of an object, e.g. TreeNode .
|
toString
simply returns the text representation of all nodes in the tree.
Text Representation of All Nodes in Tree — treeString
Method
1 2 3 4 5 6 |
treeString: String (1) treeString(verbose: Boolean, addSuffix: Boolean = false): String |
-
Turns verbose flag on
treeString
gives the string representation of all the nodes in the TreeNode
.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
import org.apache.spark.sql.{functions => f} val q = spark.range(10).withColumn("rand", f.rand()) val executedPlan = q.queryExecution.executedPlan val output = executedPlan.treeString(verbose = true) scala> println(output) *(1) Project [id#0L, rand(6790207094253656854) AS rand#2] +- *(1) Range (0, 10, step=1, splits=8) |
Note
|
|
Verbose Description with Suffix — verboseStringWithSuffix
Method
1 2 3 4 5 |
verboseStringWithSuffix: String |
verboseStringWithSuffix
simply returns verbose description.
Note
|
verboseStringWithSuffix is used exclusively when TreeNode is requested to generateTreeString (with verbose and addSuffix flags enabled).
|
Generating Text Representation of Inner and Regular Child Nodes — generateTreeString
Method
1 2 3 4 5 6 7 8 9 10 11 |
generateTreeString( depth: Int, lastChildren: Seq[Boolean], builder: StringBuilder, verbose: Boolean, prefix: String = "", addSuffix: Boolean = false): StringBuilder |
Internally, generateTreeString
appends the following node descriptions per the verbose
and addSuffix
flags:
-
verbose description with suffix when both are enabled (i.e.
verbose
andaddSuffix
flags are alltrue
) -
verbose description when
verbose
is enabled (i.e.verbose
istrue
andaddSuffix
isfalse
) -
simple description when
verbose
is disabled (i.e.verbose
isfalse
)
In the end, generateTreeString
calls itself recursively for the innerChildren and the child nodes.
Note
|
generateTreeString is used exclusively when TreeNode is requested for text representation of all nodes in the tree.
|
Inner Child Nodes — innerChildren
Method
1 2 3 4 5 |
innerChildren: Seq[TreeNode[_]] |
innerChildren
returns the inner nodes that should be shown as an inner nested tree of this node.
innerChildren
simply returns an empty collection of TreeNodes
.
Note
|
innerChildren is used when TreeNode is requested to generate the text representation of inner and regular child nodes, allChildren and getNodeNumbered.
|
allChildren
Property
1 2 3 4 5 |
allChildren: Set[TreeNode[_]] |
Note
|
allChildren is a Scala lazy value which is computed once when accessed and cached afterwards.
|
allChildren
…FIXME
Note
|
allChildren is used when…FIXME
|
getNodeNumbered
Internal Method
1 2 3 4 5 |
getNodeNumbered(number: MutableInt): Option[TreeNode[_]] |
getNodeNumbered
…FIXME
Note
|
getNodeNumbered is used when…FIXME
|
foreach
Method
1 2 3 4 5 |
foreach(f: BaseType => Unit): Unit |
foreach
applies the input function f
to itself (this
) first and then (recursively) to the children.
collectFirst
Method
1 2 3 4 5 |
collectFirst[B](pf: PartialFunction[BaseType, B]): Option[B] |
collectFirst
…FIXME
transform
Method
1 2 3 4 5 |
transform(rule: PartialFunction[BaseType, BaseType]): BaseType |
transform
…FIXME
Transforming Nodes Downwards — transformDown
Method
1 2 3 4 5 |
transformDown(rule: PartialFunction[BaseType, BaseType]): BaseType |
transformDown
…FIXME
transformUp
Method
1 2 3 4 5 |
transformUp(rule: PartialFunction[BaseType, BaseType]): BaseType |
transformUp
…FIXME
nodeName
Method
1 2 3 4 5 |
nodeName: String |
nodeName
returns the name of the class with Exec
suffix removed (that is used as a naming convention for the class name of physical operators).
Note
|
nodeName is used when TreeNode is requested for simpleString and asCode.
|