PushDownOperatorsToDataSource Logical Optimization
PushDownOperatorsToDataSource
is a logical optimization that pushes down operators to underlying data sources (i.e. DataSourceV2Relations) (before planning so that data source can report statistics more accurately).
Technically, PushDownOperatorsToDataSource
is a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan]
.
PushDownOperatorsToDataSource
is part of the Push down operators to data source scan once-executed rule batch of the SparkOptimizer.
Executing Rule — apply
Method
1 2 3 4 5 |
apply(plan: LogicalPlan): LogicalPlan |
Note
|
apply is part of the Rule Contract to execute (apply) a rule on a TreeNode (e.g. LogicalPlan).
|
apply
…FIXME
pushDownRequiredColumns
Internal Method
1 2 3 4 5 |
pushDownRequiredColumns(plan: LogicalPlan, requiredByParent: AttributeSet): LogicalPlan |
pushDownRequiredColumns
branches off per the input logical operator (that is supposed to have at least one child node):
-
For Project unary logical operator,
pushDownRequiredColumns
takes the references of the project expressions as the required columns (attributes) and executes itself recursively on the child logical operatorNote that the input
requiredByParent
attributes are not considered in the required columns. -
For Filter unary logical operator,
pushDownRequiredColumns
adds the references of the filter condition to the inputrequiredByParent
attributes and executes itself recursively on the child logical operator -
For DataSourceV2Relation unary logical operator,
pushDownRequiredColumns
…FIXME -
For other logical operators,
pushDownRequiredColumns
simply executes itself (using TreeNode.mapChildren) recursively on the child nodes (logical operators)
Note
|
pushDownRequiredColumns is used exclusively when PushDownOperatorsToDataSource logical optimization is requested to execute.
|
Destructuring Logical Operator — FilterAndProject.unapply
Method
1 2 3 4 5 |
unapply(plan: LogicalPlan): Option[(Seq[NamedExpression], Expression, DataSourceV2Relation)] |
unapply
is part of FilterAndProject
extractor object to destructure the input logical operator into a tuple with…FIXME
unapply
works with (matches) the following logical operators:
-
For a Filter with a DataSourceV2Relation leaf logical operator,
unapply
…FIXME -
For a Filter with a Project over a DataSourceV2Relation leaf logical operator,
unapply
…FIXME -
For others,
unapply
returnsNone
(i.e. does nothing / does not match)
Note
|
unapply is used exclusively when PushDownOperatorsToDataSource logical optimization is requested to execute.
|