关注 spark技术分享,
撸spark源码 玩spark最佳实践

ColumnPruning

admin阅读(1532)

ColumnPruning Logical Optimization

ColumnPruning is a base logical optimization that FIXME.

ColumnPruning is part of the RewriteSubquery once-executed batch in the standard batches of the Catalyst Optimizer.

ColumnPruning is simply a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan].

Example 1

Example 2

Executing Rule — apply Method

Note
apply is part of the Rule Contract to execute (apply) a rule on a TreeNode (e.g. LogicalPlan).

apply…​FIXME

CollapseWindow

admin阅读(1481)

CollapseWindow Logical Optimization

CollapseWindow is a base logical optimization that FIXME.

CollapseWindow is part of the Operator Optimization fixed-point batch in the standard batches of the Catalyst Optimizer.

CollapseWindow is simply a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan].

Executing Rule — apply Method

Note
apply is part of the Rule Contract to execute (apply) a rule on a TreeNode (e.g. LogicalPlan).

apply…​FIXME

WindowsSubstitution

admin阅读(1587)

WindowsSubstitution Logical Evaluation Rule

WindowsSubstitution is a logical evaluation rule (i.e. Rule[LogicalPlan]) that the logical query plan analyzer uses to resolve (aka substitute) WithWindowDefinition unary logical operators with UnresolvedWindowExpression to their corresponding WindowExpression with resolved WindowSpecDefinition.

WindowsSubstitution is part of Substitution fixed-point batch of rules.

Note
It appears that WindowsSubstitution is exclusively used for pure SQL queries because WithWindowDefinition unary logical operator is created exclusively when AstBuilder parses window definitions.

If a window specification is not found, WindowsSubstitution fails analysis with the following error:

Note
The analysis failure is unlikely to happen given AstBuilder builds a lookup table of all the named window specifications defined in a SQL text and reports a ParseException when a WindowSpecReference is not available earlier.

For every WithWindowDefinition, WindowsSubstitution takes the child logical plan and transforms its UnresolvedWindowExpression expressions to be a WindowExpression with a window specification from the WINDOW clause (see WithWindowDefinition Example).

WindowFrameCoercion

admin阅读(1450)

WindowFrameCoercion Type Coercion Logical Rule

Coercing Types in Logical Plan — coerceTypes Method

Note
coerceTypes is part of the TypeCoercionRule Contract to coerce types in a logical plan.

coerceTypes traverses all Catalyst expressions (in the input LogicalPlan) and replaces the frameSpecification of every WindowSpecDefinition with a RangeFrame window frame and the single order specification expression resolved with the lower and upper window frame boundary expressions cast to the data type of the order specification expression.

createBoundaryCast Internal Method

createBoundaryCast returns a Catalyst expression per the input boundary Expression and the dt DataType (in the order of execution):

  • The input boundary expression if it is a SpecialFrameBoundary

  • The input boundary expression if the dt data type is DateType or TimestampType

  • Cast unary operator with the input boundary expression and the dt data type if the result type of the boundary expression is not the dt data type, but the result type can be cast to the dt data type

  • The input boundary expression

Note
createBoundaryCast is used exclusively when WindowFrameCoercion type coercion logical rule is requested to coerceTypes.

UpdateOuterReferences

admin阅读(1284)

UpdateOuterReferences Logical Rule

UpdateOuterReferences is…​FIXME

Applying UpdateOuterReferences to Logical Plan — apply Method

Note
apply is part of Rule Contract to apply a rule to a logical plan.

apply…​FIXME

TimeWindowing

admin阅读(1376)

TimeWindowing Logical Resolution Rule

TimeWindowing is a logical resolution rule that FIXME in a logical query plan.

TimeWindowing is part of the Resolution fixed-point batch in the standard batches of the Analyzer.

TimeWindowing is simply a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan].

Executing Rule — apply Method

Note
apply is part of the Rule Contract to execute (apply) a rule on a TreeNode (e.g. LogicalPlan).

apply…​FIXME

ResolveWindowOrder

admin阅读(2289)

ResolveWindowOrder Logical Resolution Rule

ResolveWindowOrder is…​FIXME

ResolveWindowFrame

admin阅读(1609)

ResolveWindowFrame Logical Resolution Rule

ResolveWindowFrame is a logical resolution rule that the logical query plan analyzer uses to validate and resolve WindowExpression expressions in an entire logical query plan.

Technically, ResolveWindowFrame is just a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan].

ResolveWindowFrame is part of Resolution fixed-point batch of rules.

ResolveWindowFrame takes a logical plan and does the following:

  1. Makes sure that the window frame of a WindowFunction is unspecified or matches the SpecifiedWindowFrame of the WindowSpecDefinition expression.

    Reports a AnalysisException when the frames do not match:

  2. Copies the frame specification of WindowFunction to WindowSpecDefinition

  3. Creates a new SpecifiedWindowFrame for WindowExpression with the resolved Catalyst expression and UnspecifiedFrame

Note
ResolveWindowFrame is a Scala object inside Analyzer class.

Applying ResolveWindowFrame to Logical Plan — apply Method

Note
apply is part of Rule Contract to apply a rule to a logical plan.

apply…​FIXME

ResolveSubquery

admin阅读(1563)

ResolveSubquery Logical Resolution Rule

ResolveSubquery is a logical resolution that resolves subquery expressions (ScalarSubquery, Exists and In) when transforming a logical plan with the following logical operators:

  1. Filter operators with an Aggregate child operator

  2. Unary operators with the children resolved

ResolveSubquery is part of Resolution fixed-point batch of rules of the Spark Analyzer.

Technically, ResolveSubquery is a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan].

Resolving Subquery Expressions (ScalarSubquery, Exists and In) — resolveSubQueries Internal Method

resolveSubQueries requests the input logical plan to transform expressions (down the operator tree) as follows:

  1. For ScalarSubquery expressions with subquery plan not resolved and resolveSubQuery to create resolved ScalarSubquery expressions

  2. For Exists expressions with subquery plan not resolved and resolveSubQuery to create resolved Exists expressions

  3. For In expressions with ListQuery not resolved and resolveSubQuery to create resolved In expressions

Note
resolveSubQueries is used exclusively when ResolveSubquery is executed.

resolveSubQuery Internal Method

resolveSubQuery…​FIXME

Note
resolveSubQuery is used exclusively when ResolveSubquery is requested to resolve subquery expressions (ScalarSubquery, Exists and In).

Applying ResolveSubquery to Logical Plan (Executing ResolveSubquery) — apply Method

Note
apply is part of Rule Contract to apply a rule to a TreeNode, e.g. logical plan.

apply transforms the input logical plan as follows:

  1. For Filter operators with an Aggregate operator (as the child operator) and the children resolved, apply resolves subquery expressions (ScalarSubquery, Exists and In) with the Filter operator and the plans with the Aggregate operator and its single child

  2. For unary operators with the children resolved, apply resolves subquery expressions (ScalarSubquery, Exists and In) with the unary operator and its single child

ResolveSQLOnFile

admin阅读(1349)

ResolveSQLOnFile Logical Evaluation Rule for…​FIXME

ResolveSQLOnFile is…​FIXME

maybeSQLFile Internal Method

maybeSQLFile is enabled (i.e. true) where the following all hold:

  1. FIXME

Note
maybeSQLFile is used exclusively when…​FIXME

Applying Rule to Logical Plan — apply Method

Note
apply is part of Rule Contract to apply a rule to a logical plan.

apply…​FIXME

关注公众号:spark技术分享

联系我们联系我们