关注 spark技术分享,
撸spark源码 玩spark最佳实践

ResolveRelations

admin阅读(1811)

ResolveRelations Logical Resolution Rule — Resolving UnresolvedRelations With Tables in Catalog

ResolveRelations is a logical resolution rule that the logical query plan analyzer uses to resolve UnresolvedRelations (in a logical query plan), i.e.

Technically, ResolveRelations is just a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan].

ResolveRelations is part of Resolution fixed-point batch of rules.

Applying ResolveRelations to Logical Plan — apply Method

Note
apply is part of Rule Contract to apply a rule to a logical plan.

apply…​FIXME

Resolving Relation — resolveRelation Method

resolveRelation…​FIXME

Note
resolveRelation is used when…​FIXME

isRunningDirectlyOnFiles Internal Method

isRunningDirectlyOnFiles is enabled (i.e. true) when all of the following conditions hold:

Note
isRunningDirectlyOnFiles is used exclusively when ResolveRelations resolves a relation (as a UnresolvedRelation leaf logical operator for a table reference).

Finding Table in Session-Scoped Catalog of Relational Entities — lookupTableFromCatalog Internal Method

lookupTableFromCatalog simply requests SessionCatalog to find the table in relational catalogs.

Note
lookupTableFromCatalog requests Analyzer for the current SessionCatalog.
Note
The table is described using TableIdentifier of the input UnresolvedRelation.

lookupTableFromCatalog fails the analysis phase (by reporting a AnalysisException) when the table or the table’s database cannot be found.

Note
lookupTableFromCatalog is used when ResolveRelations is executed (for InsertIntoTable with UnresolvedRelation operators) or resolves a relation (for “standalone” UnresolvedRelations).

ResolveReferences

admin阅读(1786)

ResolveReferences Logical Resolution Rule

ResolveReferences is a logical resolution rule that the logical query plan analyzer uses to resolve FIXME in a logical query plan, i.e.

  1. Resolves…​FIXME

Technically, ResolveReferences is just a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan].

ResolveReferences is part of Resolution fixed-point batch of rules.

Resolving Expressions of Logical Plan — resolve Internal Method

resolve resolves the input expression per type:

  1. UnresolvedAttribute expressions

  2. UnresolvedExtractValue expressions

  3. All other expressions

Note
resolve is used exclusively when ResolveReferences is requested to resolve reference expressions in a logical query plan.

Resolving Reference Expressions In Logical Query Plan (Applying ResolveReferences to Logical Plan) — apply Method

Note
apply is part of Rule Contract to apply a rule to a logical plan.

apply resolves the following logical operators:

  • Project logical operator with a Star expression to…​FIXME

  • Aggregate logical operator with a Star expression to…​FIXME

  • ScriptTransformation logical operator with a Star expression to…​FIXME

  • Generate logical operator with a Star expression to…​FIXME

  • Join logical operator with duplicateResolved…​FIXME

  • Intersect logical operator with duplicateResolved…​FIXME

  • Except logical operator with duplicateResolved…​FIXME

  • Sort logical operator unresolved with child operators resolved…​FIXME

  • Generate logical operator resolved…​FIXME

  • Generate logical operator unresolved…​FIXME

In the end, apply resolves the expressions of the input logical operator.

apply skips logical operators that:

  • Use UnresolvedDeserializer expressions

  • Have child operators unresolved

Expanding Star Expressions — buildExpandedProjectList Internal Method

buildExpandedProjectList expands (converts) Star expressions in the input named expressions recursively (down the expression tree) per expression:

  • For a Star expression, buildExpandedProjectList requests it to expand given the input child logical plan

  • For a UnresolvedAlias expression with a Star child expression, buildExpandedProjectList requests it to expand given the input child logical plan (similarly to a Star expression alone in the above case)

  • For exprs with Star expressions down the expression tree, buildExpandedProjectList expandStarExpression passing the input exprs and child

Note
buildExpandedProjectList is used when ResolveReferences is requested to resolve reference expressions (in Project and Aggregate operators with Star expressions).

expandStarExpression Method

expandStarExpression expands (transforms) the following expressions in the input expr expression:

  1. For UnresolvedFunction expressions with Star child expressions, expandStarExpression requests the Star expressions to expand given the input child logical plan and the resolver.

    • For CreateNamedStruct expressions with Star child expressions among the values, expandStarExpression…​FIXME

    • For CreateArray expressions with Star child expressions, expandStarExpression…​FIXME

    • For Murmur3Hash expressions with Star child expressions, expandStarExpression…​FIXME

For any other uses of Star expressions, expandStarExpression fails analysis with a AnalysisException:

Note
expandStarExpression is used exclusively when ResolveReferences is requested to expand Star expressions (in Project and Aggregate operators).

dedupRight Internal Method

dedupRight…​FIXME

Note
dedupRight is used when…​FIXME

dedupOuterReferencesInSubquery Internal Method

dedupOuterReferencesInSubquery…​FIXME

Note
dedupOuterReferencesInSubquery is used when…​FIXME

ResolveOrdinalInOrderByAndGroupBy

admin阅读(1414)

ResolveOrdinalInOrderByAndGroupBy Logical Resolution Rule

ResolveOrdinalInOrderByAndGroupBy is part of the Resolution fixed-point batch in the standard batches of the Analyzer.

ResolveOrdinalInOrderByAndGroupBy is simply a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan].

ResolveOrdinalInOrderByAndGroupBy takes no arguments when created.

Executing Rule — apply Method

Note
apply is part of the Rule Contract to execute (apply) a rule on a TreeNode (e.g. LogicalPlan).

apply walks the logical plan from children up the tree and looks for Sort and Aggregate logical operators with UnresolvedOrdinal leaf expressions (in ordering and grouping expressions, respectively).

For a Sort logical operator with UnresolvedOrdinal expressions, apply replaces all the SortOrder expressions (with UnresolvedOrdinal child expressions) with SortOrder expressions and the expression at the index - 1 position in the output schema of the child logical operator.

For a Aggregate logical operator with UnresolvedOrdinal expressions, apply replaces all the expressions (with UnresolvedOrdinal child expressions) with the expression at the index - 1 position in the aggregate named expressions of the current Aggregate logical operator.

apply throws a AnalysisException (and hence fails an analysis) if the ordinal is outside the range:

ResolveMissingReferences

admin阅读(1768)

ResolveMissingReferences

ResolveMissingReferences is…​FIXME

resolveExprsAndAddMissingAttrs Internal Method

resolveExprsAndAddMissingAttrs…​FIXME

Note
resolveExprsAndAddMissingAttrs is used when…​FIXME

ResolveInlineTables

admin阅读(1547)

ResolveInlineTables Logical Resolution Rule

ResolveInlineTables is part of the Resolution fixed-point batch in the standard batches of the Analyzer.

ResolveInlineTables is simply a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan].

ResolveInlineTables takes a SQLConf when created.

Executing Rule — apply Method

Note
apply is part of the Rule Contract to execute (apply) a rule on a TreeNode (e.g. LogicalPlan).

apply simply searches the input plan up to find UnresolvedInlineTable logical operators with rows expressions resolved.

For such a UnresolvedInlineTable logical operator, apply validateInputDimension and validateInputEvaluable.

validateInputDimension Internal Method

validateInputDimension…​FIXME

Note
validateInputDimension is used exclusively when ResolveInlineTables logical resolution rule is executed.

validateInputEvaluable Internal Method

validateInputEvaluable…​FIXME

Note
validateInputEvaluable is used exclusively when ResolveInlineTables logical resolution rule is executed.

Converting UnresolvedInlineTable to LocalRelation — convert Internal Method

convert…​FIXME

Note
convert is used exclusively when ResolveInlineTables logical resolution rule is executed.

ResolveHiveSerdeTable

admin阅读(1103)

ResolveHiveSerdeTable Logical Resolution Rule

ResolveHiveSerdeTable is a logical resolution rule (i.e. Rule[LogicalPlan]) that the Hive-specific logical query plan analyzer uses to resolve the metadata of a hive table for CreateTable logical operators.

ResolveHiveSerdeTable is part of additional rules in Resolution fixed-point batch of rules.

Applying ResolveHiveSerdeTable Rule to Logical Plan — apply Method

Note
apply is part of Rule Contract to apply a rule to a logical plan.

apply…​FIXME

ResolveFunctions

admin阅读(1568)

ResolveFunctions Logical Resolution Rule — Resolving grouping__id UnresolvedAttribute, UnresolvedGenerator And UnresolvedFunction Expressions

ResolveFunctions is a logical resolution rule that the logical query plan analyzer uses to resolve grouping__id UnresolvedAttribute, UnresolvedGenerator and UnresolvedFunction expressions in an entire logical query plan.

Technically, ResolveReferences is just a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan].

ResolveFunctions is part of Resolution fixed-point batch of rules.

Note
ResolveFunctions is a Scala object inside Analyzer class.

Resolving grouping__id UnresolvedAttribute, UnresolvedGenerator and UnresolvedFunction Expressions In Entire Query Plan (Applying ResolveFunctions to Logical Plan) — apply Method

Note
apply is part of Rule Contract to apply a rule to a logical plan.

apply takes a logical plan and transforms each expression (for every logical operator found in the query plan) as follows:

  • For UnresolvedAttributes with names as groupingid, apply creates a Alias (with a GroupingID child expression and groupingid name).

    That case seems mostly for compatibility with Hive as grouping__id attribute name is used by Hive.

  • For UnresolvedGenerators, apply requests the SessionCatalog to find a Generator function by name.

    If some other non-generator function is found for the name, apply fails the analysis phase by reporting an AnalysisException:

  • For UnresolvedFunctions, apply requests the SessionCatalog to find a function by name.

  • AggregateWindowFunctions are returned directly or apply fails the analysis phase by reporting an AnalysisException when the UnresolvedFunction has isDistinct flag enabled.

  • AggregateFunctions are wrapped in a AggregateExpression (with Complete aggregate mode)

  • All other functions are returned directly or apply fails the analysis phase by reporting an AnalysisException when the UnresolvedFunction has isDistinct flag enabled.

apply skips unresolved expressions.

ResolveCreateNamedStruct

admin阅读(1471)

ResolveCreateNamedStruct Logical Resolution Rule — Resolving NamePlaceholders In CreateNamedStruct Expressions

ResolveCreateNamedStruct is part of the Resolution fixed-point batch in the standard batches of the Analyzer.

ResolveCreateNamedStruct is simply a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan].

Executing Rule — apply Method

Note
apply is part of the Rule Contract to execute (apply) a rule on a TreeNode (e.g. LogicalPlan).

apply traverses all Catalyst expressions (in the input LogicalPlan) that are CreateNamedStruct expressions which are not resolved yet and replaces NamePlaceholders with Literal expressions.

In other words, apply finds unresolved CreateNamedStruct expressions with NamePlaceholder expressions in the children and replaces them with the name of corresponding NamedExpression, but only if the NamedExpression is resolved.

In the end, apply creates a CreateNamedStruct with new children.

ResolveCoalesceHints

admin阅读(1555)

ResolveCoalesceHints Logical Resolution Rule — Resolving UnresolvedHint Operators with COALESCE and REPARTITION Hints

ResolveCoalesceHints is a logical resolution rule that the Spark Analyzer uses to resolve UnresolvedHint logical operators with COALESCE or REPARTITION hints (case-insensitive) to ResolvedHint operators.

COALESCE or REPARTITION hints expect a partition number as the only parameter.

Technically, ResolveCoalesceHints is a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan].

ResolveCoalesceHints is part of Hints fixed-point batch of rules (that is executed before any other rule).

ResolveCoalesceHints takes no input parameters when created.

ResolveBroadcastHints

admin阅读(1641)

ResolveBroadcastHints Logical Resolution Rule — Resolving UnresolvedHint Operators with BROADCAST, BROADCASTJOIN and MAPJOIN Hint Names

ResolveBroadcastHints is a logical resolution rule that the Spark Analyzer uses to resolve UnresolvedHint logical operators with BROADCAST, BROADCASTJOIN or MAPJOIN hints (case-insensitive) to ResolvedHint operators.

Technically, ResolveBroadcastHints is a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan].

ResolveBroadcastHints is part of Hints fixed-point batch of rules (that is executed before any other rule).

ResolveBroadcastHints takes a SQLConf when created.

Resolving UnresolvedHint with BROADCAST, BROADCASTJOIN or MAPJOIN Hint Names (Applying ResolveBroadcastHints to Logical Plan) — apply Method

Note
apply is part of Rule Contract to apply a rule to a logical plan.

apply transforms UnresolvedHint operators into ResolvedHint for the hint names as BROADCAST, BROADCASTJOIN or MAPJOIN (case-insensitive).

For UnresolvedHints with no parameters, apply marks the entire child logical plan as eligible for broadcast, i.e. creates a ResolvedHint with the child operator and HintInfo with broadcast flag on.

For UnresolvedHints with parameters defined, apply considers the parameters the names of the tables to apply broadcast hint to.

Note
The table names can be of String or UnresolvedAttribute types.

apply reports an AnalysisException for the parameters that are not of String or UnresolvedAttribute types.

applyBroadcastHint Internal Method

applyBroadcastHint…​FIXME

Note
applyBroadcastHint is used exclusively when ResolveBroadcastHints is requested to execute.

关注公众号:spark技术分享

联系我们联系我们