关注 spark技术分享,
撸spark源码 玩spark最佳实践

UnresolvedInlineTable

admin阅读(1579)

UnresolvedInlineTable Logical Operator

UnresolvedInlineTable is a unary logical operator that represents an inline table (aka virtual table in Apache Hive).

UnresolvedInlineTable is created when AstBuilder is requested to parse an inline table in a SQL statement.

UnresolvedInlineTable is never resolved (and is converted to a LocalRelation in ResolveInlineTables logical resolution rule).

UnresolvedInlineTable uses no output schema attributes.

UnresolvedInlineTable uses expressionsResolved flag that is on (true) only when all the Catalyst expressions in the rows are resolved.

Creating UnresolvedInlineTable Instance

UnresolvedInlineTable takes the following when created:

UnresolvedHint

admin阅读(1582)

UnresolvedHint Unary Logical Operator — Attaching Hint to Logical Plan

UnresolvedHint is a unary logical operator that represents a hint (by name and parameters) for the child logical plan.

UnresolvedHint is created and added to a logical plan when:

When created UnresolvedHint takes:

UnresolvedHint can never be resolved and is supposed to be converted to a ResolvedHint unary logical operator during query analysis (or simply removed from a logical plan).

Note

There are the following logical rules that Spark Analyzer uses to analyze logical plans with the UnresolvedHint logical operator:

  1. ResolveBroadcastHints resolves UnresolvedHint operators with BROADCAST, BROADCASTJOIN, MAPJOIN hints to a ResolvedHint

  2. ResolveCoalesceHints resolves UnresolvedHint logical operators with COALESCE or REPARTITION hints

  3. RemoveAllHints simply removes all UnresolvedHint operators

The order of executing the above rules matters.

UnresolvedHint uses the child operator’s output schema for yours.

Tip

Use hint operator from Catalyst DSL to create a UnresolvedHint logical operator, e.g. for testing or Spark SQL internals exploration.

UnresolvedCatalogRelation

admin阅读(1850)

UnresolvedCatalogRelation Leaf Logical Operator — Placeholder of Catalog Tables

UnresolvedCatalogRelation is a leaf logical operator that acts as a placeholder in a logical query plan until FindDataSourceTable logical evaluation rule resolves it to a concrete relation logical plan (i.e. a LogicalRelation for a data source table or a HiveTableRelation for hive table).

UnresolvedCatalogRelation is created when SessionCatalog is requested to find a relation (for DescribeTableCommand logical command or ResolveRelations logical evaluation rule).

When created, UnresolvedCatalogRelation asserts that the database is specified.

UnresolvedCatalogRelation can never be resolved and is converted to a LogicalRelation for a data source table or a HiveTableRelation for hive table at analysis phase.

UnresolvedCatalogRelation uses an empty output schema.

UnresolvedCatalogRelation takes a single CatalogTable when created.

Union

admin阅读(1040)

Union Logical Operator

Union is…​FIXME

TypedFilter

admin阅读(1302)

TypedFilter Logical Operator

TypedFilter is…​FIXME

SubqueryAlias

admin阅读(1934)

SubqueryAlias Unary Logical Operator

SubqueryAlias is a unary logical operator that represents an aliased subquery (i.e. the child logical query plan with the alias in the output schema).

SubqueryAlias is created when:

SubqueryAlias simply requests the child logical operator for the canonicalized version.

When requested for output schema attributes, SubqueryAlias requests the child logical operator for them and adds the alias as a qualifier.

Note
EliminateSubqueryAliases logical optimization eliminates (removes) SubqueryAlias operators from a logical query plan.
Note
RewriteCorrelatedScalarSubquery logical optimization rewrites correlated scalar subqueries with SubqueryAlias operators.

Catalyst DSL — subquery And as Operators

subquery and as operators in Catalyst DSL create a SubqueryAlias logical operator, e.g. for testing or Spark SQL internals exploration.

Creating SubqueryAlias Instance

SubqueryAlias takes the following when created:

Sort

admin阅读(1707)

Sort Unary Logical Operator

Sort is a unary logical operator that represents the following in a logical plan:

Sort takes the following when created:

  • SortOrder ordering expressions

  • global flag for global (true) or partition-only (false) sorting

  • Child logical plan

The output schema of a Sort operator is the output of the child logical operator.

The maxRows of a Sort operator is the maxRows of the child logical operator.

Tip
Use orderBy or sortBy operators from the Catalyst DSL to create a Sort logical operator, e.g. for testing or Spark SQL internals exploration.
Note
Sorting is supported for columns of orderable type only (which is enforced at analysis when CheckAnalysis is requested to checkAnalysis).
Note
Sort logical operator is resolved to SortExec unary physical operator when BasicOperators execution planning strategy is executed.

Catalyst DSL — orderBy and sortBy Operators

orderBy and sortBy create a Sort logical operator with the global flag on and off, respectively.

ShowTablesCommand

admin阅读(1496)

ShowTablesCommand Logical Command

ShowTablesCommand is a logical command for…​FIXME

Executing Logical Command — run Method

Note
run is part of RunnableCommand Contract to execute (run) a logical command.

run…​FIXME

ShowCreateTableCommand

admin阅读(1452)

ShowCreateTableCommand Logical Command

ShowCreateTableCommand is a logical command that executes a SHOW CREATE TABLE SQL statement (with a data source / non-Hive or a Hive table).

ShowCreateTableCommand is created when SparkSqlAstBuilder is requested to parse SHOW CREATE TABLE SQL statement.

ShowCreateTableCommand uses a single createtab_stmt column (of type StringType) for the output schema.

ShowCreateTableCommand takes a single TableIdentifier when created.

Executing Logical Command — run Method

Note
run is part of RunnableCommand Contract to execute (run) a logical command.

run requests the SparkSession for the SessionState that is used to access the SessionCatalog.

run then showCreateDataSourceTable for a data source / non-Hive table or showCreateHiveTable for a Hive table (per the table metadata).

In the end, run returns the CREATE TABLE statement in a single Row.

showHiveTableNonDataColumns Internal Method

showHiveTableNonDataColumns…​FIXME

Note
showHiveTableNonDataColumns is used exclusively when ShowCreateTableCommand logical command is requested to showCreateHiveTable.

showCreateHiveTable Internal Method

showCreateHiveTable…​FIXME

Note
showCreateHiveTable is used exclusively when ShowCreateTableCommand logical command is executed (with a Hive table).

showHiveTableHeader Internal Method

showHiveTableHeader…​FIXME

Note
showHiveTableHeader is used exclusively when ShowCreateTableCommand logical command is requested to showCreateHiveTable.

SaveIntoDataSourceCommand

admin阅读(2293)

SaveIntoDataSourceCommand Logical Command

SaveIntoDataSourceCommand is a logical command that, when executed, FIXME.

SaveIntoDataSourceCommand is created exclusively when DataSource is requested to create a logical command for writing (to a CreatableRelationProvider data source).

SaveIntoDataSourceCommand returns the logical query plan when requested for the inner nodes (that should be shown as an inner nested tree of this node).

SaveIntoDataSourceCommand redacts the options for the simple description with state prefix.

Executing Logical Command — run Method

Note
run is part of RunnableCommand Contract to execute (run) a logical command.

In the end, run returns an empty Seq[Row] (just to follow the signature and please the Scala compiler).

Creating SaveIntoDataSourceCommand Instance

SaveIntoDataSourceCommand takes the following when created:

关注公众号:spark技术分享

联系我们联系我们