ResolveRelations Logical Resolution Rule — Resolving UnresolvedRelations With Tables in Catalog
ResolveRelations
is a logical resolution rule that the logical query plan analyzer uses to resolve UnresolvedRelations (in a logical query plan), i.e.
-
Resolves UnresolvedRelation logical operators (in InsertIntoTable operators)
-
Other uses of
UnresolvedRelation
Technically, ResolveRelations
is just a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan]
.
ResolveRelations
is part of Resolution fixed-point batch of rules.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
// Example: InsertIntoTable with UnresolvedRelation import org.apache.spark.sql.catalyst.dsl.plans._ val plan = table("t1").insertInto(tableName = "t2", overwrite = true) scala> println(plan.numberedTreeString) 00 'InsertIntoTable 'UnresolvedRelation `t2`, true, false 01 +- 'UnresolvedRelation `t1` // Register the tables so the following resolution works sql("CREATE TABLE IF NOT EXISTS t1(id long)") sql("CREATE TABLE IF NOT EXISTS t2(id long)") // ResolveRelations is a Scala object of the Analyzer class // We need an instance of the Analyzer class to access it import spark.sessionState.analyzer.ResolveRelations val resolvedPlan = ResolveRelations(plan) scala> println(resolvedPlan.numberedTreeString) 00 'InsertIntoTable 'UnresolvedRelation `t2`, true, false 01 +- 'SubqueryAlias t1 02 +- 'UnresolvedCatalogRelation `default`.`t1`, org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe // Example: Other uses of UnresolvedRelation // Use a temporary view val v1 = spark.range(1).createOrReplaceTempView("v1") scala> spark.catalog.listTables.filter($"name" === "v1").show +----+--------+-----------+---------+-----------+ |name|database|description|tableType|isTemporary| +----+--------+-----------+---------+-----------+ | v1| null| null|TEMPORARY| true| +----+--------+-----------+---------+-----------+ import org.apache.spark.sql.catalyst.dsl.expressions._ val plan = table("v1").select(star()) scala> println(plan.numberedTreeString) 00 'Project [*] 01 +- 'UnresolvedRelation `v1` val resolvedPlan = ResolveRelations(plan) scala> println(resolvedPlan.numberedTreeString) 00 'Project [*] 01 +- SubqueryAlias v1 02 +- Range (0, 1, step=1, splits=Some(8)) // Example import org.apache.spark.sql.catalyst.dsl.plans._ val plan = table(db = "db1", ref = "t1") scala> println(plan.numberedTreeString) 00 'UnresolvedRelation `db1`.`t1` // Register the database so the following resolution works sql("CREATE DATABASE IF NOT EXISTS db1") val resolvedPlan = ResolveRelations(plan) scala> println(resolvedPlan.numberedTreeString) 00 'SubqueryAlias t1 01 +- 'UnresolvedCatalogRelation `db1`.`t1`, org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe |
Applying ResolveRelations to Logical Plan — apply
Method
1 2 3 4 5 |
apply(plan: LogicalPlan): LogicalPlan |
Note
|
apply is part of Rule Contract to apply a rule to a logical plan.
|
apply
…FIXME
Resolving Relation — resolveRelation
Method
1 2 3 4 5 |
resolveRelation(plan: LogicalPlan): LogicalPlan |
resolveRelation
…FIXME
Note
|
resolveRelation is used when…FIXME
|
isRunningDirectlyOnFiles
Internal Method
1 2 3 4 5 |
isRunningDirectlyOnFiles(table: TableIdentifier): Boolean |
isRunningDirectlyOnFiles
is enabled (i.e. true
) when all of the following conditions hold:
-
The database of the input
table
is defined -
spark.sql.runSQLOnFiles internal configuration property is enabled
-
The
table
is not a temporary table -
The database or the table do not exist (in the SessionCatalog)
Note
|
isRunningDirectlyOnFiles is used exclusively when ResolveRelations resolves a relation (as a UnresolvedRelation leaf logical operator for a table reference).
|
Finding Table in Session-Scoped Catalog of Relational Entities — lookupTableFromCatalog
Internal Method
1 2 3 4 5 6 7 |
lookupTableFromCatalog( u: UnresolvedRelation, defaultDatabase: Option[String] = None): LogicalPlan |
lookupTableFromCatalog
simply requests SessionCatalog
to find the table in relational catalogs.
Note
|
lookupTableFromCatalog requests Analyzer for the current SessionCatalog.
|
Note
|
The table is described using TableIdentifier of the input UnresolvedRelation .
|
lookupTableFromCatalog
fails the analysis phase (by reporting a AnalysisException
) when the table or the table’s database cannot be found.
Note
|
lookupTableFromCatalog is used when ResolveRelations is executed (for InsertIntoTable with UnresolvedRelation operators) or resolves a relation (for “standalone” UnresolvedRelations).
|