RelationConversions Logical PostHoc Evaluation Rule — Converting Hive Tables
RelationConversions is a logical posthoc resolution rule that the Hive-specific logical query plan analyzer uses to convert a Hive table…FIXME.
|
Note
|
A Hive table is when the provider is hive in table metadata.
|
|
Caution
|
FIXME Show example of a hive table, e.g. spark.table(…)
|
RelationConversions is created exclusively when the Hive-specific logical query plan analyzer is created.
Executing Rule — apply Method
|
1 2 3 4 5 |
apply(plan: LogicalPlan): LogicalPlan |
|
Note
|
apply is part of the Rule Contract to execute (apply) a rule on a TreeNode (e.g. LogicalPlan).
|
apply traverses the input logical plan looking for a InsertIntoTable with HiveTableRelation logical operators or HiveTableRelation logical operator alone.
For a InsertIntoTable with non-partitioned HiveTableRelation relation (that can be converted) apply converts the HiveTableRelation to a LogicalRelation.
For a HiveTableRelation logical operator alone apply…FIXME
Does Table Use Parquet or ORC SerDe? — isConvertible Internal Method
|
1 2 3 4 5 |
isConvertible(relation: HiveTableRelation): Boolean |
isConvertible is positive when the input HiveTableRelation is a parquet or ORC table (and corresponding SQL properties are enabled).
Internally, isConvertible takes the Hive SerDe of the table (from table metadata) if available or assumes no SerDe.
isConvertible is turned on when either condition holds:
-
The Hive SerDe is
parquet(aka parquet table) and spark.sql.hive.convertMetastoreParquet configuration property is enabled (which is by default) -
The Hive SerDe is
orc(aka orc table) and spark.sql.hive.convertMetastoreOrc internal configuration property is enabled (which is by default)
|
Note
|
isConvertible is used when RelationConversions is executed.
|
Converting HiveTableRelation to LogicalRelation — convert Internal Method
|
1 2 3 4 5 |
convert(relation: HiveTableRelation): LogicalRelation |
convert takes SerDe of (the storage of) the input HiveTableRelation and converts HiveTableRelation to LogicalRelation, i.e.
-
For
parquetserde,convertaddsmergeSchemaoption being the value of spark.sql.hive.convertMetastoreParquet.mergeSchema configuration property (disabled by default) and requestsHiveMetastoreCatalogto convertToLogicalRelation (with ParquetFileFormat asfileFormatClass).
For non-parquet serde, convert assumes ORC format.
-
When spark.sql.orc.impl configuration property is
native(default)convertrequestsHiveMetastoreCatalogto convertToLogicalRelation (withorg.apache.spark.sql.execution.datasources.orc.OrcFileFormatasfileFormatClass). -
Otherwise,
convertrequestsHiveMetastoreCatalogto convertToLogicalRelation (withorg.apache.spark.sql.hive.orc.OrcFileFormatasfileFormatClass).
|
Note
|
convert uses HiveSessionCatalog to access the HiveMetastoreCatalog.
|
|
Note
|
|
spark技术分享