HiveClientImpl — The One and Only HiveClient
HiveClientImpl
is the only available HiveClient in Spark SQL that does/uses…FIXME
HiveClientImpl
is created exclusively when IsolatedClientLoader
is requested to create a new Hive client. When created, HiveClientImpl
is given the location of the default database for the Hive metastore warehouse (i.e. warehouseDir that is the value of hive.metastore.warehouse.dir Hive-specific Hadoop configuration property).
Note
|
The location of the default database for the Hive metastore warehouse is /user/hive/warehouse by default.
|
Note
|
You may be interested in SPARK-19664 put ‘hive.metastore.warehouse.dir’ in hadoopConf place if you use Spark before 2.1 (which you should not really as it is not supported anymore). |
Note
|
The Hadoop configuration is what HiveExternalCatalog was given when created (which is the default Hadoop configuration from Spark Core’s SparkContext.hadoopConfiguration with the Spark properties with spark.hadoop prefix).
|
Tip
|
Enable Add the following line to
Refer to Logging. |
renamePartitions
Method
1 2 3 4 5 6 7 8 9 |
renamePartitions( db: String, table: String, specs: Seq[TablePartitionSpec], newSpecs: Seq[TablePartitionSpec]): Unit |
Note
|
renamePartitions is part of HiveClient Contract to…FIXME.
|
renamePartitions
…FIXME
alterPartitions
Method
1 2 3 4 5 6 7 8 |
alterPartitions( db: String, table: String, newParts: Seq[CatalogTablePartition]): Unit |
Note
|
alterPartitions is part of HiveClient Contract to…FIXME.
|
alterPartitions
…FIXME
getPartitions
Method
1 2 3 4 5 6 7 |
getPartitions( table: CatalogTable, spec: Option[TablePartitionSpec]): Seq[CatalogTablePartition] |
Note
|
getPartitions is part of HiveClient Contract to…FIXME.
|
getPartitions
…FIXME
getPartitionsByFilter
Method
1 2 3 4 5 6 7 |
getPartitionsByFilter( table: CatalogTable, predicates: Seq[Expression]): Seq[CatalogTablePartition] |
Note
|
getPartitionsByFilter is part of HiveClient Contract to…FIXME.
|
getPartitionsByFilter
…FIXME
getPartitionOption
Method
1 2 3 4 5 6 7 |
getPartitionOption( table: CatalogTable, spec: TablePartitionSpec): Option[CatalogTablePartition] |
Note
|
getPartitionOption is part of HiveClient Contract to…FIXME.
|
getPartitionOption
…FIXME
Creating HiveClientImpl Instance
HiveClientImpl
takes the following when created:
HiveClientImpl
initializes the internal registries and counters.
Retrieving Table Metadata If Available — getTableOption
Method
1 2 3 4 5 |
def getTableOption(dbName: String, tableName: String): Option[CatalogTable] |
Note
|
getTableOption is part of HiveClient Contract to…FIXME.
|
When executed, getTableOption
prints out the following DEBUG message to the logs:
1 2 3 4 5 |
Looking up [dbName].[tableName] |
getTableOption
requests Hive client to retrieve the metadata of the table and creates a CatalogTable.
Creating Table Statistics from Hive’s Table or Partition Parameters — readHiveStats
Internal Method
1 2 3 4 5 |
readHiveStats(properties: Map[String, String]): Option[CatalogStatistics] |
readHiveStats
creates a CatalogStatistics from the input Hive table or partition parameters (if available and greater than 0).
Hive Parameter | Table Statistics |
---|---|
|
|
|
|
|
Note
|
totalSize Hive parameter has a higher precedence over rawDataSize for sizeInBytes table statistic.
|
Note
|
readHiveStats is used when HiveClientImpl is requested for the metadata of a table or table partition.
|
Retrieving Table Partition Metadata (Converting Table Partition Metadata from Hive Format to Spark SQL Format) — fromHivePartition
Method
1 2 3 4 5 |
fromHivePartition(hp: HivePartition): CatalogTablePartition |
fromHivePartition
simply creates a CatalogTablePartition with the following:
-
spec from Hive’s Partition.getSpec if available
-
storage from Hive’s StorageDescriptor of the table partition
-
parameters from Hive’s Partition.getParameters if available
-
stats from Hive’s Partition.getParameters if available and converted to table statistics format
Note
|
fromHivePartition is used when HiveClientImpl is requested for getPartitionOption, getPartitions and getPartitionsByFilter.
|
Converting Native Table Metadata to Hive’s Table — toHiveTable
Method
1 2 3 4 5 |
toHiveTable(table: CatalogTable, userName: Option[String] = None): HiveTable |
toHiveTable
simply creates a new Hive Table
and copies the properties from the input CatalogTable.
Note
|
|