关注 spark技术分享,
撸spark源码 玩spark最佳实践

AnalyzePartitionCommand

AnalyzePartitionCommand Logical Command — Computing Partition-Level Statistics (Total Size and Row Count)

AnalyzePartitionCommand is a logical command that computes statistics (i.e. total size and row count) for table partitions and stores the stats in a metastore.

AnalyzePartitionCommand is created exclusively for ANALYZE TABLE with PARTITION specification only (i.e. no FOR COLUMNS clause).

Executing Logical Command (Computing Partition-Level Statistics and Altering Metastore) — run Method

Note
run is part of RunnableCommand Contract to execute (run) a logical command.

run requests the session-specific SessionCatalog for the metadata of the table and makes sure that it is not a view.

Note
run uses the input SparkSession to access the session-specific SessionState that in turn is used to access the current SessionCatalog.

run requests the session-specific SessionCatalog for the partitions per the partition specification.

run finishes when the table has no partitions defined in a metastore.

run calculates total size (in bytes) (aka partition location size) for every table partition and creates a CatalogStatistics with the current statistics if different from the statistics recorded in the metastore (with a new row count statistic computed earlier).

In the end, run alters table partition metadata for partitions with the statistics changed.

run reports a NoSuchPartitionException when partitions do not match the metastore.

run reports an AnalysisException when executed on a view.

Computing Row Count Statistics Per Partition — calculateRowCountsPerPartition Internal Method

calculateRowCountsPerPartition…​FIXME

Note
calculateRowCountsPerPartition is used exclusively when AnalyzePartitionCommand is executed.

getPartitionSpec Internal Method

getPartitionSpec…​FIXME

Note
getPartitionSpec is used exclusively when AnalyzePartitionCommand is executed.

Creating AnalyzePartitionCommand Instance

AnalyzePartitionCommand takes the following when created:

  • TableIdentifier

  • Partition specification

  • noscan flag (enabled by default) that indicates whether NOSCAN option was used or not

赞(0) 打赏
未经允许不得转载:spark技术分享 » AnalyzePartitionCommand
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏