关注 spark技术分享,
撸spark源码 玩spark最佳实践

HashPartitioner

HashPartitioner

HashPartitioner is a Partitioner that uses partitions configurable number of partitions to shuffle data around.

Table 1. HashPartitioner Attributes and Method
Property Description

numPartitions

Exactly partitions number of partitions

getPartition

0 for null keys and Java’s Object.hashCode for non-null keys (modulo partitions number of partitions or 0 for negative hashes).

equals

true for HashPartitioners with partitions number of partitions. Otherwise, false.

hashCode

Exactly partitions number of partitions

Note
HashPartitioner is the default Partitioner for coalesce transformation with shuffle enabled, e.g. calling repartition.

It is possible to re-shuffle data despite all the records for the key k being already on a single Spark executor (i.e. BlockManager to be precise). When HashPartitioner‘s result for k1 is 3 the key k1 will go to the third executor.

赞(0) 打赏
未经允许不得转载:spark技术分享 » HashPartitioner
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏