关注 spark技术分享,
撸spark源码 玩spark最佳实践

spark-core 第26页

Map/Reduce-side Aggregator-spark技术分享

Map/Reduce-side Aggregator

admin阅读(1318)赞(0)

Map/Reduce-side Aggregator Aggregator is a set of functions used to aggregate distributed data sets: [crayon-6803 ...

RDD Dependencies-spark技术分享

RDD Dependencies

admin阅读(1515)赞(0)

RDD Dependencies Dependency class is the base (abstract) class to model a dependency relationship between two or m ...

Checkpointing-spark技术分享

Checkpointing

admin阅读(1242)赞(0)

Checkpointing Checkpointing is a process of truncating RDD lineage graph and saving it to a reliable distributed ( ...

Shuffling-spark技术分享

Shuffling

admin阅读(1486)赞(0)

RDD shuffling Tip Read the official documentation about the topic Shuffle operations. It is still better than ...

HashPartitioner-spark技术分享

HashPartitioner

admin阅读(1508)赞(0)

HashPartitioner HashPartitioner is a Partitioner that uses partitions configurable number of partitions to shuffle ...

Partitioner-spark技术分享

Partitioner

admin阅读(1472)赞(0)

Partitioner Caution FIXME Partitioner captures data distribution at the output. A scheduler can optimize ...

Partition-spark技术分享

Partition

admin阅读(1462)赞(0)

Partition Partition is a contract of a partition index of a RDD. Note A partition is missing when it has no ...

关注公众号:spark技术分享

联系我们联系我们