关注 spark技术分享,
撸spark源码 玩spark最佳实践

ShuffleExchangeExec

ShuffleExchangeExec Unary Physical Operator

ShuffleExchangeExec is a Exchange unary physical operator to perform a shuffle.

ShuffleExchangeExec corresponds to Repartition (with shuffle enabled) and RepartitionByExpression logical operators (as resolved in BasicOperators execution planning strategy).

Note
ShuffleExchangeExec shows as Exchange in physical plans.

When created, ShuffleExchangeExec takes a Partitioning, a single child physical operator and an optional ExchangeCoordinator.

Table 1. ShuffleExchangeExec’s Performance Metrics
Key Name (in web UI) Description

dataSize

data size

spark sql ShuffleExchangeExec webui.png
Figure 1. ShuffleExchangeExec in web UI (Details for Query)

nodeName is computed based on the optional ExchangeCoordinator with Exchange prefix and possibly (coordinator id: [coordinator-hash-code]).

outputPartitioning is the input Partitioning.

While preparing execution (using doPrepare), ShuffleExchangeExec registers itself with the ExchangeCoordinator if available.

When doExecute, ShuffleExchangeExec computes a ShuffledRowRDD and caches it (to reuse avoiding possibly expensive executions).

Table 2. ShuffleExchangeExec’s Internal Registries and Counters
Name Description

cachedShuffleRDD

ShuffledRowRDD that is cached after ShuffleExchangeExec has been executed.

Executing Physical Operator (Generating RDD[InternalRow]) — doExecute Method

Note
doExecute is part of SparkPlan Contract to generate the runtime representation of a structured query as a distributed computation over internal binary rows on Apache Spark (i.e. RDD[InternalRow]).

doExecute creates a new ShuffledRowRDD or takes cached one.

doExecute branches off per optional ExchangeCoordinator.

If ExchangeCoordinator was specified, doExecute requests ExchangeCoordinator for a ShuffledRowRDD.

Otherwise (with no ExchangeCoordinator specified), doExecute prepareShuffleDependency and preparePostShuffleRDD.

preparePostShuffleRDD Method

Caution
FIXME

prepareShuffleDependency Internal Method

Caution
FIXME

prepareShuffleDependency Helper Method

prepareShuffleDependency creates a ShuffleDependency dependency.

Note
prepareShuffleDependency is used when ShuffleExchangeExec prepares a ShuffleDependency (as part of…​FIXME), CollectLimitExec and TakeOrderedAndProjectExec physical operators are executed.
赞(0) 打赏
未经允许不得转载:spark技术分享 » ShuffleExchangeExec
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏