spark-sql-spark技术分享-第19页

SortMergeJoinExec

2013-02-02admin阅读(1995)

SortMergeJoinExec Binary Physical Operator for Sort Merge Join

SortMergeJoinExec is a binary physical operator to execute a sort merge join.

ShuffledHashJoinExec is selected to represent a Join logical operator when JoinSelection execution planning strategy is executed for joins with left join keys that are orderable, i.e. that can be ordered (sorted).

Note

A join key is orderable when is of one of the following data types:

NullType
AtomicType (that represents all the available types except NullType, StructType, ArrayType, UserDefinedType, MapType, and ObjectType)
StructType with orderable fields
ArrayType of orderable type
UserDefinedType of orderable type

Therefore, a join key is not orderable when is of the following data type:

MapType
ObjectType

Note	spark.sql.join.preferSortMergeJoin is an internal configuration property and is enabled by default. That means that JoinSelection execution planning strategy (and so Spark Planner) prefers sort merge join over shuffled hash join.

SortMergeJoinExec supports Java code generation (aka codegen) for inner and cross joins.

Tip	Enable `DEBUG` logging level for `org.apache.spark.sql.catalyst.planning.ExtractEquiJoinKeys` logger to see the join condition and the left and right join keys.



// Disable auto broadcasting so Broadcast Hash Join won't take precedence
spark.conf.set("spark.sql.autoBroadcastJoinThreshold", -1)

val tokens = Seq(
  (0, "playing"),
  (1, "with"),
  (2, "SortMergeJoinExec")
).toDF("id", "token")

// all data types are orderable
scala> tokens.printSchema
root
 |-- id: integer (nullable = false)
 |-- token: string (nullable = true)

// Spark Planner prefers SortMergeJoin over Shuffled Hash Join
scala> println(spark.conf.get("spark.sql.join.preferSortMergeJoin"))
true

val q = tokens.join(tokens, Seq("id"), "inner")
scala> q.explain
== Physical Plan ==
*(3) Project [id#5, token#6, token#10]
+- *(3) SortMergeJoin [id#5], [id#9], Inner
   :- *(1) Sort [id#5 ASC NULLS FIRST], false, 0
   :  +- Exchange hashpartitioning(id#5, 200)
   :     +- LocalTableScan [id#5, token#6]
   +- *(2) Sort [id#9 ASC NULLS FIRST], false, 0
      +- ReusedExchange [id#9, token#10], Exchange hashpartitioning(id#5, 200)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

// Disable auto broadcasting so Broadcast Hash Join won't take precedence

spark.conf.set("spark.sql.autoBroadcastJoinThreshold", -1)

val tokens = Seq(

(0, "playing"),

(1, "with"),

(2, "SortMergeJoinExec")

).toDF("id", "token")

// all data types are orderable

scala> tokens.printSchema

root

|-- id: integer (nullable = false)

|-- token: string (nullable = true)

// Spark Planner prefers SortMergeJoin over Shuffled Hash Join

scala> println(spark.conf.get("spark.sql.join.preferSortMergeJoin"))

true

val q = tokens.join(tokens, Seq("id"), "inner")

scala> q.explain

== Physical Plan ==

*(3) Project [id#5, token#6, token#10]

+- *(3) SortMergeJoin [id#5], [id#9], Inner

:- *(1) Sort [id#5 ASC NULLS FIRST], false, 0

: +- Exchange hashpartitioning(id#5, 200)

: +- LocalTableScan [id#5, token#6]

+- *(2) Sort [id#9 ASC NULLS FIRST], false, 0

+- ReusedExchange [id#9, token#10], Exchange hashpartitioning(id#5, 200)

Table 1. SortMergeJoinExec’s Performance Metrics
Key	Name (in web UI)	Description
`numOutputRows`	number of output rows

spark sql SortMergeJoinExec webui query details.png

Figure 1. SortMergeJoinExec in web UI (Details for Query)

Note	The prefix for variable names for `SortMergeJoinExec` operators in CodegenSupport-generated code is smj.



scala> q.queryExecution.debug.codegen
Found 3 WholeStageCodegen subtrees.
== Subtree 1 / 3 ==
*Project [id#5, token#6, token#11]
+- *SortMergeJoin [id#5], [id#10], Inner
   :- *Sort [id#5 ASC NULLS FIRST], false, 0
   :  +- Exchange hashpartitioning(id#5, 200)
   :     +- LocalTableScan [id#5, token#6]
   +- *Sort [id#10 ASC NULLS FIRST], false, 0
      +- ReusedExchange [id#10, token#11], Exchange hashpartitioning(id#5, 200)

Generated code:
/* 001 */ public Object generate(Object[] references) {
/* 002 */   return new GeneratedIterator(references);
/* 003 */ }
/* 004 */
/* 005 */ final class GeneratedIterator extends org.apache.spark.sql.execution.BufferedRowIterator {
/* 006 */   private Object[] references;
/* 007 */   private scala.collection.Iterator[] inputs;
/* 008 */   private scala.collection.Iterator smj_leftInput;
/* 009 */   private scala.collection.Iterator smj_rightInput;
/* 010 */   private InternalRow smj_leftRow;
/* 011 */   private InternalRow smj_rightRow;
/* 012 */   private int smj_value2;
/* 013 */   private org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray smj_matches;
/* 014 */   private int smj_value3;
/* 015 */   private int smj_value4;
/* 016 */   private UTF8String smj_value5;
/* 017 */   private boolean smj_isNull2;
/* 018 */   private org.apache.spark.sql.execution.metric.SQLMetric smj_numOutputRows;
/* 019 */   private UnsafeRow smj_result;
/* 020 */   private org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder smj_holder;
/* 021 */   private org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter smj_rowWriter;
...

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

scala> q.queryExecution.debug.codegen

Found 3 WholeStageCodegen subtrees.

== Subtree 1 / 3 ==

*Project [id#5, token#6, token#11]

+- *SortMergeJoin [id#5], [id#10], Inner

:- *Sort [id#5 ASC NULLS FIRST], false, 0

: +- Exchange hashpartitioning(id#5, 200)

: +- LocalTableScan [id#5, token#6]

+- *Sort [id#10 ASC NULLS FIRST], false, 0

+- ReusedExchange [id#10, token#11], Exchange hashpartitioning(id#5, 200)

Generated code:

/* 001 */ public Object generate(Object[] references) {

/* 002 */ return new GeneratedIterator(references);

/* 003 */ }

/* 004 */

/* 005 */ final class GeneratedIterator extends org.apache.spark.sql.execution.BufferedRowIterator {

/* 006 */ private Object[] references;

/* 007 */ private scala.collection.Iterator[] inputs;

/* 008 */ private scala.collection.Iterator smj_leftInput;

/* 009 */ private scala.collection.Iterator smj_rightInput;

/* 010 */ private InternalRow smj_leftRow;

/* 011 */ private InternalRow smj_rightRow;

/* 012 */ private int smj_value2;

/* 013 */ private org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray smj_matches;

/* 014 */ private int smj_value3;

/* 015 */ private int smj_value4;

/* 016 */ private UTF8String smj_value5;

/* 017 */ private boolean smj_isNull2;

/* 018 */ private org.apache.spark.sql.execution.metric.SQLMetric smj_numOutputRows;

/* 019 */ private UnsafeRow smj_result;

/* 020 */ private org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder smj_holder;

/* 021 */ private org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter smj_rowWriter;

...

The output schema of a SortMergeJoinExec is…FIXME

The outputPartitioning of a SortMergeJoinExec is…FIXME

The outputOrdering of a SortMergeJoinExec is…FIXME

The partitioning requirements of the input of a SortMergeJoinExec (aka child output distributions) are HashClusteredDistributions of left and right join keys.

Table 2. SortMergeJoinExec’s Required Child Output Distributions
Left Child	Right Child
HashClusteredDistribution (per left join key expressions)	HashClusteredDistribution (per right join key expressions)

The ordering requirements of the input of a SortMergeJoinExec (aka child output ordering) is…FIXME

Note	`SortMergeJoinExec` operator is chosen in JoinSelection execution planning strategy (after BroadcastHashJoinExec and ShuffledHashJoinExec physical join operators have not met the requirements).

Generating Java Source Code for Produce Path in Whole-Stage Code Generation — `doProduce` Method



doProduce(ctx: CodegenContext): String

1

2

3

4

5

doProduce(ctx: CodegenContext): String

Note	`doProduce` is part of CodegenSupport Contract to generate the Java source code for produce path in Whole-Stage Code Generation.

doProduce…FIXME

Executing Physical Operator (Generating RDD[InternalRow]) — `doExecute` Method



doExecute(): RDD[InternalRow]

1

2

3

4

5

doExecute(): RDD[InternalRow]

Note	`doExecute` is part of SparkPlan Contract to generate the runtime representation of a structured query as a distributed computation over internal binary rows on Apache Spark (i.e. `RDD[InternalRow]`).

doExecute…FIXME

Creating SortMergeJoinExec Instance

SortMergeJoinExec takes the following when created:

Left join key expressions
Right join key expressions
Join type
Optional join condition expression
Left physical operator
Right physical operator

SortAggregateExec

2013-02-01admin阅读(2233)

SortAggregateExec Aggregate Physical Operator for Sort-Based Aggregation

Caution

FIXME

Executing Physical Operator (Generating RDD[InternalRow]) — `doExecute` Method



doExecute(): RDD[InternalRow]

1

2

3

4

5

doExecute(): RDD[InternalRow]

Note	`doExecute` is part of SparkPlan Contract to generate the runtime representation of a structured query as a distributed computation over internal binary rows on Apache Spark (i.e. `RDD[InternalRow]`).

doExecute…FIXME

SerializeFromObjectExec

2013-01-31admin阅读(2893)

SerializeFromObjectExec Unary Physical Operator

SerializeFromObjectExec is a unary physical operator (i.e. with one child physical operator) that supports Java code generation.

SerializeFromObjectExec supports Java code generation with the doProduce, doConsume and inputRDDs methods.

SerializeFromObjectExec is a ObjectConsumerExec.

SerializeFromObjectExec is created exclusively when BasicOperators execution planning strategy is requested to plan a SerializeFromObject logical operator.

SerializeFromObjectExec uses the child physical operator when requested for the input RDDs and the outputPartitioning.

SerializeFromObjectExec uses the serializer for the output schema attributes.

Creating SerializeFromObjectExec Instance

SerializeFromObjectExec takes the following when created:

Serializer (as Seq[NamedExpression])
Child physical operator (that supports Java code generation)

Generating Java Source Code for Consume Path in Whole-Stage Code Generation — `doConsume` Method



doConsume(ctx: CodegenContext, input: Seq[ExprCode], row: ExprCode): String

1

2

3

4

5

doConsume(ctx: CodegenContext, input: Seq[ExprCode], row: ExprCode): String

Note	`doConsume` is part of CodegenSupport Contract to generate the Java source code for consume path in Whole-Stage Code Generation.

doConsume…FIXME

Generating Java Source Code for Produce Path in Whole-Stage Code Generation — `doProduce` Method



doProduce(ctx: CodegenContext): String

1

2

3

4

5

doProduce(ctx: CodegenContext): String

Note	`doProduce` is part of CodegenSupport Contract to generate the Java source code for produce path in Whole-Stage Code Generation.

doProduce…FIXME

Executing Physical Operator (Generating RDD[InternalRow]) — `doExecute` Method



doExecute(): RDD[InternalRow]

1

2

3

4

5

doExecute(): RDD[InternalRow]

Note	`doExecute` is part of SparkPlan Contract to generate the runtime representation of a structured query as a distributed computation over internal binary rows on Apache Spark (i.e. `RDD[InternalRow]`).

doExecute requests the child physical operator to execute (that triggers physical query planning and generates an RDD[InternalRow]) and transforms it by executing the following function on internal rows per partition with index (using RDD.mapPartitionsWithIndexInternal that creates another RDD):

Creates an UnsafeProjection for the serializer
Requests the UnsafeProjection to initialize (for the partition index)
Executes the UnsafeProjection on all internal binary rows in the partition

Note	`doExecute` (by `RDD.mapPartitionsWithIndexInternal`) adds a new `MapPartitionsRDD` to the RDD lineage. Use `RDD.toDebugString` to see the additional `MapPartitionsRDD`.

ShuffledHashJoinExec

2013-01-30admin阅读(2910)

ShuffledHashJoinExec Binary Physical Operator for Shuffled Hash Join

ShuffledHashJoinExec is a binary physical operator to execute a shuffled hash join.

ShuffledHashJoinExec performs a hash join of two child relations by first shuffling the data using the join keys.

ShuffledHashJoinExec is selected to represent a Join logical operator when JoinSelection execution planning strategy is executed and spark.sql.join.preferSortMergeJoin configuration property is off.

Note

spark.sql.join.preferSortMergeJoin is an internal configuration property and is enabled by default.

That means that JoinSelection execution planning strategy (and so Spark Planner) prefers sort merge join over shuffled hash join.

In other words, you will hardly see shuffled hash joins in your structured queries unless you turn spark.sql.join.preferSortMergeJoin on.

Beside the spark.sql.join.preferSortMergeJoin configuration property one of the following requirements has to hold:

(For a right build side, i.e. BuildRight) canBuildRight, canBuildLocalHashMap for the right join side and finally the right join side is at least three times smaller than the left side
(For a right build side, i.e. BuildRight) Left join keys are not orderable, i.e. cannot be sorted
(For a left build side, i.e. BuildLeft) canBuildLeft, canBuildLocalHashMap for left join side and finally left join side is at least three times smaller than right

Tip	Enable `DEBUG` logging level for `org.apache.spark.sql.catalyst.planning.ExtractEquiJoinKeys` logger to see the join condition and the left and right join keys.



// Use ShuffledHashJoinExec's selection requirements
// 1. Disable auto broadcasting
// JoinSelection (canBuildLocalHashMap specifically) requires that
// plan.stats.sizeInBytes < autoBroadcastJoinThreshold * numShufflePartitions
// That gives that autoBroadcastJoinThreshold has to be at least 1
spark.conf.set("spark.sql.autoBroadcastJoinThreshold", 1)

scala> println(spark.sessionState.conf.numShufflePartitions)
200

// 2. Disable preference on SortMergeJoin
spark.conf.set("spark.sql.join.preferSortMergeJoin", false)

val dataset = Seq(
  (0, "playing"),
  (1, "with"),
  (2, "ShuffledHashJoinExec")
).toDF("id", "token")
// Self LEFT SEMI join
val q = dataset.join(dataset, Seq("id"), "leftsemi")

val sizeInBytes = q.queryExecution.optimizedPlan.stats.sizeInBytes
scala> println(sizeInBytes)
72

// 3. canBuildLeft is on for leftsemi

// the right join side is at least three times smaller than the left side
// Even though it's a self LEFT SEMI join there are two different join sides
// How is that possible?

// BINGO! ShuffledHashJoin is here!

// Enable DEBUG logging level
import org.apache.log4j.{Level, Logger}
val logger = "org.apache.spark.sql.catalyst.planning.ExtractEquiJoinKeys"
Logger.getLogger(logger).setLevel(Level.DEBUG)

// ShuffledHashJoin with BuildRight
scala> q.explain
== Physical Plan ==
ShuffledHashJoin [id#37], [id#41], LeftSemi, BuildRight
:- Exchange hashpartitioning(id#37, 200)
:  +- LocalTableScan [id#37, token#38]
+- Exchange hashpartitioning(id#41, 200)
   +- LocalTableScan [id#41]

scala> println(q.queryExecution.executedPlan.numberedTreeString)
00 ShuffledHashJoin [id#37], [id#41], LeftSemi, BuildRight
01 :- Exchange hashpartitioning(id#37, 200)
02 :  +- LocalTableScan [id#37, token#38]
03 +- Exchange hashpartitioning(id#41, 200)
04    +- LocalTableScan [id#41]

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

// Use ShuffledHashJoinExec's selection requirements

// 1. Disable auto broadcasting

// JoinSelection (canBuildLocalHashMap specifically) requires that

// plan.stats.sizeInBytes < autoBroadcastJoinThreshold * numShufflePartitions

// That gives that autoBroadcastJoinThreshold has to be at least 1

spark.conf.set("spark.sql.autoBroadcastJoinThreshold", 1)

scala> println(spark.sessionState.conf.numShufflePartitions)

200

// 2. Disable preference on SortMergeJoin

spark.conf.set("spark.sql.join.preferSortMergeJoin", false)

val dataset = Seq(

(0, "playing"),

(1, "with"),

(2, "ShuffledHashJoinExec")

).toDF("id", "token")

// Self LEFT SEMI join

val q = dataset.join(dataset, Seq("id"), "leftsemi")

val sizeInBytes = q.queryExecution.optimizedPlan.stats.sizeInBytes

scala> println(sizeInBytes)

72

// 3. canBuildLeft is on for leftsemi

// the right join side is at least three times smaller than the left side

// Even though it's a self LEFT SEMI join there are two different join sides

// How is that possible?

// BINGO! ShuffledHashJoin is here!

// Enable DEBUG logging level

import org.apache.log4j.{Level, Logger}

val logger = "org.apache.spark.sql.catalyst.planning.ExtractEquiJoinKeys"

Logger.getLogger(logger).setLevel(Level.DEBUG)

// ShuffledHashJoin with BuildRight

scala> q.explain

== Physical Plan ==

ShuffledHashJoin [id#37], [id#41], LeftSemi, BuildRight

:- Exchange hashpartitioning(id#37, 200)

: +- LocalTableScan [id#37, token#38]

+- Exchange hashpartitioning(id#41, 200)

+- LocalTableScan [id#41]

scala> println(q.queryExecution.executedPlan.numberedTreeString)

00 ShuffledHashJoin [id#37], [id#41], LeftSemi, BuildRight

01 :- Exchange hashpartitioning(id#37, 200)

02 : +- LocalTableScan [id#37, token#38]

03 +- Exchange hashpartitioning(id#41, 200)

04 +- LocalTableScan [id#41]

Table 1. ShuffledHashJoinExec’s Performance Metrics
Key	Name (in web UI)	Description
`avgHashProbe`	avg hash probe
`buildDataSize`	data size of build side
`buildTime`	time to build hash map
`numOutputRows`	number of output rows

spark sql ShuffledHashJoinExec webui query details.png

Figure 1. ShuffledHashJoinExec in web UI (Details for Query)

Table 2. ShuffledHashJoinExec’s Required Child Output Distributions
Left Child	Right Child
HashClusteredDistribution (per left join key expressions)	HashClusteredDistribution (per right join key expressions)

Executing Physical Operator (Generating RDD[InternalRow]) — `doExecute` Method



doExecute(): RDD[InternalRow]

1

2

3

4

5

doExecute(): RDD[InternalRow]

Note	`doExecute` is part of SparkPlan Contract to generate the runtime representation of a structured query as a distributed computation over internal binary rows on Apache Spark (i.e. `RDD[InternalRow]`).

doExecute requests streamedPlan physical operator to execute (and generate a RDD[InternalRow]).

doExecute requests buildPlan physical operator to execute (and generate a RDD[InternalRow]).

doExecute requests streamedPlan physical operator’s RDD[InternalRow] to zip partition-wise with buildPlan physical operator’s RDD[InternalRow] (using RDD.zipPartitions method with preservesPartitioning flag disabled).

Note

doExecute generates a ZippedPartitionsRDD2 that you can see in a RDD lineage.



scala> println(q.queryExecution.toRdd.toDebugString)
(200) ZippedPartitionsRDD2[8] at toRdd at <console>:26 []
  |   ShuffledRowRDD[3] at toRdd at <console>:26 []
  +-(3) MapPartitionsRDD[2] at toRdd at <console>:26 []
     |  MapPartitionsRDD[1] at toRdd at <console>:26 []
     |  ParallelCollectionRDD[0] at toRdd at <console>:26 []
  |   ShuffledRowRDD[7] at toRdd at <console>:26 []
  +-(3) MapPartitionsRDD[6] at toRdd at <console>:26 []
     |  MapPartitionsRDD[5] at toRdd at <console>:26 []
     |  ParallelCollectionRDD[4] at toRdd at <console>:26 []

1

2

3

4

5

6

7

8

9

10

11

12

13

14

scala> println(q.queryExecution.toRdd.toDebugString)

(200) ZippedPartitionsRDD2[8] at toRdd at <console>:26 []

| ShuffledRowRDD[3] at toRdd at <console>:26 []

+-(3) MapPartitionsRDD[2] at toRdd at <console>:26 []

| MapPartitionsRDD[1] at toRdd at <console>:26 []

| ParallelCollectionRDD[0] at toRdd at <console>:26 []

| ShuffledRowRDD[7] at toRdd at <console>:26 []

+-(3) MapPartitionsRDD[6] at toRdd at <console>:26 []

| MapPartitionsRDD[5] at toRdd at <console>:26 []

| ParallelCollectionRDD[4] at toRdd at <console>:26 []

doExecute uses RDD.zipPartitions with a function applied to zipped partitions that takes two iterators of rows from the partitions of streamedPlan and buildPlan.

For every partition (and pairs of rows from the RDD), the function buildHashedRelation on the partition of buildPlan and join the streamedPlan partition iterator, the HashedRelation, numOutputRows and avgHashProbe SQL metrics.

Building HashedRelation for Internal Rows — `buildHashedRelation` Internal Method



buildHashedRelation(iter: Iterator[InternalRow]): HashedRelation

1

2

3

4

5

buildHashedRelation(iter: Iterator[InternalRow]): HashedRelation

buildHashedRelation creates a HashedRelation (for the input iter iterator of InternalRows, buildKeys and the current TaskMemoryManager).

Note	`buildHashedRelation` uses `TaskContext.get()` to access the current `TaskContext` that in turn is used to access the `TaskMemoryManager`.

buildHashedRelation records the time to create the HashedRelation as buildTime.

buildHashedRelation requests the HashedRelation for estimatedSize that is recorded as buildDataSize.

Note	`buildHashedRelation` is used exclusively when `ShuffledHashJoinExec` is requested to execute (when streamedPlan and buildPlan physical operators are executed and their RDDs zipped partition-wise using `RDD.zipPartitions` method).

Creating ShuffledHashJoinExec Instance

ShuffledHashJoinExec takes the following when created:

Left join key expressions
Right join key expressions
Join type
BuildSide
Optional join condition expression
Left physical operator
Right physical operator

ShuffleExchangeExec

2013-01-29admin阅读(5313)

ShuffleExchangeExec Unary Physical Operator

ShuffleExchangeExec is a Exchange unary physical operator to perform a shuffle.

ShuffleExchangeExec corresponds to Repartition (with shuffle enabled) and RepartitionByExpression logical operators (as resolved in BasicOperators execution planning strategy).

Note	`ShuffleExchangeExec` shows as Exchange in physical plans.



// Uses Repartition logical operator
// ShuffleExchangeExec with RoundRobinPartitioning
val q1 = spark.range(6).repartition(2)
scala> q1.explain
== Physical Plan ==
Exchange RoundRobinPartitioning(2)
+- *Range (0, 6, step=1, splits=Some(8))

// Uses RepartitionByExpression logical operator
// ShuffleExchangeExec with HashPartitioning
val q2 = spark.range(6).repartition(2, 'id % 2)
scala> q2.explain
== Physical Plan ==
Exchange hashpartitioning((id#38L % 2), 2)
+- *Range (0, 6, step=1, splits=Some(8))

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

// Uses Repartition logical operator

// ShuffleExchangeExec with RoundRobinPartitioning

val q1 = spark.range(6).repartition(2)

scala> q1.explain

== Physical Plan ==

Exchange RoundRobinPartitioning(2)

+- *Range (0, 6, step=1, splits=Some(8))

// Uses RepartitionByExpression logical operator

// ShuffleExchangeExec with HashPartitioning

val q2 = spark.range(6).repartition(2, 'id % 2)

scala> q2.explain

== Physical Plan ==

Exchange hashpartitioning((id#38L % 2), 2)

+- *Range (0, 6, step=1, splits=Some(8))

When created, ShuffleExchangeExec takes a Partitioning, a single child physical operator and an optional ExchangeCoordinator.

Table 1. ShuffleExchangeExec’s Performance Metrics
Key	Name (in web UI)	Description
`dataSize`	data size

Figure 1. ShuffleExchangeExec in web UI (Details for Query)

nodeName is computed based on the optional ExchangeCoordinator with Exchange prefix and possibly (coordinator id: [coordinator-hash-code]).

outputPartitioning is the input Partitioning.

While preparing execution (using doPrepare), ShuffleExchangeExec registers itself with the ExchangeCoordinator if available.

When doExecute, ShuffleExchangeExec computes a ShuffledRowRDD and caches it (to reuse avoiding possibly expensive executions).

Table 2. ShuffleExchangeExec’s Internal Registries and Counters
Name	Description
`cachedShuffleRDD`	ShuffledRowRDD that is cached after `ShuffleExchangeExec` has been executed.

Executing Physical Operator (Generating RDD[InternalRow]) — `doExecute` Method



doExecute(): RDD[InternalRow]

1

2

3

4

5

doExecute(): RDD[InternalRow]

Note	`doExecute` is part of SparkPlan Contract to generate the runtime representation of a structured query as a distributed computation over internal binary rows on Apache Spark (i.e. `RDD[InternalRow]`).

doExecute creates a new ShuffledRowRDD or takes cached one.

doExecute branches off per optional ExchangeCoordinator.

If ExchangeCoordinator was specified, doExecute requests ExchangeCoordinator for a ShuffledRowRDD.

Otherwise (with no ExchangeCoordinator specified), doExecute prepareShuffleDependency and preparePostShuffleRDD.

In the end, doExecute saves the result ShuffledRowRDD for later use.

`preparePostShuffleRDD` Method

Caution

FIXME

`prepareShuffleDependency` Internal Method



prepareShuffleDependency(): ShuffleDependency[Int, InternalRow, InternalRow]

1

2

3

4

5

prepareShuffleDependency(): ShuffleDependency[Int, InternalRow, InternalRow]

Caution

FIXME

`prepareShuffleDependency` Helper Method



prepareShuffleDependency(
  rdd: RDD[InternalRow],
  outputAttributes: Seq[Attribute],
  newPartitioning: Partitioning,
  serializer: Serializer): ShuffleDependency[Int, InternalRow, InternalRow]

1

2

3

4

5

6

7

8

9

prepareShuffleDependency(

rdd: RDD[InternalRow],

outputAttributes: Seq[Attribute],

newPartitioning: Partitioning,

serializer: Serializer): ShuffleDependency[Int, InternalRow, InternalRow]

prepareShuffleDependency creates a ShuffleDependency dependency.

Note	`prepareShuffleDependency` is used when `ShuffleExchangeExec` prepares a `ShuffleDependency` (as part of…FIXME), `CollectLimitExec` and `TakeOrderedAndProjectExec` physical operators are executed.

SampleExec

2013-01-28admin阅读(2644)

SampleExec

SampleExec is…FIXME

RowDataSourceScanExec

2013-01-27admin阅读(2067)

RowDataSourceScanExec Leaf Physical Operator

RowDataSourceScanExec is a DataSourceScanExec (and so indirectly a leaf physical operator) for scanning data from a BaseRelation.

RowDataSourceScanExec is created to represent a LogicalRelation with the following scan types when DataSourceStrategy execution planning strategy is executed:

CatalystScan, PrunedFilteredScan, PrunedScan (indirectly when DataSourceStrategy is requested to pruneFilterProjectRaw)
TableScan

RowDataSourceScanExec marks the filters that are included in the handledFilters with * (star) in the metadata that is used for a simple text representation.



// DEMO RowDataSourceScanExec with a simple text representation with stars

1

2

3

4

5

// DEMO RowDataSourceScanExec with a simple text representation with stars

Generating Java Source Code for Produce Path in Whole-Stage Code Generation — `doProduce` Method



doProduce(ctx: CodegenContext): String

1

2

3

4

5

doProduce(ctx: CodegenContext): String

Note	`doProduce` is part of CodegenSupport Contract to generate the Java source code for produce path in Whole-Stage Code Generation.

doProduce…FIXME

Creating RowDataSourceScanExec Instance

RowDataSourceScanExec takes the following when created:

Output schema attributes
Indices of required columns
Filter predicates
Handled filter predicates
RDD of internal binary rows
BaseRelation
TableIdentifier

Note	The input filter predicates and handled filters predicates are used exclusively for the metadata property that is part of DataSourceScanExec Contract to describe a scan for a simple text representation (in a query plan tree).

`metadata` Property



metadata: Map[String, String]

1

2

3

4

5

metadata: Map[String, String]

Note	`metadata` is part of DataSourceScanExec Contract to describe a scan for a simple text representation (in a query plan tree).

metadata marks the filter predicates that are included in the handled filters predicates with * (star).

Note	Filter predicates with `` (star) are to denote filters that are pushed down to a relation (aka data source*).

In the end, metadata creates the following mapping:

ReadSchema with the output converted to catalog representation
PushedFilters with the marked and unmarked filter predicates

ReusedExchangeExec

2013-01-26admin阅读(1791)

ReusedExchangeExec Leaf Physical Operator

ReusedExchangeExec is a leaf physical operator that…FIXME

RDDScanExec

2013-01-25admin阅读(1544)

RDDScanExec Leaf Physical Operator

RDDScanExec is a leaf physical operator that…FIXME

RangeExec

2013-01-24admin阅读(1513)

RangeExec Leaf Physical Operator

RangeExec is a leaf physical operator that…FIXME

Generating Java Source Code for Produce Path in Whole-Stage Code Generation — `doProduce` Method



doProduce(ctx: CodegenContext): String

1

2

3

4

5

doProduce(ctx: CodegenContext): String

Note	`doProduce` is part of CodegenSupport Contract to generate the Java source code for produce path in Whole-Stage Code Generation.

doProduce…FIXME

spark-sql 第19页

SortMergeJoinExec Binary Physical Operator for Sort Merge Join

Generating Java Source Code for Produce Path in Whole-Stage Code Generation — doProduce Method

Executing Physical Operator (Generating RDD[InternalRow]) — doExecute Method

Creating SortMergeJoinExec Instance

SortAggregateExec Aggregate Physical Operator for Sort-Based Aggregation

Executing Physical Operator (Generating RDD[InternalRow]) — doExecute Method

SerializeFromObjectExec Unary Physical Operator

Creating SerializeFromObjectExec Instance

Generating Java Source Code for Consume Path in Whole-Stage Code Generation — doConsume Method

Generating Java Source Code for Produce Path in Whole-Stage Code Generation — doProduce Method

Executing Physical Operator (Generating RDD[InternalRow]) — doExecute Method

ShuffledHashJoinExec Binary Physical Operator for Shuffled Hash Join

Executing Physical Operator (Generating RDD[InternalRow]) — doExecute Method

Building HashedRelation for Internal Rows — buildHashedRelation Internal Method

Creating ShuffledHashJoinExec Instance

ShuffleExchangeExec Unary Physical Operator

Executing Physical Operator (Generating RDD[InternalRow]) — doExecute Method

preparePostShuffleRDD Method

prepareShuffleDependency Internal Method

prepareShuffleDependency Helper Method

SampleExec

RowDataSourceScanExec Leaf Physical Operator

Generating Java Source Code for Produce Path in Whole-Stage Code Generation — doProduce Method

Creating RowDataSourceScanExec Instance

metadata Property

ReusedExchangeExec Leaf Physical Operator

RDDScanExec Leaf Physical Operator

RangeExec Leaf Physical Operator

Generating Java Source Code for Produce Path in Whole-Stage Code Generation — doProduce Method

欢迎关注：spark技术分享

关注公众号：spark技术分享

QQ咨询

回顶部

Generating Java Source Code for Produce Path in Whole-Stage Code Generation — `doProduce` Method

Executing Physical Operator (Generating RDD[InternalRow]) — `doExecute` Method

Executing Physical Operator (Generating RDD[InternalRow]) — `doExecute` Method

Generating Java Source Code for Consume Path in Whole-Stage Code Generation — `doConsume` Method

Generating Java Source Code for Produce Path in Whole-Stage Code Generation — `doProduce` Method

Executing Physical Operator (Generating RDD[InternalRow]) — `doExecute` Method

Executing Physical Operator (Generating RDD[InternalRow]) — `doExecute` Method

Building HashedRelation for Internal Rows — `buildHashedRelation` Internal Method

Executing Physical Operator (Generating RDD[InternalRow]) — `doExecute` Method

`preparePostShuffleRDD` Method

`prepareShuffleDependency` Internal Method

`prepareShuffleDependency` Helper Method

Generating Java Source Code for Produce Path in Whole-Stage Code Generation — `doProduce` Method

`metadata` Property

Generating Java Source Code for Produce Path in Whole-Stage Code Generation — `doProduce` Method