LocalRelation Leaf Logical Operator
LocalRelation
is a leaf logical operator that allow functions like collect
or take
to be executed locally, i.e. without using Spark executors.
LocalRelation
is created when…FIXME
Note
|
When Dataset operators can be executed locally, the Dataset is considered local.
|
LocalRelation
represents Datasets
that were created from local collections using SparkSession.emptyDataset or SparkSession.createDataset methods and their derivatives like toDF.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
val dataset = Seq(1).toDF scala> dataset.explain(true) == Parsed Logical Plan == LocalRelation [value#216] == Analyzed Logical Plan == value: int LocalRelation [value#216] == Optimized Logical Plan == LocalRelation [value#216] == Physical Plan == LocalTableScan [value#216] |
It can only be constructed with the output attributes being all resolved.
The size of the objects (in statistics
) is the sum of the default size of the attributes multiplied by the number of records.
When executed, LocalRelation
is translated to LocalTableScanExec physical operator.
Creating LocalRelation Instance
LocalRelation
takes the following when created:
-
Output schema attributes
-
Collection of internal binary rows
-
isStreaming
flag that indicates whether the data comes from a streaming source (disabled by default)
LocalRelation
initializes the internal registries and counters.