LogicalRelation Leaf Logical Operator — Representing BaseRelations in Logical Plan
LogicalRelation
is a leaf logical operator that represents a BaseRelation in a logical query plan.
1 2 3 4 5 6 7 8 9 10 11 |
val q1 = spark.read.option("header", true).csv("../datasets/people.csv") scala> println(q1.queryExecution.logical.numberedTreeString) 00 Relation[id#72,name#73,age#74] csv val q2 = sql("select * from `csv`.`../datasets/people.csv`") scala> println(q2.queryExecution.optimizedPlan.numberedTreeString) 00 Relation[_c0#175,_c1#176,_c2#177] csv |
LogicalRelation
is created when:
-
DataFrameReader
loads data from a data source that supports multiple paths (through SparkSession.baseRelationToDataFrame) -
DataFrameReader
is requested to load data from an external table using JDBC (through SparkSession.baseRelationToDataFrame) -
TextInputCSVDataSource
andTextInputJsonDataSource
are requested to infer schema -
ResolveSQLOnFile
converts a logical plan -
FindDataSourceTable
logical evaluation rule is executed -
RelationConversions
logical evaluation rule is executed -
CreateTempViewUsing
logical command is requested to run -
Structured Streaming’s
FileStreamSource
creates batches of records
Note
|
|
The simple text representation of a LogicalRelation
(aka simpleString
) is Relation[output] [relation] (that uses the output and BaseRelation).
1 2 3 4 5 6 7 8 9 |
val q = spark.read.text("README.md") val logicalPlan = q.queryExecution.logical scala> println(logicalPlan.simpleString) Relation[value#2] text |
refresh
Method
1 2 3 4 5 |
refresh(): Unit |
Note
|
refresh is part of LogicalPlan Contract to refresh itself.
|
Note
|
refresh does the work for HadoopFsRelation relations only.
|
Creating LogicalRelation Instance
LogicalRelation
takes the following when created:
-
Optional CatalogTable