LogicalRelation Leaf Logical Operator — Representing BaseRelations in Logical Plan
LogicalRelation is a leaf logical operator that represents a BaseRelation in a logical query plan.
|
1 2 3 4 5 6 7 8 9 10 11 |
val q1 = spark.read.option("header", true).csv("../datasets/people.csv") scala> println(q1.queryExecution.logical.numberedTreeString) 00 Relation[id#72,name#73,age#74] csv val q2 = sql("select * from `csv`.`../datasets/people.csv`") scala> println(q2.queryExecution.optimizedPlan.numberedTreeString) 00 Relation[_c0#175,_c1#176,_c2#177] csv |
LogicalRelation is created when:
-
DataFrameReaderloads data from a data source that supports multiple paths (through SparkSession.baseRelationToDataFrame) -
DataFrameReaderis requested to load data from an external table using JDBC (through SparkSession.baseRelationToDataFrame) -
TextInputCSVDataSourceandTextInputJsonDataSourceare requested to infer schema -
ResolveSQLOnFileconverts a logical plan -
FindDataSourceTablelogical evaluation rule is executed -
RelationConversionslogical evaluation rule is executed -
CreateTempViewUsinglogical command is requested to run -
Structured Streaming’s
FileStreamSourcecreates batches of records
|
Note
|
|
The simple text representation of a LogicalRelation (aka simpleString) is Relation[output] [relation] (that uses the output and BaseRelation).
|
1 2 3 4 5 6 7 8 9 |
val q = spark.read.text("README.md") val logicalPlan = q.queryExecution.logical scala> println(logicalPlan.simpleString) Relation[value#2] text |
refresh Method
|
1 2 3 4 5 |
refresh(): Unit |
|
Note
|
refresh is part of LogicalPlan Contract to refresh itself.
|
|
Note
|
refresh does the work for HadoopFsRelation relations only.
|
Creating LogicalRelation Instance
LogicalRelation takes the following when created:
-
Optional CatalogTable
spark技术分享