CreateViewCommand Logical Command
CreateViewCommand
is a logical command for creating or replacing a view or a table.
CreateViewCommand
is created to represent the following:
-
CREATE VIEW AS SQL statements
-
Dataset
operators: Dataset.createTempView, Dataset.createOrReplaceTempView, Dataset.createGlobalTempView and Dataset.createOrReplaceGlobalTempView
Caution
|
FIXME What’s the difference between CreateTempViewUsing ?
|
CreateViewCommand
works with different view types.
View Type | Description / Side Effect |
---|---|
|
A session-scoped local temporary view that is available until the session, that has created it, is stopped. When executed, |
|
A cross-session global temporary view that is available until the Spark application stops. When executed, |
|
A cross-session persisted view that is available until dropped. When executed, |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 |
/* CREATE [OR REPLACE] [[GLOBAL] TEMPORARY] VIEW [IF NOT EXISTS] tableIdentifier [identifierCommentList] [COMMENT STRING] [PARTITIONED ON identifierList] [TBLPROPERTIES tablePropertyList] AS query */ // Demo table for "AS query" part spark.range(10).write.mode("overwrite").saveAsTable("t1") // The "AS" query val asQuery = "SELECT * FROM t1" // The following queries should all work fine val q1 = "CREATE VIEW v1 AS " + asQuery sql(q1) val q2 = "CREATE OR REPLACE VIEW v1 AS " + asQuery sql(q2) val q3 = "CREATE OR REPLACE TEMPORARY VIEW v1 " + asQuery sql(q3) val q4 = "CREATE OR REPLACE GLOBAL TEMPORARY VIEW v1 " + asQuery sql(q4) val q5 = "CREATE VIEW IF NOT EXISTS v1 AS " + asQuery sql(q5) // The following queries should all fail // the number of user-specified columns does not match the schema of the AS query val qf1 = "CREATE VIEW v1 (c1 COMMENT 'comment', c2) AS " + asQuery scala> sql(qf1) org.apache.spark.sql.AnalysisException: The number of columns produced by the SELECT clause (num: `1`) does not match the number of column names specified by CREATE VIEW (num: `2`).; at org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:134) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3254) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3253) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:641) ... 49 elided // CREATE VIEW ... PARTITIONED ON is not allowed val qf2 = "CREATE VIEW v1 PARTITIONED ON (c1, c2) AS " + asQuery scala> sql(qf2) org.apache.spark.sql.catalyst.parser.ParseException: Operation not allowed: CREATE VIEW ... PARTITIONED ON(line 1, pos 0) // Use the same name of t1 for a new view val qf3 = "CREATE VIEW t1 AS " + asQuery scala> sql(qf3) org.apache.spark.sql.AnalysisException: `t1` is not a view; at org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:156) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3254) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3253) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:641) ... 49 elided // View already exists val qf4 = "CREATE VIEW v1 AS " + asQuery scala> sql(qf4) org.apache.spark.sql.AnalysisException: View `v1` already exists. If you want to update the view definition, please use ALTER VIEW AS or CREATE OR REPLACE VIEW AS; at org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:169) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3254) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3253) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:641) ... 49 elided |
CreateViewCommand
returns the child logical query plan when requested for the inner nodes (that should be shown as an inner nested tree of this node).
1 2 3 4 5 6 7 8 9 10 |
val sqlText = "CREATE VIEW v1 AS " + asQuery val plan = spark.sessionState.sqlParser.parsePlan(sqlText) scala> println(plan.numberedTreeString) 00 CreateViewCommand `v1`, SELECT * FROM t1, false, false, PersistedView 01 +- 'Project [*] 02 +- 'UnresolvedRelation `t1` |
Creating CatalogTable — prepareTable
Internal Method
1 2 3 4 5 |
prepareTable(session: SparkSession, analyzedPlan: LogicalPlan): CatalogTable |
prepareTable
…FIXME
Note
|
prepareTable is used exclusively when CreateViewCommand logical command is executed.
|
Executing Logical Command — run
Method
1 2 3 4 5 |
run(sparkSession: SparkSession): Seq[Row] |
Note
|
run is part of RunnableCommand Contract to execute (run) a logical command.
|
run
requests the input SparkSession
for the SessionState that is in turn requested to execute the child logical plan (which simply creates a QueryExecution).
Note
|
|
run
requests the input SparkSession
for the SessionState that is in turn requested for the SessionCatalog.
run
then branches off per the ViewType:
-
For local temporary views,
run
alias the analyzed plan and requests theSessionCatalog
to create or replace a local temporary view -
For global temporary views,
run
also alias the analyzed plan and requests theSessionCatalog
to create or replace a global temporary view -
For persisted views,
run
asks theSessionCatalog
whether the table exists or not (given TableIdentifier).-
If the table exists and the allowExisting flag is on,
run
simply does nothing (and exits) -
If the table exists and the replace flag is on,
run
requests theSessionCatalog
for the table metadata and replaces the table, i.e.run
requests theSessionCatalog
to drop the table followed by re-creating it (with a new CatalogTable) -
If however the table does not exist,
run
simply requests theSessionCatalog
to create it (with a new CatalogTable)
-
run
throws an AnalysisException
for persisted views when they already exist, the allowExisting flag is off and the table type is not a view.
1 2 3 4 5 |
[name] is not a view |
run
throws an AnalysisException
for persisted views when they already exist and the allowExisting and replace flags are off.
1 2 3 4 5 |
View [name] already exists. If you want to update the view definition, please use ALTER VIEW AS or CREATE OR REPLACE VIEW AS |
run
throws an AnalysisException
if the userSpecifiedColumns are defined and their numbers is different from the number of output schema attributes of the analyzed logical plan.
1 2 3 4 5 |
The number of columns produced by the SELECT clause (num: `[output.length]`) does not match the number of column names specified by CREATE VIEW (num: `[userSpecifiedColumns.length]`). |
Creating CreateViewCommand Instance
CreateViewCommand
takes the following when created:
-
Child logical plan
verifyTemporaryObjectsNotExists
Internal Method
1 2 3 4 5 |
verifyTemporaryObjectsNotExists(sparkSession: SparkSession): Unit |
verifyTemporaryObjectsNotExists
…FIXME
Note
|
verifyTemporaryObjectsNotExists is used exclusively when CreateViewCommand logical command is executed.
|
aliasPlan
Internal Method
1 2 3 4 5 |
aliasPlan(session: SparkSession, analyzedPlan: LogicalPlan): LogicalPlan |
aliasPlan
…FIXME
Note
|
aliasPlan is used when CreateViewCommand logical command is executed (and prepareTable).
|