WindowSpecDefinition Unevaluable Expression
WindowSpecDefinition
is an unevaluable expression (i.e. with no support for eval
and doGenCode
methods).
WindowSpecDefinition
is created when:
-
AstBuilder
is requested to parse a window specification in a SQL query -
Column.over operator is used
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
import org.apache.spark.sql.expressions.Window val byValueDesc = Window.partitionBy("value").orderBy($"value".desc) val q = table.withColumn("count over window", count("*") over byValueDesc) import org.apache.spark.sql.catalyst.expressions.WindowExpression val windowExpr = q.queryExecution .logical .expressions(1) .children(0) .asInstanceOf[WindowExpression] scala> windowExpr.windowSpec res0: org.apache.spark.sql.catalyst.expressions.WindowSpecDefinition = windowspecdefinition('value, 'value DESC NULLS LAST, UnspecifiedFrame) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
import org.apache.spark.sql.catalyst.expressions.WindowSpecDefinition Seq((0, "hello"), (1, "windows")) .toDF("id", "token") .createOrReplaceTempView("mytable") val sqlText = """ SELECT count(*) OVER myWindowSpec FROM mytable WINDOW myWindowSpec AS ( PARTITION BY token ORDER BY id RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) """ import spark.sessionState.{analyzer,sqlParser} scala> val parsedPlan = sqlParser.parsePlan(sqlText) parsedPlan: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan = 'WithWindowDefinition Map(myWindowSpec -> windowspecdefinition('token, 'id ASC NULLS FIRST, RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)) +- 'Project [unresolvedalias(unresolvedwindowexpression('count(1), WindowSpecReference(myWindowSpec)), None)] +- 'UnresolvedRelation `mytable` import org.apache.spark.sql.catalyst.plans.logical.WithWindowDefinition val myWindowSpec = parsedPlan.asInstanceOf[WithWindowDefinition].windowDefinitions("myWindowSpec") scala> println(myWindowSpec) windowspecdefinition('token, 'id ASC NULLS FIRST, RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) scala> println(myWindowSpec.sql) (PARTITION BY `token` ORDER BY `id` ASC NULLS FIRST RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) scala> sql(sqlText) res4: org.apache.spark.sql.DataFrame = [count(1) OVER (PARTITION BY token ORDER BY id ASC NULLS FIRST RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW): bigint] scala> println(analyzer.execute(sqlParser.parsePlan(sqlText))) Project [count(1) OVER (PARTITION BY token ORDER BY id ASC NULLS FIRST RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#25L] +- Project [token#13, id#12, count(1) OVER (PARTITION BY token ORDER BY id ASC NULLS FIRST RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#25L, count(1) OVER (PARTITION BY token ORDER BY id ASC NULLS FIRST RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#25L] +- Window [count(1) windowspecdefinition(token#13, id#12 ASC NULLS FIRST, RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS count(1) OVER (PARTITION BY token ORDER BY id ASC NULLS FIRST RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#25L], [token#13], [id#12 ASC NULLS FIRST] +- Project [token#13, id#12] +- SubqueryAlias mytable +- Project [_1#9 AS id#12, _2#10 AS token#13] +- LocalRelation [_1#9, _2#10] |
Name | Description | ||
---|---|---|---|
Window partition and order specifications (for which |
|||
|
Unsupported (i.e. reports a |
||
|
Disabled (i.e. |
||
|
Enabled (i.e. |
||
|
Enabled when children are and the input DataType is valid and the input frameSpecification is a |
||
|
Contains
|
Creating WindowSpecDefinition Instance
WindowSpecDefinition
takes the following when created:
-
Expressions for window partition specification
-
Window order specifications (as
SortOrder
unary expressions)
Validating Data Type Of Window Order– isValidFrameType
Internal Method
1 2 3 4 5 |
isValidFrameType(ft: DataType): Boolean |
isValidFrameType
is positive (true
) when the data type of the window order specification and the input ft
data type are as follows:
-
DateType and IntegerType
-
Equal
Otherwise, isValidFrameType
is negative (false
).
Note
|
isValidFrameType is used exclusively when WindowSpecDefinition is requested to checkInputDataTypes (with RangeFrame as the window frame specification)
|
Checking Input Data Types — checkInputDataTypes
Method
1 2 3 4 5 |
checkInputDataTypes(): TypeCheckResult |
Note
|
checkInputDataTypes is part of the Expression Contract to checks the input data types.
|
checkInputDataTypes
…FIXME