RankLike Contract
PrettyAttribute
PlanExpression
PlanExpression Contract for Expressions with Query Plans
PlanExpression
is the contract for Catalyst expressions that contain a QueryPlan.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
package org.apache.spark.sql.catalyst.expressions abstract class PlanExpression[T <: QueryPlan[_]] extends Expression { // only required methods that have no implementation // the others follow def exprId: ExprId def plan: T def withNewPlan(plan: T): PlanExpression[T] } |
Method | Description |
---|---|
|
|
|
|
|
PlanExpression | Description |
---|---|
ParseToTimestamp
ParseToTimestamp Expression
ParseToTimestamp
is a RuntimeReplaceable expression that represents the to_timestamp function (in logical query plans).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
// DEMO to_timestamp(s: Column): Column import java.sql.Timestamp import java.time.LocalDateTime val times = Seq(Timestamp.valueOf(LocalDateTime.of(2018, 5, 30, 0, 0, 0)).toString).toDF("time") scala> times.printSchema root |-- time: string (nullable = true) import org.apache.spark.sql.functions.to_timestamp val q = times.select(to_timestamp($"time") as "ts") scala> q.printSchema root |-- ts: timestamp (nullable = true) val plan = q.queryExecution.analyzed scala> println(plan.numberedTreeString) 00 Project [to_timestamp('time, None) AS ts#29] 01 +- Project [value#16 AS time#18] 02 +- LocalRelation [value#16] import org.apache.spark.sql.catalyst.expressions.ParseToTimestamp val ptt = plan.expressions.head.children.head.asInstanceOf[ParseToTimestamp] scala> println(ptt.numberedTreeString) 00 to_timestamp('time, None) 01 +- cast(time#18 as timestamp) 02 +- time#18: string // FIXME DEMO to_timestamp(s: Column, fmt: String): Column |
As a RuntimeReplaceable
expression, ParseToTimestamp
is replaced by Catalyst Optimizer with the child expression:
-
Cast(left, TimestampType)
forto_timestamp(s: Column): Column
function -
Cast(UnixTimestamp(left, format), TimestampType)
forto_timestamp(s: Column, fmt: String): Column
function
1 2 3 4 5 6 |
// FIXME DEMO Conversion to `Cast(left, TimestampType)` // FIXME DEMO Conversion to `Cast(UnixTimestamp(left, format), TimestampType)` |
Creating ParseToTimestamp Instance
ParseToTimestamp
takes the following when created:
-
Left expression
-
format
expression -
Child expression
ParseToDate
ParseToDate Expression
ParseToDate
is a RuntimeReplaceable expression that represents the to_date function (in logical query plans).
1 2 3 4 5 6 |
// DEMO to_date(e: Column): Column // DEMO to_date(e: Column, fmt: String): Column |
As a RuntimeReplaceable
expression, ParseToDate
is replaced by Catalyst Optimizer with the child expression:
-
Cast(left, DateType)
forto_date(e: Column): Column
function -
Cast(Cast(UnixTimestamp(left, format), TimestampType), DateType)
forto_date(e: Column, fmt: String): Column
function
1 2 3 4 5 6 |
// FIXME DEMO Conversion to `Cast(left, DateType)` // FIXME DEMO Conversion to `Cast(Cast(UnixTimestamp(left, format), TimestampType), DateType)` |
Creating ParseToDate Instance
ParseToDate
takes the following when created:
-
Left expression
-
format
expression -
Child expression
OffsetWindowFunction Contract — Unevaluable Window Function Expressions
OffsetWindowFunction Contract — Unevaluable Window Function Expressions
OffsetWindowFunction
is the base of window function expressions that are unevaluable and ImplicitCastInputTypes
.
Note
|
An unevaluable expression cannot be evaluated to produce a value (neither in interpreted nor code-generated expression evaluations) and has to be resolved (replaced) to some other expressions or logical operators at analysis or optimization phases or they fail analysis. |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
package org.apache.spark.sql.catalyst.expressions abstract class OffsetWindowFunction ... { // only required properties (vals and methods) that have no implementation // the others follow val default: Expression val direction: SortDirection val input: Expression val offset: Expression } |
Property | Description |
---|---|
|
|
|
|
|
|
|
OffsetWindowFunction
is not foldable.
OffsetWindowFunction
is nullable when the default is not defined or the default or the input expressions are.
When requested for the dataType, OffsetWindowFunction
simply requests the input expression for the data type.
When requested for the inputTypes, OffsetWindowFunction
returns the AnyDataType
, IntegerType with the data type of the input expression and the NullType.
OffsetWindowFunction
uses the following text representation (i.e. toString
):
1 2 3 4 5 |
[prettyName]([input], [offset], [default]) |
OffsetWindowFunction | Description |
---|---|
Lag |
|
Lead |
frame
Lazy Property
1 2 3 4 5 |
frame: WindowFrame |
Note
|
frame is part of the WindowFunction Contract to define the WindowFrame for function expression execution.
|
frame
…FIXME
Verifying Input Data Types — checkInputDataTypes
Method
1 2 3 4 5 |
checkInputDataTypes(): TypeCheckResult |
Note
|
checkInputDataTypes is part of the Expression Contract to verify (check the correctness of) the input data types.
|
checkInputDataTypes
…FIXME
Nondeterministic Contract
Nondeterministic Expression Contract
Nondeterministic
is a contract for Catalyst expressions that are non-deterministic and non-foldable.
Nondeterministic
expressions require explicit initialization (with the current partition index) before evaluating a value.
1 2 3 4 5 6 7 8 9 10 11 |
package org.apache.spark.sql.catalyst.expressions trait Nondeterministic extends Expression { // only required methods that have no implementation protected def initializeInternal(partitionIndex: Int): Unit protected def evalInternal(input: InternalRow): Any } |
Method | Description |
---|---|
|
Initializing the Used exclusively when |
|
Evaluating the Used exclusively when |
Note
|
Nondeterministic expressions are the target of PullOutNondeterministic logical plan rule.
|
Expression | Description |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Name | Description |
---|---|
Always turned off (i.e. |
|
Always turned off (i.e. |
|
Controls whether a Turned off by default. |
Initializing Expression — initialize
Method
1 2 3 4 5 |
initialize(partitionIndex: Int): Unit |
Internally, initialize
initializes itself (with the input partition index) and turns the internal initialized flag on.
Note
|
initialize is used exclusively when InterpretedProjection and InterpretedMutableProjection are requested to initialize themselves.
|
Evaluating Expression — eval
Method
1 2 3 4 5 |
eval(input: InternalRow): Any |
Note
|
eval is part of Expression Contract for the interpreted (non-code-generated) expression evaluation, i.e. evaluating a Catalyst expression to a JVM object for a given internal binary row.
|
eval
is just a wrapper of evalInternal that makes sure that initialize has already been executed (and so the expression is initialized).
Internally, eval
makes sure that the expression was initialized and calls evalInternal.
eval
reports a IllegalArgumentException
exception when the internal initialized flag is off, i.e. initialize has not yet been executed.
1 2 3 4 5 |
requirement failed: Nondeterministic expression [name] should be initialized before eval. |
NamedExpression Contract
NamedExpression Contract — Catalyst Expressions with Name, ID and Qualifier
NamedExpression
is a contract of Catalyst expressions that have a name, exprId, and optional qualifier.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
package org.apache.spark.sql.catalyst.expressions trait NamedExpression extends Expression { // only required methods that have no implementation def exprId: ExprId def name: String def newInstance(): NamedExpression def qualifier: Option[String] def toAttribute: Attribute } |
Method | Description |
---|---|
|
|
|
|
|
|
|