spark-sql-spark技术分享-第32页

Stack Generator Expression

Stack is…FIXME

Generating Java Source Code (ExprCode) For Code-Generated Expression Evaluation — `doGenCode` Method



doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode

1

2

3

4

5

doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode

Note	`doGenCode` is part of Expression Contract to generate a Java source code (ExprCode) for code-generated expression evaluation.

doGenCode…FIXME

SortOrder Unevaluable Unary Expression

SortOrder is a unary expression that represents the following operators in a logical plan:

AstBuilder is requested to parse ORDER BY or SORT BY sort specifications
Column.asc, Column.asc_nulls_first, Column.asc_nulls_last, Column.desc, Column.desc_nulls_first, and Column.desc_nulls_last operators are used

SortOrder is used to specify the output data ordering requirements of a physical operator.

SortOrder is an unevaluable expression and cannot be evaluated (i.e. produce a value given an internal row).

Note	An unevaluable expression cannot be evaluated to produce a value (neither in interpreted nor code-generated expression evaluations) and has to be resolved (replaced) to some other expressions or logical operators at analysis or optimization phases or they fail analysis.

SortOrder is never foldable (as an unevaluable expression with no evaluation).

Tip	Use asc, asc_nullsLast, desc or desc_nullsFirst operators from the Catalyst DSL to create a `SortOrder` expression, e.g. for testing or Spark SQL internals exploration.

Note	Dataset.repartitionByRange, Dataset.sortWithinPartitions, Dataset.sort and WindowSpec.orderBy default to Ascending sort direction.

Creating SortOrder Instance — `apply` Factory Method



apply(
  child: Expression,
  direction: SortDirection,
  sameOrderExpressions: Set[Expression] = Set.empty): SortOrder

1

2

3

4

5

6

7

8

apply(

child: Expression,

direction: SortDirection,

sameOrderExpressions: Set[Expression] = Set.empty): SortOrder

apply is a convenience method to create a SortOrder with the defaultNullOrdering of the SortDirection.

Note	`apply` is used exclusively in window function.

Catalyst DSL — `asc`, `asc_nullsLast`, `desc` and `desc_nullsFirst` Operators



asc: SortOrder
asc_nullsLast: SortOrder
desc: SortOrder
desc_nullsFirst: SortOrder

1

2

3

4

5

6

7

8

asc: SortOrder

asc_nullsLast: SortOrder

desc: SortOrder

desc_nullsFirst: SortOrder

asc, asc_nullsLast, desc and desc_nullsFirst create a SortOrder expression with the Ascending or Descending sort direction, respectively.



import org.apache.spark.sql.catalyst.dsl.expressions._
val sortNullsLast = 'id.asc_nullsLast
scala> println(sortNullsLast.sql)
`id` ASC NULLS LAST

1

2

3

4

5

6

7

8

import org.apache.spark.sql.catalyst.dsl.expressions._

val sortNullsLast = 'id.asc_nullsLast

scala> println(sortNullsLast.sql)

`id` ASC NULLS LAST

Creating SortOrder Instance

SortOrder takes the following when created:

Child expression
SortDirection
NullOrdering
“Same Order” expressions

SortDirection Contract

SortDirection is the base of sort directions.

Method Description

defaultNullOrdering



defaultNullOrdering: NullOrdering

1

2

3

4

5

defaultNullOrdering: NullOrdering

Used when…FIXME

sql



sql: String

1

2

3

4

5

sql: String

Used when…FIXME

Ascending and Descending Sort Directions

There are two sorting directions available, i.e. Ascending and Descending.

SizeBasedWindowFunction Contract — Declarative Window Aggregate Functions with Window Size

2012-09-23admin阅读(1495)

SizeBasedWindowFunction Contract — Declarative Window Aggregate Functions with Window Size

SizeBasedWindowFunction is the extension of the AggregateWindowFunction Contract for window functions that require the size of the current window for calculation.



package org.apache.spark.sql.catalyst.expressions

trait SizeBasedWindowFunction extends AggregateWindowFunction {
  // No required properties (vals and methods) that have no implementation
}

1

2

3

4

5

6

7

8

9

package org.apache.spark.sql.catalyst.expressions

trait SizeBasedWindowFunction extends AggregateWindowFunction {

// No required properties (vals and methods) that have no implementation

}

Table 1. SizeBasedWindowFunction Contract
Property	Description
`n`	Size of the current window as a AttributeReference expression with `window__partition__size` name, IntegerType data type and not nullable

Table 2. SizeBasedWindowFunctions (Direct Implementations)
SizeBasedWindowFunction	Description
CumeDist	Window function expression for cume_dist standard function (Dataset API) and cume_dist SQL function
NTile
PercentRank

SimpleTypedAggregateExpression

2012-09-22admin阅读(1684)

SimpleTypedAggregateExpression

SimpleTypedAggregateExpression is…FIXME

SimpleTypedAggregateExpression is created when…FIXME

Table 1. SimpleTypedAggregateExpression’s Internal Properties (e.g. Registries, Counters and Flags)
Name	Description
`evaluateExpression`	Expression
`resultObjToRow`	UnsafeProjection

Creating SimpleTypedAggregateExpression Instance

SimpleTypedAggregateExpression takes the following when created:

Aggregator
Optional input deserializer expression
Optional Java class for the input
Optional schema for the input
Buffer serializer (as a collection of named expressions)
Buffer deserializer expression
Output serializer (as a collection of expressions)
DataType
nullable flag

ScalaUDAF — Catalyst Expression Adapter for UserDefinedAggregateFunction

ScalaUDAF is a Catalyst expression adapter to manage the lifecycle of UserDefinedAggregateFunction and hook it in Spark SQL’s Catalyst execution path.

ScalaUDAF is created when:

UserDefinedAggregateFunction creates a Column for a user-defined aggregate function using all and distinct values (to use the UDAF in Dataset operators)
UDFRegistration is requested to register a user-defined aggregate function (to use the UDAF in SQL mode)

ScalaUDAF is a ImperativeAggregate.

Table 1. ScalaUDAF’s ImperativeAggregate Methods
Method Name	Behaviour
initialize	Requests UserDefinedAggregateFunction to initialize
merge	Requests UserDefinedAggregateFunction to merge
update	Requests UserDefinedAggregateFunction to update

When evaluated, ScalaUDAF…FIXME

ScalaUDAF has no representation in SQL.

Table 2. ScalaUDAF’s Properties
Name	Description
`aggBufferAttributes`	AttributeReferences of aggBufferSchema
`aggBufferSchema`	bufferSchema of UserDefinedAggregateFunction
`dataType`	DataType of UserDefinedAggregateFunction
`deterministic`	`deterministic` of UserDefinedAggregateFunction
`inputAggBufferAttributes`	Copy of aggBufferAttributes
`inputTypes`	Data types from inputSchema of UserDefinedAggregateFunction
`nullable`	Always enabled (i.e. `true`)

Table 3. ScalaUDAF’s Internal Registries and Counters
Name	Description
`inputAggregateBuffer`	Used when…FIXME
`inputProjection`	Used when…FIXME
`inputToScalaConverters`	Used when…FIXME
`mutableAggregateBuffer`	Used when…FIXME

Creating ScalaUDAF Instance

ScalaUDAF takes the following when created:

Children Catalyst expressions
UserDefinedAggregateFunction
mutableAggBufferOffset (starting with 0)
inputAggBufferOffset (starting with 0)

ScalaUDAF initializes the internal registries and counters.

`initialize` Method



initialize(buffer: InternalRow): Unit

1

2

3

4

5

initialize(buffer: InternalRow): Unit

initialize sets the input buffer internal binary row as underlyingBuffer of MutableAggregationBufferImpl and requests the UserDefinedAggregateFunction to initialize (with the MutableAggregationBufferImpl).

Figure 1. ScalaUDAF initializes UserDefinedAggregateFunction

Note	`initialize` is part of ImperativeAggregate Contract.

`update` Method



update(mutableAggBuffer: InternalRow, inputRow: InternalRow): Unit

1

2

3

4

5

update(mutableAggBuffer: InternalRow, inputRow: InternalRow): Unit

update sets the input buffer internal binary row as underlyingBuffer of MutableAggregationBufferImpl and requests the UserDefinedAggregateFunction to update.

Note	`update` uses inputProjection on the input `input` and converts it using inputToScalaConverters.

Figure 2. ScalaUDAF updates UserDefinedAggregateFunction

Note	`update` is part of ImperativeAggregate Contract.

`merge` Method



merge(buffer1: InternalRow, buffer2: InternalRow): Unit

1

2

3

4

5

merge(buffer1: InternalRow, buffer2: InternalRow): Unit

merge first sets:

underlyingBuffer of MutableAggregationBufferImpl to the input buffer1
underlyingInputBuffer of InputAggregationBuffer to the input buffer2

merge then requests the UserDefinedAggregateFunction to merge (passing in the MutableAggregationBufferImpl and InputAggregationBuffer).

Figure 3. ScalaUDAF requests UserDefinedAggregateFunction to merge

Note	`merge` is part of ImperativeAggregate Contract.

ScalaUDF — Catalyst Expression to Manage Lifecycle of User-Defined Function

ScalaUDF is a Catalyst expression to manage the lifecycle of a user-defined function (and hook it in to Spark SQL’s Catalyst execution path).

ScalaUDF is a ImplicitCastInputTypes and UserDefinedExpression.

ScalaUDF has no representation in SQL.

ScalaUDF is created when:

UserDefinedFunction is executed
UDFRegistration is requested to register a Scala function as a user-defined function (in FunctionRegistry)



val lengthUDF = udf { s: String => s.length }.withName("lengthUDF")
val c = lengthUDF($"name")
scala> println(c.expr.treeString)
UDF:lengthUDF('name)
+- 'name

import org.apache.spark.sql.catalyst.expressions.ScalaUDF
val scalaUDF = c.expr.asInstanceOf[ScalaUDF]

1

2

3

4

5

6

7

8

9

10

11

12

val lengthUDF = udf { s: String => s.length }.withName("lengthUDF")

val c = lengthUDF($"name")

scala> println(c.expr.treeString)

UDF:lengthUDF('name)

+- 'name

import org.apache.spark.sql.catalyst.expressions.ScalaUDF

val scalaUDF = c.expr.asInstanceOf[ScalaUDF]

Note	Spark SQL Analyzer uses HandleNullInputsForUDF logical evaluation rule to…FIXME



// Defining a zero-argument UDF
val myUDF = udf { () => "Hello World" }

// "Execute" the UDF
// Attach it to an "execution environment", i.e. a Dataset
// by specifying zero columns to execute on (since the UDF is no-arg)
import org.apache.spark.sql.catalyst.expressions.ScalaUDF
val scalaUDF = myUDF().expr.asInstanceOf[ScalaUDF]

scala> scalaUDF.resolved
res1: Boolean = true

// Execute the UDF (on every row in a Dataset)
// We simulate it relying on the EmptyRow that is the default InternalRow of eval
scala> scalaUDF.eval()
res2: Any = Hello World

// Defining a UDF of one input parameter
val hello = udf { s: String => s"Hello $s" }

// Binding the hello UDF to a column name
import org.apache.spark.sql.catalyst.expressions.ScalaUDF
val helloScalaUDF = hello($"name").expr.asInstanceOf[ScalaUDF]

scala> helloScalaUDF.resolved
res3: Boolean = false

// Resolve helloScalaUDF, i.e. the only `name` column reference

scala> helloScalaUDF.children
res4: Seq[org.apache.spark.sql.catalyst.expressions.Expression] = ArrayBuffer('name)

// The column is free (i.e. not bound to a Dataset)
// Define a Dataset that becomes the rows for the UDF
val names = Seq("Jacek", "Agata").toDF("name")
scala> println(names.queryExecution.analyzed.numberedTreeString)
00 Project [value#1 AS name#3]
01 +- LocalRelation [value#1]

// Resolve the references using the Dataset
val plan = names.queryExecution.analyzed
val resolver = spark.sessionState.analyzer.resolver
import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute
val resolvedUDF = helloScalaUDF.transformUp { case a @ UnresolvedAttribute(names) =>
  // we're in controlled environment
  // so get is safe
  plan.resolve(names, resolver).get
}

scala> resolvedUDF.resolved
res6: Boolean = true

scala> println(resolvedUDF.numberedTreeString)
00 UDF(name#3)
01 +- name#3: string

import org.apache.spark.sql.catalyst.expressions.BindReferences
val attrs = names.queryExecution.sparkPlan.output
val boundUDF = BindReferences.bindReference(resolvedUDF, attrs)

// Create an internal binary row, i.e. InternalRow
import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
val stringEncoder = ExpressionEncoder[String]
val row = stringEncoder.toRow("world")

// YAY! It works!
scala> boundUDF.eval(row)
res8: Any = Hello world

// Just to show the regular execution path
// i.e. how to execute a UDF in a context of a Dataset
val q = names.select(hello($"name"))
scala> q.show
+-----------+
|  UDF(name)|
+-----------+
|Hello Jacek|
|Hello Agata|
+-----------+

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

// Defining a zero-argument UDF

val myUDF = udf { () => "Hello World" }

// "Execute" the UDF

// Attach it to an "execution environment", i.e. a Dataset

// by specifying zero columns to execute on (since the UDF is no-arg)

import org.apache.spark.sql.catalyst.expressions.ScalaUDF

val scalaUDF = myUDF().expr.asInstanceOf[ScalaUDF]

scala> scalaUDF.resolved

res1: Boolean = true

// Execute the UDF (on every row in a Dataset)

// We simulate it relying on the EmptyRow that is the default InternalRow of eval

scala> scalaUDF.eval()

res2: Any = Hello World

// Defining a UDF of one input parameter

val hello = udf { s: String => s"Hello $s" }

// Binding the hello UDF to a column name

import org.apache.spark.sql.catalyst.expressions.ScalaUDF

val helloScalaUDF = hello($"name").expr.asInstanceOf[ScalaUDF]

scala> helloScalaUDF.resolved

res3: Boolean = false

// Resolve helloScalaUDF, i.e. the only `name` column reference

scala> helloScalaUDF.children

res4: Seq[org.apache.spark.sql.catalyst.expressions.Expression] = ArrayBuffer('name)

// The column is free (i.e. not bound to a Dataset)

// Define a Dataset that becomes the rows for the UDF

val names = Seq("Jacek", "Agata").toDF("name")

scala> println(names.queryExecution.analyzed.numberedTreeString)

00 Project [value#1 AS name#3]

01 +- LocalRelation [value#1]

// Resolve the references using the Dataset

val plan = names.queryExecution.analyzed

val resolver = spark.sessionState.analyzer.resolver

import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute

val resolvedUDF = helloScalaUDF.transformUp { case a @ UnresolvedAttribute(names) =>

// we're in controlled environment

// so get is safe

plan.resolve(names, resolver).get

}

scala> resolvedUDF.resolved

res6: Boolean = true

scala> println(resolvedUDF.numberedTreeString)

00 UDF(name#3)

01 +- name#3: string

import org.apache.spark.sql.catalyst.expressions.BindReferences

val attrs = names.queryExecution.sparkPlan.output

val boundUDF = BindReferences.bindReference(resolvedUDF, attrs)

// Create an internal binary row, i.e. InternalRow

import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder

val stringEncoder = ExpressionEncoder[String]

val row = stringEncoder.toRow("world")

// YAY! It works!

scala> boundUDF.eval(row)

res8: Any = Hello world

// Just to show the regular execution path

// i.e. how to execute a UDF in a context of a Dataset

val q = names.select(hello($"name"))

scala> q.show

+-----------+

| UDF(name)|

+-----------+

|Hello Jacek|

|Hello Agata|

+-----------+

Generating Java Source Code (ExprCode) For Code-Generated Expression Evaluation — `doGenCode` Method



doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode

1

2

3

4

5

doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode

Note	`doGenCode` is part of Expression Contract to generate a Java source code (ExprCode) for code-generated expression evaluation.

doGenCode…FIXME

Evaluating Expression — `eval` Method



eval(input: InternalRow): Any

1

2

3

4

5

eval(input: InternalRow): Any

Note	`eval` is part of Expression Contract for the interpreted (non-code-generated) expression evaluation, i.e. evaluating a Catalyst expression to a JVM object for a given internal binary row.

eval executes the Scala function on the input internal row.

Creating ScalaUDF Instance

ScalaUDF takes the following when created:

A Scala function (as Scala’s AnyRef)
Output data type
Child Catalyst expressions
Input data types (if available)
Name (if defined)
nullable flag (turned on by default)
udfDeterministic flag (turned on by default)

ScalaUDF initializes the internal registries and counters.

ScalarSubquery ExecSubqueryExpression

2012-09-19admin阅读(1923)

ScalarSubquery (ExecSubqueryExpression) Expression

ScalarSubquery is an ExecSubqueryExpression that can give exactly one value (i.e. the value of executing SubqueryExec subquery that can result in a single row and a single column or null if no row were computed).

Important

Spark SQL uses the name of ScalarSubquery twice to represent an ExecSubqueryExpression (this page) and a SubqueryExpression. It is confusing and you should not be anymore.

ScalarSubquery is created exclusively when PlanSubqueries physical optimization is executed (and plans a ScalarSubquery expression).



// FIXME DEMO
import org.apache.spark.sql.execution.PlanSubqueries
val spark = ...
val planSubqueries = PlanSubqueries(spark)
val plan = ...
val executedPlan = planSubqueries(plan)

1

2

3

4

5

6

7

8

9

10

// FIXME DEMO

import org.apache.spark.sql.execution.PlanSubqueries

val spark = ...

val planSubqueries = PlanSubqueries(spark)

val plan = ...

val executedPlan = planSubqueries(plan)

ScalarSubquery expression cannot be evaluated, i.e. produce a value given an internal row.

ScalarSubquery uses…FIXME…for the data type.

Table 1. ScalarSubquery’s Internal Properties (e.g. Registries, Counters and Flags)
Name	Description
`result`	The value of the single column in a single row after collecting the rows from executing the subquery plan or `null` if no rows were collected.
`updated`	Flag that says whether `ScalarSubquery` was updated with collected result of executing the subquery plan.

Creating ScalarSubquery Instance

ScalarSubquery takes the following when created:

SubqueryExec plan
Expression ID (as ExprId)

Updating ScalarSubquery With Collected Result — `updateResult` Method



updateResult(): Unit

1

2

3

4

5

updateResult(): Unit

Note	`updateResult` is part of ExecSubqueryExpression Contract to fill an Catalyst expression with a collected result from executing a subquery plan.

updateResult requests SubqueryExec physical plan to execute and collect internal rows.

updateResult sets result to the value of the only column of the single row or null if no row were collected.

In the end, updateResult marks the ScalarSubquery instance as updated.

updateResult reports a RuntimeException when there are more than 1 rows in the result.



more than one row returned by a subquery used as an expression:
[plan]

1

2

3

4

5

6

more than one row returned by a subquery used as an expression:

[plan]

updateResult reports an AssertionError when the number of fields is not exactly 1.



Expects 1 field, but got [numFields] something went wrong in analysis

1

2

3

4

5

Expects 1 field, but got [numFields] something went wrong in analysis

Evaluating Expression — `eval` Method



eval(input: InternalRow): Any

1

2

3

4

5

eval(input: InternalRow): Any

Note	`eval` is part of Expression Contract for the interpreted (non-code-generated) expression evaluation, i.e. evaluating a Catalyst expression to a JVM object for a given internal binary row.

eval simply returns result value.

eval reports an IllegalArgumentException if the ScalarSubquery expression has not been updated yet.

Generating Java Source Code (ExprCode) For Code-Generated Expression Evaluation — `doGenCode` Method



doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode

1

2

3

4

5

doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode

Note	`doGenCode` is part of Expression Contract to generate a Java source code (ExprCode) for code-generated expression evaluation.

doGenCode first makes sure that the updated flag is on (true). If not, doGenCode throws an IllegalArgumentException exception with the following message:



requirement failed: [this] has not finished

1

2

3

4

5

requirement failed: [this] has not finished

doGenCode then creates a Literal (for the result and the dataType) and simply requests it to generate a Java source code.

ScalarSubquery SubqueryExpression

2012-09-18admin阅读(1932)

ScalarSubquery (SubqueryExpression) Expression

ScalarSubquery is a SubqueryExpression that returns a single row and a single column only.

ScalarSubquery represents a structured query that can be used as a “column”.

Important

Spark SQL uses the name of ScalarSubquery twice to represent a SubqueryExpression (this page) and an ExecSubqueryExpression. You’ve been warned.

ScalarSubquery is created exclusively when AstBuilder is requested to parse a subquery expression.



// FIXME DEMO

// Borrowed from ExpressionParserSuite.scala
// ScalarSubquery(table("tbl").select('max.function('val))) > 'current)
val sql = "(select max(val) from tbl) > current"

// 'a === ScalarSubquery(table("s").select('b))
val sql = "a = (select b from s)"

// Borrowed from PlanParserSuite.scala
// table("t").select(ScalarSubquery(table("s").select('max.function('b))).as("ss"))
val sql = "select (select max(b) from s) ss from t"

// table("t").where('a === ScalarSubquery(table("s").select('b))).select(star())
val sql = "select * from t where a = (select b from s)"

// table("t").groupBy('g)('g).where('a > ScalarSubquery(table("s").select('b)))
val sql = "select g from t group by g having a > (select b from s)"

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

// FIXME DEMO

// Borrowed from ExpressionParserSuite.scala

// ScalarSubquery(table("tbl").select('max.function('val))) > 'current)

val sql = "(select max(val) from tbl) > current"

// 'a === ScalarSubquery(table("s").select('b))

val sql = "a = (select b from s)"

// Borrowed from PlanParserSuite.scala

// table("t").select(ScalarSubquery(table("s").select('max.function('b))).as("ss"))

val sql = "select (select max(b) from s) ss from t"

// table("t").where('a === ScalarSubquery(table("s").select('b))).select(star())

val sql = "select * from t where a = (select b from s)"

// table("t").groupBy('g)('g).where('a > ScalarSubquery(table("s").select('b)))

val sql = "select g from t group by g having a > (select b from s)"

Creating ScalarSubquery Instance

ScalarSubquery takes the following when created:

Subquery logical plan
Child expressions (default: no children)
Expression ID (as ExprId and defaults to a new ExprId)

RuntimeReplaceable Contract

2012-09-17admin阅读(1644)

RuntimeReplaceable Contract — Replaceable SQL Expressions

RuntimeReplaceable is the marker contract for unary expressions that are replaced by Catalyst Optimizer with their child expression (that can then be evaluated).

Note	Catalyst Optimizer uses ReplaceExpressions logical optimization to replace `RuntimeReplaceable` expressions.

RuntimeReplaceable contract allows for expression aliases, i.e. expressions that are fairly complex in the inside than on the outside, and is used to provide compatibility with other SQL databases by supporting SQL functions with their more complex Catalyst expressions (that are already supported by Spark SQL).

Note	RuntimeReplaceables are tied up to their SQL functions in FunctionRegistry.

RuntimeReplaceable expressions cannot be evaluated (i.e. produce a value given an internal row) and therefore have to be replaced in the query execution pipeline.



package org.apache.spark.sql.catalyst.expressions

trait RuntimeReplaceable extends UnaryExpression with Unevaluable {
  // as a marker contract it only marks a class
  // no methods are required
}

1

2

3

4

5

6

7

8

9

10

package org.apache.spark.sql.catalyst.expressions

trait RuntimeReplaceable extends UnaryExpression with Unevaluable {

// as a marker contract it only marks a class

// no methods are required

}

Note	To make sure the `explain` plan and expression SQL works correctly, a `RuntimeReplaceable` implementation should override flatArguments and sql methods.

Table 1. RuntimeReplaceables
RuntimeReplaceable	Standard Function	SQL Function
`IfNull`		ifnull
`Left`		left
`NullIf`		nullif
`Nvl`		nvl
`Nvl2`		nvl2
ParseToDate	to_date	to_date
ParseToTimestamp	to_timestamp	to_timestamp
`Right`		right

RowNumberLike Contract

2012-09-16admin阅读(1553)

RowNumberLike Contract

RowNumberLike is…FIXME

spark-sql 第32页

Stack Generator Expression

Generating Java Source Code (ExprCode) For Code-Generated Expression Evaluation — doGenCode Method

SortOrder Unevaluable Unary Expression

Creating SortOrder Instance — apply Factory Method

Catalyst DSL — asc, asc_nullsLast, desc and desc_nullsFirst Operators

Creating SortOrder Instance

SortDirection Contract

Ascending and Descending Sort Directions

SizeBasedWindowFunction Contract — Declarative Window Aggregate Functions with Window Size

SimpleTypedAggregateExpression

Creating SimpleTypedAggregateExpression Instance

ScalaUDAF — Catalyst Expression Adapter for UserDefinedAggregateFunction

Creating ScalaUDAF Instance

initialize Method

update Method

merge Method

ScalaUDF — Catalyst Expression to Manage Lifecycle of User-Defined Function

Generating Java Source Code (ExprCode) For Code-Generated Expression Evaluation — doGenCode Method

Evaluating Expression — eval Method

Creating ScalaUDF Instance

ScalarSubquery (ExecSubqueryExpression) Expression

Creating ScalarSubquery Instance

Updating ScalarSubquery With Collected Result — updateResult Method

Evaluating Expression — eval Method

Generating Java Source Code (ExprCode) For Code-Generated Expression Evaluation — doGenCode Method

ScalarSubquery (SubqueryExpression) Expression

Creating ScalarSubquery Instance

RuntimeReplaceable Contract — Replaceable SQL Expressions

RowNumberLike Contract

欢迎关注：spark技术分享

关注公众号：spark技术分享

QQ咨询

回顶部

Generating Java Source Code (ExprCode) For Code-Generated Expression Evaluation — `doGenCode` Method

Creating SortOrder Instance — `apply` Factory Method

Catalyst DSL — `asc`, `asc_nullsLast`, `desc` and `desc_nullsFirst` Operators

`initialize` Method

`update` Method

`merge` Method

Generating Java Source Code (ExprCode) For Code-Generated Expression Evaluation — `doGenCode` Method

Evaluating Expression — `eval` Method

Updating ScalarSubquery With Collected Result — `updateResult` Method

Evaluating Expression — `eval` Method

Generating Java Source Code (ExprCode) For Code-Generated Expression Evaluation — `doGenCode` Method