First Aggregate Function Expression
First
is a DeclarativeAggregate function expression that is created when:
-
AstBuilder
is requested to parse a FIRST statement -
first standard function is used
-
first
andfirst_value
SQL functions are used
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
val sqlText = "FIRST (organizationName IGNORE NULLS)" val e = spark.sessionState.sqlParser.parseExpression(sqlText) scala> :type e org.apache.spark.sql.catalyst.expressions.Expression import org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression val aggExpr = e.asInstanceOf[AggregateExpression] import org.apache.spark.sql.catalyst.expressions.aggregate.First val f = aggExpr.aggregateFunction scala> println(f.simpleString) first('organizationName) ignore nulls |
When requested to evaluate (and return the final value), First
simply returns a AttributeReference (with first
name and the data type of the child expression).
Tip
|
Use first operator from the Catalyst DSL to create an First aggregate function expression, e.g. for testing or Spark SQL internals exploration.
|
Catalyst DSL — first
Operator
1 2 3 4 5 |
first(e: Expression): Expression |
first
creates a First
expression and requests it to convert to a AggregateExpression.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
import org.apache.spark.sql.catalyst.dsl.expressions._ val e = first('orgName) scala> println(e.numberedTreeString) 00 first('orgName, false) 01 +- first('orgName)() 02 :- 'orgName 03 +- false import org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression val aggExpr = e.asInstanceOf[AggregateExpression] import org.apache.spark.sql.catalyst.expressions.aggregate.First val f = aggExpr.aggregateFunction scala> println(f.simpleString) first('orgName)() |
Creating First Instance
First
takes the following when created:
-
Child expression
-
ignoreNullsExpr
flag expression