BoundReference Leaf Expression — Reference to Value in Internal Binary Row
BoundReference is a leaf expression that evaluates to a value in an internal binary row at a specified position and of a given data type.
BoundReference takes the following when created:
-
Data type of the value
-
nullableflag that controls whether the value can benullor not
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
import org.apache.spark.sql.catalyst.expressions.BoundReference import org.apache.spark.sql.types.LongType val boundRef = BoundReference(ordinal = 0, dataType = LongType, nullable = true) scala> println(boundRef.toString) input[0, bigint, true] import org.apache.spark.sql.catalyst.InternalRow val row = InternalRow(1L, "hello") val value = boundRef.eval(row).asInstanceOf[Long] |
You can also create a BoundReference using Catalyst DSL’s at method.
|
1 2 3 4 5 6 7 8 |
import org.apache.spark.sql.catalyst.dsl.expressions._ val boundRef = 'hello.string.at(4) scala> println(boundRef) input[4, string, true] |
Evaluating Expression — eval Method
|
1 2 3 4 5 |
eval(input: InternalRow): Any |
|
Note
|
eval is part of Expression Contract for the interpreted (non-code-generated) expression evaluation, i.e. evaluating a Catalyst expression to a JVM object for a given internal binary row.
|
eval gives the value at position from the input internal binary row that is of a correct type.
Internally, eval returns null if the value at the position is null.
Otherwise, eval uses the methods of InternalRow per the defined data type to access the value.
| DataType | InternalRow’s Method |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
others |
|
Generating Java Source Code (ExprCode) For Code-Generated Expression Evaluation — doGenCode Method
|
1 2 3 4 5 |
doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode |
|
Note
|
doGenCode is part of Expression Contract to generate a Java source code (ExprCode) for code-generated expression evaluation.
|
doGenCode…FIXME
spark技术分享